← Back to Blog

The ChatGPT of Finance Arrives

Oct 8, 2025

Bloomberg recently revealed a finance-focused AI model – BloombergGPT – that is a 50 billion parameter Large Language Model (LLM). This is aimed at improving performance on existing Natural Language Processing (NLP) tasks in finance such as news classification, named entity recognition, sentiment an

Bloomberg recently revealed a finance-focused AI model – BloombergGPT – that is a 50 billion parameter Large Language Model (LLM). This is aimed at improving performance on existing Natural Language Processing (NLP) tasks in finance such as news classification, named entity recognition, sentiment analysis and question-answering among others. What sets BloombergGPT apart is the high-quality financial data that Bloomberg curated over multiple decades that was fed into the model for training.

Training data for the Bloomberg LLM spans from March 1, 2007 to July 31, 2022, and is referred to as “FinPile” consisting of financial web, financial news, company filings, press releases and Bloomberg news. These constitute 363 billion tokens (54.2% of training data). Token is nothing but a fragmented piece of the original data, that is fed into the Machine Learning models. Bloomberg has tokenized its financial database using a Unigram model, throwing away conditioning contextual information and estimating each term independently for efficiency.

Financial training data was augmented by public datasets for general-purpose training, such as The Pile, Colossal Clean Crawled Corpus (C4) and Wikipedia primarily for good performance on general purpose NLP tasks from benchmarks such as BIG-bench hard, knowledge assessments, reading comprehension and other linguistic tasks. The training for BloombergGPT required ~53 days of computations, run on 64 servers in partnership with NVIDIA and Amazon Web Services. Assuming cost per server instance of $33 hourly, a crude estimation yields $2.7 million to produce the model! Facebook’s deep learning package – PyTorch – was used to train the BloombergGPT model.

BloombergGPT is capable of few-shot learning, where using only a few examples, it can learn to perform the task at hand. For example, it can generate valid Bloomberg Query Language snippets using only 3 examples for training. It utilizes its knowledge about stock tickers and financial terms to yield valid queries.

No alt text provided for this image
Source: Bloomberg Paper on arXiv

Some potential financial applications of the model would include:

  • Generating drafts of a SEC filing
  • Summarizing a paragraph of financial information into a headline
  • Obtaining an organization’s executive-level structure and linkages within its subsidiaries
  • Automated draft routine market reports and summaries
  • Retrieval of specific areas of financial statements

 

BloombergGPT outperforms existing LLMs on financial tasks by a huge margin, while performing at par on general purpose tasks. Since LLMs are susceptible to data leakage attacks, which use model weights to extract large segments of underlying textual data, Bloomberg has decided not to release its model even as an API or a chat interface. The future of finance and fintech is already seeing a huge upturn!

No alt text provided for this image
Source: Bloomberg

Share this post

Facebook Twitter LinkedIn WhatsApp

Related Posts