On 27 March 2024, the Alan Turing Institute, in collaboration with HSBC and the UK Financial Conduct Authority, published a new research report (Report) on the impact and the potential of large language models (LLMs) in the financial services sector. 

As LLMs (the most high-profile of which being Open AI’s ChatGPT) can analyse large amounts of data quickly and generate coherent text, there is obvious potential for use in LLMs in financial services. The Report suggests that the UK financial sector, even with its scale, so far has taken a cautious approach to adoption of AI, but that it all may come in a rush of adoption in the next two years, for which the sector seems not fully prepared.

LLMs in the Finance Sector

Examples of LLMs being developed for the financial sector include:

  • BloombergGPT : BloombergGPT is a ‘closed source’ 50 billion parameter LLM developed by Bloomberg to generate financial news and analysis as well as to develop new financial products and services. Bloomberg says that the quality of the AI model comes down to the quality of the data on which it is trained: “Thanks to the collection of financial documents Bloomberg has curated over four decades, we were able to carefully create a large and clean, domain-specific dataset to train a LLM that is best suited for financial use cases.”

  • FinGPT : FinGPT is an ‘open source’ LLM trained on a significant dataset of financial data. It can be used for a variety of tasks including generating financial reports, performing financial analysis, and developing new financial products and services. FinGPT’s developers argue that greater use of open source models will see “a shifting trend towards democratizing Internet-scale financial data.”

  • TradingGPT : TradingGPT is a multi-agent system with layered memory capabilities. A multi-agent system involves individual agents within the AI model each undertaking the computation and then working collaboratively - often ‘debating’ each other - to come up with a single answer. Trading GPT has three agents with separate memory streams focused on short-term, medium-term and long-term financial data. The TradingGPT developers say this more closely matches how human traders think.

Methodology

Building on a prior report by the Alan Turing Institute and HSBC on the impact of LLMs in banking, the team invited 43 participants from banks, regulators, investment banks, insurers, payment firms, government and legal professionals to hold a consensus-building workshop. Participants were asked about the likelihood, significance, timing, and expected impact of LLMs on the financial services sector. 

Opportunities

The workshop participants shared their experiences and views on the opportunities of LLMs in the financial services sector.

Adoption of LLMs

Just over half the participants leverage LLMs to enhance performance in information-orientated work tasks, while only 29% employ them to boost critical thinking skills. Additionally, 16% utilise language models to break down complex tasks, and 10% leverage these tools to improve team collaboration. Perhaps surprisingly, 35%, said they do not currently incorporate any LLMs into their tasks.

The cautious approach of UK financial firms so far to LLMs can be see in the following responses:

  • Current use of LLMs is mainly for risk-free tasks with heavy human assistance, such as text summarisation and increasing the speed of analysis.

  • While many participants reported that their firms had already incorporated LLMs into their ‘financial insight generation services’, ‘financial service safety’ and ‘public communication and engagement services’, LLMs are being mainly used internally in thse lines of business, for example in training scenarios to improve customer facing skills of staff, rather than for direct customer service provision. ‘Money-economics related services’ are the only category where LLMs currently were not being advanced by the majority of participants but they expect it to be integrated within two years.

That said, the UK financial sector seems to be on the cusp of substantial change in the use of LLMs:

  • The workshop participants were all in agreement that the integration of LLMs would occur in all functional service areas well within five years, possibly as soon as two years.

  • With LLMs already embedded within most financial firms, moving from internal to customer-facing use also was expected to come quickly.

The Future

As to future opportunities, the workshop participants tended to focus again on the opportunities for LLM to simplify internal processes within financial institutions, including streamlining decision-making processes, risk profiling, benefit quantification, and prioritisation, improving investment research, and back-office operations.

However, "the two most promising opportunities" advanced by workshop participants were:

  • Financial advisory : use of personalised robo-advice fed by a broad range of data could enhance strategic and advisory services and help experienced professionals do more. Leveraging advanced NLP capabilities, financial institutions could also integrate multiple media types, such as images, into comprehensive internal assessments.

  • Financial literacy : LLMs could be used to power financial literacy educational environments and improve financial inclusion with personalised support based on an individual’s literacy levels: e.g. LLMs could effectively educate children to learn more about investments and pensions so as to take advantage of the benefits of compounding sooner.

Risks

The participants overwhelmingly identified legal and reputational harm as the most impactful harm in the integration of LLMs in financial services.

Participants concerns centered around two main areas of potential loss:

  • Loss of confidence should a catastrophic failure occur or heavy reliance on LLMs creates a systemic issue. Given the public hypersensitivity to AI mishaps, firms also face a challenge proving something is not the case when accused of AI-related harms. For example, Apple’s algorithm was accused of giving men and women different credit limits when applying for an AppleCard. This accusation was disproved by the New York State Department of Financial Services; and

  • The loss of core skills if firms reduce their workforce by too much reliance on LLMs, which can contribute to a ‘herding behaviour’ leading to misinformed investment decisions.

At a more granular level, some of the risks identified by the workshop participants were:

  • Privacy risk: This was one of the highest priorities for workshop participants, with nearly half of the participants concerned about privacy vulnerabilities. A major challenge is determining which jurisdiction (and therefore which national regulation) applies if the developer and deployer of an LLM, or the data subjects and their data used by the LLM, are in different countries.

  • Lack of assurance and traceability: Workshop participants noted while regulators are increasingly focused on transparency obligations in the context of LLMs, information about the extent of data used for training is currently almost impossible to retrieve or validate, again especially for the downstream users of LLMs developed by third parties. This problem is only getting worse with technological developments: as noted above, increasingly multiple AI agents work together to produce outputs, compounding the complexity and lack of transparency.

  • Robustness of internal guidelines: Only 19% of participants rated their own internal guidelines for using LLMs as highly adequate. On average, they rated the robustness of their internal measures at 3.7 on a scale of 1 to 5. Specifically, participants identified the deficiencies in current guidelines as follows:

A crucial requirement for supporting internal guidelines is the provision of clearer examples illustrating the maturity level of LLM-based systems and their limitations. This necessity extends to offering guidance on human factors and providing examples of user journeys, applicable to both internal and third-party development. The guidelines should include warnings regarding the limitations inherent in LLMs and delineate when and why utilising LLMs would be preferable compared to other similar tools..

Towards Safe Adoption of LLMs

The Report then outlined a range of views of participants at the workshop on the safe adoption of LLMs across the finance sector, taking into account these opportunities and risks.

Robustness and resilience

Participants emphasised the need for robustness and resilience of LLMs. ISO/IEC TS 5723:2022 defines ‘robustness’ as the ‘ability of a system to maintain its levels of performance under a variety of circumstances’. Resilience depends on building robust processes and systems so that they can withstand adversity and recover fast.

Participants pointed out that, while financial firms are well versed in requirements to establish and verify robust internal processes, there is a fundamental mismatch between the traditional approach of financial regulators and LLM technology: Current financial services regulation tends to drive ‘robustness checklists’ that apply detailed use-cases evaluations (described as ‘atomic’ level testing), whereas “disruptiveness of [LLMs] comes from their ability to comprehend a wide variety of data and generate outputs in a variety of styles.” In the Australian context, the Chair of the Australian Securities and Investments Commission has stated in a speech that the existing legal and regulatory framework may be insufficient to prevent AI associated harm.

Data Asymmetry

Data asymmetry between big tech and financial services firms is a growing concern. The UK's regulator, the FCA, has called for input on the competition implications of this asymmetry.

Another concern is the data asymmetry between bigger players in the finance services sector to small players. The Report highlights the Open Banking initiative, which empowers banking customers to share their banking data with third parties, as an illustration of where data sharing between differing sized entities works well.

The use of digital identity, in the UK represented by the Data Protection and Digital Information Bill which is currently being debated in the House of Lords, is another means of enhancing data accessibility and levelling the playing field between different sized players in the sector while ensuring data subject consent and control.

Security

The Open Web Application Security Project has highlighted the security challenge and attack surfaces by the interaction of LLMs into business applications. Participants highlighted three priority areas for managing security risks:

  • Third-party vendors: Third-party infrastructure may introduce security risks. Financial institutions should engage in robust due diligence of vendors but there are obvious limitations to this diligence. For example, where small financial services firms with limited influence are negotiating with big multinational providers.

  • Open vs closed models: Closed models are generally more secure than open models. The legal implications of using less secure open models, built on top of other open architecture, is a topic of present debate.

  • Trade-off between security and openness: An over-protectiveness of data can restrict opportunities for innovation. Open Banking was again cited in the Report as an initiative that proves that data can be shared between stakeholders with the consent of the consumer in an open ecosystem which is also secure.

Fairness

Workshop participants agreed that fairness and bias-free principles need to be applied, but they also acknowledged that some data is already biased and it was unclear how to de-bias unstructured textual data. There was a general acknowledgement that use of LLMs in the financial services sector should be subject to a higher standard of scrutiny with continuing heavy human involvement, with “fairness and bias-free principles need to be baked into human agent scenarios.”

Explainability

The workshop participants noted the growing regulatory requirements for explainability (e.g. the EU’s AI Act), and that the lack of explainability in decision making is preventing the financial services industry from using LLMs in many applications. They “advocated that having accurate but intuitively explainable models are more important than having complete explainability..[and] that proportionality and therefore different levels of explainability and transparency may be appropriate for different use cases.”

Accountability

Workshop participants acknowledged that, given that people’s wealth was involved, accountability (and associated customer trust) is even more important in the financial services sector than explainability and transparency. While there are established accountability policies and internal standards, there may need to be different standards for LLM given the breadth of their use, their risks, and the teams involved. 

The ‘key stone’ needed to be a legal document which set out “frameworks of accountability based on use-cases, supported by a model owner who is accountable for defined roles, defined tasks, procurement, and ensuring risk assessments are passed.” 

But workshop participants also said that accountability for LLMs is not just an issue for IT teams but for the whole of the organisation including upstream to vendors. From a regulatory perspective however, accountability will often will be sheeted home to one party: for example in the case of GDPR, the entity that is the data controller will have the highest level of accountability.

Integrity

Connected with accountability, workshop participants acknowledged that the concept of integrity in safely adopting LLMs is vital:

  • While determining the ethical use of LLMs is connected with the financial institution’s brand and stakeholder management, “building a trustworthy product is distinct from building overall trust.” Consumer trust needs to be considered at a granular level in every aspect of an LLM, such as the design of the user interface and the personalisation steps.

  • The ethical considerations in data integrity are sometimes overlooked, and “[a] disconnect arises between third-party developers and companies regarding the quality and integrity of training data for LLMs, with companies emphasising a greater need for data integrity.”

  • The current preference for ‘human-in-the-loop’ over complete human replacement underscores the importance of maintaining integrity.

Skills

Education on privacy and security and misinformation in the context of LLMs has been integrated into internal cybersecurity and digital training courses in most financial institutions. However, workshop participants noted that “in some cases, training, governing and maintaining LLMs can be assessed to take considerable amount of time and money compared to realised or anticipated productivity gains.”

Participants at the workshop noted that there is a gap in training for executives who will need to understand these models to support the development of accountability and assignment of responsibilities.

Going forward, individual accountability regimes like the UK Senior Managers Regime and the Australian Financial Accountability Regime (FAR) may require executives and directors to undergo specific LLM training in order to be approved as fit and proper for their roles by the relevant regulators. The latest Gilbert + Tobin insights on the final FAR rules and guidance may be found here .

Concluding Observations

The Report concludes with three recommendations:

  1. Develop use-case dependent, sector-wide analysis of LLM assessments: Cross-industry forums and calls for input should be used to share safety concerns, adversarial incidents, and best practices.

  2. Exploration into opportunities emerging with open-source models: Collaborative research should be undertaken into open-source models in line with data protection requirements and mitigating security and privacy concerns should be undertaken by industry.

  3. Develop LLM stakeholder community: Across academia, financial institutions, regulators and policymakers.

The Alan Turing Institute next plans to execute a red-teaming event specifically targeting financial LLMs to facilitate stress testing, inform confidence levels and guide opportunities to strengthen integrity and resilience for open-weight and open-source LLM models.

Read more: The Impact of Large Language Models in Finance: Towards Trustworthy Adoption