OWASP has a top 10 list for every occasion and this post looks at one of the newest additions: the OWASP Top 10 for LLM applications. Given the runaway popularity of large language models for building innovative software features and products, it’s worth taking a closer look at the security risks to software that integrates with generative AI.
The Open Web Application Security Project (OWASP) has compiled the Top 10 for LLM applications as another list-style awareness document to provide a systematic overview of the application security risks, this time in the rapidly growing realm of generative AI. While everyone is aware of some of the risks related to large language models (LLMs), few have a full picture of where AI security fits into cybersecurity overall. It’s common to see people either underestimating the risk (typically in the rush to deploy a new AI-enabled feature) or vastly overestimating it and dismissing anything that mentions AI out of hand.
LLMs have become the poster child of the current AI boom, but they are just one small category of artificial intelligence overall. They are also only one component of anything termed an LLM application, so before looking at the top 10 risks to understand the wider security picture, let’s start by clarifying the terminology:
Before you ask: Invicti does not use data obtained from large language models in any of its products. For automated application and API security testing with DAST, the need for accurate, repeatable, and reliable results rules out LLMs as a viable solution.
Â
To learn how Invicti uses machine learning to get the benefits of AI without the shortcomings of LLMs, see our post on the technical side of Predictive Risk Scoring.
As with other OWASP Top 10 efforts, this one is also not intended as a simple checklist but as a document to raise awareness of the main sources of risk to application security. Especially for LLMs, these risks are all interlinked and originate from more general security weaknesses. Similar to the treatment we’ve given the OWASP API Security Top 10, let’s look at the broader themes behind the top 10 LLM risks and see what they tell us about the current LLM gold rush.
Prompt injection attacks are undoubtedly the biggest security concern when it comes to using LLMs, so it’s no surprise they top the list, but they are only one symptom of more fundamental issues. LLMs are a new type of data source in many ways due to their black-box nature: they generate rather than retrieve their results, they are non-deterministic, there is no way to explain how a specific result is generated, and their output relies on training data that is usually outside the user’s control. The unpredictable nature of LLMs accounts for three of the top 10 risk categories:
We’ve all laughed at some of the things ChatGPT and other conversational LLM apps can produce, but the greatest potential of LLMs lies with automation—and that’s no laughing matter. Once generative AI data sources are integrated through APIs and automated, blindly trusting the results and forgetting they need special care and attention opens up three more risk avenues:
The models themselves can also be targeted. Any LLM-based application relies on a specific model being operational and responsive, so taking that model offline will also affect any software that relies on it. Often being extremely costly to train and run, commercial models are also prized intellectual property, which can make them the direct target of attacks. The two risk categories for model abuse are:
LLMs are built, trained, refined, and operated using a complex chain of tools, often including other models for fine-tuning, making their supply chain a security risk as much as with any other piece of software (if not more). To address novel use cases and help integrate LLMs into ever more systems and applications, entire ecosystems of open-source and commercial plugins and extensions have also sprung up. You can think of these two categories as upstream and downstream security risks:
Large language model applications aren’t inherently less secure than any other software, but they do come with added caveats on top of typical AppSec considerations like access control or input validation and sanitization. The main risk is that LLMs, like other types of generative AI, are fundamentally different from more traditional data sources and the only way to build and use them securely is to keep this in mind at all times.
The sometimes near-magical capabilities of large language models come at the price of accepting that your results are coming from a black box that’s never guaranteed to work the way you expect or generate precisely what you were hoping for. So, in a way, the OWASP Top 10 for LLM applications is a list of reasons why you shouldn’t blindly trust generative AI as the data source for your app.