What the OWASP Top 10 for LLM applications tells us about generative AI security

OWASP has a top 10 list for every occasion and this post looks at one of the newest additions: the OWASP Top 10 for LLM applications. Given the runaway popularity of large language models for building innovative software features and products, it’s worth taking a closer look at the security risks to software that integrates with generative AI.

What the OWASP Top 10 for LLM applications tells us about generative AI security

The Open Web Application Security Project (OWASP) has compiled the Top 10 for LLM applications as another list-style awareness document to provide a systematic overview of the application security risks, this time in the rapidly growing realm of generative AI. While everyone is aware of some of the risks related to large language models (LLMs), few have a full picture of where AI security fits into cybersecurity overall. It’s common to see people either underestimating the risk (typically in the rush to deploy a new AI-enabled feature) or vastly overestimating it and dismissing anything that mentions AI out of hand.

LLMs have become the poster child of the current AI boom, but they are just one small category of artificial intelligence overall. They are also only one component of anything termed an LLM application, so before looking at the top 10 risks to understand the wider security picture, let’s start by clarifying the terminology:

  • A large language model is essentially a massive piece of code (sometimes literally a single multi-gigabyte file) that takes text instructions and generates a result. Internally, LLMs are complex multi-layered neural networks with billions of parameters that are preset by processing vast amounts of training data. The biggest models require so much computing power that only a handful of companies can train and operate them.
  • An LLM application is any piece of software that sends data to an LLM and receives results from it. To take the most obvious example, ChatGPT is a chat application that interacts with the GPT model. LLM-based functionality is being built into everything from business software to operating systems and phones, so the meaning of “LLM application” is expanding rapidly.

Before you ask: Invicti does not use data obtained from large language models in any of its products. For automated application and API security testing with DAST, the need for accurate, repeatable, and reliable results rules out LLMs as a viable solution.

 

To learn how Invicti uses machine learning to get the benefits of AI without the shortcomings of LLMs, see our post on the technical side of Predictive Risk Scoring.

Reframing the Top 10 for LLM apps by risk areas

As with other OWASP Top 10 efforts, this one is also not intended as a simple checklist but as a document to raise awareness of the main sources of risk to application security. Especially for LLMs, these risks are all interlinked and originate from more general security weaknesses. Similar to the treatment we’ve given the OWASP API Security Top 10, let’s look at the broader themes behind the top 10 LLM risks and see what they tell us about the current LLM gold rush.

The dangers of working with black boxes

Prompt injection attacks are undoubtedly the biggest security concern when it comes to using LLMs, so it’s no surprise they top the list, but they are only one symptom of more fundamental issues. LLMs are a new type of data source in many ways due to their black-box nature: they generate rather than retrieve their results, they are non-deterministic, there is no way to explain how a specific result is generated, and their output relies on training data that is usually outside the user’s control. The unpredictable nature of LLMs accounts for three of the top 10 risk categories:

  • LLM01: Prompt Injection. LLMs operate on natural language, so their instructions always mix commands and user-supplied data, allowing for attacks that directly or indirectly modify the system prompt (see our ebook for a detailed discussion).
  • LLM03: Training Data Poisoning. Setting the internal parameters of an LLM requires vast amounts of valid, permitted, and accurate training data. By infiltrating custom datasets or modifying publicly available data, attackers can influence LLM results. 
  • LLM06: Sensitive Information Disclosure. There is no way to verify that an LLM wasn’t trained on sensitive data. If such data was included, you can never be completely sure that it won’t be revealed in some context, potentially resulting in a privacy violation.

When you trust LLMs too much

We’ve all laughed at some of the things ChatGPT and other conversational LLM apps can produce, but the greatest potential of LLMs lies with automation—and that’s no laughing matter. Once generative AI data sources are integrated through APIs and automated, blindly trusting the results and forgetting they need special care and attention opens up three more risk avenues:

  • LLM02: Insecure Output Handling. If LLM outputs are directly used as inputs to another application (including another LLM) and not sanitized, a suitable prompt may cause the LLM to generate an attack payload that is then executed by the application. This may expose the app to attacks like XSS, CSRF, SSRF, and others.
  • LLM08: Excessive Agency. The latest LLMs can trigger external functions and interface with other systems in response to a prompt. If this ability is not tightly controlled or control is bypassed, an LLM could perform unintended actions, either on its own or under an attacker’s control.
  • LLM09: Overreliance. Some LLM responses and suggestions can superficially seem valid but lead to severe problems if used verbatim or acted upon. Examples include making the wrong decisions based on false information or introducing software bugs and vulnerabilities by accepting incorrect or insecure suggestions from AI code assistants.

Model abuse

The models themselves can also be targeted. Any LLM-based application relies on a specific model being operational and responsive, so taking that model offline will also affect any software that relies on it. Often being extremely costly to train and run, commercial models are also prized intellectual property, which can make them the direct target of attacks. The two risk categories for model abuse are:

  • LLM04: Model Denial of Service. Attackers can bombard an LLM with sequences of malicious requests to overwhelm the model or its hosting infrastructure. Examples include extremely long or deliberately difficult prompts as well as abnormally high request volumes.
  • LLM10: Model Theft. Apart from directly accessing and exfiltrating proprietary models, attackers can also attempt to extract their internal parameters to create an equivalent model. A large number of precisely targeted (and uncapped) queries and responses may also provide enough data to train or refine a copycat model.

Weaknesses in LLM implementations and integrations

LLMs are built, trained, refined, and operated using a complex chain of tools, often including other models for fine-tuning, making their supply chain a security risk as much as with any other piece of software (if not more). To address novel use cases and help integrate LLMs into ever more systems and applications, entire ecosystems of open-source and commercial plugins and extensions have also sprung up. You can think of these two categories as upstream and downstream security risks:

  • LLM05: Supply Chain Vulnerabilities. A vulnerable dependency could allow attackers to compromise an LLM system, for example to access user prompts and account data. Many AI projects use open-source Python packages from the PyPi registry, so poisoned, backdoored, or simply vulnerable packages from the registry are a serious risk.
  • LLM07: Insecure Plugin Design. Security vulnerabilities in LLM plugins and extensions may open up new attack avenues that are beyond the control of both application and LLM developers. For example, a plugin might fail to validate query inputs and thus allow attacks such as SQL injection, or it may even allow attackers to gain unauthorized access to backend systems through remote code execution.

To get the most out of generative AI, understand the risks first

Large language model applications aren’t inherently less secure than any other software, but they do come with added caveats on top of typical AppSec considerations like access control or input validation and sanitization. The main risk is that LLMs, like other types of generative AI, are fundamentally different from more traditional data sources and the only way to build and use them securely is to keep this in mind at all times.

The sometimes near-magical capabilities of large language models come at the price of accepting that your results are coming from a black box that’s never guaranteed to work the way you expect or generate precisely what you were hoping for. So, in a way, the OWASP Top 10 for LLM applications is a list of reasons why you shouldn’t blindly trust generative AI as the data source for your app.

Zbigniew Banach

About the Author

Zbigniew Banach - Technical Content Lead & Managing Editor

Cybersecurity writer and blog managing editor at Invicti Security. Drawing on years of experience with security, software development, content creation, journalism, and technical translation, he does his best to bring web application security and cybersecurity in general to a wider audience.