A Voyage of Discovery: Talking APIs With Frank Catucci and Dan Murphy

What’s with all the buzz around API security? It’s becoming the top concern in application security as everyone is looking for faster and more reliable ways to secure their ever-growing API ecosystem. In Postman’s 2023 State of the API Report, 92% of respondents said they planned to increase their investments in APIs through 2024, which was up a massive 89% from the previous year. With API usage surging in software development, the line between APIs and applications is getting blurred, even as the security industry seems to treat them as completely separate things.

Invicti recently released API discovery as part of its API Security product to help companies proactively address API-related risks in their application environments—but how does it all work under the hood and what makes it so special? We sat down for an interview with Invicti’s CTO, Frank Catucci, and Chief Architect, Dan Murphy, to clear up some API misconceptions, get closer to the technical side of building API security into an application security platform, and learn why it’s so important to treat APIs not as a separate entity but as an integral part of your attack surface.

*Frank Catucci, CTO and Head of Security Research*

This might seem a very obvious question to start with, but we’re seeing a lot of confusion about the differences between web applications and APIs. Especially in the security industry, you see a lot of dedicated API security products and vendors, so it sometimes feels like applications and APIs are two separate things with different security requirements. So what’s your practitioner’s eye view on applications vs. APIs in terms of architecture and, of course, security?

Dan Murphy: I come from a software engineering background and have spent a lot of my career thinking about APIs and web applications. But for folks who don’t necessarily have the same background, it’s sometimes hard to visualize, so it’s valid to ask: What is an API? How does it differ from a web app? And the answer is those things are a little blurred. Many modern applications are single-page applications (SPAs) that are simply invoking APIs as the user clicks around the app, so they’re a kind of hybrid of GUI and API. But with a traditional API, the thing on the other end of the request is not the web browser—it’s a piece of code. It may be some other web service invoking a webhook, some backend code or systems talking to each other, but it’s definitely not a human clicking inside of a browser.

One of the metaphors I like to use is that APIs are like the service elevators in buildings—people coming in the front door don’t see them, but they carry a lot of cargo behind the scenes, in this case all the internals of a web app. They don’t have a GUI that you can see and interact with. As in a real physical building, because those service APIs stay out of sight, it might not be clear if they’re being maintained and updated and kept secure.

Frank Catucci: That’s a great metaphor—APIs are the part of an application that does the heavy lifting in terms of data access and processing, but because they often aren’t visible, they can slip through testing and inventory efforts. So when people ask me what’s so special about APIs and API security, I like to start with an example of an API-based attack, such as the Optus data breach. Now that one was only possible because of an exposed API endpoint that let an attacker download the data of over 10 million customers without any authorization or authentication.

So that Optus API, that service elevator if you like, would allow anybody who figured out the URL to enter a customer number and get confidential information back, and just enumerate those customers without any limits. It was what we call a shadow API that was never intended to be accessible in production, so it didn’t have all the security controls we’d normally expect. And because it was this heavy-lifting service elevator, it allowed the attacker to automatically exfiltrate huge amounts of data that they probably wouldn’t be able to get so easily if they were, say, manually hacking a web form.

Could you talk a bit more about shadow APIs? We see that term thrown around a lot, so what practical security problems come up with shadow APIs and, more generally, when doing API security rather than securing that more visible part of applications?

Dan: It’s pretty easy for an API, which doesn’t have a user-visible manifestation, to be ignored and go out of date. With a website, a developer or security person can often simply click around and they will quickly notice if anything looks really sketchy. In fact, this is what we do automatically with our Predictive Risk Scoring. But APIs are a lot more difficult for that kind of quick analysis because they don’t have anything that you can directly interact with. They are a catalog of invisible operations that could be performed on a computer. And if you don’t keep track of what’s in that catalog and who’s allowed to do these operations, you can get shadow APIs creeping in, like these hidden service doors that might not be easy to find but aren’t locked or monitored for when somebody rattles all the locks and eventually gets in.

Frank: I’m glad you used the word “catalog” because those catalogs or inventories are really the sticking point for API security. So, ideally, you want to keep track of all your API specifications. In reality, they can live in various places and formats, formal and informal. You might have your “official” specs in OpenAPI (aka Swagger) files or Postman collections or your API management system like MuleSoft or whatever else you’re using, but you can also have proxy exports from Fiddler or even a Burp or Invicti scan. I’ve even seen them in Excel sheets. But all of these essentially need to be inventoried and tracked in order to be able to secure them and understand exactly what their context and purpose is.

In a perfect world, you would have everything tracked in your API gateways and management systems. Reality, though, tends to get a bit messy, and most companies I’ve seen and spoken to use a mix of different methods and systems.

Dan: It’s the sprawl that gets you. The unknown APIs that are out there are the ones that I would consider to be the riskiest. And that really speaks to the need for discovery because APIs tend to be organic; they tend to be created to connect to business opportunities, and they don’t always have a ton of oversight when they’re deployed. If you think of APIs as data pipes, it’s very hard to swap out a pipe that has active users from a lot of different places, so just like a pipe, they tend to get buried under the street, they do their job, and people forget about them. Until they burst, of course!

You mentioned discovery, which is a key part of Invicti’s API Security product and of the approach we’re proposing to help organizations secure their applications, APIs included. You have both been deeply involved in the intense development effort to design and implement that feature. To close out, could you talk a little about how Invicti’s API discovery works under the hood and how it fits into the wider API security picture?

Dan: Discovery is needed to find all those pipes that people put in overnight for an urgent project and didn’t necessarily catalog anywhere. And because organizations tend to keep their API information in different places, we decided to build out API discovery in layers. So we’re starting by finding all the spec files we can because these often live in predictable locations or in places that our crawler can get to, and we add those to all the specs that the organization knows and can deliver upfront. Then the next layer are API management platforms like MuleSoft that we can plug into and get more specs. And once we’ve found all the specs we could, we do traffic analysis to find APIs that are deployed and passing traffic but not cataloged.

In engineering terms, one of the really cool things we’ve built is the ability to discover APIs from real traffic. For example, one of our discovery features lets us plug into a Kubernetes cluster and analyze the traffic to find API requests. So if, heaven forbid, somebody quietly slipped into production that big water main that happens to make an entire project work, you could now find it by looking at traffic and say, “Oh, wow, you know what? We have these six sets of well-documented APIs, and then we’ve got this one that’s doing two million queries per day that is not on the map.” But we can now build that map, reconstruct the endpoints based on the traffic, build a regular OpenAPI spec file, and feed that to the scanner for testing.

Frank: That’s the other big piece of it—we’re doing discovery to find or reconstruct all those specs, and that is crucial because you can’t secure what you don’t know exists. But once you have all those specs, you need to make sure the APIs are not vulnerable to attack. This is kind of where tools that only focus on discovery can falter because once you have that inventory, you need to test it using some other tool. So at Invicti, we have what many consider to be the best DAST scanner in the world, and we’ve been using it to scan APIs for years, currently supporting 16 different API spec formats. Now that we have API discovery on the same platform, all those specs, known and discovered, can go straight to the scanner and be automatically tested for vulnerabilities without the need for additional tools.

Dan: And the cool thing is we can take many of the hundreds of security checks we designed for testing websites and apply them to scanning APIs. At a very high level, you can think of a DAST scan as just clicking through all the things on site, trying to open every single door, go through all the links, submit all the forms, and then mess around with parameter values until something pops and you get a little bit of cross-site scripting inside the browser. When we have an API spec, we can do something similar and attack all the normal places that we would if we came across this API in the course of a regular web browsing session.

But if you try to test an API and you just give it a low-effort payload, you can end up not getting deep enough into the app, and you just get this 400 error that says bad input. Usually, the really juicy code happens a little bit deeper than that, so during scans we’ll also try to mutate things and create representative payloads that match the input that is expected to get the scanner past input validation. You want to get to the point where you’re acquiring that SQL table, where you’re making that call out to the command-line tool—so it’s very important to get as proper-looking inputs as you possibly can. Some things like cross-site scripting probably don’t make sense outside a browser, but you can totally go through an API to steal an AWS identity token via SSRF.

Frank: I think it’s also important to add that we’re continuing work on discovering and testing API so we can find more endpoints, reconstruct more specs, find more vulnerabilities, and ultimately help our customers close those gaps faster.

Want to learn more about API Security, API discovery, and the Invicti platform? Check out our webinar to learn API security challenges, understand the benefits of comprehensive API discovery, and see the Invicti platform with API Security in action!

A voyage of discovery: Talking APIs with Frank Catucci and Dan Murphy

Related Articles

SQL Injection Cheat Sheet

HTTP security headers: An easy way to harden your web applications

How you can disable directory listing on your web server – and why you should

JSON injection