How AI-Generated Code Could Kill Your API

How AI-Generated Code Could Kill Your API

Posted in

In a recent post, we recapped a keynote speech by former Gartner analyst Paul Dumas at our Austin API Summit. The gist of the talk was that, as the practice of using AI to generate code grows, we will become stewards of content. In other words, prompting, testing, and curating AI-created code.

According to research by GitHub, in 2023, as many as 92% of US developers were using AI coding tools “both in and outside of work.” Another study conducted by GitHub found that those using Copilot (a tool developed by GitHub) completed tasks up to 55% faster than those working without it.

However, this AI reliance brings potential consequences. Due to the “sheer amount of code produced… without guardrails, generative AI will likely lead to more technical debt,” writes Bill Doerrfeld. Why reuse and refine code when you can just ditch it and generate something new?

Problems can also arise when developers, say, use AI to generate code in a language that they’re not familiar with. Do they understand the relevant best practices? Security risks? The fixes for problems that arise? The answer to these questions is often a resounding “no.”

Still, the implicit message of the research above is to get on board with AI coding tools or get left in the dust, which, for some skeptics, could be a scary prospect…

Risks of Using AI-Generated Code

It’s worth pointing out that both of the studies above refer to “AI coding tools” (check out our article on AI tools for API developers), which isn’t necessarily the same thing as AI-generated code. Such tools might, for example, suggest code completions or fixes for known bugs.

But there certainly are people out there, both developers and laypersons, creating tools and products purely from AI-generated code. At Austin API Summit 2024, we spoke with Katie Paxton-Fear, API Security Educator, Traceable AI, about some of the implications of people using AI-generated code at this scale.

“It’s great that they are able to get their app ideas out there, but they have no idea how to actively look at that code to see if it’s secure or if this is the most appropriate way of doing things,” says Paxton-Fear. “If you’re using AI-generated code, you may not even be aware of what your regulatory requirements are.”

Let alone whether or not you’re meeting them. And, even if you are aware of the security requirements around storing specific data, you may not know how to handle breaches.

Then again, Paxton-Fear observes, the opposite might also be true. “People who don’t know anything about security probably can learn some common pitfalls by questioning ChatGPT or Copilot. And I can see that coming in handy for junior developers who don’t feel comfortable or confident asking senior engineers for fear of wasting their time.”

We’ve previously covered how tools like Hacking APIs GPT might be used for API security testing. Other companies are out there chasing funding for their own tools, focusing on using AI to test security. Such tools should certainly, for the time being at least, be considered supplemental measures rather than a replacement for typical code reviews and testing.

With the right security protocols and observability in place, using AI-generated or AI-assisted code doesn’t have to be a problem. It can even have significant benefits. But in situations where there might already be cracks, relying on AI-created code could be downright dangerous.

The Risks of Using AI-Generated Code

Before we get into the API space, let’s look at the big picture. Using AI-generated code has various associated risks, such as the possibility of introducing inaccuracies or errors. The fact is thousands of tutorials and posts written 10+ years ago (now thoroughly out of date) may end up being used as training data.

Take PHP’s documentation, suggests Paxton-Fear. The documentation contains notes around deprecation, but there are comments from 14 years ago that may no longer be relevant. “As a result, we’re going to see the return of things, like vulnerabilities and exploits, that we thought we had already dealt with,” she says.

Hallucinations are also a cause for concern. Consider ChatGPT recommending the OpenCage Geocoding API’s phone lookup functionality…which doesn’t exist. It’s a perfect example of, in Paxton-Fear’s words, “AI telling us what we want to hear” rather than providing accurate and reliable information.

In addition to the accidental misstep, there’s also activity that’s actively bad-natured to be considered. In a recent case, cybercriminals pushed malware via a malicious package suggested as a fix for various problems on StackOverflow.

Paxton-Fear suggests that “we can absolutely expect to see the same thing with AI-generated responses to prompts or questions, especially because AI is non-deterministic. A smart attacker could include malicious packages in training data and sneak them in as responses to really specific problems that won’t be asked often.”

Why Is Gen AI Particularly Dangerous to APIs?

The general risk of developers using AI-generated code is compounded in the API space, Paxton-Fear believes, because of the nature of writing APIs. Creating and documenting endpoints can be repetitive, even boring, so the incentive to automate the process is high. In that respect, APIs are more susceptible to these risks than other projects might be.

As more products integrate AI into their offerings, we can expect further adoption of external tools like OpenAI’s API. Beyond the documented risks of over-reliance on AI — something that cracks OWASP’s top 10 risks of LLMs — there’s always a risk of interactions between applications and such APIs failing in some way or another.

“We’re never going to solve prompt injection,” Paxton-Fear worries. “If you’re relying on AI input to generate blog posts or code, you have to look to the usual suspects. You can’t necessarily trust user input, and AI input is just another form of that.” Whether or not you trust AI input over something created manually by an engineer is another issue entirely…

As more and more apps out there come to rely on generative AI, we better hope that their providers are doing a stellar job of maintaining and securing those products.

Can We Mitigate Risks and Embrace Gen AI Coding?

On the face of it, the viewpoints of a self-confessed AI skeptic like Paxton-Fear and Gartner’s former analyst Paul Dumas — who predicts that AI will soon produce APIs for you — are very different.

In fact, though, they do share commonalities. Paxton-Fear told us that, despite her reservations, “I do think it’s a powerful creative tool, especially for developers who know what they’re doing. They can look at that code and actually evaluate its quality.” Which echoes Dumas’ sentiments.

However, Paxton-Fear also observes that a workplace’s culture plays a big part in mitigating the risk of AI-generated code. “Developers of all levels aren’t measured on security outcomes, they’re measured on lines of code, features submitted, and tickets resolved.”

As such, they are incentivized towards efficiency, and we’ve seen that using AI can be a big help by cutting down the work they have to do. “But there’s a real risk there, especially for organizations that don’t really have security teams, if they find [or don’t find] that they’re missing documentation explaining that something is deprecated or sunsetted.”

One possible solution to this problem is providing senior engineers with real balance: giving them a chance to think about applications more broadly and the big-picture questions, not just reviewing requests. This presents more challenging problems that they want to engage with themselves, rather than outsourcing the work to gen AI tools.

Ultimately, Paxton-Fear concludes that AI-generated code should be “a tool used alongside humans, not a replacement for them.” Much like Paul Dumas did in his Austin API Summit keynote. Good documentation is regularly updated with guidance, security issues, and so on. Due to lag, AI tools won’t have access to that information for a long time. A skilled human developer will.

“There’s a lot of clear benefits in cutting corners,” Paxton-Fear observes offhand. “Run every red light, and you’ll get there quicker… most of the time. The one time you don’t? It could be bad.”