How AI Can Be Used In API Security Posted in Security Kristopher Sandoval September 27, 2023 The tech space goes in cycles. The newest innovation often becomes the answer for everything (whether or not it’s the right implementation). And right now, AI is going through this hype cycle. AI has seen massive popularity in the media and is framed as the next major wave in tech computing. Given the current trends, what are the takeaways for API providers? What can APIs actually use AI for? Let’s look at a few ways AI can be used in API security. We’ll consider some specific areas where AI could benefit API security and consider some of its limitations in this area. How AI Can Be Used in API Security Identify Vulnerabilities in Real Time One of the best use cases for AI in API security implementations is in the realm of heuristics. If you can establish what the expected use case is within the set of parameters and states you’ve defined, you can create a baseline by which any other action is tested against. If new actions match the expected baseline behavior, then congratulations — it’s likely legitimate behavior. However, if the action doesn’t match the baseline, it’s a red flag requiring special attention. The problem with this approach is that it depends entirely on how good your heuristics are. Heuristics are loosely coupled rules that rely on a firm understanding of the codebase in question. This has posed a problem in past years. By their very nature, human-defined heuristics will be less complete than machine-defined ones. Machine-defined heuristics can be significantly more complete, but they depend on large amounts of input to find good baseline metrics. AI represents an excellent middle ground. With machine learning and language models, we have something that is halfway between human and machine. These technologies are great at filtering through massive amounts of data and information to find potential points of interest. And, they can lump data together into categories and hierarchies for heuristics purposes. In essence, AI can process what is appropriate for a given input, which is the kind of logical system that can be leveraged for successful real-time heuristic analysis. AI-Powered Access Control Access control is a huge part of securing an API. Unfortunately, it is also a very common threat domain, exposing massive amounts of data and functionality through both simple misconfigurations and complex attack vectors. AI can help with this particular issue through a couple of different approaches. Firstly, AI can be a wonderful addition when looking at risk-based access control. AI’s pattern recognition ability comes to the forefront here, allowing systems to detect a variety of circumstances, including attempted credential stuffing, device information, historical data for the IP in question, and more. These findings could signal access refusal, even when the requester has the proper credentials. Secondly, this access control can be broken into a process that is much more granular. Let’s say the AI detects a pattern that suggests something unexpected is happening. Instead of outright rejecting access, what about increasing the intensity of access control through additional factors or stepped-up authentication? AI is wonderful at implementing granularity in data sets, so this is relatively easy to do with current models. Finally, access control revocation or limitation can be deployed to the same system when otherwise accepted accounts begin to perform unacceptable actions. Not all attacks are from the outside — inside threat actors can do considerably more damage in a shorter amount of time, and by the time you know the attack is underway, it could be too late. AI-driven access control systems could help detect the beginning of damaging behavior and cut it off at the source. Proactive Threat Detection AI models are adept at understanding complex patterns hidden within large amounts of data sets by their very design. To this end, AI can be utilized to proactively hunt for threats by analyzing data that API developers already have access to. Training AI models on data sets, including API logs, network traffic data, user heuristics, and general threat libraries from other vendors, can help build a solid starting point by which actual API threats can be discovered. This analysis can be taken one step further by comparing the understanding the AI system has generated with actual, real-world implementations. By monitoring and analyzing real-world patterns and comparing them with the learned patterns from the data sets it was trained on, AI systems can detect threats proactively by finding both ongoing patterns of danger as well as the categories and types of threats that are most common and most likely given the way the API intrinsically behaves. For instance, an audio processing API might have very different threat vectors than an API detecting weather changes, and an AI-driven system will be able to compare the threats it expects to the real world implications of the existing codebase. By comparing these two, threat categories that may not be obvious may begin to show themselves, exposing potential vectors before they even develop. AI-driven solutions in this field are great at uncovering patterns, correlative threats, and indications of compromised code that might otherwise fly under the radar in traditional systems. This is largely owed to the fact that, unlike other security systems, AI systems are constantly evolving and learning, meaning that their data and resultant understandings are tightly coupled to the reality of new threats and modalities. Preemp Intrusive Behaviours Through Modeling AI language models are unique in the sense that they can, through proper instruction, put on certain masks, “modeling” behaviors as if the AI were another entity. An AI model can portray itself as it is requested to. You can ask it to act like a musician, a teenager, or a famous author, and many other permutations. As part of this, you can give particular requests to these models to act in a way that is helpful to the process of preempting intrusive behaviors. Understanding a threat actor’s methodology and thought process is perhaps the best way to preempt that threat. When working with an AI model, we can instruct it to act as a security model expert who is enacting penetration testing modalities. In doing this, we can instruct AI models to act as if they were the enemy and betray that logic, surfacing it to those who could use it to counteract the eventual attack. It is important to note that AI models, especially language models, are trained on specific data sets. Accordingly, the threats that these models might surface will be limited by the data sets that they observed. For many AI companies, this represents an opportunity — training AI systems upon intrusion methods from previously known vectors could be a great chance to find similar vectors that could be abused by threat actors, allowing developers to solve issues before they are used for nefarious purposes. Coding Securely from Scratch Another great use case for AI is in developing the API itself. AI can be used once a product is live, but it can also be used at the beginning of development to make for more secure codebase implementations. Since AI is designed to find the most efficient and sensible response to an input, we can take two basic tracks here. The first is leveraging the AI to generate more secure API code. Using the AI to generate strong boilerplate based on existing efforts, and then iteratively testing attack vectors against it, can lay a strong foundation upon which developers can build additional products. This also reduces the overall time to market for products. However, this doesn’t always result in perfect code. As such, AI-generated code should be more strictly reviewed for potential errors, as AI is only as good as the data it has been trained against. The second track is to use AI to test each chunk of code before it goes live as part of an ongoing security audit. While security testing should be a part of every development cycle, such testing sometimes takes place on entire builds rather than components due to the cost of time and resources. With AI, software testing can be done at a smaller level and with faster results, allowing for real-time testing for each component ahead of an overall security test. Automation Detection and Mitigation Not all threat actors will be people sitting in front of a screen — many potential threats to APIs come from other machines. These attacks can sometimes be hard to detect, as they can operate in a way that is allowed but forms a precursor to an attack. In such an event, it’s hard to parse actual user behaviors from bot reconnaissance. Is this user just really interested in this series of endpoints, or is this a bot trying to detect potential weaknesses in the armor? Using AI can help detect and mitigate these kinds of automations. By using data sets to figure out what these behaviours look like and then running complex algorithms around current access, AI systems can be used to differentiate between machine and human users, and through this differentiation, apply different strategies for mitigation and threat prevention. Approaches can include changing rate limits, redirecting large requests to other endpoints, deploying CAPTCHA or other systems to prevent access, and more. To be clear, not all bots are bad, but figuring out which are good and which are not is a huge first step toward making the API more secure. Caveat Emptor AI is a great technology, but it is also dangerous for a number of reasons. When it comes to security, the biggest problem is in the quality of output. AI is trained upon massive amounts of data, and from this processing, natural language and learned behaviors can be mimicked through machine output. The problem is that this output is only as good as the data it’s fed, and given the large amount of incorrect information online, the quality of an AI output is unpredictable. Accordingly, you should think of AI as another tool in the big toolset of API development. Nobody would ever use a blowtorch to remove a nail from a piece of wood — that doesn’t diminish the value of the blowtorch, but it does show that not every tool is right for the job. While AI is appropriate for many implementations, it must be carefully applied. Output data should be reviewed for accuracy, compared to known quantities, and tested before being accepted. AI systems should be heavily vetted and deployed locally, avoiding any external dependencies that might be twisted using poor data sets or intentional muddying. That being said, AI can be helpful in the realm of API security, and it promises to be an emerging technology worth paying attention to. The latest API insights straight to your inbox