API hacking is, unfortunately, part of the modern API landscape. Whenever you have resources exposed to the greater internet, those resources are going to be attacked in some way.
Thankfully, half of the fight is just being aware of the threats against your API. Knowing that a threat exists and preparing your solutions ahead of time can negate the threat when it rears its ugly head.
To that end, today we’re going to discuss 4 common methods of API hacking, how they work, and how you can prepare to handle them.
1: Reverse Engineering
We often view our APIs in terms of developer experience – from start to end, how the average developer is going to experience the offering. The fact is that this view is flawed – it only considers the API as it’s intended on being used.
That’s not always the way the API interaction is going to be, of course – in faulty user experiences, the API might operate in unexpected ways. In the case of reverse engineering, that’s exactly what the hacker is trying to do. They call the API in a reverse manner to discover weaknesses in the API that might otherwise be obscured during normal use.
For example, let’s assume we have an API in which user account data can be requested when a repeat order is made. To the user, and to the developer, the flow looks something akin to:
- Order requested
- Order associated with Account
- Order fulfilled.
For someone reverse engineering the API, of course, this flow has some possible points for misuse. If the order fulfillment system is entered in reverse, it’s possible that the internal API which links orders to accounts could be broken into, allowing for the browsing and exposure of user account data.
While the typical use flow may not expose this flaw, specific issues with the process might be easier to see when looking in reverse. Of course, by the time most developers look at this, it’s often too late, and an exposure has already occurred.
How to Fight It
One solution to this problem is base level encryption. By encrypting data in transit and at rest, you’re obfuscating the nature and relationship of the data being handled by any call. Essentially, you’re allowing the functionality to operate as expected, but you’re obscuring the relationships between that data. It is this relationship information that drives most reverse engineering; simply having the data doesn’t mean much if you don’t know how to decrypt it or utilize it.
The main problem with this sort of defense is that an attacker can often pose as a trusted agent. Therefore, obfuscation isn’t exactly effective, as either end of the equation is prone to spoofing and impersonation. Accordingly, while it’s helpful to encrypt data (and to put it bluntly, should be as good as a requirement for most APIs), your defenses can’t just stop at “encrypt and hope for the best.”
One further solution is to change the way your URIs are actually structured. If a URI has coded information in the call, such as specific directory formats that expose where the resource lives and the organizational storage for related resources, then the URIs themselves could be giving away a ton of valuable information. Changing these to be less obvious can go a long way to negating such discovery.
Machine learning and heuristics have made detecting these types of attacks much easier, and there’s a certain amount of defense that can be found in such tech. Leveraging data learned from user behavior and aggregating this data can help identify outlier behavior. For instance, by tracking the average user interactions on your API, you can set a baseline that can help identify extreme deviations. If your API typically has one or two normal calls per session, but a single user is sending constant probes into your media server’s login system with random credentials, you can be relatively sure that this activity is not valid.
These systems can be paired with live obfuscation systems as well; routing traffic to a stated endpoint that randomly routes to a set of randomly named endpoints. This may help fight and deter these attacks, and when you combine this with heuristics-based detection, you can largely mitigate attacks.
Of course, it could be argued that the best approach is to separate functionality you don’t want reverse engineered from common functions you want to be exposed. By separating these out into microservices, you can remove the threat vector to your normal services and heavily secure your major service points from these types of attacks.
In the context of an API, spoofing is when a party masquerades as someone they are not. This can take a variety of forms, which we’ve detailed below.
2. User Spoofing
User spoofing is when an attacker pretends to be someone they’re not. Often, the attacker will attempt to portray themselves as a trusted user in order to pivot to additional users, allowing them free access to data and the ability to deal more damage without being readily discovered. These attacks often use data discovered through phishing or other such credential leaks in order to prevent other alarms, such as those found in reverse engineering, from going off.
Once the attacker has broached the system, the attack often attempts to inject some sort of privilege escalation attack by directing URI functions to other URIs (as is the case in media encoding APIs), inserting code acting as text (as in the case of translation APIs), or just flooding APIs with more data that it can handle, forcing an overflow failure.
3. Man in the Middle Attack
In this type of attack, the attacker will pose as an element either in the chain of communication to the server, or the server itself. The attacker’s aim here is to act as if they are some trusted link in the API chain, intercepting data either for morphing or offloading.
Sometimes, this attack can be done by squatting on a domain that is similar to the API URI scheme and copying the format of the API request/resource location (or at least, making it seem the same). In this case, a user might be requesting a call using a resource located at
API.io/media/function, and a squatter might sit on
APO.io/media/function. A single character’s difference could make all the difference in the world, and open up the requester to the reality of sending their credentials to the wrong server.
Other times, the attack could show itself in the form of establishing a node between the user and the data requested. If the resolution service is breached, then a secondary call could easily be added to the server function, automatically sending data received to an external service.
Providers should note that this attack is often transparent – the attacker wants to appear as a valid part of the chain, and so it might still respond with the correct data, passing on the data to the API itself and responding with the response package. This is done so that the user has no idea that they’ve been compromised. Advanced versions of this attack could see data changed mid-transfer, forcing your deposit to be placed in a different bank account, or your purchase to be shipped to a different address.
How to Fight It
One solution to this problem is certificate pinning. Certificate pinning is basically setting up a pre-configured server certificate that is trusted by the API. When the handshake process is initiated, the certificate received is tested against the certificate that is trusted – if they don’t match, then the communication is invalidated, and the server connection is rejected.
Of course, this hinges upon trusting the certificate authority, and thereby assuming the authority is not part of the false loop. That being said, demanding a very specific, pre-configured certificate makes it so that every single part of the chain would have to be corrupted in order for any detrimental spoofing to occur.
Another fix is encrypting all traffic in transit. While the attack might still capture data, the data should be rendered useless with proper encryption, ensuring that what is captured is essentially “noise”. You can also add salting to the data stream in order to make this data even harder to use. The problem here is that the data is still being captured, and the system is assuming that all encryption is going to stay in the current state. History tells us of course that this is not reality, as major, strong encryption standards of yesteryear are now largely considered insecure. Public Key cryptography helps in this somewhat, but ultimately, you’re going to need a group of solutions to effectively mitigate your end of this threat.
You can also utilize services like two-factor authentication to prevent these types of attacks from the user perspective. If a user is required to use two-factor authentication, and a man-in-the-middle attack is attempting to be transparent, the calls will be separate from each other. Even if the calls are captured, they will be encrypted and separate – if you enforce session sanitation properly, this two-factor authentication will prevent significant damage from being done, and by the time it could theoretically be cracked, the transaction window will have long passed.
4: Session Replays
Session replays are specifically against websites and other systems that generate and store sessions. While proper RESTful design should not deal with state, that’s not always the reality of the situation – many APIs, whether for valid reasons or not, have state as part of their core flow, even if they call themselves “RESTful”. When sessions are part of the equation, this type of attack is designed to capture the session, and replay it to the server. In effect, the attacker is rewinding time and forcing the server to divulge data as if the same interaction is occurring once more.
In many cases, if this type of attack isn’t stopped, it can enable the attacker to act just like the user, and could lead to additional attacks, especially those depending on pivoting using the user credentials. If this is just a normal user session, the attack can be bad – if the session is from an administrator or elevated user, however, it could be catastrophic.
How to Fight It
Proper session management is the key here. First and foremost, server state and session may not always be the same thing, but if you’re doing RESTful API communication, the difference isn’t really that important – you should be avoiding sessions in general. Sessions and states in many applications serve the exact same purposes and open up the API to huge risk.
If you have to use sessions, ensure those sessions are invalidated once you get past an idle timeout period or the user logs out. You can also set the session lifespan to terminate at a certain point, which will invalidate the session and prevent this type of attack.
You could also encrypt the session data if a session is required. Ensure that once a session is connected, some piece of encrypted code is used as a sort of token for that session. Without it, if the session is replayed in the future, it is essentially useless, as the token itself is what makes the session valid.
5: Social Engineering
While this is not in and of itself technically an “API hack”, it directly affects the API. Social engineering is attacking not the machine code and the API itself, but the weakest element of all – the human element. Humans are fallible, and they can be tricked – often very easily. Social engineering takes advantage of this in a multitude of ways.
Phishing is the process of sending out mass contact to known users, often using cleverly crafted emails providing links to reset a password or validate a security incident. The catch is that these links aren’t real, and instead result in the attacker grabbing credentials. Spear phishing is much the same but focuses on one high-value target, often providing additional data, typically stolen in some sort of security incident, to instill trust in the user that the communication is indeed valid.
Once the attacker has access to these resources, they can commit any of the above attacks that much easier, and with a greater amount of success. It’s hard to detect an attacker when they’ve come into the network using legitimate credentials.
How to Fight It
While you could argue this can be solved through education and instilling common sense in your users, that’s really just a dream in many ways. You can’t enforce common sense, and attempting to do so is really a fool’s errand.
The best method you can do to stop social engineering is to enforce API level security. You can use opt-in heuristic systems to determine when a user is coming from an unknown machine, unknown location, or other variation in known behavior. In this way, you’re treating the symptom, not the cause, but the effect is still protective.
You could absolutely use two-factor authentication in this case as well. This type of authentication would stop any social engineering attempt in its tracks by requiring the attacker to have something they don’t have, typically an authentication app on someone’s phone. While this doesn’t make such an attack impossible per se, it does make it so costly in terms of time and effort, and by doing so, decreases the immediate value and makes this sort of attack more costly than it’s worth.
Ultimately, API security is always going to be a game of cat and mouse. The solutions offered here are very much a starting point; a combination of these solutions will need to be put in place for any meaningful protection. Also keep in mind that this list is not exhaustive – there are as many ways to hack an API as there are hackers to utilize them. Accordingly, your best bet is to simply be aware and be cognizant of your design choices.
What do you think are the most common ways an API might be hacked? Do you think APIs face different types of threats than other online resources? Let us know below.