Token-Design-for-a-Better-API-ArchitectureLittle details like tokens can sometimes help structure complex API architectures. In this piece we’re going to have a look at different architectures, and ultimately see how a better way to design tokens can lead to a more performant result.

Consider the role of tokens within two facets of API design, access control and data stability. Every time we deal with API design we must necessarily manage access control, and provide data stability. These two pillars can be broken down in the following way:

  • Access control
    • User identity (Authentication)
    • Permission (Authorization)
  • Data stability
    • Accurate documentation
    • Consistent response format (API versioning)
    • Service uptime and availability

The maxim here is to:

Design access control in a way that it facilitates data stability.

With this in mind we can now review three archetypal API architectures.

1: The Happy Accident

This is the most common design, which is in fact no design at all. Basically if you have a website, you have an API! With just a simple cURL request and some scraping, you can get the data you need directly from the website.

Here a very in-depth article on how to get started with web scraping, and several traps and pitfalls explained in details. Though this seems very simple and quick, we must be very careful with unhappy accidents. In fact there are two main problems with this “design”:

  1. Data format changes
  2. Working with private data (user & pass)

The first point is a problem of data stability: any new website implementation change can suddenly break our API. The second point, though it can usually be worked around, just makes the architecture so much more complicated. When dealing with authorization (usually username/email and password) doing a simple web scraping won’t immediately work, but we’ll have to deal with signup or login pages first (and hope there’s no captcha!)

2: The Front Desk

The front desk design represents a gateway model architecture. That is, all API requests must go through the gateway first, which then calls the actual service requested. The gateway usually manages authentication, rate limiting, and possibly other things as well (e.g. analytics, performance monitoring, etc.). This is sometimes dubbed Gateway-as-a-service, and there’s quite a few providers on the web. One interesting example is Tyk which is developed in Go, and is completely open source.

Like any architecture, using a gateway approach has both good and bad properties.

Pros:
– Easy to iterate on
– Easy to develop quickly
– Shortens go-to-market time

Cons:
– Single point of failure
– Potentially costly
– Rigid architectural constraints


Watch R. Kevin Nelson discuss this topic at a Nordic APIs event

3: Metropolis

A service-oriented architecture (SOA) is an architectural pattern in computer software design in which application components provide services to other components via a communications protocol, typically over a network. The principles of service-orientation are independent of any vendor, product or technology.

MSDN

This third API architectural type is Service-Oriented Architecture, or SOA. With SOA, each service handles its own authentication, rate limiting, API credentials, etc. We can imagine this architecture is like if each service has its own front desk.

The idea behind Metropolis is that if each server knows how to handle the API gateways and they can also handle their own authentication, then the API requests can hit the services directly (both internally and externally).

Here’s where token design comes in handy to help the services handle their own authentication. But what exactly is authentication? When we talk about authentication we generally want to ask these two questions:

  1. Can this user perform this action?
  2. Can this app perform this action on this user’s behalf?

But there’s more to it than that. Check out our eBook dedicated to API Security to understand the subtleties behind authentication, authorization, delegation, and federation.

Securing-the-API-stronghold-blog-post-CTA-01

Token Design

Let’s start by reviewing what a token is. Here’s from the Oxford Dictionary:

Token – A thing serving as a visible or tangible representation of a fact, quality, feeling, etc.

In the realm of API design, a token is a simple string that represents a user that we have previously validated. Now, there’s two ways we can go about actually using a token. We are going to start from the old way.

The Old Way:

This would be a typical way to handle tokens for an API, and if you’ve worked with any kind of API you’re probably familiar with this design:

  1. Generate random session or token key
  2. Store payload data in a datastore
  3. Use session or token key to lookup the payload

Although this design may sound familiar, there’s an obvious problem with it. That is, we need to validate user credentials each time, which means sending our token and then performing a lookup for every request. This is basically going to the front desk for each request. It works, but it’s probably not the most efficient way to go about it. How can remove this bottleneck and have a more performant design?

With this in mind we can look at how token design and cryptography can lead us to a better architecture.

The Better Way:

We can use encryption to have a token design that avoids database lookups entirely. Let’s start with the complete process and then we’ll review it in detail:

  1. Minimize payload data. The payload should only contain minimal information, like user id, timestamps, and possibly other identification codes.
  2. Serialize the payload.
  3. Sign the payload. There are different ways to go about this, but it’s typically done with HMAC.
  4. Encrypt payload + signature combo
  5. Base64 encode the encryption result
  6. The encoded result is your token!

For all this we basically just need a library with a generate token and a parse token methods. There are 3 obvious benefits from this design:

  • No database lookups
  • No storage requirements for most tokens
  • Constant-time token parsing

And there’s only two requirements to be able to parse the tokens correctly:

  • Shared encryption and MAC keys between internal services
  • Properly implemented crypto libraries

Revocation & Logout

One small issue is that revocation and logout become problematic with this approach. Since we haven’t stored the token anywhere, how do we go about revoking its permission?

To answer that let’s get some details about the data on a typical payload:

  • User identification
    • ID, name, avatarURL
  • Token metadata
    • issuedAt, expires, sessionID

A token like this has roughly the following characteristics:
– Around 220 bytes in size
– The payload data is encrypted
– Two secret keys are required to parse or generate the tokens. This protects against both spoofing and data leakage.

Now back to logging out. For most APIs, revocation is more of an edge case rather than a common task. It won’t matter much then that when using this design the revocation process is going to be a bit slower than usual. In brief, we’ll have to use a back-channel (e.g. a message queue) that will inform all our services that a specific token has been revoked. Each service must then store in memory (or in a fast data-store cache like memcache or redis) that the token has been revoked. It’s in fact much easier and faster to store only the revoked tokens rather than storing every single one. This way instead of checking if a token is valid we can just check if a payload has been de-authorized.

The process for token permission revocation and logout is therefore:

  1. Propagate the de-authorization to nodes through a back-channel
  2. Wait for propagation to finish before responding

Conclusion

Carefully designing our approach to tokens can have significant effects on our overall API architecture. We started by seeing how any website can become an API through simple web scraping (the happy accident). We then reviewed how a front-desk architecture works like a gateway, and its pros and cons. Then we moved to a real Service-Oriented Architecture, and how a different way of designing tokens using some encryption techniques can lead to a new design completely free of database lookups.

Giovanni Casinelli

About Giovanni Casinelli

Co-founder & CTO at BonAppetour. Formerly CEO at Asteroid. Passionate about new technologies and how they shape the future. Building and improving APIs at BonAppetour to allow everyone to eat with locals when traveling.

  • Hi Giovanni. My investigations are so far revealing that signature-based authorization is far less secure than old style of verification of session randoms. In short, when an authentication authority is breached, even just read access to the server will reveal the signing keys, which an attacker can subsequently use to forge unexpired tokens. Meanwhile, session randoms can be stored hashed on the server, leaving them relatively secure even following a read-only breach. Companies like to be able to say that even though they were breached, accounts were unlikely compromised due to having hashed passwords, but when servers issue signed JWTs, a breach compromises all accounts despite having hashed passwords.

    That’s the short version. More details on StackExchange: http://security.stackexchange.com/questions/125135/what-scenarios-really-benefit-from-signed-jwts

    • travisspencer

      If a hacker can get in far enough to access the private key used to sign tokens, there’s a good chance he can also get in deep enough to add random session IDs (i.e., opaque tokens) to the session store (esp. if the session store is on the same server machine). Also, if the random session IDs are hashed, that means there’s a symmetric key on the server that’s used to compute those hashes. Read-only access will reveal this secret (though write access will be required to store a new hash). If the hashed session IDs are in a back-end DB and the hacker has found the hash key, he’s probably also found the DB credentials. So, adding something to the DB will probably be doable. Not saying that one type of token is better than the other with these comments, just adding some additional considerations. This stuff is complicated, and the solution is usually a defense in depth. Using opaque tokens and signed, by-value tokens (e.g., JWTs) together with varying expiration times that take into account the network topography, client, user, scopes, etc. is usually the best approach.

      • Your response doesn’t make sense to me. Point-by-point:

        (1) Most accounts on most servers can read fewer files than they can write to. Therefore, attackers will more commonly be able to read but not write. Read-only access is an important case to guard against because it’s necessarily more common. We can’t dismiss the most likely case because of the existence of less likely cases.

        (2) Hash algorithms typically do not use keys. Most use MD5 or SHA-1. These forever map the same set of inputs to the same set of outputs. Their value is that they are not reversible. Nothing that an attacker finds on the server helps with reversing hashes.

        (3) If any tier relies on the JWT-reported user ID or expiration, the tier is trusting that the signing key was not read; it is not providing an “additional” layer of security. Were the tier to both verify the JWT signature and look up the token, it seems that the server would be wasting clock cycles, because the token look up could provide all of the information, and there would have been no need to verify the signature.

        • travisspencer

          I think that both random opaque tokens (what you mean by “session random” IINM) and signed, by-value tokens that contain all identity data (e.g., JWTs) should be used in combination in a way that manages risk and inter-dependencies. My reasoning for using opaque tokens though isn’t due to the “weakness” of digital signing. If done right, signing is secure. I would suggest using opaque, random tokens when you send them out of the compute environments that are under your control and translate them to signed, by-value tokens as the opaque tokens flow into your network by doing a lookup in the token issuer’s DB (as we discussed in the other blog post on this topic). If you are concerned about the signing key being breached and exposing the APIs that use signed JWTs to risk, set up an endpoint where the APIs can query to see if the signing key has been invalidated. This will work effectively like a CRL does in SSL. The APIs can cache the results for the amount of time you’re comfortable with. If you use opaque tokens on the Internet, JWTs within your network, and periodically check if the signing key is still valid, I think you’ll be in very good shape.

      • Wait! I concede that JWT can provide an additional layer of security! Thank you!

        Session randoms can be guessed. If an attacker manages to guess a session random, they get access. If the data in the JWT has to match the data in the DB, then it’s harder for an attacker to properly guess the random. However, taking advantage of this additional layer of security requires a database lookup, nullifying the supposed benefit of JWTs for authorization without DB lookups.

        That’s the first concrete benefit of JWTs that I’ve seen, but authorization without lookups still looks like a naive, insecure use of JWTs.

        BUT… You can accomplish the equivalent by lengthening the session random, so I guess I still don’t see the improvement that JWTs offer.