How to Handle Batch Processing with OAuth 2.0

Recently on the Nordic APIs channel we’ve had a few people ask — how do you handle batch processes that are secured with OAuth 2.0? Batch requests are ones executed automatically or programmed to repeat recurringly.

Usually we use OAuth to confirm user identity for API calls, but the problem is that OAuth 2.0 isn’t really designed for batch processing. OAuth is typically used more synchronously — as in, you send a request and you get an immediate response. Dealing with OAuth, you usually have the communication channel open.

Batch processes, on the other hand, are often asynchronous, meaning that you may not have an open channel as operations occur over a longer time scale. Though many API providers are moving toward flashy RESTful microservices, asynchronous behavior in mobile APIs or enterprise service architectures still exist, and need to be dealt with in a secure and scalable method that considers the identity of the request along each step of the way.

A common problem is that when we create requests that trigger a process at a later time, these triggers are often done as batch jobs. For example, if at night a bank’s system triggers all transfers that are waiting in the queue, this is a batch job. If you consider the entire operation, the original request to make the transfer is more of an asynchronous request.

Say you have such an asynchronous environment, where an API triggers another service to perform an action, as in a message queue. When the action is finally triggered, often the identity of the final call is not coupled with the initial call in any way. Using a typical OAuth flow here is awkward, since OAuth tokens are intended to be very short lived, meaning they shouldn’t be stored in the queue for an indefinite batch job. Many organizations just don’t realize how prone a queue or distribution mechanism is to attack, often done by internal actors. To circumvent this vulnerability, in this article we’ll provide a potential solution for implementing OAuth throughout the entire batch request process to heighten the verification standards and identity validation.

Example Case: Recurring Bank Deposit

Say you need to trigger a recurring deposit of $50 into a savings account each month for the next year. First, let’s consider how this request is typically handled using a normal OAuth flow in an environment with HTTP RESTful APIs and a microservices style architecture.

Normal OAuth Workflow (Simplified)

  1. The user first initiates a request on the Client website.
  2. The Client will then send an Access Token to an API Gateway.
  3. The API Gateway will then pass the Access Token to the Authorization Server (the OAuth server).
  4. The Authorization Server verifies that the Access Token is correct, and sends an JWT Access Token to the API Gateway.
  5. The API Gateway then passes the JWT Access Token to a microservice which will need to act as a trigger each month for a continuous job. We’ll call this Trigger Savings API.
  6. The job is then inserted into a Queue, which triggers a second microservice that moves money into the account. We’ll call this the Do Savings API.
  7. The Queue is reset, and Step 6 is repeated when the correct time has elapsed.

Essentially, in this model we use two microservices: one to trigger the payment, and one to move the money. What this means is that you must put the job in a queue, so that later, when the right time has elapsed, it will trigger a Do Savings API to transfer money to an account. Then the queue is updated.

Seems fine and dandy? Well it’s not.

The Problem with Batch Requests

One issue with this batch process is that it becomes difficult to verify the identity throughout each step — from the first request to the request initiating the Do Savings API. When you let a batch process rot in the queue, you lose track of identity control all together, increasing the risk of attack tremendously.

Closely related is the access token specification requirements. In this case our access token is likely a bearer token. As recommended within the OAuth 2.0 Specifcation, a bearer token is meant to have a very short lived lifetime — something like 5 – 15 minutes is standard practice.

As we’ve written before, you need to treat bearer tokens like cash — anyone who steals this token can send it. By sending this token to an API, you implicitly trust that no one else got ahold of the token along the way. Since the call is meant to behave synchronously, you trust that it’s been kept safe until it expires, since the client/server communication is secured. The longer the lifetime of the token, however, the higher the chance of it being stolen.

OAuth bearer tokens are the preferred token for maintaining client simplicity. However, we can’t stick a bearer token in the queue as the lifespan would be far too long. Stored asynchronously, a hacker could potentially reshuffle the queue, and assign it with different tokens. Or even more likely, a bug in internal processing could mix up the tokens and the messages they belong to purely by accident. As this infrastructure is fairly common throughout enterprise contexts, using tokens in this way could put a company in severe jeopardy.

So, how do we create a specialized token that is relevant and continuously tied to batch operations? The solution lies in making the bearer token more bound to the request. Since OAuth 2.0 is all about pushing complexity to the server rather than onto the client, we will likely have to do some maneuvering on the server side… Prepare for an OAuth flow peppered with a little batch processing magic.

OAuth Server + Scopes + Meta Scope to the Rescue

App developers, web developers… nobody really wants to touch security and often shy away from confronting it. But our solution may not be that complex; it lies in adding another verification layer with the Authorization Server.

When a server receives a JWT Access Token, it may contain permissions of a certain kind, which are called scopes. These scopes are actually very powerful, and can be used to trigger functionalities, delineate certain account access tiers, and more.

In our savings example, we could use a scope to trigger continuous savings to permit starting the buffer. This would say that you are allowed to put something in the buffer — it could read your account status, or something similar, to begin processing a payment.

This particular scope would behave like a meta scope. We want to transform our bearer token into something less dangerous — something that can be used to trigger more transactions, so that once it enters the queue, it can only perform one action and cannot be altered in any way.

In order to process this, we need something like a Trigger Savings API, which would check to see if the token has a trigger payment scope. If so, it would prepare a new token to send to the queue. To do so, it will call the OAuth server with the incoming access token, as well as with the content that details the request.

The goal is to downscope the permissions of the access token into a single action that can rest safely in the queue. This does require a little bit more action on the part of the server. It will take the savings operation data, and sign it, using the OAuth server key to create a signature out of that content. It will then respond with an access token that is more Long Lived (LL), which we’ll call JWT Access Token LL.

Within the JWT Access Token LL is the signature of the specific operation. Now, we can send this request with the JWT Access Token LL, and store it in the queue. In doing so, we can specify the Time to Live (TTL) of the operation to be long lived. Essentially, we use the short-lived access token to sign a message; the JWT Access Token LL contains metadata in addition to a signature so that it can be associated with the queued job.

Updated Workflow

There are certainly other ways to accomplish this, but here is a step by step workflow using the process described above:

  1. The Client (the website or mobile app) sends an Access Token to the API Gateway. This token carries something like a ‘trigger_continuous_savings’ scope to delineate the type of transaction taking place.
  2. The API Gateway sends the Access Token to the OAuth Server for verification.
  3. The OAuth Server verifies, and sends a signed JWT Access Token along with the Content Request to the API Gateway.
  4. The API Gateway sends the JWT Access Token to a Trigger Savings API.
  5. The Trigger Savings API sends the JWT Access Token and Content Request to the OAuth Server for verification.
  6. The OAuth Server sends a new JWT Access Token LL to the Trigger Savings API that can be long lived for the total time it should be initiating savings. It contains a signature, TTL of 1 year, and a scope to permit the $50 savings. It also sends a Content Request to ensure the token is issued for this particular job.
  7. The Trigger Savings API passes the ** JWT Access Token LL** to the Queue.
  8. At the proper time, the Queue then triggers the Do Savings API, passing the private key for verification.
  9. The Do Savings API then sends money to the user’s account.

Click to view larger version

This flow does place some additional complexity on the server to accept additional parameters. Also, the above flow assumes we use dynamic scopes. We would include an action scope like save_money, as well as a scope that determines the amount in the creation of our JWT. In this specific example, we pass a TTL of one year, with the $50 recurring amount to save.

Since the only action the end token can be used for is when sending a request to the Do Savings API, it becomes intimately tied to the request. You wouldn’t be able to edit the amount in the token, because it’s using a signature that only the Authorization Server can produce. Since each API is given a public key, they share trust with the OAuth server. In other words, there is an implicit trust between the Trigger Savings API and the Do Savings API.

So, when the Do Savings API gets that request it can ask the Authorization Server — “is this valid?” You can ask it directly, or have the AS distribute its public key, so that you can use the public key to verify the signature and validate the current data. Thus, you can tie things together in this way.

Read How to Control User Identity Within Microservices for an introduction on using JWT scopes

Security is Amplified

Sure, you could return a longer lived bearer token with a lesser scope, but the problem is that when data is in the queue, there’s no coupling other than that they’re in the same slot. With our updated flow, what has changed is that the specific enqueued job is now tied to a specific token that has been created from a short-lived access token that had the right to create jobs.

Queues are typically not treated like an OAuth Server — Authorization Servers are typically a highly secured component of your network, while on the other hand, many people may have access to a queue. Some may not even realize this could be a target of attack, but they certainly are.

Without this added verification layer, you would be giving a hacker ample time to enter and rearrange the queue. This could be a disgruntled employee even, who has direct access. From an auditing perspective, they could easily replace tokens or hide certain operations.

Creating a long-lived access token is more in tune with the OAuth 2.0 specification and couples your data more tightly to avoid identity vulnerabilities. Since OAuth 2.0 doesn’t specifically use bearer tokens in this way, by putting the signature of the request into the JWT Access Token LL, we transform it into a bound token, allowing us to still use OAuth 2.0.

Potential Downsides

Every time a transaction needs to occur, there is slightly more weight put on the Authorization Server, but not a whole lot. At the very least tokens are verified. The extra operation is the signature check over the content of the operation it is supposed to perform, which is stored within the access token.

Weigh this added server operation against the alternative; having the client sign the data in the beginning and having key management extend all the way out to the client. This would entail far heavier clients, especially for ones located outside of the organization.

Lastly, there are a lot of details here on who can ask for what. We must recognize that we will need to build our security infrastructures that will carry out the workflows. One final risk to consider — you’ll need to trust a middle man i.e. the Authorization Server to honor asymmetric cryptography.

Conclusion

This sort of operation should be considered for anything with a service queue or distribution bus. Especially now that subscription services with continuous payment charges are in vogue, learning how to verify identity and permissions throughout each step is vital.

Do you have experience or issues handling API identity management in the context of batch operations? We’d love to read your comments below!