General Techniques for Proper API Caching

Posted in

Caching is an excellent solution for ensuring that data is served where it needs to be served and at a level of efficiency that is best for the client and server. That said, caching is often seen as a magic bullet that can deliver greater efficiency and cut overall data costs for both clients and servers. This is misleading; it is only the proper application of caching that gets you these gains.

Today, we’re going to look at some general best practices for caching. Keep in mind that a best practice is entirely subjective – what may be the best practice in a specific implementation may not be appropriate in another. That being said, following these general guidelines will get you most of the way to an accurate, effective, and useful caching flow.

Estimate Costs Accurately

One of the first and most important steps any developer should take when deploying a caching system is to ensure that the estimated costs for caching are considered. Value in this sense is not monetary but refers to the cost on the server and the general impacts on client and server resources collectively.

Every cache requires local storage of some sort. Content needs to be loaded into some sort of memory, and this memory is thus not usable for other processes, needs, and functions. Even a small amount of data chosen for caching can result in rather large blocks of memory eaten up as this cached content is cached across the entire ecosystem of microservices, APIs, and providers.

While this is less an issue for content that does not change state very often, this can also be magnified by data that does change state. With constant state change being moved to cache and a “moving target” for accuracy, caching could be highly expensive. Accordingly, estimate what actually needs to be cached and evaluate the impact on resources at its most logical extreme.

Allocate the Cache Appropriately

We’ve discussed this at length previously, but it bears repeating. Once the cached data has been identified, you must consider where that data should live.

There are functionally two different types of cache – client caching, where data is stored locally, and server caching, where data is stored on the server. There is a hybrid caching solution, but this adds complexity and overhead to the caching system as a whole. Where you place the actual cache in practice matters almost as much as what you cache.

Client caching, where data is stored locally for the client, serves to limit the data cost incurred by the user. Everything the user could need that is routinely referenced and is appropriate for the cache can be kept locally, reducing the need for additional calls. While this obviously boosts efficiency for the user, there is also a boost for the server, as the content is requested less often.

Server caching also reduces the cost incurred but is primarily focused on the server cost. By storing calls in a cache, the API itself is not impacted by repeated calls – no processing has to be done, no manipulation, no transformation; it’s all ready in one form for general purpose use. Unlike client caching, server caching reduces the cost for the server but not for the client – wait time may be reduced as no calculation really needs to be done, but a call still has to be made.

Hybrid is a best of both worlds, but it also brings a reduction in the cost savings for both parties. There is also the fact that a client may not have the most space for a local cache, especially when using thin clients. In such cases where the cached data is large, client caching should take precedence.

Ensure Security and Accuracy

When caching, one of the most important things to consider is the privacy and security of both the codebase and the users within the codebase. The simple fact is that not everything that can be cached should be cached, and improper caching can result in some major issues. Something as minor as choosing to cache credentials on the client-side for admins can result in that client being broken and, through exposure, those credentials used to attack the server.

That seems a simplistic case, but there are even cases of applications pushing API tokens and such into a cache, and then dumping that information into error reports. Sharing that error report could then expose a lot about the service and decrease the overall safety and security of the system.

There’s also the fact that security and accuracy can be tied hand in hand when there’s a bad caching choice. A source of truth is paramount to ensuring both the consistent, secure operation of an application and the effectiveness of your implementation.

For instance, consider a codebase in which a directory is used to reference customer data. If the cached information as to the whereabouts of that data goes out of date or is for some reason incorrect, not only are you exposing the security of the application, you’re also breaking clients that may rely on that data. Such a breaking experience may not always be obvious, either – the end user may believe that there are no customers or that there is no order data because the client application may not render as a failure, and may just suggest the data does not exist.

This is such a monumentally important aspect of caching that Roy Fielding, the father of the RESTful paradigm, has spoken to both the importance of caching in REST (it’s required) and the importance of ensuring the accuracy of that cache:

“The advantage of adding cache constraints is that they have the potential to partially or completely eliminate some interactions, improving efficiency, scalability, and user-perceived performance by reducing the average latency of a series of interactions. The trade-off, however, is that a cache can decrease reliability if stale data within the cache differs significantly from the data that would have been obtained had the request been sent directly to the server.”

Ensure the Cost is Worth It

Everything in a cached system has a cost. Beyond the actual location of the cache, we have to ask whether or not the cache serves the purpose it is designed for. It’s often quite easy to get into a mindset that the client or the server wants something that it doesn’t want – coding in a cache in those cases may do nothing more than overcomplicate the underlying systems and add complexity to interactions with the codebase.

For instance, it may seem sensible to cache customer order status on the server level. After all, how often does that information change? If we follow the logic, however, we’d see that caching that sort of data is not helpful – order status is not only dynamic in content, it’s dynamic in amount. It may be true that order status doesn’t change quickly on average, but what about during seasonal cycles where ten or fifteen orders at a time may be active? What happens if an order is delayed – do you update the status, or do you leave the status as open? How often is the user checking the order status versus simply waiting for a pushed update in an application?

Ultimately, those aspects need to be considered, and the question needs to be asked – is caching this information, wherever we cache it, worth the cost? If it’s not worth the cost, don’t cache it.

It’s not always obvious, either – some cases aren’t so cut and dry. What about a user ID? On first blush, it would make sense that a server caches the user ID for connected clients. In RESTful design, however, there’s no state to be had – caching the user ID during the session and utilizing that to establish a relationship during that client connection is akin to creating a state, which flies against the concept of RESTful design.

This is especially true if the client is already sending the user ID with every request – if the data is coming in, and the server is checking that data against the cache and tossing it out if unchanged, then what’s the point in storing it anyways? Isn’t that basically the same flow as not having the data cached, but with the extra overhead that cache intrinsically adds?

Microsoft has offered some knowledge on this topic within their Azure documentation. While this is largely the opinion of Microsoft, it does represent a decent practice for most applications – not every implementation is going to have the same requirements, but the lion’s share of applications will. Microsoft suggests utilizing both a cache and a system for persistent data storage:

“Consider caching data that is read frequently but modified infrequently (for example, data that has a higher proportion of read operations than write operations). However, we don’t recommend that you use the cache as the authoritative store of critical information. Instead, ensure that all changes that your application cannot afford to lose are always saved to a persistent data store. This means that if the cache is unavailable, your application can still continue to operate by using the data store, and you won’t lose important information.”

Establish Rules Based Upon Complexity and Usefulness

Much of the issue around caching comes down to dealing with types of content. There’s a sliding scale between “this is cacheable” and “this is not cacheable” that is affected by both who owns the cache and what the purpose of that cache is. For example, something like site CSS is absolutely cacheable, and should be cached. In fact, much of the site experience can be cached with a hybrid approach – locally cached files can store basic information about the user that can then be used in requests, and server cache can store CSS, HTML, etc. for service.

Where we get into more complex caching is when we start considering content that is dynamic, but only over a long period of time. The status of the API is something that is ideally always one value. While there are instances in which the API might be down, or being updated, those are far and away the most uncommon situations compared to the API simply being up, and as such, the state can easily be cached.

Then there are things that shouldn’t be cached for any reason. Things like passwords should not be cached on the server. Server mapping (that is, mapped locations of data beyond public endpoints) should never exist on the client. Basic, common-sense assumptions for cacheability should consistently be tested, validated, or rejected.

Once this happens at scale, we can start to see a set of rules being established concerning our cached content. This is beyond simply seeing whether or not something should be cached or what the relative cost is – once we have a solid idea of where “the line” for caching actually is, we can apply those rules globally and start to create a sort of internal framework for auditing our cache.

Audit Your Cache

This is perhaps the most important “final” step in implementing caching. It is a best practice to consistently review and audit your cache framework, assumptions, and practical implementations. What made sense for a codebase a year ago may not make sense today, and certainly won’t make sense in five years’ time. This is especially important over multiple evolutions of the codebase, as caching content that doesn’t even exist anymore can be a huge processing overhead, regardless of the fact that no data is ultimately stored.

Consider a situation in which a microservice caches the shipping status of each order. In order to reduce cost for the server, the developer has moved to using an external API provided by the shipping courier to query the status of the shipment. The cache originally worked by first checking the cache, testing the status of the order on an obsolete order status table, and then returning this status to the customer. In the new workflow, the user’s request is forwarded to the external API, which grabs the data, then returns it to the customer, who caches the content locally.

What happened to the old cached data for order status? What happened to the code that queried the now-obsolete table? If it still exists, we have a problem – the codebase is made more complex, and it’s possible to still call the endpoint, despite the fact that the data is no longer useful. More to the point, if we did not remove the caching from the service, it’s possible the data will be held on until some arbitrary data termination point, as there’s nothing to tell it that the data is no longer useful. Over several thousands of orders, this can add up to a huge daily operational cost that simply does nothing of value.


Caching seems relatively straight-forward, but there are definitely things that need to be considered when implementing caching in a server-client relationship. Properly estimating cost through identifying need, usefulness, and appropriateness can save client and server resources, deliver optimal user experience, and generally deliver on the promise that cache brings to the table.

What do you think about these best practices? What should we add to this list? Let us know in the comments below!