4 Design Tweaks to Improve API Operations Bill Doerrfeld March 21, 2017 We’ve previously discussed best practices when it comes to designing an API with quality developer experience. But what will the long term operational repercussions be for the design moves we make now? For example, if URLs are designed without metadata to describe actions, later on, product owners will have a difficult time staring at unintelligible logs. Or, if microservices aren’t orchestrated correctly, you run the risk of long load times queuing multiple API calls in a mobile environment. These are only two of the many operational consequences that many API owners overlook while designing their APIs. Today, we’ll consider some methods to make web Application Programming Interfaces more operationally efficient and responsive to the way that clients will consume them. Led by Nordic APIs veteran Jason Harmon, we’ll cover the most common operational anti-patterns that could break an API. We’ll offer ways to remedy these poor design situations, that if not addressed, could cause some serious migraines later on in the API lifecycle. Common API Operational Errors to Watch Out For Let’s consider four anti-patterns found in the wild all too often — from simple, yet often overlooked issues, such as using the appropriate HTTP method, to more complex composition errors. “To some extent, the way you design your API can set you up for failure.” At our last Platform Summit, Jason Harmon spoke on operational API design anti-patterns 1: HTTP GET instead of POST API providers commonly implement an HTTP GET call where a POST should have been used instead. A reminder to use the correct HTTP method is by far the most basic piece of advice here, yet surprisingly, it’s a common error that can cause some major damage. Harmon related a story from his experience at Typeform, where a small subset of users in web browsers were hitting the “back” button, causing their sessions to lose all data. By changing the call type from GET to POST, Typeform was quickly able to solve their caching issue, as the HTTP RFC states, POST is not allowed to be cached by anything. The lesson here is to watch out for cached calls from browsers or proxies. If you do encounter unexpected behavior, Harmon recommends to first “look for the GETs… using POST instead is an easy fix.” If you’re stuck in a situation with erratic cache issues, he adds that adding an extra query string with a randomly generated cachebuster to the GET call (i.e. ?cache_buster=[random]) could solve a recurring issue. 2: Letting clients constantly poll APIs Many API providers should reconsider how they allow clients to update data. Too often, the client sets up constant polling on API endpoints. When a large dataset is involved, and queries are occurring continually, such as every 30 seconds, the number of calls can really add up. Large volumes of calls to an AWS server can be expensive as well. Typeform is not immune to the API polling issue. Typeform end users want to know if there is updated data on their form, and therefore set up a constant polling service to see if the data has been updated. Since Typeform is among the Zapier compatible apps, meaning that users can create customizable “zaps” that tie in Typeform functionalities, the number of services continually requesting new data skyrockets. To avoid constant polling, Harmon recommends you build and launch webhooks for your service, and convince your consumers to use them. But first, find out what the rate of change for your data is. For example, The average rate of change for Typeforms are typically 1–2 weeks. Identifying a pattern for the rate of change in your data can help you design your webhooks to be as lean as possible. Harmon recognizes one catch to this approach — he recommends still using a polling API alongside the webhook payload to carry out sporadic system checks and perform large downloads. 3: Rigid hierarchy in microservices causes latency Lately, many SaaS providers have been transitioning to the microservices architectural style. Microservices are lean, well-bounded components dedicated to processing a specialized functionality. This is great for structural segmentation, but if microservice communication is not orchestrated well, requesting data can cause serious latency. “Build your microservices to be externalized from the ground up” A BFF has been implemented to solve microservice design issues at both Paypal and Typeform. By decreasing JSON packages, a BFF can cut down JavaScript processing. Sam Newman describes using a BFF for each type of client interface. The problem with microservices is that a client requesting a large number of functions could end up sending dozens of separate calls, leading to a massive query load. This is especially problematic in mobile environments, where calls must be queued in series instead of being executed in parallel to one another. For some environments, this process is simply too slow. According to Harmon, the solution lies in a BFF. No, not a best friend forever, but a Backend-for-Frontend that acts as a shim to help compose microservices. A BFF is a lightweight layer that acts as an orchestration API. It could be built as a Node.js service, or however internal developers see fit. The goal of such a shim is to, with one call, have the client receive all the resources packaged together in the way the client wants. That way, to put it in Harmon’s words: “They’re not gluing together a model in the browser, the model’s already given to them the way the wanted it in the first place.” Allowing clients to call a microservice directly without any composition layer is poor design as it doesn’t consider the use cases and client limitations. Constructing a BFF layer is a possible solution, but it should be noted that GraphQL is another potential solution as well. Also read: Asynchronous APIs in Choreographed Microservices 4: Generic actions When we design URLs, Harmon reminds us that detailing actions matters. He recommends to note state transitions within the URL, and to pass along short descriptions as metadata. If you don’t describe state transitions outside of the typical CRUD verbs, you run the risk of having very generic, unreadable logs. This problem is often called protocol tunneling; with generic URL names you often lose perspective, and then when an action breaks, it’s hard to locate the affected calls and analyze trends. If we design URLs as generic phrases that tell us nothing about the entire story, then error diagnosis will be difficult. “URLs are a key component in how you operationalize the API” Harmon notes this is especially important for product owners using tools such as the ELK stack to visualize data sources to determine insights. When designing URLs, visibility is a plus, so consider a good method of identifying these URLs with naming schematic, along with metadata for the actions that are being taken. /resource/:id/generic-name + {action:process} Harmon-ious Mantras to Live By We’ve covered much ground with some specific issues to avoid. For more generic design advice, the wise Harmon provides some mantras to design by: Use cases first, then design: Ensure your HTTP methods are correctly ascribed, and that user behavior won’t adversely affect the outcome. Design can influence performance: Put an end to polling. Rather, design with subscription webhooks and insist that clients use them instead. Structure is good, but be prepared to blur those lines: Though microservices architecture is about separate division, having too rigid of a structure can result in “unhappy clients and crappy performance.” Design can put out fires: Having a DevOps approach to design early on can put out fires in the long term. Try to make performance more visible. It’s all about the logs: Ultimately, your logs are the transcript of what clients are actually doing. Consider how you can make this performance more intelligible by crafting URLs and metadata that will aid metric analysis and make product owners happy. Developer experience is the mainstay of most API design discussion, but we mustn’t ignore how design moves will affect API operations as well. As Harmon points out: “API design is not just fanciful usability discussions and lot of fluffy emotions about developers… Developer Experience really is just the first layer.” Following these guidelines, we can improve specific areas to help design on the scale of decades, as operational efficiency is intimately correlated with platform longevity. By planning for operational improvements now, we can “design in a way that writes sentences, to tell a bigger story later on.” Also watch Nordic APIs veteran Jason Harmon present his tips on scaling API design at the 2014 Summit The latest API insights straight to your inbox