5 KPIs for API Platform Engineering

5 KPIs for API Platform Engineering

Posted in

As the API landscape has evolved, the idea of an API platform as an offering not of a single service but of a collection of applications, microservices, and functions has become more prominent. This development has allowed organizations to create compelling products at scale, bringing new functions to users across the globe.

When considering the success of an API platform, it helps to have some concrete key performance indicators (KPIs) to point to when measuring performance. Below, explore specific metrics that can be used as KPIs for your API platform engineering efforts, both to measure ongoing success as well as detect potential pitfalls at scale.

What is API Platform Engineering?

Before diving into specific KPIs, let’s briefly define what API platform engineering is. Platform engineering is the idea of offering a platform that delivers business value to the consumer through the improvement of developer experience, enhanced productivity, and self-service capabilities. These platforms are backed by automation and contained infrastructure.

Put another way, API platform engineering looks at the API platform as a vehicle for business enablement that can be streamlined, improved, and iterated upon. The design, development, and management of API platforms require ample consideration of scalability, extensibility, security, and more. API platform engineering intends to improve as much as possible for as many users as possible.

However, part of the problem of these efforts is measuring each effort’s specific success. While a lot of time and resources can be spent making an incredibly scalable service, this means nothing if it can’t be extensible or secure as well. API platform engineering is as much a balancing game as it is an iteration one.

5 KPIs for API Platform Engineering

With that in mind, how can developers measure the success of API platform engineering efforts? Let’s examine five key metrics that can help guide these efforts.

1. Availability and Uptime

Measuring a service’s availability and relative uptime for end users is a KPI that is a lot more than the sum of its parts. In the simplest form, this number is just a percentage of how much time an API has been operational and available to the end user. Diving deeper, however, reveals the hidden intricacies of this KPI.

Availability and uptime suggest much about your network health, the connectivity between your user and the service, and the load balancing across clusters and systems. Imagine a coffee shop near you that has amazing coffee but is only sporadically open during posted hours. That would suggest a lot about its business practices and that it is dealing with underlying logistics issues.

Similarly, availability and uptime can serve as a logistical snapshot, showing your platform’s relative health and external usability.

2. Error Rates in Service

Error rates are another KPI that can help you understand the stability of your code and its underlying systems. Error rates, specifically in the form of failed request service, help give you a view of the codebase stability and the resources they depend on. High error rates suggest issues of transit, reliability of multiple nodes, and even poorly formed code. Low error rates suggest high-quality code. However, ludicrous points — such as 99.99999% lacking error rates — might indicate an issue in the production of disconnected systems.

This metric can also serve a long-term value in establishing a log-based heuristic. If your error rate is 99.5%, which is quite good, all things considered, any deviation can set off a klaxon that alerts your developers to potential issues before they become broader in scale and scope.

Ultimately, error rates can also be used as a form of sentiment prediction. Comparing how prone your system is to error rates to long-term continual use can help you gauge the perceived value of the product and the desire of your users to continue using it. As such, error rates are highly valuable insight generators.

3. Average Response Time

The time it takes an API to respond to a request indicates the health of the system upon which your API rests. From a platform point of view, average response time exposes infrastructural issues that are often hard to see. Most software projects fail when it comes time to actually enact them, and likewise, a perfect platform might fail horrifically when put into production. When this happens, it can be hard to detect whether the uptime issue results from a poorly iterated code deployment or a poor-quality last-mile network for your users.

Average response time, accordingly, can help detect issues in combination with other metrics. Is your uptime and availability excellent, but your average response time is high? This suggests an issue at the network layer that might be hard to detect without this context. Likewise, the inverse might be true. If your average response time is excellent and your system is never available, that’s a strong sign that the problem is you, not the infrastructure.

4. Developer Engagement and Sentiment

This KPI can be somewhat hard to track, but it is perhaps the most important on this list. Developer engagement and sentiment is the idea of tracking how developers interact with your platform, how many of those interactions are active, the frequencies of those calls, and so forth. This can help build an idea of the desire your developers have for the offering, helping to convert your system from a “nice external tool” into a veritable and desirable API-as-a-product offering that is core to business success.

Likewise, the sentiment piece of this is very important. Finding out how developers feel about the service can be challenging. But when collecting feedback, error reports, and so forth, you can begin to build a picture. Do you have very low instances of errors and failures but a high rate of reporting? You have an engaged and passionate developer audience! Do you have high error rates paired with low reporting? Your developer user base might be pulling back and should be reached out to.

4. Traffic Volume

Ultimately, an API platform will live or die based on the number of average users, and as such, this is an incredibly important metric. Monitoring traffic volume helps you understand how many users you have and what kind they might be. Reporting the total number of users is important, but you should also differentiate between paid and free users, as well as long-term users or short-term ones.

Notably, this metric can also help you determine potentially underserved markets as well as markets that may be overserved or overleveraged, indicating opportunities for resource optimization or more efficient throttling. By understanding where your users are coming from and what kind of volume you are servicing, you can tailor your efforts to them specifically. Have a high volume of free Korean users? Maybe you need to adjust your pricing strategy or clarify your value offering. Have a high volume of paid US users but a very low volume of Spanish users? Maybe you need better language support.

Metrics For API Platform Engineering

Luckily, the KPI landscape is not a mystery. Tracking these KPIs is something that has been done — and done well — for a long time now. Figuring out what you want to track is the hardest step. Once you’ve decided on that, it’s simply a matter of enabling the systems required and collecting the data!

What do you think of these KPIs? Did we miss anything that you think should have been included? Let us know in the comments below!