Discover the benefits of API rate limiting and ways to implement it.

As API developers, we need to make sure our APIs are running as efficiently as possible. Otherwise, everyone using your database will suffer from slow performance.

Performance isn’t the only reason to limit API requests, either. API limiting, which is also known as rate limiting, is an essential component of Internet security, as DoS attacks can tank a server with unlimited API requests.

Rate limiting also helps make your API scalable. If your API blows up in popularity, there can be unexpected spikes in traffic, causing severe lag time.

So, how exactly do we rate limit our APIs? In this article, we’ll delve into the main strategies and industry standards around rate limiting. We’ll showcase effective rate limiting libraries and frameworks, and demonstrate sample code needed to implement request queues, throttling, algorithmic-based rate limiting.

How To Limit API Requests And The Importance Of Rate Limiting

Let’s start by taking a look at what rate limiting is. Then we’ll take a look at how rate limiting works. We’ll also touch on the importance of rate limiting.

What Is API Rate Limiting?

If you dole out unlimited access to your API, you’re essentially handing over the keys to the kingdom. Anyone can use your API, as much as they want, at any time.

While it’s great that people want to use your API and find it useful, open access can decrease value and limit business success. Rate limiting is a critical component of an API product’s scalability.

API owners typically measure processing limits in Transactions Per Second (TPS). Some systems may have physical limitations on data transference. Both are part of the Backend Rate Limiting.

To prevent an API from being overwhelmed, API owners often enforce a limit on the number of requests, or the quantity of data clients can consume. This is called Application Rate Limiting.

If a user sends too many requests, API rate limiting can throttle client connections instead of disconnecting them immediately. Throttling lets clients still use your services while still protecting your API.

However, keep that in mind there is always the risk of API requests timing out, and the open connections also raise the risk of DoS attacks.

Best Practices For API Rate Limiting

One approach to API rate limiting is to offer a free tier and a premium tier, with different limits for each. There are many things to consider when deciding what to charge for premium API access. For cost estimates, read our piece on API Pricing or API business models for ideas.

API providers will still need to consider the following when setting up their API rate limits.

  • Are requests throttled when they exceed the limit?
  • Do new calls and requests incur additional fees?
  • Do new calls and requests receive a particular error code and, if so, which one?

For a complete overview, let’s see some examples. Below we’ll see how a few API providers are communicating rate limiting guidelines to their developer users.

GitHub

From GitHub’s Rate Limiting Documentation for Developers

GitHub’s documentation explains the different rate limits for authenticated and unauthenticated requests. It also defines the header that is returned as well as the meaning of some of the rate-limiting commands.

The documentation also lets users know how to check their current rate of usage, informing how to increase rate limits for specific applications, and outlining repercussions for when clients abuse rate limits.

LinkedIn

From LinkedIn’s rules for developers

LinkedIn’s rate limiting documentation explains how different API endpoints have different limits. LinkedIn’s API features three different kinds of rate limiting: application throttle, user throttle, and developer throttle. The documentation also specifies the time zone used to define the beginning and end of the day.

Bitly

Bitly’s documentation not only defines their rules; they also offer guidance on the best practices for avoiding rate-limiting issues.

Some of these best practices include:

  • Security issues
  • Caching
  • Long page load times
  • Batch processing
  • High-volume requests
  • URL encoding

These examples should help illustrate how you might help developers avoid some of the API rate limiting issues and avoid getting throttled.

How To Throttle API Calls

Seeing as how there are numerous ways for a program to connect with an API, there are also various ways you can throttle API traffic.

Limiting calls made to third-party APIs and applications are some of the most challenging situations to limit. For example, if your clients call the Google Maps API directly, there’s not much you can do to limit that. You’re just going to have to pay for the appropriate level of data usage.

If the rate-limited API is accessed via some form of backend process, it’s decidedly easier to limit the API queries using the backend code. Say your API only allows 20 requests per second. You can set up a process that only allows 20 requests a second to pass through.

If all of those requests are happening synchronously, it might not make a difference, but you can quickly see a difference when it comes to asynchronous tasks.

If your process is implemented in Node.js, for example, you could use the  bottleneck package.

If you’re using Ruby and tools like Sidekiq, you could use plug-ins like sidekiq::throttled or sidekiq enterprise for rate-limiting.

Every programming language will have its own version of throttling or rate-limiting. Look into the libraries and packages available in the language you’re working in to see what’s out there.

What You Need To Know About Rate Limiting

Many services that use REST APIs feature API limiting as a defense against DoS attacks and overloaded servers. Some APIs feature soft limits, which allow users to exceed the limits for a short period. Others have a more hardline approach, immediately returning an HTTP 429 error and timing out, forcing the user to send a brand new query.

Setting a timeout is the easiest way to limit API requests. Just set the timeout limit and then return the following message to your users.

This method works like a charm for those looking to get something up and running quickly. It’s a good approach for those who’ve recently switched from PHP to Node.js. Node.js handles all kinds of asynchronous requests, as well, so you might need a more permanent solution. Request queues are ways to achieve this.

Three Methods Of Implementing API Rate-Limiting

There are numerous ways you can rate-limit your API. Here are three of the most popular ways to go about API rate-limiting.

1. Request Queues

There are a lot of request queue libraries out there, and each programming language or development environment has its own commands. This means a lot of the hard work has already been done for you.

There are even queue library directories, making it easy to search for pre-written code. There are a few request-rate-limiter libraries already out there.

One particular library sets the rate limit at two requests per second and places the rest in a request queue. There are plenty of request queue libraries out there that are ready to be used. They’re about as close to plug-and-play as you can get in API development.

Android Volley

Volley is a particularly popular request queue library for Android developers. Not every Android Library can take advantage of Volley, as some of them require more extensive networking capabilities. Take a look at your Android library’s documentation to make sure it’s compatible with Volley.

Amazon Simple Queue Service (ASQS)

Amazon’s Simple Queue Service (ASQS) is a ready-made request queue library that is perfect for request and messaging queues. The project is regularly maintained, so you won’t have to constantly debug your hardware or software to make ASQS work.

Setting Rules For Request Queues

To illustrate the best way to set rules for rate-limiting libraries, we’ll be using npmjs, a package manager for JavaScript. Npmjs features a lot of request queue libraries, so you don’t have to code everything from scratch. It’s also got a healthy development community around it, so there’s a lot of support and guidance available should you run into any problems.

To start, you’re going to use the ‘request’ library to make HTTP requests. We’re going to use the ‘request-rate-limiter’ library, and it’s easy to use and configure for a variety of uses.

This rate-limiting library automatically limits the number of requests that can be sent to an API. It also sets up the request queue automatically. This also means you don’t have to worry about how many requests are sent to the API, as they’ll be added to the queue.

2. Throttling

Throttling is another common way to practically implement rate-limiting. It lets API developers control how their API is used by setting up a temporary state, allowing the API to assess each request. When the throttle is triggered, a user may either be disconnected or simply have their bandwidth reduced.

Possible at the application, API, or user level, throttling is a popular method to rate-limit APIs. Therefore, there are several commercial products on the market ready-made for developers. Progress offers the Hybrid Data Pipeline offers throttled API access for:

  • IBM DB2
  • Oracle
  • SQL Server
  • MySQL
  • PostgreSQL
  • SAP Sybase
  • Hadoop Hive
  • Salesforce
  • Google Analytics

The utility also features built-in functions to filter the query results that returned to the client, such as $count, $top, and $skip.

They also offer OpenAccess SDK for proprietary APIs. OpenAccess SDK offers a standard SQL interface such as ODBC, JDBC, ADO.NET or OLE-DB. OpenAccess SDK easily integrates with most security and authorization systems, making it a useful firewall between APIs and back-end systems.

3. Rate-limiting Algorithms

Algorithms are another way to create scalable rate-limited APIs. As with request queue libraries and throttling services, there are many rate-limiting algorithms already available.

Leaky Bucket

The leaky bucket algorithm is a simple, easy-to-implement rate-limiting solution. It translates requests into a First In First Out (FIFO) format, processing the items on the queue at a regular rate.

Leaky Bucket smooths outbursts of traffic, easy to implement on a single server or load balancer. It’s also small and memory-efficient, due to the limited queue size.

Fixed Window

Fixed window algorithms use a fixed rate to track the rate of requests using a simple incremental counter. The window is defined for a set number of seconds, like 3600 for one hour, for example. If the counter exceeds the limit for the set duration, the additional requests will be discarded.

The fixed window algorithm is a simple way to ensure your API doesn’t get bogged down with old requests. Your API can still be overloaded using this method, however. If a slew of requests is made when the window refreshes, your API could still be stampeded.

Sliding Log

A sliding log algorithm involves tracking each request via a time-stamped log. Logs with timestamps that exceed the rate limit are discarded. When a new request comes in, the sum of the logs are calculated to determine the request rate. If the request exceeds the limit threshold, they are simply queued.

Sliding log algorithms don’t suffer from the stampeding issues of fixed windows. It can get quite expensive to store an unlimited amount of logs for each request. Calculating the number of requests across multiple servers can also be expensive. Sliding log algorithms aren’t the best for scalable APIs, preventing overload, or for preventing DoS attacks.

Sliding Window

Sliding window algorithms combine the best of Fixed Window and Sliding Log algorithms. A cumulative counter for a set period is used, similar to the fixed window algorithm. The previous window is also assessed to help smooth outbursts of traffic.

The small number of data points needed to assess each request makes the sliding window algorithm an ideal choice for processing large amounts of requests while still being light and fast to run.

Final Thoughts: The Effects of Rate Limiting

Today, we’ve looked at some of the best ways to limit API requests, but what are the effects of rate limiting APIs?

It’s never been more essential to make digital creations as efficient as possible. The research firm Dimensional Research conducted a study on mobile performance and user experience in 2015. They found that 80% of app users will only try and use an app that’s giving them problems three times before uninstalling it. 36% of app users report developing an unfavorable opinion towards a brand due to app performance issues.

Unregulated API requests can also lead to slow page load times for websites. Not only could this leave an unfavorable opinion with your customers, but it can also tank your SEO rankings. With the prevalence of mobile Internet traffic, Google is factoring Page Speed more and more in their algorithm rankings.

Ensuring digital content is as fast and efficient as possible is vital in today’s global economy. Mobile app speed and page load times can fluctuate wildly from country to country. You don’t want to alienate your international customers with bloated, slow-to-load apps and websites.

As we’ve seen, rate limiting is an essential skill for developers of all traditions. If you’re using API requests in any regard, consider these rate limiting techniques to increase security, business impact, and efficiency across the board.

J Simpson

About J. Simpson

J. Simpson lives at the crossroads of logic and creativity. He writes and researches tech-related topics extensively for a wide variety of publications, including Forbes Finds. He is also a graphic designer, journalist, and academic writer, writing on the ways that technology is shaping our society while using the most cutting-edge tools and techniques to aid his path. He lives in Portland, Or.