What is the Difference Between API Gateway and Load Balancer

What’s the Difference Between an API Gateway and a Load Balancer?

Posted in

Terminology is important, especially when discussing conflicting or overlapping technological solutions. A simple misunderstanding of a term can lead to issues across the thought patterns of developers and users, resulting in long-term issues. Accordingly, it is beneficial to take a look at terms that are often conflated.

Two such terms are API gateways and load balancers. These two technologies might sound quite similar, and in practical application, they seem to operate similarly. But ultimately, these technologies are built for very different things. Accordingly, it’s important to differentiate their purposes, intents, and use cases.

In this piece, we’ll compare the API gateway and load balancer concepts. We’ll define them, provide some real-world examples of each, and present some compelling use cases for both technologies.

What is an API Gateway?

To understand how an API Gateway functions, we must first understand what an API actually is. In their simplest form, APIs are a layer of communication between a requester and a provider to facilitate the transfer of information across networked devices. In other words, when you request the weather for a location, your device connects with a weather API to generate a response to the specific request you have made.

In the modern web development space, this API experience has taken the form of microservices. The idea behind a microservice is that each API element stands alone as its own service and connects to other services to form the complete function.

One of the easiest ways to understand web APIs is with a metaphor — in this case, a librarian. Imagine you enter a library looking for a specific book — you may not know the author or even the title, but you know what it’s about and the general year it was released. You make this request to the librarian, and using this information, the librarian checks their catalog to find the title you are looking for. In the web API space, your book is the item you need, and your request is handled by the librarian, or the API, to retrieve resources.

Now imagine this same scenario, but instead of just one request, there is a crowd of 300 people all requesting information. This might seem dramatized, but think of how many people are probably accessing Spotify or YouTube right now — all of those requests are facilitated using an API. As such, APIs need to ensure that the requests are handled properly.

Enter the API gateway. API gateways are the facilitation layer for these abundant requests, allowing clients to make calls and then routing them to the appropriate microservice. Notably, API gateways also typically handle protocol and specification compliance, and as such, are best seen as the “enforcers” of an API’s ruleset and framework. No matter the request, it filters through the API gateway and is transformed and routed in the way the gateway sees as appropriate. There are many pre-built API gateways on the market, or some organizations choose to develop them in-house.

What is a Load Balancer?

One might look at the term “load balancer” and assume it shares overlapping functions with a gateway. After all, what is an API gateway but a request-balancing system? In one way, this is true — an API gateway does indeed balance the load of requests. There are some major differences, however, that set the technologies apart.

First and foremost, load balancing is principally a network balancing system. When we refer to “load balancing,” we are referring to the distribution of network traffic, requests, and processing across a group of servers, whether those servers are physical or virtual. For example, when a request hits the servers at YouTube, they are not just handled by a single massive server — they are balanced across multiple servers and nodes to deliver the video to the user from a server capable of serving the information. Load balancers are principally designed to allow cost-effective scaling, efficiency, and reliability by ensuring that no single server is overladen.

If we keep with our earlier example about a library, this would be more akin to having a group of deputy librarians all waiting behind the principal librarian to carry out the task in question. Requests are sent to the specific deputy librarian who can perform the task, which is then handled via the internal system invisibly from the user’s perspective. Load balancers are essentially the “traffic cops” of the server world, ensuring that speed, efficiency, and cost are balanced with serving client requests in the most effective way.

What’s the Difference?

Let’s look at this from the perspective of the systems themselves. From the perspective of the load balancer and the API gateway, what’s most important is what is specifically being balanced.

From the perspective of the API gateway, the requests themselves are being balanced. It’s more important to take the request and ensure that they are properly formatted, hitting the correct microservice, and are being equalized to prioritize resource service. However, API gateways can do so much more than simple routing. They often offer authentication, compliance, and other verification systems to ensure that the request has the best chance of being served.

Load balancers, on the other hand, are less concerned with the requests and are much more concerned with the network traffic. In other words, Load balancers balance the network, not the requests. A load balancer doesn’t really care whether the request is well-formed or which microservices the server is sending to — instead, it cares whether the resource that can solve the request is free to do so and whether it is overly taxed with network demands.

Practical Examples

API Gateway

Let’s look at a practical example of an API gateway. One great example is the wonderful work developed by Netflix. Netflix has a lot of specific challenges to its core offering, but perhaps the biggest challenge is the simple number and type of devices that can access the service. Televisions, phones, laptops, tablets, and other devices of all different sizes and resolutions all use different apps to request video content in various sizes and formats.

To resolve these requests, Netflix utilizes an approach known as the Federated Gateway. In this solution, there is an API gateway that takes in a request and then routes those requests to one of three core entities:

  • The Movie Entity: The shows, movies, shorts, etc., that make up the Netflix offerings
  • The Production Entity: A studio production that contains locations, vendors, and more.
  • The Talent Entity: All of the associated talent who work on the movie.

By creating a federated architecture, a client request enters the system and is met by a GraphQL Gateway. From here, the request is distributed to the microservices through a federated system using the Schema Registry as a source of truth, with the final coalesced response sent through the gateway as a single response.

Load Balancer

Interestingly enough, one of the best examples of a load balancer can also be found in Netflix! Netflix utilizes a load balancing mechanism they call Edge Load Balancing. In essence, when a request enters the network at the “edge” of its map (that is, it enters from outside the Netflix network into the network itself), it goes through several mechanisms to deliver the most effective load balancing.

First, the request is put through a combinatorial algorithm that joins the Join-the-shortest-Queue process with the Server-Reported Utilization process. In essence, this combines the shortest route for the user with the most available and healthy servers, as internally reported to find the perfect server to respond to the request. By adopting this combination algorithm, Netflix can balance the health of its internal servers with the end-user experience.

Additionally, Netflix built a Choice-of-2 Algorithm, allowing servers to be chosen by comparing their statistics to one another. This further allowed for finer tuning of which server is best suited for responding to the request, allowing for the load balancer to choose between Client Health, Server Utilization (most recent algorithmic score from the server itself), and Client Utilization (current number of requests from the Load Balancer to the server).

Additional systems allow for even more granular control. These include: filtering (which filters out servers that consistently exceed pre-set configured health and utilization metrics), probation (which prevents new servers from facing a deluge of requests before they can even establish their own metrics), and server-age-based warmup (which gradually increases traffic to a server over its first 90 seconds of uptime).

This kind of system is highly effective at ensuring the internal resources serve the requests in their most effective pattern. According to Netflix’s documentation, this has dramatically reduced load-time issues, error rates, and actual data service rates. Netflix specifically notes that this has resulted in dramatic improvements to the load-related error rate, which is basically a complex way of saying “it balances the load effectively.”


What’s important to remember about API gateways and load balancers is that although they sound very similar, they function on opposite sides of the system. API gateways are much more concerned with the requests themselves, whereas load balancers are far more concerned with the servers that will answer those requests. Netflix is a good use case in which those lines are much more clearly defined, and is a key example to keep in mind when differentiating these technologies.

Are there any caveats we missed? Let us know in the comments below!