Everything You Need to Know About API Pagination

As consumer expectations ramp up, API performance has never been more important than it is today. It’s a well-known statistic that 53% of web users will abandon a web page if it takes more than 3 seconds to load.

These expectations don’t necessarily line up with the technical requirements of an API. In the era of big data and analytics, APIs are dealing with larger amounts of data in their backend than ever before. To truly stand their ground in today’s digital economy, APIs must be optimized for peak efficiency. API pagination is a key strategy for making sure your APIs run smoothly and effectively.

But what is API pagination? How can API pagination help your APIs function at peak performance? We’re going to tell you, in our complete guide to API pagination.

To make sure we’re on the same page, let’s start by looking at what pagination is. Then we’ll delve deeper into API pagination with example code implementations.

What Is Pagination?

Have you ever clicked through an image gallery? Or read through an extensive web tutorial broken up into multiple segments? Do you know the numbers on the bottom of the gallery or webpage?

That’s pagination.

Sitechecker.pro, a technical SEO website, defines pagination as “an ordinal numbering of pages, which is usually located at the top or bottom of the site pages.” API pagination just applies that principle to the realm of API design.

API queries to dense databases could potentially return millions, if not billions, of results. There’s no telling what kind of drain that could put on your API. Pagination thus helps to limit the number of results to help keep network traffic in check.

Let’s look at some of the most common API pagination methods before we look at coding examples.

Offset Pagination

Offset pagination is one of the simplest to implement. It’s achieved using the limit and offset commands. Offset pagination is popular with apps powered by SQL databases, as limit and offset are already included with the SQL SELECT library.

API request using limit and offset looks like:

GET /items?limit=20&offset=100

Offset pagination requires almost no programming. It’s also stateless on the server side and works regardless of custom sort_by parameters.

The downside of offset pagination is that it stumbles when dealing with large offset values. If you were to set an offset of 1000000, for example, the API would have to scan through a million database entries and then discard them.

The other downside of offset pagination is that adding new entries to the table can cause confusion, which is known as page drift. Consider this scenario:

Start with the query GET /items/?offset=0&limit=15
Add 10 new items to the database
Perform the same query again. This will only return 5 results, as adding 10 items to the database moved the offset back by 10. This can cause a lot of confusion on the client-side.

Keyset Pagination

Keyset pagination uses the filter values of the previous page to determine the next set of items. Those results will then be indexed.

Consider this example:

Client requests most recent items GET /items?limit=20
Upon clicking the next page, the query finds the minimum created date of 2019–01–20T00:00:00. This is then used to create a query filter for the next page. GET /items?limit=20&created:lte:2019-01-20T00:00:00
And so on…

The benefits of this approach is that it doesn’t require additional backend logic. It only requires one limit URL parameter. It also features consistent ordering, even when new items are added to the database. It also works smoothly with large offset values.

Seek Pagination

Seek pagination is the next step beyond keyset pagination. Adding the queries after_id and before_id, you can remove the constraints of filters and sorting. Unique identifiers are more stable and static than lower cardinality fields such as state enums or category names.

The only downside to seek pagination is it can be challenging to create custom sort orders.

Consider this example:

Client requests a list of the most recent items GET items?limit=20
Client requests a list of the next 20 items, using the results of the first query GET /items?limit=20&after_id=20
Client requests the next/scroll page, using the final entry from the second page as the starting point GET /items?limit=20&after_id=40

Seek pagination can be consolidated into a WHERE clause. For example:

SELECT
*
FROM
	Items
WHERE
	Id > 20
LIMIT 20

This is fine if you’re sorting orders by Id. What happens when you want to filter by email address, for example? The backend would have to search the database for that email address and then translate it into an Id. A second query then has to be conducted using the Id as the WHERE value.

Example:

Query 1

SELECT
	Email = AFTER_EMAIL
FROM
	Items
WHERE
	ID = 20

Then…

Query 2

SELECT
	*
FROM
	Items
WHERE
	Email >=[AFTER_EMAIL]
LIMIT 20

Benefits of Seek Pagination:

Uncouples filter logic from pagination logic
Consistent ordering, even when new items are added to the database. Sorts newest added items well.
Works smoothly, even with large offsets.

Disadvantages of Seek Pagination:

More demanding of backend resources using offset or keyset pagination
When items are deleted from the database, start_id may no longer be a valid id.

How To Implement API Pagination Into Your Own API Design

We’ve learned a lot about how API pagination works and why it’s important. Let’s conclude with a real-world example to help you work pagination into your own API design.

We’re going to work with a REST API, as they’re very common. You should be able to translate these practices into other programming languages and environments easily enough. For this example, we’re going to use the HAL Browser, which is an API browser that makes linking easy and intuitive. In case you’re not familiar with HAL, it stands for Hypertext Application Language. Adding HAL to your API will make it explorable and discoverable. It also allows your API to be served and consumed using open source libraries available for most programming languages.

Once you’ve got HAL and HAL Browser set up, let’s start with some code.

{
    "_links": {
        "self": {
            "href": "https://example.org/api/user?page=3"
        },
        "first": {
            "href": "https://example.org/api/user"
        },
        "prev": {
            "href": "https://example.org/api/user?page=2"
        },
        "next": {
            "href": "https://example.org/api/user?page=4"
        },
        "last": {
            "href": "https://example.org/api/user?page=133"
        }
    }
    "count": 3,
    "total": 498,
    "_embedded": {
        "users": [
            {
                "_links": {
                    "self": {
                        "href": "https://example.org/api/user/mwop"
                    }
                },
                "id": "mwop",
                "name": "Matthew Weier O'Phinney"
            },
            {
                "_links": {
                    "self": {
                        "href": "https://example.org/api/user/mac_nibblet"
                    }
                },
                "id": "mac_nibblet",
                "name": "Antoine Hedgecock"
            },
            {
                "_links": {
                    "self": {
                        "href": "https://example.org/api/user/spiffyjr"
                    }
                },
                "id": "spiffyjr",
                "name": "Kyle Spraggs"
            }
        ]
    }
}

As you can see, links are highly useful in creating pagination in a REST API.

Take note of the variables, as well. They’re not accidental. Labels such as self, first, next, and last, are widely used by API developers. Your API is likely to be consistent with a lot of other APIs if you use these variables.

You can use many different approaches to create pagination in REST. Queries just happen to be the easiest to implement as well as consistent, so we’re going to stick with that approach.

Let’s create an example scenario. We’re going to create an API called programmers.featured and then list 5 featured programmers at a time using API pagination.

 166 lines  features/api/programmer.feature
  # we will do 5 per page
  Scenario: Paginate through the collection of programmers
    Given the following programmers exist:
      | nickname    |
      | Programmer1 |
      | Programmer2 |
      | Programmer3 |
      | Programmer4 |
      | Programmer5 |
      | Programmer6 |
      | Programmer7 |
      | Programmer8 |
      | Programmer9 |
      | Programmer10 |
      | Programmer11 |
      | Programmer12 |

After you make the first GET request, you’re going to search the results for the NEXT link and use that to return the next page of results.

166 lines  features/api/programmer.feature
  Scenario: Paginate through the collection of programmers
    Given the following programmers exist:
    When I request "GET /api/programmers"
    And I follow the "next" link

For this example, you’ll use the HATEOAS library. The code knows this, so it knows to look out for _links, next, and href and to use those for the next GET request.

Following these criteria, you know that Programmer7 would be contained on Page 2 of results. You won’t find Programmer2 or Programmer11. Those would be on Page 1 and Page 3 of results, respectively. So try searching for a Programmer entry from a particular line.

php vendor/bin/behat features/api.programmer.feature:96

This doesn’t work, however, as the code can’t find the next link since they haven’t been added yet.

Adding Pagination Links Using HATEOAS

HATEOAS mentions adding pagination links specifically in its documentation. To create this, you’re going to create a resource called PaginatedReprisentation. Let’s create that using ProgrammerController.

ssrc/KnpU/CodeBattle/Controller/Api/ProgrammerController.php

 public function listAction()
    {
        $programmers = $this->getProgrammerRepository()->findAll();
        $collection = new CollectionRepresentation(
            $programmers,
            'programmers',
            'programmers'
        );
    }

Now let’s make that scalable by creating the next instance of PaginatedRepresentation using $paginated.

use Hateoas\Representation\PaginatedRepresentation;
    public function listAction()
    {
        $collection = new CollectionRepresentation(
            $programmers,
            'programmers',
            'programmers'
        );
        $paginated = new PaginatedRepresentation(
        );
    }

Normally, you’d use a library to create pagination. Since we’re just creating a mock-up, we’re hard-coding some of the variables for the sake of argument. We’ll set the page to always remain on 1 and the limit to 5. This allows you to calculate the total number of pages as well. In this example, we’ve got 12 programmers, divided by 5, and then rounded up using the ceil function.

public function listAction()
    {
        $limit = 5;
        $page = 1;
        $numberOfPages = (int) ceil(count($programmers) / $limit);
        $paginated = new PaginatedRepresentation(
            $collection,
            'api_programmers_list',
            array(),
            $page,
            $limit,
            $numberOfPages
        );
        return $response;

Now that we’ve created the $paginated resource we can pass that onto the createAPIResource.

public function listAction()
    {
        $paginated = new PaginatedRepresentation(
            $collection,
            'api_programmers_list',
            array(),
            $page,
            $limit,
            $numberOfPages
        );
        $response = $this->createApiResponse($paginated, 200, 'json');
        return $response;
    }

Now try the test again.

php vendor/bin/behat features/api.programmer.feature:96

This still returns an error, but we’re getting somewhere. It detects Programmer7 but Programmer2 and Programmer11 are still present, as all of the programmers are still included in the list.

Now let’s turn these into actual pagination. Since page and limit are being used as query parameters, let’s continue using those. When you need to request information, you’re going to simply add a REQUEST $request argument to the controller.

   public function listAction(Request $request)
    {
    }

Now you’re going to add query parameters to the list, as well. You can add $request->query->get('page') and the second argument is the default value if there is no PAGE sent for any reason. This goes for LIMIT as well. This can be set by the end user, but we’re going to default to 5.

    public function listAction(Request $request)
    {
        $limit = $request->query->get('limit', 5);
        $page = $request->query->get('page', 1);
    }

Most API pagination queries will use some combination of limits and offsets to narrow down the search results. Since this example only has 12 entries, we’re just going to return all of the entries and use PHP arrays to deliver the desired results.

If you were to implement this in an actual API, you might use a library like Pagerfanta, which facilitates pagination as well as featuring adapters for search engines.

For this example, we’re still going to use manual logic and then use the command array_slice to subdivide the results.

   public function listAction(Request $request)
    {
        $limit = $request->query->get('limit', 5);
        $page = $request->query->get('page', 1);
        // my manual, silly pagination logic. Use a real library
        $offset = ($page - 1) * $limit;
        $numberOfPages = (int) ceil(count($programmers) / $limit);
        $collection = new CollectionRepresentation(
            // my manual, silly pagination logic. Use a real library
            array_slice($programmers, $offset, $limit),
        );
    }

This returns all of the results and then returns the desired outcomes using the array_slice function. Go ahead and try the test again:

php vendor/bin/behat features/api.programmer.feature:96

Success! You should have a working paginated API at this point.

Example of Pagination in Practice: Spotify API

Pagination is quite commonly used within popular public web APIs that must deal with much data on the backend. It’s also adopted alongside filtering capabilities to help return more relevant information.

For example, consider a developer that wants to program a request to the Spotify API to retrieve catalog information about a specific artist’s albums. Using pagination and filtering could be helpful, especially if the artist has a large discography.

The following request could be used to return a list of albums from a particular artist that are available in the Swedish market. It uses the limit parameter to limit the response to ten albums.

https://api.spotify.com/v1/artists/{id}/albums?include_groups=album&limit=10

The limit parameter can also be used with offset to retrieve further results. For example, this would respond with the next ten following albums.

https://api.spotify.com/v1/artists/{id}/albums?include_groups=album&market=SE&limit=10&offset=10

The Spotify API documentation is a model example of how pagination is often adopted within web APIs. By modeling this and other popular public APIs, you can build pagination best practices into your services that enable users to seamlessly interact with complex databases.

API Pagination: Summary and Best Practices

As APIs continue to get more involved and elaborate, API pagination is only going to become more essential. As we’ve shown, some APIs can return millions of search results, if not more. This can slow the response time of an API call down to a crawl.

To summarize, we’ve looked at what API pagination is, as well as some of the most common methods of implementation. These include offset pagination, keyset pagination, and seek pagination. We also discuss some of the merits and shortcomings of each approach to help you decide which approach is best for your API designs.

Offset pagination is the easiest to implement. Offset pagination has its limitations, however, such as the limitations of large offset values and inaccuracies that come from page drift. Keyset pagination is a bit more robust, but it’s also tightly tied to page results. Seek pagination is even more robust, as it returns consistent ordering even when new items are added to the table. Seek pagination can be more complicated for the backend to implement, however. It can also get thrown off when items are removed from the database.

API pagination is essential if you’re dealing with a lot of data and endpoints. Pagination automatically implies adding order to the query result. The object ID is the default result, but results can be ordered in other ways as well.

Finally, we concluded with some code examples to give you some practical insights on writing your own API pagination code.

API pagination is a vast topic. There’s a lot that can be said about it. You can learn more about it by reading this in-depth article from Moesif or this one from Dzone.