How to Design a Scalable GraphQL Schema

GraphQL is still the shiny new tech in the API space. Providing incredible functionality with a small learning curve to getting started, GraphQL seems to actually be delivering on its promises. Maybe this is why, for the first time, GraphQL finally eclipsed SOAP in usage, found the 2023 Postman State of API Report.

The problem is that any issues that might arise with GraphQL typically come after some time. Certain problems can start at the first step but only rear their heads after a year or more. Accordingly, understanding how to build a GraphQL schema that is not only effective but intrinsically scalable is a significant part of practical implementation. To be truly scalable is to be built for the future rather than the now, and given the flexibility of GraphQL, anything other than forethought and wise development is a waste of potential.

Today, we’re going to dive into what makes a GraphQL Schema that is scalable. We took inspiration from a recent blog post by Nord Security on Medium, and added our own insights and suggestions. After you finish this piece, you should have a strong sense of what scalability actually is, what it means to be scalable in the GraphQL world, and how best to ensure your product is scalable from step one.

What is Scalability?

We have discussed scalability at length in previous pieces on Nordic APIs, but it bears repeating here as scalability is part and parcel to successful GraphQL implementation.

Scalability is, in essence, a concept built around scaling the technical implementation to the needs of the use case, whether the consumer or the business case influences that use case. For example, an API initially built to serve a customer base of 1,000 customers might soon find itself needing to serve 20,000 customers, half of whom want more complex functions. Scalability ensures that ever-increasing concurrent requests are supported without negatively impacting the service. For the end user, there should be no difference between the service answering one request and the service answering one million requests.

So what makes something scalable? Being stateless allows for optimizing the request service, increasing efficiency, and servicing the current request as a singular one. Asynchronous processing is also beneficial, reducing the overhead built into synchronous direct connections and allowing users to interact with the underlying data structure. Allowing for various formats and modes, which is something GraphQL does in spades, also helps as it removes the need to pre-process data for the end user based upon assumptions and instead shifts this processing to the specific user need.

Don’t miss our workshop with Apollo GraphQL: What If All Your Data Was Accessible in One Place

Methods to Design a Scalable GraphQL Schema

Design Around Business Needs

Scalability necessarily means designing for business needs. Ultimately, an API is going to function only based upon the context it’s given — APIs are not sentient AI-powered systems (yet), and, especially within the context of GraphQL, form is meant to help provide some structure to user requests and to the way the API itself functions. Accordingly, ensuring that the underlying schema of the GraphQL instance matches the business functionality and the needs of said business will go a long way towards ensuring that it is scalable.

Imagine, for a moment, that you build a GraphQL instance instead based around data structures. That might make sense for the here and now, but what happens when your business grows, adding new functions, data formats, and needs? The API that you built around audio provisioning and purchasing might not be able to grow to service video provisioning, merchandise, and equipment sales, in such a case. If you first develop around the business case, you provide a future-proof modality by which scalability is enhanced instead of hindered.

Be Clear and Standardize

Naming is perhaps the most crucial step when first developing a schema. Names aren’t just handy labels — they’re systems of contextualization and information. Proper name labeling can help a developer or a user understand the purposes, form, and function of something with very little lift. Conversely, poor clarity and standardization can do the inverse, blocking scalability and worsening the end product.

Let’s get back to that audio API we discussed above. When you built your API, you created a type that denoted Author, allowing for a single string to define the Author in question.

type Author {
credit: String!
}

Now let’s assume that the business has boomed, and you are providing audiobooks in addition to digital webcomics, ebooks, digital art, and audio novels. In such a case, the Author type no longer represents the wide range of artists, or voiceover talent, that may work on a single piece of media. And more importantly, if the schema defined that as a necessary entity to exist for each entry, you may quickly find yourself in a situation where a piece does not have an “author,” so to speak, and cannot be represented by your schema.

To fix this, we can denote the specific kind of author we’re discussing:

type Author {
creditType: String!
}

And from here, we can expand this to include a type and a credit:

type Author {
creditType: String!
creditInfo: String!
}

This is to say nothing, of course, of the lacking standardization that may come into play. In some cases, as you dive into the code, you may not even know what each item is if it’s named poorly. Adopting clarity and standardization helps negate many of these issues.

Design for Simplicity and Functional Purposes

When looking at how things function, ensure they are designed around the simplest unit possible that makes the most sense and is the most functional.

There are two core points to consider here. The first is the nesting of object types. Consider how your code is built — while it might make sense to you to bundle everything together into a single type, or, conversely, have each field be its own type, either of these extremes results in something that is not fit for form. Oversimplifying and overcomplicating will quickly make a mess of the codebase, confusing even your best users.

Instead, adopt the motto that things should be simplified to their sensible balancing point. Instead of bundling all the country data into the Address type, create a Country type! Instead of creating an Author type and pushing everything into this piece, create a Creator type that then allows Author, Musician, or Artist to be referenced.

In a related concept, keeping interfaces small and purpose-built is another way to adopt this mentality. Yes, an omni-interface sounds good in theory, but what are your end users actually going to use? What does it matter if they have 1,000 potential interface features if 90% of your user base only uses a small subset?

Plan for the User

Ultimately, much of this advice falls upon one core conceit — the end user will make your schema do what they want it to do, and the difference between being scalable and not scalable is the level of difficulty they take on in doing so. The end user will ultimately be the one who uses this system, and their needs and desires should drive your schema through the business logic.

We’ve seen failures in this domain time and time again. Car manufacturers have designed a car they think the market loves for drivers they think want the model, only to find that those drivers don’t exist and the market does not want the product. In the same mindset, API developers shouldn’t design a system without considering what the market actually desires.

The schema should reflect the end-user experience. Does your userbase utilize the highest-resolution content you can host? Be prepared to default to that setting and provide upscaling systems. Conversely, do your users often use the API in countries with less well-formed internet networks with average lower speeds? Serve a schema that adopts low default payload sizes and only serves the information explicitly requested.

Understanding these needs and adopting the schema and standard response paradigm will go a long way toward ensuring you have a scalable schema that fits your current purpose.

Plan for the Future

Let’s be real, though. The current process isn’t going to be current for too long. Accordingly, adopting a planning mindset for the future is also very important. You have an incredible wealth of data at your fingertips — user interactions, desires, and flows can be surveyed and understood with little effort. With very little development at the beginning of the schema, these testing modalities and systems can be built into the body of the code itself.

Look forward, not backward, and gather as much information as you can about the desires of your userbase. This information will inform your efforts to make the schema that much better and that much more scalable.

Conclusion

Ultimately, scalability is a question of user experience. Providing a system that scales to the needs of the userbase and ensures proper business logic adherence is a delicate balance game that, if done correctly, can reap massive benefits for all involved.

Are there any major concepts around GraphQL schema scaling that we missed? Let us know in the comments below!