GraphQL is a very powerful query language that does a great many things right. When implemented properly, GraphQL offers an extremely elegant methodology for data retrieval, more backend stability, and increased query efficiency.
The key here though is that simple phrase — when implemented properly. GraphQL has had somewhat of a gold rush adoption, with smaller developers responding to the media hype and big name early adopters with their own implementations. The problem is, many people aren’t considering what adopting GraphQL actually means for their system, and what security implications come with this adoption.
GraphQL is a paradigm shift in many ways — and with that, security concerns have changed. While some security concerns have gone away, replaced by architectural differences and nuances, other concerns have been amplified.
In this piece, we’re going to talk about those issues, highlighting security concerns that an API system supporting GraphQL should acknowledge. While GraphQL itself is not the primary driver of all these concerns, these issues should be addressed within the greater frame of a GraphQL system.
GraphQL — A Summary
Quickly, let’s summarize what GraphQL is, and how it does what it does. Simply put, GraphQL is an application layer query language designed to interpret a string from a server or client, and return that data to the requesting client in the form that they request.
GraphQL was developed by Facebook as a means to transition away from HTML5 applications on mobile towards robust, native applications. This was facilitated by allowing easier backend queries through the unification of multiple interior endpoints to a single forward facing endpoint.
With that in mind, what are some concerns in regards to security and best practices in terms of GraphQL?
Implied Documentation vs. Actual Documentation
A serious concern when implementing GraphQL is how documentation is handled between versions. GraphQL does not have versioning support in the same way other systems do. That’s not to say that you can’t version in GraphQL, but simply that by design, GraphQL is designed for API evolution without version control being required.
While this is certainly a fair approach, there are a good number of developers who depend on versioning to communicate changes in field values, endpoints, and declarations. While this is not good practice, that doesn’t stop it from being relatively common — and unfortunately, when GraphQL is brought into the mix, the issue only becomes worse.
Let’s say an API changes a fundamental endpoint or internal construct. In a traditional API, this change would be documented as a version change, with built-in documentation as such. In GraphQL implementations, this isn’t the case, and so many developers might learn about this deprecated endpoint by attempting to use it, and being turned away.
So what’s the security issue here? The problem is that, without proper documentation, you can very quickly run into a situation where an endpoint is no longer valid, but data is still being sent and requested from that endpoint. This can result in collisions and unintended functionality. More to the point, if your application is still attempting to poll a non-existent GraphQL endpoint, an emulated endpoint from a man-in-the-middle attack could theoretically step almost seamlessly into the data stream.
There are many ways to mitigate this, and adopting a continuous versioning strategy is absolutely paramount as a solution. Though this issue is dramatically magnified by GraphQL, this is much more a problem with developer practices than GraphQL itself.
Perhaps the biggest issue with a GraphQL implementation is inherent in the approach itself. GraphQL has the extremely powerful ability to unify multiple endpoints into a single queryable point — this is the entire crux of what makes GraphQL powerful. The problem here, though, is in the fact that a single endpoint, even if it connects to multiple internal endpoints, functions as a single point to the consumer.
We often view APIs as working from the interior to the edge — in other words, from the code base out to the consumer experience. Typically, this is fine, as the consumer is a single point accessing multiple endpoints, which then call multiple functions, which might relay to even more databases or resources. When we’re talking about GraphQL, however, we’re actually creating an interior to edge to interior system, in which the codebase narrows to a single point or collection of single points, which then expands to the user.
In a poorly defined GraphQL request from the consumer (or, for that matter, a poorly formed GraphQL endpoint defined by the developer), a single failure means a total failure to access internal resources. If improperly abstracted, documented, and specified, you’re essentially putting all your eggs in a basket.
Functionally speaking, this entire issue could be summed up thusly: call failures equate to insecurity for the developer consumer. Minimizing these failures is an entire industry in and of itself, and poorly adopting GraphQL negates much of the efforts developers have made to this end.
Data and Server Transaction Volumes
GraphQL has another fundamental problem related to its foundational aspects — queries are often larger and more complex than in traditional REST APIs. In GraphQL, a single request might combine multiple requests to multiple endpoints, resulting in an unpredictable amount of data for each request over time.
Consumers are in control of their requests — and in some ways, this is dangerous. Not every provider is going to have huge scalable infrastructure or cloud servers on retainer, and thus not every provider is going to be comfortable with the idea of allowing for variable content requests based on the whim of a user.
There’s also the concern of limiting the actual request in a reasonable way. Consumers aren’t always the most judicious with their requests — sometimes, a consumer may request more data than they really need, and over a long time and large amount of requests, this adds up to significant overhead that did not exist in a non-GraphQL solution.
Information Hiding and Chattiness
Neither GraphQL or REST enforces any significant amount of data hiding. The problem with GraphQL specifically comes when people rush to adopt it, and instead of logically thinking out the mapping of public endpoints to private models, they simply map a 1:1 relationship.
While this is functional, it’s incredibly dangerous. The entire point of GraphQL is to allow customization options for requesting data — but this also means that GraphQL enables data access in a way that may not be intended, and when 1:1 relationships are established, you lose some very important control of internal assets.
This is, just as with any REST API, entirely manageable with proper GraphQL schema design. Third party clients and APIs should be able to interact with your GraphQL endpoint without knowing the internal workings of the API.
GraphQL can be designed in a way to have this be a limitation of the endpoint, but this comes with the threat of a rushed adoption, possibly resulting in exposing much more than was intended. Depending on how GraphQL is set up and how queries are handled, the actual endpoints defined can insist on “chattiness”, rather than allow it.
This isn’t just a matter of schema revelation, either. When an endpoint unifies multiple internal endpoints and functionalities, this endpoint can be improperly implemented to retrieve far more data than was ever intended. This has some serious implications.
First, there is the server implication. If the data is being retrieved first and then stripped of irrelevant data before being transferred to the client, we’re being extremely wasteful. Imagine if every single time you want to drive, you had to burn an entire tank of petrol, regardless of distance travelled.
In some cases, this would make sense — a two hundred mile trip would easily eat through your tank, and possibly more. For a short trip, this would be absolutely wasteful, and over time, would lead to excessive wear and tear. Limiting this chattiness is incredibly important.
Then there is the aforementioned security internal exposure issue. This has already been somewhat discussed, but it bears repeating — allowing for unlimited access to a variety of endpoints in a manipulatable way is the perfect storm for penetration testing, and makes identifying the structure and scheme of the backend trivial in many cases.
Authorization and GraphQL
Another point of potential issue is improper implementation of authorization in GraphQL-dependent APIs. Again, this is much less an issue with GraphQL itself, and more an issue with adoption.
In GraphQL, authentication can be handled handled using query context. The basic idea here is that the request can be injected with arbitrary code that is then passed through to the request, which allows very fine, granular control of authorization by the developer proper. This is a very secure system — it’s arguably much stronger than other solutions when it comes to SQL injection and Langsec issues.
The problem comes with developers not actually understanding that this is a possibility. Developers might look at their current authorization logic, decide they don’t want to change anything fundamentally, and instead insert this logic into the GraphQL layer itself rather than the business logic layer.
This is fundamentally flawed, and is advised against in the GraphQL specification itself. Placing this logic in the GraphQL layer opens the security system up to code injection, sniffing, and other attacks that could easily expose the internal authorization structure, thereby rendering it null.
This isn’t to say that GraphQL is a poor choice, or that you should be wary of implementing it. Quite the opposite, in fact — proper implementation of GraphQL is possibly one of the best things that can be done for a large API when it has a variety of data on the backend that must be delivered in a sensible, consumer-controlled way.
What this is to say, though, is that our optimism should be tempered. As with any new technology or implementation, we need to prove that the theoretical is provable in the actual implementation. With so many potential issues sprouting from improper implementation, it’s incredibly important for developers to look at GraphQL in the frame of “how do I make sure I do this right” rather than “move fast and break things”.
We’re not alone in this advice for measured optimism, either.
ThoughtWorks, a software company which delivers verdicts to the industry on new technologies, advised that developers should “assess” whether or not GraphQL is the correct solution given their use case. API Evangelist Kin Lane was likewise cautious, stating that during a conversation in the API Evangelist Slack channel:
“the consensus seemed to be that GraphQL is a way to avoid the hard work involved with properly getting to know your API resources, and it is just opening up a technical window to the often messy backend of our database-driven worlds.”
The simple fact is that, as with any solution, we need to be cautious and ensure the following:
A. We are using GraphQL for an appropriate use, and not adopting it in the flavor of the week mentality;
B. Our endpoints are well documented, with secondary paths in the case of failure;
C. We are properly hiding the internal schema and structure of our server;
D. Finally, we are limiting the amount of interactions allowed to a reasonable measure.
If and when these situations are validated, GraphQL makes for an amazing implementation.