Review of GROQ, A New JSON Query Language

Query languages are fundamental to the nature of the modern web. As data grows and the internet becomes more complex, bigger datasets are the natural result. To act on that data, though, we need to have systems in place that allow for filtering and transformation.

While there are many tools for this purpose, one interesting tool is GROQ. Part and parcel to the Sanity solution from Sanity.io, GROQ was recently open-sourced. It offers a powerful open-source solution to filtering massive data sets.

Today, we’re going to dive into GROQ, and see what makes it unique. We’ll look at a simple invocation, as well as some more complex ones. We’ll also look at where GROQ may make more sense compared to other offerings, and present a use case for GROQ inclusion.

What is GROQ?

GROQ is a JSON-based query language that was initially developed for Sanity.io as a query and execution solution. While work on the initial offering started in 2015, Sanity released GROQ as an open standard in 2019. You can view the current GROQ working draft here.

GROQ, short for Graph-Relational Object Queries, is principally a method of filtration – it carries on the work that has become so fashionable with implementations like GraphQL, allowing for robust, transformed, and filtered data sets from large bodies of data. In simple terms, GROQ aims to describe exactly what information an application wants, and then deliver that info through a series of transformations, joins, and processing to present a specific desired response.

GROQ is strongly typed and boasts a variety of client libraries for extensibility. It supports complex parameters and operators, as well as functions, piping, and advanced joins. It allows for slicing, ordering, projects, variables, conditionals – it’s a compelling solution indeed. Let’s look at an example usage from the documentation, and discuss a few key attributes of GROQ that may make it a strong choice.

Example

Core Invocation

GROQ statements are typically seen as a data flow from left to right – everything first flows through the filter element, then through additional functions (such as ordering) and into the projection. As such, each element of the core invocation flows into the other in a rather sensible way.

Let’s take a look at an example query we might want to run. Consider a hypothetical music API. We’re trying to pull an album from a list of albums according to its release year. To do this, we can run the following query:

*[_type == 'album' && releaseYear >= 1994]

As every document in Sanity has to have a type, the_type in this query states that the document type queried is the “album” type. From here, we use && to state “and”, and append the releaseYear request. By using releaseYear, we are assuming that such a field exists in the data, and that it contains numbers that we can apply a “greater than or equal to” filter to.

When we run this query, we get a clean output – all albums that have a release year equal to or greater than 1994. This is all well and good, but we can use projections to make it more useful.

Projections

A projection can be used in GROQ to describe the data we want from each query, and deliver that data in a cleaner and more understandable form. If we take our initial query and add a projection, we get something like this:

*[_type == ‘album' && releaseYear >= 1994]{ _id, title, releaseYear }

In this example, we still get the data we requested in the first example – however, we get several more elements of data that we can use for filtering and transformation. _id allows us to grab the data ID, title allows us to grab the album title, and releaseYear allows us to actually output the year itself rather than simply return all data in the set that matches the query.

In this way, we have made our return much more efficient and effective, and have grabbed a good deal more information for transformation and contextualization.

Ordering

Our data request is complete, but it’s still somewhat complicated. The albums we are pulling from this list will return in the order they were entered – in our case, where we want to see a chronological listing of albums by year, this makes for a suboptimal experience. Thus, we can use sorting to order our output using the order function.

Let’s transform our request:

*[_type == 'album' && releaseYear >= 1994] | order(releaseYear) {_id, title, releaseYear}

With this request, we get our initial data, but we sort by the release year. This allows us to see a chronological list of items – Sanity will default to ascending order, which means we will see the oldest first.

Use Case

GROQ is just one of many query language options – what makes it a good choice versus something like GraphQL? First (and most obvious), GROQ is a great option for a Sanity native application. Because Sanity has GROQ baked in as a query language, it makes perfect sense to adopt GROQ versus other alternatives. That being said, Sanity also supports GraphQL.

Perhaps the strongest argument for GROQ is the fact that it’s relatively lean and expressive. GROQ allows for more free-form querying than some GraphQL mapping techniques allow. For example, we can look at this query in GraphQL (mapped via Sanity’s GraphQL mapper):

{
  authors(where: {
      debutedBefore_lt: "1900-01-01T00:00:00Z", 
      name_matches: "Edga*r"
  ) {
    name,
    debutYear,
  }
}

When expressed in GROQ, we derive the following:


*[_type == "author" && name match "Edgar" && debutYear < 1900]{
  name,
  debutYear
}

While the GROQ implementation is perhaps more complex than the GraphQL mapped option, it is nonetheless more compact, while retaining much of the clear syntax. As requests grow ever more complex, such savings can have obvious value to the end code.

Conclusion

GROQ is a strong query language indeed – and for those coding natively with Sanity, it’s a great option. It should be remembered, however, that it is simply that – an option. Whether or not it works in your codeflow or your chosen implementation will largely come down to the technology in use, the solutions employed, and the overall demands of the data query.

It should be obvious that GROQ is best employed where Sanity is part of the project at large. Being the in-built solution for querying, it can be leveraged to greater heights in such a way that it is seamless and easy to integrate. Where Sanity is not part of the project, however, the value proposition changes significantly, as using GROQ requires greater effort for implementation.

All told, GROQ is still a great tool, and one to consider when looking to leverage query languages.