A Guide to JSON Schema

APIs and JSON are irrevocably intertwined. JSON is the most common format for exchanging data over the internet. Many developers recommend using the JSON format due to its many benefits, which are uniquely beneficial for APIs.

However, if you’ve ever worked with a JSON instance from the command line, you’ll know that when left to its own devices, the JSON can get pretty unwieldy. And while even though most APIs work with JSON, there’s no guarantee that the consumer will be able to gracefully interact with data in this format. A schema file aims to fix all of that.

Developers have stepped in to fill that void. Below, we’ll take a look at JSON Schema, a specification language for JSON. We’ll cover what it is, whether you need it, and briefly review getting up and running so you can try it out for yourself!

Introduction to JSON Schema

For those who are new to the subject, JSON stands for JavaScript Object Notation. JSON is a convenient way to transmit a lot of data pertaining to one query, which is returned as a series of paired items. This is often a dictionary, but the root of the instance may be an array, or a boolean, a number, or a string.

For example, imagine you are requesting all of the data pertaining to an employee:

{
   "employee":{
      "name":"sonoo",
      "salary":56000,
      "married":true
   }
}

Pretty simple, right? Just query for employees, and you’re given values for name, salary, and marital status as a response.

Now consider this example instead.

{
   "medications":[
      {
         "aceInhibitors":[
            {
               "name":"lisinopril",
               "strength":"10 mg Tab",
               "dose":"1 tab",
               "route":"PO",
               "sig":"daily",
               "pillCount":"#90",
               "refills":"Refill 3"
            }
         ],
         "antianginal":[
            {
               "name":"nitroglycerin",
               "strength":"0.4 mg Sublingual Tab",
               "dose":"1 tab",
               "route":"SL",
               "sig":"q15min PRN",
               "pillCount":"#30",
               "refills":"Refill 1"
            }
         ],
         "anticoagulants":[
            {
               "name":"warfarin sodium",
               "strength":"3 mg Tab",
               "dose":"1 tab",
               "route":"PO",
               "sig":"daily",
               "pillCount":"#90",
               "refills":"Refill 3"
            }
         ],
         "betaBlocker":[
            {
               "name":"metoprolol tartrate",
               "strength":"25 mg Tab",
               "dose":"1 tab",
               "route":"PO",
               "sig":"daily",
               "pillCount":"#90",
               "refills":"Refill 3"
            }
         ],
         "diuretic":[
            {
               "name":"furosemide",
               "strength":"40 mg Tab",
               "dose":"1 tab",
               "route":"PO",
               "sig":"daily",
               "pillCount":"#90",
               "refills":"Refill 3"
            }
         ],
         "mineral":[
            {
               "name":"potassium chloride ER",
               "strength":"10 mEq Tab",
               "dose":"1 tab",
               "route":"PO",
               "sig":"daily",
               "pillCount":"#90",
               "refills":"Refill 3"
            }
         ]
      }
   ],
   "labs":[
      {
         "name":"Arterial Blood Gas",
         "time":"Today",
         "location":"Main Hospital Lab"
      },
      {
         "name":"BMP",
         "time":"Today",
         "location":"Primary Care Clinic"
      },
      {
         "name":"BNP",
         "time":"3 Weeks",
         "location":"Primary Care Clinic"
      },
      {
         "name":"BUN",
         "time":"1 Year",
         "location":"Primary Care Clinic"
      },
      {
         "name":"Cardiac Enzymes",
         "time":"Today",
         "location":"Primary Care Clinic"
      },
      {
         "name":"CBC",
         "time":"1 Year",
         "location":"Primary Care Clinic"
      },
      {
         "name":"Creatinine",
         "time":"1 Year",
         "location":"Main Hospital Lab"
      },
      {
         "name":"Electrolyte Panel",
         "time":"1 Year",
         "location":"Primary Care Clinic"
      },
      {
         "name":"Glucose",
         "time":"1 Year",
         "location":"Main Hospital Lab"
      },
      {
         "name":"PT/INR",
         "time":"3 Weeks",
         "location":"Primary Care Clinic"
      },
      {
         "name":"PTT",
         "time":"3 Weeks",
         "location":"Coumadin Clinic"
      },
      {
         "name":"TSH",
         "time":"1 Year",
         "location":"Primary Care Clinic"
      }
   ],
   "imaging":[
      {
         "name":"Chest X-Ray",
         "time":"Today",
         "location":"Main Hospital Radiology"
      },
      {
         "name":"Chest X-Ray",
         "time":"Today",
         "location":"Main Hospital Radiology"
      },
      {
         "name":"Chest X-Ray",
         "time":"Today",
         "location":"Main Hospital Radiology"
      }
   ]
}

This JSON could be a hospital’s database represented as a series of JavaScript objects. It features nested JSON for different departments, from information on medications to the location of different departments.

As you can see from this example, the JSON can quickly become supremely complicated. It’s possible to nest JavaScript options as well as arrays, which can quickly become rather overwhelming. It’s also an example of how much is possible when using JSON.

Complex, lengthy instances aren’t the only issue you might run into using JSON. Applications often need to validate a JSON object to make sure that some criteria are met. Without a schema, an application can only tell that a JSON object is formatted properly. It can’t tell what’s in the object.

This is where JSON Schema comes into play.

JSON Schema is a language for describing the content, structure, and semantics of the JSON instance. It lets you apply metadata about an object’s properties, which are themselves listed as JSON. JSON Schema dictates what fields exist, whether that field is optional, and what data format the consumer can expect.

Here’s an example of a JSON Schema:

{
   "$schema":"https://json-schema.org/draft/2020-12/schema",
   "title":"Person",
   "description":"A person",
   "type":"object",
   "properties":{
      "name":{
         "description":"A person's name",
         "type":"string"
      },
      "age":{
         "description":"A person's age",
         "type":"number",
         "minimum":18,
         "maximum":64
      }
   },
   "required":[
      "name",
      "age"
   ]
}

Here’s a JSON object itself:

{
   "name":"John Doe",
   "age":35
}

With the JSON Schema, the API consumer knows that the customer’s name is “John Doe” and their age is “35”.

An application that understands what’s inside a JSON object makes it far easier to return data within particular constraints. You could query to only return employees in their 30s, for example, or only employees who are married.

Benefits of JSON Schema

JSON is intended to return data in a format that is readable by humans and machines alike. Without a bit of fine-tuning, however, it’s sometimes neither. One advantage of using JSON Schema is it makes JSON more intelligible for computers and users alike.

One of the most popular benefits of JSON Schema is its usefulness for API testing and validation. For example, say your API can only handle names that are less than 20 characters long. Or imagine you only want to send emails from a particular date range to your database.

As we pointed out before, implementing this isn’t easy when the API consumer doesn’t know what’s inside the JSON instance. Without a schema in place, an application might be able to tell that there’s something there, but the contents themselves will be a mystery.

Using JSON Schema also prevents you from having to make too many changes on the client side. One common but misguided approach for creating client-side API applications is to make a list of the common HTML codes and then implement them on the client side. This isn’t the best approach, though, as some features can break if something changes on the server-side.

“The primary benefit to using JSON Schema is its interoperability across many programming languages; the consistency and reliability of validation. A good implementation will use the official test suite, ensuring interoperability and correct validation. Organizations which publish APIs often provide JSON Schemas for their data payloads and expectations, allowing developers to know what is required, and computers to reliably validate.” – Ben Hutton, JSON Schema Specification Lead at Postman

One final benefit of using JSON Schema that we’ll mention today (there are many) is that there are already a number of pre-written assets available. JSON Schema’s grown to be quite popular, so there are libraries for JSON validators for most of the popular programming languages.

Libraries for JSON Schema Validation include:

Language Library
.NET Json.NET Schema
C WJElement
Clojure jinx
Go gojsonschema
JavaScript ajv
Python jschon
Ruby JSONSchemer
PHP Opis Json Schema
Kotlin Medeia-validator

Getting Started With JSON Schema

Now let’s take a quick look at how you can use JSON Schema for yourself so you can try it out within your own project.

Let’s create a JSON object for a pet store catalog. An individual item in the catalog includes:

  • An Identifier: productID
  • Product Name: productName
  • Cost: price
  • Additional Information: tags

A JSON object for that catalog might look like:

{
   "productId":1,
   "productName":"Kibble",
   "price":12.50,
   "tags":[
      "food",
      "dogs"
   ]
}

A basic JSON Schema contains four main data points.

  • $schema declares which draft of JSON Schema you’re using
  • $id declares a URI that will be the base URI that other URI references will be checked against.
  • Title and Description are merely descriptive
  • Type tells the consumer what type of object it can expect.

An example of a basic JSON Schema might be:

{
   "$schema":"https://json-schema.org/draft/2020-12/schema",
   "$id":"https://example.com/product.schema.json",
   "title":"Product",
   "description":"A product in the catalog",
   "type":"object"
}

Defining The Properties

Now let’s see what a real-world example of JSON Schema might look like.

{
   "$schema":"https://json-schema.org/draft/2020-12/schema",
   "$id":"https://example.com/product.schema.json",
   "title":"Product",
   "description":"A product from pet store's catalog",
   "type":"object",
   "properties":{
      "productId":{
         "description":"The unique identifier for a product",
         "type":"integer"
      }
   },
   "required":[
      "productId"
   ]
}

So far, so good, right? This is all fairly straightforward. But what happens when a JSON Schema gets more involved?

Nesting Data Structures

Now let’s take a look at a slightly more detailed JSON Schema.

{
   "$schema":"https://json-schema.org/draft/2020-12/schema",
   "$id":"https://example.com/product.schema.json",
   "title":"Product",
   "description":"A product from pet store's catalog",
   "type":"object",
   "properties":{
      "productId":{
         "description":"The unique identifier for a product",
         "type":"integer"
      },
      "productName":{
         "description":"Name of the product",
         "type":"string"
      },
      "price":{
         "description":"The price of the product",
         "type":"number",
         "exclusiveMinimum":0
      },
      "tags":{
         "description":"Tags for the product",
         "type":"array",
         "items":{
            "type":"string"
         },
         "minItems":1,
         "uniqueItems":true
      },
      "dimensions":{
         "type":"object",
         "properties":{
            "length":{
               "type":"number"
            },
            "width":{
               "type":"number"
            },
            "height":{
               "type":"number"
            }
         },
         "required":[
            "length",
            "width",
            "height"
         ]
      }
   },
   "required":[
      "productId",
      "productName",
      "price"
   ]
}

Here you can see that a dimensions entry has been added and that nested JSON objects have been attributed to the properties variable. When you follow this format, you can add as many properties as needed, and they’ll always be properly understood by the API consumer.

JSON Schema: Final Thoughts

It’s safe to say that API validation and testing will continue to become more critical as APIs continue to become more prevalent. It’s also safe to assume that JSON isn’t likely to become less complicated as time moves forward.

This means that having a schema for your data is going to continue to become more important from here on out, as well. This makes JSON Schema an ideal choice for those working with APIs, as JSON is the preferred file format for working with APIs.

Looking to the future of JSON Schema, Ben Hutton had this to say:

“With the addition of Dialects and Vocabularies, refined in our latest publication (draft 2020-12), JSON Schema now has a foundation on which standardized extensions can be built for other use cases. We’re already working to form two industry lead Special Interest Groups (SIGs) for code generation and databases. Some extensions exist already, but they are non-standard.”

If you’d like more examples of JSON Schema in action and some readymade code for you to try out JSON Schema for yourself, follow the JSON Schema GitHub Repository. If you’d like to become involved in the specification itself, Hutton recommends joining the JSON Schema Slack channel. From there, you can see activity on GitHub and Stackoverflow, ask questions and help others, or just keep an eye out for any announcements. JSON Schema is also on Twitter here.