There are few things API developers like to discuss more than the choices made during development. One of the most common questions in this space is the choice of data format — the nature of serialization, transfer, and storage can drastically impact how an API functions and how one understands the underlying system.

Below, we will discuss two solutions in this space — JSON and YAML. We’ll look at their strengths and weaknesses and identify their similarities and differences. We’ll look at some basic syntax in this article – for more in-depth overviews of their syntax, use cases, and functions, we recommend A Guide to JSON Schema and What Data Formats Should My API Support?.

What is JSON?

JSON, or JavaScript Object Notation, is an open standard for data interchange and an open file format. It was first developed to address the need for a stateless, real-time server-to-browser communication paradigm that did not heavily depend on the plugins and extensions of the time, notably Flash and Java. While those systems did allow for some amount of stateless exchange, they carried with them a good deal of overhead and insecurity, and as such, a new solution was needed.

JSON was created as a subset of JavaScript as Standard ECMA-262 3rd Edition —December 1999, and was later standardized under RFC 8259 and Internet Standard STD 90 as well as ISO/IEC 21778:2017. The core concept of JSON was to create a human-readable format to store and transmit data in the form of attribute-value pairs. In other words, you say what something is, and then tell a quality about that thing. In doing so, a syntax was readily created, an example of which can be seen below.

{
  "companyName": "NordicAPIs",
}

This syntax allows for the definition of an attribute (“companyName”) and a value (“NordicAPIs”). You can set the nature of these attributes in a wide variety of data type choices, including Numbers, Strings, Booleans, Arrays, and Objects. These data types support a variety of attribute-value pairings. Because of this relative simplicity in data typing, JSON is used at a large scale as a solution for lightweight data interchange. Examples such as JSON-RPC, a remote procedure call protocol based upon JSON, leverage this efficiency in design to create a notification-call relationship between entities.

A more complex JSON structure may incorporate many of these different data types. For example, this example response is from the National Weather Service and provides a list of weather events for the region of Texas:

{
    "@context": [
        "https://geojson.org/geojson-ld/geojson-context.jsonld",
        {
            "@version": "1.1",
            "wx": "https://api.weather.gov/ontology#",
            "@vocab": "https://api.weather.gov/ontology#"
        }
    ],
    "type": "FeatureCollection",
    "features": [
        {
            "id": "https://api.weather.gov/alerts/urn:oid:2.49.0.1.840.0.8556b5f144b57c3b8b9f5f08f36431bbd4de1b3e.001.1",
            "type": "Feature",
            "geometry": null,
            "properties": {
                "@id": "https://api.weather.gov/alerts/urn:oid:2.49.0.1.840.0.8556b5f144b57c3b8b9f5f08f36431bbd4de1b3e.001.1",
                "@type": "wx:Alert",
                "id": "urn:oid:2.49.0.1.840.0.8556b5f144b57c3b8b9f5f08f36431bbd4de1b3e.001.1",
                "areaDesc": "Coastal Willacy; Coastal Cameron; Coastal Kenedy",
                "geocode": {
                    "SAME": [
                        "048489",
                        "048061",
                        "048261"
                    ],
                    "UGC": [
                        "TXZ256",
                        "TXZ257",
                        "TXZ351"
                    ]
                },
                "affectedZones": [
                    "https://api.weather.gov/zones/forecast/TXZ256",
                    "https://api.weather.gov/zones/forecast/TXZ257",
                    "https://api.weather.gov/zones/forecast/TXZ351"
                ],
                "references": [
                    {
                        "@id": "https://api.weather.gov/alerts/urn:oid:2.49.0.1.840.0.0bc7e358229e3139787c9fe41b0229e8b96a628e.001.1",
                        "identifier": "urn:oid:2.49.0.1.840.0.0bc7e358229e3139787c9fe41b0229e8b96a628e.001.1",
                        "sender": "w-nws.webmaster@noaa.gov",
                        "sent": "2021-11-08T03:52:00-06:00"
                    }
                ],
                "sent": "2021-11-08T11:34:00-06:00",
                "effective": "2021-11-08T11:34:00-06:00",
                "onset": "2021-11-08T18:00:00-06:00",
                "expires": "2021-11-09T00:00:00-06:00",
                "ends": "2021-11-09T00:00:00-06:00",
                "status": "Actual",
                "messageType": "Update",
                "category": "Met",
                "severity": "Minor",
                "certainty": "Likely",
                "urgency": "Expected",
                "event": "Coastal Flood Statement",
                "sender": "w-nws.webmaster@noaa.gov",
                "senderName": "NWS Brownsville TX",
                "headline": "Coastal Flood Statement issued November 8 at 11:34AM CST until November 9 at 12:00AM CST by NWS Brownsville TX",
                "description": "* WHAT...Isolated minor coastal flooding expected.\n\n* WHERE...Coastal Kenedy, Coastal Cameron and Coastal Willacy\nCounties.\n\n* WHEN...This evening.\n\n* IMPACTS...Wave run-up may approach the dunes along narrow\nbeaches. Beach equipment, such as umbrellas and chairs, could\nbe moved by waves. Vehicles driving along narrow beaches may\nexperience higher water levels. Elevated water levels may also\noccur across the Laguna Madre and South Bay, and along State\nHighway 4 west of Boca Chica State Park.\n\n* ADDITIONAL DETAILS...High tide occurs at 9:49 PM CST.",
                "instruction": "Do not drive through flooded roadways.",
                "response": "Monitor",
                "parameters": {
                    "PIL": [
                        "BROCFWBRO"
                    ],
                    "NWSheadline": [
                        "COASTAL FLOOD STATEMENT REMAINS IN EFFECT FROM 6 PM CST THIS EVENING THROUGH MIDNIGHT CST TONIGHT"
                    ],
                    "BLOCKCHANNEL": [
                        "EAS",
                        "NWEM",
                        "CMAS"
                    ],
                    "VTEC": [
                        "/O.CON.KBRO.CF.S.0027.211109T0000Z-211109T0600Z/"
                    ],
                    "eventEndingTime": [
                        "2021-11-09T06:00:00+00:00"
                    ]
                }
            }
        }
    ],
    "title": "current watches, warnings, and advisories for Texas",
    "updated": "2021-11-08T17:35:27+00:00"
}

JSON has some major benefits going for it. As stated already, it is incredibly efficient. By utilizing the data types, JSON has a simple 1:1 value-attribute relationship. It’s lightweight, and for this reason, it’s been used commonly in everything from web browsers to the Internet of Things.

JSON’s efficiency comes from simplicity. That is its own kind of benefit — its simple data types and structure make for easy parsing and faster generation. It also makes it easier for humans to interact, once the syntax is understood. There’s not a lot of complexity in nesting or other systems in JSON, and as such, it’s often true that what you see is what you get. This can also make for easier debugging and error tracking.

It should be noted that JSON is used quite widely, which has led to it having a robust, active community facilitating the development of libraries, implementations, frameworks, etc. It’s been around for long enough, and iterated upon significantly, that the install base is large, and the number of experts is ever-increasing.

Simply put, JSON has somewhat more limited data type support and native features versus its contemporaries (such as YAML), but this also means that it’s typically a bit more straightforward, lightweight, and comprehensible in the abstract.

What is YAML?

YAML is a human-readable data serialization language typically used for configuration files and commonly used for data interchange. Originally, YAML stood for “Yet Another Markup Language”, but this was later changed to the tongue-in-cheek recursive backronym “YAML Ain’t Markup Language”. First proposed by Clark Evans in 2001, and then designed by Evans, IngydotNet, and Oren Ben-Kiki, YAML was designed to be natively human-readable, including supporting comments and nested structuring.

Because YAML is essentially a superset of JSON, it boasts relatively robust compatibility with JSON systems. While earlier versions were not entirely compatible, this has been iterated upon for several years now, resulting in a much more compatible experience.

YAML natively supports a wide range of scala data type support, including strings, integers, and floats, as well as lists and associate arrays (maps, dictionaries, and hashes). YAML is very recognizable as a configuration methodology, as many open source projects have utilized it due to its ability to nest configuration elements within core grouping objects. This ability to utilize nested objects has also lent its support in projects focused on data serialization of highly complex data sets, as this nested object paradigm allows for complex data schemes to be more accurately represented — this is especially true when attaching objects across different domains of interest or quality.

nordicAPIs:
  name: Nordic APIs

A more complex, nested YAML output example can be found on the YAML Wikipedia page, and looks something like this:

---
receipt:     Oz-Ware Purchase Invoice
date:        2012-08-06
customer:
    first_name:   Dorothy
    family_name:  Gale

items:
    - part_no:   A4786
      descrip:   Water Bucket (Filled)
      price:     1.47
      quantity:  4

    - part_no:   E1628
      descrip:   High Heeled "Ruby" Slippers
      size:      8
      price:     133.7
      quantity:  1

bill-to:  &id001
    street: |
            123 Tornado Alley
            Suite 16
    city:   East Centerville
    state:  KS

ship-to:  *id001

specialDelivery:  >
    Follow the Yellow Brick
    Road to the Emerald City.
    Pay no attention to the
    man behind the curtain.
...

This support for complexity is perhaps YAMLs biggest strength. For instance, let’s assume a developer wanted to detail a media company that owns multiple subsidiaries. Each of which has co-produced several movies with a revolving cast of talent that sometimes has been in the same movies. In such as data set, you would need a solution that would allow for complex data nesting, as each entity in the system would relate to other entities in some complex ways. YAML is an effective solution here.

Of course, complexity can also become a downside — nested systems can make YAML harder to read. This is ironic, considering that YAML was first designed to solve the human-readability problem. This complexity can also slow down parsing, making it more inefficient the more complex your systems become.

Notably, YAML also has a much smaller community than JSON. While this is not a barrier to implementation, it does mean that you are somewhat more limited in what you can expect in terms of support, spin-off forked projects, etc., compared to something more standard like JSON.

What’s the Difference Between JSON and YAML?

The difference between JSON and YAML is ultimately a tradeoff between speed and complexity. JSON is faster to parse than YAML, and because of this, it can often be easier to understand and implement once you get over the syntax oddities of how it’s presented. This speed comes with a cost, however — the simplicity of JSON is a double-edged sword. Simplicity means it’s faster, but it may require you to do more to work around that simplicity, thus decreasing ease in the long run.

On the other hand, YAML may not be as fast or easy to parse, but it supports near-endless nested complexity. This also means that, while it’s easy to read at first blush, it can also become more challenging to read the more you nest.

JSON has a wide install base, meaning that you will typically have more options for support and forked projects. YAML does not have this, but it also means that what documentation does exist is largely straightforward and to the point.

YAML may be attractive to the end-user for complexity reasons. If the data set itself is complex, then it’s fine if the underlying data format is complex as well. Of course, the inverse is true — JSON may be more attractive for more straightforward data sets, as you don’t need to worry about trying to adapt a square peg to a round hole.

  • JSON: Simple, powerful, best meant for simple interactions and data structures
  • YAML: Complex, robust, and best meant for complex data sets requiring nesting

Choosing the Right Toolset

Ultimately, JSON and YAML are designed for specific things, and it’s unhelpful to point to one being better than the other. One would not use a hammer to place a screw in a doorframe, nor would one use a screwdriver to drive a nail. JSON and YAML may be related, but they are designed for two specific use cases.

Accordingly, your choice must be between complexity and speed. Yes, it is possible to solve a complex data relationship problem with JSON, but why would one do that? This would only add complexity to the system, and YAML is already a robust solution for such nested approaches. The opposite is also true.

Which do you find more readable? Which do you prefer? Let us know in the comments below!