How-to-Validate-OpenAPI-Definitions-With-Open-Policy-Agent-and-Rego

How to Validate OpenAPI Definitions With Open Policy Agent and Rego

Posted in

A design-first approach to APIs gives developers a fast track to deliver secure, high-quality APIs, and the ability to execute API design tests early in the API lifecycle. What’s in our technology portfolio that can help us get there quickly? OpenAPI Specification is a format for describing the shape of our APIs. Luckily, it has become the most prevalent standard for RESTful APIs. So how do we start testing these documents? Do we have an existing skill set we can utilize? Open Policy Agent (OPA) might be a solution that’s already in our toolbox! Let’s take a look at how we can use OPA to drive design-time governance for OpenAPI definitions.

What Got Us Here?

The number of web APIs has expanded wildly over the last decade. The availability of cloud infrastructure and the growing needs of connected businesses enabled a race to digital transformation. This rush has left many organizations looking to gain deeper insight and take greater responsibility as stewards of their internal innovation rollercoaster. This leads them to ask questions such as, “how do we build great API experiences consistently?” and “how can we offer guard rails that ensure both safety and speed?”

Leverage an Existing Policy Engine

Infrastructure and security teams have been adopting Open Policy Agent (OPA) as a set of technologies to help centralize policy management while keeping enforcement distributed among existing access control points, such as applications, API servers, and gateways. Within these teams, a trend is emerging around using OPA for configuration testing. Conftest, one tool in the OPA ecosystem, does just that with built-in support for validating Kubernetes, Terraform, and Docker files. Let’s take a look at an example policy.

package main
deny[msg] {
  input.kind == "Deployment"
  not input.spec.template.spec.securityContext.runAsNonRoot
  msg := "Containers must not run as root"
}

Policies are expressed in a language called Rego. This policy has a rule named deny that is a set populated with a variable named msg. Rego can express assertions about the data it receives. In this case, when receiving a Kubernetes manifest as input data, Rego checks that:

  1. The JSON input data has a top-level name, kind, that matches “Deployment.”
  2. There is no runAsNonRoot name on the security context of the input data.
  3. The rule body then sets the msg output variable to equal a validation error when the deny rule is queried.

While the primary use case for OPA is for applications to query authorization decisions, as it turns out, OPA’s policy engine can be very flexible. This rule might run in a CI/CD pipeline whenever Kubernetes manifests are updated. The continuous integration pipeline can fail when the Rego query data.main.deny returns a validation error.

Define Governance Rules

When we think of the word “policy,” we often see it in the context of authorization, a rule describing access control. We may even have horror stories of managing several layers of Access Control Lists (ACLs). Inevitably, these policies are combined into collections — often expressed as a mathematical set — and then we have decisions to make on what to do when applying multiple layers. Which set operations should we use? Should we take an intersection, a union, or perhaps even the difference? A few innovations have been born over the years to tackle this complexity. With Open Policy Agent (OPA), Rego allows us to model complex policies declaratively. Here’s an example policy that checks an OpenAPI definition. It’s used in an OPA bundle (a shareable collection of policies) named Spego.

package openapi.policies["tag-description"]
results[msg] {
    tags := input.tags
    tags[i]
    not tags[i].description
    msg := {
        "code": "tag-description",
        "path": ["tags", sprintf("%d", [i])],
        "message": "Tag object must have \"description\"."
    }
}

In this rule, we’re looking through each tag in an OpenAPI definition. If a tag does not have a description, we return a message with details about the validation error and where to find it in the document. The i variable in this policy rule helps us iterate across all tags in the array. Each tag object will have assertions applied from the rule body, and the msg will be added to the results set should all the assertions be true. We can use the opa CLI to run a query using an input document. In this example, the OpenAPI input definition has an array of tags, but the first tag in the array is missing a description. We see that reflected in the result.

$ opa eval
    --bundle ./src
    --format pretty
    --input ./example/inputs/openapi.json
    'data.openapi.policies["tag-description"].results'
[
  {
"code": "tag-description",
"message": "Tag object must have \"description\".",
"path": [
    "tags",
    "0"
]
  }
]

Note that OPA does not have a record of the original input document format and won’t point to lines and columns when problems are found. It only understands JSON data and even alphabetically sorts name-value pairs. Also, with documents that take advantage of JSON References to other files, OPA will not resolve those references beforehand. These features do exist in another linting tool named Spectral, and there are even ways to connect that to OPA and get the best of both worlds.

Conclusion

Open Policy Agent (OPA) and its expressive Rego syntax are powerful tools to define policies and calculate decisions. The initial investment in these technologies may be both time-consuming and a necessary part of an organization’s infrastructure and security strategy. Using Rego, we’ve learned how to leverage an investment in OPA to start taking advantage of design-time governance in the API Lifecycle. What’s next?

  • Check out more examples of OpenAPI policies using Spego.
  • Investigate other linting solutions such as Spectral.
  • Learn how the Spectral CLI makes a useful and convenient frontend for configuration testing by exploring the spego-spectral-example.

OPA, Rego, and the associated project tooling will be around for a long time. How we leverage these tools for OpenAPI will continue to evolve. Get involved in the conversation by joining the OpenAPI Community and the Open Policy Agent Community!