Benefits of Protobufs For Internal Microservices

Benefits of Protobufs For Internal Microservices

Posted in

APIs are loud — they generate an incredible amount of data. What’s more important than how this data is generated, however, is how the data is stored and shared. APIs live and die by data exchange, and for this reason, the structure and format of the data is often the most important part of any API system.

Below, we’ll look at a format called Protocol Buffers (Protobufs), an open-source data serialization format initially developed by Google. We’ll consider why a developer might use them and explore a specific use case where Protobufs make a lot of sense — internal microservices.

What Are Protobufs?

Protobufs are a solution to serializing structured data. They’re language and platform-agnostic with wide support for an array of environments. Users can define how they want their data to be structured and then generate structured data based on this definition.

To do this, Protobufs utilize two elements. The first is the definition language, which is the code used by the proto generator to connect to the data, libraries, and systems generating the code. The second is the format itself, which utilizes the definition language to create the data output.

The Pros of Protobufs

Protobufs offer quite a few benefits for adoptees. First and foremost, the format is quite compact, and the process of creating it is swift. This leads to rapid parsing with minimal filesize, a “best of both worlds” that is often in conflict with other formats. This, in conjunction with the broad language support, means that you can rapidly write a small file to share with many languages and environments.

A big Protobuf benefit is that the output can be read across multiple languages regardless of where it was generated. Since the data format is structured and stored within a .proto definition, you could, in theory, extract the data from Java and port it to Ruby or pull from C# and push it to Dart. This wide support and extensibility allows for a wide range of support that removes language as an iteration blocker.

These definitions are living definitions as well — because the format is structured, as long as you structure the definition according to best practices, you can simply update the definition and retain backward and forward compatibility. This is a huge benefit, as it allows for rapid iteration that does not introduce breaking changes. This is a huge gain because Protobufs may be used where changes requiring version updates could be a showstopper (for instance, in enclosed IoT devices).

The Cons of Protobufs

Protobufs are great for various use cases but are not perfect for everything. There are a few specific scenarios in which Protobufs are a bad solution.

Firstly, Protobufs are not compressed. While the output file is very small for what you get, there are many situations where they might not be small enough, and additional compression may be necessary. In such a situation, the solution for additional compression might remove many of the benefits of adopting Protobufs.

Protobufs are also not self-describing. Some data formats self-describe the data so that you can understand exactly what each data element means in the context of itself. Protobufs does not provide this, meaning both ends of the equation must rely on the .proto file for interpretation. For large-scale contextual processing, this might introduce more complexity and processing weight.

The documentation for Protobufs notes a particular use case for scientific and engineering use cases that is shown below:

  • Protocol buffer messages are less than maximally efficient in both size and speed for many scientific and engineering uses that involve large, multi-dimensional arrays of floating point numbers. For these applications, FITS and similar formats have less overhead.
  • Protocol buffers are not well supported in non-object-oriented languages popular in scientific computing, such as Fortran and IDL.

Finally, there is the fact that Protobufs comes from Google. For some, the use of a Google format, whether open-source or not, licensed or not, might carry with it some issues. Therefore, it’s something to consider when looking at long-term product support for a free and open-source product by a for-profit corporation.

The Benefit of Protobufs for Internal Microservices

Protobufs make a lot of sense with internal microservices. There’s a very strange situation that arises when internal microservices are considered with any technical solution. Suddenly, the desire for external compatibility and other considerations weakens considerably, and the core argument becomes one of interchangeability.

Built for Internal

When building for external microservices, you assume that someone external to the organization will be using your code, systems, and integration methods. Therefore, you start to make decisions based on that eventuality. With internal microservices, however, you are developing for an audience of one — the internal developer.

In such a use case, standard exchange and data format becomes the most important factor, and this is something that Protobufs delivers in spades. Protobufs are designed for efficient and rapid data serialization in a readily understood and exchangeable format across a set of languages. Assuming that you support those languages internally, Protobufs becomes a solution for merging these disparate systems into a cohesive collective.

Performance and Standardization

One significant benefit that should be considered is the balance of standardization against performance. Standardized systems are more uniform and portable, but they demand the same standard performance across the board. If this performance is middling at best or only suitable in a single scenario, other services suffer in their own environments. By targeting only performance, you can use the specific language you want, but you lose out on the benefits of standardization.

Prefer to use the relatively quick processing power of Python to serve an internal function? Prefer to use the higher capabilities of C# for a more complex work function? You can use both with Protobufs, and as long as you have an agreed-upon definition, you’d never even know they were using different solutions.

Improved Discovery

This also introduces the benefit of improved discovery. Half of discovery is ensuring that you have an actual view of what services do and how they go about doing it. Since Protobufs require a definition and an agreed-upon format, you can utilize this system to define the core functions of each component and catalog it according to what it does rather than where it lives.

This benefit is significant: this is akin to adopting a standard house addressing format. When you know how to format an address, you can pinpoint exactly where something is, and per that information, you can get more information on whether the location is a house, an apartment, or a mailbox. Protobufs give you an idea of what each service is outputting, and from this, you can generate an understanding of how the entire ecosystem uses that data.

Conclusion

The simple fact is that it’s never been easier to use Protobufs. It’s the interface definition language of choice for many solutions, and with all the benefits noted herein, that’s not a huge surprise. Protobufs is highly efficient and unlocks significant benefits in most environments. Assuming you are not working in one of the few environments that can’t benefit from its adoption, Protobufs might be your solution for internal microservice data portability.

What do you think of this piece? Let us know in the comments below!