What's The Difference Between Upstream and Downstream?

The concept of upstream and downstream is often confusing to novice developers — knowing exactly when a component is downstream can be complicated, especially when topics like dependency and inheritance into the equation. A firm grasp of these concepts is perhaps one of the most important things a software engineer can have for developing a successful implementation for the long term.

Today, we’ll look at these topics. We’ll discuss what “stream” refers to and isolate what downstream and upstream means within API-centric development. We’ll look at what it means to have an API dependency, and how inheritance plays a significant long-term role in the success of a given API implementation.

Understanding the Stream

APIs are fundamentally interconnected systems — no matter how powerful an API may be, it depends on other systems to do the work it does. These APIs function as entity nodes in a greater network of functional nodes that exchange data and information. Entities receive data in the form of an input and then send data outwards in the form of output. The sum total of these data flow points is often referred to as a stream.

To understand how this works and why it’s important, it’s helpful to envision the process in the form of a river. Imagine a product manufacturer’s factory sits on a river. In order to produce their products, they require raw materials from point A at the start of the river. The materials flow down the river to the factory, point B, where a manufacturing process converts the raw materials into sellable widgets. Once the widgets are done, they are sent further down the river to a store, point C, where the end consumer buys the widget.

This results in a relationship that starts at A and concludes at C (A -> B -> C). APIs are much like this — in order to carry out a function for the end-user, we often require the raw data or materials to be sent from another location before we can interact with it. Once we interact with or transform the data, we pass it to a point further along the stream, typically an end-user or additional API requesting the data.

Upstream Versus Downstream

So if an API is like a manufacturer on a river, what does it mean to be upstream or downstream? Much like our manufacturing example, APIs require input and output from other systems. In the same way that a raw material provider is “upstream” or “up river” from a factory, a raw data provider is “upstream” from the API that processes it. Similarly, the API which takes the output of another component’s processing is considered “downstream” or “down river”.

Let’s look at a practical example. Let’s say we’ve created an API that helps users find discounts on train tickets and then serves this data through text message to a specified number. In order to serve this information, the API acts as a kind of middle processor — it needs to pull data from an external source and then push that data to the user in a specific format. In this particular case, we have a very similar technical specification to our manufacturing example mentioned above.

Our API is in the middle of this stream of information, with the user being at the ultimate downstream position, waiting for the data to be processed. When the API grabs pricing data, it reaches out to local provider APIs to pull the most current ticket prices. This is an upstream data source, as it feeds into the API for processing. Once the API processes this data, it then needs to send it to the end-user utilizing a text message provider service with its own API. In relation to our API, this API is considered downstream, as it serves as a transit channel for our data on the way to the ultimate end user.

What is notable in our case is that the reference to downstream and upstream is entirely dependent on where you are at in the flow. For example, to the ticket data API that provides the data being acted upon, our processing API and the final text-message provision resources, the SMS provision API, are both downstream. For that API, nothing is upstream. For the SMS provision API, however, the opposite is true. Everything before it, including our processing API and the ticket data API, is upstream, with the only downstream entity beyond itself being the end-user who views the data and the device they use to render the data.

Dependencies and Upstream/Downstream Considerations

Where this relationship becomes more complex is when you introduce dependencies. A dependency is exactly what it sounds like — something required by other entities in the system to function. These requirements could be any variety of APIs, systems, entities, or libraries, but the important consideration is whether or not it is required to input into the system or to output from the system.

Let’s take a look at our ticketing API once more. Suppose the API itself has some core dependencies. It includes several libraries and communications protocols that have been created by external parties that allow it to function. On the upstream side, the API requires additional APIs and libraries to request the data and receive it in the form that is required. For instance, the API that we pull data from utilizes a GraphQL system to feed this data to an endpoint that processes external commands for compression and delivery. The API and the parts that make it are considered upstream dependencies, as no flow of data can occur without these constituent parts.

Once the data is processed, however, it has to go somewhere. The endpoints utilized, including the SMS transit API, the data presentation API, metadata APIs, etc., are all composed of their own dependencies and systems that ultimately form the transit pathway for the processed data. In this sense, these are all downstream dependencies, as they are all required parts that make up a complete pathway.

Dependency Inheritance

What makes this even more complex is that not all dependencies are clear. In many cases, they can be inherited simply by choosing specific technical solutions. Every choice made in the construction of an API comes with its own set of dependencies that affect not only the API in question but also all of the downstream APIs that it interacts with.

In our ticketing example, let’s assume that we have chosen to incorporate a security protocol that requires the use of two specific libraries. This choice has some direct implications, of course — any time the API is accessed, it will require this authentication system to allow users to interact with the internal data set, and as such, it has become a core dependency.

What is interesting, however, is how this affects downstream dependencies. Any application, API, system, or other interconnective entity that wishes to utilize the API we’ve designed will have to conform either to our security protocol or a compatible protocol. That means that any downstream entity will have to view our dependency as its own dependency, thereby creating an upstream dependency for that API. Of course, in doing so, it is also creating a downstream dependency for any system that interacts with it.

As those downstream APIs update, change, integrate, or even offer functionalities as part of a library, this dependency chain does not simply disappear. Suppose another API evolves or offers its services to other systems. Then, it’s likely that those dependencies will propagate. Even if the API in question offers this data without requiring our security protocol, it is likely that it also has its own dependencies that it will pass to users, thereby creating another upstream/downstream chain of dependency.

Example Authentication Stream

To reiterate a real-life example of upstream and downstream services, let’s consider a modern authentication flow. Nowadays, username and password pairs are rarely sufficient to secure digital service logins safely. Current authentication patterns often apply additional temporal and geographic checks, which may trigger additional authentication factors, like one-time passwords, additional biometric identification, or the use of the Google Authenticator app. Since upstream services trigger additional downstream APIs, this could be considered part of a stream or flow.

This example is evident within a Hypermedia Authentication API (HAAPI). A constraint of REST, hypermedia gives the client hyperlinks for further actions based on the current state. When used in an authentication flow, hypermedia can be a helpful way for a system to negotiate further actions throughout the login journey.

For example, consider a user attempting to log into an application using a social login API. The system notices the user’s IP address corresponds to a geographical region far away from their last login — it may even represent an impossible journey. The application then triggers a downstream service to apply additional verification, such as sending an on-time password with an SMS API. Another system inherits the value the user inputs and validates it with another server, all before authenticating the user and granting access to the end application.

Understanding Dependencies and Informed Development

All of this is not meant to suggest that dependencies are necessarily a bad thing. A dependency is simply a truth in modern development. However, it creates the need to fully understand both the dependencies of the systems that a developer chooses to integrate and the likely impact on future development and end-users.

Developers should make informed decisions as to what systems, libraries, and APIs they are building into their APIs. They should endeavor to make sure that these dependencies are well-documented and well-understood. This is not only sound advice for long-term internal development — it’s also good advice for the long-term customer and user satisfaction, as these dependencies will likely become central aspects for forking and development in the future.

A firm understanding of what is downstream and upstream of a given API implementation is key to long-term success, user and customer satisfaction, and the health of the industry.