Kubernetes is often talked about, but seldom fully understood system in the API industry. This is largely because containerization, the principle approach that Kubernetes is built upon, is still not nearly as ubiquitous as the classical approach to API design and resource management. However, Kubernetes is possibly one of the most powerful systems that API providers can leverage for efficiency, scalability, and modularity.
To that end, in this piece, we’re going to take a dive into what Kubernetes actually is. We’ll define some key terms, contextualize what containers actually are, and hopefully demonstrate the inherent value that containerization, and especially containerization through Kubernetes, offers to API microservices.
Building a Background
In order to understand Kubernetes, we first need to understand containers in the context of APIs. Containers are essentially an entirely different method of handling applications and resources when providing a service to the end user. The container method runs contrary to the classical approach, and as such, it bears some discussion.
In the traditional method, applications, libraries, and the main operational kernel all exist independently of one another on one or more server. When a function is called, the application communicates throughout the server stack, calling resources in their typically hardcoded locations. These applications have been deployed onto the host using some sort of application manager, often built into the operating system kernel itself. Those applications don’t always tie to just a single resource location, and in many cases, can call many disparate sources and incorporate a massive tangle of libraries.
While this worked for smaller functions and applications, when resources were somewhat limited and common amongst massive megalithic applications, the microservice industry has made this approach no longer manageable. Having multiple versions of executables and libraries with various versioned codebases results in increasing complexity and inefficiency within the classic approach.
The answer to this is the container. In a container approach, each application is packaged into an often ephemeral container, which has only the most minimal library required for its function. This avoids massive bloat, copied executables, and complex libraries, and makes the management of each container that much easier. Each container is functionally a self-contained application package, functioning independently from everyone else but still reporting to the general kernel or an other orchestration source for its base orders.
Containers thus utilize OS-level virtualization through a hypervisor, offering the best of the classical approach, while reducing bloat and making for more efficient management, delivering the major benefits of microservices on a local level.
What is Kubernetes?
Now that we understand containers, we should look at how Kubernetes specifically implements the concept. While the old methodology was overly complex, it offered centralized control. Now, if containers are meant to segment content, how can we establish control? Under Kubernetes specifically, we need to orchestrate functions across many containers, automate deployment, automatically handle replication, scale our containers, and more. So, with these requirements – how exactly does Kubernetes do all of this?
To figure this out, let’s define some key terms.
Clusters are groups of physical servers or virtual machines that have the Kubernetes system as a chief manager. Each cluster is made of several key parts, which we will define in short order. Simply put, the Cluster is essentially a combination between the Nodes, which contain the actual Kubernetes System, and the Kubernetes Master, which includes both an API Server to process commands and a Replication Controller, which will be detailed below.
Pods are the smallest deployable functional unit in Kubernetes parlance, and are essentially the Kubernetes “unit”. The Pod contains a group of Containers – each Container in the same pod shares the same network namespace, and talks to one another using the localhost system.
Pods are ephemeral, meaning that they typically don’t persist in terms of data storage. Instead, the Pods are replicated or destroyed on demand by the aforementioned Replication Controller, adopting dynamically in scale to the required resource count.
Pods can be tagged with Labels. A label is a key/value pair that defines the attributes of the pod itself. These labels allow users to tag pods using custom labels, describing exactly what they are and determining how they are treated by the policies laid out under the Replication Controllers.
Replication Controllers allow for the monitoring of pods within the cluster, and is the chief orchestrator of the Kubernetes system at large. When the Replication Controller is set up, it’s given a set of policies by which its functions are dictated, and through these strictures, the Pods are managed in concert with one another.
If the Replication Controller has a specific amount of pods set for creation, and a pod is destroyed for whatever reason, the Replication Controller will automatically destroy the excess server and balance the traffic appropriately using the load balancing services internally. If the Replication Controller notes that more Pods are demanded but a lower number are currently in service, it can likewise spin up additional Pods to meet the demand.
Essentially, the Replication Controller makes sure that, at any given time, the policies it’s been given are being enforced. It does this through two simple elements – the Pod Template, which is used to spin up new pods or destroy them, and works on a given state and specification, and Labels, which are used to determine which Pods the Replication Controller will actually monitor and handle.
Services are used by Kubernetes to create an abstraction layer between the Pods and the requesting resource. Because Pods are ephemeral, a static, stable IP address scheme can not always be expected. Even when using Volumes, which are static storage, you’re likely to have Pods that tie to those Volumes, and as such, you need some methodology to access the Pods.
The Kuberentes Services defines a logical set of pods, and the policy by which those Pods are accessed. The easiest way to think of this Service is to consider it like you would a DNS server, where a Pod is logically labelled and organization, and access is resolved between the requestor of that resource and the resource itself. Utilizing a continuous method of identification while not demanding a static IP address scheme allows for ephemeral Pods without demanding an actual stable, constant system of addressing.
A Node, as previously stated, is a physical or virtual machine. In Kubernetes terms, it’s often referred to as a “Kubernetes Worker”. A worker is made of three main components.
The first of these components is the method by which the container is actually formed, separating the content. More often than not, this is something like Docker or Rocket, and is actually a separate conversation apart of Kubernetes – for more information on Docker, you can find a previous piece here.
The second of those three main components is the kube-proxy. The kube-proxy is a Services method by which proxy connections are established. The Kubernetes network proxy handles the TCP/UDP forwarding and handling for linked containers, and can also be tied into an optional add-on for Cluster DNS management.
Finally, the Kubelet is the primary node agent, and is functionally the primary method by which containers are managed. When we discuss container management, we’re really talking about the process that’s started by the Kubelet and the Kubernetes Master.
The Kubernetes Master is where the Kubernetes API Server actually resides, and is also where the Replication Controller technically resides. In many ways, the Kubernetes Master is the “main piece” of the Kubernetes puzzle, with the Node functioning as the container, and the Kubernetes Master as the principle node of management and processing.
Now that we roughly understand how Kubernetes works, we should identify any potential issues in its adoption. Kubernetes is extremely popular, and quite mature, but that doesn’t mean it’s perfect. Kuberentes is quite hard to setup and configure, especially when compared to the relatively more limited implementations such as Docker.
While Docker is limited by its API, it makes for easier setup, and for that reason, many would prefer it. That being said, Kubernetes does boast internal logging and monitoring methods, something that Docker doesn’t actually have, and something that Docker needs to leverage 3rd party tools in order to achieve. This of course means that while Docker setup might seem easier, there’s often additional hurdles that need to be overcome if one is to use Docker alone, negating much of the ease that seems so apparent.
Kubernetes does have some cons though that should be considered. Perhaps the biggest of these is the fact that Kubernetes is often processor heavy in larger implementations, meaning that the more it scales up, the more processing is required compared to traditional solutions (and in some cases, even alternate methods of containerization). It is still highly efficient and effective in terms of scaling, it just might lose some of these efficiencies at certain scales.
It should also be noted that Docker and Kubernetes can, of course, be used in concert. That being said, at a certain point, you’re going to face diminishing returns – Docker doesn’t necessarily orchestrate any better than the Kubernetes orchestration it aims to replace, and as such, many providers may face a question as to why they’re even trying to use Docker’s methodology in the first place.
Utilizing Kubernetes with APIs
Kubernetes, and containers in general, are perfect for the API microservice approach. Since containers allow for the packaging of functionality and only the dependencies that are needed for that functionality, microservices can be made more portable and modular utilizing Kubernetes than depending on other methodologies.
This also means that microservices can be made more lean, only spinning up additional instances and resources when needed, rather than deploying them on hand and load balancing amongst existing properties when their use is required. This results in reduction of redundancy, and ultimately, a greater lightweight quality to the system in collective terms.
There’s also a huge benefit to security that should be discussed. Most services don’t call it “sandboxing”, but that’s exactly what containerization is – segmenting of resources and applications from other sources and applications. This means that they each run in their own separate system, but add to one another to become greater than the sum of its parts.
Finally, and perhaps most important, Kubernetes offers fast development and easier iteration. Since each container is modular and portable, and each container can be augmented and easily replicated individually from each other container, new APIs can be tested modularly, in variable stacks, and with resources spun up only when demanded.
Ultimately, utilizing containerization is a good idea for API microservices – and of the choices on hand, Kubernetes is certainly a good choice
Kubernetes can be hard to set up, but ultimately, it’s worth the effort. Containerization is an important element of the modern microservice API format, and due to its more modern, ubiquitous presence in the industry, Kubernetes is hard to argue against. It offers scalability and modularity that’s extremely valuable to API providers, and despite its relative intensive processor requirements, it ultimately boasts greater benefits than it does drawbacks.