DevOps is the set of tools, processes and activities that modern software development teams put in place in order to ensure that they can convert code into live applications in a stable and repeatable way. It’s a bit of a catch-all term, and can be broken down into different areas. One of these is continuous integration, the process of building and testing code at regular intervals to avoid bugs and accelerate integration. Another is configuration management (CM) — the set of practices aiming to manage the runtime state of the applications.
What is Configuration Management?
Every application consists of one or more databases, web servers, application servers, reverse proxies, load balancers, and other moving parts that need to work together at runtime to make a working system.
Software configuration management is the process of describing the ways in which all these inputs — called artifacts or configuration items — interact and how they leverage the underlying infrastructure. Specifically, it addresses the installation, configuration and execution of configuration items on servers. Effective configuration management ensures consistency across environments, avoids outages, and eases maintenance.
Traditional configuration management required system administrators to write scripts and hand-maintain files listing technical endpoints, port numbers, and namespaces. When the complexity of a system increases, so does the configuration management process. Artifact versioning, environment changes and tensions between developers and system engineers have led to increased infrastructure automation.
CM in the Cloud
The advent of the cloud meant that servers were moved out of on premise data centers and into those of cloud hosting vendors. While the inherent complexities of running an on-premise infrastructure disappeared, new problems arose as well.
Cloud technologies have enabled teams to deploy software to hundreds if not thousands of servers concurrently to satisfy the demands of software usage in the internet age. Managing that many servers requires automation on a different scale, and a more systematic approach. This is where an API-driven approach to CM can help — rather than installing and launching scripts on each node, a centralized CM server could control all of the nodes programmatically and drastically reduce the workload of the team’s sysadmins.
New CM software tools have gradually been introduced in the last decade to address these growing needs. In the next two sections we’ll look at some of the most popular among them.
The Leaders: Puppet and Chef
Puppet and Chef are the most mature and the most popular CM tools at the moment. The packaging and deploying of applications used to be the sole province of system engineers. By enabling developers to take part in this process, Puppet and Chef have together defined a new category of CM solutions — infrastructure as code.
Both are open source projects and based on Ruby (although significant portions of the Chef architecture have been rewritten in Erlang for performance reasons). They both have an ecosystem of plugin developers as well as a supporting company offering enterprise solutions. Each of them features a client-server architecture, with a master server pushing configuration items to agents running on each node.
Puppet by Puppet Labs was founded by a system administrator named Luke Kanies in 2005. It is built in Ruby but offers its own JSON-like declarative language to create ‘manifests’ — the modules in which configuration items are described, using high-level concepts like users, services and packages. Puppet is “Model driven”, which means that not much programming is usually required. This makes Puppet a hit with system engineers with little experience in programming.
Puppet compiles manifests into a catalog for each target system, and distributes the configuration items via a REST API, so the dev teams don’t need to worry about installing and running their stuff — all configuration items will automatically be deployed and run on each node as foreseen in the manifests. Instructions are stored in their own database called PuppetDB, and a key-value store called Hiera. Puppet is used at Google, the Wikimedia Foundation, Reddit, CERN, Zynga, Twitter, PayPal, Spotify, Oracle and Stanford University.
Chef was created by Adam Jacob in 2008, as an internal tool at his company Opscode. Chef is generally developer-friendly and very popular with teams already using Ruby. It lets developers write ‘cookbooks’ in Ruby and stores the resulting instructions in PostgreSQL. Chef is used at Airbnb, Mozilla, Expedia, Facebook, Bonobos and Disney.
In both cases, a secure API is available to access any object within Puppet or Chef rather than going through the command line interface. For example, a developer can query the API of each of these to find out how many active nodes are present, or build a plugin for their favorite Continuous Integration system to trigger a deployment every time a new version of the code is built.
A healthy ecosystem of developers have contributed numerous plugins and extensions to both Puppet and Chef. Likewise, Puppet and Chef plugins are also available for popular Continuous Integration products servers like Jenkins, enabling direct integration between the CI and CM processes. That way code builds that pass all the required tests can be automatically delivered to target environments without any manual intervention.
The Contenders: Salt and Ansible
While Puppet and Chef dominate the CM landscape, several contenders have emerged to cater for perceived weaknesses in their architecture.
SaltStack is a relatively new CM tool built in Python and open sourced in 2011. Used by PayPal, Verizon, HP and Rackspace, Salt focuses on low-latency architecture and fault tolerance. It features a decentralized setup, small messaging payloads, no single point of failure, and parallel execution of commands for optimal performance.
Ansible, also open source and built in Python, was created by Michael DeHaan in 2012. It was created in reaction to the relative complexity of Puppet and Chef, and attempts to offer a simpler, more elegant alternative, with a shorter learning curve.
Contrary to Puppet, Chef and Salt, Ansible is based on an agentless architecture — meaning that no agent is required to run on each infrastructure node, which leads to less complexity and less load on the network. Ansible modules, referred to as units of work, can be written with a variety of scripting languages like Python, Perl or Ruby. Ansible lets users define playbooks in YAML for often used system descriptions.
Aside from Puppet, Chef, Salt and Ansible, there are many other CM options, such as Capistrano and SmartFrog. Each one of them differentiates in a certain way. For example, Otter has a web based user interface which lets you switch between a drag-and-drop editor and text mode, along with first class support for Windows.
Cloud Vendor Solutions
Infrastructure-as-a-Service vendors like Amazon Web Services come with their own, highly specific concepts and terminology, and CM tools need to speak that language to let DevOps people navigate their product stack.
All of the above mentioned CM products have extensions for Amazon EC2, and some of them have native support for Google Compute Engine. Amazon has its own native configuration management product called OpsWorks which competes with Puppet and Chef for applications entirely hosted on Amazon Web Services (although OpsWorks itself is based on Chef internally).
Vagrant by HashiCorp is an open source tool to manage virtual development environments (such as VMs and Docker containers) and wraps around CM tools. With Vagrant teams can create portable development environments that can be moved between hosts without any changes to the configuration.
In-House CM tools
Some companies have very specific requirements around CM that are hard to meet by using one of the available products on the market. Building a custom, fully-fledged configuration management solution represents a great deal of work, but it can be worth it for companies that have the means to see it through.
Netflix created their own configuration management API called Archaius. Named after a species of chameleon, this Java and Scala-based in-house tool lets Netflix perform dynamic changes to the configuration of their Amazon EC2-based applications at run time. It was open sourced in 2012.
Netflix had a variety of reasons for building an alternative to Puppet or Chef. High availability is paramount to their business model, so they can’t avoid any downtime related to server deployments. All of their applications are designed in such a way that configuration can be reloaded at run time.
In addition, Netflix servers span multiple environments, AWS regions and technology stacks, which are collectively called ‘context’. Thanks to Archaius, Netflix are able to enable/disable features dynamically depending on their context.
The Road Ahead
While the configuration management technological landscape has greatly matured over the last few years, there is general consensus that things will keep evolving in the future. The current mainstream solutions are often viewed as too complicated, unforgiving and a hassle to maintain. One alternative to the classic CM approach is the emerging immutable infrastructure championed by Docker.
Whereas regular configuration management aims to define and manage state at run time, containerized applications require that all the configuration be defined at build time. The resulting portable containers can then be moved from one host to the next without any changes in state.
These states can then be saved in Docker images and used later to re-spawn a new instance of the complete environment. Images therefore offer an alternative means of configuration management, arguably superior to runtime configuration management.
Another alternative is to hide the complexities of configuration management with PaaS solutions like Heroku that deal with CM under the hood and packages everything needed to run an application in buildpacks. While less flexible than IaaS, it offers the luxury of ignoring the CM process entirely.
It’s unclear where configuration management is headed, but one thing is certain — it will remain one of the chief concerns of all DevOps teams for the foreseeable future.