The Benefits of a Serverless API Backend

Imagine if your backend had no infrastructure. No permanent server, nowhere for your API to call home. Sounds a bit bleak, doesn’t it? As it turns out, serverless backends could be the next big thing for implementing truly scalable cloud architecture.

How do you make an Application Programming Interface lightweight on the client side, yet scalable to heightened traffic demands? SaaS vendors have been migrating to serverless architecture to solve this dilemma, as well as many other operational issues found in hosting their web applications.

In this article, we’ll identify what the serverless craze is, and why some providers may want to consider having a serverless API backend. Led by Rich Jones of Gun.io, we’ll define what we mean by serverless architecture, provide an example of one practice today, and aim to outline some potential benefits and pitfalls of adopting this approach.

What Does “Serverless” Mean?

Traditional cloud hosting is permanent. As in, you pick a server provider, and they run your software on multiple servers worldwide. There are precise, recurring physical locations for where your data is stored and functionality processed.

Serverless computing is a strategic deviation from this model — it is an event-driven setup without permanent infrastructure. This doesn’t mean servers are no longer involved, rather, it means that servers are auto-created on a per-need basis to scale to the demands of your app.

But for developers, what serverless really means is less time spent on operations, since they no longer have to worry about traditional server maintenance. The benefits of a serverless infrastructure really add up:

No more over capacity issues
Servers are autoscaling
You don’t pay for idle time
Consistent reliability and availability
No load balancing, no security patches

In general, serverless simply equates to peace of mind (but perhaps not for some, as Operations may need to find another job altogether).

This post was inspired by a talk given by Rich Jones of Gun.io at the 2016 Platform Summit:

From Traditional to Serverless Environments

To understand the subtleties between traditional and serverless approaches, let’s walk through a basic step-by-step sample implementation of each.

Traditional Web Request

An interaction with a traditional web server will often occur in a format similar to this:

An Apache or NGINX web server listens for events as they come in.
The server then converts this to a Web Server Gateway Interface (WSGI) environment.
This is sent to an application to process the request.
Then the web server sends the response back to the client.
The web server resumes listening.

There are a few drawbacks of this approach. For one, if you encounter a huge spike in traffic, this system will deal with the requests as they came in. If the end user isn’t ahead in the queue, they will likely experience a timeout, and the page will look like it’s down. Medium to late visitors to the queue face very slow speeds. Secondly, when it’s not processing a request, the web server is left in an idle state to poll, wasting valuable resources that could be used elsewhere.

API Design on the Scale of Decades eBook image

Serverless Web Request

Within a serverless infrastructure, each request corresponds to it’s own server. After the server processes the function, it is immediately destroyed. For example, here’s how Jones’s Zappa handles a web request:

The request comes in through an API Gateway.
The API request is mapped to a dictionary using Velocity Template Language (VTL).
A server is created.
The server then converts the dictionary to a standard Python WSGI and feeds it into the application.
The application returns it, and it passess it through the API Gateway.
The server is destroyed.

Astoundingly, all this occurs under 30 milliseconds, so that “by the time the user actually sees the [content appear on the] page, the server has disappeared… which is actually a pretty zen thing if you think about it,” says Jones.

So what are the advantages to spawning servers on a moment’s notice? To Jones, the top reason is scalability. Since a single request matches to a single server creation, this relationship can be scaled indefinitely, on a scale of literally trillions of events per year.

Second is cost savings. Paying by the millisecond means that you are only spending money on actual server processing. AWS Lambda charges around $0.0000002 per request. But since Lambda tier offers 1M free requests per month, this means it could stay free for small projects or young startups.

This infinite scalability make serverless infrastructure a boon for both small breadth projects like microservices, APIs, IoT projects, or chatbots, but also for larger traditional enterprise content management systems like Django as well.

Also read: Designing a True REST State Machine

How to Get Started: Understanding the Serverless Vendors

Sound interesting? An easy way to get started is with a serverless framework like Zappa, Serverless Framework, or Apex (more here). With some frameworks, like Zappa, you can adopt serverless computing pretty easily for your existing APIs. All three are built around AWS Lambda, Amazon’s cloud computing service, but other significant serverless computing providers are within offerings by Microsoft Azure Functions, Google Cloud Functions, and IBM Bluemix OpenWhisk. However, according to Jones:

“AWS Lambda is by far the leader in the space… it’s just far more capable in pretty much every regard. The others are still playing catchup.”

Designing Event-Driven Serverless Applications

Within a serverless environment, a main design element that will be novel to newcomers is that code is going to execute only in response to events. Since building a robust, event-driven application means designing in-response to events, what can we define as our event sources?

An event may be related to file operations — for example, say a user uploads an image and the application needs to resize a large picture into a small avatar. Using a serverless architecture, you could have a thumbnail service execute a response in an asynchronous and non-blocking way. Instead of setting up an entire queuing system, having a native cloud hosted queue can handle this.

Support notifications like receiving an email, text, or Facebook message could also be interpreted as events. Rather than polling for new emails to come in, an action could be executed specifically in response to these. Where it gets really interesting is how you can treat HTTP requests as an event. This paired with other event trigger types is usually called a hybrid architecture.

Database activity could also be used as an event trigger. A change to a table’s row could trigger an action to happen, for example. However, Jones reminds us to treat the “API as the primary source of truth in your application” — don’t make SQL calls inside your event functions, rather, funnel this through your API.

Jones reminds us that time is also an important factor that can be used as an event source, and will be needed to initiate regular occurring tasks or updates. Throughout these varying event sources, instead of creating machines that constantly poll your resources for changes, you’re essentially setting up triggers within your applications to execute a response.

5 Serverless Pro Tips:

All this sounds awesome, but what are the downsides of building applications with serverless backends? In his presentation, Jones covers some ground on potential downsides, how to avoid them, and some general tips for getting the most out of a serverless arrangement:

Avoid vendor lock-in: This can be a big issue when adopting any new technology. To avoid vendor lock-in Jones recommends integrating software that provides open source compatible offerings, and to decouple vendor interactions from your application. Rather than hardcoding interactions, Jones recommends decoupling this logic — creating a dispatcher inside of a function to add an item to the queue is one way of doing so.
Mock your vendor calls for testing: When writing a mock or sample app that behaves as if synced to the cloud, you may want to test your cloud functions. Placebo is an interesting package that will record your actions with AWS and replay them as if you were interacting with the server.
Think “server-lessly” and avoid infrastructure: It can take a while to develop the serverless mindset. When developing, consider if you actually need a database, or if a queue can be adopted instead.
Stage different environments: When testing and staging, Jones recommends using CI for multiple production environments (Blue/Green Deployment).
Deploy globally: Using a geographically distributed server arrangement can increase speed and security. AWS Lambda services can host on 11 regions, so that anywhere on the planet can reach a 20 millisecond ping.

Example: Kickflip SDK

Kickflip is an example of a serverless SDK that provides live video streaming

So how do we build an authenticated API, with low-latency, low cost, that is infinitely scalable, without having to worry at all about server operations? Let’s turn to an example serverless implementation in action.

Kickflip is an SDK that brings live streaming video to mobile applications. A “live stream” is essentially just a combination of separate MP4 files, along with a manifest that determines the order of the videos. Since a real-time video stream service wouldn’t need to keep large amounts of video data around for later use, it is an ideal application for a serverless environment.

Kickflip uses a hybrid architecture of HTTP and non-HTTP event sources to trigger server creation from a mobile phone upload, which updates the manifest file so that end users view the latest video chunks. To do all this, Kickflip uses a combination of services: API Gateway for authentication, an API constructed with Lambda, Zappa, and Flask, file storage using S3, and CloudFront for global content delivery. The simplified flow is as follows:

The client authenticates with the API. Kickflip uses Amazon’s authentication API key generation service, but a custom identity access management handler could work here as well.
The API returns a short-lived federation access token which can only be used to upload a file into a specific S3 bucket.
The client receives the token, and uses it to upload the video.
An AWS Lambda server is executed in response to the new video upload, and the stream manifest is updated. This upload acts as the event-source.
Content is served on the CloudFront delivery network for low-latency.
Users see the latest video stream on their device.
The server is destroyed and the temporary access token is revoked.

Jones demonstrates that with the strategic pairing of technologies, a serverless video streaming service can be developed in only 42 lines of Python.

Building Serverless API Backends

The serverless movement represents a profound paradigm shift in our ability to create impressively scalable web services. Rethinking how events can spark temporary server iterations can be an extremely cost effective solution for microservices and large projects alike.

With all the small connected services being deployed in this manner, the serverless arrangement also reiterates the rise of composable enterprises that depend on many different services to thrive; cementing the web API’s position as an important cog in modern and future web communication.