Should You Design Natural Language First APIs? Bill Doerrfeld February 7, 2019 Humans are getting lazier, but APIs tend to stay pretty complicated. Wouldn’t it be nice if we could, instead, speak in plain language to automatically initiate any change in the backend? For some scenarios, a natural language facade may just be where web APIs are heading. The classic CRUD (Create, Read, Update, Delete) schematic for acting upon data is at odds with the way humans naturally communicate, bringing up some interesting API design questions when we consider the incoming ubiquity of voice-driven and AI-embedded machines. As we’ve previously described, Natural Language Processing (NLP) is a growing trend that could affect how machine-machine communication is structured. The problem comes down to how developers communicate with a database. When interfacing with voice-assistance and NLP-driven commands, tools like Alexa, Google Home, demonstrate how CRUD is no longer applicable for all design situations. James Higginbotham has expressed how API design is changing in the age of bots, IoT, and voice. We also saw evidence at the Nordic APIs 2018 Platform Summit — Pavel Veller provided an example of embedded natural language understanding in practice. Pavel’s API actually accepts natural English language commands within JSON requests. In this article, we’ll dig into how and why APIs may use natural language in their designs. We’ll consider how a web API designed with natural language understanding might behave, citing a sample natural language-driven travel application in the process. This post was inspired by a session given by Pavel Veller at Nordic APIs 2018 Platform Summit: Slides What is Natural Language Understanding? First off, what exactly is natural language understanding? Presenting at Nordic APIs, Pavel Veller describes how statistical models mainly focus on two main objectives to understand language: intent detection and named entity recognition. Let’s see how this works in practice. For example, if a user is making a request to a meeting scheduling app, a typical voice command might be something like: "Please schedule a meeting with Robert for tomorrow at 2pm" The algorithm on the other end will likely first attempt to understand the intent of the request. A parser will highlight the keyword “schedule” to trigger scheduling functionality. Next, entities such as who and when will be discovered. In this case, these entities are Robert, tomorrow, and 2pm. Using this information, the scheduling app can easily fulfill something like a Google calendar API call to satisfy the user’s request. Related: What Does the Rise of Bots Mean for APIs? Why Build an API With NLP Design? Natural Language Processing is now relied upon for chatbots, voice assistants, and other burgeoning IoT scenarios. In these transactions, NLP doesn’t quite fit with the CRUD model, beckoning some developers to construct more natural-looking external interfaces as well. Outside of the CRUD reasoning, including natural language within the API design may bring other benefits. First, it puts more reliance on the backend servers to provide functionality, satisfying the philosophy of placing more burden on the server-side. It may also increase developer experience, allowing novice and non-coders to get the most out of APIs. Also read: 7 Growing API Design Trends What Does An API Using NLP Look Like? There is no hard and fast rule for how NLP-infused APIs should be designed. A natural language API may receive only a string of plain English text as input. What travels on the wire could be a natural English sentence. Interestingly, if we look at the JSON definition, we see there is nothing wrong with using natural language, as JSON is language-agnostic: “JSON is a text format that is completely language independent … These properties make JSON an ideal data-interchange language” Pavel notes that JSON is not a language, but a data-interchange language — thus transporting plain English is certainly fair game. You could argue that a pure natural language API should respond in plain language too. However, Pavel foresees roadblocks with this — such as communicating errors as well as the hassle of running natural language understanding on the client-side for incoming requests. Case Study: Travelwhelm So, how would we design a natural language API? Let’s look at one such case — the Travelwhlem API. Travelwhelm is a side project developed by Pavel to help calculate his time spent traveling. This user-facing tool accepts travel information and generates a percentage of time spent traveling. Employees in marketing, like developer advocates, for example, often agree on a certain % of their working hours to be on the road. Thus, plane-bound users can input their travel time into Travelwhelm to calculate a percent each month. Related: Virtual Assistants Harness Third Party Developer Power 3 Natural Language Input Options You can interface with Travelwhelm in any of three methods: A website entry, SMS text, or through an Alexa skill. At the Platform Summit, Pavel gave a live demo of the process in action; he spoke to an Amazon Echo device to initiate a calendar adjustment on the Travelwlehm interface. Take this sample Travelwhlem command: "I was in New York last Thursday" Thankfully, the natural language system doesn’t need to discover that much to initiate this command. It must simply mark each date as either traveled or untravelled. Pavel describes how designing a binary system is an easy first step. “When I solve problems with technology, the first thing I try is the easiest thing that works” Pavel distills this theory into Travelwhelm by first detecting a positive or negative intent — either traveling or not traveling. His home-brewed algorithm uses the following expression to parse for negative sentiment: /\bcance|\bnot\b|\b(wo|were|was)n't/ Next, the application must find entities, which in this case were dates. As you may infer, there are many libraries available for date recognition out of English, such as Stanford and Google NLP. However, in Pavel’s case, the nuances of his user commands didn’t work well with any previously built libraries. Take, for instance, this command: I will be in Stockholm on October 22nd, 23rd, and 24th Unfortunately, previously-constructed libraries were only able to detect the pattern of month + date, meaning that only Oct 22 would be recognized from the statement above. This results in a date and two ordinary numbers that aren’t predicated on the month. Similarly, there are issues with commands such as: "I was in SFC this past Tues and Wed" Pavel found that NLP libraries have issues connecting information that is predicated on previous words. So, he got to work constructing his own library, Orolo, which he describes as an “aggressive parser” designed to understand dates and date ranges in a sentence. Also read: Using OAuth Device Flow For UI-Incapable Devices Takeaways on Designing Natural Language First Named entity recognition is not easy, and the evolution of natural language may always be one step ahead of our vacant digital machines. However, Pavel is optimistic that one day systems will truly be able to talk to one another using natural language. For niche APIs that come into contact with voice-driven scenarios, such a design could make sense, relieving processing pressure on the client-side. “Maybe in the future we will be able to do a lot better with natural language understanding …. And maybe systems will be able to talk with one another” However, for right now, Pavel’s experiments act as an informative reminder: building natural-language understanding from scratch is extremely time-intensive. Pavel notes how his Orovlo library took exponentially more time to construct than the actual app framework itself. Natural language adds another abstraction layer to an API, increasing complexity and the need for added maintenance. So, should developers be creating APIs that solely speak English (or any natural language, for that matter)? A simple answer is probably not at this point. As Pavel has noted, subtle nuances in commands still require custom made libraries and parsing English on the client-side is a burden. Still, it is a novel concept and one we will continue to consider as user interfaces become more voice-driven in the future of personal AI.