How to Design APIs for LLMs

How to Design APIs for LLMs

Posted in

Software development has always grown in leaps and bounds at the intersection of disparate technologies. Mobile devices required a sea change in how web development was approached, resulting in responsive and efficient design. The availability of massive data storage in the context of powerful data processing capabilities unlocked big data and cloud computing. The world has witnessed the transformative power of application programming interfaces (APIs) that allow different software entities to communicate, share data, and extend functionalities.

Figuring out how to make technology work in tandem with other solutions has always fundamentally reshaped our work. And now, a new player has entered the innovation stage: large language models (LLMs). These systems, colloquially referred to simply as “artificial intelligence” or “ChatGPT,” promise to reinvigorate and revolutionize how we perceive and utilize APIs at the industry level.

Understanding what an LLM actually is and adopting a set of strategies and approaches will be a huge deal as LLMs become more common and built-in to our everyday lives. Accordingly, we must determine how to make the most of these marvelous technologies. Today, we’re going to dive into that exact topic.

What Even Is a Large Language Model?

Simply put, a large language model (LLM) is an advanced type of machine learning model designed to understand and generate human-like text. These models are trained on vast amounts of data from a variety of sources. These sources are not necessarily the most accurate, and in many cases, ethical questions have been raised around the privacy or copyright of the content within the training set.

Nonetheless, the models have been heavily trained specifically on human-generated content, meaning that, for the first time, LLMs have created processes, content, and behavior that often feel more human-like than not. This has been a revolutionary change in the field of natural language processing (NLP) — tasks that were once considered challenging or even impossible for machines suddenly seem achievable, and with a result that feels lifelike.

At its core, a large language model is a type of neural network, specifically a variant of a transformer architecture, that has been trained on massive datasets. LLMs, such as OpenAI’s GPT (Generative Pre-trained Transformer) series, can generate coherent and contextually relevant sentences, answer questions, provide summaries, translate languages, and even code in various programming languages. They achieve this by learning patterns and relationships between words, phrases, and ideas within the data they are trained on.

Marching Forward with LLMs

LLMs are here, and they are starting to pop up in various places, both obvious and not-so-obvious. Developers are integrating LLMs into their integrated development environments (IDEs) to assist with code generation, bug detection, and documentation, boosting productivity and efficiency. Since LLMs can generate human-like content, they have quickly become invaluable tools for documentation generation, especially when the differences between endpoints need to be succinctly described in a human-consumable fashion.

LLMs have, in fact, become so powerful that they are starting to be used to generate code, both for internal use and for public API calls. Calling public APIs in various languages using requests generated by an LLM is no longer a distant dream — it’s something that developers are already attempting (with better or worse outcomes).

Interestingly, the fact that LLMs are both human-like and machine-based could benefit systems requiring both human and machine-readable specifications and documentation, as a single source of truth translating between the two often results in content that might otherwise have one side or the other be a bit weaker.

As the industry evolves, we can expect LLMs to become more ubiquitous and embedded into even more applications, fast becoming a standard tool in various domains. Their potential, combined with the iterative nature of the industry’s continued research and development, will likely lead to more advanced models that are fine-tuned for specific tasks, thereby further embedding them into our digital landscape in purpose-built applications.

9 Ways to Design APIs for LLMs

With that in mind, integrating large language models in the software ecosystem will herald a transformative approach to API construction, consumption, and interaction. Making APIs more “LLM-friendly” will be imperative for sustainable success. To get there, however, we need to make some considerations and adopt some strategies. Below, we’ll look at specific ways to design your APIs for LLMs.

1. Natural Language Descriptions: Making APIs Understandable

Traditional approaches to API documentation have often been highly technical, especially in the context of in-code comments and reference links. Because LLMs look at systems in a way that is closer to natural language, adopting a more human-oriented approach, ironically, is paramount. Developer portals should explain code throughout documentation points as if they were explaining to another developer or a layperson, using plain language to denote what a piece of code does, why it was chosen, and what the implications are.

Adopting this conversational way of documentation, as opposed to the more technically-oriented straightforward way of documentation that is more common, could result in API developer portals that are more digestible for LLMs. This digestibility can help the LLMs find errors in code construction and create new instruction modalities for human users. Conversational documentation, for instance, can be converted into AI-driven video content using tools such as Synesthesia to provide additional avenues of user communication.

2. Semantic Structuring: Enhancing Data Interpretation

Data interpretation is vital for any API interaction. Structuring API responses with semantic tags ensures LLMs can discern the intent behind the data and the relationship between disparate data points.

From an API context, rather than merely returning data, APIs could be designed to tag data points with their context. For instance, indicating that a returned number represents a “price” or a “date” seems like a minor step. However, this fundamentally helps create context-aware data processing and provides a holistic view of the data.

While this obviously has huge implications for the LLMs’ general understanding, it makes iterative development utilizing AI tooling much easier. Simply put, making the underlying system easier to understand for the LLM can only be a positive.

3. Ensuring Accurate LLM Interactions: The Art of Precision

One of the weak points of current LLM systems is the accuracy and precision of LLM-generated output. In the realm of APIs, this is doubly important — it’s one thing for an LLM to suggest that William Shakespeare was a famous guitarist, but it’s something else entirely for an LLM to represent an API endpoint that does not exist or function as stated. Accordingly, building precision into API descriptions is key to providing accurate LLM results.

This can be done in a variety of ways. By providing ample context and code samples for the LLM to ingest and model through the API documentation, LLMs can generate code or data interpretations more in line with the API’s intent. The better your documentation, the better the output. This output can then be fed through validation layers – for instance, for each endpoint referenced in documentation, a simple check for validity can verify that the information provided is indeed accurate.

Taking it a step further, implementing a continual feedback loop for learning and generation can be effective. Taking LLM outputs and asking the system to validate itself can be a simple but powerful step in the process of reducing errors and improving efficiency. All of this requires ample pre-planning and structured data, which requires a very precise data flow at both the micro and macro levels.

4. Self-Descriptive Endpoints: Intuitive API Interactions

One way APIs can be designed to take advantage of LLMs is by making the APIs self-descriptive. LLMs are best when there is a set understanding of each component that is reinforced through continual usage. By stating “Endpoint A does this thing,” you are giving the LLM information. By defining an endpoint like “Endpoint A, or dataProcessing/endpoint,” you are creating a self-descriptive endpoint.

In doing this, you create a reinforcement loop. The LLM will know that the endpoint is being used incorrectly or described inaccurately by the innate values of that endpoint in a way that it would otherwise not be able to do were the endpoint not self-descriptive.

To simplify this point, design endpoints with names that reflect their actual and practical core functions. For instance, an endpoint named /retrieveUserDetails is far more intuitive than /endpointA45, and will result in a more complete understanding for both human and machine consumption.

5. Feedback Mechanisms: Building a Two-way Street

We often think of LLMs as something that we work upon. The reality is that LLMs can actually interface with humans similarly to how we interface with them. Therefore, when we feed an API description to an LLM, we can specifically ask it to interpret and utilize an API as a standard user would, and as an output, generate questions that may result from use.

This is a different way to use an LLM, and requires the creation and embedding of mechanisms where LLMs as an entity can seek clarification, verify interpretations, and even make requests of developers. This is a fundamental inversion of the current LLM model where the developer asks something of the LLM, but in essence, we’re just clarifying that we want the LLM to feel like it can ask questions as well. This can take various forms, but ultimately, results in a system that is self-testing instead of self-assured.

From a practical point of view, this could be something as simple as building a section in your code that says something like this:

# For LLM model - please verify that this endpoint indeed results in a status code if it fails. Please additionally review this code as if you were an external user and forward any questions you may have about functionality to help@api.com.

6. Standardized Error Responses: Predictable Troubleshooting

LLMs are not just good at using well-formed code — they’re also very good at using poorly-formed code and fixing these issues. APIs should have standardized and understandable error messages. These messages must also have human-readable explanations that facilitate greater understanding.

API developers should endeavor to make these responses as clear and usable as possible because LLMs are wonderful at troubleshooting given a clear set of expectations and criteria. By creating a network of error messages, you can have an LLM look at the data and say, “If there’s a failure, please use the error code to identify the likely cause of the error for further testing.”

To get there, however, error codes must be consistent and detailed. Allowing LLMs to understand the issue and rectify it with actionable feedback for developers requires creating the right tools to get the job done.

7. Verify Third-Party Code For Inaccuracies

Perhaps the most important consideration for LLM accuracy in the context of API integrations is verifying that you trust all sources that talk about your API! Because, in many cases, the code that the LLM ingests comes from outside the official documentation. This code might reside in StackOverflow, blog posts, YouTube tutorials, or other sources. Unfortunately, this means that it is equally likely that the LLM is ingesting bad code as it is good code.

To solve this, API providers should seek to ensure all data sources the LLM is ingesting are actually valid. As part of this, API advocates could contact external sources to fix any errors. Acting as a partner to external parties and continually improving the overall data set being ingested by public LLMs will lead to greater accuracy overall. Of course, this might take a lot of legwork.

On that note, make sure your internal resources (documentation, blog posts, tutorials, eBooks) are also accurate!

8. Generative Schema Definitions: Flexibility in Data Structures

When it comes to LLMs, rigidity could be a blocker. The rigidness of predefined prompts could limit the output, as restrictions in an LLM model will often be treated as hard lines rather than soft limitations.

Accordingly, generative schema definitions could provide greater flexibility, allowing LLMs to tailor data structures based on the use case at hand. Instead of using overly strict data structures and types as a limit on LLM outputs, developers could clarify a range of acceptable responses using guidelines and templates. This empowers LLMs to provide output that is most appropriate based on the interaction context. This can result in powerful iteration and development that would otherwise not exist due to over-strict limitations.

In essence, this is a warning regarding over-specificity. While LLMs need very good directions, an overly-defined approach can also stop LLMs from coming up with new approaches. You could get much greater innovation and experimentation by providing LLMs with what you want to look like versus what it must mirror in format.

9. Bake LLMs Into Your Developer Center

Another potential option for LLM integration is to build the AI experience into a product’s native developer center. Instead of relying on an external third-party solution or LLM-powered chatbot, running an LLM using your own developer resources and codebase can help surface data and make the support infrastructure much more useful. This has a handful of benefits, including more useful data, more accurate support efforts, and long-term contextual support based on previous questions and issues.

The largest benefit of this approach is avoiding context switching, which can result in a disrupted developer experience that lacks continuity. For example, shifting back and forth between web-based documentation and an IDE might seem like a small barrier, but it can be a significant hangup in the long run.

“Context switching and a fragmented developer experience are two of the biggest hurdles that can impact developers and organizations ability to meet deadlines, increase user satisfaction, and drive ROI,” said Isaac Nassimi, SVP of Product at Nylas. Nylas has sought to fix this problem by deploying the Nylas AI chatbot, Nylas Assist, which is powered by generative AI and is connected to its entire knowledge base. This is one way to help developers quickly find solutions in a natural language format.

In Conclusion

The fusion of LLMs and APIs is set to redefine the developer landscape. As AI-driven integrations become the norm, mastering the art of crafting LLM-optimized APIs will be a game-changer, providing unparalleled efficiency and innovation in software development. Soon, strategic alignment with large language models won’t merely be an option or a cool feature. Instead, it will be a requirement for success within the competitive AI industry.

That said, it’s good to note that LLMs, as advanced as they may be, are only tools in our vast technological arsenal. They are not silver bullets that autonomously bring about perfection. Instead, they are instruments that amplify our capabilities. As with any tool, the expertise of the people using it and the specificity of the environment in which they’re used are almost more important than the tooling itself. LLMs excel at understanding, generating, and manipulating human language, but they require the guiding hand of human expertise to navigate the complexities of real-world applications.

Marrying the capabilities of LLMs with APIs offers the promise of streamlined development, enhanced efficiency, and innovative solutions. This will necessitate a dance between machine efficiency and human creativity. We stand on the brink of a new era, but this era must be one of tailor-built and expertise-driven environments that make the most of LLMs. Embracing LLMs as powerful allies while cherishing and leveraging the unparalleled value of human skills will be the true key to long-term success in the ever-evolving digital landscape.

It should be noted here that this advice, while appropriate and useful, is still theoretical and untested, given this technology is still so new. LLMs and AI are rapidly changing, sometimes literally by the hour, and accordingly, the best piece of advice we could give is to be aware. Changes to platforms and new technological innovations are the promise of this industry, so this advice is partially theoretical. The way AI will impact APIs is yet to be fully understood or appreciated, so this should all be taken as a start to this wild journey rather than an entirely authoritative statement.