How-APIs-Can-Reduce-AI-Resource-Consumption

How APIs Can Reduce AI Resource Consumption

Posted in

AI is an incredibly powerful emerging technology that promises to provide incredible computation at our fingertips. The problem is that this benefit comes with a cost: AI is incredibly resource-hungry, both in terms of raw computation and storage power, as well as natural resources.

Thankfully, API developers can play a huge role in mitigating this consumption by designing more efficient queries, using smarter caching and data storage, and defaulting to local processing techniques when possible.

With some basic awareness and consideration, APIs can help deliver a future that is AI-driven and responsible. Below, we’re going to look at the nature of this problem and offer some solutions to reducing AI resource consumption at scale and in production.

AI Is Hungry for Energy and Data

Before we dig into the role that APIs might play in resolving this issue, we should first put the concerns in context. Just how high is the resource cost of modern AI?

According to one pre-publication study review from the University of California Riverside, OpenAI’s ChatGPT utilizes two liters of water for every 10-50 queries performed. This cost would be significant in any region but is especially notable for a study from California, a region that spent the better part of the last decade in a severe drought. Training one instance of this model uses almost 1,300 megawatt-hours of energy, roughly equivalent to the annual energy consumption of 130 US homes.

This is not a problem with OpenAI but a problem with generative AI in general. According to one study, artificial intelligence already uses around 33 times more energy per task than purpose-built software. This cost is projected to expand dramatically through the rest of this decade, with some estimating a ten-fold increase in AI energy consumption by 2026, eclipsing the electrical consumption of the country of Belgium.

Many of the costs underlying this movement towards generative AI are not so obvious, either. While AI models have obvious computing and electrical consumption costs, what’s less obvious is that these systems rely on data and computation in centralized data centers, which are seeing dramatic increases in energy consumption by the technological sector.

According to the International Energy Agency, global energy consumption by data centers is slated to rise to 1,000 terawatt-hours (TWh) by 2026. These costs are already starting to impact organizations, with Google recently declaring itself no longer carbon neutral following increases in supply chain consumption and cessation of carbon offset purchasing. A similar announcement by Microsoft followed announcing an increase in CO2 emissions of nearly 30% since 2020 due to data center demands.

All this adds up to a single conclusion: AI is hungry. It’s hungry for data for training, and it’s hungry for energy to run itself. This hunger has introduced significant new energy demands to a system that is largely not ready to meet them. Case in point, over one-fifth of Ireland’s entire energy grid is dedicated to data centers, and that energy demand is so extreme that the country has started actively refusing new data center planning due to the extreme energy needs.

Can Tech Fix Tech? APIs as an Efficiency Resolution

There is no getting around the fact that AI is very expensive. This technology is as revolutionary as it is hungry for data and power, but that hunger can be managed and mitigated entirely in some cases. APIs will play a huge role in ensuring that this technology can develop while using energy and resources responsibly.

Thankfully, specific mitigation methods noted below can be implemented today to reduce the costs of AI. With proper planning and application, these approaches can help drastically reduce the cost of AI at scale and, by doing so, its impact on the environment.

API Efficiency Reduces Computational Cost

First and foremost, API efficiency will be a chief driving force behind gains in energy efficiency in this space. As organizations develop their own [AI-as-a-service solutions across web design, graphics, text generation, and more, the APIs that drive these interactions will have to reckon with how efficiently they function. This efficiency may seem small in isolation. At scale, however, with some AI providers handling upwards of 10 million queries a day, small improvements in API efficiency can have drastic effects on the overall cost.

Consider how an API might parse a query to an AI model. When a user queries your API, you might pass a JSON payload to your model of choice. How you structure your JSON and what you wrap that payload with will determine a lot in terms of how it’s parsed, computed, and responded to. Reducing the overall payload, structuring data as simple strings, tossing out overly complex or long queries, and other such strategies can make your API call and the resultant AI query more efficient.

Adopting a hybrid approach to data servicing can also make a huge difference. AI is often “thrown at a problem” even when it’s not a perfect solution. Instead of allowing your users to request the status of a package via an AI tied into your shipment inventory, why not leverage GraphQL or other query languages to handle such tasks? By leveraging AI for systems that make sense, and avoiding it for those that don’t, you can create a more optimal environment that reduces overall computational cost without affecting the overall function of the service.

Utilizing Effective API Caching and Services to Reduce Wasted Computation

One big gain in efficiency through APIs can be found through effective caching and the service of this cached information. This is especially true when services are using AI models for content that is repeatable. For instance, product information and descriptions can stay the same for months at a time, and if thousands of people are expressing interest in the same book or sweater, there’s no reason for a system to process the same listing repeatedly.

What’s important here is to balance the freshness of data with the need for that data to actually be updated. Some content cannot be cached, and there’s little to be gained regarding cached efficiency. For that which can be cached, however, gains can be cumulatively added, resulting in significant computational process reduction over time.

Notably, this caching can include more than just the data served to the end user. Endpoint information, routing pathways, internal infrastructure, and other structural elements are often served by the API. It’s unreasonable to expect a user to request how to perform a certain function or retrieve a specific data set. By caching this information, you are abstracting away the common requests from the common service at all levels, drastically increasing efficiency at scale.

API Data Processing and Storage Choices Change Data Costs

One of the main costs underpinning AI is data storage and processing for training models and iterative systems. These systems constantly record data, ingest data, and train upon data for accuracy and contextualization. Much of the data center costs have come from increasingly large data storage needs and systems that power this collection and digestion.

When it comes down to it, the most efficient choice for data storage in APIs connected to AI would be to not collect any data. Of course, such a solution flies in the face of the current AI generation. The opposite, however, of collecting all data for potential use, is too far a correction in the other direction.

Judging what data should be collected, ingested, and stored is a big question for many API providers outside the AI industry. Adding a model requiring more data for training purposes makes the situation murkier. For this reason, the right amount of data to be collected should be considered the minimum required for ongoing operations. If your API and AI combination require vast amounts of data for ongoing training, efficiencies in terms of ensuring low duplication, versioning which records change rather than holistic snapshots, and other such efforts can go a long way toward mitigating a large amount of data being stored.

There is also the fact that some data is not worth collecting, even if there may be some edge case for it being useful for ongoing training. Collecting every scrap of information might be suitable for creating a training set, but if you’re collecting data that is only adding a small amount of information at the cost of entire hard drives, you’re wasting space and computation power for what is ultimately not something that will really move the needle.

Leveraging Local AI vs. Centralized AI

Finally, providers should consider where their models are actually located. It’s ironic that, in a reality where APIs are largely decentralized and microservices-oriented, AI is mainly centralized in the hands of a few large organizations at this time. Many API developers are fine just paying a license and using an external API to access AI functions.

The problem with this is that models are often leveraging massive data centers that are dependent on high data and resource consumption server clusters. Many data centers are powered by on-demand energy, such as those sourced from oil or natural gas, and the idea of scaling up these systems on renewables has quite a long way to go.

While providers can’t change the nature of resource consumption, they can change where the model is located and, in doing so, change the likelihood of alternative power sources. For instance, APIs can use their own models, and by building and training their own models, can limit the amount of data being collected, the resources required, and, ultimately, the actual servers being used to serve that data. This can be especially impactful for API providers with a specific geographic service area, as locating this functionality closer to the end user can reduce resource consumption quite dramatically in terms of resources and server costs.

Conclusion

Ultimately, providers should think of AI and the APIs connecting with and powering these systems as a huge potential for efficiency that can reduce costs, improve operations, and even deliver climate change benefits. Doing this right will take some thought and careful planning, but it can be hugely impactful.