DALL-E And The Future of AI And APIs

Three extremely popular APIs on the market in 2021 — Alexa Skill Management API, Google Assistant API, and IBM Watson Discovery API — all relate to Artificial Intelligence (AI). It also seems like the future of the AI space will be closely linked with that of APIs.

OpenAI’s DALL-E, stylized as DALL·E, is one example of this. Non-technical observers have described the program as being like magic and, even knowing that machine learning is behind DALL-E, it’s hard to disagree. Access to DALL-E, as with some of OpenAI’s other offerings, will likely all remain available via API access.

But, as fascinating as the tool might be, DALL-E offers a troubling glimpse into one potential future of AI and APIs. Below we’ll be taking a closer look at DALL-E, its relationship with APIs, and how it could represent a problematic model for API development.

Introducing DALL-E

Named for Salvador Dali and Disney Pixar’s WALL-E (which is ironic given that WALL-E, while adorable, is rusty and outdated), DALL-E was introduced at the beginning of 2021.

To quote OpenAI, “DALL-E is a 12-billion parameter version of GPT–3 trained to generate images from text descriptions, using a dataset of text–image pairs. We’ve found that it has a diverse set of capabilities, including creating anthropomorphized versions of animals and objects, combining unrelated concepts in plausible ways, rendering text, and applying transformations to existing images.”

In other words? DALL-E creates a computer-generated image from random text entered.

Some of the use cases demonstrated on the OpenAI website include everything from technical drawings to a high-quality emoji of a lovestruck boba tea. In other words, this is an incredibly powerful tool with myriad potential applications.

Unfortunately, we might have to wait a while to see what some of those applications actually are. OpenAI itself is still in private beta, with a “Join the Waitlist” button on its site since launching, and its APIs are also waitlisted only. Current OpenI documentation doesn’t contain anything about DALL-E.

OpenAI and APIs

DALL-E is powered by GPT–3, around which much of OpenAI’s work revolves. To date, GPT–3 has been released in SaaS (Software-as-a-service) form only. Ostensibly, that’s to limit misuse of the technology and because GPT–3 requires an enormous amount of computer power*.

*ZDNet explains that this is why access to GPT–3 is via a cloud-based API endpoint; Lambda Computing estimates that it could take a single GPU hundreds of years to complete tasks like GPT–3 does.

It’s also, as OpenAPI admits, to make money. Not a tremendous amount of money, mind you, since 750 words generated by the Davinci model will only cost you 6 cents. What does the API’s output look like? Well, it could look like all sorts of things.

While DALL-E generates images from random text input, GPT–3 generates contextually relevant text output. Check out this example from OpenAI’s developer portal:

“Topic: Breakfast

Two-Sentence Horror Story: He always stops crying when I pour the milk on his cereal. I just have to remember not to let him see his face on the carton.”

Some of the content generated by OpenAI’s APIs is, forgive the pun, scarily good. What’s even scarier is that, with much of the information around them redacted, we might never know if something has been created by a human or by an API.

A Bleak Future for AI?

Now, let’s get back to DALL-E. There’s no doubt that it’s a very sleek tool, but the extent to which visitors can use the app is limited. Rather than being able to enter their own text, DALL-E users are currently limited to a range of drop-down menus provided by OpenAI. Hardly a true sandbox.

There are various reasons why that might be the case, from minimizing workload through to concerns about illicit activity like deep fakes. But it probably also has something to do with the fact that Microsoft licensed exclusive use of GPT–3 in 2020. A public API can still be used to receive output, but Microsoft currently owns the rights to GPT–3’s underlying code.

OpenAI was initially positioned as a non-profit with the aim of “democratizing AI” but later switched to a capped profit model. It’s (very) easy to make the argument that limiting access to GPT–3 and tools like DALL-E doesn’t fit with the aim of AI democratization.

Not to mention that, because DALL-E applies “transformations to existing images,” there are some profound implications when it comes to copyright infringement and trademark violations.

Still, as Mark Riedl puts it, perhaps the biggest concern is here is that “if we believe that the road to better AI in fact is a function of larger models, then OpenAI becomes a gatekeeper of who can have good AI and who cannot. They will be in a position to exert influence (explicitly or implicitly) over wide swaths of the economy.”

The Danger of Gated AI (and APIs)

There are implications from the above for APIs too. Non-profit APIs exist and most APIs offer a limited number of monthly calls for free, but that’s where the charity ends.

Fortunately, ludicrously expensive API calls (the subject of a fun Twitter thread we found) are the exception rather than the norm. But there’s no guarantee it will stay that way forever.

The more deeply embedded an API becomes in an organization’s processes, the more difficult it is to find an alternative to replace it if the associated cost goes up. Many marketplaces are attempting to standardize the monetization of APIs, but doing so is tricky, and there’s usually variance. If a provider doesn’t want to conform to guidelines, they could leave the marketplace.

APIs have long been associated with the free — that’s free in spirit, not necessarily from cost — exchange of information and better enabling services to work together. In the recent case of Google v. Oracle, it was found that Google’s copying of the Java SE API to build a “new and transformative program” legally qualified as a “fair use of that material as a matter of law.”

APIs have come to represent big business in the tech space, and that means lots of money changing hands. It might also come to mean API providers locking out potential users they deem to be too risky or unable to keep up with the business’ financial aspirations.

While OpenAI’s API certainly isn’t the first private API, its prohibitive nature coupled with the controversial nature of AI as a subject matter could come to set a concerning precedent (even though DALL-E is much more fun and fancy-free than Skynet ever was).

NordicAPIs signed up to the OpenAI beta waitlist in July 2020 and have yet to be granted access.