#3: Natural Language APIs Are Coming

“The hottest new programming language is English” — Andrej Karpathy

Feb 11, 2023

The Internet, as imagined in the 1960s

When the Internet was first conceived in the 1960s as a network of computers, it wasn’t entirely clear how they would communicate with each other. Programs were written in different programming languages and there were no protocols to stitch them together.

For Joseph Licklider, author of the initial ARPAnet proposal, plain English was the language for humans to communicate with computers. In fact, he called ARPAnet a “system based upon [...] computer appreciation of natural language”. He imagined humans stating goals to computers in natural language, and the “problem-solving, hill-climbing, self organizing programs [would be] be able to devise and simplify their own procedures for achieving stated goals”.

How would these programs communicate, if they were written in different languages and had no knowledge of each other? Inventor Bret Victor famously argued that the computers were supposed to get the goals from humans and then “dynamically figure out a common language through which to exchange information and fulfill the goals that the human programmer gave to them”. In Licklider’s proposal this was done through “message processors”. A program requests something from another computer on the network, and “the request is translated by one or more of the message processors into the precise language required by the remote [computer]”.

To summarize: humans were supposed set goals for a computer in natural language, computers would then query other computers on the network, and the network would ‘translate’ between computers.

What we got instead: the APIs

What we ended up with as the main way for programs to communicate with each other was APIs. Instead of programs figuring out how to talk to each other, humans now have to do it for them. Whenever a programmer wants to use a service, they have to use the API. They open the documentation for that service, figure out how it works, what language it speaks and then speak to the program in the program’s language.

For example, if I wanted to upload a file to Amazon S3 storage, I would write this code:

    # Create an S3 client using boto3
    s3 = boto3.client("s3")
    
    # Use the upload_file method of the S3 client to upload the file to S3
    s3.upload_file(local_file, s3_bucket, s3_file)

This is what programmers actually do all day. They learn languages of different programs and then translate them to each other. A web developer translates the language of the database to the language of the backend service, and then the language of the backend service to the frontend language — and so on, for every single detail. YCombinator founder Paul Graham observed that “99.5% of programming consists of gluing together calls to library functions”.

There are several bad things about using APIs to connect programs with each other:

They slow down the development process: you have to learn a new language every time you want to use another program.
After you’ve written a program for a particular API, you can’t easily switch to another. It might take a full rewrite to switch to switch to a different service. Even if it’s functionality is the same, its language isn’t.
If the API changes — and they always do, maybe not in 1 year, but definitely in 10 — your program might stop working entirely.

We ended up with APIs not because they were better, but because they were easy. Licklider himself predicted that things would go that way. To him, the approach in which “real-time concatenation of preprogrammed segments and closed subroutines which the human operator can designate and call into action simply by name [is] simpler and apparently capable of earlier realization”.

There were no real implementations of these “negotiating”, “probing” programs. No one really knew how the programs were supposed to chat.

Enter AI

One thing that ChatGPT demonstrates is its ability to write code based on natural language queries. Given a query, it readily produces a fully functioning program. In fact, the earlier example of uploading a file to Amazon S3 was written by ChatGPT.

Why can’t natural language be the API? Rather than writing code, we could directly ask the service for what we need. The service would then translate that request into code and produce the required result.

In fact, we can already simulate this with ChatGPT:

This is exactly what I wanted. The only thing missing is running the code.

Let’s call these natural language APIs or NLAPIs. They are better than traditional APIs because:

They speed up the development process: you don’t have to learn a new language every time you want to use another program (though you still need to understand what it does).
Switching to another implementation is easy and doesn’t require a full rewrite.
As long as there are no dramatic changes to the API, it will work as expected forever.

The implementation-switching part is especially important. Imagine you realize that if you switch from Amazon to Azure, you save $1000 a month. Re-writing all code in a large application to use Azure instead of Amazon will cost tens of thousands of dollars in developer hours. So you swallow the $1000 a month in lost profits, because the upfront payment is just too large.

However, with NLAPIs you might not even do a rewrite. Let’s reconfigure ChatGPT to use Azure. We will even go even a step further and make the prompt use the Amazon terminology (buckets), while asking it for an Azure implementation (blobs):

Not only ChatGPT produced correct code, it correctly understood what I meant despite me using non-Azure terminology.

Unless there is a serious technical obstacle that I’m not seeing, the demand for services with NLAPIs will be huge. This is a 10x productivity improvement for any developer.

Switching costs removed

The biggest strategic implication of NLAPIs is the removal of switching costs. Cloud computing providers, payment processors, database providers, e-commerce platforms are able to price their subscriptions and additional services at a premium. Switching to a different service is so costly that it’s almost always more economically sound to pay the premium in price.

However, with the NLAPIs switching costs are reduced to essentially zero, except when a service provides a functionality that others don’t. The disruption that will follow is immense.

Incumbents are stuck in a lose-lose situation: either they implement an NLAPI and erode the switching costs or lose the business to an NLAPI entrant.

The entrants who will provide the same services as incumbents with an added benefit of an NLAPI are not much better off. Providing an NLAPI to a commodity service essentially means forgoing the switching costs moat. Long-term, these entrants will see their margins erode. However, this can be a quick path to an acquisition by a major incumbent — especially considering that incumbents will be initially reluctant to implement NLAPIs as they will decrease their defensibility, leading to a counter-positioning power play.

A first-mover opportunity is up for grabs for a platform that can specialize in combining different services through NLAPIs — a sort of Zapier of NLAPIs. It is likely that the complexity of gluing together dozens of services through natural language will increase non-linearly the number of glued services. Solving this problem might provide a moat through a switching cost: you won’t be able to easily switch an NLAPI platform without losing the stability that it offers.

Finally, the value creation unlocked by NLAPIs will be supported by companies providing consulting services and developer tools. These will be extremely lucrative during the period of switching from APIs to NLAPI.

Looking out for early signs

How can we know that the NLAPIs are coming?

First of all, we’d need to see proof-of-concepts of NLAPIs and communities forming around them. Look out for Discord chats on this topic.

The next step will be the specialized developer tools for NLAPIs. Debugging output of NLAPIs cannot be done through usual tools, so debuggers are likely to be the first of these.

The final chord will be an implementation of an NLAPI by a major platform. However, anyone getting into NLAPIs at that point will be playing catch-up.

EDIT (24/03/23): Welp, that didn’t take long

OpenAI just announced support for plugins for ChatGPT. These plugins will use a natural language API.

Plugin developers expose one or more API endpoints, accompanied by a standardized manifest file and an OpenAPI specification. These define the plugin's functionality, allowing ChatGPT to consume the files and make calls to the developer-defined APIs.
The AI model acts as an intelligent API caller. Given an API spec and a natural-language description of when to use the API, the model proactively calls the API to perform actions. For instance, if a user asks, "Where should I stay in Paris for a couple nights?", the model may choose to call a hotel reservation plugin API, receive the API response, and generate a user-facing answer combining the API data and its natural language capabilities.

So it seems that Licklider’s vision is finally coming true — as usual, faster than we thought, but slower than he imagined.

The 1993

Discussion about this post