This article was first published on https://shannonlow.substack.com/
OpenAI unveiled some exciting plans this week at their first DevDay, but what most caught my attention was the advent of GPT as a platform for third party AI agents - a paradigm shift that introduces a new way of thinking about software development that could unleash a wave of creativity and innovation.
In a nutshell, OpenAI is enabling anyone to create custom AI agents for specific tasks and use cases using their no-code GPT platform. These AI agents or GPTs will be made available for other users to use through OpenAI’s GPT store, and GPT creators will be paid a share of OpenAI’s revenue based on the usage of their GPTs.
The comparison to mobile apps and app stores is what I find most exciting. Before this, we mostly saw ChatGPT as a general purpose tool we could use to answer questions and perform tasks like summarising documents, writing articles and creating images, as well as to perform more complex jobs like playing the role of a Dungeon Master in a Dungeons & Dragons role-playing adventure. But we would always have to start the session by providing ChatGPT with a detailed prompt telling it what we want it to do, and if we wanted to redo the task or job another time with a new context and different parameters, we’d have to start a new ChatGPT session and copy, paste and edit the detailed prompt manually. This was bad for reusability and made it difficult for other users to discover that ChatGPT could be used in these specific ways.
With custom GPTs, a GPT creator (many of whom may have been power users of ChatGPT, with their own bank of detailed, reusable, i.e. copy-and-paste-able, prompts) can create a reusable AI agent or GPT that performs a specific task, set of tasks, or job to solve a problem, and other users can use the same GPT whenever they need to solve the same problem. When you think about it from the perspective of a use case that encapsulates a common problem and a reusable solution, that sounds a lot like an app.
So what can you build as a GPT to solve a problem for users? What is the “shape” of such a “GPT-based solution”? One obvious example is a custom chatbot that takes on a particular persona or role in order to interact with human users and deliver some form of utility (e.g. entertainment, companionship, advice, etc.) That’s the direction character.ai and Meta have taken. These might be amusing as toys and perhaps enlightening as social and psychological experiments, but do we really need a store full of Elon Musk and Snoop Dogg chatbots?
Fortunately, with OpenAI’s GPT platform, you can add custom knowledge, API interactions, software execution, and external actions, in order to return a result, not just an answer, and this is where the potential for what you can build extends beyond what we might traditionally see as a chatbot.
Imagine a fitness coaching GPT (ala Freeletics app) that guides you through workouts, or a cooking GPT (ala Cookpad app) that provides real-time recipe assistance in the kitchen. These AI agents, akin to apps, could harness data from multiple sources, take input from various interfaces, apply computations, and return results or deliver user experiences.
Now we start to see the close analogy to mobile apps, the obvious difference (today) being the primary interface through which we interact with the GPT or app - text/voice chat in the first case versus a touch screen in the second. But a more impactful difference is that OpenAI’s LLM powers the guts of the GPT instead of custom app code written by software engineers. And what this means is that many many more people (non-engineers) can create these kinds of reusable solutions or GPT “apps” than before.
I also don’t expect GPTs to remain static in terms of their capabilities and input/output interfaces, so the initial wave of Snoop-Dogg-etc1. chatbot “toys” aren’t likely to be indicative of the full potential of user-created custom GPTs. After all, the first user-created iPhone apps in 2007 were essentially websites, but as Apple added more functionality to its SDK, iOS developers were able to create apps that accessed the camera, GPS, and other native elements of the iPhone, and mobile apps became much more sophisticated, useful and differentiated from websites. This is where I’d expect GPTs to go, beyond text and voice input, to accessing custom data, external services and even hardware components for both input and output.
“Each GPT can be granted access to web browsing, DALL-E, and OpenAI’s Code Interpreter tool for writing and executing software. There’s also a “Knowledge” section in the builder interface for uploading custom data, like the DevDay event schedule. With another feature called Actions, OpenAI is letting GPTs hook into external services for accessing data like emails, databases, and more, starting with Canva and Zapier.”2
What happens to software development if GPT “apps” like this become so easy to create because an LLM is doing the technical heavy lifting in the background? I believe “software development” will become more accessible to a wider range of non-technical creators, and will swing towards a process (and skill) of discovering valuable use cases, leaving more of the implementation and execution (of the solution) to AI.
This brings us to the shift from GPT as a general purpose tool, to GPT as a platform for building “apps”, GPT as an “app store”, and GPT as an “app” ecosystem - possibly the next major platform and ecosystem after mobile apps, but potentially much easier for ordinary people to access and build due to the natural language interface for no-code building, and the LLM guts doing the technical heavy lifting of “app development”.
Framed in this way, the GPT platform has many of the characteristics of mobile app platforms like iOS and Android/Google Play:
These are the familiar building blocks of a large and successful platform and ecosystem.3
At the same time, the GPT platform differs from the previous mobile app paradigm in the following key ways:
In this new paradigm, what can we imagine building, especially if GPTs have access to custom knowledge, API interactions, software execution, external actions and new user interfaces? And how far away from the chatbot format can we get?
The short answer, I think, is that it’s still too early to tell. Many things still need to be figured out - extending the input and output interfaces for GPTs, envisioning user experiences for AI agents beyond chat, etc. But that doesn’t mean we should discount the potential for a seismic shift in what we consider apps to be, how we build them, and who can build them. Once the potential shape and scope of GPT “apps” becomes clearer through experimentation (which the GPT store will encourage through revenue sharing), more of the “developer ecosystem” - third-party APIs for AI agents to connect to, developer tools that enable building of more sophisticated GPTs, including tools for testing, user acquisition, analytics, etc. will emerge. And as we progress along this path, interesting early investment opportunities will also likely emerge across both the content/apps (GPTs) and infrastructure/tools spaces.
The advent of GPT as a platform has the potential to transform the landscape of AI agents and redefine software development. OpenAI’s recent announcement marks the beginning of what could be rapid innovation on these fronts, and as with other instances of massive technological change, I believe it will happen gradually, then suddenly.