On building the modern science of AI agents
What is an AI agent? Well, it’s generally some piece of software that interacts with the world, perceiving it through some sensors and influencing it by executing some actions. In the modern tech lingo, an AI agent is some entity based on Large Language Models, interacting with a computer. You might be excited about them, and so am I: the dream of giving agency to the versatility and steerability of LLMs is too vivid, too ambitious, to be ignored.
There is often a specific tinkering attitude around the work on this modern form of AI agents. As a community, we are building these systems with passion, stitching pieces together and trying to get good performance on tasks of interest, in one way or another. That’s very exciting and natural: new promising tools make us excited to try them, to explore and to make things work. You can see this type of pure excitement in Github projects, startups, and even in papers submitted to scientific conferences. Tinkering is great. It’s at the heart of good software and innovation. But, for AI agents, tinkering is not going to be enough.
AI agents are not just any type of software. They are autonomous: their decisions happen, by learning and interaction, with potentially no human intervention. Their actions can have consequences on the digital or physical world. And if they work, they are powerful. This means that society can greatly benefit from this technology, but also that its development cannot be carried out without concerns on the alignment and safety of these systems. Only careful science can guarantee that AI agents are going to be as useful and as safe as we need them to be.
But what science? The science we need is a new, modern, science of LLM-based AI agents. It should show, with as minimal doubt as possible, if they work, when they work, how they work, and why they work. What are good testbeds to understand AI agents? Why should LLM pretraining give them decision-making capabilities? How can we better align a decision-making agent to human intentions or create guardrails for it? A modern science of AI agents should be able to answer these questions.
I am afraid. I am afraid that allowing the hype to blind us while developing these modern AI agents might potentially lead us to miss the opportunities they could create. I see two ways this could happen:
By keeping some unreasonable promise of general AI agents alive for too long and not pursuing actual understanding of its feasibility under current paradigms
By deploying unsafe AI agents in the wild and observing some accident
In both cases, there is a risk of sweeping away most of the passion (and the funding) we are observing right now. In the first case, due to unmet expectations; in the second case, due to the fear that the technology might never be robust enough for deployment. We don’t want to create disillusionment about AI agents: they can be a transformative tool, and improve the well-being of our society. We don’t want to deploy unsafe AI agents: even when not completely autonomous, they may already cause some harm, and lead to both human suffering and pushbacks against the technology. All of this can be avoided, at the same time, by conducting deep and rigorous scientific investigations.
Why do we need a new science? Neither evaluating nor understanding LLM-based sequential decision-making agents are easy. One might say it is way harder than evaluating or understanding static predictors. Nowadays, we even think that static predictors based on LLMs are only shallowly evaluated or understood: imagine how shallow our grasp of LLM-based AI agents might be! We need new tools, new knowledge, new energy.
Now, listen. This is a call to arms. If you are a researcher with interest in sequential decision-making or LLMs, don’t dismiss the modern AI agents, but also don’t let the hype distract you from building safe systems and true understanding. Be hyped and curious, first of all, but also be rigorous, skeptical, and cautious. Our community needs this. Let’s build together the modern science of AI agents.
Thanks to Nate Rahn, Martin Klissarov and Jesse Farebrother for the nice discussions on AI agents and their feedback on the blog post.