AI agents are being hailed as the next big leap in artificial intelligence, but there’s no universally accepted definition of what they are—or what they should do. Even within the tech community, there’s debate about what constitutes an AI agent.
At its core, an AI agent can be described as software powered by artificial intelligence that performs tasks once handled by human roles, such as customer service agents, HR representatives, or IT help desk staff. However, their potential spans much further. These agents don’t just answer questions—they take action, often working across multiple systems. For example, Perplexity recently launched an AI agent to assist with holiday shopping, while Google introduced Project Mariner, an agent that helps users book flights, find recipes, and shop for household items.
While the idea seems straightforward, it’s muddied by inconsistent definitions. For Google, AI agents are task-based assistants tailored to specific roles, like coding help for developers or troubleshooting issues for IT professionals. In contrast, Asana views agents as digital co-workers that take on assigned tasks, and Sierra—a startup led by former Salesforce co-CEO Bret Taylor—envisions agents as sophisticated customer experience tools that surpass traditional chatbots by tackling complex problems.
This lack of consensus adds to the uncertainty around what AI agents can truly achieve. Rudina Seseri, founder and managing partner at Glasswing Ventures, explains this ambiguity stems from the technology’s infancy. She describes AI agents as intelligent systems capable of perceiving their environment, reasoning, making decisions, and taking actions to achieve specific goals autonomously. These agents rely on a mix of AI technologies, including natural language processing, machine learning, and computer vision, to operate in dynamic environments.
Optimists, like Box CEO Aaron Levie, believe AI agents will improve rapidly as advancements in GPU performance, model efficiency, and AI frameworks create a self-reinforcing cycle of innovation. However, skeptics like MIT robotics pioneer Rodney Brooks caution against overestimating progress, noting that solving real-world problems—especially those involving legacy systems with limited API access—can be far more challenging than anticipated.
David Cushman of HFS Research likens current AI agents to assistants rather than fully autonomous entities, with their capabilities limited to helping users complete specific tasks within pre-defined boundaries. True autonomy, where AI agents handle contingencies and perform at scale without human oversight, remains a distant goal.
Jon Turow, a partner at Madrona Ventures, emphasizes the need for dedicated infrastructure to support the development of AI agents. He envisions a tech stack that allows developers to focus on product differentiation while leaving scalability and reliability to the platform. This infrastructure would likely involve multiple specialized models working together under a routing layer, rather than relying on a single large language model (LLM).
Fred Havemeyer of Macquarie US Equity Research agrees, noting that the most effective AI agents will combine various models to handle complex tasks. He imagines a future where agents act like autonomous supervisors, delegating tasks and reasoning through multi-step processes to achieve abstract goals.
While this vision is compelling, the current state of AI agents suggests we’re still in a transitional phase. The progress so far is promising, but several breakthroughs are needed before agents can operate as envisioned—truly autonomous, multi-functional, and capable of seamless collaboration across diverse systems.
This story, originally published on July 13, 2024, has been updated to reflect new developments from Perplexity and Google.