ADVERTISEMENT
OpenAI has introduced a new general-purpose AI agent within ChatGPT, designed to execute a wide range of computer-based tasks for users. The company states this agent can automatically manage calendars, generate editable presentations, and run code.
The tool, named ChatGPT agent, merges functionalities from OpenAI’s earlier agentic tools. These include Operator's web-clicking ability and Deep Research's capacity to synthesize information from numerous websites into concise reports. OpenAI reports users will interact with the agent through natural language prompts in ChatGPT.
ChatGPT agent is available today for subscribers to OpenAI’s Pro, Plus, and Team plans. Users can activate the tool by selecting "agent mode" in ChatGPT’s dropdown menu.
This launch represents OpenAI’s attempt to evolve ChatGPT into an agentic product capable of taking direct actions and offloading tasks, rather than only answering questions. Previous AI agents from various Silicon Valley companies have struggled with complex tasks, often falling short of initial product visions. However, OpenAI indicates ChatGPT agent offers enhanced capabilities over its prior offerings.
The new agent can utilize ChatGPT connectors, allowing users to link applications like Gmail and GitHub for relevant information retrieval. OpenAI states ChatGPT agent has terminal access and can use APIs to access specific applications.
OpenAI suggests the agent can be used to "plan and buy ingredients to make Japanese breakfast for four" or "analyze three competitors and create a slide deck." These tasks require the agent to process websites, plan actions, and utilize tools.
The model supporting ChatGPT agent demonstrates strong performance on various benchmarks. OpenAI reports a score of 41.6% on Humanity’s Last Exam (pass@1), a test spanning over a hundred subjects, approximately doubling the scores of OpenAI’s o3 and o4-mini models. On FrontierMath, a challenging math benchmark, ChatGPT agent scored 27.4% with tool access; the previous highest score from o4-mini was 6.3%.
OpenAI developed ChatGPT agent with safety protocols, acknowledging the new capabilities could present risks. In a safety report for ChatGPT agent, OpenAI has classified the model as "high capability" in biological and chemical weapon domains, a designation indicating potential to "amplify existing pathways to severe harm." OpenAI notes this is a precautionary measure lacking direct evidence, leading to the activation of new safeguards.
These safeguards for ChatGPT agent include a real-time monitor that analyzes prompts for biological relevance. If detected, a second monitor assesses the agent’s response for potential biological threat content. OpenAI has also disabled ChatGPT’s memory feature for this agent to prevent misuse, such as sensitive data exfiltration via prompt injection attacks, though this feature may be revisited.
The real-world performance of ChatGPT agent remains to be seen, as agent technology has historically faced challenges in practical application. OpenAI states it has developed a more capable model to deliver on the promise of AI agents.