ChatGPT maker OpenAI has released a preview of Operator — an artificial intelligence agent which can interact with websites and “perform digital tasks” on the behalf of its users.

Operator’s claim to fame is its ability to interact with graphical user interfaces (GUIs) through mouse and keyboard inputs, much like a human would use a standard computer.

In a “research preview” released 23 January, OpenAI showed the new agent could perform tasks which directly involved navigating pages on the internet, such as making restaurant reservations, buying clothes, or ordering food.

OpenAI further demonstrated the agent using a Linux operating system to download lecture files, combine PDFs, and compress images using photo editing software GIMP.

“Operator is one of our first agents, which are AIs capable of doing work for you independently – you give them a task and it will execute it,” said OpenAI.

Chief executive Sam Altman and OpenAI staff debuted Operator to a livestream audience on social media platform X, where they showed the AI operating via a prompt and response interface similar to ChatGPT.

Among some of the pre-written prompts shown on stream were “Find 4 tickets to the Kendrick Lamar concert” and “Suggest a 30-minute meal with chicken that has at least 4.5 stars”, while a more involved example saw the AI purchase groceries by viewing a photographed shopping list and browsing online US grocery delivery service Instacart.

OpenAI said Operator would eventually be a “part of ChatGPT”, with the preview currently available to ChatGPT Pro users in the US only.

How does Operator work?

Operator is powered by “Computer-Using Agent” (CUA) — a newcoming language model which combines GPT-4o’s vision capabilities with “advanced reasoning through reinforcement learning” to enable interactions with GUIs.

OpenAI explained the new model could “break tasks into multi-step plans and adaptively self-correct when challenges arise”, a capability which marked “the next step in AI development” by allowing AI to “use the same tools humans rely on daily”.

Company staff emphasised that while AI has already proven capable of performing such tasks using Application Programming Interface (APIs) on compatible websites, Operator differs by using a combination of screenshots and keyboard/mouse inputs.

At a conceptual level, the new agent uses a relatively simple logic: CUA runs through a loop that receives a visual snapshot of the computer’s current state; then reasons through the next appropriate steps using context-informed “chain-of-thought”; and finally takes actions (such as clicking, scrolling or typing) until it decides the task is completed or human input is needed.

Most steps are handled automatically, but OpenAI emphasised CUA “seeks user confirmation for sensitive actions” such as entering login details or filling out CAPTCHA forms.


Operator can complete tasks such as ordering food or booking a hotel, OpenAI says. Image: OpenA
I

Progress for agentic AI

While ChatGPT has already been tooled by advanced users to perform certain agentive tasks, CUA and Operator are being penned as some of OpenAI’s first agentic products.

The company said CUA has established a “new state-of-the-art” in computer and browser use benchmarks – scoring 38.1 per cent for computer use testing benchmark OSWorld (compared to the average 72.4 per cent scored by humans), and 58.1 per cent for web testing benchmark WebArena (compared to 78.2 per cent for humans).

“[There is] still a way to go,” said OpenAI.

Experts have anticipated 2025 will see major developments in agentive AI, with research and consulting firm Gartner anticipating some five billion connected products with the potential to behave as customers by end of the year.

A recent report from digital operations firm PagerDuty found 46 per cent of Asia Pacific and Japan respondents viewed agentive AI as “core to the future of IT operations”, while survey findings from enterprise AI company Pega found some 57 per cent of employees were open to using AI agents at work.

OpenAI’s Operator arrives only one week after Altman appeared alongside returning US President Donald Trump at the White House, to declare a multi-billion dollar private AI investment venture titled Stargate.

This announcement was coupled with a significant undoing of former president Joe Biden’s 2023 executive order to mandate safety measures on AI development.