OpenAI Releases AI Agent "Operator"
The startup OpenAI has introduced its own AI agent called Operator. This AI agent is designed to perform various tasks on the internet on behalf of the user. Operator has the capability to browse web pages, interact with them, type text, scroll, and click buttons.
This new tool can handle a range of repetitive basic tasks like filling out forms, ordering groceries, or booking hotels. According to OpenAI's announcement, "The ability to use the same interfaces and tools that people interact with daily expands the scope of AI application, helping to save time on everyday tasks and opening up new opportunities for business interaction."

Features of Operator
Operator is powered by a new AI model known as Computer-Using Agent (CUA). This model combines GPT-4o's screen visualization abilities with advanced reasoning through reinforcement learning. The agent can perceive information through screenshots and emulate human actions using a mouse and keyboard.
The CUA model is trained to seek confirmation before finalizing tasks such as making a hotel reservation or sending an email. Currently, a preliminary research version of Operator is operational and will be further developed based on user feedback.
Availability and Future Plans
Operator is currently available to ChatGPT Pro subscribers for $200 in the US through a special resource. OpenAI's future plans include expanding access to a wider audience. While the agent is not flawless at this stage, it will prompt the user to intervene if it encounters difficulties during a task.

As a comparison, in October 2024, the AI startup Anthropic unveiled an updated version of the Claude 3.5 Sonnet model, which can interact with a computer like a human - performing tasks such as moving the cursor, clicking buttons, and typing text.