When OpenAI first shipped Operator in August 2025, the demo was impressive but the reality was underwhelming. The browser agent could book flights and fill out forms, but one wrong click and it was lost. Eight months later, the picture is markedly different.

From Demo Theater to Real Work

The shift started around month three. OpenAI rolled out a series of context-window expansions and a new “confirmation checkpoint” system that let Operator pause and ask before taking irreversible actions. Users went from watching it succeed in controlled demos to trusting it with actual work tasks — drafting outreach emails, compiling research summaries, filling procurement forms.

Independent benchmarks tell the story. When.agent, a third-party AI testing platform, tracked 500 complex multi-step tasks across Operator, Anthropic’s Claude Computer Use, and Google’s Gemini Ultra in January 2026, Operator completed 78% successfully, up from 31% at launch. That’s not perfect, but it’s crossable threshold for real adoption.

The Competitor Landscape Sharpens

OpenAI didn’t have the agent space to itself for long. Anthropic’s Computer Use feature, baked directly into Claude’s API, hit general availability in October 2025 with a different approach — instead of controlling a browser, it speaks directly to web APIs. The result is faster execution and fewer visual hallucinations, but narrower use cases.

Google went wider. Gemini Ultra’s “Deep Research” mode, expanded in March 2026, can now autonomously navigate paywalled sites, extract tables from PDFs, and compile findings into structured reports. It’s the closest thing yet to a junior researcher you don’t have to manage.

Microsoft, notably absent from the first wave, re-entered with an Azure AI Agents platform in February 2026. Built on top of GPT-4o and integrated directly into Microsoft 365, it can draft emails, update spreadsheets, and schedule meetings by talking to your existing calendar and files. The enterprise pitch is straightforward: you already pay for Microsoft 365, and this makes it smarter.

What Enterprise Adoption Actually Looks Like

The early enterprise data is mixed in ways that should temper both optimism and cynicism. A March 2026 survey of 340 companies using AI agents by workplace research firm Fishbowl found that 61% had deployed at least one agentic system to production. Of those, 44% reported measurable productivity gains, but only 19% said the gains were “significant.”

The gap between deployment and impact comes down to trust and workflow integration. Companies seeing the best results aren’t just plugging in agents — they’re redesigning processes around what AI does well: repetitive multi-step tasks, document synthesis, and data entry. Companies seeing the worst results are trying to bolt AI onto existing workflows unchanged and expecting magic.

The Next Frontier: Memory and Continuity

The biggest unlock in 2026 isn’t raw capability — it’s memory. Early agents started each session fresh. Newer systems can maintain context across days and weeks, learning user preferences, remembering prior decisions, and picking up where they left off. OpenAI’s Memory API, rolled out quietly in January 2026, lets any developer give agents persistent context windows.

For individual professionals, this means your AI assistant gradually becomes genuinely tailored to how you work. For enterprises, it means agents can manage long-running projects without constant human handoffs. The terrible-two phase is ending. The useful-adult phase is beginning — unevenly, imperfectly, but unmistakably.