Stay with us

From Chatting to Doing: What Actually Shifted in May
In May, OpenAI, Anthropic, Google, and Cursor showed where agents are heading: moving away from chat panels and closer to real work in a managed environment.

Codex, Claude, Gemini, and Cursor delivered a consistent message in May: agents are moving from chat to production. The focus isn't on a better model, but on environment, governance, and integration with work systems. For companies, this isn't a signal to buy another chat window—it's a signal to rebuild their operating model for AI.
The Signal from May: Agents Are Going into Production
The biggest shift in May 2026 didn't come as a single major launch. It arrived in the form of four smaller, yet strikingly aligned announcements from different vendors—all pointing in the same direction.
OpenAI moved Codex into managed remote environments and detailed how to run it securely. Anthropic raised usage limits for Claude Code and highlighted partnerships with PwC and KPMG. Google introduced Managed Agents in the Gemini API at I/O 2026. Cursor added Jira integration, a multi-repo workflow, and Composer 2.5.
Four different vendors moving in the exact same direction at once is no coincidence. It's a symptom of a maturing market: agents are moving past the demo stage and becoming an operational layer.
OpenAI Codex: From the IDE to a Managed Environment
In May, OpenAI pushed Codex beyond the editor. The mobile app now allows users to monitor ongoing work, return to active threads, approve commands, and change direction without having to sit at a computer.
But even more significant than mobile is the work in managed remote environments. There, Codex can operate directly alongside the customer's infrastructure, repositories, and internal data—without needing to send context to a public cloud.
At the same time, OpenAI detailed how they run Codex safely: isolating high-risk actions, establishing clear boundaries, approvals, and telemetry. This is exactly the kind of documentation that convinces enterprise customers. It's not about the features themselves, but their governance.
Claude: More Capacity, a Larger Organizational Role
In May, Anthropic raised the limits for Claude Code and the Claude API. In practice, this means fewer interruptions during longer workflows and more room for autonomous work without teams hitting ceilings as quickly as before.
But the partnerships speak louder than the limits themselves. PwC and KPMG are integrating Claude into their corporate processes, internal systems, and roles with direct accountability for results. This isn't a pilot project in a sandbox environment; it's a deployment with real-world consequences.
This marks a significant shift in the risk appetite of large organizations. For technical teams, the message is clear: capacity, access controls, and governance are now far more important conversations than comparing benchmarks.
Gemini and Cursor: Agents Directly in the Workflow
Google introduced Managed Agents in the Gemini API at I/O 2026. A single API call spins up an agent that uses tools, works within an isolated Linux environment, and maintains state across steps. This feels closer to production automation than traditional prompting.
Google is also bringing agentic behavior directly into the Gemini app, promising proactive and continuous assistance. Less clicking, more steps handled entirely by the agent.
Cursor took a more concrete path. The Jira integration, Composer 2.5, multi-repo setup, and cloud agent environments aren't just visionary features. They solve one specific problem: how a task moves from a ticket, through an agent, into a PR, and how that entire process connects to an existing delivery system.
What This Means for Your Team
The takeaway from the May releases is clear: agents deliver real value only when they have a well-defined scope, an audit trail, and a human who remains accountable for the outcome.
If a team is introducing coding agents, it makes sense to start small: test coverage, bug fixes, updating documentation, or preparing PRs. For each step, it's worth tracking lead time, the number of reworks, and how often senior intervention is needed. The data will tell a better story than any convincing demo.
The TameTeq perspective is straightforward: value doesn't come from adding another chat window. It comes when an agent fits seamlessly into existing operations, respects the rules, and genuinely accelerates delivery—without introducing unnecessary risk.
"It's not about a better model. It's about a better operating model for AI agents."
References
Read next

AI Workflows for Internal Processes: The Fastest Path to Higher Productivity
Most companies today approach AI from the visible surface-a chatbot, a copilot in the inbox, or a single smart feature on the website. However, the fastest business impact often lies elsewhere: in internal processes that cost dozens of hours of manual labor every week. This is exactly where it makes sense to build small internal systems and workflows on top of your data. Systems that gather background information, sort requests, prepare outputs, monitor dependencies, and speed up tedious operations. The result isn't just saved time. It's a faster company, cleaner operations, and a team that can focus on more important work.

Ship From Your Phone: Your Own AI Delivery System
A single message from your phone triggers the entire delivery engine: PR reviews, incident response, fixes, and preview releases. Within minutes, you get an analysis, a proposed patch, test results, and an audit trail. Everything runs self-hosted, meaning the server, rules, integrations, and data stay fully under your control.

RAG vs. File-Based Agents.
Instead of a simple 'File-based search vs. RAG' debate, a practical comparison is more useful: what starts faster, what offers better control, and what actually scales. This article summarizes current 2026 data and translates it into a straightforward decision-making framework.