Every piece of software these days has some "agent" feature now, or tries to sell you one. Agentic workflows, agentic automation, agentic AI for the enterprise.
Since Claude 4.6 and (at last) GPT 5.5, we have to acknowledge – the technology for AI agents in production mostly is ready.
But most organizations aren't. At least not the ones I've seen. The gap between "we're doing agents" and "we actually have agents running in production" is huge.
Here’s why that happens and how you can figure out if you’re ready for an AI agent or not.
What's an AI Agent, again?
Calling everything "agentic" that has an LLM in it isn't helpful. Most of them should be built as AI workflows – which are 10x easier to bring to and keep in production. But sometimes, aiming for an actual AI agent makes sense.
So what is an AI agent, really? For LLM-era agents in a business context, three things matter:
A goal – By definition, an agent tries to achieve a certain outcome. Not a task, not a step-by-step instruction. A goal. "Resolve this support ticket" is a goal. "Classify this email and route it to team B" is a workflow with two steps. The goal is what gives the agent direction.
Tools – In order to achieve that goal, the agent has to be able to interact with the world around it. Systems it can act on. A database it can query, a CRM it can update, an API it can call. In a business context, an agent without tools can't do anything useful. It's just a chatbot that thinks out loud. Tools are what give the agent reach.
Autonomy – This is probably the most important one. The agent decides HOW to reach the goal. Which tools to use, in what order, how many times. If you've defined the steps, it's a workflow, even if an LLM executes each step. Autonomy is what makes it an agent and not a workflow.
In practice, though, things get murkier. A lot of what's being called an "agent" isn't agentic at all. And even the ones that meet the criteria above are still heavily constrained. A recent study by researchers from Berkeley and Stanford found that ~70% of production agents rely on prompting off-the-shelf models and 68% execute ten or fewer steps before human intervention. These are tightly scoped, heavily supervised systems – closer to smart workflows than autonomous agents.
And that's fine! Workflows are great. Most AI agents probably should have been a workflow.
But calling everything an agent creates wrong expectations — especially about what your organization needs to support one.
The 3 Requirements for AI Agents in Production
Now that we've covered what an agent is (and assuming you're sure you need one), let's move to what it takes for your org to actually run one.
Because an AI agent will make wrong calls. That's by design. Autonomy means it can choose badly. The question is whether your organization can handle the failure. That comes down to three things:
1) Traceability – Can you see what it did? When the agent makes a tool call, can you reconstruct which tools it used, what data it pulled, what reasoning drove the decision, and which policy applied? If you can't trace the chain from input to action, you can't debug, you can't improve, and you can't explain to a customer or regulator what happened. Most organizations don't have this for their human processes either, which is exactly why deploying an agent makes the gap visible.
2) Accountability – Who owns the outcome? When the agent makes a wrong call, who catches it? Who fixes it? Who improves the rules? If the answer is "nobody" or "IT, I guess," there's no path to production. Agents don't replace accountability. They require more of it, because decisions happen faster and at higher volume. Someone — a human, with a name and a role — needs to own the results. McKinsey's 2025 State of AI survey found that AI high performers are about 3x more likely to have senior leaders who demonstrate clear ownership of AI initiatives. That's not a coincidence.
3) Recoverability – Can you correct or tolerate the outcome? Some actions are reversible: you can delete a draft, undo a CRM update, re-route a ticket. Some aren't: you can't un-send a contract, un-publish a press release, or un-wire a payment. For irreversible actions, the question becomes: can you catch it before it fires (human-in-the-loop gate), or can you absorb the cost if it goes wrong? If neither is true, you need a checkpoint before that action fires.
These three stack as dependencies:
You can't recover from what you can't see.
Tracing is pointless if nobody acts on it.
Accountability is empty if the person responsible has no way to fix or contain the outcome.
Miss any one of the three and the whole thing breaks.
This is why we see so few agents in production. Because this isn't a pure tech issue. The route to stable production is iterative and controlled. Organizations that scale well don't eliminate human involvement. They make human oversight more selective, better orchestrated, and more accountable over time.
Most companies don't think about any of this until after the first bad outcome. By then, trust in the whole system is gone.

Checklist: Is My Organization Ready?
Let's make this concrete. Before deploying an AI agent into production, your organization needs to pass two tests.
The first test checks whether you actually have an agent (and not just a workflow in disguise):
1) Can you define the goal clearly? Bad: "Improve customer service." Good: "Resolve tier-1 support tickets with a resolution rate above 80% and escalate everything else."
2) Can you give it tools – and do you control what they do? Can you grant API access, permissions, and credentials so the agent can read from and write to real systems? Most organizations have never given a non-human actor write access to production systems.
3) Can you articulate the decision rules? Bad: "Here are 100 historic examples, figure it out." Good: "These are the 12 rules to follow when interacting with customers, and here's when to escalate."
The second test checks whether your org can actually run it:
4) Can you trace what it did, name who's responsible, and fix it if it's wrong? If you can't answer yes to all three parts, you're not ready for production — regardless of how good the demo looks.
To pass these tests, you need strong data foundations, accessible systems, integrated tooling, clear human-oversight rules, cross-functional collaboration, and workflows that are redesigned around AI. That's hard. It's why so few companies are there.
Getting to this point isn't a project. It's the final stage of a journey – one that's often started best with a simple AI assistant.
What to do next
First, ask yourself if you really need an AI agent at all. If no, build a workflow — it'll get you 80% of the value at a fraction of the complexity.
If yes, apply the checklist above. Can you define the goal, grant tool access, write the decision rules? Can you trace what it does, name who owns the outcome, and fix it when it's wrong?
Not ready? That's fine. Start with an Assistant. Let your team learn how the AI responds. Document the decision rules as you go. Build the traceability and accountability infrastructure you'll need later.
At some point, the agents will come – and they’ll actually work.
See you next Saturday!
Tobias
PS: Before you debate whether you need an AI agent or an AI workflow, make sure you're solving the right problem. The Profitable AI Advantage helps you figure this out.
