AI Update Q2/2026

Earlier this week, I gave my regular Quarterly AI Briefing for clients. This time, we covered 8 topics in ~45 minutes – the things that happened that I found newsworthy enough to matter for your business.

As usual, there was a lot of noise. But this quarter, there was also (unusually) a lot of signal. We saw the first major frontier model get banned. The first real budget bursts. And clear signs that “agentic AI” is moving from conference slides into core business systems.

Today, I’d like to pull a few highlights from that briefing – the three things I’d pay closest attention to if you run AI in a business.

Let’s dive in!

Q2 in 1 Minute

Here are the eight topics that made it into the briefing

The Mythos Ban: the US government switched off the world's most capable AI model on a Friday afternoon
The Anthropic Quarter: now past OpenAI, and the industry's new talent magnet
The Agent Quarter: every big tech conference converged on one single message
The Agentic Ecosystem: Big names are opening up for agentic AI
The ROI Reality: bills arrived, but value didn’t (shocker)
The Governance Gap: agents run in production, but the cockpit’s still missing
The Layoff Illusion: "We replaced people with AI" stopped impressing the stock market
Outlook: GPT-5.6 is ready, and guess who decides whether it ships

Let’s zoom in on three of these:

If you’d like to watch the full 45-minute version of the Q2/2026 AI Briefing, you can grab the recording as part of my Profitable AI Pass – a system designed to help you find (at least) one profitable AI opportunity in your business with the help of my AI Copilots. Available until Sunday, July 5.
→ More details here

1. The Mythos Ban

Back in April, Anthropic announced Project Glasswing. Their unreleased Mythos model scanned critical software for security flaws, available only to a handful of partners. It found over 10,000 high or critical vulnerabilities in the world's most important software.

So Anthropic decided to hold the model back (“Too dangerous to release”). Keep it private, work with trusted partners, and ship a safer version later.

That safer version eventually became Fable 5, released in June.

And this is where things started to get kafkaesque.

On June 10, Dario Amodei published an essay “Policy on the AI Exponential” where he literally wrote:

❝

It is time to go beyond transparency to more serious and binding regulation of AI. I believe the best analogy […] is to cars, airplanes, or drugs—powerful technologies essential to the modern economy, but capable of killing large numbers of people if designed or operated poorly.

Dario Amodei, Anthropic CEO

Two days later, on a Friday afternoon, the US Commerce Department issued export controls on Mythos and Fable 5. Anthropic’s best AI model effectively became unavailable worldwide across all platforms within a few hours.

Be careful what you wish for.

The interesting nuance here is that the US government didn’t ban Anthropic’s model per se. It just restricted access to non-US citizens (which effectively led to Anthropic taking the model down because how would you ever control who is a US citizen or not?)

To be clear, this was a precedent situation which many have kind of predicted, but no one saw it coming that fast.

A few days ago, the export controls were lifted and Fable 5 came back globally on July 1.

Except it didn't.

The developer platform BridgeMind re-ran their coding benchmark on the restored model. Debugging performance dropped by almost 70%. Refactoring by nearly 50%. The new safety classifiers reroute anything that smells like security work to the older Opus model, and they trigger on routine tasks.

Image source: Wes Roth via X

If this quarter has shown one thing then it’s that Sovereign AI stopped being a buzzword and became a hard requirement for enterprises running critical workflows with AI.

I believe the consequence is not that every company should now go out and buy their own GPU racks. But you should be able to answer the following question:

❝

Which of my workflows would survive losing their AI model tomorrow?

For chat and productivity tools, switching is annoying but doable. For the workflows embedded deep in your operations, swappable models or version-fixed open source models are now table stakes. I've written about the how in Build AI Fast, Then Own It Smart.

2. The Agent Quarter

Q2 was also conference season. We saw Anthropic’s Code with Claude, Google I/O, Microsoft Build, and NVIDIA GTC Taiwan. Hundreds of announcements that condense into one sentence, and I think Satya Nadella said it most clearly:

❝

AI is no longer about responding to a prompt, it is about running the work.

The receipts:

Google made AI Mode the default search experience worldwide and put Gemini Spark to work as a 24/7 agent inside Gmail.
Microsoft shipped Agent Framework 1.0, a native runtime for agents on Windows, and repositioned Copilot from pair programmer to peer programmer.
Anthropic introduced Managed Agents: you define the outcome, their servers run the agents, even overnight. They call it “decoupling the brain from the hands.”
And NVIDIA expects $1 trillion in compute demand through 2027, already sold out on GPUs.

Microsoft finally embracing the term “Autopilots” as well.

What this gave us is consumer, enterprise, and infrastructure all betting on the same thesis: we’re moving full-steam into the era of agentic AI.

Now here's the slide none of the keynotes showed: what happens when AI agents actually run the real world for a longer period of time.

Two interesting experiments from this quarter:

Researchers put Google's Gemini in charge of a real café in Stockholm. The first days looked great, but then it lost $6,000 by handing out 99% discounts and hosting lavish events.
Another group of researchers set up five virtual societies of ten AI agents each and let them run for 15 days – no governance or oversight. Results: Grok's society went extinct in 4 days. Gemini committed 683 crimes. GPT-5 agents failed to take basic survival actions. And Claude sat down and wrote constitutions, until bad actors convinced it to go rogue. This one hit a nerve. The LinkedIn post quickly accumulated 100,000+ views and many comments of people sharing similar experiences in non-simulated environments:

Researchers had AI models simulate a society and run it for days.

Result:
- In Grok’s world everyone died after ~4 days
- Gemini committed 683 crimes
- Claude spent its time writing constitutions (until it met bad actors that convinced it to go rogue)
Read more…

LinkedIn - Tobias Zwingmann

What I wanted to show with this post is AI benchmarks typically evaluate one-off tasks: “Do this job, I'll check the result." But production is different. Deployed agents make repeated decisions, react to feedback, face manipulation, and pursue a goal over a long horizon. That's where small mistakes compound into drift nobody designed for.

So yes, the direction is set, whether we like it or not. But there's a whole staircase between an assistant and an autonomous agent, and I still prefer the middle step: when the workflow is known, an Autopilot built in n8n beats a fancy AI agent. And where agents do go live, governance belongs in the design from day one, with performance watched all the time. Not just the first week after go-live.

3. The ROI Reality

One story made the rounds this quarter: a company that reportedly spent $500 million in a single month on Claude tokens. Because nobody set usage limits.

I couldn't verify it because there’s no named company, and no source. But it fits what I see in consulting work: companies spend first and measure second.

The verified stories are telling enough:

Microsoft canceled internal Claude Code licenses over costs.
Uber burned through its entire 2026 token budget in four months.
ServiceNow reported the same.

And what’s actually worse than spending is that none of them can actually draw a clear line from spend to gains. More tokens does not mean more productivity. (Picture employees using AI to check the weather every day.)

This is nothing new: Automating a task is not the same as creating value. What typically gets automated is the work people dislike, not the work that matters. The agent layer is cheap to buy and expensive to run, and the vendors above are making the buying part easier every quarter. The value part stays your job.

Which brings back the two questions I ask before any engineered AI use case:

What's the minimum value this needs to deliver?
What's the maximum I'll spend to find out?

If a use case can't answer both, it's a use case without a case. Kill it, no bonus points for running the most agents.

Surging AI costs are a control problem more than a cost problem. You can switch a workflow off. If you designed it so you can. (See my Cost Cap Model)

Looking Ahead

OpenAI just announced GPT-5.6 a few days ago – and guess who they asked first before they shipped it? Right. We’re still waiting on general approval. Until then, only a “selected group of partners” have access.

I guess the old times where frontier AI labs can’t ship new models fast enough are over. We’ll see.

I’ll report back in Q3.

Until then…

The Full Briefing

This was 3 of 8 topics. The recording covers the rest: why this was the Anthropic Quarter, what Salesforce and SAP just signaled about the agentic ecosystem, the governance gap in enterprise AI, and how the layoff illusion burst.

The full recording and all slides are included in the Profitable AI Pass, available only through this weekend (July 5).

Get Your Profitable AI Pass Today

See you next Saturday!
Tobias