Earlier this week I gave a Quarterly AI Briefing for my clients.

It’s a new recurring format that I plan to give regularly at the end of each quarter with the goal to recap what were the major events that happened in AI, what’s working, and what you should stop doing.

In this article, I want to share three key insights from this session.

Let’s dive in.

Models aren’t the key differentiator anymore

Q1 2026 set a record in the sheer number of new AI models that were released. In total, we saw over 250 new model releases between January and March, roughly 3 per day. The big labs alone shipped over 15 new models. Seems like ages ago, but Claude Opus 4.6 was just released in February (Opus 4.7 launched this week). It’s a breakneck speed that’s hard to keep up with.

But the good news is you don’t have to.

Because frontier level performance is very similar across the board. Gemini 3.1 and GPT 5.1 lead by a small margin. Maybe they just gamed the benchmarks best. What’s clear is that all of the big labs are competing in a similar category. Some models are slightly ahead in one area than others. Gemini thrives on multimodal and unstructured data. OpenAI and Anthropic excel in coding. on average, they’re all very similar.

Open source models are catching up, too. By the end of Q1/26 they deliver about 90% of the frontier performance at around 2% of the cost with Chinese models leading the pack.

From a business perspective, this means that the prime factor for success is not really what kind of model you pick, but more how do you integrate this model into your workflow – and at what costs you’re running them at scale.

Well-engineered prompting, context management, and workflow design are where performance gains live now.

Still, it’s too early to pick a winner. Avoid single-provider lock-in. The opportunity cost of not being able to switch is real because of the huge market dynamics.

Two use cases drive ROI across the board

Everyone is still trying to find the golden goose with AI. The “game-changer”. So far, most value generation is still scattered around the last mile of process optimization. The nitty gritty details that don’t make headlines, but silently create profits as they run.

Apart from that, two categories stand out.

The first is coding. Make no mistake, this is broader than just software development. Business operators using tools like Claude Code or Codex to build internal tools, accelerate prototyping or just generally building personal workflows are winning. “Vibe coding” has officially reached escape velocity, allowing non-developers to build simple apps. This doesn’t mean that software engineering is getting replaced overnight. It’s more that the demand for software will surge. As building software gets cheaper and more accessible, the total demand for software goes up, not down. It’s Jevons paradox at work. The key for organizations right now is not to have hundreds of employees push their own vibe-coded app to customers or production (unless you’re Anthropic). It’s that business people can simply describe what they want and AI builds a fast v1. Sometimes, this v1 is already useful. Sometimes it serves rather illustrative purposes.This is the new standard.

The second big area is Customer support. Chatbots that are deflecting tier-1 queries, auto-classifying and routing new tickets and performing simple actions to resolve the issue. This case has the clearest path to ROI. An average customer interaction by a human costs up to $6.00. AI brings that cost down to $0.50.

If you're looking for your next “no-brainer” AI use case, you should probably start here. But start simple. The 80% Fallacy still holds true and building Customer Support Chatbots in production is an ongoing journey.

AI doesn't reduce work, it intensifies it

This is one of the most important research insights from Q1 because it validated something that a lot of us heavy AI users are already feeling.

Harvard Business Review found out that heavy use of AI doesn't free up capacity, but adds pressure. Pressure to do more. Pressure to use idle time. Pressure to perform. "Every time you're not using AI, it feels like a waste of time." That’s why people generally don't save 5 hours/week with ChatGPT – they use it to fill them with more output.

The problem with that is that more output does not necessarily lead to better business results. “AI work slop” which is AI-generated content that looks good, but lacks substance. According to a BetterUp study, this costs an average 10,000-person business $9M per year and affects work culture negatively. 30% viewed their colleagues as less trustworthy after receiving AI workslop from them.

If your team is already under pressure, simply adding AI into the mix won't help. More pressure to use AI leads to more AI slop and therefore worse business results.

The limits of Productivity AI. To take steam off your employees, you don’t have to give them more tools, but re-engineer the way work is done.

For example, a customer support bot that resolves queries automatically is reducing work. Simply giving ChatGPT access to your support agents will intensify it.

If you’re rolling out general-purpose assistant tools like ChatGPT and Copilot, you need enablement programs, not just licenses. People using ChatGPT can point you to a more structural solution (“this work shouldn’t even show up here”), but they can’t build it. That's where an Engineered AI solution comes in – and that's a different conversation entirely.

Looking forward to what the next 3 months will bring.

I’ll keep you posted.

See you next Saturday!
Tobias

Reply

Avatar

or to participate

Keep Reading