The Cost Cap Model

How to stop AI costs from eating your ROI

The day an AI project enters an Excel spreadsheet is usually the day it dies.

Here’s what I see too often: a team builds a scrappy AI prototype. It kinda works. People get excited. Then someone says the word: "Production".

Out come the spreadsheets.

Six tabs. 3-year forecasts. Assumptions nobody really believes — but everyone pretends to agree with. Leadership wants certainty, and the numbers get engineered until nothing means anything anymore. A few months later, the prototype ends up in the graveyard.

You can't skip budgeting altogether — but you also can’t scale AI with fantasy math. What you need instead is a Cost Cap.

Let's dive in!

Why AI Costs Need Guardrails

Today’s article is all about AI solutions that live in the Engineered AI solutions track. They very much feel like "just another IT project". But what I’ve learned from the last 50+ applied AI projects is that the cost of these AI projects behave nothing like traditional IT costs.

Take a look at this (extremely simplified) chart:

In most classical IT projects, the vast majority of cost is front-loaded. You have an idea what to build, you scope it out, you bring in developers to build the thing. After that, ongoing costs are mainly "just" flat maintenance or driven by feature requests. Whether 1,000 or 100,000 use your system typically doesn't have much impact on the running cost. That's why software is eating the world.

You build something, it runs, and costs are pretty much ignorable unless you decide to enhance the software.

AI doesn't work that way. With AI solutions, costs grow even when nothing new is being added:

  • Model maintenance: Models degrade over time as data patterns shift. You need to retrain, adjust, and monitor.

  • Data drift: Your production data looks different every day. Fixing this costs money.

  • Usage-driven compute: AI requires more expensive hardware than classical IT systems. The more people use your AI, the more inference costs pile up. Success literally makes it more expensive.

If you're not careful, costs spiral rapidly out of control, erasing any ROI gains you thought you had.

That's why I always use a Cost Cap before bringing any AI solution into production.

How the Cost Cap Model Works

The Cost Cap model is dead simple. It's two numbers. That's it.

1. Value Threshold

This is the minimum value your AI use case must deliver in a given, recurring period.

If you've read my previous post on the $10K Threshold, you know the idea: Define a clear value threshold that makes an AI use case worth pursuing. The period depends on your ambition. $10K per quarter, per year, or per week – whatever fits your business context.

2. AI Cost Cap

This is the maximum running cost your AI solution is allowed to produce in that same period.

If your Value Threshold is $10K/quarter, your Cost Cap might be $3K/quarter. The exact ratio depends on your margins and risk tolerance — but the cap must always sit below the threshold.

The area between your Cost Cap and your Value Threshold is your profit zone. That's where you want to operate.

The Rule

Your AI use case stays alive as long as:

  • Delivered Value > Value Threshold, AND

  • Actual Costs < Cost Cap

If costs exceed the cap, or value drops below the threshold, you pause and reassess the case. It’s a clear line in the sand. No endless budget increases. No "let's just see how it goes".

This creates an economic corridor: minimum required impact on one side, maximum allowed cost on the other. Stay inside the corridor, and you're profitable. It’s that simple.

Example

Let's make this concrete.

Imagine your team built an AI-powered document extraction tool for your finance department. It pulls data from invoices and feeds it into your ERP system. The prototype worked great, now it's time for production.

Step 1: Define the Value Threshold

You estimate the tool saves ~15 hours of manual work per week across the team. At a blended cost of $50/hour, that's $750/week — or roughly $10K per quarter assuming the freed up time can be spent well otherwise.

That's your Value Threshold: $10K/quarter.

Step 2: Set the Cost Cap

You decide this system is allowed to cost at most $5K per quarter to run.

That includes:

  • inference

  • hosting

  • licenses

  • a buffer for surprises

This Cost Cap immediately erases a lot of bad ideas here like buying a fancy enterprise SaaS that costs $150K/year in licensing fees alone. The Cost Cap is a forcing function to design not for the most flashy AI solution but the one that's profitable. In other words: Instead of building the "best" AI system, you'd build the one that saves you those 15 hours, and runs for less than $20K per year.

Step 3: Monitor the Corridor

Every quarter, you check two things:

  1. Are we still saving at least $10K worth of time?

  2. Are our running costs still under $5K?

If yes – keep going. If the value drops (maybe the workload shrinks, or processes change) or costs spike (usage explodes, or you need expensive model upgrades) — you pause and decide: optimize, scale back, or shut it down.

No heavy spreadsheet work required. Just two numbers and a simple rule.

What About Development Costs?

You might be wondering: What about all the money spent building the thing in the first place?

One-time development costs sit outside my Cost Cap model. The Cost Cap is about ongoing, running costs.

But here's the thing: if you're doing AI right, your upfront development costs for moving into production shouldn’t be that high anyway.

Why? Because you've already built something during the prototyping and the pilot phase. Things you can repurpose like the core logic, the workflow design, the conceptual data pipelines, the AI model, ideally the infrastructure. You shouldn’t be starting again from scratch.

Recycle what you built in earlier phases. That's how you keep one-time costs manageable and protect your economic corridor.

Check out my post "From Prototype to Production" if you want to go deeper on this.

Conclusion

The Cost Cap model gets you out of spreadsheet theater and back into execution. You track two real numbers, one corridor.

Here's your action item:

For your current AI use case (or the next one you're considering), answer two questions:

  1. What's the minimum value it needs to deliver per period? (Value Threshold)

  2. What's the maximum it's allowed to cost per period? (Cost Cap)

If it stays inside the corridor, you run with it.

If it doesn't, it's time for action. Either increase the delivered value (can you save even more hours or $$$?), or bring the costs down (think cheaper model, on-prem infrastructure, caching layers). Worst case: stop the project altogether.

Don't let complex Excel spreadsheets stop you.

See you next Friday,
Tobias

Reply

or to participate.