Making LLMs (Really) Useful in the Real World

How “tool use” turns a text generator into a powerful business engine

Have you ever tried getting ChatGPT to calculate the total cost of a complex project with dozens of line items and variable discounts? If you have, you probably ended up with a confident-sounding answer that was completely wrong.

This isn't a random glitch - it's a fundamental limitation of how large language models work. But the good news is that these limitations are being solved through an approach called "tool use" - essentially giving LLMs the ability to call on specialized helpers when needed.

Today, I'll show you how this works in practice with a real-world example that could save you hours of frustration – and embarrassing customer interactions.

Let's dive in!

📢 Upcoming Workshop Announcement 📢

The Limitation of LLMs

Large Language Models are, at their core, just text generators. Despite all their impressive capabilities, they're fundamentally designed to predict what words should come next in a sequence.

This next-word prediction approach works surprisingly well for many tasks - writing emails, summarizing documents, generating creative content. But it has clear limitations, especially when it comes to:

  • Precise calculations: Numbers get treated as just another type of token, with no inherent mathematical meaning attached to them

  • Real-time information: LLMs can only work with the data they were trained on

  • External verification: They have no built-in way to check if what they're saying is actually true

  • Executing code: While LLMs are excellent at writing code, they can’t run the code by themselves but need a runtime to do this.

This is why your AI assistant might confidently tell you that 34.1 × 91.5 equals 3,123.15 (it's actually 3,120.15) or claim a product launched in 2022 when it actually came out in 2023.

ChatGPT “doing math” by predicting tokens

Why This Matters For Your Business

These limitations aren't just academic concerns - they directly impact how useful AI can be in business contexts. If you always have to double-check the correctness of LLM responses, many use cases don't make sense economically.

So what can you do?

Rather than abandoning LLMs altogether, the solution is to let LLMs do what they do best (like handling a conversation), while delegating specialized tasks to purpose-built services.

This is where "tool use" comes in.

In essence, tool use enables an LLM to recognize when it needs help and trigger an API call to an external service. Think of it like a smart project manager who knows when to call in a specialist. The LLM handles the general reasoning and communication, while specialized tools handle specific technical tasks.

This "delegation paradigm" is transforming LLMs from a standalone technology into a coordinator of specialized services - making it vastly more useful for real-world applications.

Practical Deep-Dive: Spreadsheet Calculations

Let’s consider a real-world example:

Imagine we're the organizer of the up-and coming Tech Innovate conference in San Francisco (Any resemblance to actual events is purely coincidental!)

Image and event are completely made up. The use case isn't.

On our website we have a chatbot so that people get easy information about our upcoming events. 90% might just basic questions like the date and time:

Our chatbot can answer these questions right away because we have provided all relevant event information in the prompt. (If it was much more, we would have given it access to a knowledge base using a RAG setup – something that is a tool as well!)

So far so good.

But then some few (but not unimportant messages might be like this:

The answer here should be “hell yeah” and thanks to the connected knowledge, the chatbot could be able to assist with something like that:

Here’s where things break.

Would you like a pricing quote for a specific booth type or size?” is where the frontier of our LLM capabilities end. B2B-pricing (and conference-pricing in particular!) can be tricky and it’s easy for the LLM to spit out a wrong calculation because it overlooked a dependency or just “predicted” the wrong token in an attempt to do basic arithmetics.

So… tool use for the rescue!

In this case our pricing lives in a simple spreadsheet like this:

There are different prices per booth type and also different business rules (like minimum stand size) we need to consider. To avoid errors we always want the calculation to be done by the spreadsheet, not the LLM!

The LLM should just fetch the inputs and communicate the results.

How can we do this?

Of course, we could try to upload our Excel to Google Sheets, enable the Google Sheets API, define a custom schema and deploy it as a microservice. If you think that sounds like a lot of work, then you’re right. It’s possible, but would be an overkill here.

That's where tools like Grid.is come in. Grid.is, is a fantastic service from a cool startup in Iceland (no pun intended) that exposes a spreadsheet as an API to an LLM.

Here’s how to set this up using a ChatGPT Custom GPT in 5 steps:

Step 1: Upload the spreadsheet to Grid (or connect it to Google Sheets)

Step 2: Select which cells should be exposed to the LLM

Step 3: Copy the auto-generated instructions to your LLM

Step 4: Create a new custom action

Step 5: Copy and paste your private API key as well as the provided schema

You’re done!

To make it even easier for the LLM to understand my spreadsheet, I also added the following text to the prompt:

Now, when a customer inquires about the price our chatbot will query our spreadsheet API to get the right quotation:

No guessing anymore – just 100% correct calculations! Your chatbot could now also handle complex financial models, ROI analyses, pricing strategies - essentially anything you'd normally do in a spreadsheet, but triggered through natural language.

There are also two nice side effects:

1) Thanks to the spreadsheet API, our pricing requests get validated automatically. For example, if a customer wants less than the minimum space it wouldn’t calculate it:

2) The LLM can also help the customers with more open-ended questions. That’s the real power! Instead of acting like a “smart form”, the LLM could help with more advanced questions like this one:

To sign-off, our chatbot could now offer to collect the client’s email address, contact the support team and ask them to send an official quote to their address – sending an email would be another tool.

You can try out the chatbot demo here:

(And of course, this would all work outside of Custom GPTs too - you just need to know how to set it up and iterate quickly.)

You see, the options are endless!

Other Common Tool Use Cases

While calculation assistance is a perfect example of tool use, it's just one application in a growing ecosystem. Here are some other powerful ways LLMs are being extended through tools:

  • Knowledge retrieval: LLMs can query Wikipedia, company knowledge bases, or documentation before responding (typically RAG)

  • Real-time data access: Tools can pull current stock prices, weather forecasts, or other time-sensitive information from real-time APIs

  • Code execution: LLMs can write code, then actually run it and see the results before giving you the final solution

  • Search: Let the LLM do a (web) search to gather the necessary information.

The pattern is consistent across these examples: the LLM handles the conversation and reasoning, while delegating specialized tasks to purpose-built tools that excel at specific functions.

The Growing Tool Ecosystem

Tool use is becoming a standardized approach in the AI industry. Anthropic has recently launched their Model Context Protocol (MCP) that provides a unified interface for exposing tools to LLMs so you don’t have to wire up every service manually (like we did in the example above).

What does this mean for your business? Three big things:

  1. Reliability: Tools that handle specialized tasks dramatically improve the accuracy and reliability of AI solutions

  2. Flexibility: As new tools emerge, they can be integrated without needing to retrain the entire AI model

  3. Extensibility: Your custom business systems can be connected to AI through the same tool-use patterns

The growing standardization also means that tools created for one AI platform will increasingly work across different models – reducing vendor lock-in and making your LLM investments more future-proof.

Tools are also the key building block for deploying AI agents – advanced LLM systems that have more autonomy in deciding when they use which tool for which purpose.

Conclusion

Tool use represents a significant shift in how we interact with and benefit from LLMs. Rather than expecting LLMs to excel at everything, we're building an ecosystem where they can delegate specialized tasks to purpose-built tools – resulting in more reliable, capable AI systems.

As you think about implementing AI in your own business, consider:

  • Where are your friction points where LLMs fall short?

  • What existing systems could serve as tools to address those?

  • How might connecting these create new workflows that weren't possible before?

Figuring this out isn't a straightforward process. It requires lots of iterations and experiments. So why not start today? Think about a recurring workflow involving crunching numbers in a spreadsheet, and try setting it up like the example above.

Who knows what solution you might end up with.

Keep innovating!

See you next Friday,
Tobias

Reply

or to participate.