No Data? No Problem! How To Start With AI Anyway

5 ways to kickstart your AI journey today – even if your data is a mess

"We need to fix our data before we can do AI."

If I had a dollar for every time I've heard this from business leaders, I'd probably be retired on a beach. "Fixing our data first" has become a comfortable excuse. A cozy blanket to hide under instead of taking the plunge into AI.

But here's the cold truth: while you're spending months (or years!) on internal data-cleaning projects, others are building their AI capabilities as they go – messy data or not.

You don't need perfection to start. So how can you begin your AI journey now and lay the foundation for more advanced applications along the way?

Let's dive in!

BONUS RESOURCE

Want an in-depth, real-world example of AI improving data? Check out my latest D2A2 Report on Semantic Agents, co-written with Arun Marar, Ph.D., and Prashanth Southekal.

The Data-Ready Myth

First, let's address the elephant in the room. Yes, data quality matters for certain AI applications. If you're building a custom machine learning model to predict highly specific business outcomes, you'll need relevant, well-structured historical data.

But here's what most businesses miss: modern AI, especially generative AI and foundation models, doesn't always require your perfect, proprietary dataset to start delivering value. Many of today's most powerful AI tools are pre-trained on vast datasets and can be applied to your business problems immediately.

Think of it this way: traditional ML models are like custom-built engines that need your specific fuel (data) to run. By contrast, today's foundation models are like ready-made vehicles that can drive on almost any road – they might not be perfectly optimized for your specific terrain, but they'll get you moving today while you improve the road along the way.

The real strategic advantage comes from "building while doing" rather than "preparing then doing". Ideally, each AI implementation creates data assets and capabilities that enable more sophisticated applications down the line.

5 Ways to Start with AI Without Perfect Data

Here are 5 different ways you can start using AI today with little or no dependency on existing data:

1. Use AI to Generate Data

Instead of waiting for perfect data to appear, use AI to create it:

  • Extract key information from unstructured documents by having tools like ChatGPT or Claude pull key details from PDFs, contracts, or reports. This turns "dark data" into reusable insights. It's nothing new. I've covered this already in my book AI-Powered Business Intelligence back in 2022.

  • Transcribe team calls using tools like Otter, Fathom or Microsoft Teams' built-in transcription. For example, if every department meeting was transcribed, people that were sick or couldn't attend can catch up quickly – while you simultaneously build a valuable dataset for future use.

  • Automate CRM updates with tools like Winn.AI or similar custom-built solutions you can dramatically enhance the quality in your CRM by suggesting field entries after calls, emails or meetings – letting your sales people do sales and not babysit the CRM.

2. Use AI to Improve Data

Rather than viewing data cleaning as a pre-AI task, use AI to improve your data:

  • Enforce data governance rules with AI that automatically flags inconsistencies in your records. Imagine an intelligent policy check automatically applied to every new piece of data that enters critical systems.

  • Clean up messy values in spreadsheets and databases by having AI identify and correct errors. This works surprisingly well, even without perfect training data – pre-trained AI can recognize patterns and anomalies based on context.

  • Enhance data quality during collection by implementing real-time suggestions that suggest inputs intelligently or dynamically validate field entries – so data gets corrected before entering your system.

  • Standardize inconsistent formats for names, addresses, and product descriptions across systems. I've seen teams use simple prompt engineering to normalize customer records in a fraction of the time it would take to create and implement rigid rules.

3. Use AI for Tasks That Don't Need Your Data

There are lots of powerful AI use cases that actually don't require any of your proprietary data:

  • Draft routine communications using general-purpose LLMs. From emails and meeting agendas to project updates, these tools can generate high-quality first drafts without any training on your specific content.

  • Generate creative content for marketing without extensive brand training. You can provide a few examples in your prompt and get immediately useful results that just need light editing to match your voice.

  • Conduct research using AI services that support web search like Perplexity, Google Gemini, or ChatGPT search. These let you compile insights from across the web, saving hours of manual research.

  • Create basic presentations that can be refined rather than built from scratch. AI tools like Beautiful, Gamma, or Plus not only allow you to generate outlines and suggest content structure but also create final slide decks that looks like they came straight from an agency.

4. Use AI for Tasks That Work Well with Messy Data

Some AI applications are surprisingly resilient to data quality issues:

  • SEO analysis and UX improvement. Even ChatGPT can analyze content and suggest improvements based on general on-page SEO principles just from your site's raw HTML code or even screenshots.

  • Support ticket categorization that quickly allows you to triage incoming emails with AI. Modern LLMs can quickly understand the intent behind customer messages and route them appropriately, even when customers express themselves in wildly different ways – across languages!

  • Competitive intelligence gathering from varied public sources. AI can monitor news, social media, and websites to extract meaningful competitive insights without requiring a perfect taxonomy or data structure. In fact, AI can help you build this taxonomy as you go!

Which brings me to the last point…

5. Use AI to Build a Knowledge Foundation

Of course, organizations that have a strong data foundation will eventually run circles around others that are just getting started. That's why starting to use AI to build your internal knowledge infrastructure creates not only short-term value, but also long-term competitive advantage. Some standard use cases:

  • Create simple question-answering systems on your existing documents using RAG (Retrieval-Augmented Generation). This works both for internal and external applications.

  • Enhance search for knowledge bases without restructuring your information. Just replacing the keyword-based search component of your company's wiki or knowledge base can help users find relevant information more reliably.

  • Capture institutional knowledge from meetings and conversations using the AI transcription and summarization from above. While this rarely works completely automatically at scale, it has become much easier thanks to AI.

  • Build learning loops. By tracking which answers work for users and which need refinement, you continuously improve your knowledge base without a massive upfront investment.

When You Actually Should Wait for Better Data

In the spirit of balance, there are legitimate cases where improving your data first makes sense:

  • High-stakes decision systems where wrong predictions could have serious consequences (like healthcare diagnostics or financial risk models)

  • Highly regulated environments with strict requirements for data governance and explainability

  • When your existing data is actively misleading (e.g., reflecting biased historical decisions)

In these cases, a thoughtful data strategy is mandatory – but even here, you can often start with simpler AI applications in parallel with your data improvements.

Building Your Progressive AI Roadmap

The beauty of starting with AI now is that early implementations create the capabilities and data foundations for more advanced use cases – aka your AI Roadmap.

For example, a common progression I've seen work well:

  1. Start with content generation and knowledge tools that require minimal proprietary data

  2. Implement basic data collection through AI-assisted transcription and extraction systems

  3. Build improved data governance using AI to enforce consistency

  4. Develop simple analytics on the newly collected, higher-quality data

  5. Create prediction models that leverage the growing historical dataset

Each stage generates data and learnings for the next, creating a compounding return on your investment. The organizations that wait for perfect data often find themselves perpetually stuck at step zero.

Time to Build

The most successful companies I work with aren't waiting for perfect data – they're starting with practical AI applications now and building their data assets along the way.

Here's a simple implementation you can try this week:

Create a knowledge assistant for your team documentation.

  1. Gather 5-10 frequently referenced docs (handbooks, policies, FAQs, etc.)

  2. Add these as knowledge to an AI Assistant using OpenAI or Azure

  3. Deploy the assistant to a simple internal website, Teams or Slack channel

  4. Have team members interact with the assistant 24/7

Time to set up: 1-2 days
Cost: ~$2.75 per 1,000 chats (assuming ~800 words per chat)
Benefit: Immediate time savings on information retrieval + creating a foundation for more sophisticated knowledge management

As your team uses this tool, you'll discover which information is most valuable, where gaps exist, and how people search for information – all valuable data for future AI initiatives.

"We need to fix our data first" might feel responsible, but in reality it's probably costing you time and competitive advantage.

What AI capability could you start building next week?

See you next Friday,
Tobias

Reply

or to participate.