- The Augmented Advantage
- Posts
- From Ugly Data To Profitable Insights
From Ugly Data To Profitable Insights
Why perfect data is the enemy of profitable insights – and how I use AI for help
Your best business insights are buried in ugly data.
Customer complaints buried in support tickets, sales patterns scattered across inconsistent spreadsheets, employee feedback sitting unread in surveys – all the messy, imperfect information that traditional analysis can't handle. Every day you wait for "clean" data is another day opportunities slip by.
AI finally makes working with ugly data worthwhile, and today I'll share the exact 5-step process I use to do $50K+ analysis work in 2 days from the ugliest data.
Let's dive in!
What’s Ugly Data?
To understand ugly data, let's look at its counterpart first: “Pretty Data”. Traditional BI is built around pretty data. It treats it like a factory operation: High precision, high volume, regular reporting cycles. Business users can pivot and slice with self-service tools (to the extent that’s allowed). It's built for questions you ask repeatedly: monthly sales, quarterly performance, annual trends. This model works brilliantly when you know exactly what you're looking for and can define it in advance.
Ugly data breaks this. Take employee survey responses about "what would improve collaboration." Or customer feedback. Or incoming Purchase Orders. There's no "right format" for how this data should look like. It depends on the context. For a survey, you could offer checkboxes instead of free text. But forcing responses into predefined categories might kill the very insights you’re looking for after all!
The fundamental challenge with analyzing ugly data is that you never know if you’ll get something out of it until you start looking. And this process used to be very expensive. You might spend weeks structuring survey data only to discover it tells you nothing new.
Which killed Ugly data projects before AI.
But, that equation has changed.
The 5-Step Ugly Data Process
Thanks to AI, you can now extract reliable insights from Ugly data in hours, not weeks of manual work with uncertain payoff. But it’s not as easy as just uploading your ugly spreadsheet to ChatGPT and ask it for “insights”. To get “reliable insights", the key is having a systematic approach that leverages AI's strengths while building in the reliability checks that business decisions require.
I use a 5-step process every time I approach an Ugly Data project. I'll walk you through each step using a recent corporate example – an employee survey analysis that would have cost $50K+ using traditional methods – and took me 2 days with AI.
Note: All specific details have been modified for confidentiality, but the process and results reflect reality.
The scenario: A large corporation conducted an employee survey to understand how teams interpret high-level company values and identify specific ways to enhance collaboration. They collected thousands of open-text responses using minimal structure. The challenge was that extracting valid, actionable insights from this would have easily cost a month of manual work – something that the sponsor hesitated to do because there was no guarantee this survey would actually deliver fresh insights.
With the Ugly Data Process, I completed this work in less than 2 days.

Here's how it works:
Step 1: Know What You’re Looking For
Don’t ever do a data analysis unless you know what you’re looking for. This is not about fast-tracking the answer. It’s about asking the right question. (Feel free to run these prompts for help.) If you skip this step and just ask ChatGPT for "insights", you’ll waste two days of work and end up with interesting-sounding results that no-one will ever use.
Example: The HR team had two specific questions: How do employees actually interpret our company values in practice? And what specific barriers prevent better collaboration? These weren't perfectly defined research questions – they were business problems with enough direction to guide the analysis toward actionable answers.
Step 2: Get Data Into Right Shape
You don't need perfect data, you need workable data. This means basic structure that AI can easily write code for. Don't waste time on deep cleaning at this step; focus on making the data easy to work with programatically.
Example: The raw export had two separate tables – survey responses with question IDs, plus a lookup table with actual questions. Also empty rows and totals from the export process. So I spent a few minutes to turn this Ugly data into Tidy data – a flat CSV that had question titles as column headers, one row per respondent, and clean text responses.

Tidy data – easy to work with. Source: R 4 Data Science
Step 3: Data Exploration
Next, you need to understand what you're actually working with. I use AI to create tons of visualizations, counts and analyze patterns that might turn out to be big pitfalls. Professional data analysts call this “Exploratory Data Analysis”– and it's critical for avoiding garbage insights.
Example: I had AI write the code to check for response lengths, completion rates, and looked for unusual patterns. This revealed that some responses were just one word per answer, while others had extremely long text in the first question only. Turns out, some respondents had unconsciously answered all questions in their first response, leaving the rest blank. Without catching this, any question-by-question analysis would have been completely skewed. In a traditional approach I’d spend at least a full day to reach this point. With AI, it was still in the second hour.
Step 4: AI-Powered Analysis
Now comes the fun part – the "actual" analysis. Modern LLMs already know analytical best practices – you just have to apply them! Direct the AI with the specific question and ask it to suggest appropriate methods. This works much better than just asking for generic "insights".
Here’s how this could look like for an area like customer analytics:

Example: I used AI to quickly categorize the thousands of open-text responses and identify common themes. GPT-4o (back then) automatically suggested clustering algorithms to group similar responses and recommended key phrase extraction. We then used correlation analysis to quantify relationships between different survey questions. This revealed how different employee groups interpret company values – insights that would have taken weeks of manual categorization to uncover.
Step 5: Validation
AI isn't necessarily a smarter analyst than you, and errors will happen. That's why I manually validate any insight that could drive business decisions or challenge it with follow-up questions to test alternative explanations.
Example: When AI suggested that "remote workers felt most disconnected", I dug deeper and asked for specific response examples to review them manually. This revealed an overrepresentation from one specific country in the remote worker sample, so applying this insight company-wide would have been invalid. Instead, we had pinpointed a regional issue that needed targeted attention.
The Results
Let's break down what this 5-step process actually delivered:
Traditional approach:
Easily 20+ person days of manual work
$50K+ in analyst costs or internal resources
4-6 weeks from start to actionable insights
High risk of missing nuanced patterns in thousands of responses
AI approach:
2 days total time investment (much less spent than $50K :)
Under $200 in AI tools and platform costs
Same day turnaround for initial insights, refined analysis by day two
Caught critical validation issues that manual analysis often misses
The survey project revealed some critical insights the HR team never expected right off the bat. But more importantly, the low time investment meant we could afford to explore follow-up questions. When traditional analysis costs $50K, you get one shot to ask the right questions. When it costs $200 and two days, you can iterate until you find what matters.
Your Ugly Data Opportunity
Right now, you probably have ugly data sitting unused because the traditional analysis economics didn’t work. Customer feedback that never gets read. Sales notes that could reveal why deals stall. Survey responses gathering digital dust.
And it’s not just text. Ugly data comes in any form (which is the beauty of it).
So the question is not whether this data contains valuable insights, but whether you can afford not to extract them in the age of AI.
Your best insights are trapped in Ugly data that traditional analysis can't economically handle. AI changed that equation.
The data is there. The tools exist. You just need the right process.
And get started – happy analyzing!
See you next Friday!
Tobias
Reply