.png)
When I sit down with a new product team for the first time, I ask to see what they've been building. Almost always, the answer is something beautiful. A polished interface, a coherent flow, screens that look production-ready. And almost always, the same question hasn't been asked yet: does this actually solve the right problem?
I've shipped over 18 products and spent years guiding companies through AI prototyping and AI product design decisions. The tools we have today are remarkable — what used to take a full day of documentation, wireframing, and review now takes under an hour. That speed is a genuine advantage. But it's also where most teams go wrong.
Polish signals effort. It doesn't signal accuracy. And before a team has validated anything with real users, it doesn't tell you whether you've identified the right problem.
When AI generates a visually impressive output, it tells you one thing: it understood the description well enough to produce something plausible. That's genuinely useful. But plausible and correct are not the same thing.
I've watched this happen on projects I've led and on products I've consulted on: a team builds a beautiful screen, looks at it, and thinks they're done. We take that design to stakeholders, run user testing, and find the screen contains redundant information. AI optimizes for completeness. Users want simplicity. The screen gets deleted — not because it looked bad, but because it solved a problem users didn't actually have.
The danger isn't that AI made a mistake. It's that the output was so convincing it nearly bypassed the one step that would have caught the error: showing it to real users and asking direct questions. I call it the illusion of polish. It makes a team feel like they nailed it. That's the trap. That's the moment I always push teams to pause and ask: is this the right problem? Is that button actually what the user needs, just because AI put it there?
This pattern — building fast, skipping validation, and paying for it later — is well documented. Some of the most expensive AI project failures in recent years trace back to exactly this: teams that moved from polished output to deployment without asking whether the output solved the right problem. As NineTwoThree's CEO Andrew Amann has noted, too many AI projects arrive as second tries after burning $50k on entirely avoidable mistakes.
The speed AI provides is only valuable when it's used to run more validations faster — not to skip validation entirely.
To understand why polish misleads, you have to understand the two modes of product work — and knowing which one a team is in at any given moment is something I spend a significant part of my time helping them figure out.
Divergence means exploring, generating options, throwing ideas at the wall to find out which ones have traction. Convergence means narrowing down, choosing, and validating — shifting the goal from breadth to accuracy.
Both are necessary, in that order. The failure mode is collapsing them into a single step. That's exactly what happens when a team opens an AI product design tool, generates a polished screen, and treats it as a product decision. They've converged before they've diverged. The tool accelerated the process, but the process skipped a phase.
If you narrow in too fast, you risk solving a problem that isn't the right one. If you stay in divergence too long, you build an entire plan on assumptions that were never actually tested. Either way, the polish hides the gap.
The good news is that AI-assisted development can accelerate both phases — it just requires treating them as distinct. Here's the workflow I bring to my everyday product work and teach to the teams I work with:
Each tool tests a different hypothesis. Figma Make answers: what does this interaction look like? Claude Code answers: can this actually work? Running them as separate prototypes — rather than one combined prompt — keeps each test clean and the results interpretable.
All of this used to take a full day, sometimes several days depending on complexity. Now it takes under an hour. That compression is real, and it's available to anyone. But the outputs still need to go in front of stakeholders before they become decisions.
Before generating anything, a team needs to name what they're trying to validate. Every AI prototyping session should start with a hypothesis:
You need to know what you're actually trying to experiment on with every prompt. What is the hypothesis you're trying to validate? The problem you're trying to get clarity on, or the thing you want to confirm — are we going in the right direction or not?
Without a hypothesis, the session is still divergence even if it feels like convergence. A polished output without a validated hypothesis is just a more convincing assumption.
The deeper version of the polish trap isn't a team falling in love with a beautiful screen. It's a team spending weeks iterating on a polished solution to the wrong problem entirely. This is the version I see most often, and it's the most expensive one to fix.
Sound product strategy starts further back — before any tools are opened. The question isn't how to build faster. It's: which problem is actually worth solving? That's the question that drives everything we do at NineTwoThree, and it's the question AI, no matter how capable, cannot answer for you.
When clients come to us with a new product they want to build, we start with our Rapid Validation Sprint — four weeks designed to answer one question before any building begins: are we solving the right problem?
Not sure where to start with AI in your organization? Our framework for choosing your first (or next) AI project walks through exactly this prioritization process.

Rather than going to every team in an organization, we target those with the highest friction, the clearest revenue opportunity, or the sharpest operational pain. We identify pain points broadly before any filtering begins.
From a list of twenty potential pain points, a group of seven people can converge on two or three candidates in a single session through power-dotting — structured voting that moves fast without skipping the divergence that came before.
Before committing to a problem, three qualifying questions filter the candidates:
A problem that clears all three checks has revenue potential, is AI-ready, and has the human infrastructure to sustain the solution after it's built.
No AI tool can walk into a room of stakeholders and help a company decide what problem is worth solving. That's time-tested product management work. By the time we deliver a solution, it's targeted towards specific ROI and KPIs — something that actually makes somebody's life easier on day one.
The three phases above aren't abstract. Here's how they played out on a project we're currently running.
A healthcare solutions provider came to us knowing they needed AI to hit revenue goals, but with no clear starting point. We ran the sprint: discovery sessions with the highest-friction teams, power-dotting to narrow twenty pain points down to two or three, then an AI-readiness check on each candidate.
The result was a specific, validated problem: RFP proposal generation. Sales teams were spending significant time reading 30-page RFP documents, extracting requirements, coordinating responses across team members, and assembling proposals that were both thorough and strategically coherent. AI could get them 70–80% of the way to a first draft. The human team would own the final 20%.
That precision is what the sprint is designed to produce. The team didn't leave with a mandate to "use AI more." They had a defined problem, a scoped solution, and clear criteria for success. That clarity is what made the next phase — AI-assisted software development — productive rather than open-ended. It's also where the polish trap reappeared.
With the right problem confirmed, we ran prompt engineering and design work in parallel. Both tracks faced the same temptation to treat polished output as finished work.
Prompt engineering track:
Design track:
Both tracks could iterate indefinitely. The outputs kept improving. The designs kept getting more refined. Nothing in the tools told us when to stop.
What stopped it wasn't AI. It was a stakeholder session. We built a quick prototype, put it in front of users, and asked basic product management questions:
One entire screen was removed after that session. Not because it looked bad — it looked fine. It had been generated because it fit the pattern of what a proposal tool might include. Nobody had asked whether users would actually use it. The screen was redundant, and only direct user feedback made that visible.
AI is great at creating information that makes something look like it's solving more problems. But that's sometimes exactly the problem — it adds noise that looks like value. The judgment call to delete that screen couldn't come from a tool. It had to come from the team, informed by real users.
This is what the polish trap obscures in every project. Product prototype development that looks production-ready is not the same as a prototype that has been validated.
These questions, asked with real stakeholders, are what separate validated AI product design from polished output. Product managers have been asking versions of them for decades. What AI changes is the cost of having something visual to show during the session — a prototype that used to take days to build now takes an hour.
These questions are cheap to ask with a polished prototype in hand. What they surface is something AI cannot generate: a real user's honest reaction to whether something solves their actual problem.
Product strategy conversations keep circling back to tools: which models, which prompts, which workflows. But the skill gap at most organizations isn't tooling. It's knowing when to stop iterating and start validating.
AI can accelerate divergence and convergence. It cannot decide which mode a team should be in. It cannot walk into a room of stakeholders and determine which problem is worth solving. It cannot tell you to delete a screen. It can only produce outputs based on what you described.
The organizations I work with that get real value from AI-assisted development are the ones that treat AI outputs as hypotheses, not conclusions. A generated UI is a starting point for a conversation with a user. A polished proposal draft is a first pass. Done well, an hour with NotebookLM, Figma Make, and Claude Code produces a set of questions a team can answer in a 30-minute stakeholder session. That's the value. Not the polish.
My advice to any team using AI tools today: keep using them, and don't let the polish trick you. Just because AI can make something look production-ready doesn't mean the thinking is done. You still need to ask the most fundamental questions: what problem are we solving? Is it the right problem? How does life look different after we solve this?
At NineTwoThree, we start every new AI engagement with a Rapid Validation Sprint: four weeks to identify the right problem before writing a single line of code. If your organization is using AI tools but struggling to get from polished outputs to validated solutions, talk to our team.
When I sit down with a new product team for the first time, I ask to see what they've been building. Almost always, the answer is something beautiful. A polished interface, a coherent flow, screens that look production-ready. And almost always, the same question hasn't been asked yet: does this actually solve the right problem?
I've shipped over 18 products and spent years guiding companies through AI prototyping and AI product design decisions. The tools we have today are remarkable — what used to take a full day of documentation, wireframing, and review now takes under an hour. That speed is a genuine advantage. But it's also where most teams go wrong.
Polish signals effort. It doesn't signal accuracy. And before a team has validated anything with real users, it doesn't tell you whether you've identified the right problem.
When AI generates a visually impressive output, it tells you one thing: it understood the description well enough to produce something plausible. That's genuinely useful. But plausible and correct are not the same thing.
I've watched this happen on projects I've led and on products I've consulted on: a team builds a beautiful screen, looks at it, and thinks they're done. We take that design to stakeholders, run user testing, and find the screen contains redundant information. AI optimizes for completeness. Users want simplicity. The screen gets deleted — not because it looked bad, but because it solved a problem users didn't actually have.
The danger isn't that AI made a mistake. It's that the output was so convincing it nearly bypassed the one step that would have caught the error: showing it to real users and asking direct questions. I call it the illusion of polish. It makes a team feel like they nailed it. That's the trap. That's the moment I always push teams to pause and ask: is this the right problem? Is that button actually what the user needs, just because AI put it there?
This pattern — building fast, skipping validation, and paying for it later — is well documented. Some of the most expensive AI project failures in recent years trace back to exactly this: teams that moved from polished output to deployment without asking whether the output solved the right problem. As NineTwoThree's CEO Andrew Amann has noted, too many AI projects arrive as second tries after burning $50k on entirely avoidable mistakes.
The speed AI provides is only valuable when it's used to run more validations faster — not to skip validation entirely.
To understand why polish misleads, you have to understand the two modes of product work — and knowing which one a team is in at any given moment is something I spend a significant part of my time helping them figure out.
Divergence means exploring, generating options, throwing ideas at the wall to find out which ones have traction. Convergence means narrowing down, choosing, and validating — shifting the goal from breadth to accuracy.
Both are necessary, in that order. The failure mode is collapsing them into a single step. That's exactly what happens when a team opens an AI product design tool, generates a polished screen, and treats it as a product decision. They've converged before they've diverged. The tool accelerated the process, but the process skipped a phase.
If you narrow in too fast, you risk solving a problem that isn't the right one. If you stay in divergence too long, you build an entire plan on assumptions that were never actually tested. Either way, the polish hides the gap.
The good news is that AI-assisted development can accelerate both phases — it just requires treating them as distinct. Here's the workflow I bring to my everyday product work and teach to the teams I work with:
Each tool tests a different hypothesis. Figma Make answers: what does this interaction look like? Claude Code answers: can this actually work? Running them as separate prototypes — rather than one combined prompt — keeps each test clean and the results interpretable.
All of this used to take a full day, sometimes several days depending on complexity. Now it takes under an hour. That compression is real, and it's available to anyone. But the outputs still need to go in front of stakeholders before they become decisions.
Before generating anything, a team needs to name what they're trying to validate. Every AI prototyping session should start with a hypothesis:
You need to know what you're actually trying to experiment on with every prompt. What is the hypothesis you're trying to validate? The problem you're trying to get clarity on, or the thing you want to confirm — are we going in the right direction or not?
Without a hypothesis, the session is still divergence even if it feels like convergence. A polished output without a validated hypothesis is just a more convincing assumption.
The deeper version of the polish trap isn't a team falling in love with a beautiful screen. It's a team spending weeks iterating on a polished solution to the wrong problem entirely. This is the version I see most often, and it's the most expensive one to fix.
Sound product strategy starts further back — before any tools are opened. The question isn't how to build faster. It's: which problem is actually worth solving? That's the question that drives everything we do at NineTwoThree, and it's the question AI, no matter how capable, cannot answer for you.
When clients come to us with a new product they want to build, we start with our Rapid Validation Sprint — four weeks designed to answer one question before any building begins: are we solving the right problem?
Not sure where to start with AI in your organization? Our framework for choosing your first (or next) AI project walks through exactly this prioritization process.

Rather than going to every team in an organization, we target those with the highest friction, the clearest revenue opportunity, or the sharpest operational pain. We identify pain points broadly before any filtering begins.
From a list of twenty potential pain points, a group of seven people can converge on two or three candidates in a single session through power-dotting — structured voting that moves fast without skipping the divergence that came before.
Before committing to a problem, three qualifying questions filter the candidates:
A problem that clears all three checks has revenue potential, is AI-ready, and has the human infrastructure to sustain the solution after it's built.
No AI tool can walk into a room of stakeholders and help a company decide what problem is worth solving. That's time-tested product management work. By the time we deliver a solution, it's targeted towards specific ROI and KPIs — something that actually makes somebody's life easier on day one.
The three phases above aren't abstract. Here's how they played out on a project we're currently running.
A healthcare solutions provider came to us knowing they needed AI to hit revenue goals, but with no clear starting point. We ran the sprint: discovery sessions with the highest-friction teams, power-dotting to narrow twenty pain points down to two or three, then an AI-readiness check on each candidate.
The result was a specific, validated problem: RFP proposal generation. Sales teams were spending significant time reading 30-page RFP documents, extracting requirements, coordinating responses across team members, and assembling proposals that were both thorough and strategically coherent. AI could get them 70–80% of the way to a first draft. The human team would own the final 20%.
That precision is what the sprint is designed to produce. The team didn't leave with a mandate to "use AI more." They had a defined problem, a scoped solution, and clear criteria for success. That clarity is what made the next phase — AI-assisted software development — productive rather than open-ended. It's also where the polish trap reappeared.
With the right problem confirmed, we ran prompt engineering and design work in parallel. Both tracks faced the same temptation to treat polished output as finished work.
Prompt engineering track:
Design track:
Both tracks could iterate indefinitely. The outputs kept improving. The designs kept getting more refined. Nothing in the tools told us when to stop.
What stopped it wasn't AI. It was a stakeholder session. We built a quick prototype, put it in front of users, and asked basic product management questions:
One entire screen was removed after that session. Not because it looked bad — it looked fine. It had been generated because it fit the pattern of what a proposal tool might include. Nobody had asked whether users would actually use it. The screen was redundant, and only direct user feedback made that visible.
AI is great at creating information that makes something look like it's solving more problems. But that's sometimes exactly the problem — it adds noise that looks like value. The judgment call to delete that screen couldn't come from a tool. It had to come from the team, informed by real users.
This is what the polish trap obscures in every project. Product prototype development that looks production-ready is not the same as a prototype that has been validated.
These questions, asked with real stakeholders, are what separate validated AI product design from polished output. Product managers have been asking versions of them for decades. What AI changes is the cost of having something visual to show during the session — a prototype that used to take days to build now takes an hour.
These questions are cheap to ask with a polished prototype in hand. What they surface is something AI cannot generate: a real user's honest reaction to whether something solves their actual problem.
Product strategy conversations keep circling back to tools: which models, which prompts, which workflows. But the skill gap at most organizations isn't tooling. It's knowing when to stop iterating and start validating.
AI can accelerate divergence and convergence. It cannot decide which mode a team should be in. It cannot walk into a room of stakeholders and determine which problem is worth solving. It cannot tell you to delete a screen. It can only produce outputs based on what you described.
The organizations I work with that get real value from AI-assisted development are the ones that treat AI outputs as hypotheses, not conclusions. A generated UI is a starting point for a conversation with a user. A polished proposal draft is a first pass. Done well, an hour with NotebookLM, Figma Make, and Claude Code produces a set of questions a team can answer in a 30-minute stakeholder session. That's the value. Not the polish.
My advice to any team using AI tools today: keep using them, and don't let the polish trick you. Just because AI can make something look production-ready doesn't mean the thinking is done. You still need to ask the most fundamental questions: what problem are we solving? Is it the right problem? How does life look different after we solve this?
At NineTwoThree, we start every new AI engagement with a Rapid Validation Sprint: four weeks to identify the right problem before writing a single line of code. If your organization is using AI tools but struggling to get from polished outputs to validated solutions, talk to our team.
