
An enterprise hired an AI agency and lost $80,000. When the project ended, they had nothing to show for it.
Stories like this are becoming common. AI agencies and implementation partners are appearing everywhere, but the market is flooded with more garbage than legitimate expertise. A recent MIT study found that 95% of AI agencies fail their enterprise clients. That means if you hire randomly, you have a 1 in 20 chance of success.
The problem isn't just incompetence. Many of these agencies are using AI to generate proposals without listening to actual client needs. They're applying "spray and pray" tactics, sending out templated pitches and hoping something sticks. When one company recently gathered proposals for the same AI project, the range was staggering: $20,000 to $280,000 for identical requirements.
That price gap tells you everything. Either someone doesn't understand the work, someone is trying to undercut the market with an unrealistic bid, or someone is dramatically overcharging. For companies trying to implement AI seriously, this creates a minefield. How do you separate legitimate partners from expensive disasters?
This guide walks through the specific red flags to watch for and the questions that reveal whether an agency knows what they're doing.
If you're planning an AI implementation, download our AI Strategy Toolkit to evaluate proposals, calculate ROI, and build a roadmap before you commit to any agency.
You can also watch the full conversation with Andrew Amann, walking you through how to vet AI agencies.
The single most important indicator of a legitimate AI agency is whether they conduct a proper discovery phase before proposing a solution.
Real agencies don't start with technology. They start by listening. They investigate your business, understand your specific challenges, and then propose a solution based on what you actually need. This process has different names at different firms (rapid validation sprint, discovery workshop, business analysis), but the principle is the same: understand the problem before proposing the solution.
Before a legitimate agency sends you a proposal, you should speak with multiple people on their team.
At a minimum, you should talk to:
If an agency sends you a proposal after one 30-minute call, they don't understand your business. They're guessing.
During these conversations, the agency should be asking questions like:
Agencies that skip discovery end up building solutions to problems you don't have. They create flashy demos that don't integrate with your actual workflows. When you try to deploy them, the edge cases they never considered destroy the timeline and budget.
That's not an agency. That's a numbers game.
If your proposal reads like it could have been sent to any company in your industry, walk away. A real proposal should reference specific challenges you mentioned, use your terminology, and demonstrate understanding of your business model.
When you receive proposals ranging from $20,000 to $200,000 for the same project, the gap usually comes down to one distinction: single-player versus multiplayer complexity.
A single-player AI solution is built for one person (typically an executive) to use in isolation. Examples include:
These can often be "vibe coded" quickly because the edge cases are limited. When the single user encounters a problem, they adapt. There's no need to handle multiple users with different permissions, varying levels of technical skill, or competing use patterns.
The $20,000 proposals you see are typically single-player MVPs. And there's nothing inherently wrong with that, as long as you understand what you're buying.
The problem happens when companies treat single-player pricing as if it applies to multiplayer deployment.
When you build a single-player solution without proper planning, every new user type becomes weeks of additional development. The sales team needs different data from the engineering team. Customer-facing staff need different permissions than internal analysts. What seemed like a straightforward tool becomes a sprawling mess of special cases.
When you move from a single-user tool to a system that serves your entire organization, three critical challenges emerge. Each one adds layers of technical complexity that cheap proposals simply don't account for.
Who has access to what data? How do those permissions change based on role, department, or seniority? When Person A asks a question, they should get different results than Person B asking the same question if their access levels differ.
Two years ago, this had to be custom-coded. Metadata descriptions tracked user permissions manually. Now, Microsoft and Google ecosystems automatically store user permissions with vectorized documents. This helps, but you still need an agency to map these permissions and figure out what happens when something changes.
Jill has been with your company for 20 years. She knows exactly how to phrase questions to get useful answers. Susie started six months ago and doesn't know the right terminology yet.
This requires building systems that understand user intent, not just literal queries. It means handling recency (newer information should surface first), avoiding bias (everyone should get accurate answers regardless of how they ask), and eliminating redundancy (similar questions shouldn't produce wildly different responses).
When your AI represents your company to customers or helps employees make decisions, it needs to maintain a consistent tone and voice. A support bot that sounds friendly to one customer and robotic to another creates a broken experience.
All three of these challenges require sophisticated prompt engineering, extensive testing across user types, and quality assurance processes that single-player tools never need.
If the agency says yes, they've built this type of system before. They're confident in their estimate because they've planned for the edge cases. They know what multiplayer complexity costs.
If they say no, or if they hedge, their business model doesn't support fixed pricing. They know the project will balloon once reality hits. The low initial quote is bait.
Look for agencies that provide a top limit and a structured breakdown of what you need to go live. You should be able to see exactly what you're paying for and know that the price won't creep upward as "unforeseen complications" emerge.
Want to see if an agency's proposal makes financial sense? Use our AI ROI framework to evaluate whether their timeline and budget will actually deliver positive returns.
You can download the whole AI Strategy Toolkit right here.
Building AI solutions follows the same principle as building houses: if you plan the architecture properly, you never have to tear anything down and rebuild.
When you build a house, you plan where the bathrooms go, where the plumbing runs, where the electrical needs to be installed. You get approvals before breaking ground. This prevents you from building a room and then realizing the plumbing has to run through the middle of the kitchen.
AI projects work the same way.
Some agencies approach AI projects through rapid iteration: build something in one prompt, see what breaks, spend hours fixing edge cases, repeat. This feels fast initially because you get something working quickly.
But every fix creates new problems. The architecture wasn't planned, so each addition requires refactoring. The $20,000 MVP becomes a $100,000 mess of patches and workarounds.
This means:
Only after the architecture is solid do you hit submit and generate code. The result? Projects that come in on time and on budget because you anticipated the complexity instead of discovering it mid-build.
When evaluating agencies, ask them to walk through their planning process. Do they dive straight into coding, or do they start with architecture diagrams and user flow mapping? The answer reveals whether they've done this before.
The agencies that consistently deliver successful AI projects share common characteristics.
Experience matters. Agencies that have delivered hundreds of projects know what works and what doesn't. They've seen the edge cases. They've learned where complexity hides.
By the time a legitimate agency says yes to your project, they should already know:
All of that planning happens before you sign a contract. When you get confirmation that the agency wants to move forward, they should already have answers to these questions. If they're still figuring out the basics after you've committed, they didn't do the work upfront.
When evaluating AI agencies, you'll often hear them discuss which models they use. This matters more than it might seem.
OpenAI's reasoning models are excellent. The APIs are robust and predictable. For enterprise applications requiring complex logic and decision-making, OpenAI's models perform well.
But OpenAI has increasingly focused on consumer-grade products rather than enterprise infrastructure:
These are not enterprise features. They're designed to capture consumer data and enable advertising revenue. ChatGPT Atlas, for example, gives OpenAI access to every site users visit while browsing. That's valuable for ad targeting, but it's a privacy nightmare for enterprises.
Claude (from Anthropic) and Gemini (from Google) have both invested heavily in enterprise infrastructure:
For companies on Microsoft ecosystems, Microsoft Copilot (which uses OpenAI models under the hood) provides enterprise-grade tooling. For Google Workspace users, Gemini offers native integration.
Ask the agency which models they typically use and why. If they default to consumer ChatGPT for enterprise projects, they may not understand the security and permission requirements your business actually needs.
Legitimate agencies should be able to explain:
If they can't answer these questions clearly, they're not thinking about multiplayer complexity.
Companies often lower their hiring standards when it comes to AI implementation. They wouldn't hire a salesperson without checking references, or bring on a product manager without understanding their process. But they'll hand $50,000 to an AI agency and just hope it works out.
Apply the same rigor to vetting AI agencies that you would to hiring any critical team member:
The 95% failure rate isn't inevitable. It exists because companies don't know what questions to ask. Now you do.
If you're looking for an AI implementation partner that actually delivers results, talk to NineTwoThree. We've successfully launched over 160 projects by following exactly the process outlined in this guide. We start with discovery, we listen to your actual needs, and we build solutions that work in production, not just in demos.
Because the best AI strategy is the one that actually ships.
An enterprise hired an AI agency and lost $80,000. When the project ended, they had nothing to show for it.
Stories like this are becoming common. AI agencies and implementation partners are appearing everywhere, but the market is flooded with more garbage than legitimate expertise. A recent MIT study found that 95% of AI agencies fail their enterprise clients. That means if you hire randomly, you have a 1 in 20 chance of success.
The problem isn't just incompetence. Many of these agencies are using AI to generate proposals without listening to actual client needs. They're applying "spray and pray" tactics, sending out templated pitches and hoping something sticks. When one company recently gathered proposals for the same AI project, the range was staggering: $20,000 to $280,000 for identical requirements.
That price gap tells you everything. Either someone doesn't understand the work, someone is trying to undercut the market with an unrealistic bid, or someone is dramatically overcharging. For companies trying to implement AI seriously, this creates a minefield. How do you separate legitimate partners from expensive disasters?
This guide walks through the specific red flags to watch for and the questions that reveal whether an agency knows what they're doing.
If you're planning an AI implementation, download our AI Strategy Toolkit to evaluate proposals, calculate ROI, and build a roadmap before you commit to any agency.
You can also watch the full conversation with Andrew Amann, walking you through how to vet AI agencies.
The single most important indicator of a legitimate AI agency is whether they conduct a proper discovery phase before proposing a solution.
Real agencies don't start with technology. They start by listening. They investigate your business, understand your specific challenges, and then propose a solution based on what you actually need. This process has different names at different firms (rapid validation sprint, discovery workshop, business analysis), but the principle is the same: understand the problem before proposing the solution.
Before a legitimate agency sends you a proposal, you should speak with multiple people on their team.
At a minimum, you should talk to:
If an agency sends you a proposal after one 30-minute call, they don't understand your business. They're guessing.
During these conversations, the agency should be asking questions like:
Agencies that skip discovery end up building solutions to problems you don't have. They create flashy demos that don't integrate with your actual workflows. When you try to deploy them, the edge cases they never considered destroy the timeline and budget.
That's not an agency. That's a numbers game.
If your proposal reads like it could have been sent to any company in your industry, walk away. A real proposal should reference specific challenges you mentioned, use your terminology, and demonstrate understanding of your business model.
When you receive proposals ranging from $20,000 to $200,000 for the same project, the gap usually comes down to one distinction: single-player versus multiplayer complexity.
A single-player AI solution is built for one person (typically an executive) to use in isolation. Examples include:
These can often be "vibe coded" quickly because the edge cases are limited. When the single user encounters a problem, they adapt. There's no need to handle multiple users with different permissions, varying levels of technical skill, or competing use patterns.
The $20,000 proposals you see are typically single-player MVPs. And there's nothing inherently wrong with that, as long as you understand what you're buying.
The problem happens when companies treat single-player pricing as if it applies to multiplayer deployment.
When you build a single-player solution without proper planning, every new user type becomes weeks of additional development. The sales team needs different data from the engineering team. Customer-facing staff need different permissions than internal analysts. What seemed like a straightforward tool becomes a sprawling mess of special cases.
When you move from a single-user tool to a system that serves your entire organization, three critical challenges emerge. Each one adds layers of technical complexity that cheap proposals simply don't account for.
Who has access to what data? How do those permissions change based on role, department, or seniority? When Person A asks a question, they should get different results than Person B asking the same question if their access levels differ.
Two years ago, this had to be custom-coded. Metadata descriptions tracked user permissions manually. Now, Microsoft and Google ecosystems automatically store user permissions with vectorized documents. This helps, but you still need an agency to map these permissions and figure out what happens when something changes.
Jill has been with your company for 20 years. She knows exactly how to phrase questions to get useful answers. Susie started six months ago and doesn't know the right terminology yet.
This requires building systems that understand user intent, not just literal queries. It means handling recency (newer information should surface first), avoiding bias (everyone should get accurate answers regardless of how they ask), and eliminating redundancy (similar questions shouldn't produce wildly different responses).
When your AI represents your company to customers or helps employees make decisions, it needs to maintain a consistent tone and voice. A support bot that sounds friendly to one customer and robotic to another creates a broken experience.
All three of these challenges require sophisticated prompt engineering, extensive testing across user types, and quality assurance processes that single-player tools never need.
If the agency says yes, they've built this type of system before. They're confident in their estimate because they've planned for the edge cases. They know what multiplayer complexity costs.
If they say no, or if they hedge, their business model doesn't support fixed pricing. They know the project will balloon once reality hits. The low initial quote is bait.
Look for agencies that provide a top limit and a structured breakdown of what you need to go live. You should be able to see exactly what you're paying for and know that the price won't creep upward as "unforeseen complications" emerge.
Want to see if an agency's proposal makes financial sense? Use our AI ROI framework to evaluate whether their timeline and budget will actually deliver positive returns.
You can download the whole AI Strategy Toolkit right here.
Building AI solutions follows the same principle as building houses: if you plan the architecture properly, you never have to tear anything down and rebuild.
When you build a house, you plan where the bathrooms go, where the plumbing runs, where the electrical needs to be installed. You get approvals before breaking ground. This prevents you from building a room and then realizing the plumbing has to run through the middle of the kitchen.
AI projects work the same way.
Some agencies approach AI projects through rapid iteration: build something in one prompt, see what breaks, spend hours fixing edge cases, repeat. This feels fast initially because you get something working quickly.
But every fix creates new problems. The architecture wasn't planned, so each addition requires refactoring. The $20,000 MVP becomes a $100,000 mess of patches and workarounds.
This means:
Only after the architecture is solid do you hit submit and generate code. The result? Projects that come in on time and on budget because you anticipated the complexity instead of discovering it mid-build.
When evaluating agencies, ask them to walk through their planning process. Do they dive straight into coding, or do they start with architecture diagrams and user flow mapping? The answer reveals whether they've done this before.
The agencies that consistently deliver successful AI projects share common characteristics.
Experience matters. Agencies that have delivered hundreds of projects know what works and what doesn't. They've seen the edge cases. They've learned where complexity hides.
By the time a legitimate agency says yes to your project, they should already know:
All of that planning happens before you sign a contract. When you get confirmation that the agency wants to move forward, they should already have answers to these questions. If they're still figuring out the basics after you've committed, they didn't do the work upfront.
When evaluating AI agencies, you'll often hear them discuss which models they use. This matters more than it might seem.
OpenAI's reasoning models are excellent. The APIs are robust and predictable. For enterprise applications requiring complex logic and decision-making, OpenAI's models perform well.
But OpenAI has increasingly focused on consumer-grade products rather than enterprise infrastructure:
These are not enterprise features. They're designed to capture consumer data and enable advertising revenue. ChatGPT Atlas, for example, gives OpenAI access to every site users visit while browsing. That's valuable for ad targeting, but it's a privacy nightmare for enterprises.
Claude (from Anthropic) and Gemini (from Google) have both invested heavily in enterprise infrastructure:
For companies on Microsoft ecosystems, Microsoft Copilot (which uses OpenAI models under the hood) provides enterprise-grade tooling. For Google Workspace users, Gemini offers native integration.
Ask the agency which models they typically use and why. If they default to consumer ChatGPT for enterprise projects, they may not understand the security and permission requirements your business actually needs.
Legitimate agencies should be able to explain:
If they can't answer these questions clearly, they're not thinking about multiplayer complexity.
Companies often lower their hiring standards when it comes to AI implementation. They wouldn't hire a salesperson without checking references, or bring on a product manager without understanding their process. But they'll hand $50,000 to an AI agency and just hope it works out.
Apply the same rigor to vetting AI agencies that you would to hiring any critical team member:
The 95% failure rate isn't inevitable. It exists because companies don't know what questions to ask. Now you do.
If you're looking for an AI implementation partner that actually delivers results, talk to NineTwoThree. We've successfully launched over 160 projects by following exactly the process outlined in this guide. We start with discovery, we listen to your actual needs, and we build solutions that work in production, not just in demos.
Because the best AI strategy is the one that actually ships.
