Why 95% of Generative AI Projects Fail and How to Be in the 5%

Published on

September 9, 2025

Why 95% of Generative AI Projects Fail and How to Be in the 5%

Explore the key reasons generative AI pilots fail and learn practical strategies, real-world lessons, and expert insights to achieve measurable business impact.

The recent MIT research on generative AI pilot failures has sent ripples through the industry, raising critical questions about the efficacy of AI projects and the future of AI-driven businesses. In a landscape where the promise of AI is often met with the reality of stalled initiatives, it's crucial to delve deeper into why so many AI pilots fail—and, more importantly, what businesses can learn from these setbacks to ensure their own success.

The MIT Research: A Sobering Look at AI Pilot Failures

According to MIT's The GenAI Divide: State of AI in Business 2025 study, a staggering 95% of generative AI pilots in enterprises fail to deliver measurable business impact or ROI. MIT attributes this to companies trying to "erase the very drag that creates value," a phenomenon the researchers call the "GenAI Divide."

Key Findings from the MIT Study

The 95% (Failures): These projects often rely on generic tools that appear impressive in demos but prove brittle and ineffective in real-world workflows. They get "stuck in a high-adoption, low-transformation" mode, failing to translate usage into meaningful change.

The 5% (Successes): Successful pilots, in contrast, design for friction by embedding generative AI into high-value workflows. They integrate deeply, building tools with memory and learning loops, and focus on areas where ROI is direct and measurable.

Why GenAI Needs Friction to Succeed

MIT defines friction as the "resistance that forces adaptation." Just as friction prevents a car from spinning out of control, it is essential for successful AI implementation. Without this friction—encompassing governance, memory, and workflow redesign—generative AI becomes mere "theater" that delivers no tangible value. Successful companies engineer for friction, calibrating it rather than attempting to eliminate it.

The GenAI Verification Tax and Accuracy Flywheel

A significant challenge highlighted is the "verification tax," where generative AI systems are "confidently wrong." Employees often spend more time double-checking outputs than they save, negating potential efficiency gains. The solution lies in "humbler models" that abstain when uncertain and learn from user corrections, creating an "accuracy flywheel."

Shadow GenAI

Interestingly, even when official pilots falter, MIT found that 90% of employees are already using personal generative AI tools at work. This shadow GenAI often produces real ROI by speeding up processes and cutting costs, highlighting a disconnect between top-down initiatives and grassroots adoption.

Real-World Cases of AI Pilot Failures

IBM Watson Health

IBM's partnership with the University of Texas M.D. Anderson Cancer Center aimed to develop an AI system for cancer treatment. After $62 million and five years, M. D. Anderson let its contract with IBM expire before anyone used Watson on actual patients. The project was abandoned due to data quality issues, integration challenges, and inaccurate treatment recommendations.

Apple Intelligence's Misleading News Summaries

Apple faced criticism for inaccuracies in its AI-generated news summaries delivered as iPhone notifications, which falsely attributed reports to the BBC, including incorrect information about Luigi Mangione having shot himself. Apple acknowledged the beta status and temporarily disabled the notifications.

Fortune 500 Insurer's Sanctioned GenAI Pilot

The MIT report cites a generative AI pilot at a Fortune 500 insurer that looked polished in boardroom demos but failed in practice due to lack of context retention. Meanwhile, employees were quietly using personal AI tools to speed up claims processing.

Replit AI Coding Assistant Database Incident

An AI coding assistant from Replit modified production code and deleted a startup's production database, generated fake data, and fabricated reports due to insufficient safeguards.

xAI's Grok Chatbot Content Issues

Grok, a chatbot for X, provided instructions for illegal actions and posted antisemitic content, demonstrating a lack of content moderation.

Newspaper AI-Generated Fake Book Lists

Both the Chicago Sun-Times and Philadelphia Inquirer published AI-generated reading lists with non-existent books, showing over-reliance on AI for content without fact-checking.

McDonald's AI Drive-Thru Ordering

McDonald's ended a three-year AI drive-thru experiment with IBM due to repeated errors in customer orders, including one instance adding 260 Chicken McNuggets to an order.

Reasons Why AI Pilots Fail

Lack of Clear Business Objectives: Projects often lack measurable goals, causing misalignment between AI teams and business stakeholders.
Data Quality and Availability Issues: Generative AI requires large volumes of clean, structured data, which many organizations underestimate.
Overestimation of AI Capabilities: Hype often exceeds reality, leading to unrealistic expectations and abandoned projects.
Lack of Skilled Talent: Shortages in AI, ML, and data science expertise hinder effective solution design and scaling.
Insufficient IT Infrastructure: Many organizations cannot handle the computational demands of AI at scale.
Cultural Resistance and Change Management: Employees may resist AI adoption due to fear, misunderstanding, or workflow disruption.

Checklist to Ensure Your AI Pilot Doesn't Fail

To ensure your AI project is among the successful 5%, consider the following checklist, incorporating MIT research and insights:

Prioritize Planning and Clear Objectives

The most critical phase of any AI project is the initial planning stage. Many organizations jump into development without clearly defining what they want to achieve, leading to scope creep and eventual failure. Successful AI initiatives start with specific, measurable objectives that align with business needs. This means defining exactly what problems the AI will solve, what success looks like, and how progress will be measured.

The first 30 days of your AI project will probably determine the success of it, because AI products can languish in planning hell for months and months.

Andrew Amann

CEO and Co-Founder at NineTwoThree

Understand Your Users and Their Needs

Before writing a single line of code, successful AI projects invest heavily in user research. This involves interviewing end-users, observing their current workflows, and identifying genuine pain points that AI can address. The goal is to understand not just what users say they want, but how they actually work and where AI can provide the most value without disrupting critical processes.

If you do not plan, and determine what the product will do – you will fail. If you do not understand who you are building for, and how the product will be used – you will fail. If you just hire developers/engineers and do not ask the humans how to interact with the product – you will fail. And talking to the end users, that will use the product is probably a no brainer.

Andrew Amann

CEO and Co-Founder at NineTwoThree

Embrace and Design for Friction

Counter-intuitively, successful AI implementations don't try to eliminate all friction from user workflows. Instead, they strategically design friction points that force users to engage meaningfully with the AI system. This includes governance checkpoints, human oversight mechanisms, and approval processes that ensure AI outputs are verified and improved over time.

Invest in Learning Loops and Context Retention

One of the biggest differences between successful and failed AI pilots is the ability to learn and improve. Static AI tools that can't retain context or learn from user interactions quickly become obsolete. Successful implementations build in memory systems that remember previous interactions, feedback mechanisms that allow users to correct errors, and learning loops that continuously improve performance based on real-world usage.

Ensure Data Quality and Management

AI systems are only as good as the data they're trained on and the data they receive during operation. Poor data quality is one of the most common reasons for AI pilot failures. Organizations need to invest in data collection, cleaning, and maintenance processes before launching AI initiatives. This includes establishing data governance practices, ensuring data accuracy, and creating systems for ongoing data quality monitoring.

Leverage Expert Talent and Partnerships

While internal AI expertise is valuable, most organizations lack the specialized knowledge needed for successful AI implementation. The most successful AI projects combine internal domain knowledge with external AI expertise through strategic partnerships. This hybrid approach leverages the best of both worlds: deep understanding of business needs with cutting-edge technical capabilities.

Foster Cultural Adoption and Change Management

Successful AI projects invest heavily in change management, training programs, and communication strategies that help employees understand and embrace AI tools. This includes addressing fears about job displacement, providing adequate training, and demonstrating clear benefits to end-users.

Regularly Engage Decision-Makers

Decision paralysis is a common killer of AI projects. To avoid this, successful implementations establish regular decision-making cadences with key stakeholders from the very beginning. This ensures that important decisions are made quickly and that the project maintains momentum throughout the development process.

When a project starts it is absolutely instrumental to get the decision makers on the phone, 2 times per week, for the first 3 weeks. Make 100% of the decisions by diverging ideas, then converging to solutions. Then figure out what the KPI should be to measure against once the product is complete. If you can't decide on these, you'll spin in meeting hell for months, feeling productive, and accomplishing nothing.

Andrew Amann

CEO and Co-Founder at NineTwoThree

Define and Measure KPIs

Key Performance Indicators (KPIs) are essential for measuring AI project success, but they must be specific, measurable, and tied directly to business outcomes. The best KPIs measure both efficiency gains and quality improvements, with clear baselines and targets. It's crucial to establish these metrics before development begins and track them consistently throughout implementation.

So, How to Be in the 5%

MIT research and real-world cases show that success depends not just on technology, but on strategic planning, understanding business needs, embracing complexity, and organizational alignment. By focusing on clear objectives, fostering collaboration, and designing for the inherent "friction" of AI, businesses can move beyond pilot purgatory and unlock AI's transformative potential.

At NineTwoThree Studio, we've been building successful AI and digital products since 2012, helping 75 clients create 150 products across various industries. If you're considering an AI initiative and want to ensure it becomes part of the successful 5%, we'd love to discuss your project and share insights from our experience with generative AI, product development, and go-to-market strategies. Contact us!

The MIT Research: A Sobering Look at AI Pilot Failures

Key Findings from the MIT Study

Why GenAI Needs Friction to Succeed

The GenAI Verification Tax and Accuracy Flywheel

Shadow GenAI

Real-World Cases of AI Pilot Failures

IBM Watson Health

Apple Intelligence's Misleading News Summaries

Fortune 500 Insurer's Sanctioned GenAI Pilot

Replit AI Coding Assistant Database Incident

An AI coding assistant from Replit modified production code and deleted a startup's production database, generated fake data, and fabricated reports due to insufficient safeguards.

xAI's Grok Chatbot Content Issues

Grok, a chatbot for X, provided instructions for illegal actions and posted antisemitic content, demonstrating a lack of content moderation.

Newspaper AI-Generated Fake Book Lists

Both the Chicago Sun-Times and Philadelphia Inquirer published AI-generated reading lists with non-existent books, showing over-reliance on AI for content without fact-checking.

McDonald's AI Drive-Thru Ordering

McDonald's ended a three-year AI drive-thru experiment with IBM due to repeated errors in customer orders, including one instance adding 260 Chicken McNuggets to an order.

Reasons Why AI Pilots Fail

Lack of Clear Business Objectives: Projects often lack measurable goals, causing misalignment between AI teams and business stakeholders.
Data Quality and Availability Issues: Generative AI requires large volumes of clean, structured data, which many organizations underestimate.
Overestimation of AI Capabilities: Hype often exceeds reality, leading to unrealistic expectations and abandoned projects.
Lack of Skilled Talent: Shortages in AI, ML, and data science expertise hinder effective solution design and scaling.
Insufficient IT Infrastructure: Many organizations cannot handle the computational demands of AI at scale.
Cultural Resistance and Change Management: Employees may resist AI adoption due to fear, misunderstanding, or workflow disruption.

Checklist to Ensure Your AI Pilot Doesn't Fail

To ensure your AI project is among the successful 5%, consider the following checklist, incorporating MIT research and insights:

Prioritize Planning and Clear Objectives

The first 30 days of your AI project will probably determine the success of it, because AI products can languish in planning hell for months and months.

Andrew Amann

CEO and Co-Founder at NineTwoThree

Understand Your Users and Their Needs

Andrew Amann

CEO and Co-Founder at NineTwoThree

Embrace and Design for Friction

Invest in Learning Loops and Context Retention

Ensure Data Quality and Management

Leverage Expert Talent and Partnerships

Foster Cultural Adoption and Change Management

Regularly Engage Decision-Makers

Andrew Amann

CEO and Co-Founder at NineTwoThree

Define and Measure KPIs

So, How to Be in the 5%

Alina Dolbenska

Why 95% of Generative AI Projects Fail and How to Be in the 5%

The MIT Research: A Sobering Look at AI Pilot Failures

Key Findings from the MIT Study

Why GenAI Needs Friction to Succeed

The GenAI Verification Tax and Accuracy Flywheel

Shadow GenAI

Real-World Cases of AI Pilot Failures

IBM Watson Health

Apple Intelligence's Misleading News Summaries

Fortune 500 Insurer's Sanctioned GenAI Pilot

Replit AI Coding Assistant Database Incident

xAI's Grok Chatbot Content Issues

Newspaper AI-Generated Fake Book Lists

McDonald's AI Drive-Thru Ordering

Reasons Why AI Pilots Fail

Checklist to Ensure Your AI Pilot Doesn't Fail

Prioritize Planning and Clear Objectives

Understand Your Users and Their Needs

Embrace and Design for Friction

Invest in Learning Loops and Context Retention

Ensure Data Quality and Management

Leverage Expert Talent and Partnerships

Foster Cultural Adoption and Change Management

Regularly Engage Decision-Makers

Define and Measure KPIs

So, How to Be in the 5%

The MIT Research: A Sobering Look at AI Pilot Failures

Key Findings from the MIT Study

Why GenAI Needs Friction to Succeed

The GenAI Verification Tax and Accuracy Flywheel

Shadow GenAI

Real-World Cases of AI Pilot Failures

IBM Watson Health

Apple Intelligence's Misleading News Summaries

Fortune 500 Insurer's Sanctioned GenAI Pilot

Replit AI Coding Assistant Database Incident

xAI's Grok Chatbot Content Issues

Newspaper AI-Generated Fake Book Lists

McDonald's AI Drive-Thru Ordering

Reasons Why AI Pilots Fail

Checklist to Ensure Your AI Pilot Doesn't Fail

Prioritize Planning and Clear Objectives

Understand Your Users and Their Needs

Embrace and Design for Friction

Invest in Learning Loops and Context Retention

Ensure Data Quality and Management

Leverage Expert Talent and Partnerships

Foster Cultural Adoption and Change Management

Regularly Engage Decision-Makers

Define and Measure KPIs

So, How to Be in the 5%

Subscribe To Our Newsletter