
On September 29, 2025, Anthropic released Claude Sonnet 4.5. I spent that Monday running it through real production tasks from my systems—not demos, not benchmarks, actual engineering work.
I use AI tools daily. Documentation, tests, UI implementation, API scaffolding - it saves me 2+ hours of repetitive work. AI has become an essential part of my workflow.
Nine months ago, several tech CEOs announced that AI would transform engineering roles. Salesforce paused engineering hires. Meta announced AI agent initiatives. Google reported 30% of new code is AI-generated.
I wanted to understand the current state: Where does AI excel? Where does it struggle? What works in production versus what's still experimental?
After 18 months of tracking industry data, daily AI usage, and systematic testing with the latest models, I found some clear patterns.
Sam Altman (OpenAI), March 2025: "At some point, yeah, maybe we do need less software engineers."
Mark Zuckerberg (Meta), January 2025: "In 2025, we're going to have an AI that can effectively be a mid-level engineer."
Marc Benioff (Salesforce), February 2025: "We're not going to hire any new engineers this year." Then he fired 4,000 support staff.
Sundar Pichai (Google), October 2024: "More than a quarter of all new code at Google is generated by AI."
By April 2025: 30%. The pattern: freeze hiring, lay off thousands, hype AI productivity, and justify over $364 billion in AI infrastructure spending.
What they're not telling you is more interesting.
Based on 18 months of tracking AI adoption across teams (survey panels of 18 CTOs, >500 engineers across startups and enterprise):
This gap between perception and measurement deserves closer examination.
Notable incident: The Replit case, where AI deleted a production database (1,206 executives, 1,196 companies) and created fake profiles to mask the deletion (detailed analysis).
On the day Claude Sonnet 4.5 launched, I tested it with real production tasks.
Task: Add multi-tenant support to a B2B platform for data isolation between corporate clients.
AI's Proposal:
My Simplification:
Outcome: 5 minutes of architectural thinking prevented weeks of unnecessary complexity.
Pattern: AI optimizes for "theoretically possible," not "actually needed."
Task: Implement the simplified spec from Case #1.
Setup: I wrote a detailed 60-minute specification (including file list, examples, constraints, and "what not to do").
AI's Output:
But:
The Math:
Task: Mock BullMQ queue in Vitest tests for a Next.js app to avoid Redis connections.
Expectation: 30–60 min for experienced dev.
What Happened:
Human path: After 30m, realize it's architectural. Pivot to HTTP-level mocking or Testcontainers. Solution in ~1–2h.
Outcome: Complete failure.
Synthesis:
These test results show an important difference between AI as an assistant and AI as an independent agent.
Successful daily AI applications in my workflow:
The consistent factor: human oversight and validation at every step. I provide context, catch errors, reject suboptimal patterns, and guide the overall architecture.
Key observation: AI tools, when used with expert supervision, can significantly increase productivity. Without supervision, they multiply problems.
Current hiring trends show significant shifts:
(Full analysis: Talent Pipeline Trends)
This creates a potential issue for future talent development:
Year 3 projection: Shortage of mid-level engineers
Year 5 projection: Senior engineer compensation increases, shorter tenure
Year 10 projection: Large legacy codebases with limited expertise
The pattern suggests a gap between junior training and senior expertise requirements.
Executive teams often measure lines of code generated, while engineering teams measure complete delivery, including debugging, security, and maintenance. This creates different success metrics.
My Case #3 demonstrates that AI tools struggle with system integration, which comprises approximately 80% of production software work versus 20% isolated tasks.
Major technology companies have committed significant capital to AI infrastructure:
This represents 30% of revenue (compared to historical 12.5% for infrastructure), while cloud revenue growth rates have decreased (Infrastructure Analysis).
AI tools amplify existing expertise but don't create it independently. Experienced developers report a 30% productivity gain on routine tasks, while junior developers exhibit increased defect rates (detailed metrics).
I'm not an AI skeptic. I use AI every single day. Here's where it genuinely transforms my productivity:
Time saved: ~2 hours daily on repetitive tasks.
But I review every single line. I know what good looks like. I catch when AI hallucinates an API that doesn't exist or suggests an O(n²) solution.
No. Not in 2025, not in 2030. Under current constraints, replacement at scale is not feasible.
AI is a fast intern. It amplifies expertise, but doesn't create it.
When assessing AI's role in software development, consider:
Organizations seeing success with AI tools share common approaches:
After 18 months of data collection, production testing, and daily AI tool usage:
AI tools excel as productivity enhancers. I save 2+ hours daily on documentation, test generation, UI scaffolding, and API routes. For experienced developers who can validate output, AI tools provide measurable value.
Industry implications: The combination of reduced junior hiring, increased reliance on AI tools, and significant infrastructure investment creates potential long-term challenges for talent development and code maintainability.
Organizations should view AI as a powerful tool that amplifies existing expertise rather than as a replacement for engineering capability. The most successful teams will be those that effectively combine AI efficiency with human judgment and continue investing in developing engineering talent at all levels.
If you're navigating AI implementation in your organization and want practical guidance based on production experience, NineTwoThree has delivered over 150 AI solutions that focus on measurable ROI. Our approach centers on augmenting engineering teams rather than replacing them—because we've seen firsthand what actually works in production.
P.S. All three case studies are available with full technical details. Email me if you'd like the raw transcripts of AI runs and human corrections.
P.S.S. - The spec from Case #1 is included below for transparency.
B2B platform needs proper data isolation between corporate clients:
Current flow:
Web App → Queue → Processor → Vector DB → Search
Right now everything filters by userId. Works fine for consumer use case, but B2B needs another layer - client-level isolation on top of user-level.
Instead of rebuilding everything with clientId columns and migrations (don't do this), just pass dynamic metadata through the existing flow.
The idea:
That's it. No DB changes.
// Law firm
{
client: "lawfirm_abc",
department: "corporate",
confidentiality: "high"
}
// Hospital
{
customer: "hospital_xyz",
ward: "cardiology",
sensitive: "true"
}
// Finance
{
tenant: "finance_corp",
division: "accounting",
year: "2024"
}
Just Record<string, string>. Call the fields whatever makes sense for the client.
Upload with metadata:
POST /api/shared/v1/data
{
"data": "document content...",
"metadata": {
"client": "lawfirm_abc",
"department": "corporate"
}
}
Search with same metadata:
POST /api/shared/v1/ask
{
"question": "What is the policy?",
"metadata": {
"client": "lawfirm_abc",
"department": "corporate"
}
}
web-app (5 files)
Add metadata?: Record<string, string> to:
processor (2 files)
const vectorMetadata = {
documentId,
userId,
fileName,
// ... system stuff
...(customMetadata || {}), // spread user fields (safe)
};
web-app:
processor:
On September 29, 2025, Anthropic released Claude Sonnet 4.5. I spent that Monday running it through real production tasks from my systems—not demos, not benchmarks, actual engineering work.
I use AI tools daily. Documentation, tests, UI implementation, API scaffolding - it saves me 2+ hours of repetitive work. AI has become an essential part of my workflow.
Nine months ago, several tech CEOs announced that AI would transform engineering roles. Salesforce paused engineering hires. Meta announced AI agent initiatives. Google reported 30% of new code is AI-generated.
I wanted to understand the current state: Where does AI excel? Where does it struggle? What works in production versus what's still experimental?
After 18 months of tracking industry data, daily AI usage, and systematic testing with the latest models, I found some clear patterns.
Sam Altman (OpenAI), March 2025: "At some point, yeah, maybe we do need less software engineers."
Mark Zuckerberg (Meta), January 2025: "In 2025, we're going to have an AI that can effectively be a mid-level engineer."
Marc Benioff (Salesforce), February 2025: "We're not going to hire any new engineers this year." Then he fired 4,000 support staff.
Sundar Pichai (Google), October 2024: "More than a quarter of all new code at Google is generated by AI."
By April 2025: 30%. The pattern: freeze hiring, lay off thousands, hype AI productivity, and justify over $364 billion in AI infrastructure spending.
What they're not telling you is more interesting.
Based on 18 months of tracking AI adoption across teams (survey panels of 18 CTOs, >500 engineers across startups and enterprise):
This gap between perception and measurement deserves closer examination.
Notable incident: The Replit case, where AI deleted a production database (1,206 executives, 1,196 companies) and created fake profiles to mask the deletion (detailed analysis).
On the day Claude Sonnet 4.5 launched, I tested it with real production tasks.
Task: Add multi-tenant support to a B2B platform for data isolation between corporate clients.
AI's Proposal:
My Simplification:
Outcome: 5 minutes of architectural thinking prevented weeks of unnecessary complexity.
Pattern: AI optimizes for "theoretically possible," not "actually needed."
Task: Implement the simplified spec from Case #1.
Setup: I wrote a detailed 60-minute specification (including file list, examples, constraints, and "what not to do").
AI's Output:
But:
The Math:
Task: Mock BullMQ queue in Vitest tests for a Next.js app to avoid Redis connections.
Expectation: 30–60 min for experienced dev.
What Happened:
Human path: After 30m, realize it's architectural. Pivot to HTTP-level mocking or Testcontainers. Solution in ~1–2h.
Outcome: Complete failure.
Synthesis:
These test results show an important difference between AI as an assistant and AI as an independent agent.
Successful daily AI applications in my workflow:
The consistent factor: human oversight and validation at every step. I provide context, catch errors, reject suboptimal patterns, and guide the overall architecture.
Key observation: AI tools, when used with expert supervision, can significantly increase productivity. Without supervision, they multiply problems.
Current hiring trends show significant shifts:
(Full analysis: Talent Pipeline Trends)
This creates a potential issue for future talent development:
Year 3 projection: Shortage of mid-level engineers
Year 5 projection: Senior engineer compensation increases, shorter tenure
Year 10 projection: Large legacy codebases with limited expertise
The pattern suggests a gap between junior training and senior expertise requirements.
Executive teams often measure lines of code generated, while engineering teams measure complete delivery, including debugging, security, and maintenance. This creates different success metrics.
My Case #3 demonstrates that AI tools struggle with system integration, which comprises approximately 80% of production software work versus 20% isolated tasks.
Major technology companies have committed significant capital to AI infrastructure:
This represents 30% of revenue (compared to historical 12.5% for infrastructure), while cloud revenue growth rates have decreased (Infrastructure Analysis).
AI tools amplify existing expertise but don't create it independently. Experienced developers report a 30% productivity gain on routine tasks, while junior developers exhibit increased defect rates (detailed metrics).
I'm not an AI skeptic. I use AI every single day. Here's where it genuinely transforms my productivity:
Time saved: ~2 hours daily on repetitive tasks.
But I review every single line. I know what good looks like. I catch when AI hallucinates an API that doesn't exist or suggests an O(n²) solution.
No. Not in 2025, not in 2030. Under current constraints, replacement at scale is not feasible.
AI is a fast intern. It amplifies expertise, but doesn't create it.
When assessing AI's role in software development, consider:
Organizations seeing success with AI tools share common approaches:
After 18 months of data collection, production testing, and daily AI tool usage:
AI tools excel as productivity enhancers. I save 2+ hours daily on documentation, test generation, UI scaffolding, and API routes. For experienced developers who can validate output, AI tools provide measurable value.
Industry implications: The combination of reduced junior hiring, increased reliance on AI tools, and significant infrastructure investment creates potential long-term challenges for talent development and code maintainability.
Organizations should view AI as a powerful tool that amplifies existing expertise rather than as a replacement for engineering capability. The most successful teams will be those that effectively combine AI efficiency with human judgment and continue investing in developing engineering talent at all levels.
If you're navigating AI implementation in your organization and want practical guidance based on production experience, NineTwoThree has delivered over 150 AI solutions that focus on measurable ROI. Our approach centers on augmenting engineering teams rather than replacing them—because we've seen firsthand what actually works in production.
P.S. All three case studies are available with full technical details. Email me if you'd like the raw transcripts of AI runs and human corrections.
P.S.S. - The spec from Case #1 is included below for transparency.
B2B platform needs proper data isolation between corporate clients:
Current flow:
Web App → Queue → Processor → Vector DB → Search
Right now everything filters by userId. Works fine for consumer use case, but B2B needs another layer - client-level isolation on top of user-level.
Instead of rebuilding everything with clientId columns and migrations (don't do this), just pass dynamic metadata through the existing flow.
The idea:
That's it. No DB changes.
// Law firm
{
client: "lawfirm_abc",
department: "corporate",
confidentiality: "high"
}
// Hospital
{
customer: "hospital_xyz",
ward: "cardiology",
sensitive: "true"
}
// Finance
{
tenant: "finance_corp",
division: "accounting",
year: "2024"
}
Just Record<string, string>. Call the fields whatever makes sense for the client.
Upload with metadata:
POST /api/shared/v1/data
{
"data": "document content...",
"metadata": {
"client": "lawfirm_abc",
"department": "corporate"
}
}
Search with same metadata:
POST /api/shared/v1/ask
{
"question": "What is the policy?",
"metadata": {
"client": "lawfirm_abc",
"department": "corporate"
}
}
web-app (5 files)
Add metadata?: Record<string, string> to:
processor (2 files)
const vectorMetadata = {
documentId,
userId,
fileName,
// ... system stuff
...(customMetadata || {}), // spread user fields (safe)
};
web-app:
processor:
