ARTICLENovember 12, 2025

Why AI Implementations Fail

Henry NguyenFounder, Xiren9 min read

I've spent the last year watching operators try to deploy AI into their businesses. The results have been worse than the surrounding industry literature would lead you to expect. Most of these deployments fail because the technology is being asked to do something it cannot do. It's being asked to cover for the absence of an underlying process.

This post is what I've actually seen, why I think the failure pattern is structural, and what I do differently now when an operator asks me to help them automate something.

The numbers don't add up

Two sets of data sit awkwardly next to each other.

The technology demonstrably works. Microsoft's 2024 Work Trend Index Annual Report, drawing on a survey of 31,000 workers across 31 countries, found that people using AI-assisted tools were saving an average of 2.5 hours per day on tasks the tools handled (Microsoft & LinkedIn, 2024). McKinsey's published research puts achievable operational cost reductions at 20 to 30 percent in functions where AI is actively deployed (Chui et al., 2023). By the end of 2024, more than 90 percent of executives reported plans to deploy AI-enabled automation within the next year.

The deployments fall apart anyway. Boston Consulting Group's 2024 review of enterprise AI initiatives, surveying 1,000 CxOs and senior executives across 59 countries, found that 74 percent of companies had yet to show tangible value from their use of AI (Boston Consulting Group, 2024). Gartner now projects that more than 40 percent of agentic AI projects will be canceled by the end of 2027 due to escalating costs, unclear business value, or inadequate risk controls (Gartner, 2025).

The technology works. The implementations fail. Both things are true. I want to spend the rest of this post explaining why I think the gap between those two truths exists, and what I've learned to do about it.

The pattern

When I look back at the deployments I've watched fail across professional services, construction, real estate operations, and adjacent verticals, they fail in roughly the same way.

An operator identifies a workflow that consumes significant employee time and decides to automate it. An AI tool is selected, usually an LLM-based assistant or a no-code workflow builder. A controlled pilot shows promising results. The deployment expands. Somewhere between sixty and a hundred and twenty days in, output quality starts degrading, exceptions multiply, and operators quietly route work back to manual processes.

In every case I've reviewed closely, the AI tool itself was performing as specified. The collapse happened around the AI, not inside it. Handoffs that depended on a human's judgment broke. Decisions that had no documented logic produced bad outputs. Exception cases that the original "process" had never actually addressed, because some experienced employee had been silently absorbing them, stopped being handled.

The workflow being automated was a habit performed by a person, with the person serving as the live exception handler. Once the person was removed, the exceptions stopped being handled.

This is the failure mode at its most general: the AI was deployed onto an undocumented process, and the deployment surfaced the absence of process underneath.

Workflow versus process

A distinction that helps here. Workflows and processes are not the same thing.

A workflow is what an organization does. A process is what an organization has documented, made repeatable, and decoupled from individual judgment. Most service businesses operate primarily on workflows. They have processes only for things compliance forces them to have processes for, like payroll, billing, and regulatory filings. Everything else runs on tribal knowledge held by experienced employees.

This used to be a defensible operating model. The business grew at the speed at which it could absorb and train new people. Tribal knowledge transferred informally during the absorption period. The model worked.

AI breaks the model in a specific way. AI proposes to scale the work without scaling the people who carry the tribal knowledge. The implicit knowledge required to make the work coherent has nowhere to go. Without that knowledge, the AI's outputs gradually drift from what the business actually needs, and nobody catches the drift until it shows up in customer complaints, revenue churn, or quality incidents.

McKinsey's research on generative AI productivity surfaced this finding indirectly. The reported gains were concentrated in roles where the work was already well-specified. Software engineering, structured customer support, document summarization (Chui et al., 2023). Gains were significantly smaller, sometimes negative, in roles where the work was specified primarily through human judgment.

The implication is the thing I want operators to internalize: AI accelerates whatever process you point it at, including the absence of one.

The compounding problem

The pattern compounds, and this is the part most operators don't see coming.

Organizations that deploy AI onto undocumented workflows fail in three layered ways at once.

First, the AI deployment generates output at a scale that exceeds the organization's capacity to inspect it. When the output quality is high, this is fine. When the output quality is variable, the variance ships to customers before anyone notices.

Second, the human exception handlers, the people who were silently absorbing the unmodeled cases, are now responsible for catching errors at machine speed rather than executing the original work at human speed. This is a categorically harder job, and the same employees who were previously valuable workflow operators become bottleneck reviewers. Most of them are bad at the new job, because nobody hired them for it.

Third, the organization's institutional knowledge of how the work actually gets done starts to atrophy. The people who knew the unwritten rules either leave (because their role has shifted to error correction) or stop reinforcing the rules in their day-to-day work (because the AI is now doing the day-to-day work). Within twelve to eighteen months, the organization is dependent on the AI deployment, has lost the ability to operate without it, and has degraded the quality floor of its core deliverable.

The financial picture lags the operational picture by one or two reporting cycles. By the time the cost shows up on the P&L, the institutional ability to fix it is often already gone. This is what I mean when I say AI scales broken processes. The process keeps running. Your ability to remember how it used to work is what disappears.

Why operators keep walking into this

There's a fair question here. If this pattern is well-documented, and it is, why do operators keep walking into it?

I think the information is available. The problem is that the incentive structure around AI deployment systematically rewards deployment over readiness.

Vendor incentives push toward fast deployment. AI tooling vendors have a commercial interest in positioning their products as drop-in replacements for human labor. Gartner has noted that many vendors engage in what it calls "agent washing," rebranding existing chatbots and automation tools as agentic AI without delivering meaningful autonomous capabilities (Gartner, 2025). The actual deployment requires significant process work that the vendor neither performs nor scopes. The vendor's pilot success rate is high. Their long-term implementation success rate is unknown to them, because by the time things degrade, attribution has been diluted.

Buyer incentives push the same direction. Operators under cost pressure are incentivized to pursue automation regardless of process readiness. The cost of doing nothing is visible in the current quarter. The cost of doing it badly only shows up in later quarters. The decision tree at the moment of purchase is asymmetric.

Implementation consultants are paid for deployment, not durability. The standard engagement closes before the eighteen-month window in which most failures manifest. There is no economic actor in the standard implementation chain whose compensation is tied to whether the deployment is still working in year two.

The pattern is incentive-driven, which is why information alone has not been enough to fix it. Knowing this is the first thing that lets you sidestep it.

What I do now

After enough of these post-mortems, I changed how I work. I no longer accept "what should we automate" as a starting question from a prospective client. I start with a different question. Is the work you want to automate actually a process, or is it a habit that an experienced employee performs?

The diagnostic I run is four questions, and the order matters.

One. Is the workflow currently producing a defined output? If the output varies materially based on which employee performs it, the workflow is a craft, not a process. Crafts can be automated, but only after the variation is reconciled to a standard. Automation deployed against an unreconciled craft will scale the variation.

Two. Are the exceptions documented? In most service workflows, the exception cases consume disproportionate operator time. If the people doing the work cannot articulate the full set of exception cases and how each is currently resolved, the workflow is not ready for automation. The exceptions will become silent failure modes after deployment.

Three. Does the workflow have a measurable quality threshold? If you can't specify what "correct" output looks like, you can't detect degradation. Automation against an unspecified quality threshold produces a system that cannot be debugged, because there's no agreed definition of what would constitute a bug.

Four. Is the workflow's output consumed by a downstream process that is itself well-defined? Automation amplifies coupling. If the workflow's output feeds another workflow that's itself undocumented, automating the upstream workflow propagates errors faster. The downstream workflow has to be readiness-evaluated before either can be automated.

A workflow that fails any of these four questions needs process work first, automation second. In my experience, fewer than one in five workflows that operators initially nominate for automation pass all four readiness questions on first review.

The reframe this produces is significant. Most operators believe they have a software problem, and they discover, in the mapping, that they have a process problem dressed up as a software problem. The financial case is also better than people expect. Process work, once completed, makes the subsequent automation deployment cheaper, faster, and more durable. The mapping pays for itself before the automation begins.

What I want operators to take from this

The current generation of AI tooling is, on technical merits, capable of substantial operational impact. The published productivity research is real (Microsoft & LinkedIn, 2024). The cost reduction figures are achievable (Chui et al., 2023). I'm not arguing with any of that.

What I am arguing with is the assumption that deployment is the rate-limiting step. In the operations I've reviewed, deployment is not rate-limiting. Process readiness is. Operators who automate without first establishing readiness are introducing a more sophisticated form of organizational fragility, on accelerating timescales, because the technology is improving faster than the operating practice around it.

The fix is to invert the order of operations. Process mapping first, automation second, scale third. Operators who do this over the next twenty-four months will compound their gains. The ones who don't will become the source of the failure statistics that future research will cite.

If you're an operator trying to figure out where you sit on this readiness question, the Operations Map I described above is what we run for free in our first conversation with anyone considering working with us. Twenty minutes, no deck, no proposal. You don't need to be in the market to hire us to take it. You'll know more about your own operation at the end of it than you did at the start.

That's the offer. The link is below.

References

Boston Consulting Group. (2024, October 24). AI adoption in 2024: 74% of companies struggle to achieve and scale value. https://www.bcg.com/press/24october2024-ai-adoption-in-2024-74-of-companies-struggle-to-achieve-and-scale-value
Chui, M., Hazan, E., Roberts, R., Singla, A., Smaje, K., Sukharevsky, A., Yee, L., & Zemmel, R. (2023). The economic potential of generative AI: The next productivity frontier. McKinsey & Company. https://www.mckinsey.com/capabilities/mckinsey-digital/our-insights/the-economic-potential-of-generative-ai-the-next-productivity-frontier
Gartner. (2025, June 25). Gartner predicts over 40% of agentic AI projects will be canceled by end of 2027 [Press release]. https://www.gartner.com/en/newsroom/press-releases/2025-06-25-gartner-predicts-over-40-percent-of-agentic-ai-projects-will-be-canceled-by-end-of-2027
Microsoft & LinkedIn. (2024). 2024 work trend index annual report: AI at work is here. Now comes the hard part. Microsoft. https://www.microsoft.com/en-us/worklab/work-trend-index/ai-at-work-is-here-now-comes-the-hard-part

Operations Map

Map your operations.

Twenty minutes, no deck, no proposal. Walk through what your team does, and find out what software could be doing instead.

Map your operations