The AI Classification Problem

Layer -1Problem ClassificationCorporate Innovation

16 Feb

TLDR;-) The dominant AI failure mode is taxonomic, not technological. Organisations misclassify AI initiatives: they treat transformation as optimisation because optimisation is fundable and governable. Everything downstream follows from the classification: timeline, stakeholders, success metrics, mandate. Get the classification wrong and you have set the initiative up to fail before any code is written.

Your AI initiative was approved as an optimisation. It is actually a transformation.

That sentence describes the single most common failure pattern I observe across corporate innovation programmes. The technology performs. The pilot delivers results. And then the initiative stalls, runs over budget, or gets quietly shelved. Not because the AI didn't work. Because the organisation resourced, governed, and evaluated it as the wrong kind of problem.

The AlixPartners 2026 Disruption Index, surveying 3,200 executives across 11 countries, estimates that approximately 80% of AI projects fail. RAND's peer-reviewed 2024 study puts the figure at the same level. MIT's NANDA initiative found that 95% of generative AI pilots deliver zero measurable return. These numbers are by now familiar. What is less familiar is the question nobody is asking: fail at what, exactly?

AlixPartners documents that success is concentrated in operational efficiency. Growth leaders distinguish themselves through discipline, not ambition. The report then urges bolder AI adoption. This is a contradiction. If success clusters in operational discipline and failure clusters in ambitious deployment, the problem is not that organisations lack ambition. The problem is that they cannot tell the difference between an operational improvement and a structural transformation. They are treating these as the same kind of work. They are not.

Innovation is not integration

Before we classify types of AI initiative, we need to address a prior confusion. A significant proportion of what organisations label "AI innovation" is not innovation at all. It is integration: deploying known capabilities into existing processes.

Rolling out Microsoft Copilot across your organisation is not innovation. Connecting an LLM to your customer service workflow using an off-the-shelf API is not innovation. These are integration projects. They deploy mature capabilities into existing structures. They are legitimate work. They can deliver real value. But they require fundamentally different sequencing, ownership, metrics, and timelines from genuine innovation.

The confusion matters because it creates two distinct failure modes. When integration is treated as innovation, the project inherits exploration overhead: discovery workshops, hypothesis testing, open-ended timelines. Deployment work gets wrapped in innovation theatre. The team builds prototypes when it should be building rollout plans.

When innovation is treated as integration, the project inherits specification requirements: fixed scope, fixed timeline, governance designed for predictable execution. Exploratory work gets squeezed into a delivery template that cannot accommodate uncertainty. The team writes requirements documents for problems it has not yet understood.

Both failure modes originate in the same place: classification. And the reason organisations default to the wrong classification is political. Integration is fundable because it improves existing processes. Innovation is governable because it can be siloed in labs and accelerators. Misclassification occurs because organisations default to whichever frame is politically easier, not whichever frame is accurate.

Why the standard frameworks cannot help

The instinct, when facing a classification problem, is to reach for an existing framework. The three most commonly referenced frameworks each fail to classify AI correctly. Not because they are wrong. Because they use the wrong axis.

Christensen's disruption theory classifies by market entry pattern. Disruptive innovations enter markets from below: cheaper, simpler, initially inferior. They serve over-served or non-consuming customers. Incumbents dismiss them because they do not threaten the most profitable segments.

AI violates every one of these assumptions simultaneously. It enters expensive from above. GPT-4 passed the bar exam. Incumbents are investing tens of billions, not dismissing the technology. It outperforms existing solutions from day one across multiple domains. It serves the most demanding customers first. The Christensen Institute's own Michael Horn conceded in 2024 that it does not make much sense to classify generative AI as disruptive in and of itself.

The point is not that Christensen was wrong. His framework accurately describes a specific pattern of market entry. AI is not that pattern. His classification mechanism is unusable here, and applying it produces misleading signals.

Nagji and Tuff's innovation ambition matrix classifies by product and market novelty along two axes: how far from existing products and how far from existing markets. Their framework is elegant and widely cited, including by Gartner. It generated the well-known 70/20/10 portfolio split across core, adjacent, and transformational innovation.

The problem is that their axes measure ambition, not organisational impact. An AI API integration registers as "core" on both axes: existing technology, existing market. But it can require transformational organisational change: new data governance, new skills, new accountability structures, new decision-making processes. Nagji and Tuff's framework would tell you this is a core initiative. Your organisation will discover it is not. The 70/20/10 split itself has weak empirical grounding. An Innovation Leader/KPMG survey found actual portfolio allocations closer to 49/28/23, which looks nothing like the canonical ratio.

Wardley mapping classifies by evolutionary stage: genesis, custom-built, product, commodity. Its resolution is higher than the other two frameworks, and its insight about co-evolution of practice is the sharpest concept available for understanding AI's organisational impact. But AI is a composite technology spanning the full evolutionary axis simultaneously. GPU compute infrastructure is commodity. Foundation models are late-stage product. Domain-specific applications are genesis. Organisational practices for working with AI are novel and unformed.

Wardley's own resolution for this is correct: decompose. Map each component separately. But organisations do not decompose. They treat "AI" as a monolithic initiative and classify the whole thing at once. This is where Wardley's co-evolution concept becomes diagnostic. AI infrastructure has evolved to commodity. Organisational practices for absorbing AI remain at genesis. That gap between the maturity of the technology and the immaturity of the practice is the misclassification mechanism.

Henderson and Clark identified this dynamic in 1990, in a context that had nothing to do with AI. Their study of semiconductor photolithographic equipment showed that misclassification of innovation type was the primary failure mechanism for established firms. Companies that correctly classified an innovation as architectural (requiring reconfiguration of existing components) performed well. Companies that misclassified architectural innovation as incremental (improving components within an existing architecture) failed. The firms that failed were not less capable. They were less accurate about what kind of problem they were solving.

The same mechanism is operating now, at scale, across every industry.

Three classes of AI initiative

If the standard frameworks cannot classify AI accurately, what can? The answer starts with a different axis entirely. Not market entry pattern, not product/market novelty, not evolutionary stage. Organisational impact.

I use three classes in my own diagnostic work with corporate innovation teams and accelerator programmes.

Optimisation improves an existing process within the current operating model. Structures are unchanged. Authority relationships are unchanged. The AI replaces or accelerates a specific task. Timeline: 3 to 12 months. Example: automated document classification that speeds up an existing compliance workflow. The people doing the work remain the same. The decisions remain the same. The output is faster.

Adjacency extends into a neighbouring domain. The operating model is modified but not replaced. New stakeholders are required. Existing roles shift. Timeline: 6 to 24 months. Example: AI customer service agents that change how escalation decisions are made. The work is recognisable but the decision-making processes have changed. Someone who was not previously involved now has authority. Someone who previously had authority no longer does.

Transformation changes how the organisation operates. New capabilities, new structures, new incentive systems, new authority relationships. Timeline: 18 to 48 months. Example: AI as a core capability that reorganises how work is distributed, evaluated, and governed. The people doing the work, the skills they need, the way performance is measured, and the lines of accountability have all changed.

The most actionable diagnostic I use is the reversibility test. If the AI were removed tomorrow, would the organisation's processes revert to their previous state? If yes, it is an optimisation. If reverting would be painful but possible, it is an adjacency. If reverting is unthinkable because the organisation has fundamentally reorganised around the capability, it is a transformation.

The reversibility test works because it cuts through the ambiguity that the other frameworks leave unresolved. It does not ask what the technology does. It asks what the organisation would have to undo.

What the failure data shows

Read the failure data with the classification lens and a pattern becomes visible.

RAND's 2024 study interviewed 65 experienced data scientists and ML engineers. The number one root cause of failure: stakeholders misunderstand the problem that needs to be solved. Not the technology. Not the data. The problem definition. Eighty-four percent of interviewees cited leadership-driven failures as the primary cause. The most common form: instructing the data science team to solve the wrong problem.

This is a classification error. When an organisation classifies a transformation as an optimisation, it defines the problem in optimisation terms: faster throughput, lower cost, fewer errors. The data science team optimises for these metrics. The model works. And the initiative fails anyway, because the actual problem was never about throughput. It was about reorganising how decisions get made.

McKinsey's 2024 AI findings show a related pattern. Only 17% of organisations report that 5% or more of EBIT comes from generative AI. Organisations that redesigned workflows before selecting models were twice as likely to succeed. Workflow redesign is a classification decision. It means recognising that the initiative is not an optimisation of the existing workflow but at minimum an adjacency requiring structural change.

The Scrum.org analysis of 166 AI anti-patterns found that 65% have organisational causes and 22% have technical causes. Nearly two-thirds of the failure patterns are not about the technology at all.

Digital transformation showed the same profile. McKinsey's longstanding finding of 70% failure rates. Bain's 2024 analysis at 88%. The root cause is structural, not technical. Organisations treated digital transformation as a technology deployment problem when it was an organisational redesign problem. The same misclassification, a decade earlier, with a different technology.

AI's version of this problem is wider than its predecessors. The gap between the maturity of AI technology and the maturity of organisational practice for absorbing it is larger than the equivalent gap was for cloud computing, mobile, or social media. Each of those technologies had a period of ambiguity about what they required organisationally. AI's ambiguity period is compressed into a shorter timeframe with exponentially larger investment. The cost of misclassification is correspondingly higher.

The diagnostic

The failure is not in the technology. It is not in the strategy. It is in the classification: the pre-strategic decision about what kind of problem this is.

Every corporate AI initiative has two identities. What the organisation approved, and what the initiative actually requires. The gap between these two is the misclassification, and it cascades through every downstream decision.

Three questions to surface the gap.

What did your organisation approve? Look at the business case, the timeline, the governance structure, the success metrics. These tell you the classification the organisation applied. A 6-month timeline with ROI targets and IT governance says optimisation. A 24-month timeline with cross-functional steering says adjacency. If your initiative has the first set of constraints but the second set of requirements, you have a misclassification.
What does the initiative actually require? Apply the reversibility test. If you removed the AI capability tomorrow, what would need to change? If the answer is "nothing structural," you have an optimisation. If the answer includes "we would need to re-hire, re-train, or reconstruct decision processes," you have something larger.
Where is the gap? The distance between what was approved and what is required is the misclassification. It predicts the initiative's failure mode, its timeline overrun, and the political conflict that will emerge when the mismatch becomes visible.

Henderson and Clark demonstrated 35 years ago that established firms fail primarily from misclassifying the type of innovation they face. The same mechanism is operating now. The technology is different. The classification error is identical.

Alexandra Najdanovic is the founder of Aieutics, working with founders and corporate innovation teams on strategic transformation and AI-readiness. The Critical Path Layers framework referenced in this article is her proprietary diagnostic model for innovation sequencing.