AI Agents Can't Do Consulting Work Yet—But Their Creator Says Give Them Until 2026

The promise of artificial intelligence replacing high-paid knowledge workers has been one of the most tantalizing—and most contested—narratives in the technology sector. Now, one of Silicon Valley’s most closely watched AI startups is offering a surprisingly candid admission: its AI agents largely failed when put to the test on real consulting tasks. But the company’s CEO insists the failures are temporary, and that a dramatic transformation of the consulting industry is just months away.

Mercor, the AI-powered talent marketplace valued at $2 billion after a recent funding round led by Benchmark and with backing from Peter Thiel and Ben Horowitz, has been running experiments pitting its AI agents against tasks typically handled by management consultants. The results, according to CEO Brendan Foody, were humbling. As reported by Business Insider, the AI agents struggled with the kind of ambiguous, multi-step analytical work that forms the bread and butter of firms like McKinsey, Bain, and Boston Consulting Group.

A Candid Admission From a $2 Billion Startup

Foody’s transparency is notable in an industry where executives routinely overpromise on AI capabilities. In his account to Business Insider, Foody described how Mercor tested its AI agents on consulting-style engagements—the kind of work that involves synthesizing large volumes of unstructured data, developing strategic frameworks, and producing polished client-ready deliverables. The agents, he said, could handle discrete, well-defined subtasks but fell apart when confronted with the holistic judgment and contextual reasoning that consulting work demands.

The specific shortcomings are instructive. AI agents could pull data, generate summaries, and even draft slide decks. But they struggled to understand the underlying business problem, failed to ask clarifying questions the way a seasoned consultant would, and produced recommendations that lacked the nuance clients expect when they’re paying upward of $500 an hour. The gap between “impressive demo” and “billable work product” turned out to be enormous—a reality check that resonates across the enterprise AI sector.

Why Consulting Is Such a Hard Nut to Crack

Management consulting has long been considered one of the white-collar professions most vulnerable to AI disruption. The work involves pattern recognition, data analysis, and the synthesis of best practices—all areas where large language models theoretically excel. Yet the Mercor experiments reveal why the reality is far more complicated. Consulting engagements are rarely about answering a single, well-formed question. They require navigating organizational politics, understanding unstated client objectives, and iterating on deliverables through a feedback loop that demands emotional intelligence as much as analytical horsepower.

The major consulting firms themselves have been investing heavily in AI, but largely as a tool to augment their human consultants rather than replace them. McKinsey launched its own generative AI platform called Lilli, while BCG has partnered with OpenAI and built internal tools to accelerate research and analysis. These firms see AI as a leverage multiplier—enabling a junior analyst to do the work of two or three—rather than a wholesale substitute for human judgment. Mercor’s findings, perhaps inadvertently, validate that cautious approach.

Foody’s Bold 2026 Prediction

Despite the current shortcomings, Foody is not backing down from his long-term thesis. He told Business Insider that he expects AI agents to be capable of replacing consultants on many standard engagements by 2026. His confidence rests on the rapid pace of improvement in foundational AI models, the development of better agent architectures that can decompose complex problems into manageable steps, and Mercor’s own proprietary data on how humans successfully complete knowledge work.

Mercor’s unique position in the market gives it an unusual vantage point. The company operates a platform that matches businesses with skilled freelancers and contractors, and it has accumulated a vast dataset on how human workers approach and complete tasks. Foody believes this data is the key ingredient that will allow Mercor to train AI agents that don’t just mimic surface-level outputs but actually replicate the reasoning processes of top-tier consultants. It’s an ambitious claim, but one grounded in a concrete data advantage that most pure-play AI labs lack.

The Broader Race to Automate Professional Services

Mercor is far from the only company chasing this prize. The professional services automation space has attracted enormous venture capital investment over the past 18 months. Startups like Devin, which markets itself as an AI software engineer, and Harvey, which targets legal work, are pursuing similar strategies in adjacent professional domains. The common thesis is that AI agents—autonomous systems capable of executing multi-step workflows with minimal human oversight—will eventually absorb large categories of knowledge work currently performed by highly educated, highly compensated professionals.

But the track record so far has been mixed at best. Devin faced scrutiny after independent testers found its capabilities fell short of marketing claims. Harvey has gained traction in legal research but has not yet displaced junior associates at major law firms. Across the board, the pattern is consistent: AI agents perform impressively on narrow, well-defined tasks but struggle with the open-ended, judgment-intensive work that defines professional services. The question is whether this is a fundamental limitation or merely a temporary one that will be overcome as models improve.

What the Consulting Firms Are Actually Worried About

Inside the major consulting firms, the conversation about AI is more nuanced than the public narrative suggests. Partners at top firms privately acknowledge that AI will eventually compress the time required for many standard analyses—market sizing, competitive benchmarking, financial modeling—that currently occupy armies of junior consultants. But they also point out that the most valuable part of consulting has never been the analysis itself. It’s the relationships, the credibility, the ability to tell a CEO something difficult and be believed.

This relational dimension of consulting is precisely what AI agents cannot replicate, at least not yet. When a McKinsey partner walks into a boardroom, they bring not just a slide deck but decades of pattern recognition across hundreds of client engagements, the social authority to challenge executive assumptions, and the political savvy to navigate competing stakeholder interests. These are capabilities that emerge from embodied human experience, and no amount of training data can easily substitute for them. Mercor’s Foody seems to acknowledge this implicitly by focusing on the more routine, analytical components of consulting rather than the advisory relationship itself.

The Economic Stakes Are Staggering

The global management consulting market is worth an estimated $300 billion annually, according to industry analyses. Even partial automation of the analytical and research components of consulting work could represent a multi-billion-dollar disruption. For the consulting firms, the threat is not that AI will eliminate their business overnight, but that it will erode the economic model that depends on billing large teams of junior consultants at premium rates. If an AI agent can do in hours what a team of analysts does in weeks, the justification for those fees collapses.

This is why the most forward-thinking consulting firms are racing to reposition themselves as AI-enabled rather than AI-vulnerable. Accenture has committed $3 billion to AI investments. Deloitte has built dedicated AI practices across all its business lines. The strategic calculus is clear: if automation is coming for the analytical grunt work, the firms that survive will be those that have already shifted their value proposition toward higher-order strategic advice and implementation support that AI cannot easily replicate.

What Mercor’s Experiment Tells Us About the State of AI

Perhaps the most valuable takeaway from Mercor’s candid assessment is what it reveals about the current state of AI agent technology more broadly. The hype cycle around autonomous AI agents has reached a fever pitch, with breathless predictions about the imminent obsolescence of entire professions. Mercor’s experience suggests that the reality is far more incremental. AI agents are genuinely useful for well-structured, repetitive tasks. They are genuinely poor at the kind of ambiguous, context-dependent reasoning that characterizes high-value professional work.

Foody’s 2026 timeline is aggressive but not implausible, given the pace of advancement in model capabilities. OpenAI, Anthropic, and Google DeepMind are all investing billions in improving the reasoning abilities of their foundational models. If those efforts bear fruit, the downstream implications for companies like Mercor—and for the consulting industry—could be profound. But the history of AI is littered with predictions that proved premature, and the gap between laboratory benchmarks and real-world professional performance has consistently been wider than technologists expected.

For now, the consultants can breathe easy—but they would be wise to spend the next 12 months preparing for a world where their analytical monopoly is no longer guaranteed. The AI agents aren’t ready today. The question that should keep every managing partner up at night is whether they’ll be ready tomorrow.

AI Agents Can’t Do Consulting Work Yet—But Their Creator Says Give Them Until 2026