Anthropic's Margin Squeeze: Inference Costs Bite as Revenue Surges

As artificial intelligence companies race to dominate the market, The Information reports that Anthropic has slashed its 2025 gross margin projection to 40% from an earlier 50% estimate, driven by inference costs that exceeded expectations by 23%. The company’s revenue, meanwhile, rocketed to $4.5 billion in 2025, marking 12 times growth from prior levels, according to internal projections cited by Sri Muppidi in The Information. This dichotomy underscores the mounting pressure on AI labs to balance explosive demand with soaring operational expenses.

Anthropic’s challenges stem from higher-than-anticipated spending on running AI models for users, known as inference. ‘Those inference costs were 23% higher than the company had anticipated,’ The Information notes, with the firm calculating gross margins by subtracting these costs along with other product sales expenses from revenue. Enterprise demand has fueled revenue growth, pushing annualized figures toward $3 billion as reported by Reuters in May 2025, but the efficiency gap is widening.

Posts on X echo this tension, with users like Rohan Paul highlighting Anthropic’s ‘downward margin reset’ and Dario Amodei’s warnings about a ‘cone of uncertainty’ in AI investments. The firm, backed by Amazon and Google, relies heavily on their cloud infrastructure, where inference on servers ran over budget.

Inference Expenses Eclipse Projections

The core issue lies in the compute-intensive nature of serving AI queries. Inference, the process of generating responses from trained models like Claude, now dominates costs as user volumes surge. The Information details how Anthropic’s projections, shared internally last month, reflect this reality amid aggressive expansion.

Earlier optimism had painted a rosier picture, with projections of $70 billion in revenue and $17 billion in profit by 2028, as noted in X posts citing The Information. Yet, reality intervened: total spending on R&D, marketing, and administration hit $7 billion in 2025, outpacing even the robust revenue gains.

Industry observers point to broader trends. Hacker News discussions debate whether labs like Anthropic and OpenAI are losing money on inference, scrutinizing metrics like model FLOPS utilization and GPU efficiency.

Hardware Innovations Target Bottlenecks

Enterprises like D-Matrix are tackling these pain points head-on. In The New Stack, the startup champions in-memory compute to alleviate AI inference bottlenecks. Traditional architectures shuttle data between memory and processors, creating latency; D-Matrix’s approach keeps computations within memory chips, slashing energy use and speeding responses.

‘AI inference is hitting a wall with memory bandwidth,’ The New Stack quotes D-Matrix executives, emphasizing how their chips could cut costs by processing tokens faster. This aligns with Anthropic’s plight, where inference overran budgets on hyperscaler hardware.

Memory constraints are a hot topic in 2026 news. CNBC reports Morgan Stanley favoring memory stocks as capacity emerges as a key AI buildout bottleneck, with firms like Micron poised for gains.

Enterprise Shift Drives Revenue Surge

Anthropic’s business-focused strategy is paying off. AI CERTs News highlights accelerating profitability from enterprise contracts, with Claude Code nearing $1 billion in annual recurring revenue per X sentiment. Reuters confirmed $3 billion annualized revenue in mid-2025, validating generative AI’s enterprise appeal.

Yet, profitability remains elusive short-term. Projections show losses narrowing, with breakeven eyed by 2027—three years ahead of rivals—per The Information. CEO Dario Amodei has stressed efficiency bets, telling CNBC the firm’s ‘do more with less’ philosophy keeps it competitive.

Funding talks underscore scale ambitions. The New York Times reports negotiations for $10 billion at a $350 billion valuation, amid IPO speculation as per KraneShares.

Competitive Pressures and Efficiency Wars

OpenAI offers a benchmark. SaaStr analyzes its reported 70% compute margin in late 2025, per The Information, though B2B startups lag. Anthropic’s tilt toward efficiency, projecting lower compute spend than OpenAI, was flagged by Rohan Paul on X citing The Information.

Ed Zitron on Bluesky (bsky.app) critiques the hype, arguing AI firms overbuild capacity amid uncertain demand. This mirrors Amodei’s ‘cone of uncertainty,’ where data center bets lock in costs years ahead.

X chatter from Sri Muppidi amplifies the margin drop, with Techmeme aggregating sources on the 40% target despite cost overruns.

Path Forward Amid Uncertainty

Optimism persists for margins. Tanay Jaipuria’s newsletter dissects 2025 AI gross margins, suggesting improvements via optimization. Anthropic’s research, like its economic index, measures real-world AI productivity gains, potentially justifying premiums.

Memory tech evolution could help. 24/7 Wall St. predicts a 2026 ‘memory explosion’ for inference, boosting stocks like Micron. D-Matrix’s in-memory push positions it to ease bandwidth woes plaguing Anthropic.

For insiders, the lesson is clear: revenue growth dazzles, but inference economics will dictate survivors. As Anthropic navigates this, its margin recalibration signals a maturing industry grappling with compute’s true price.

Anthropic’s Margin Squeeze: Inference Costs Bite as Revenue Surges

Notice an error?

Ready to get started?