Inside Google's AI Pivot: From Search Giant to Infrastructure Kingpin, and Why the Industry Is Watching Closely

SAN FRANCISCO — Google wants to be the backbone of the artificial intelligence era. Not just one of the players. The backbone.

That was the unmistakable message delivered at Google Cloud Next 2025, the company’s annual cloud conference held this week in Las Vegas, where executives laid out an aggressive vision that positions Google not merely as an AI model maker but as the essential infrastructure provider for every company building with artificial intelligence. The sheer volume of product announcements — more than 200 by one count — was itself a statement of intent. So was the audience: over 35,000 attendees, a record for the event, packed into the Mandalay Bay Convention Center to hear how Google plans to weave AI into every layer of its cloud platform.

The conference, rebranded this year as “All Things AI” according to The Register, made clear that Google Cloud CEO Thomas Kurian sees this moment as a once-in-a-generation opportunity to close the gap with Amazon Web Services and Microsoft Azure. Both competitors still command larger shares of the cloud market. But Google believes AI changes the math.

The Hardware Bet: Custom Silicon and the Race for Compute

At the core of Google’s strategy is a bet on custom hardware. The company unveiled its seventh-generation Tensor Processing Unit, called Ironwood, which it described as the most performant AI accelerator it has ever built. Ironwood TPUs are designed specifically for inference workloads — the computationally expensive process of running trained AI models to generate outputs in real time. Google says a single Ironwood chip delivers 4,614 teraflops of compute, a staggering figure that underscores how much raw processing power inference now demands.

But the chip itself is only part of the story. Google also detailed plans for what it calls an Ironwood “pod” — a cluster of 9,216 TPUs connected via its custom interconnect fabric. The idea is that customers running massive AI models shouldn’t have to worry about stitching together thousands of individual accelerators. Google handles that complexity.

This is a direct challenge to Nvidia’s dominance in AI hardware. While Google has offered TPUs for years, the Ironwood generation represents an escalation — an explicit pitch to enterprises that they don’t need Nvidia GPUs to run frontier AI workloads. The timing matters. Nvidia’s data center revenue has exploded, driven by insatiable demand for its H100 and B200 chips. Google is essentially arguing that for inference at scale, its custom silicon offers a better price-performance ratio.

Whether that argument holds up under real-world scrutiny remains to be seen. Nvidia’s CUDA software environment has deep roots in the developer community, and switching costs are real. Still, Google’s willingness to invest billions in custom chip development signals it isn’t backing down.

The hardware announcements didn’t stop at TPUs. Google also introduced its A4 virtual machines powered by Nvidia’s latest Blackwell GPUs, a nod to the reality that many customers still want Nvidia hardware. It’s a both-and strategy: build your own chips while also offering the competition’s best products. Pragmatic. And potentially expensive to maintain.

Google’s infrastructure push extends to networking and storage, too. The company talked about its Jupiter networking fabric, which it says delivers petabit-scale bandwidth between TPU pods. For customers training or running models with hundreds of billions of parameters, network bandwidth between accelerators is often the bottleneck. Google is betting that its vertically integrated stack — custom chips, custom networking, custom storage — gives it an edge that hyperscalers assembling commodity hardware can’t match.

The financial stakes are enormous. Google Cloud’s annual revenue run rate now exceeds $40 billion, and the AI infrastructure business is its fastest-growing segment. Every major cloud provider is racing to build enough capacity to meet demand from enterprises deploying AI. Capital expenditure across the three major cloud providers is expected to exceed $200 billion in 2025 alone, according to recent analyst estimates. Google’s share of that spend will depend heavily on whether Ironwood and its surrounding infrastructure can deliver on the performance promises made this week.

Software, Agents, and the Platform Play

Hardware is the foundation. But Google’s ambitions extend well beyond chips and data centers.

The company announced a sweeping set of software tools aimed at making it easier for enterprises to build, deploy, and manage AI applications. Central among these was an expansion of its Vertex AI platform, which now supports what Google calls “agent-to-agent” communication — a framework that allows multiple AI agents to collaborate on complex tasks without human intervention at each step.

This is where the industry is heading. Fast. The concept of autonomous AI agents — software programs that can reason, plan, take actions, and hand off tasks to other agents — has moved from research curiosity to commercial priority in less than a year. Google, OpenAI, Microsoft, and Anthropic are all racing to build the tooling that makes multi-agent systems practical for enterprise use.

Google’s approach leans heavily on its Gemini model family. At the conference, the company previewed Gemini 2.5 Pro and Flash variants, along with a new model it’s calling Gemini with “Deep Think” capabilities for complex reasoning tasks. The models are tightly integrated with Vertex AI, meaning developers building on Google Cloud get optimized access to the latest Gemini versions.

There’s a strategic logic here that goes beyond model performance benchmarks. By tying its best models to its cloud platform, Google creates a flywheel: developers who want the newest Gemini capabilities build on Google Cloud, which drives cloud revenue, which funds more AI research, which produces better models. Microsoft has a similar dynamic with OpenAI and Azure. The difference is that Google controls both the model development and the cloud infrastructure end to end, without relying on a separate model provider.

The agent-focused announcements also included new tools for grounding AI outputs in enterprise data, retrieval-augmented generation improvements, and expanded support for running AI workloads across hybrid and multi-cloud environments. That last point is significant. Many large enterprises run workloads across AWS, Azure, and Google Cloud simultaneously. Google’s willingness to support multi-cloud AI deployment is a concession to market reality — and a recognition that locking customers in isn’t a viable strategy when switching costs for AI inference are lower than for traditional cloud workloads.

Security got significant attention as well. Google introduced new AI-powered security tools within its Google Unified Security platform, including threat detection models trained on its own security telemetry data. The pitch: as AI-generated attacks grow more sophisticated, defending against them requires AI-native security tools. It’s a reasonable argument, though every major cloud and security vendor is making similar claims.

And then there’s the enterprise search and workplace AI angle. Google announced major updates to its Workspace platform, including AI agents that can operate within Gmail, Docs, and Sheets to automate routine business tasks. The vision is that knowledge workers will increasingly delegate tasks — scheduling, summarization, data analysis, email drafting — to AI agents embedded directly in the tools they already use. Microsoft is pushing the same vision with Copilot across its 365 suite. The battle for the AI-augmented workplace is very much on.

One notable announcement that drew attention from developers: Google is making its Gemini models available through a “context caching” feature that lets applications store frequently used context — think long documents, codebases, or conversation histories — to reduce latency and cost on subsequent API calls. For applications making thousands of inference calls per minute, context caching could meaningfully reduce operating expenses. It’s a technical detail, but the kind of technical detail that wins over engineering teams evaluating which cloud to build on.

The Competitive Calculus

Google Cloud Next 2025 didn’t happen in a vacuum. Microsoft held its own AI-focused events in recent weeks, touting Copilot adoption numbers and Azure AI growth. AWS re:Invent, held last December, showcased Amazon’s own custom AI chips — Trainium2 — and its Bedrock platform for model hosting. The three hyperscalers are converging on remarkably similar strategies: custom silicon, proprietary models, agent frameworks, enterprise integration.

The differentiation, such as it is, comes down to execution and ecosystem relationships. AWS has the largest installed base. Microsoft has the Office franchise and the OpenAI partnership. Google has arguably the deepest AI research bench — DeepMind and Google Brain, now unified, have produced some of the most important advances in machine learning over the past decade — and it controls the most popular consumer AI surface in the world: Google Search.

But consumer AI strength doesn’t automatically translate to enterprise cloud wins. Google Cloud has historically been the third-place player, behind AWS and Azure, in overall cloud market share. AI is its best opportunity to change that ranking. Whether the Ironwood TPU, Gemini models, and Vertex AI platform are enough to peel enterprise workloads away from entrenched competitors is the multi-billion-dollar question.

The early signs are mixed. Google Cloud’s growth rate has accelerated, and the company has signed high-profile AI deals with companies like Mercedes-Benz, Spotify, and several large financial institutions. But AWS and Azure are growing their AI businesses quickly too. This isn’t a zero-sum market — total spending on cloud AI infrastructure is expanding rapidly — but relative positioning matters for long-term competitive dynamics.

There’s also the question of trust. Enterprises evaluating AI platforms aren’t just looking at benchmarks and pricing. They’re assessing reliability, support, and the provider’s long-term commitment to the space. Google has a well-earned reputation for killing products that don’t hit internal targets. That history makes some enterprise buyers cautious, even as Google’s cloud division insists it is fully committed to the business.

So where does this leave Google? In a strong position, but not a dominant one. The AI infrastructure market is still forming, and no single vendor has locked it up. Google’s vertical integration — from chip design to model training to application-layer tools — gives it structural advantages that are hard to replicate. Its research capabilities are world-class. Its cloud platform is technically excellent.

But technical excellence alone has never been enough to win enterprise markets. Distribution, sales execution, partner relationships, and customer trust matter just as much. Google knows this. The record attendance at Cloud Next, the aggressive product cadence, the “All Things AI” rebrand — all of it points to a company that understands the stakes and is putting everything behind this moment.

The next twelve months will tell us whether the bet pays off. AI infrastructure spending shows no signs of slowing down. The companies that capture the largest share of that spending will define the next era of enterprise technology. Google is making its case. Now it has to deliver.

Notice an error?

Ready to get started?

Inside Google’s AI Pivot: From Search Giant to Infrastructure Kingpin, and Why the Industry Is Watching Closely

Notice an error?

Ready to get started?