Inside Amazon's Trainium Lab: How a Quiet Chip Project Landed Anthropic, OpenAI, and Even Apple

AUSTIN, Texas — The building doesn’t look like much from the outside. A low-slung concrete structure on a corporate campus east of downtown, it could pass for a regional insurance office. Inside, behind two layers of badge-controlled doors and a biometric scanner, Amazon Web Services is manufacturing what it believes is the future of artificial intelligence hardware.

This is Amazon’s Trainium lab, and until last week, almost no one outside the company had seen it.

TechCrunch was granted an exclusive tour of the facility in late March, revealing for the first time the scale and ambition of a project that has quietly attracted commitments from three of the most important names in AI: Anthropic, OpenAI, and Apple. The implications for the semiconductor industry — and for Nvidia’s dominance of AI training hardware — are significant. Perhaps more significant than Wall Street currently appreciates.

Amazon’s custom silicon effort isn’t new. The company introduced its first Inferentia chip in 2019 and the original Trainium in 2020. Both were greeted with polite skepticism. Early performance benchmarks were unimpressive. Software support was thin. Engineers at major AI labs dismissed the chips as a sideshow — interesting, maybe, but not something you’d stake a frontier model on.

That calculus has changed.

The third-generation Trainium3 chip, now in volume production at the Austin facility, delivers what Amazon claims is a 4x improvement in training throughput over its predecessor, with energy efficiency gains that make Nvidia’s H100 look, in the words of one AWS engineer quoted by TechCrunch, “like a space heater by comparison.” Those are fighting words in an industry where Nvidia’s GPUs have been treated as the only serious option for training large language models. But the customer list suggests the performance claims aren’t just marketing bravado.

Anthropic’s involvement runs deepest. The Claude developer, in which Amazon has invested more than $4 billion, has been co-designing workloads with the Trainium team for over two years, according to TechCrunch’s reporting. Anthropic engineers have had direct input into the chip’s instruction set architecture and its interconnect fabric — the high-speed networking layer that allows thousands of chips to work in concert on a single training run. This isn’t a casual cloud computing relationship. It’s a hardware partnership.

OpenAI’s presence on the customer roster is more surprising. The Microsoft-backed company has historically relied almost exclusively on Nvidia hardware, supplemented by Microsoft’s own Maia chips for some inference workloads. But according to TechCrunch, OpenAI has signed a multi-year agreement to use Trainium3 clusters for a portion of its training infrastructure, a move that diversifies its supply chain at a moment when Nvidia chips remain allocation-constrained.

And then there’s Apple.

The iPhone maker’s AI ambitions have been something of a slow burn. Apple Intelligence, the company’s on-device AI framework, relies primarily on Apple’s own M-series and A-series silicon for inference. But training the models that power those features requires massive cloud-based compute. Apple has traditionally purchased that compute from Google Cloud. The shift toward AWS and Trainium, first reported by TechCrunch, signals a strategic realignment that likely reflects both pricing advantages and Apple’s legendary desire to reduce dependence on any single supplier — especially one that also happens to be a competitor in consumer hardware.

So what exactly makes Trainium3 compelling enough to pull these customers away from Nvidia’s gravitational field?

The answer, based on TechCrunch’s tour and conversations with AWS executives, comes down to three things: power efficiency, interconnect speed, and cost. The Trainium3 chip is fabricated on TSMC’s N3E process node, the same advanced manufacturing technology used in Apple’s latest iPhone processors. Each chip contains approximately 100 billion transistors — roughly comparable to Nvidia’s Blackwell B200 — but is designed from the ground up for transformer-based model training rather than general-purpose GPU computing. That specialization allows Amazon’s engineers to strip out hardware features irrelevant to AI workloads and dedicate more die area to the matrix multiplication engines that dominate modern training runs.

The interconnect story may matter even more. Training a frontier model like GPT-5 or Claude 4 requires distributing work across tens of thousands of chips simultaneously. The speed at which those chips can communicate determines how efficiently the overall system performs. Nvidia’s NVLink and InfiniBand technologies have long set the standard here. Amazon’s response is a custom interconnect called NeuronLink, which the company claims delivers 2.4 terabits per second of bidirectional bandwidth between chips in the same rack, with a proprietary optical fabric connecting racks across the data center. During the tour, AWS showed TechCrunch a test cluster of 16,384 Trainium3 chips operating as a single logical unit — a scale that, if the performance numbers hold, would make it one of the most powerful AI training systems in existence.

Cost is the third leg. Amazon won’t disclose specific pricing for Trainium3 instances, but multiple sources told TechCrunch that the per-token training cost is roughly 40% lower than equivalent Nvidia-based configurations on AWS. For companies spending hundreds of millions — in some cases billions — of dollars annually on training compute, a 40% reduction is not a rounding error. It’s a strategic weapon.

Nvidia, for its part, isn’t standing still. The company’s Blackwell Ultra chips, expected to ship in volume later this year, promise their own generational leap in performance. CEO Jensen Huang has repeatedly emphasized that Nvidia’s advantage extends far beyond raw silicon to include CUDA, the software platform that has become the de facto standard for AI development. Rewriting training codebases to run on non-Nvidia hardware is expensive and time-consuming. That switching cost has been Nvidia’s moat for years.

But Amazon appears to be eroding it. AWS has invested heavily in its Neuron SDK, the software layer that allows AI frameworks like PyTorch and JAX to run on Trainium hardware with minimal code changes. According to TechCrunch, Anthropic’s engineers reported that porting their training pipeline to Trainium3 required approximately three weeks of work — far less than the months-long effort that earlier generations of custom AI chips demanded. If that experience is representative, the CUDA lock-in argument weakens considerably.

The financial stakes are enormous. Nvidia generated $61 billion in data center revenue in fiscal 2026, a figure that has grown at triple-digit percentages for three consecutive years. Any meaningful share loss to Amazon’s captive silicon — or to competing efforts from Google (TPUs), Microsoft (Maia), and Meta (MTIA) — would hit Nvidia’s topline growth trajectory and, by extension, its $2.8 trillion market capitalization. Wall Street analysts have largely modeled Nvidia’s forward earnings on the assumption of continued near-monopoly in AI training hardware. The emergence of a credible alternative, validated by customers of the caliber of Anthropic, OpenAI, and Apple, introduces a variable that current consensus estimates may not fully reflect.

Amazon’s motivations are equally clear. AWS remains the company’s profit engine, generating over $30 billion in operating income last year. But competition from Microsoft Azure and Google Cloud has intensified, particularly in AI workloads. Offering proprietary hardware that delivers better performance per dollar than Nvidia-based instances gives AWS a differentiation story that its rivals can’t easily replicate. Microsoft has its Maia chips, yes, but they aren’t yet available for external customers. Google’s TPUs are powerful but remain tightly coupled to Google Cloud’s own infrastructure and software stack. Amazon is betting that Trainium can be both a first-party advantage for AWS and an open platform that attracts the broader AI development community.

There are risks. Custom silicon programs are capital-intensive and unforgiving. A single misstep in chip design — a memory bandwidth bottleneck, a thermal management flaw, a software compatibility gap — can set a program back years and waste billions. Amazon has stumbled before: the original Graviton processor received mixed reviews, and the first Trainium chip was widely regarded as underpowered. The company’s track record suggests it learns from these failures, but past performance is no guarantee of future results, as the prospectus writers like to say.

The geopolitical dimension adds another layer. U.S. export controls on advanced AI chips have constrained Nvidia’s ability to sell its most powerful hardware to Chinese customers. Amazon’s Trainium chips, manufactured by TSMC in Taiwan, face similar restrictions but are designed exclusively for use within AWS data centers — meaning Amazon controls the entire supply chain from fab to rack. That vertical integration could prove advantageous in a world where chip export policy is increasingly a tool of national security strategy.

During the Austin tour, TechCrunch noted the presence of a testing area where engineers were validating Trainium4 prototypes — the next generation, reportedly on track for late 2027. Amazon’s chip development cadence has accelerated from roughly 30 months between generations to 18 months, approaching the pace Intel and AMD maintain for their server processors. If that cadence holds, the gap between Trainium and Nvidia’s best offerings could continue to narrow, or potentially reverse.

The broader pattern here is unmistakable. The largest consumers of AI compute — the hyperscalers and the frontier model developers — are moving aggressively to reduce their dependence on a single chip supplier. Not because Nvidia’s products are inadequate, but because concentration risk in a supply chain this critical is simply unacceptable at the scale these companies now operate. Amazon’s Trainium effort is the most advanced expression of this trend, but it’s part of a larger movement that includes Google’s TPU v6, Microsoft’s Maia 200, and Meta’s ongoing MTIA development.

For Nvidia investors, the question isn’t whether alternatives will emerge. They already have. The question is how quickly those alternatives reach performance parity — and whether the market has priced in a world where Nvidia’s share of AI training compute declines from 90%-plus to something closer to 60% or 70%.

For Amazon, the Trainium bet represents something bigger than a chip. It’s a statement that the most important technology of the next decade — artificial intelligence — will run on infrastructure that Amazon designs, manufactures, and controls. Not rented from Nvidia. Not borrowed from TSMC’s catalog of standard designs. Built from scratch, in a concrete building in Austin, by a team that until very recently almost no one knew existed.

The tour is over. The competition is just beginning.

Inside Amazon’s Trainium Lab: How a Quiet Chip Project Landed Anthropic, OpenAI, and Even Apple

Notice an error?

Ready to get started?