Elon Musk’s xAI has pushed GPU clusters to extremes. Colossus, its Memphis supercomputer, started with 100,000 Nvidia H100s. Built in 122 days. Doubled to 200,000 GPUs 92 days later. Now mixes 150,000 H100s, 50,000 H200s, and 30,000 GB200s. The world’s largest single coherent AI training setup.
Speed defined the project. Suppliers quoted 18 to 24 months for 100,000 GPUs. xAI demanded six. They repurposed an old Electrolux factory. Input power: 15 megawatts. Needed: 150. Rented generators. Leased a quarter of U.S. mobile cooling capacity. Power swings during training dropped 50% in 100 milliseconds. Generators couldn’t cope. Solution: Tesla Megapacks with custom software to smooth fluctuations. Networking? Cables for 100,000 GPUs. Teams worked four shifts, 24/7. Musk slept in the data center, cabling himself.
“The general principles of first-principle thinking apply to software, hardware, anything really,” Musk said. “Often, we were told something was impossible, but once we broke it down into its constituent elements, we could solve those.” Now Colossus trains Grok. Supports X. Aids SpaceX. Plans call for 1 million GPUs.
Scale Brings Brutal Efficiency Hurdles
Raw size isn’t enough. Utilization lags. xAI President Michael Nicolls admitted in a staff memo that Model FLOPs Utilization (MFU) hit just 11%. Industry averages: 35-45%. Target: 50%. GPUs sit idle without perfect orchestration. Power. Cooling. Networking. All must align. Traditional Ethernet delivers 60% throughput at scale. xAI’s Nvidia Spectrum-X Ethernet hits 95%. Zero latency degradation. No packet loss. BlueField-3 SuperNICs per GPU. RoCE v2 congestion control.
One ex-EPA official called unpermitted gas turbines a violation. “That is a violation of the law,” said Bruce Buckheit. Over a dozen run in Southaven, Mississippi. Fuel Colossus 1 and 2. Emit pollutants near schools. Residents report asthma flares. Noise. xAI seeks permits for 41 more. Could spew 6 million tons of greenhouse gases yearly. 1,300 tons of other pollutants. Company didn’t comment. Mississippi claims turbines are “portable.” EPA disagrees.
Power guzzles 250 megawatts now. 35 gas turbines provide 420 megawatts capacity. 208 Megapacks back it up. First substation: 97 days to build. Solaris supplies more. Wastewater recycling plant—world’s largest ceramic membrane bioreactor—processes graywater. Protects aquifers. Supplies surplus to locals.
Cooling mixes liquid and air. Supermicro 4U systems. Closed-loop water. 119 chillers handle 200 megawatts. Racks pack 64 GPUs each. 1,500 total. Memory bandwidth: 194 petabytes/second. Storage: over one exabyte.
But challenges persist. Fault tolerance across thousands of nodes. Checkpoint I/O. Cosmic ray bit flips. Mismatched firmware. Tangled cables. xAI engineers debug on the fly. Sustained runs last months.
Compute Flexes Across Musk’s Empire
Colossus powers more than Grok. SpaceX taps it too. Recent deal: Cursor, hot AI coding startup. xAI rents tens of thousands of chips. Cursor trains models there. SpaceX holds $60 billion buy option. Or pays $10 billion fee. Colossus equals one million H100s in power, SpaceX claims. Two Cursor execs jumped to xAI, reporting to Musk.
“Combining Cursor’s product and distribution to expert software engineers with SpaceX’s Colossus supercomputer,” the companies said. Cursor’s valuation soared—from $2.5 billion to $29.3 billion. Now eyes $60 billion.
Wall Street toured Colossus recently. SpaceX shows off amid IPO prep. Merged with xAI earlier. Losses hit $4.94 billion in 2025 on $18.67 billion revenue. AI infra bets.
Training ramps. Seven models parallel: Imagine V2, variants. 1T, 1.5T, 6T, 10T parameters. 10T rivals Anthropic’s biggest. $1.5 billion cost. Colossus 2: first gigawatt cluster. Vertical stacks cut latency. $659 million investment.
NAACP sued over Memphis site. Pollution fears. Temps rise. Few jobs added. But Memphis eyes AI hub status.
xAI rewrites rules. Others hoard GPUs—95% idle industry-wide. xAI scales aggressively. Fixes bottlenecks. Turns factory into frontier compute beast. Grok advances. Musk’s web tightens. Compute wars intensify.


WebProNews is an iEntry Publication