The GPU Has a Memory Problem: How Rowhammer Attacks Just Jumped From CPUs to Nvidia's Best Hardware

For more than a decade, Rowhammer has haunted the world of computer memory. The attack — which exploits the physical properties of DRAM chips by rapidly accessing rows of memory cells to cause electrical interference in adjacent rows — was long considered a CPU problem. A nuisance for processors. A well-documented vulnerability with an expanding library of mitigations.

Not anymore.

Researchers from the Graz University of Technology in Austria and the University of Rennes in France have demonstrated that Rowhammer is very much a GPU problem too, and they’ve done it on some of Nvidia’s most capable hardware. Their work, which targets Nvidia’s discrete GPUs including the high-end A100 data center accelerator, represents the first successful Rowhammer attack mounted from a graphics processor. The implications stretch across cloud computing, artificial intelligence infrastructure, and any multi-tenant environment where GPUs are shared among users who don’t trust each other.

“Our work shows that Rowhammer, which is well-studied on CPUs, is a serious threat on GPUs as well,” the researchers stated in their paper, as reported by TechRadar. The team — Lukas Giner, Fabian Thomas, Florian Hammler, Daniel Gruss, and Michael Schwarz — presented their findings in a paper they’ve titled, fittingly enough, to signal that this is new territory for the security community.

To understand why this matters, you have to understand what Rowhammer actually does at the silicon level. Modern DRAM stores data as electrical charges in tiny capacitors, each representing a single bit. These capacitors are arranged in dense rows. When a processor repeatedly reads from one row — hammering it — the electrical disturbance can cause bits in neighboring rows to flip. A zero becomes a one. A one becomes a zero. That’s it. That’s the whole trick. But from that simple physical phenomenon, attackers have built an arsenal of exploits that can escalate privileges, escape sandboxes, steal cryptographic keys, and corrupt data with surgical precision.

On CPUs, the attack surface has been thoroughly mapped. Google’s Project Zero first demonstrated practical Rowhammer exploits in 2015, and since then researchers have found ways to trigger bit flips from JavaScript in browsers, from virtual machines targeting hypervisors, and from unprivileged user processes targeting the operating system kernel. CPU vendors and DRAM manufacturers have responded with a mix of hardware and software defenses: Target Row Refresh (TRR), error-correcting code (ECC) memory, and various kernel-level mitigations that limit how aggressively programs can access memory.

GPUs, though, are a different animal entirely.

The researchers demonstrated that GPUs present unique advantages for mounting Rowhammer attacks — advantages that in some ways make them better attack platforms than CPUs. Graphics processors are designed from the ground up for massive parallelism. An Nvidia A100 contains thousands of CUDA cores that can hammer memory rows simultaneously, generating far more memory accesses per second than a CPU ever could. The memory subsystem is different too. GPU memory (typically GDDR or HBM) operates at higher bandwidths and with different timing characteristics than the DDR4 and DDR5 found in standard server configurations.

And here’s where it gets uncomfortable for cloud providers. GPUs are increasingly shared resources. Amazon, Google, Microsoft, and others now offer fractional GPU access, where multiple tenants share the same physical GPU. Nvidia itself has promoted Multi-Instance GPU (MIG) technology on its A100 and H100 accelerators as a way to partition a single GPU among multiple users. The researchers’ work raises hard questions about whether those isolation boundaries are sufficient when an attacker can manipulate the physical properties of the underlying memory.

The attack the team developed works without any special privileges. A malicious user with ordinary access to a shared GPU — the kind of access anyone renting cloud GPU time would have — can craft memory access patterns that induce bit flips in memory regions belonging to other tenants or to the GPU’s own management structures. The researchers showed they could reliably trigger bit flips on the Nvidia A100, which is the workhorse of most AI training and inference clusters worldwide. They also tested consumer-grade Nvidia GPUs and found them vulnerable.

This isn’t purely theoretical. The researchers built end-to-end proof-of-concept attacks demonstrating concrete security impacts. They showed that bit flips in GPU memory can corrupt the results of machine learning computations, alter the behavior of neural networks during inference, and potentially allow an attacker to break out of the memory isolation that’s supposed to keep one tenant’s data separate from another’s. In a world where companies are sending proprietary data and models to cloud GPU instances for training and inference, the ability to corrupt or exfiltrate that data through a hardware-level attack is a serious concern.

The timing of this research is notable. GPU security has received relatively little attention compared to CPU security, even as GPUs have become the most strategically important chips in the data center. The explosive growth of generative AI has made Nvidia’s data center GPUs the most sought-after computing resources on the planet, with companies spending billions to secure allocations. Yet the security model for these devices remains immature compared to what exists for CPUs, where decades of adversarial research have driven substantial hardening.

ECC memory, which is present on data center GPUs like the A100, was long assumed to be a sufficient defense against Rowhammer. The researchers challenged that assumption. While ECC can correct single-bit errors, Rowhammer attacks can be tuned to produce multi-bit flips that overwhelm ECC’s correction capabilities. The team demonstrated techniques for generating patterns of bit flips that either evade ECC detection entirely or produce uncorrectable errors that crash the system — a denial-of-service vector in its own right.

Nvidia was notified of the findings before publication, following standard responsible disclosure practices. But the nature of the vulnerability makes it extraordinarily difficult to patch in software. Rowhammer is a hardware problem rooted in the physics of DRAM manufacturing. As memory density increases — as the capacitors get smaller and closer together — the problem tends to get worse, not better. DRAM manufacturers have been engaged in a years-long cat-and-mouse game with Rowhammer researchers on the CPU side, and now that same dynamic appears poised to play out for GPU memory as well.

The defense options are limited and imperfect. On the hardware side, DRAM manufacturers can implement more aggressive refresh rates, which reduces performance and increases power consumption. They can adopt newer mitigation schemes like Per-Row Activation Counting (PRAC), which tracks how frequently each row is accessed and proactively refreshes neighbors of heavily hammered rows. But these defenses add cost and complexity, and they’ve historically been implemented with the CPU threat model in mind — not the massively parallel access patterns that GPUs generate.

On the software side, GPU driver and runtime developers could implement access pattern throttling to prevent the rapid, repeated memory accesses that trigger bit flips. But doing so would directly conflict with the performance characteristics that make GPUs valuable in the first place. The entire point of a GPU is to access memory as fast and as often as possible. Any mitigation that slows down memory access patterns is a mitigation that degrades GPU performance.

Cloud providers face a particularly awkward calculus. Multi-tenant GPU sharing is economically essential — it’s what makes GPU cloud computing affordable for smaller customers. But if Rowhammer attacks can cross tenant boundaries on shared GPUs, providers may need to rethink their isolation strategies. Physical isolation — giving each tenant their own dedicated GPU — is the most secure option but also the most expensive, and it would effectively eliminate the efficiency gains that multi-tenant architectures provide.

So what happens next? The research community will almost certainly begin probing GPU Rowhammer more aggressively. The Graz team, led by Daniel Gruss, has a long track record in hardware security research — Gruss was one of the key researchers behind the Spectre and Meltdown CPU vulnerabilities disclosed in 2018. When this group identifies a new attack surface, others follow quickly.

Nvidia, for its part, has been investing more in security as its GPUs have become critical infrastructure. The company’s Confidential Computing initiative, which uses hardware encryption to protect data in GPU memory even from the cloud provider’s own administrators, represents one potential layer of defense. But Confidential Computing is designed to protect against software-based attacks and privileged insiders — it’s not clear how well it holds up against a physical-layer attack like Rowhammer that manipulates the actual electrical state of memory cells.

AMD and Intel, both of which are pushing their own GPU and accelerator products into the data center, should be paying close attention. The fundamental physics of Rowhammer don’t discriminate by vendor. If Nvidia’s GPUs are vulnerable, there’s every reason to expect that AMD’s Instinct accelerators and Intel’s Gaudi chips face similar risks. The researchers focused on Nvidia hardware because of its dominant market position, but the underlying DRAM technology is shared across the industry.

For enterprises running AI workloads on shared infrastructure, the practical advice right now is limited. Monitor for unusual memory error rates, which could indicate Rowhammer activity. Prefer dedicated GPU instances over shared ones for sensitive workloads. And keep a close eye on GPU vendor security advisories in the coming months, because this research is likely to trigger a wave of follow-up work.

The broader lesson is one the security community has learned before, painfully, with CPUs: hardware that was designed purely for performance, without adversarial threat modeling, will eventually become a liability. GPUs were built to render pixels and crunch numbers as fast as possible. They were not built to be secure multi-tenant computing platforms. And yet that’s exactly what they’ve become, almost overnight, as the AI boom has transformed them from specialized graphics hardware into general-purpose computing engines that process some of the most sensitive data in the world.

Rowhammer on GPUs isn’t the end of anything. It’s the beginning of a new front in hardware security — one where the stakes are measured in the integrity of AI models, the confidentiality of training data, and the reliability of the infrastructure that an increasing share of the global economy depends on. The researchers in Graz and Rennes have fired the starting gun. The race to defend against what they’ve found is just getting underway.

Notice an error?

Ready to get started?

The GPU Has a Memory Problem: How Rowhammer Attacks Just Jumped From CPUs to Nvidia’s Best Hardware

Notice an error?

Ready to get started?