PyTorch's Quiet Bet on Safetensors Could Reshape How AI Models Are Stored and Loaded

For years, the default way to save and load PyTorch models has relied on Python’s pickle serialization format — a method so deeply embedded in machine learning workflows that most practitioners never questioned it. That’s changing now. PyTorch is moving to adopt Safetensors as a built-in, first-class option for model serialization, a shift that addresses longstanding security vulnerabilities and performance bottlenecks that have plagued the field’s most popular deep learning framework.

The change, reported by Phoronix, stems from a pull request merged into the PyTorch codebase that integrates Safetensors support directly into the framework’s core save and load functions. No more third-party wrappers. No more workarounds. Developers will be able to use torch.save() and torch.load() with Safetensors natively, making the transition as frictionless as possible for existing codebases.

Why does this matter? The answer lies in the fundamental problems with pickle.

Pickle, Python’s built-in object serialization protocol, can execute arbitrary code during deserialization. That’s not a theoretical risk — it’s an architectural feature. When you load a pickled file, you’re trusting that the file’s creator didn’t embed malicious payloads. In a world where researchers routinely download pretrained models from public repositories like Hugging Face Hub, this trust model is badly broken. A poisoned model file could compromise an entire training pipeline, exfiltrate data, or establish persistent access to a machine. Security researchers have demonstrated these attacks repeatedly, and the machine learning community has been slow to respond with structural fixes rather than warnings.

Safetensors, originally developed by Hugging Face, takes a fundamentally different approach. The format stores tensor data in a flat binary layout with a simple JSON header describing tensor names, shapes, and data types. There is no executable code. There is no opportunity for arbitrary code execution during loading. The file either contains valid tensor data or it doesn’t — there’s no ambiguity, no hidden logic, no attack surface worth mentioning.

Performance is the other half of the equation. Because Safetensors files use memory-mapped I/O, loading large models becomes dramatically faster. Instead of deserializing an entire file into memory, the operating system can map the file directly into the process’s address space, allowing tensors to be accessed on demand. For models that run into the tens or hundreds of gigabytes — which is increasingly the norm with large language models — this difference is not trivial. Loading times drop from minutes to seconds in some benchmarks.

The integration into PyTorch’s core wasn’t inevitable. Safetensors has been available as a standalone Python package since Hugging Face released it in 2022, and the Hugging Face Transformers library adopted it as the default format some time ago. But PyTorch itself — the foundational framework underneath most of these higher-level libraries — continued to rely on pickle for its native save/load operations. That created an awkward split: best practice said to use Safetensors, but the framework’s own API pushed you toward pickle unless you went out of your way.

The merged pull request eliminates that friction. According to the Phoronix report, the implementation adds a weights_only parameter and Safetensors format option to PyTorch’s serialization API. Developers can specify Safetensors as their preferred format when saving, and PyTorch will handle the rest. The design preserves backward compatibility — existing pickle-based checkpoints will continue to load — while making the safer, faster option readily accessible.

This matters beyond just developer convenience. The AI industry is grappling with supply chain security in ways it largely ignored during the initial boom. Model files are software artifacts, and like any software artifact, they can be vectors for attack. The U.S. government’s focus on AI safety, the European Union’s AI Act, and growing enterprise adoption of large models all create pressure for more secure-by-default tooling. A framework-level endorsement of Safetensors sends a clear signal to the broader community: pickle for model weights should be treated as a legacy practice.

Hugging Face has been pushing this message for a while. The company’s model hub now displays security warnings for models uploaded in pickle format and encourages maintainers to convert to Safetensors. Many of the most popular model repositories — including those for Llama, Mistral, and Stable Diffusion variants — already ship Safetensors versions of their weights. But without native PyTorch support, there was always a gap between what the model hosting platforms recommended and what the underlying framework made easy.

That gap is closing.

There are nuances to this transition. Safetensors is designed specifically for tensor data — it doesn’t handle arbitrary Python objects, optimizer states with complex structures, or custom class instances. For full training checkpoints that include optimizer state, learning rate schedulers, and other metadata, pickle or alternative formats may still be necessary. But for the most common use case — saving and loading model weights — Safetensors is strictly superior on both security and performance axes.

The implementation also raises questions about the broader model serialization space. ONNX, GGUF (used by llama.cpp), and other formats serve different purposes but overlap in some areas with Safetensors. PyTorch’s decision to bring Safetensors into its core effectively crowns it as the preferred format for weight storage within the PyTorch community, which dominates research and increasingly production deployment. That’s a significant endorsement that could further consolidate the format’s position as an industry standard.

And the timing isn’t accidental. As models grow larger and deployment becomes more distributed — across cloud instances, edge devices, and federated training setups — the cost of insecure or slow serialization compounds. A format that’s both safe and fast isn’t a nice-to-have anymore. It’s infrastructure.

Meta, which maintains PyTorch, has been investing heavily in making the framework production-ready for large-scale deployments. The addition of native Safetensors support fits into a broader pattern of hardening PyTorch for enterprise and safety-critical applications. Recent PyTorch releases have emphasized compilation (torch.compile), distributed training improvements, and better integration with deployment pipelines. Safetensors integration is the serialization piece of that puzzle.

For practitioners, the transition should be straightforward. If you’re already using Safetensors through the Hugging Face libraries, little changes except that the underlying framework now speaks the format natively. If you’ve been saving models with torch.save() and pickle, you’ll soon have a one-parameter switch to move to a more secure format. The barrier to adoption just dropped to nearly zero.

So what’s the bottom line? PyTorch’s adoption of Safetensors removes one of the last major friction points preventing the ML community from moving away from pickle-based model serialization. It’s a security upgrade, a performance upgrade, and a standardization move all wrapped into a single pull request. Not flashy. But consequential.

The machine learning industry has a habit of tolerating known risks until a framework-level change forces better defaults. This is one of those changes. And it’s overdue.

PyTorch’s Quiet Bet on Safetensors Could Reshape How AI Models Are Stored and Loaded

Notice an error?

Ready to get started?