Internet Archive Turns To Books

    June 7, 2011

Keeping track of the history of the Internet is a task that has fallen at the feet of, and they do an fantastic job of preserving where the Internet has been and how many of the popular sites we visit started out. For example, check out the history of Google. You’ll immediately notice the original version of the search engine looks absolutely nothing like it does in 2011, and that also provides a good example of why the service provided by is an important one.

With that in mind, you can understand the excitement when announced they would begin archiving books, as well. Termed the “Physical Archive,” the goal of is, to put it simply, “preserve one copy of every published work.” Part of the preservation process includes digitizing the content within the books, with the following goals going forward:

Because we expect day-to-day access to these materials to occur through digital means, the our physical archive is designed for long-term preservation of materials with only occasional, collection-scale retrieval. Because of this, we can create optimized environments for physical preservation and organizational structures that facilitate appropriate access. A seed bank might be conceptually closest to what we have in mind: storing important objects in safe ways to be used for redundancy, authority, and in case of catastrophe.

The blog entry discussing the project indicates not all written works will make into their physical archive, and in fact, they indicate the number of unique titles in literature is estimated to be around 100 million.’s goal is preserve over 10 million of these individual works. To facilitate this process, needed a physical containment unit capable of keeping these printed works protected, and, well, dry. So they turned to the shipping industry for ideas:

Based on this technical literature and specifications from depositories around the world, Tom McCarty, the engineer who designed the Internet Archive’s Scribe book-scanning system, began to design, build, and test a modular storage system in Oakland California. This system uses the infrastructure developed around the most used storage design of the 20th century, the shipping container. Rows of stacked shipping containers are used like 40′ deep shelving units. In this configuration, a single shipping container can hold around 40,000 books, about the same as a standard branch library, and a small building can hold millions of books.

An example of the storage facility in question:


Storage takes place like so:

  • Books are cataloged, and have acid free paper inserts with information about the book and its location,
  • Boxes store approximately 40 books with labeling on the outside,
  • Pallets hold 24 boxes each,
  • Modified 40′ shipping containers are used as secure and individually controllable environments of 50 or 60 degrees Fahrenheit and 30% relative humidity,
  • Buildings contain shipping containers and environmental systems,
  • Non-profit organizations own and protect the property and its contents.
  • While such a task should be commended, is’s goal any different than Google’s Library Project? Aside from the fact that has shown no indication of monetary gain with their physical archive project — perhaps the biggest difference between the two; something Google has had to address in legal venues — making their approach a little more altruistic than Google’s.

    Being the non-profit that they are, is also soliciting donations to assist in the project’s undertaking. Considering their goal, it’s actually a cause worthy of donation.