Internet Archiving By Search Engines Illegal?

    July 20, 2005

A number of friends have been talking about the kerfuffle about the Canadian bill that would protect the copyright of web publishers against the archiving activities of search engines.

Internet Archiving By Search Engines Illegal?
Is Internet Archiving Illegal?

Many have been forwarding the link to the CNET article and blogging about it. Most are outraged at this crazy Canadian attempt to make “search engines illegal.”

Internet law expert Eric Goldman, though, suggests the article is misguided, because archiving activities might already be violating US law.

The article’s sloppy title, “Cache a page, go to jail” is part of the problem. Search engines might coyly refer to “caching” pages when they are actually “archiving” them, as Goldman points out.

Caching is done on local machines or on ISP’s servers in order to (for example) improve the speed of browsing. But such caching does not seem to break with basic browsing and copyright conventions.

Come to think of it, this makes intuitive sense. By allowing my site to be spidered by search engines, I give those indexes the right to point to my site, not to take and publish pages from that site. Google archives such pages, which is a great help when a page gets taken down, but there really is some question as to its legality. We all love the Wayback Machine, but do companies consent to having their content published by someone else? No. I realize that the content liberators of the world will say that this view is shortsighted, but I’m not talking about vision (and neither is Goldman), I’m wondering about what the law actually says.

It appears that Canada’s proposed law is not so out of place with what is already on the books elsewhere.

Linking and archiving are two very different animals. One is directing people to content; the other is snatching up that content and making it available for your own potential business gain.

At the end of the day, though, it may prove difficult to prohibit archiving, since theoretically you could find many ways around it, like taking photos of every single web page in existence, and archiving those. Your Googles of the world are already out there taking photos of lots of stuff and figuring out new ways to “organize the world’s information.” Somewhere, there has to be a line between organizing information and violating copyright, though. Likely, the onus will be placed on copyright holders to put their content behind a protected wall, as if it were subscriber- or buyer-only material, if they want to avoid having it archived.

There are many gray areas in this realm, like with Amazon’s “Search Inside the Book,” and Google Video Search. I’m glad I’m not a copyright lawyer right now.

Reader Comments

Andrew Goodman is Principal of Page Zero Media, a marketing consultancy which focuses on maximizing clients’ paid search marketing campaigns.

In 1999 Andrew co-founded, an acclaimed “guide to portals” which foresaw the rise of trends such as paid search and semantic analysis.