AP Says Scrapers Targets, Not Bloggers

By: WebProNews Staff - June 12, 2009

The Associated Press plans soon to sic a scraper-bot on the Web to find swiped AP content. While no one would argue with taking on scraper sites, the vagueness of AP news editor Ted Bridis might be worth considering.

In an interview with Ars Technica, Bridis talked of a new technology (that writer Matthew Lasar cleverly described as a “search-and-maybe-threaten bot”) that is on the horizon for the AP. The technology will identify and flag webpages copying entire AP articles. Upon flagging, AP lawyers would review.

A scraper-bot would be nice, a positive technology to evolve from this fiasco. What you do you think?

Bridis insisted the news organization would not be going after bloggers or publications excerpting a paragraph of AP content and linking to the original. He admitted AP sometimes borrows excerpts from newspapers and crafts their own story around it.

But Bridis stopped there and made no such concessions about usage of headlines and AP ledes. Arguing the so-called “hot news misappropriation” doctrine, this could affect search engines, aggregators, and sites like the Drudge Report who display headlines and the first line of an article.

Also under the radar would be articles written based on AP content, especially commercial websites rewriting with hedges like “the AP has reported” or the “AP said.” That’s where the vagueness is troubling, and where the lines are fairly blurry. It’s hard to tell if there is more emphasis on commercial or on an attribution method. It is also unclear what is meant by "rewriting." Does he define rewriting only as reporting the facts with only a word or two changed (i.e., plagiarism)?  Or does Bridis also include rewriting as retelling a story in different words, or even summarizing facts?  

Should the AP be able to dictate which facts are fair to retell, which styles are acceptable to retell them, which sentences are acceptable to excerpt, and how attribution is to be made? Let us know.

Depending on how these questions are answered, Bridis could be drawing a line between blogs and news sites, essentially saying nonprofit bloggers can quote and refer but commercial news sites cannot. He’s also drawing a line between textual storytelling and verbal storytelling. Bridis seems to suggest any commercial, textual relay of information wouldn’t be considered “fair” use, so long as they can, in a decentralized communication universe, prove the AP was the only outfit that knew certain facts. That argument is rather stunning considering the AP is a distributor of news first written elsewhere in the world at local publications.

What’s extra interesting is that though the AP has criticized fair use as a “misguided” legal theory, the organization itself is insisting on its own with the “hot news” doctrine, which is mostly a semantic device to create a separate category for “facts,” which are not copyrightable in the first place.  

Ninety years ago, the AP sued William Randolph Hearst’s International News Service (INS) for swiping breaking news the AP had gathered and distributing the news on its own. Over a lengthy court battle reaching the Supreme Court, the “hot news” doctrine was born. Though the AP essentially lost the suit because the courts found that facts could not be copyrighted, hot news (a scoop) was designated as a special kind of property to which the outlet breaking the news had exclusive rights for a limited amount of time. Just how long these special kinds of facts are protected is unclear, especially in the Internet age, when hot news gets cold much faster.

To succeed in its efforts, the AP will have significant legal hurdles in front of it. The organization will have to redefine fair use, get a court to uphold that some facts are protected and set some kind of timetable for that protection, explain how textually reporting facts to an Internet audience is different from reporting facts to any other audience by any other method, find a logical differentiation between bloggers and journalists, between Internet forums/social networks and water cooler conversations, convince courts previous precedents regarding aggregating, linking and snippeting should be overturned, all while avoiding federal charges of anticompetitive behavior.

Those are some pretty tall hurdles, and likely a 90-year-old argument from a different world isn’t going to be able to jump them.

This has heated debate written all over it. Sound off in the comment section. 

WebProNews Staff

About the Author

WebProNews StaffWebProNews | Breaking eBusiness News Your source for investigative ebusiness reporting and breaking news.

View all posts by WebProNews Staff
  • Zander Dondal

    I think the idea sucks. All of the content that Ap uses actually belongs to the American people and they have stolen that content themselves.

    AP is already a content thief, now they call others thief?

    Personally I believe the public should castrate AP excutives.


  • http://www.cpasitesolutions.com/index.php Kenny

    It doesn’t matter if AP considers it a misguided legal theory or not, Fair Use is a legal precedent and a cornerstone of IP law. Let them howl until they’re blue. The courts will slap them down if they try to challenge fair use.

    I completely support IP owners protecting their rights. As long as that’s all they try to do I’m behind AP. Unfortunately AP has shown a remarkable lack of adaptability and has done very little to adjust it’s business model to conform to the new demands of the information age. This campaign of theirs has a stink of desperation. It looks like AP is jumping the shark, and I won’t be half surprised to see them try to take fair use (and by extension, free speech) with them when they go down.

  • http://www.wiredgypsy.com M.E.

    Plagiarism is plagiarism. There is nothing different or unique about the Internet in that regard. Take a class in journalism or read a book on the subject. Gaining facts from a source and rewriting is totally fine. Basically, don’t be lazy. The only difference with the Internet and past mediums is not that the AP is somehow dates it is that so many techies are so ignorant about this.

    • News Dispundit

      Oh.. I don’t think “techies”, as you label them, are ignorant. I think that most are pretty savvy and cognizant of what is plagiarism and what is not.

      What the “newsies” complain about is that they lost their monopoly on controlling information.

      The difference between now and then is that the cost barriers to publishing have been obliterated and newspapers (who in many cases still don’t understand modern communication) no longer have the monopoly on the conversation of the day.

      My journalism class taught me doctrines of Fair Use and what is plagiarism and what AP wants isn’t equitable or fair.

      Sorry.. but I agree that AP doesn’t get to set the rules. AP is letting their lawyers run their business and that will spell doom for them in the long run.

      I certainly think that AP’s strategy is going to cost them big. They just may end up losing their business model.

  • amd

    As long as the AP is reporting on events that happen, the copyright really belongs to people associated with the events. Their naming them should give them no ownership, which belongs elsewhere. Copying large portions of articles, however, is copyright infringement.


  • http://www.iphlogger.com iPhlogger.com

    Google has google bot

    Yahoo has slurp

    Now attorneys have “Sue”…as in they will Sue your ( | ) if your “revenue score” is high enough they can make money.

    Just say’n….

  • http://www.wildlifeprotectaust.org.au Pat OBrien

    AP is definately out of touch with modern communications. We use complete news excerpts from media outlets all over the World for our weekly non-commercial newsletter Wildlife Bytes. It focuses on wildlife news of interest to our subscribers. I think this attempt by AP to monopolise news shows again how many people are disenchanted with media monopolies. In fact I believe that’s why they fill their newspapers up with all sorts of junk to attract readers, and only devote a couple of pages to actual news. I also understand newspaper readership is crashing, as more people turn to the Internet to catch up with the news, and even more people don’t believe what they read in the papers anyway. Pat OBrien, Australia

  • http://petitpub.com/blog Petit

    AP will indeed have a hard time proving that their story, republished by others, is in fact stolen. Legally that is.

    Nevertheless I think citations should be labeled as such and the source mentioned and, in the case of the web, linked as well.

    Citations are not in extenso, but excerpts, and sites living as parasites on original sites by automatically copying articles in their full text, should really be bashed as hard as possible.

    Referring to an article by AP or any other source by retelling the story in the manner of:
    “AP says today that and draws the conclusion that “, is something else. Here we have a reader that spreads the word about the matter of the original article and with comments of his own. Also of course linking back to the original article. This is promoting the original and should be legitimate and “fair use”.

    I don’t think we, AP or anyone, can address this by legal means. It is more a matter of netiquette.

  • http://www.peterjcrowley.com Peter J. Crowley

    Facts interesting term for AProvda to toss out like the facts for invading Iraq? Simple regurgitation of lies doesn’t constitute a fact. There has to be a better name for Lawyer Bots than Sue.
    enjoy pjc
    to quote a very famous detective “Just the facts mam”

  • Thomas

    Poor AP, whose future is now apparently managed by, and at the mercy of total absolute idiots. To AP: Today, in this new and wonderfully amazing age known as “the 21st century”, the only reason anyone ever reads your articles and news, is when those article pop up on Google News or any of the other thousands of hub-sites that your new efforts will essentially scare off. And what will be left? An AP without an audience. In this century, the benefits of getting as many people to read your content as possible, is almost never related directly to the content – but the advertising and revenues models built around it. If AP wants to write, then it had better learn – and learn fast – that, umm.. geee.. *readers* might be the next critical component for business. Rather than drive away readers, get some management with brains to help you figure out how to monetize the unchangeable realities of today. If AP provides good quality, some number of people *will* seek you out and perhaps even subscribe directly or yield obvious revenues streams (in addition to clicking any related ads and pay-for-click deals you should be negotiating). But if you want to hide your content behind a wall, thinking people will jump those hurdles just because you’re AP, then you’re wrong. On the web today, it’s free access first… Impress me (your reader and potential customer) and maybe I’ll sign-up later. Otherwise, I’ll just click the next link out of 20 on page 1 of Google results.

  • Guest

    Well, anyone that knows a little about publishing web pages are way ahead of the AP. They could set their website to prevent specific “bots” (like the AP bot) from accessing their site. How would AP deal with this?

  • http://traficblognet.com Scott Reynolds

    I agree the Internet does not need millions of copies of the same article posted all over creation. There should be guidelines or even rules for copying and linking of articles and information for the good of the internet and its users. I believe that could be set up so everyone benefits.

    But is that what humans want?

    We need the trickery and competition. That one chance we may make it through the maze to find the truth. The opportunity to think it was their fault. The idea “I was first so it is mine”.

    In a free world. Once it is published, it is the worlds. Isn’t that what they want? To be seen by the world?

  • http://www.bluegrassmerchants.com Mike Lawson

    It seems it’s time for the AP horse-and-buggy to get off the Information Highway. They’re fixin’ to be in a wreck like they ain’t ever seen. If they’re not going to buy a vehicle that fits the flow of the traffic, they at least need to get a neighbor who has one to bring them to town.

    I grew up reading newspapers and periodicals in print. They had a smell to them that exuded knowledge, authority and integrity. I had relatives that worked at the Courier Journal in Louisville and the writers there were looked on as celebrities in the community. But everything changes and communication is no exception.

    I’m sure AP and the others are simply repeating the cries of the scribes in the era of Guttenburg. It’s a natural human reaction to react rashly when your safety and security is threatened. The stability of a future in the printed news field has been compromised, to say the least. There is none.

    Print news media only has two choices to choose from really: integrate into 21st century technology or go back to school and learn a new trade in a different field.

    I imagine there will always be newspapers (or their New Wave equivalents) for local and regional news. But the day of international monsters is all but gone. They have simply outlived their usefullness and are old, withered and of a poor disposition.

    The Talking Heads on television will be next. Perhaps they are already being phased out as well. I know I have phased most of them out as purveyors of gossip, shock and other useless morsels of information of little value. Except the weather; even journalists can’t seem to taint that.

  • http://www.firmalatter.dk/latterkurser-og-workshops/bryd-vanen.htm FirmaLatter

    If reporters report, who “owns” the news? The media? Of course not. The media is simply a means to bring the “news” to the market square.

    For centuries we have been relying on paper and ink for that. Now it’s time for 1’s and 0’s.

    The whole idea of the internet is presenting information and linking, so if someone says “you can only link to this place, and not to that place” we have struck either censorship, or greed.

    I believe that AP is in the latter category with this idea. Nothing is won by going after those that live off the system. Everything is won by being better today than we were yesterday. No one will ever be able to control others for good – it’s like swimming upstream: eventually you will tire, and go with the flow.

    AP should report, and let others report what AP reported.

  • Guest

    I’m sorry, but I’m really confused. Is the AP’s entire reason for being not to gather and disseminate news?

    I’m all for cracking down on plagiarism, which is rife online, but they’re in such an uproar about preventing people from actually using their content.

    Ultimately, it comes down to money. They’re fine with their paid members reprinting their content but heaven forbid someone follow their example and excerpt/summarise/link.

    They’re bound to become obsolete unless they get with the times.

    Just a humble opinion from South Africa.

  • http://www.kevinwebb22.com Kevin

    This was probably snuck into the stimulus bill and taxpayers will be covering it.

  • http://www.usaidtube.com Kenneth

    The idea that a system of tracking people copying articles is pointless given that once you post a blog or a twitter or even a news event becomes public domain once it hits the web and is therefore open for anyone to see unless the site has restricked viewing rights. which makes the copying and pasting of said material fair rights to email or share with others for the terms of general interest as long as the article is not amended.

    However the idea of using this bot to track people coping web site profile and reusing it as there own to impersonate someone else for the means of causing harm or stealing in any form would be a great idea esp if these people were trailed under the laws related to Identaty thieft and fraud.

    Perhasp this would be a better use of lawyers time instead of wasting court time with pumped up charges of possible copywright thieft to make the corpations richer from unfair pursuance of a poor person sharing material that corpations are already profitting from.

    father more innternet providers should have to pay a % of their monthly fee’s into a fraud protection fund for the people hurt by scammers which could be used to setup an innertner policing squad for the recovery of stolen goods ect.

    I once spoke of this to my Government many years ago and designed a model for use asking them to gather all the counties leaders of the world that use the innernet to sign an acord for stamping out this criminal activity and well todate nothing has been done for the innernet user you might want to concieder that idea.

    Kenneth Donaldson

  • http://www.stonerscolony.com FaTe

    Not so long ago it was bloggers then it was the AP topping their own headlines and now they have failed at all that they move to idea #godKnowsByNow….

    The AP failed and will continue to do so, I’m just wondering how long it’ll take them too realise it.

  • http://thecomputergal.com Nora McDougall-Collins

    Newspapers and other news businesses should receive fair compensation for the work they put into gathering and distributing the news. On the other hand, they don’t create the news. If you see a crime being committed and a reporter sticks a mike in front of you, do you get paid for helping them make a profit?

    However, they do have to pay people to gather, write, photograph, etc. Instead of blocking websites from “stealing” their news, why not make it so that some of the ads that pay for the work are required to be posted along with the news? Advertisers should love that type of viral marketing!

  • Guest

    Piraracy is not a new concept to the bussiness and bussiness should stop acting as such. If anything piracacy increases the competion in markets and supports capitalism. However, the amount is neccessary to be regulated as it is today to limit the amount of corruption and attempt to provide accurate compensation for the working class. Although, this may not be concrete because when the entertainment industry is examined the people seem to recieve a large portion of the money traffic, however the earnings are hyperpolorized. Meaning better be the best to cash in; a basic neccessity for the promotion of capitalism.