Can You Trust Wikipedia?

    September 6, 2007

Trust – reliance on the integrity, strength, ability, surety, etc., of a person or thing; confidence.

How confident are you that a particular Wikipedia page has reliable information? How sure are you in the ability of all the people who may have edited that page? Thanks to Luca de Alfaro and colleagues at the University of California, Santa Cruz you may soon be able to know which parts of a given Wikipedia page you can and can’t trust.

de Alfaro and his team are developing software that will color text different hues of orange if the text may be less than trustworthy. The deeper the orange the less you may want to trust the particular text. You can see a demo of the software by visiting the Wikipedia trust coloring demo page. Some pages are pretty clean so you may have to click to view a few random pages before you really see much in the way of orange.

It works by first evaluating the reputation of the author.

We compute the reputation of Wikipedia authors according to how long their contributions last in the Wikipedia. Specifically, authors whose contributions are preserved, or built-upon, gain reputation; authors whose contributions are undone lose reputation.

and then using that reputation to compute the trust of each word of each revision.

We compute the trust value of each word of a revision according to the reputation of the original author of the word, as well as to the reputation of any authors that have edited the page, especially if the edit is in the proximity of the word.

de Alfaro’s goal isn’t to show the Wikipedia shouldn’t be trusted, but rather to build trust trough transparency so that “nobody can single-handedly modify information without some traces of that being available for some time afterwards.”

After looking at a few random pages some pages show more shades of orange than others. I’m not always sure what to make of the orange text in all cases. Some edits were clearly made for grammatical reasons and while a given word or two shows a deep orange removing the words doesn’t alter the facts at all. In other places the text is a little more tied into the ‘facts’ of the page and seeing them with an orange background does make you question the snippet a little.

The trust coloring probably isn’t perfect, but the team is still fine-tuning the algorithms and the results will likely improve with time.

Combined with the recent release of the Wikiscanner, which shows how many edits a particular user of IP address has made the coloring software should help inspire more trust in Wikipedia entries or at the very least show which pages should perhaps be looked at with a grain of skepticism. It should also hold authors more accountable.

You can find more information through AP Tech Writer, Brian Bergstein’s article and do give the demo a try. No matter what you think of the Wikipedia the trust coloring is interesting and might make you think a little differently about what you can and can’t trust.