Saturday, March 03, 2007

Where the wisdom of crowds fails

Richard Bennett has an interesting post about Wikipedia and the decentralization of knowledge collection titled "Teaching the hive mind to discriminate." He argues that while Wikipedia is good at accumulating the knowledge of a large number of individuals, it also collects their "prejudice, mistaken beliefs, wishful thinking, and conformance to tradition." It is unrealistic to expect that these erroneous beliefs will automatically be weeded out because "expertise is not as widely dispersed as participation":
So the real question about information and group scaling is this: are there procedures for separating good information from false information (”discrimination”) that are effective enough to allow groups to be scaled indefinitely without a loss of information quality? It’s an article of faith in the Wikipedia “community” that such procedures exist, and that they’re essentially self-operative. That’s the mythos of “emergence”, that systems, including human systems, automatically self-organize in such a way as to reward good behavior and information and purge bad information. This seems to be based on the underlying assumption that people being basically good, the good will always prevail in any group.
Readers of this blog know that I would argue that many religious and political beliefs are examples that support Bennett's position.

On a related point, Ed Felten has a recent post about how reputation systems on the Internet can be manipulated, referencing a pair of articles at Wired by Annalee Newitz. A common flaw is that the reputations of the raters themselves is either not taken into account or is easily manipulated. If there were a way of reliably weighting expertise of raters within appropriate knowledge domains, that could provide a method of discrimination to sort out the good from the bad information.

This is a subject that my planned (but never completed) Ph.D. dissertation in epistemology (on social epistemology, specifically on obtaining knowledge based on the knowledge of others) at the University of Arizona should have touched upon.

One philosopher who had touched on this subject at the time I was working on my Ph.D. (back in the early 1990s) was Philip Kitcher, whose book The Advancement of Science: Science without Legend, Objectivity without Illusions (1993, Oxford University Press) contains a chapter titled "The Organization of Cognitive Labor" (originally published as "The Division of Cognitive Labor" in the Journal of Philosophy, 87(1990):5-21).

7 comments:

Tim said...

I think Richard makes a good point in general, but he seems oblivious to the fact that Wikipedia already has a good filtering mechanism: the "no original research" rule. That is, they outsource the verification of information to widely accepted third parties. Determining whether a given bit of information is true or false is difficult, but verifying that the New York Times and the Washington Post agree on something is not.

Of course, you'll have subjects where the authoritative sources disagree, or where there's disagreement about what counts as an authoritative source. But even in those cases, the "neutral point of view" rule provides a pretty good second line of defense against incompetent editors. If you can't find a consensus among authoritative sources, you simply give a fair-mined summary of each perspective and let the reader decide where to go.

So the point that the wisdom of crowds can lead us astray is obviously true in general, but I don't think Wikipedia is a very good example. Because their stated goal is to organize the large amount of information that's already widely accepted. That's not a task that requires especially discriminating judgment.

Tim said...

By the way, this is Tim Lee. Can I humbly request the option to sign comments with name and website rather than Google account? I don't use my Google account much, so it's not a very helpful way to identify myself.

Jim Lippard said...

Tim:

Sorry about the Google account requirement, but there don't seem to be any other good options for Blogger that help stem the tide of comment spam.

I don't think I agree that no original research + NPOV constitute sufficient filters, at least on subjects that require considerable expertise (like science). The former filter still allows items published by non-expert general sources like newspapers, and the latter filter requires a failure to discriminate between positions even when one side is well-supported and the other isn't. There are many areas where there are disputes about what the facts are, even when the facts are overwhelmingly supported to the satisfaction of the relevant experts.

Tim said...

"The former filter still allows items published by non-expert general sources like newspapers, and the latter filter requires a failure to discriminate between positions even when one side is well-supported and the other isn't."

But isn't the latter behavior precisely what you want in an encyclopedia? Even for subjects like intelligent design where one side of the argument is populated almost entirely by idiots, it's still somewhat useful to know that there are a significant number of people who consider the subject controversial. The approach Wikipedia tends to take, which I think is the right approach, is to briefly acknowledge the existence of dissenters, with a link to a separate article about their views, while focusing the body of the article on the mainstream view.

For example, the article on evolution doesn't mention intelligent design at all, but the article on intelligent design classifies it as a creationist viewpoint and includes extensive coverage of criticism of the view. The article on global warming briefly mentions that dissenting viewpoints exist, while focusing the body of the article on the IPCC position. Those interested in learning more about dissenting views can visit the global warming controversy page. The article on the Holocaust has a short section on Holocaust deniers that states (quoting Public Opinion Quarterly): "No reputable historian questions the reality of the Holocaust, and those promoting Holocaust denial are overwhelmingly anti-Semites and/or neo-Nazis."

So in these cases, at least, I don't see any evidence that Wikipedia gives too much credence to poorly-supported dissenting views. It mentions that these views exist, and offers a brief summary for those who are interested, but it doesn't seem to have trouble clearly identifying which viewpoints are the mainstream views.

The CAPTCHA doesn't catch spam? I would think that would be extremely effective.

Jim Lippard said...

Tim: I didn't realize that combination was available--I'm giving it a try (word verification minus Google account requirement). Let's see what happens.

I agree that it's worth knowing that dissent exists--I guess I wasn't considering that NPOV allows you to state the level of acceptance (and who accepts it).

To what extent is the content of particular Wikipedia entries biased towards the views of the editors who are most active/motivated to edit them? I know I've seen creationism entries that appear to have been authored primarily by creationists, minimizing relevant critical information, though the Duane Gish entry's current form is a counterexample (and the discussion page shows creationists complaining about an unfair Wikipedia bias against them).

Tim said...

That's a good point: people who are fans of obscure/crackpot theories are the most likely to write entries about them. The ones that are prominent enough will probably have people come along and add information about criticism of them, but the ones that are really obscure will be somewhat biased in favor of the views in question.

However, the only real alternative is to have no information on those subjects at all. I think having a sympathetic description of a group's beliefs--even if those beliefs are crazy--is better than having no information at all. If the group ever becomes prominent enough to have a real impact, its critics will find the page and add information about opposing viewpoints.

Thanks for enabling anonymous commenting!

Richard Bennett said...

Jim, I think you've hit the key point: the content of Wikipedia entries is ultimately determined by the views of the editors who spend the most time writing, and the rules are really beside the point. And the fact is that the most dedicated editors are either paid shills or insane. Wikipedia editors have contempt for the views of experts, the Original Research ban means nothing, and the NPOV rule is only as good as the people who enforce it. The Net Neutrality and Internet history articles are dominated by people who believe in urban legends and conspiracy theories. That's not at all surprising as the common man lives and dies by conspiracy theories, which are so much easier to apply than evidence-driven hypotheses.

At the end of the day, peer production is only as good as the peers who produce it.