Saturday, March 03, 2007

Where the wisdom of crowds fails

Richard Bennett has an interesting post about Wikipedia and the decentralization of knowledge collection titled "Teaching the hive mind to discriminate." He argues that while Wikipedia is good at accumulating the knowledge of a large number of individuals, it also collects their "prejudice, mistaken beliefs, wishful thinking, and conformance to tradition." It is unrealistic to expect that these erroneous beliefs will automatically be weeded out because "expertise is not as widely dispersed as participation":
So the real question about information and group scaling is this: are there procedures for separating good information from false information (”discrimination”) that are effective enough to allow groups to be scaled indefinitely without a loss of information quality? It’s an article of faith in the Wikipedia “community” that such procedures exist, and that they’re essentially self-operative. That’s the mythos of “emergence”, that systems, including human systems, automatically self-organize in such a way as to reward good behavior and information and purge bad information. This seems to be based on the underlying assumption that people being basically good, the good will always prevail in any group.
Readers of this blog know that I would argue that many religious and political beliefs are examples that support Bennett's position.

On a related point, Ed Felten has a recent post about how reputation systems on the Internet can be manipulated, referencing a pair of articles at Wired by Annalee Newitz. A common flaw is that the reputations of the raters themselves is either not taken into account or is easily manipulated. If there were a way of reliably weighting expertise of raters within appropriate knowledge domains, that could provide a method of discrimination to sort out the good from the bad information.

This is a subject that my planned (but never completed) Ph.D. dissertation in epistemology (on social epistemology, specifically on obtaining knowledge based on the knowledge of others) at the University of Arizona should have touched upon.

One philosopher who had touched on this subject at the time I was working on my Ph.D. (back in the early 1990s) was Philip Kitcher, whose book The Advancement of Science: Science without Legend, Objectivity without Illusions (1993, Oxford University Press) contains a chapter titled "The Organization of Cognitive Labor" (originally published as "The Division of Cognitive Labor" in the Journal of Philosophy, 87(1990):5-21).

4 comments:

Lippard said...

Tim:

Sorry about the Google account requirement, but there don't seem to be any other good options for Blogger that help stem the tide of comment spam.

I don't think I agree that no original research + NPOV constitute sufficient filters, at least on subjects that require considerable expertise (like science). The former filter still allows items published by non-expert general sources like newspapers, and the latter filter requires a failure to discriminate between positions even when one side is well-supported and the other isn't. There are many areas where there are disputes about what the facts are, even when the facts are overwhelmingly supported to the satisfaction of the relevant experts.

Lippard said...

Tim: I didn't realize that combination was available--I'm giving it a try (word verification minus Google account requirement). Let's see what happens.

I agree that it's worth knowing that dissent exists--I guess I wasn't considering that NPOV allows you to state the level of acceptance (and who accepts it).

To what extent is the content of particular Wikipedia entries biased towards the views of the editors who are most active/motivated to edit them? I know I've seen creationism entries that appear to have been authored primarily by creationists, minimizing relevant critical information, though the Duane Gish entry's current form is a counterexample (and the discussion page shows creationists complaining about an unfair Wikipedia bias against them).

Anonymous said...

That's a good point: people who are fans of obscure/crackpot theories are the most likely to write entries about them. The ones that are prominent enough will probably have people come along and add information about criticism of them, but the ones that are really obscure will be somewhat biased in favor of the views in question.

However, the only real alternative is to have no information on those subjects at all. I think having a sympathetic description of a group's beliefs--even if those beliefs are crazy--is better than having no information at all. If the group ever becomes prominent enough to have a real impact, its critics will find the page and add information about opposing viewpoints.

Thanks for enabling anonymous commenting!

Anonymous said...

Jim, I think you've hit the key point: the content of Wikipedia entries is ultimately determined by the views of the editors who spend the most time writing, and the rules are really beside the point. And the fact is that the most dedicated editors are either paid shills or insane. Wikipedia editors have contempt for the views of experts, the Original Research ban means nothing, and the NPOV rule is only as good as the people who enforce it. The Net Neutrality and Internet history articles are dominated by people who believe in urban legends and conspiracy theories. That's not at all surprising as the common man lives and dies by conspiracy theories, which are so much easier to apply than evidence-driven hypotheses.

At the end of the day, peer production is only as good as the peers who produce it.