Monday, August 07, 2006

AOL releases user search data, tied to individual users

AOL has published logs showing web activity data for 650,000 users--it's 20 million searches in about 800MB. Although the AOL screen names were converted to random numbers, the numbers are consistent across an individual user's activity and in many cases is no doubt sufficient to identify the individual based on ego surfing and other activity.

As Tech Crunch points out:
The most serious problem is the fact that many people often search on their own name, or those of their friends and family, to see what information is available about them on the net. Combine these ego searches with porn queries and you have a serious embarrassment. Combine them with “buy ecstasy” and you have evidence of a crime. Combine it with an address, social security number, etc., and you have an identity theft waiting to happen. The possibilities are endless.
The Paradigm Shift blog notes an instance of an AOL user who appears to be plotting to kill his wife (though there are, of course, possible innocent explanations). Commenters note that over 100 users used search terms which included references to child porn. There is no doubt that this will be used to argue for greater release of data to the government with fewer safeguards against misuse; commenters have already made the claim that "if you don’t do anything wrong, then you have nothing to be afraid of - even if people can view your search history." Commenter Robert follows up with a good response:
Do you ever search for your SSN#, phone number and/or name on line to see if it was posted without your consent? Do you ever worry your day care provider might be a child molester so you search for child molestation and the care takers name or their business name? Do you ever want to find ways to explain sex to your teen age daughter? Gee I wonder what those search terms might look like? Are you famous? Imagine if you type in the name of restaurant you want to go to and the word paparazzi to see if they are known to hang there. Let’s hope they do not see that? Oh, do you have a rare disease or maybe you are pregnant and are looking for clinic in your area so you type in your zip code? In a rural areas that might leave oh 1-30 people it could be? Oh, maybe you think your son is gay? I wonder what you would search for then? Do you have any fetishes or other unusual hobby that might be embarrassing for people to know about but is not illegal. Remember that rural issue again? Getting it yet, because I could go on and on. This is an personal invasion at its most basic level. Not only does it expose personal details of peoples lives, but it is open to wild misinterpretations. Take the wife killing search. Has anyone thought they were simply looking for news they had heard of on the topic, looking for a good book they had heard about with that topic whose title they could not remember, were a wife worried their husband was thinking about this, or maybe that it was exactly what they were looking for but it was only a private fantasy that let them cool off one day after an angry argument? Without context any term can seem scandalous or even criminal. Finally, there is the greater issue. When you start taking away more and more privacy. Each time you chip away at the greater fundamental concept that you deserve this right at all.
Releasing this data to the general public was sheer idiocy on AOL's part (and apparently a mistake), and demonstrates that an AOL account is not a good idea even when it's free.

The data has been downloaded hundreds of times and is now being redistributed on other websites.

UPDATE August 8, 2006: AOL has admitted and apologized for its mistake. has an article which gives some more examples of the kind of information that can be gleaned from the search records.