AOL Searcher No. 4417749 is Exposed

Posted by Marshall on August 09, 2006 | Link It

I have suspected that Search Engine Query logs were going to be a hot topic a couple years ago when I heard someone from Google speaking about how they’re kept and the limited access Google employee’s have.   

In the case of AOL, powered by Google, both AOL and Google have a copy of that information about every search that is conducted and who executed it.  I don’t mean to imply that Google or AOL knew who searcher 441779 was, by name, - but they knew enough to determine they were from the same searcher.

Graphic: What Revealing Search Data Reveals

"No. 4417749 conducted hundreds of searches over a three-month period on topics ranging from “numb fingers” to “60 single men” to “dog that urinates on everything.”

And search by search, click by click, the identity of AOL user No. 4417749 became easier to discern. There are queries for “landscapers in Lilburn, Ga,” several people with the last name Arnold and “homes sold in shadow lake subdivision gwinnett county georgia.”

It did not take much investigating to follow that data trail to Thelma Arnold, a 62-year-old widow who lives in Lilburn, Ga., frequently researches her friends’ medical ailments and loves her three dogs. “Those are my searches,” she said, after a reporter read part of the list to her.

So if it can be done in Thelma Arnold’s case - what’s to stop it from being done to you and I?  Nothing really - just the will of some one, some group, to get ahold of the search queries and decode them to tie them back to the individual who made them.  

"Several bloggers claimed yesterday to have identified other AOL users by examining data, while others hunted for particularly entertaining or shocking search histories. Some programmers made this easier by setting up Web sites that let people search the database of searches.

John Battelle, the author of the 2005 book “The Search: How Google and Its Rivals Rewrote the Rules of Business and Transformed Our Culture,” said AOL’s misstep, while unfortunate, could have a silver lining if people began to understand just what was at stake. In his book, he says search engines are mining the priceless “database of intentions” formed by the world’s search requests.

“It’s only by these kinds of screw-ups and unintended behind-the-curtain views that we can push this dialogue along,” Mr. Battelle said. “As unhappy as I am to see this data on people leaked, I’m heartened that we will have this conversation as a culture, which is long overdue.”

Companies want to target by what your searching on and the Goverment wants the information to target crime for national security.  On the other hand, we have a situation that could not have been forseen 200+ years ago - where everyone’s (who searches) queries can be exposed.

I think new ways to protect identity from search engines may be needed.  It’s one thing for Google to say, we’ll lock up our data and only the most entrusted employees can touch it and only in certain ways.  That may work for Google - but look what happened when AOL took it’s part of that same data (since it’s powered by Google) and published portions of it on the Web, in the name of research.  

And since Google powers many Search Engines, as does Yahoo - portions of search query data are insecure, just waiting for the right buyer, or the right mistake to be made public.  



Post a Response

Name (required)

Email (required, not published)

Website (optional)

Note: The following tags are approved for comments on this blog:
<a href=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <del> <strong>





Subscribe

RSS Subscribe View my FriendFeed Current Subscribers