FAQ: AOL's Search Gaffe and You - Ryan Singel Word spread last week that researchers at AOL had released three months' worth of search logs that contained nearly 20 million search histories detailing the online lives of 658,000 customers. The data included information on subscribers who used AOL's browser, but not those who had used AOL's portal. AOL user IDs were replaced with pseudonymous numbers and the data was organized by a user's search history. The data set included the time and date of a search, the search terms and the result, if any, clicked on. AOL has apologized and taken down the data, but it is now widely available on the internet and some have set up search engines that query the records. For those worried about what companies or federal investigators might do with such records in the future, here's a primer on how search logs work and how to avoid being writ large within them. Why did AOL release the records? AOL's research arm released the records in order to help academic search researchers. Researchers use such records -- known as a corpus -- to test new search methods and tweaks. I'm an AOL user. Did AOL release my search terms? So far AOL has not contacted AOL users to let them know if they were one of the users affected. You can try contacting AOL customer service at 1-703-265-1000, or use this page of tips from the Electronic Frontier Foundation. AOL says it anonymized the data by replacing the AOL user ID with a randomized number. Is it possible for someone to figure out who I am just from my searches? Possibly. Reporters for The New York Times tracked down a Georgia woman based solely on a review of the AOL logs. Wired News was also able to determine the identity of one 14-year-old from his queries and knows of one woman who was identified by an outside party and notified she had sensitive financial data revealed by the logs. Why do search engines save logs of search terms? Search companies use logs and data-mining techniques to tune their engines and deliver focused advertising, as well to create cool features such as Google Zeitgeist. They also use them to help with local searches and return more relevant, personalized search results. How does a search engine tie a search to a user? If you have never logged in to a search engine's site, or a sister service like Google's Gmail offering, the company probably doesn't know your name. But it connects your searches through a cookie, which has a unique identifying number. Using its cookies, Google will remember all searches from your browser. It might also link searches by a user's internet protocol address. How long do cookies last? It varies, but 30 years is about average. AOL drops a cookie in your browser that will expire in 2034. Yahoo used to set a six-month cookie but now its tracker expires in 2037. A new cookie from Google expires in 2036. What if you sign in to a service? If you sign in on AOL, Google or Yahoo's personalized homepage, the companies can then correlate your search history with any other information, such as your name, that you give them. If you use their e-mail or calendar offerings, the companies can tie your searches to your correspondence and life activities. Together these can provide a more complete understanding of your life than many of your friends or family members have. Why should anyone worry about this leak or bother to disguise their search history? Some people simply don't like the idea of their search history being tied to their personal lives. Some people check to see if their Social Security or credit card numbers are on the internet by searching for them. Ironically, for more than a few AOL users, the leak of the search terms means that this sensitive information is now on the web. Others don't know what the information could be used for, but worry that the search companies could find surprising uses for that data that might invade privacy in the future. The government could also use its recently broadened subpoena power to get trillions of records, and any evidence of any crime could be used against you, even if the reason for the original request was to fight terrorism. For example, if you use Google's Gmail and web-optimizing software, the company could correlate everyone you've e-mailed, all the websites you've visited after a search and even all the words you misspell in queries. Search histories could also be subpoenaed and used as evidence in a civil court proceeding such as a divorce or business dispute. What's the first thing people who worry about their search history should do? Cookie management helps. Those who want to avoid a permanent record should delete their cookies at least once a week. Other options might be to obliterate certain cookies when a browser is closed and avoid logging in to other services, such as web mail, offered by a search engine. Does cookie management guarantee that a search engine can't string together a search history? No. If you destroy your cookie and then quickly get a new one, a search engine that logs your IP address could easily connect your old searches with your new ones. While many broadband subscribers have dynamic IP addresses that are subject to change, these dynamic addresses are much more stable than most people suspect. How do you manage cookies with your browser? In Firefox, you can go into the privacy preference dialog and open Cookies. From there you can remove your search engine cookies and click the box that says: "Don't allow sites that set removed cookies to set future cookies." For even better control -- such as being able to keep certain cookies and automatically throw away others when you close your browser -- try the CookieCuller plug-in. In Safari, try the free and versatile PithHelmet plug-in. You can let some cookies in temporarily, decide that some can last longer, or prohibit some sites, including third-party advertisers, from setting cookies at all. While Internet Explorer's tools are not quite as flexible, you can manage your cookies through the Tools menu by following these instructions. Has the government ever requested such records before? Yes. One attempt was made public last fall when Google fought a subpoena from the Justice Department which asked for similar records from AOL, MSN, Yahoo and Google. The feds wanted the records to help defend an ongoing court challenge to the Child Online Protection Act. Google largely won that battle, but Yahoo, MSN and AOL all turned over records to the government. The government may have also asked for large quantities of search records as part of antiterrorism efforts, but those subpoenas and warrants typically come with gag orders that would prevent the search engines from publicly discussing them. Have search histories ever been used to prosecute someone? Robert Petrick was convicted in November 2005 of murdering his wife, in part based on evidence that he had Googled the words "neck," "snap" and "break." But police obtained his search history from an examination of his computer, not from Google. Can I see mine? Usually, no. But if you want to trace your own Google search history and see trends, and you don't mind if the company uses the information to personalize search results, you can sign up for Google's beta Personalized Search service. Could search histories be used in civil cases? Certainly. Google may well be fighting the government simply on principle -- or, as court papers suggest, to keep outsiders from using the company's proprietary database for free. But a business case can also be made that if users knew the company regularly turned over its records wholesale to the government, they might curtail use of the site. A related question is whether Google or any other search engine would fight a subpoena from a divorce attorney, or protest a more focused subpoena from local police who want information on someone they say is making methamphetamine. What if I want more anonymity than simply deleting my cookie when I'm searching? If you are doing any search you wouldn't print on a T-shirt, consider using Tor. A formerly EFF-sponsored service, Tor helps anonymize your web traffic by bouncing it between volunteer servers. It masks the origins and makes it easier to evade filters such as those installed by schools or repressive regimes. The service has its drawbacks. While it can be very useful for a journalist in China, data services can be slower or have greater latency due to the extra stops the data makes, and a general dearth of servers. Is Tor perfectly anonymous? No. Computers leak data. Tor, combined with the Privoxy proxy server (which comes bundled with Tor), reduces some of that leakage, but still isn't foolproof. But when used with Firefox, Tor and Privoxy can provide a mostly anonymous web-browsing experience. Are there other options? Anonymizer offers a limited free browsing service and sells software, both of which are supposed to protect your anonymity but have suffered serious performance issues. There are other proxy servers on the internet, but you have to judge for yourself whether you trust them, and some websites actively block anonymous browsing. Maxthon Browser, a tabbed browser built on top of Internet Explorer, has a fairly easy way of surfing through a proxy connection, which can be useful for masking your IP address and censorware, but the privacy conscious still need to monitor cookies.