Monday, September 22, 2008

Google takes more baby steps in online privacy game

Through a series of blog posts, topped off by two announcements this week about retaining user-related data, Google is inching along toward alleviating the concerns of many users and privacy groups, trumpeting each step along the way.

In one blog post on Monday, a Google legal and engineering team announced a "significant" shortening of Google's IP retention policies, involving plans to start anonymizing users' addresses on servers after nine months instead of the previous 18 months.

Then in another entry on the same day, Google's senior VP of operations, Urs Hölzle, said Google will soon start anonymizing IP addresses as well as other information collected through Google Suggest users, within 24 hours after each search is done.

Used in Google's new Chrome browser as well as in the Firefox Web browser, Google Search, and Google Toolbar, Google Suggest "guesses what you're typing and offers suggestions in real time," Holzle noted.

"To provide its recommendations, Google Suggest needs to know what you've already typed, so these partial queries are sent to Google. For 98% of these requests, we don't log any data at all and simply return the suggestions. For the remaining 2% of cases (which we select randomly), we do log data, like IP addresses, in order to monitor and improve the service," he wrote.

Now, though, in a change expected to be deployed within a month, Google will start to anonymize the logged data within about 24 hours "in the 2% of Google Suggest requests we use," according to the senior VP.

For Google searches themselves, however, IP addresses logged on servers will now be anonymized after nine months, as opposed to 24 hours -- but this is down from a previous level of 24 months. Before moving to the 24-month retention period, Google kept user data from searches on hand for an indefinite length of time.

"It was a difficult decision [to anonymize IP addresses after nine months] because the routine server log data we collect has always been a critical ingredient of innovation," according to a blog post this week co-authored by three Google staffers: Peter Fleischer, global privacy counsel; Jane Horvath, senior privacy counsel; and Alma Whitten, software engineer.

What's the reason for this confusing discrepancy in retention times between Google Suggest and Google searches? "In the case of Google Suggest we decided it's possible to provide a great service while anonymizing data almost immediately. But in other cases -- such as our core Web search -- storing data like IP addresses for a time is crucial to make improvements to search quality, improve security, fight fraud and reduce spam," Holzle maintained.

Meanwhile, on the whole, Google seems to have been growing more transparent in recent months about its reasons behind modifications to its privacy policies. Fleischer has been quoted elsewhere as admitting that Google only shortened retention to 24 months after having been pressured by regulators in the European Union.

"[But] some in the community of EU data protection regulators continued to be skeptical of the legitimacy of logs retention and demanded detailed justifications for this retention. Many of these privacy leaders also highlighted the risks of litigants using court-ordered discovery to gain access to logs, as in the recent Viacom suit," Google's legal and engineering team wrote this week.

In the Viacom suit, the media entertainment empire gained access to YouTube's logs -- resulting in a Viacom boycott and angry postings by some YouTube users -- after alleging that YouTube and its owner Google committed copyright infringement by allowing unauthorized viewing of movie clips and sports highlights.

Meanwhile, though, Google has now agreed to retain users' data from the use of Google Search for only nine months, because, "after months of work our engineers developed methods for preserving more of the data's utility while also anonymizing IP addresses sooner," according to the blog from the legal/engineering team.

"We haven't sorted out all of the implementation details, and we may not be able to use precisely the same methods for anonymizing as we do after 18 months, but we are committed to making it work," they said.

In contrast, Marissa Mayer, Google's VP for search products and user experience, doesn't appear to have been as candid in July in her announcement of the addition of a link from Google's home page to Google's privacy policies.

"Today we're making a home page change by adding a link to our privacy overview and policies. Google values our users' privacy first and foremost. Trust is the basis of everything we do, so we want you to be familiar and comfortable with the integrity and care we give your personal data," Mayer wrote.

Somehow, Mayer didn't mention that a coalition of 14 privacy groups had written a letter to Google CEO Eric Schmidt about a month before, demanding the link to privacy policy be added to Google's home page, and charging that, without it, Google was in violation of the California Online Privacy Protection Act.

  • Google bows to keystroke privacy concerns
  • Microsoft on IE8–speed not top priority
  • agrees with Google’s critics, issues new privacy safeguards
  • Google adds privacy link, avoids trouble