2004-05-20

Gmail Issues Explained

I'd like to take a moment and discuss some of the issues surrounding Google's giant new e-mail service, Gmail, which has had so much controversy this last month and a half.

As I see it, there are two main issues that still float around a lot of people's minds: Gmail's policy of scanning e-mail content for advertisements and retaining e-mails after users choose to delete them. In this post, I hope to show how both of these issues are misunderstood, and how Gmail is really no more harmful than any other similar webmail service on the Internet.

I'll start with the deletion policy. Many people are worried that Google intends to keep every e-mail you send and receive indefinitely, even if you select the e-mail and delete it yourself. This is because in Gmail's privacy policy, it said that e-mails would reside on their servers after the user deleted them. However, as it was later revealed, this was simply a disclaimer referring to their backup system.

Because we keep back-up copies of data for the purposes of recovery from errors or system failure, residual copies of email may remain on our systems for some time, even after you have deleted messages from your mailbox or after the termination of your account.

[Source]

They backup the data on their servers regularly, and all that Google is saying is that when you delete an e-mail, the backups of that e-mail will still remain until the next backup is made. Cofounder Sergey Brin has said straightforward that the backups will eventually be deleted:

Steve Gillmor: Is it possible to delete messages, or does everything continue to reside in AllMail?

Sergey Brin: Oh, no, no, that was just poor wording on our part. It's just that we make a variety of backups, and we can't guarantee instantaneous deletion. Stuff that's on tapes, and those are offline—we eventually delete it, but we can't guarantee an instantaneous deletion.

[Source]

Most similar Internet services keep regular backups of users' data. If they didn't, there would be a constant risk of losing your e-mails in something like a hard drive crash (which does happen from time to time). Gmail is no different. And because backups are often made in regular time intervals, not necessarily just when you get a new e-mail or delete one, at any time the backups might reflect the contents of your account as they were a day, a week, or even a month ago. So when you delete a file, it doesn't get deleted immediately in the backups. You just have to wait for Gmail to make the backup again, which would remove the deleted e-mail completely from their servers.

Now, an issue that has been given much more attention is the fact that Gmail scans the content of your incoming e-mails (as well as the e-mails in your Sent folder) in order to place relevant advertisements on the side of the page. People immediately got concerned that Gmail was reading your e-mails, as if it was learning about your secrets and tracking personal messages. However, if you actually understand how Gmail delivers these advertisements, you can clearly see that these privacy concerns are not serious, and that there is no evidence that Gmail is storing any form of personal profile generated from the content of a user's e-mails.

Essentially, in terms of what it does to your message and the database functions it performs, it's no different from your typical spam filter or even spell checker. For the sponsored links, Google basically has a database of keywords and websites that correspond to those keywords. Gmail does a quick scan through your e-mail content, ignoring everything but words that match with one of the keywords in the database. If it finds one, it just grabs the website information that corresponds to it and prints out the information onto the page. It also does a few other minor things, like match text in the e-mail with text from other e-mails of yours in order to determine if it's part of a conversation (basically, if a group of text at the beginning or end of an e-mail matches exactly with that of another e-mails, it's grouped together). There's no evidence that Gmail does any form of profile building during this process. For the most part, it's just simple text matching. Saying that Gmail's advertisement system is a privacy concern is like saying that any kind of display formatting (which, to some extent, appears in all webmail systems) is a privacy concern.

It's true that if Google wanted to, they could take the content of the e-mails and actually build a rough profile of the user with it. However, that's totally unrelated to the advertising system, and any service that deals with private communication has the ability to do the exact same thing. Generally, companies don't, and there's no reason to suspect that Google in particular is doing it.

1 comment

Anonymous

Hey

I dig your coverage on Google. Keep it up. Its i.jesus here. I started the Google group http://groups-beta.google.com/group/g-fan/ (G Fan) and your my first poster on it so thanks for checking it out.

I started it cause i think google is bloody interesting. Figure this out. Has there ever been a library in history with such infomation of integrity and so much of it. Let alone being sinch as figs to find.

Im also interested in working there but im an undergrad so whats one to do until then. You wouldnt happen to know if they have any scholarships?

If you would like to become a moderator email me from the group.

Thanks

I,Jesus.

Post new comment

Comment moderation policy: Your comment will be reviewed before it is added to the site. This is in response to spam and other forms of abuse. I gladly accept comments containing criticism as long as the language is clean.

This weblog is powered by Blogger.