A few miles away from me there lives a famously eccentric rich kid named John Hormel, heir to the family that got rich making and selling canned meat products for the Armed Forces which was eventually sold commercially as "Spam". He is reportedly very upset with how this term is being used these days.
According to the 1991 edition of The New Hackers Dictionary, the term "spam" was originally used by computer geeks to signify "to crash a program running a fixed size buffer with excessively large input data". The joke was that it was being compared to a can of Spam overstuffed with processed meat. Somewhere along the lines the term now refers to excessive amounts of junk e-mail, a problem getting worse every day.
The first documented use of global e-mail for advertising occurred in 1978 when Digital Corp. sent out an invitation for a product demo to hundreds of Arpanet (a precursor to the internet) users. The Arpanet community was outraged by this use for e-mail.
The problem is that the e-mail protocol which allowed Digital to do this is virtually identical to the one used today, and that is part of the problem.
Five years ago, my thought was that junk e-mail wasn't so bad because "with two clicks of the mouse it was gone". But now the problem has gotten so bad, it is almost not worth using e-mail anymore. It has now become too time consuming to scan e-mails by hand, and I have had to spend money on McAfee SpamKiller to help sort the good from the bad.
Anatomy of a Spam
Here is a source code for a typical Spam, I just got this one this morning.
Received: from [184.108.40.206] (helo=mx2.mail.yahoo.com) by efwd.dnsix.com with smtp (Exim 3.36 #2) id 18qlP4-0003H4-00 for firstname.lastname@example.org
; Wed, 05 Mar 2003 18:46:38 -0800
Content-Type: multipart/mixed; boundary="_separator_"
Subject: 100% bigger
From: janny <email@example.com>
Date: Wed, 05 Mar 2003 18:46:38 -0800
This is a multi-part message in MIME format.
I've color coded portions for explanation:
Red - Don't even bother responding, this address either does not exist, or belongs to an innocent party.
Green - This is supposed to tell you where the the email originated, the highlighted address is the real origin, but a trace route back to this address resulted in an error "can't find 220.127.116.11 - non-existent domain". A trace route to efwd.dnsix.com circled the globe twice before reporting "Destination Unreachable". Often spammers will use an unsuspecting e-mail server without their knowledge to pass their e-mails.
Grey - My E-Mail Address. Since multiple To: and Cc: addresses is a sure sign of spam, most spam is now sent one address at a time by dedicated computers.
Blue - Multipart mixed emails are designed so a text version and an html version of the same message can be sent to a receiver who may not be capable of receiving HTML messages. Since the second partition is usually just a HTML version of the plain text in the first partition, many Spam elimination programs, including the one I use, only look at the first partition of text. Here the relevant text is hidden in the second partition in order to duck scanning programs.
Purple - An increasingly popular trick is to do a base 64 encode of the text. This is an archaic form of e-mail still around for backward compatibility, but because it looks like garbled letters to e-mail sorters, it is another way to avoid scanning.
A Review of SpamKiller
For the better part of a year, I've been using McAfee SpamKiller to do most of the dirty work. There are filters built in to all major e-mail clients, but setting up all the filters needed to kill the majority of the spam would be difficult and time consuming. SpamKiller is pre programmed to get rid of the majority of spam that comes to your mailbox. It does this by searching for key words in the subject, in the senders address, and in the message itself. It sends the good mail to Outlook Express while the bad is stored separately for either further review or automatic deletion.
While it has a constantly updated list of filters, I found this program unable to detect the worst spam using the built in filters. As pointed out above, it does not handle multipart e-mails very well, nor can it scan base 64 encoded emails. I created custom filters to get rid of these two cases (If header content-transfer-encoding contains "Base64" then kill and If header content-type contains "Multipart" then kill) and it has helped quite a bit, but occasionally legitimate messages contain multipart partitions.
I also created a filter to catch the W32.Klez e-mail virus that is still spreading persistently (If message text contains "TVqQAAMAAAAEAAAA//" then kill), but since then my ISP has been using Brightmail to scan for viruses in attachments, now I just kill messages containing "This message has been processed by Brightmail(TM)"
As it stands, the vast majority of the e-mail is caught, but at a loss of some legitimate mail. I have also been forced to set Outlook Express to preview all e-mail in plain text as a precaution (everyone should do this, especially if you have kids in the house to prevent getting pornographic pictures embedded in HTML based e-mail).
There are other commercial spam filters that work better. One is called MailGuardian, and it is a service that checks the originating address of every e-mail you receive and flags those from known spammers with a change in the subject line. Then you can use filtering software to get rid of the spam as you see fit. Reports are that it is very reliable, but the service costs $30 a year, where SpamKiller is a one time fee.
What all this adds up to is that junk e-mail is ruining e-mail as a good form of communication. Can we pass laws to prevent this? We already have, and guess what, the spammers don't care!
A Statistical Solution!
The problem with e-mail filtering software is that it is really easy for it to get things wrong. If for example you were to e-mail this page to a friend and I happen to type the word "Viagra" somewhere, it might get picked up as spam.
I have seen many spams purposely misspell key words or use "s-ex". Also phrases like "Unsubscribe" are disappearing from messages. But, if you mistakenly use a key spam word in a personal e-mail, it might get tagged as spam.
There is a better way, using statistical analysis techniques from the 18th century. This column explains it better than I can.
*****The Final Solution!******
The ultimate solution may require a radical change in e-mail. Right now e-mail files are sent to your e-mail server for you to download. What if instead the e-mail severs held mail to be sent and then sent its address to the recipients e-mail client. Then when you check your e-mail, all you get is addresses where your e-mail is stored, and your e-mail is retrieved from those addresses.
The advantage is, the sender cannot be disguised, because to retrieve the e-mail we need to know the senders real address. Then, known spammer's addresses can easily be blocked.
The problem is that the whole e-mail protocol would have to change at the same time. every e-mail server would have to be updated, and every e-mail program would have to be updated as well. That's over a billion updates worldwide!
The thing is, as junk e-mail becomes a bigger and bigger problem, the more likely a radical change like this might happen.
Any Program can Kill Spam
As a public service, here are the filters I use with the most hits for junk e-mail. Set up your e-mail client to detect these and a big chunk of your e-mail problem will be alleviated:
Sender name starts with a number
Message contains "TVqQAAMAAAAEAAAA//" (klez virus)
Sender name is mostly numbers
Message contains "unsubscribe"
Message contains "remove in the subject" or "subject=remove" or "remove.htm"
Subject is all Uppercase
Sender name is blank
Subject contains "adv"
Subject contains "free"
Message contains "100% free" or "100% guaranteed" or "money back guarantee" or "no obligation"
Message contains "1-800" "1-888" "1-877"
Subject contains "viagra" or "penis"
Message "To:" does not contain "@", is missing or is blank
Message contains "debt free" or "work from home"
Message contains "feedback form"