Build a spam firewall with Linux
At the risk of stating the obvious, I hate spam. Even one is enough to make my blood pressure rise. I have always believed that the best way to control spam is to not get any, so I have jealously guarded my primary e-mail address given to me by my ISP. I only give it to my friends and business associates, and even then I don't give it to all of them. The ones who like to send those annoying 'Fw:' messages to everybody they've ever met and throw 500 addresses in the To: field for God and everybody to see will never get my real address. (The one exception is that I do allow several Security Focus mailing lists to come to my primary address for convenience.)
For web sites that require registration, mailing lists, and all other uses that require public disclosure of my e-mail, I use a number of accounts on free services like Yahoo, Hotmail, and MSN. Hey, better to clog up their servers than mine, right? :D
Unfortunately, one of my ex-girlfriends had a penchant for sending me lots and lots of cute little web pages that have that "Send this page to a friend!" button. Grrrrrr...... so now it seems like I'm on just about every mailing list in the known universe. For years, I have used e-mail clients like PMMail and Evolution that have powerful filtering capabilities to filter out the small amount of spam I received, but as the amount of spam grew and the filters became more and more complex, it became too burdensome to maintain different sets of filter rules for 2 or 3 different OSes, all using different mail clients and filtering methods. Additionally, I found it inconvenient to be working in Linux and have to reboot to access an e-mail that I retrieved via POP3 with PMMail in Windows.
Then around Christmas, I came into possession of a couple of old P-233 boxes, which have just been sitting around my house for a couple of months now. It seemed a shame to have them just sitting in a closet, and considering my problem, I decided to put one of them to work for me.
I have always known that SpamAssassin was a great tool for filtering spam, but because it's designed primarily for use on a mail server rather than a workstation, it's never been much use to me. And I have always known that IMAP was a very good solution for people that need a central repository for their e-mail so it can be accessed from a variety of locations, but it seemed like a bit of overkill for my needs. But with my spam problem growing by the day, and with a spare box sitting in the closet, it seemed to me that the time was right to combine these two ideas into a comprehensive e-mail solution.
* It had to run on Linux. Gentoo is my distribution of choice.
* It had to be able to download my e-mail from several different ISP accounts into one local account for easy checking.
* It had to have hooks for invoking SpamAssassin and other filtering tools. I might want to add antivirus scanning later.
* I had to be able to access my e-mail over an SLL-enabled IMAP connection for security.
* It had to be free.
That sounds pretty simple, but the actual working out of it turned out to be incredibly tedious and time consuming. Most of the SpamAssassin documentation I found online was geard toward users who are running a full-blown public mail server with SMTP and a registered domain. There was no need for me to run sendmail or postfix because I have no domain, and documentation on using fetchmail with SpamAssassin is very sparse.
I looked at several free IMAP servers, and I finally decided on courier-imap. I rejected imap-uw because of its terrible security record and unfortunate development process, and I rejected cyrus-imap because I never could get it configured peoperly. It seemed to have some problems authenticating the connection with pwcheck and PAM, so I just skipped it in favor of courier, which had a much simpler configuration anyway. Cyrus was a real bear to configure.
For mail retrieval I settled on getmail instead of the venerable fetchmail. I never could get fetchmail to retrieve my mail because it would query the server and report the number of messages, but it never actually downloaded anything and stuck on the first message. After a little searching on Google, I discovered that fetchmail has no mail delivery agent (MDA) of its own and depends on sendmail or postfix to do the actual delivery to the user's mailbox. Getmail has its own MDA and it worked perfectly the first time for me, so I went with it. For filtering, I chose a combination of procmail, SpamAssassin, and Razor.
I will begin from the assumption that you have a working Linux installation connected to your network and that you have already taken basic hardening steps, including a packet filtering script that leaves ports 22 and 143 open on the LAN side. I started with a clean Gentoo installation, i.e., with only the base system emerged.
The first step is to install all the packages, which on Gentoo is very easy. SpamAssassin is the only package for me that has to be downloaded and compiled by hand, but it requires the perl HTML parser package. Your mileage may vary:
Portage takes care of all dependencies and installs everything. If installation isn't that easy on your system, all I can say is that you're using the wrong distribution. :D
# emerge courier-imap
# emerge getmail
# emerge procmail
# emerge razor
# emerge HTML-Parser
# cd /usr/src
# tar xvfz Mail-SpamAssassin-2.52.tar.gz
# cd Mail-SpamAssassin-2.52
# perl Makefile.PL
# make install
The next order of business is to get the IMAP server up and running. There's no point in fetching a bunch of mail if there's no place to put it. We have to make one small change to /etc/courier-imap/authdaemonrc. Find the line that begins with authmodulelist and edit it to use only the "authpam" module like so, because we want users to be authenticated by their normal login:
Then we make the SSL certificate, add the services for courier-imapd and courier-imapd-ssl to the default runlevel so they will start at boot, and start the daemons. If you use an RPM-based distro like Red Hat or Mandrake, most of this will probably be done for you and it will be a simple matter of starting the services.
authmodulelistorig="authcustom authcram authuserdb authpam"
Adding SpamAssassin to the runlevel is a little different, but not much harder. We want to start spamd in daemon (-d) mode:
# rc-update add courier-imapd default
* courier-imapd added to runlevel default...
* Caching service dependencies...
* rc-update complete.
# rc-update add courier-imapd-ssl default
* courier-imapd-ssl added to runlevel default...
* Caching service dependencies...
* rc-update complete.
# /etc/init.d/courier-imapd start
* Starting courier-imapd...
# /etc/init.d/courier-imapd-ssl start
* Starting authdaemond.plain...
* Starting courier-imapd over SSL...
Next, we have to create user foo to receive the mail and give foo a home directory. All of the packages we are using use the Maildir format instead of the traditional mbox found on older *nixes. All users' mail will go to a predefined mail directory on the user's home directory instead of /var/mail/xxxx. The standard directory is $HOME/Maildir but Gentoo uses $HOME/.maildir so you may have to adjust accordingly for your distro. The maildirmake utility will create foo's INBOX and a Spam folder to catch the spam (be sure you are foo and not root):
# echo /usr/bin/spamd -d > /etc/init.d/spamd
# chmod +x /etc/init.d/spamd
# rc-update add spamd default
* spamd added to runlevel default...
From here, it's just a matter of creating a few simple configuration files for getmail and procmail in foo's home directory. Use your favorite text editor to create the following files.
$ maildirmake /home/foo/.maildir
$ maildirmake /home/foo/.maildir/.Spam
Getmail will retrieve all mail from the ISP's server, and the last line hands it off to procmail for processing. Then create /home/foo/.procmailrc like so:
The first line tells procmail where the default INBOX is. The next 3 lines tell procmail to invoke SpamAssassin only if the message is less than 256k in size to prevent scanning of files with large attachments. On a P-233, SpamAssassin scanning is SLOOOOOOWWWWWW even on files this size, so scanning a message with an MP3 attached would be a nightmare. The last 3 lines tell procmail that if, after analyzing the mail, SpamAssassin has added the 'X-Spam-Status: Yes' line to the header, it goes into the bucket. SpamAssassin has Razor tests integrated into it, so there is no need to do anything with Razor as long as it's installed.
# Use maildir-style mailbox in user's home directory
* < 256000
| /usr/bin/spamc -a
* ^X-Spam-Status: Yes
That's it. Running the getmail command as user foo should now return something like this:
By default, getmail leaves the mail on the server, which is good for this in case we screw something up royally. Once you are satisfied that everything is working properly, you can run 'getmail -d' to delete mail after retrieval. Now just fire up your favorite mail client and set it up to check the address of your server (192.168.1.2 for me), IMAP protocol, user=foo, pwd=bar and you should be golden. Spam should be in the Spam folder and everything else should be in INBOX.
POP3 greeting: +OK Intermail POP3 server ready.
POP3 user responce: +OK please send PASS command
POP3 PASS response: +OK foo is welcome here
POP3 stat response: 16 messages, 15655 octets
POP3 list response: +OK 16 messages
msg #1/16 : len 2513 ... retrieved ... delivered to postmaster
msg #2/16 : len 2145 ... retrieved ... delivered to postmaster
After a little testing, I found that a few spams were slipping through into my INBOX, so I did a little editing of /home/foo/.spamassassin/user_prefs to lower the spam threshold from 5 to 3 and increase the score for the HTML_WEB_BUGS test to 3.00, and it seems to be working perfectly now. No spam has gotten through, and nothing legitimate has gotten caught.
There is one little thing that remains unresolved: I haven't figured out is how to run getmail in a cron job as user foo every 10 minutes. I tried to run it as a regular cron job, but all the mail that got tagged by SpamAssassin came out owned by root so I couldn't read it as foo. So if somebody can throw me a bone and finish this up with the cron part, I would be most grateful. :D