Sorting SPAM

28 Feb 2007

I been using SpamAssassin for a while to help identify SPAM. About a week ago, I started seeing all messages that were being flagged as SPAM by SpamAssassin show up in my Inbox instead of in my SPAM folder.

Well, it irritated me enough a moment ago to actually take a look at the full headers of just such a message. Here are the headers added by SpamAssassin:

X-Spam-Flag: YES
X-Spam-Checker-Version: SpamAssassin 3.1.8 (2007-02-13) on
       dark-templar.lamontpeterson.net
X-Spam-Level: ***********************
X-Spam-Status: Yes, score=23.0 required=4.0 tests=BAYES_80,DRUGS_ERECTILE,
       DRUGS_ERECTILE_OBFU,HTML_MESSAGE,RCVD_IN_BL_SPAMCOP_NET,URIBL_AB_SURBL,
       URIBL_JP_SURBL,URIBL_SBL,URIBL_SC_SURBL,VIA_GAP_GRA autolearn=no version=3.1.8
X-Spam-Report:
       *  2.5 VIA_GAP_GRA BODY: Attempts to disguise the word 'viagra'
       *  2.0 BAYES_80 BODY: Bayesian spam probability is 80 to 95%
       *      [score: 0.8180]
       *  0.0 HTML_MESSAGE BODY: HTML included in message
       *  1.6 RCVD_IN_BL_SPAMCOP_NET RBL: Received via a relay in bl.spamcop.net
       *      [Blocked - see <http ://www.spamcop.net/bl.shtml?201.83.176.249>]
       *  1.6 URIBL_SBL Contains an URL listed in the SBL blocklist
       *      [URIs: tersho.com]
       *  3.8 URIBL_AB_SURBL Contains an URL listed in the AB SURBL blocklist
       *      [URIs: tersho.com]
       *  4.1 URIBL_JP_SURBL Contains an URL listed in the JP SURBL blocklist
       *      [URIs: tersho.com]
       *  4.5 URIBL_SC_SURBL Contains an URL listed in the SC SURBL blocklist
       *      [URIs: tersho.com]
       *  2.4 DRUGS_ERECTILE_OBFU Obfuscated reference to an erectile drug
       *  0.5 DRUGS_ERECTILE Refers to an erectile drug

(Now that’s one spammy piece of SPAM!)

OK, so I took a look at my ~/.mailfilter file on the server:

### SPAM
if ( /^X-Spam-Flag: *(yes|YES) / )
{
   to "$HOME/mail/.SPAM/"
}

Many of my readers may be eagle-eyed enough to spot the problem right away. If you said, “Hey, you’ve got a superfluous space after your closing parenthesis in your regular expression there,” then you got it.

That regex would match either “yes” or “YES” (they are case sensitive). I did this because at some point long ago, I had a rule on a system that used “yes”, but SpamAssassin today produces “YES” and I just didn’t want to have it missing stuff because of something like that.

I decided to further improve this regex so that it might be less likely I’ll have to “fix” it again:

### SPAM
if ( /^X-Spam-Flag: *[yY][eE][sS]/ )
{
   to "$HOME/mail/.SPAM/"
}

Problem solved.

BTW: the term SPAM originally came to be used in the computer world because of the Monty Python Spam sketch.


Actions

Informations

Leave a comment

You can use these tags : <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>