[unisog] SP*M Detection Methods & Processes

Joseph Brennan brennan at columbia.edu
Tue Sep 26 02:56:21 GMT 2006

Russell Fulton <r.fulton at auckland.ac.nz> wrote:

> Anyway, I'd be interested in hearing from anyone who is using commercial
> products as to how they are coping with the current wave of image spam.

I've noticed that the great majority of spam rejections now are by
blocklists or other simple checks, and not Spamassassin.

Take yesterday.  We rejected 1.2 million and accepted 376,000.  The
latter includes mail sent by our users with authentication.  What
stopped the 1.2 million?  Here:

389,000		host sending us the mail was in Spamhaus blocklist
344,000		no valid recipients
110,000		domain in sender address did not exist
106,000		host sending us the mail was in NJABL blocklist
 69,000		URI in message body was in SURBL blocklist
 33,000		Message-ID of virus and spamware (only a-z before @)
 21,000		URI in message body was in Spamhaus blocklist
 18,000		executable file (almost all viruses)
  9,000		access denied by local access.db listings
  9,000		host sending us the mail was in DSBL blocklist
  7,000		sender address was a nonexistent address in our domain
  6,000		HELO claimed to be our own hosts
  4,000		Date was ridiculous (e.g. timezone -0280)

Now that's 1.1 million without Spamassassin.  The value of Spamhaus
should be obvious-- worth the few hundred a year it costs.  We started
using SURBL recently and it has been very valuable in getting the
image spam and other medical products spam.

As to NJABL and DSBL, we check in the order Spamhaus, NJABL, DSBL, and
once we get a hit we stop-- so NJABL and DSBL would have got more than
shown if we had continued.  The order was determined earlier this year
after we did try all three for a while.  Spamhaus got the most.

Our filtering is done with the Mimedefang milter.  Part of the strategy
is to use the inefficient Spamassassin tests as little as possible.

Image spam not caught by the blocklists gets caught by three scoring
tests that matched this many messages:

 57,000		CU_HASCID		score 0.1
 56,000		EXTRA_MPART_TYPE	score 2.5
 52,000		CU_META_45		score 4.5

CU_HASCID = (local test) message has a cid: link in it.
EXTRA_MPART_TYPE = standard Spamassassin test, scored higher
CU_META_45 = (local test) adds score when both of the above hit, and
  subject has "RE:" or "FW:", and subject is < 30 characters.

Probably all the messages hitting CU_META_45 got rejected, so 52,000
were caught this way more or less.

Joseph Brennan
Columbia University Information Technology

More information about the unisog mailing list