[unisog] SP*M Detection Methods & Processes

Alan Amesbury amesbury at oitsec.umn.edu
Fri Sep 22 19:08:09 GMT 2006


Bill Martin wrote:

> Well, for us it is that time of year again for to experience the "Goldie
> Locks" syndrome with our spam detection process again.  I'm sure you are
> all familiar with it..   you know, the "it's to much", "it's not
> enough", and still others saying "it's just right"....  the unfortunate
> part is, for us, #1 & #2 total more than #3, so we find ourselves again,
> evaluating and comparing....
> 
> Given that, we are looking to compare what we are doing w/ other
> universities, bot large and small.
> 
> Our current architecture consists of multiple gateways, running Amavis,
> handing off to SpamAssassin, an anti-virus package and of coarse our
> MTA.
> 
>    MTA
>      +-->Amavis
>             + SpamAssassin
>             + AntiVirus
>    MTA <---------+
>     +
>     V
> Delivery
[snip]

For low- to mid-volume systems handling re-mail for human and most
electronic consumers, I use an architecture like this:

	MTA (proxy)
	AV (proxy)
	MTA (queue)
		+ SpamAssassin
		+ other stuff?
	MTA (queue) <----+
	(whatever)


The idea behind this is that senders delivering content that gets
flagged by the AV software never gets queued; it allows immediate
rejection (a 5xx code is returned) so that the sender can deal with it.
 It has these effects:

	1) Messages flagged by the AV software are rejected in
	   a way that the *sender* must deal with it.

	2) No messages are ever dropped by my system (no RFC
	   violation there).

	3) Legitimate senders of messages erroneously tagged by
	   the AV software know immediately that their messages
	   got rejected.


I really like this setup, because it rejects stuff the AV software
doesn't like as early in the process as possible, and does it in a way
that forces the sender to deal with the problem.  More importantly, the
bad traffic is rejected *without* resulting in those irritating
automated "Our software found a virus in e-mail from you.  This message
scanned by Lame-O AntiVirus!" messages when the sender information is
forged (unless, of course, the sending machine is something like an open
relay that then attempts a DSN).

The biggest downside:  scalability.  It simply doesn't scale for
high-volume sites.  AV software is notoriously resource hungry.  The
proxying steps don't actually queue e-mail, so each connection means
there's an AV scanner instance and up to two MTA instances taking up
resources to accept e-mail.  Using ClamAV and Postfix on a dual
PentiumIII with 2GB RAM seems to handle up to around 50K messages/day,
but it starts to get a bit shaky above that point.

For certain non-human consumers of e-mail, though, this may not be
appropriate.  Some AV software (e.g., ClamAV) tag phishing scams as well
as viruses, so putting something like that in front of your abuse at ...
address may actually prevent people from being able to complain about
stuff like that.

One other thing to consider:  on the external MTA, think about rejecting
obvious garbage.  For example, a sending host connecting with "HELO
localhost" is in blatant violation of RFC2821, section 3.6, which
explicitly states that "[o]nly resolvable, fully-qualified, domain names
(FQDNs) are permitted when domain names are used in SMTP."  We
explicitly reject e-mail from hosts calling themselves "localhost" and
"localhost.localdomain", and also reject connections from hosts that use
*our* mailhub's name, IP address, or domain as an argument in the SMTP
"HELO" statement.  Postfix definitely has the ability to enforce this
sort of thing, and I'm pretty sure that similar things can be done with
the other commonly available MTAs.  Besides, *this* scales pretty well.  :-)

Hope that's at least marginally useful.


--
Alan Amesbury
University of Minnesota


More information about the unisog mailing list