[unisog] Extremely High Spam Statistics

Valdis.Kletnieks at vt.edu Valdis.Kletnieks at vt.edu
Wed Feb 21 06:34:24 GMT 2007

On Tue, 20 Feb 2007 15:53:29 EST, Daniel Feenberg said:
> Is it possible that much of the difference across MTA is due to some 
> counts excluding mail to non-existant accounts, and some including such 
> mail? Dictionary spam can be quite volume, and those messages are not 
> something we would include as spam or non-spam.

Or other similar "what do we count" issue.  We're currently accepting on
the order of 2M msgs/day, and flagging something like 60% or 70% as spam.
On the other hand, that does *not* include some 4M to 5M connections a day
that come from sources that we've judged so spammy that we just drop them
on the floor the instant they try to EHLO at us.  Even if each of those
connections just sent us one RCPT TO: that would raise the spam level up
to close to 90% and it would go even higher if they had multiple RCPT TO

For that matter - are you guys computing based on what percent of MAIL FROM
are attached to a spammy DATA, or counting RCPT TO's? If you have a sample of
50 one-recipient legit mails, and one spam with 50 recipients, counting the
MAIL FROM makes it about 2% spam, counting RCPT TOs makes it 50%.

