Spam: DNS blocklists are the oldest means of spam-blocking, and are still exceedingly useful; nowadays, many of these are fully automated systems, using proxy-detection algorithms and sensing patterns in mailer behaviour indicative of spam.
A few months back on the ASRG list, there was a discussion of DNSBL accuracy; I posted some SpamAssassin figures, based on our ‘mass-check’ tests, but noted that they were computed using current DNSBL contents against a corpus of saved mail, so due to the time delta, were not 100% representative.
These figures are a lot better. Since August, I’ve been collecting real-time DNSBL hit data on my mail, as it is delivered at my SpamAssassin installation. In other words, it’s live accuracy data — it’s using just what the DNSBLs had listed at scan time.
(DNS blocklist accuracy figures continued…)
Note, however, that it’s still incomplete:
- some DNSBLs were not measured; these are just the default DNSBL list in SpamAssassin 2.60, excluding RCVD_IN_NJABL_DIALUP (which I had to remove because I can’t parse out accurate data).
- it’s only 1 person’s hand-classified mail.
- SpamAssassin tests more than just the ‘delivering’ SMTP relay; it’ll also look backwards through the headers, at earlier relays, to catch spam sent via mailing lists. This is different from what’s used with most traditional DNSBL-supporting systems.
But the results should still be quite useful.
The time period covered:
- Thu, 21 Aug 2003 17:11:30 -0700 (PDT)
- Sat, 25 Oct 2003 23:11:52 -0700 (PDT)
Recap of the fields:
- SPAM% = percentage of messages hit that were spam
- HAM% = percentage of messages hit that were spam
- S/O = Spam/Overall = Bayesian probability of spam
- RANK = artificial ranking figure, ignore this!
- SCORE = default SpamAssassin 2.60 score
- NAME = name of test. Figuring out the exactly DNSBL should be pretty obvious ;)
OVERALL% SPAM% HAM% S/O RANK SCORE NAME 21839 1993 19846 0.091 0.00 0.00 (all messages) 100.000 9.1259 90.8741 0.091 0.00 0.00 (all messages as %) 5.989 59.0567 0.6601 0.989 1.00 2.25 RCVD_IN_BL_SPAMCOP_NET 3.869 37.7822 0.4636 0.988 0.96 1.10 RCVD_IN_DSBL 0.751 8.2288 0.0000 1.000 0.95 4.30 RCVD_IN_OPM_HTTP 1.964 20.2709 0.1260 0.994 0.95 1.10 RCVD_IN_NJABL_PROXY 0.659 7.1751 0.0050 0.999 0.95 0.64 RCVD_IN_NJABL_SPAM 0.614 0.0000 0.6752 0.000 0.94 -0.10 RCVD_IN_BSP_OTHER 0.050 0.5519 0.0000 1.000 0.94 4.30 RCVD_IN_OPM_SOCKS 0.027 0.3011 0.0000 1.000 0.94 4.30 RCVD_IN_OPM_WINGATE 0.119 0.0000 0.1310 0.000 0.94 -4.30 RCVD_IN_BSP_TRUSTED 0.939 9.7341 0.0554 0.994 0.94 4.30 RCVD_IN_OPM 1.081 10.9383 0.0907 0.992 0.93 1.52 RCVD_IN_SORBS_SOCKS 1.062 10.7376 0.0907 0.992 0.93 1.27 RCVD_IN_SBL 0.229 2.4084 0.0101 0.996 0.93 1.10 RCVD_IN_SORBS_MISC 0.618 6.3221 0.0453 0.993 0.93 1.10 RCVD_IN_SORBS_HTTP 0.595 5.9709 0.0554 0.991 0.92 4.30 RCVD_IN_OPM_HTTP_POST 0.078 0.7526 0.0101 0.987 0.90 2.60 RCVD_IN_SORBS_ZOMBIE 0.815 7.5263 0.1411 0.982 0.89 1.39 DNS_FROM_RFCI_DSN 3.594 24.8369 1.4613 0.944 0.81 2.55 RCVD_IN_DYNABLOCK 1.685 11.4400 0.7054 0.942 0.78 0.10 RCVD_IN_RFCI 0.380 2.4586 0.1713 0.935 0.75 1.31 RCVD_IN_NJABL_RELAY 6.182 33.9689 3.3911 0.909 0.73 0.10 RCVD_IN_NJABL 10.422 44.4054 7.0090 0.864 0.63 0.10 RCVD_IN_SORBS 0.037 0.1505 0.0252 0.857 0.54 2.80 RCVD_IN_SORBS_WEB 2.344 4.1144 2.1667 0.655 0.17 0.00 RCVD_IN_SORBS_SPAM