Justin's Linklog

Spam: DNS blocklists are the oldest means of spam-blocking, and are still exceedingly useful; nowadays, many of these are fully automated systems, using proxy-detection algorithms and sensing patterns in mailer behaviour indicative of spam.

A few months back on the ASRG list, there was a discussion of DNSBL accuracy; I posted some SpamAssassin figures, based on our 'mass-check' tests, but noted that they were computed using current DNSBL contents against a corpus of saved mail, so due to the time delta, were not 100% representative.

These figures are a lot better. Since August, I've been collecting real-time DNSBL hit data on my mail, as it is delivered at my SpamAssassin installation. In other words, it's live accuracy data -- it's using just what the DNSBLs had listed at scan time.

(DNS blocklist accuracy figures continued...)

Note, however, that it's still incomplete:

some DNSBLs were not measured; these are just the default DNSBL list in SpamAssassin 2.60, excluding RCVD_IN_NJABL_DIALUP (which I had to remove because I can't parse out accurate data).
it's only 1 person's hand-classified mail.
SpamAssassin tests more than just the 'delivering' SMTP relay; it'll also look backwards through the headers, at earlier relays, to catch spam sent via mailing lists. This is different from what's used with most traditional DNSBL-supporting systems.

But the results should still be quite useful.

The time period covered:

Thu, 21 Aug 2003 17:11:30 -0700 (PDT)
Sat, 25 Oct 2003 23:11:52 -0700 (PDT)

Recap of the fields:

SPAM% = percentage of messages hit that were spam
HAM% = percentage of messages hit that were spam
S/O = Spam/Overall = Bayesian probability of spam
RANK = artificial ranking figure, ignore this!
SCORE = default SpamAssassin 2.60 score
NAME = name of test. Figuring out the exactly DNSBL should be pretty obvious ;)

OVERALL%   SPAM%     HAM%     S/O    RANK   SCORE  NAME
21839     1993    19846    0.091   0.00    0.00  (all messages)
100.000   9.1259  90.8741    0.091   0.00    0.00  (all messages as %)
5.989  59.0567   0.6601    0.989   1.00    2.25  RCVD_IN_BL_SPAMCOP_NET
3.869  37.7822   0.4636    0.988   0.96    1.10  RCVD_IN_DSBL
0.751   8.2288   0.0000    1.000   0.95    4.30  RCVD_IN_OPM_HTTP
1.964  20.2709   0.1260    0.994   0.95    1.10  RCVD_IN_NJABL_PROXY
0.659   7.1751   0.0050    0.999   0.95    0.64  RCVD_IN_NJABL_SPAM
0.614   0.0000   0.6752    0.000   0.94   -0.10  RCVD_IN_BSP_OTHER
0.050   0.5519   0.0000    1.000   0.94    4.30  RCVD_IN_OPM_SOCKS
0.027   0.3011   0.0000    1.000   0.94    4.30  RCVD_IN_OPM_WINGATE
0.119   0.0000   0.1310    0.000   0.94   -4.30  RCVD_IN_BSP_TRUSTED
0.939   9.7341   0.0554    0.994   0.94    4.30  RCVD_IN_OPM
1.081  10.9383   0.0907    0.992   0.93    1.52  RCVD_IN_SORBS_SOCKS
1.062  10.7376   0.0907    0.992   0.93    1.27  RCVD_IN_SBL
0.229   2.4084   0.0101    0.996   0.93    1.10  RCVD_IN_SORBS_MISC
0.618   6.3221   0.0453    0.993   0.93    1.10  RCVD_IN_SORBS_HTTP
0.595   5.9709   0.0554    0.991   0.92    4.30  RCVD_IN_OPM_HTTP_POST
0.078   0.7526   0.0101    0.987   0.90    2.60  RCVD_IN_SORBS_ZOMBIE
0.815   7.5263   0.1411    0.982   0.89    1.39  DNS_FROM_RFCI_DSN
3.594  24.8369   1.4613    0.944   0.81    2.55  RCVD_IN_DYNABLOCK
1.685  11.4400   0.7054    0.942   0.78    0.10  RCVD_IN_RFCI
0.380   2.4586   0.1713    0.935   0.75    1.31  RCVD_IN_NJABL_RELAY
6.182  33.9689   3.3911    0.909   0.73    0.10  RCVD_IN_NJABL
10.422  44.4054   7.0090    0.864   0.63    0.10  RCVD_IN_SORBS
0.037   0.1505   0.0252    0.857   0.54    2.80  RCVD_IN_SORBS_WEB
2.344   4.1144   2.1667    0.655   0.17    0.00  RCVD_IN_SORBS_SPAM

Archives

Real-time DNS blocklist accuracy figures