Skip to content

Category: Uncategorized

BBC’s iPlayer — what a mess

I haven’t paid a whole lot of attention to the BBC’s "iPlayer" project, since, as a non-UK resident, I’m not allowed to use it anyway. But this interview at Groklaw with Mark Taylor, President of the UK Open Source Consortium, was really quite eye-opening. Here’s some choice snippets.

On the management team’s Microsoft links:

The iPlayer is not what it claimed to be, it is built top-to-bottom on a Microsoft-only stack. The BBC management team who are responsible for the iPlayer are a checklist of senior employees from Microsoft who were involved with Windows Media. A gentleman called Erik Huggers who’s responsible for the iPlayer project in the BBC, his immediately previous job was director at Microsoft for Europe, Middle East & Africa responsible for Windows Media. He presided over the division of Windows Media when it was the subject of the European Commission’s antitrust case. He was the senior director responsible. He’s now shown up responsible for the iPlayer project.

On their attempts to bullshit the BBC Trust on the cross-platform issue:

In the consultations that the BBC Trust made, there were 10,000 responses from the public. And the overwhelming majority of them, over 80% — which is an unheard-of figure in these kind of things — said, we don’t like the platform. We don’t like it being single-platform. So it’s a big issue. And the BBC Trust said to us, "Why the vehemence? Why have people reacted this way?" And I explained the ‘Auntie’ analogy. It’s people don’t expect that from the BBC. It’s got this huge history of integrity, doing the right thing, standing up to bullies. (laughter) They’ve done this for a very long time. And people find that it’s surprising. And they said, "Yeah, but," you know, the BBC guys said, "Well, trust us. This is going to be cross-platform." And we said, "Well, how? It’s completely single-platform." They say that, but we haven’t been able to find anyone who’s been able to explain how they’re going to achieve that at the moment, even though they’re entirely locked into one single platform.

(aside: MS did this at one point with Internet Explorer — remember, there was some mystery team in Germany that supposedly had IE ported to Solaris, hence it therefore qualified as ‘cross-platform’.)

On the architecture of the product:

Q: it’s a Verisign Kontiki architecture, it’s peer-to-peer, and in fact one of the more worrying aspects is that you have no control over your node. It loads at boot time under Windows, the BBC can use as much of your bandwidth as they please (laughter), in fact I think OFCOM … made some kind of estimate as to how many hundreds of millions of pounds that would cost everyone […]. There is a hidden directory called "My Deliveries" which pre-caches large preview files, it phones home to the Microsoft DRM servers of course, it logs all the iPlayer activity and errors with identifiers in an unencrypted file. Now, does this assessment agree with what you’ve looked at?

Mark Taylor: Yes.

Q: What are the privacy implications for an implementation like this?

Mark Taylor: Well, just briefly going back to the assessment thing, yes it does log precisely RSS and stuff like that and more importantly, anyone technically informed who’s had a look at it — even more importantly, the user’s assessment as well and — frankly horrified if you go and spend some time in the BBC iPlayer forums, it’s eye-opening to see the sheer horror of the users, some of them technically not — you know, relatively early-stage users — but when it gets explained to them by some of the longer-using users of it, it’s concentrated misery. (laughter)

[…]

it’s a remarkable thing with them as well, there’s a lot of pain going on in the user forums, and some of the main technical support questions in there are "how do I remove Kontiki from my computer?" See, it’s not just while iPlayer is running that Kontiki is going, it’s booted up. When the machine boots up, it runs in the background, and it’s eating people’s bandwidth all the time. (laughter) In the UK we still have massive amounts of people who’ve got bandwidth capping from their ISPs and we’ve got poor users on the online forums saying, "Well, my internet connection has just finished, my ISP tells me I’ve used up all of my bandwidth."

Q: It uses up their quota, but they can’t throttle it, they can’t reduce it —

Mark Taylor: No, they can’t throttle it. […] It’s malware as well as spyware.

And to top this off, there’s a (frankly insane) budget of UKP 130,000,000 to build this — that’s $266,000,000 — for something that could be built better by just hiring the guys behind UKNova and simply negotiating with the rights-holders directly.

Holy crap. Talk about a technical disaster masquerading as a solution to a business problem…

Plug: Decorama stickers

Plug plug! We picked up some really cute stencils for the nursery a few months back, but took our time putting them up — we were a bit daunted by the instructions — and only got around to putting them up last week. (We needn’t have worried — it was really easy.)

They’re Decorama vinyl stickers from Bored Inc.. I can’t recommend them enough — their art is fantastic, the quality’s great, and Bored Inc. were really friendly and helpful about the whole transaction.

If you’re looking to do something similar, I’d definitely recommend their stuff.

‘Blended threat’ = Storm

[Commtouch have apparently released an ‘Email Threats Trend Report’ for the third quarter of 2007], which contains this factoid:

Blended threat messages — or spam messages with links to malicious URLs — accounted for up to 8% of all global email traffic during the peaks of various attacks during the quarter […]

Spam with malware hyperlinks inside: One technique which reached a new high during the quarter was innocent-appearing spam messages that contained hyperlinks to malware-sites. This type of spam utilizes vast zombie botnets to launch ‘drive-by downloads’ and evade detection by most anti-virus engines. Several blended spam attacks of this type focused on leisure-time activities, such as sports and video games. Messages invited consumers to download "fun" software such as NFL game-tracking and video games from what appeared to be legitimate websites. Instead, consumers voluntarily downloaded malware onto their computers.

Those short messages that invited downloads of NFL game-tracking software ("Get Your Free NFL Game Tracker", "Football Fan Essentials", "Are you ready for football season?" etc.), and video games ("Wow, free games!", "New game software, with over 1000 games—FREE", "Holy cow, 1000 free games online" etc.), is all output from the Storm worm — I wouldn’t call it a new kind of "blended threat" per se. I’m surprised that Commtouch didn’t name it; maybe they don’t realise it’s Storm?

I’d say it’s output is higher than 8% of my incoming spam, although it has reduced its spam output quite a bit recently.

‘Dead spammer’ story a hoax

Update: yep, it’s spam.

Earlier today, Digg and Reddit featured this story:

Alexey Tolstokozhev (btw, in Russian his name means ‘Thick Skin’), a Russian spammer, found murdered in his luxury house near Moscow. He has been shot several times with one bullet stuck in his head. According to authorities, this last head shot is a clear mark of russian hit men (known as "killers" in Russia).

Since then, it’s received plenty of attention — I even posted it to my link blog myself. Unfortunately, I’m now certain it’s a fake. (Igor at the McAfee AVERT blog concurs.)

Here are my reasons:

  • There are still no corroborating stories in the press, several hours later;

  • ‘Alexey Tolstokozhev’ doesn’t appear in ROKSO, or even Google;

  • The entire site claims to have been shut down due to load, all except for that one page — there isn’t a single link that can be reached that works;

  • Indeed, Google has no other pages indexed on that site, which is pretty odd for a weblog;

  • And most fishy of all, the domain was registered yesterday, using a privacy-protection service, on Estdomains (which has a poor reputation).

All very fishy. My guess is that in a week’s time, that page will be a linkfarm, picking up all that Google juice for free. In other words, loonov.com is a spam site…

Update: Greetings, Slashdot comment readers! Hopefully that uncritical article (which was posted after this one) will be fixed to note the hoax soon…

Other voices have since added their agreement — Alex Eckelberry at Sunbelt software added his a few minutes after I posted this, and the Register wrote an article this morning about it.

(BTW, just to save some face — I’d like to note that I smelled a rat at the time I posted it initially, qualifying the link with a sceptical ‘hmm’. I’m not that gullible ;)

Update 2: the /. story was fixed by Zonk: ‘Good story. Unfortunately, probably a fake.’

Scary Storm figure

This study of the Storm worm (via) contains this rather terrifying factoid:

Figure 12 illustrates a time-volume graph of TCP packets, SMTP packets, spam messages, and smtp servers. Our analysis of this graph reveals the following findings. First, we find that except for the first 5 minutes almost all the TCP communication is dominated by spam. Second, we measured that hosts generate on average of 100 successful spam messages per five minutes, which translates to 1200 spam messages per hour or 28,800 messages per day. If we mutiply this by the estimated size for the Storm network (which we suspect varies between 1 million and 5 million, we derive that the total number of spam messages that could be generated by Storm is somewhere between 28 billion and 140 billon per day.

While such numbers might be mind-boggling they are inline with observed spam volumes in the Internet, e.g., overall volume of spam messages in the Internet per day in 2006 was estimated to be around 140 billion [2]; Spamhaus claims to have been blocking over 50 billion spam messages per day in October 2006 [10], and AOL was blocking 1.5 billion spam messages per day in its network in June 2006 [5]. These numbers suggest that Storm could be responsible for anywhere between 17% and 50% of all spam that is generated on the Internet.

28 to 140 billion messages per day. That is a lot of spam.

Minor nitpick with the paper — it notes that

Storm retrieves emails found in [certain] files and gathers information about possible hosts, users, and mailing lists that are referenced in these files. In particular, it looks for strings like “yahoo.com”, “gmail.com”, “rating@”, “f-secur”, “news”, “update”, “anyone@”, “bugs@”, “contract@”, “feste”, “gold-certs@”, “help@”, “info@”, “nobody@”, “noone@”, “kasp”, “admin”, “icrosoft”, “support”, “ntivi”, “unix”, “bsd”, “linux”, “listserv”, “certific”, “sopho”, “@foo”, “@iana”, “free-av”, “@messagelab”, “winzip”, “google”, “winrar”, “samples” , “abuse”, “panda”, “cafee”, “spam”, “pgp”, “@avp.” , “noreply” , “local”, “root@”, and “postmaster@”.

I would postulate that those strings are a stoplist — that in fact the worm avoids sending spam to addresses containing those strings. The presence of "abuse" and "postmaster" in particular would suggest that.

Long-lived spam via Yahoo! search

Back in May, I noticed some spam in my Moin Moin wiki, and fixed it.

As this Yahoo! Site Explorer view of taint.org demonstrates, Yahoo!’s search is still showing these results, partly; despite the spam content being long deleted (example ), they still show the spam title and URL, despite the fact that the title and text no longer contains those spam keywords.

Annoyingly, I’m still seeing referrer clickthroughs from search.yahoo.com to these deleted pages from lusers looking for porn, as a result. Come on Yahoo!, fix your search to notice the title change at least, so people don’t think the pages still contain porn!

Eircom WEP key-generation algorithm reversed

Over the weekend, this really hit the Irish blogosphere — several Irish guys have apparently figured out the algorithm used by Eircom to generate WEP keys.

I blogged that page in the link-blog this morning, but it’s worth writing about a little more. WEP is apparently easy to crack nowadays, so in a way all those wifi users were insecure anyway — but this is interesting as a case study of how not to write a key generator:

  • Compiled code != secret: the first mistake Eircom made was to generate the WEP key entirely from a little "secret" text, some "secret" shuffles, and the serial number of the hardware. There should always be some randomness in there. Compiled code running on a user’s desktop, is not secret.

  • Don’t share secrets: Secondly, it’s a good demo of why you don’t generate two separate key values from the same source data. In this case, both the WEP key and the SSID are generated from the Netopia router’s serial number — and sufficient bits are accidentally exposed in the SSID to enable computation of the WEP key. (This is kind of moot in many cases, since the serial number is also exposed in the MAC address, in even more detail.)

As far as I can tell — although it’s not quite clear who did what — that guy Kevin Devine did a pretty great job of reversing this code. Nice one.

I’m impressed that there’s now an app which detects the static tables (S-boxes, constants etc.) used in crypto algorithms — that idea seems very clever in retrospect, hadn’t occurred to me.

Here’s a boards.ie thread where this exploit was discussed; there are plenty more details there, if you’re curious. It seems this has been quietly floating around back-channels since the start of September.

(By the way, am I missing something, or did Eircom ship unstripped binaries for the key generator library? I could swear that when I looked at the Boards thread earlier today, there was a cut-and-paste from IDA Pro listing a function prototype. Oh dear; if so, add that to the ‘case study’ list above. ;)

It seems Eircom are now recommending all customers switch to WPA — good luck with that, since it’ll break all those Nintendo DSes. That won’t be popular!

Update: the original page seems to be down, but here’s the source for the command-line decoder: dessid.c. See also EirWep.

Oh noes!


dsc05400
Originally uploaded by jmason

Sorry to readers of Planet Antispam — it had stopped updating for a week, after the server move. I’d forgotten to restart the cron job… now fixed.

Taint.org Has Moved

I’m moving pretty much all my home sites and infrastructure from the venerable "dogma.boxhost.net" to a new host, "soman.fdntech.com". This weblog has just made the jump. Please leave a comment if you notice anything awry.

There may be a few rough edges, since I upgraded to WordPress 2.2.2 in the process; for example, my sooper-s3kr1t "what is my name" anti-spam protocol was set to not require a preview of all posted comments, or the correct answer — in just over an hour I received 25 spam comments… so it’s good to know it’s working ;)

Dublin-area Intro To Open Streetmap

A last-minute notice — the Irish Linux Users’ Group are organising an introduction to Open Streetmap tomorrow:

Open Streetmap : An Intro

The ILUG committee is organising an introduction to the Open Streetmap project on Saturday, 1st September, 2007 in Dublin.

This will include info on how to use your GPS and upload your data to the project, to contribute to a free and open map of the world.

The Hamlet Pub, Balbriggan (N 53.61396 W 6.20608 degrees)

Sat, 1st Sep 2007 2pm ~ 5pm

If you have a GPS and a laptop, please feel free to bring them. Wireless internet is available in the venue.

To register interest, please e-mail chairman-at-linux.ie

Not Cosmo

So, we were all set to name our new arrival Cosmo, assuming it was a boy. We were certain it was going to be a boy. Guess what? It wasn’t… so now we have to narrow down the girl-name shortlist in a hurry!

Isn’t she lovely? Lots more photees at Flickr.

Anyway, I may be hard to get hold of for a while… this lady will be keeping me busy I think ;)

Update: Looks like the name is Beatrice Lily Mason, although there’s still a fair bit of indecision, unfortunately ;)

Update 2: Beatrice Lily Gray Mason. Final answer!

Stupid Unicode Tricks

Cool Unicode trick, via Mantari — cut and paste this character into a Unicode-aware application (like this post’s comment box!), then type something and see what happens:

‫‬‭‮‪‫‬‭‮҉

My Nokia 770

A couple of weeks back, there was quite a bit of buzz in the Irish blogosphere and elsewhere about the Nokia 770; prices for new N770s had dropped from $290ish to a very reasonable $140 / EUR130-ish price-point. I, along with a good few others, bought one.

I bought mine through Expansys, with a free 1GB RS-MMC memory card. They’ve sold out and no longer have any N770s listed; however, Buy.com still seem to have them in stock, so if you’re interested, you can probably still pick one up. (It seems Nokia is trying to sell off their remaining N770 stock, cheap, with plans to drop support for the software platform. I’m fine with this, but it may put other buyers off.)

I’ve now been using it for a while, and am still happy. ;) Here are my recommended top apps:

Slimserver. Originally designed to operate as the backend software for the Squeezebox thin-client MP3 player, this has a fantastic UI built for the N770, and its MP3 stream output works perfectly on the tablet.

This is by far the neatest way to get at a 6000-song music library without a laptop; there was some talk in the GNOME community of making a decent DAAP client, but so far there’s no working results there that I could find. :(

maemo-mapper. This is a fantastic mapping app for the tablet; it presents map tiles downloaded from OpenStreetMap or Google Maps in an N770-optimized format, with the usual nice draggable UI. Bonus: it’ll work offline, so you can follow a route while online, then take the tablet along to help navigate.

Tip: once you start maemo-mapper, click the "Download…" button in the "Repository Manager" and it’ll download details for the 5 most useful map repositories, including Google and Virtual Earth.

FBReader. A very nice document reader; much nicer than trying to read long HTML pages in the builtin web browser, especially since it allows you to turn the device on its side.

In general, the Opera Mini browser works fine; be sure to enable Javascript and set up a swap file on the RS-MMC card first. It does all the basic HTML and rudimentary AJAX; Google Calendar is a no-go, but GMail and even Google Maps works adequately, modulo minor bugs. Plain Old HTML sites like Wikipedia, IMDB and so on all work great.

As long as you’re realistic about the platform, it won’t disappoint — video requires custom transcoding, for example, and proprietary apps like Flash and RealPlayer lag behind their desktop equivalents, but as far as I can tell that’s the case for every embedded platform. (Since I spent a couple of years developing such a platform, I’m quite comfortable with this.)

A really really nifty thing about the N770 is that it’s now entirely hackable — within 30 minutes of powering on, I was able to get a terminal window open with a root prompt, and was adding ext3 partitions to the RS-MMC card. Apps are installed using "apt-get". The terminal even has word-completion system optimized for the UNIX command-line – nice ;)

This SomethingAwful thread contains plenty more good tips. I’m happy I bought it — so many of these gadgets can wind up as an overpriced door-stop, but this is easily worth what I paid for it.

Update: this thread at InternetTabletTalk seems pretty chock-full of good advice, too.

Test my auto-generated ruleset

(I posted this to the SA users and dev lists, too.)

I’ve been working on a new way to auto-generate body rules recently (see previous posts). The results are checked into SVN trunk daily in the "rulesrc/sandbox/jm/20_sought.cf" file.

We haven’t had much time to figure out how to produce auto-generated 3.2.x rule updates for our entire ruleset at updates.SpamAssassin.org, so instead of dealing with that, I’ve taken a shortcut around it ;) I’m now making just the "20_sought.cf" ruleset available as a standalone, unofficial sa-update ruleset at sought.rules.yerp.org.

Before using it, you’ll need the GPG key:

  wget http://yerp.org/rules/GPG.KEY
  sudo sa-update --import GPG.KEY                

then use this to update:

  sudo sa-update \
        --gpgkey 6C6191E3 --channel sought.rules.yerp.org \
        [...other channels...] \
        --channel updates.spamassassin.org

(similar to how you’d use Daryl’s sa-update version of the SARE rulesets.)

Feel free to run sa-update as frequently as you like.

Please consider it alpha; I may take it down in a few months depending on how it goes, or if we can get it working as part of the core updates. In the meantime though, I’m curious to hear how you get on with it. (In particular, copies of false positives would be very welcome.)

Update: it’s been very successful, so I’d now consider it in production.

The Prime Time Group pump-and-dump

Spamnation.info links to an interesting article by Computerworld’s Gregg Keizer about the massive PRTH.PK spam run.

As usual, there are no shortage of suckers:

The spam blast did drive up Prime Time’s share price from Monday’s low of around 7 cents to Wednesday’s high of 11 cents, a 57% jump. Thursday morning, however, the bottom dropped out, and the stock fell to under 7 cents. Trading volumes peaked Wednesday as well, at around 1.7 million shares, substantially higher than any day in the month prior. "You can actually see the wave of activity in the stock and compare it with the volume of spam that we trapped," said [Sophos analyst Ron] O’Brien.

But here’s an interesting new tactic by the good guys:

Last Wednesday afternoon, Prime Time announced that it was ordering a Non Objecting Beneficial Owners (NOBO) list to get a clearer picture of who owned its shares. "The NOBO list will be used to determine the naked short positions in Prime Time Group Inc.," the company said in a statement. "The finding will then be reported to the [National Association of Securities Dealers] to take action against the violators of the naked short regulations."

"Naked short" is a investment term that refers to selling short, essentially a bet that the price will drop, but with a twist: "naked" means that the investor sells short without first making sure he can borrow the shares from another investor holding a "long" position on the stock.

I hope this works; it’d be great to see the profit mechanism behind pump-and-dump spam killed off.

Spamnation notes:

Incidentally, the greeting card spam that built the botnet used to promote PRTH.PK and CYTV.OB also continues. It has iterated through another couple of generations: the current incarnation tells recipients to collect their custom Musical ecard or custom Movie-quality ecard or other variants on that theme. We’ve seen about 150 of these in the past three days, suggesting that the unknown senders are probably well on their way to building up another botnet for their next stock spam run.

Spreading trojans via greeting-card spam is a trademark of the gigantic Storm botnet, AFAIK: SecureWorks info, MessageLabs info, spam levels causing DDoS for Canadian networks, DDoS threat for EDU sector.

The Haughey 419 returns

A few months back, Blogorrah noted an amazing 419 scam, claiming to be a missive from ex-Taoiseach of Ireland Charlie Haughey‘s wife, Maureen. It’s really quite appropriate Charlie becoming the subject of a scam himself, given what he did to this country. But anyway… over the weekend, a new variant on the theme emerged:

From Mrs Maureen Haughey, ROI

My Dear Friend,

I am Maureen Haughey, widow of former Taoiseach of the Republic of Ireland, Charles J. Haughey and daughter of former Taoiseach of the Republic of Ireland and heir to de Valera, Sean F. Lemass.The Press has written a lot about unresolved mysteries and corruption surrounding CharlesÂ’s dealings, but I tell you something,my Charlie was a good man. He was human and he did whatever he did.

People marvel why I stuck with Charlie and didn’t speak during the mess that came with the exposure of his affairs with Terry Keane (I just hate to think of her). I had to stand by him through the tribunal times…. it was to do with what I’m doing now. No one knew the details of all Charlie’s financial dealings but me. I remain the only one who knows all who got loans from Charlie and didn’t come back to pay when he was disgraced. I am the only one who knows about these monies and the other Ansbacher accounts.

I write to you, an old weary woman, sick and almost tired of living. My end is near but I will not depart until my final mission is accomplished and I also write this with an unshaken belief in the power of aspirations and dreams of a human being. The Irish government thinks it can shave and reduce me to a poor widow but I have the winning ace. A few years ago, when we werenÂ’t sure if my Charlie would be convicted, he kept some money in trust for me in a Security and Finance company. He did not open the account in our names so it will not be traced to us to enable the past remain the past. The name on the account is Cedric de Vregille. I never thought Charlie would leave me so soon and it never occurred to me to ask if this name were fictitious or not or a name of any of his friends. I have tried to find this man but to no avail. The amount he deposited in this name is 30,000,000 (Thirty Million Euros).

I want an honest person to come forward and lay claims to this amount, moreover to use the funds as instructed by me. I have all the documents needed, I just need a face for the name. I have mapped out 30% of the funds for you, as you will help us (you and I) execute this job.

As soon as I receive your acceptance for this work I shall give you necessary details of my solicitor who will facilitate the release of the funds in your name. Please reply me via my personal email: maureen_haughey67@yahoo.co.uk


For my security and the sake of letting sleeping dogs lie, I strongly advice that you keep our dealings confidential. You can read more about my charlie from:

http://www.ireland.com/focus/haughey/ITstories/story11.htm

http://www.teachersparadise.com/ency/en/wikipedia/c/ch/charles_haughey.html

http://www.everything2.com/index.pl?node_id=548983&lastnode_id=0

Thank You.


Message sent using UebiMiau 2.7.2

It was sent via a webmail system at nildram.co.uk, from a proxy in Australia.

The writing is amazingly ornate — ‘I write to you, an old weary woman, sick and almost tired of living’, ‘the Irish government thinks it can shave and reduce me to a poor widow but I have the winning ace’, etc. Very odd stuff. Also, it looks spell-checked. And, once again, poor old cyclist Cedric de Vregille gets dragged into it, too! I wonder what he did to deserve that ;)

If you fancy scambaiting, ‘maureenhaughey67@yahoo.co.uk’ is the one to go for. These guys seem to be having a good go of it‘The thought of the Irish government trying to shave an old woman has shocked and appauled me, so I will assist in anyway possible.’_ ha!

Rule Discovery Progress Update

Back in March, I wrote a post about a new rule discovery algorithm I’d come up with, based on the BLAST bioinformatics algorithm. I’m still hacking on that; it’s gradually meandering towards production status, as time permits, so here’s an update on that progress.

There have been various tweaks to improve memory efficiency; I won’t go into those here, since they’re all in SVN history anyway. But the results are that the algorithm can now extract rules from 3500 spam and 50000 ham messages without consuming more than 36 MB of RAM, or hitting disk. It can also now generate a SpamAssassin rules file directly, and apply a basic set of QA parameters (required hit rate, required length of pattern, etc.).

On top of this, I’ve come up with a workflow to automatically generate a usable batch of rules, on a daily basis, from a spam and ham corpus. This works as follows:

  • Take a sample of the past 4 days traffic from our spamtrap network. Today this was about 3000 messages.

  • add the hand-vetted spam from my own accounts over the same period (this helps reduce bias, since spamtraps tend to collect a certain type of spam), about 3400 messages.

  • discard spams that scored over 10 points (to concentrate on the stuff we’re missing).

  • Pass the remaining 3517 spams, and text strings from over 50000 nonspam messages, into the "seek-phrases-in-log" script, specifying a minimum pattern length of 30 characters, and a minimum hitrate of 1% (in today’s corpus, a rule would have to hit at least 34 messages to qualify).

  • That script gronks for a couple of minutes, then produces an output rules file, in this case containing 28 rules, for human vetting. (Since I’ve started this workflow, I’ve only had to remove a couple of rules at this step, and not for false positives; instead, they were leaking spamtrap addresses.)

  • Once I’ve vetted it, I check it into rulesrc/sandbox/jm/20_sought.cf for testing by the SpamAssassin rule QA system.

The QA results for the ruleset from yesterday (Aug 3) can be seen here, and give a pretty good idea of how these rules have been performing over the past week or two; out of the nearly 70000 messages hit by the rules, only 2 ham mails are hit — 0.0009%.

In fact, I measured the ruleset’s overall performance in the logs provided by the 4 mass-check contributors who provided up-to-date data in yesterday’s nightly mass-check; bb-jm, jm, daf, dos, and theo (all SpamAssassin committers):

Contributor Hits Spams Percent
bb-jm 4249 24996 17.00%
jm 3450 14994 23.00%
daf 1236 35563 3.48%
dos 32867 100223 32.79%
theo 28077 382562 7.34%

(bb-jm and jm are both me; they scan different subsets of my mail.)

The "Percent" column measures the percentage of their spam collection that is hit by at least one of these rules; it works out to an average of 16.72% across all contributors. This is underestimating the true hitrate on "fresh" spam, too, since the mass-check corpora also include some really old spam collections (daf’s collection, for example, looks like it hasn’t been updated since the start of July).

Even better, a look at the score-map for these rules shows that they are, indeed, hitting the low-scoring spam that other rules don’t hit.

That’s pretty good going for an entirely-automated ruleset!

The next step is to come up with scores, and publish these for end-user use. I haven’t figured out how this’ll work yet; possibly we could even put them into the default "sa-update" channel, although the automated nature of these rules may mean this isn’t a goer.

If you’re interested, the hits-over-time graph for one of the rules (body JM_SEEK_ICZPZW / Home Networking For Dummies 3rd Edition \$10 /) can be viewed here.

Host monitoring with Jaiku

A few weeks back, we were having trouble with dogma, our shared server where taint.org is hosted, which would occasionally be unavailable for unknown reasons. We needed to monitor its availability so that it could be fixed when it crashed again, and we’d be able to investigate quickly. Since it was happening mostly out of working hours, SMS notification was essential.

Normally, that kind of monitoring is pretty basic stuff, and there’s plenty of services out there, from Host-Tracker.com to the more complex self-hosted apps like monit and Nagios which can do that. But looking around, I found that none of them offered SMS notification for free, and since this was our personal-use server, I wasn’t willing to sign up for a $10-per-month paid account to support it, or buy any hardware to act as a private SMS gateway.

Instead, I thought of Jaiku — the Finnish company which offers a microblogging/presence platform similar to Twitter. Jaiku had a couple of cool features:

  • SMS notifications
  • it’s possible to broadcast messages to a "channel", which others could subscribe to, IRC-style
  • it has an open API

This would allow me to notify any interested party of dogma’s downtime, allowing subscribers to subscribe and unsubscribe using whatever notification systems Jaiku support.

With a little perl and LWP, I rigged up a quick monitoring script to check http://taint.org/ via HTTP, and report if it was unavailable over the course of 5 retries in 50 seconds. If it was broken, the script sends a JSON-formatted POST request to Jaiku’s "presence.send" method, informing the target channel of the issue. (Perl source here.)

You can see the ‘#dogmastatus’ channel here — as you can see, we fixed the problem with dogma just over 2 weeks ago ;)

It’s worth noting that I had to set up an additional user, "downtimebot", on Jaiku to send the messages — otherwise I’d never see them on my configured mobile phone! Jaiku uses the optimisation that, if I sent the message, there’s no need to cc me with a copy of what I just sent; logical enough.

Anyway, if you’re interested in dogma’s availability (there might be one or two taint.org readers who are), feel free to add yourself to the #dogmastatus channel and receive any updates.

Update: Fergal noted that it’s pretty simple to use Cape Clear’s assembly framework to perform a HTTP ping test with output to Jabber/XMPP. nifty!

A fishy Challenge-Response press release

I have a Google News notification set up for mentions of "SpamAssassin", which is how I came across this press release on PRNewsWire:

Study: Challenge-Response Surpasses Other Anti-Spam Technologies in Performance, User Satisfaction and Reliability; Worst Performing are Filter-based ISP Solutions

NORTHBOROUGH, Mass., July 17 /PRNewswire/ — Brockmann & Company, a research and consulting firm, today released findings from its independent, self-funded "Spam Index Report– Comparing Real-World Performance of Anti-Spam Technologies."

The study evaluated eight anti-spam technologies from the three main technology classes — filters, real-time black list services and challenge- response servers. The technologies were evaluated using the Spam Index, a new method in anti-spam performance measurement that leverages users’ real-world experiences.

[…] The report finds that the best performing anti-spam technology is challenge-response, based on that technology’s lowest average Spam Index score of 160.

[…] Filter – Open Source software-(Spam Index: 388): This technology is frequently configured to work in conjunction with PC email client filters. The server adds SPAM to the subject line so that the client filter can move the message into the junk folder. This class of software includes projects such as ASSP, Mail Washer and SpamAssassin, among others.

The "Spam Index" is a proprietary measurement of spam filtering, created by Brockmann and Company. A lower "Spam Index" score is better, apparently, so C/R wins! (Funny that. The author, Peter Brockmann, seems to have some kind of relationship with C/R vendor Sendio, being quoted in Sendio press releases like this one and this one, and providing a testimonial on the Sendio.com front page.)

However — there’s a fundamental flaw with that "Spam Index" measurement, though; it’s designed to make C/R look good. Here’s how it’s supposed to work. Take these four measurements:

  • Average number of spam messages each day x 20 (to get approximate number per work-month)
  • Average minutes spent dealing with spam each day x 20 (to get approximate minutes per work-month)
  • Number of resend requests last month
  • Number of trapped messages last month

Then sum them, and that gives you a "Spam Index".

First off, let’s translate that into conventional spam filter accuracy terms. The ‘minutes spent dealing with spam each day’ measures false negatives, since having to ‘deal with’ (ie delete) spam means that the spam got past the filter into the user’s inbox. The ‘number of trapped messages’ means, presumably, both true positives — spam marked correctly as spam — and false positives — nonspam marked incorrectly as spam. The ‘number of resend requests last month’ also measures false positives, although it will vastly underestimate them.

Now, here’s the first problem. The "Spam Index" therefore considers a false negative as about as important as a false positive. However, in real terms, if a user’s legit mail is lost by a spam filter, that’s a much bigger failure than letting some more spam through. When measuring filters, you have to consider false positives as much more serious! (In fact, when we test SpamAssassin, we consider FPs to be 50 times more costly than a false negative.)

Here’s the second problem. Spam is sent using forged sender info, so if a spammer’s mail is challenged by a Challenge/Response filter, the challenge will be sent to one of:

  • (a) an address that doesn’t exist, and be discarded (this is fine); or
  • (b) to an invalid address on an innocent third-party system (wasting that system’s resources); or
  • (c) to an innocent third-party user on an innocent third-party system (wasting that system’s resources and, worst of all, the user’s time).

The "Spam Index" doesn’t measure the latter two failure cases in any way, so C/R isn’t penalised for that kind of abusive traffic it generates.

Also, if a good, nonspam mail is challenged, either

  • (a) the sender will receive the challenge and take the time to jump through the necessary hoops to get their mail delivered ("visit this web page, type in this CAPTCHA, click on this button" etc.); or
  • (b) they’ll receive the challenge, and not bother jumping through hoops (maybe they don’t consider the mail that important); or
  • (c) they’ll not be able to act on the challenge at all (for example, if an automated mail is challenged).

Again, the "Spam Index" doesn’t measure the latter two failure cases.

In other words, the situations where C/R fails are ignored. Is it any wonder C/R wins when the criteria are skewed to make that happen?

Stop with the fake phish data

An anonymous friend in the anti-phishing community writes:

For those of you who blog and/or have contacts in the general computer user ‘go fight ’em’ community:

Is there any way you can get the word out that dropping a couple hundred fake logins on a phishing site is NOT appreciated??

It creates havoc for those monitoring the drop since it’s an unbelieveable waste of time and resources to clean up the file. Also, for those drop files that ‘recycle’ after every 10 entries, valid data is lost.

It also creates havoc for those who get these files and try to notify victims. They waste time, too .. pulling legit info from amongst the trash.

I know there are programs out there that create/dump this stuff onto sites and some who call themselves ‘phish phighters’ enjoy the harassment aspect. But it wastes the time/effort of those who are seriously working these things.

New Science Gallery in Dublin

I just got this missive from the new Science Gallery at Trinity College Dublin:

The SCIENCE GALLERY is seeking EXPRESSIONS OF INTEREST for Festival of Light projects.

Calling all techno-artists, playful scientists, renegade engineers, architects, sculptors, lighting designers, fashion designers, guerilla projectionists and inventors…

The Science Gallery at Trinity College Dublin is developing a two week FESTIVAL OF LIGHT as its launching programme in January 2008 which will celebrate the art, science and technology of light through a range of installations and events in the Science Gallery and around Dublin’s city centre.

We are seeking proposals for installations, events and workshops. You can download our Expression of Interest form here. We would like this to reach far and wide so please forward this onto anyone you think may be interested in submitting!

If you would like to discuss your ides with us or would like further information prior to submitting an Expression of Interest Submission please contact Elizabeth Allen at elizabeth.allen /at/ sciencegallery.org .

I’m looking forward to see what happens with this; hope it works out well.

T9 in Ireland

Tobias DiPasquale <a href="http://blog.cbcg.net/articles/2007/07/11/damn-they-thought-of-that-too”>notes that the iPhone’s dictionary can correct the word ‘f***ing’ right out of the box. Handy!

The vagaries of various companies’ autocompletion dictionaries are always worth a comment. I’ve noticed that swearing is generally omitted, presumably for prudish reasons to do with tabloid PR fears. But as an Irishman, I find it particularly galling that Nokia’s T9 dictionary cycles through the following entries for "pints":

  • Shots
  • Pious
  • Riots
  • Pints

When I type "pints" (which happens a lot), believe me, I never mean to type "pious". Stupid phone!

Planet Antispam unborked

Those of you who visit Planet Antispam may have noticed that it hadn’t been updating in a few days. Somehow or other, the Planet software had corrupted its cache, and was dying with this error:

Traceback (most recent call last):
  File "planet.py", line 167, in ?
    main()
  File "planet.py", line 160, in main
    my_planet.run(planet_name, planet_link, template_files, offline)
  File "/home/planet/antispam/planet-2.0/planet/__init__.py", line 240, in run
    channel = Channel(self, feed_url)
  File "/home/planet/antispam/planet-2.0/planet/__init__.py", line 527, in __init__
    self.cache_read_entries()
  File "/home/planet/antispam/planet-2.0/planet/__init__.py", line 569, in cache_read_entries
    item = NewsItem(self, key)
  File "/home/planet/antispam/planet-2.0/planet/__init__.py", line 845, in __init__
    self.cache_read()
  File "/home/planet/antispam/planet-2.0/planet/cache.py", line 74, in cache_read
    self._type[key] = self._cache[cache_key + " type"]
  File "/usr/lib/python2.3/bsddb/__init__.py", line 116, in __getitem__
    return self.db[key]
KeyError: 'tag:blogger.com,1999:blog-9336495.post-117499582419244211 feedburner_origlink type'

Ah, Berkeley DB, always good for the infrequent inscrutable, yet fatal, error. A wipe of the contents of the cache directory, and it seems to be working again.

Unfortunately, I had to drop the RSS feed for Aunty Spam; it seems the domain has lapsed, and I can’t seem to find an RSS feed that contains just the spam-related Aunty Spam posts any more.

‘I Go Chop Your Dollar’ star arrested

The Register is reporting that ‘Nigerian comedian and actor Nkem Owoh’ has been arrested in Amsterdam as a suspected 419 scammer:

Nigerian comedian and actor Nkem Owoh was one of the 111 suspected 419 scammers arrested in Amsterdam recently as part of a seven month investigation, dubbed Operation Apollo.

Owoh became a well known star within the Nigerian film industry, sometimes colloquially known as Nollywood because of its trite plots, poor dialogue, terrible sound, and low production standards.

Owoh starred in the 2003 film Osuofia, and a year later was one of several actors temporarily banned from appearing in movies by Nigeria’s Association of Movie Marketers and Producers because he demanded excessive fees and unreasonable contract demands.

Owoh became internationally known for his song "I Go Chop Your Dollar", the anthem for 419 scammers ("Oyinbo man I go chop your dollar, I go take your money and disappear / 419 is just a game, you are the loser, I am the winner", full lyrics here), which was banned in Nigeria after many complaints.

The song was the title track from the comedy, "The Master", starring Owoh as a scheming 419er.

The alleged scammers are suspected of running a series of lottery-based (AKA 419-lite) scams.

Here’s the video for "I Go Chop Your Dollar".

It’s not exactly cut and dried, though. This thread suggests that he wasn’t arrested for fraud; instead that the Dutch authorities detained pretty much everyone at his concert. This article suggests similar:

The Netherlands police were said to have stormed the venue of the show in a helicopter about 2a.m and arrested practically everybody at the venue. […]

"Over 200 of them (Nigerians) were arrested that night. It was a big haul; they came with helicopter and cars and circled the whole area. As I speak with you, over 70 of those apprehended that night have been deported for possession of expired or fake immigration papers.

"Osuofia was also whisked away but was released hours after," the source said.

Update: It appears Osuofia was not arrested after all; lots more details here.

Hunting the wily mangosteen

A few weeks ago, I was in Tesco Clearwater when I spotted something I wasn’t expecting; a tray of fruit labelled "Mangosteen".

Mangosteen are delicious. In Thailand, they’re called "the queen of fruit" (with the oh-so-stinky and not quite as enjoyable Durian as the king). We once spent a week on a Thai beach snacking on bags of the things; they’re so good.

Unfortunately the tray was empty. :(

Ever since then, every time I’ve gone back to that Tesco, there’s been no sign of the mangosteen; not even another empty tray! Thing is, I now know they’re importing them, so I’m really jonesing… if any Dublin taint.org readers happen to spot some, please (a) be sure to buy some for yourself and (b) let us know where you found it!

Linking for charidee

Tom tagged me with another blog link-meme — a worthwhile one, though; the idea is to improve the page rank of charities in Ireland, by linking to them. Fair enough!

The list of charities so far is:

And I’ll add Focus Ireland (who seem to have broken their website!). Thanks to Dorothy for the suggestion.

Who to pass it on to? How’s about Una, James and Donncha?

NSAI invites comments on OOXML/OpenXML standard

Antoin writes:

NSAI (the Irish national standards body) has posted an invitation for comments on its site regarding the proposed new Office Open XML standard (ISO/IEC DIS 29500). NSAI has established an ad hoc committee to consider the matter, and I am a member of that committee, together with a number of far more important and qualified people.

Anyway, we are anxious to hear from anyone who has a view on what way NSAI should vote on this standard when it reaches committee. If you can provide links to any relevant articles, that would also be very helpful. If you have time, please review the documents and leave your comments either here or send them to the committee.

So if you’ve been following the ongoing drama (to be honest, I haven’t), please feel free to make a submission; the deadline is 11 July.

UPS Ireland suck

I’m waiting for a replacement battery from Dell, covered under warranty. Dell service have been great, but UPS, not so much…

On Monday (25th June), after a little back-and-forth to establish that the battery was faulty, I got a mail from Dell saying:

The Part (Battery) will be with you tomorrow pre 17:00 (Next Business Day). Please note that you will require to return the faulty part at the same point of time, the courier person would not be delivering the part until you return the defective part.

Great! That’s good warranty service. I’m happy.

So I wait… and wait. Finally, 2 days later, today (Wednesday 27th), at 17:45, a courier appears to pick up the faulty part. Unfortunately, he doesn’t have the replacement with him.

I go online to see what’s up via online tracking, and see this:

Location Date Local Time Description
DUBLIN,
IE
27/06/2007 16:41 A CORRECT STREET NAME IS NEEDED FOR DELIVERY. UPS IS ATTEMPTING TO OBTAIN THIS INFORMATION
27/06/2007 4:13 IN-TRANSIT SCAN
27/06/2007 4:12 IMPORT SCAN
DUBLIN,
IE
26/06/2007 18:31 IMPORT SCAN
26/06/2007 5:59 IMPORT SCAN
26/06/2007 5:58 OUT FOR DELIVERY
26/06/2007 3:59 ARRIVAL SCAN
KOELN (COLOGNE),
DE
26/06/2007 4:39 DEPARTURE SCAN
26/06/2007 4:14 DEPARTURE SCAN
HERKENBOSCH,
NL
25/06/2007 10:09 ORIGIN SCAN
NL 25/06/2007 14:02 BILLING INFORMATION RECEIVED

So, what, the street name is "INCORRECT" despite one UPS driver having no problem? I suspect someone just couldn’t be arsed.

I rang up UPS, provided a hint, and it seems the delivery is now rescheduled for Friday. So much for "next business day" delivery! Lucky the laptop works on AC without the battery, otherwise I’d be quite annoyed.

I wonder if I can provide feedback to Dell about this? There’s a possibility they might switch courier company if they get enough complaints about crappy service. It also makes me wonder if there’s any decent international parcel delivery service in Ireland. At least UPS haven’t yet required me to schlep over to a "local" depot 5 miles away to pick up the package myself, like An Post does…

How I wound up with a pond

My weekend went like this:

  1. buy a Green Cone composting system
  2. read instructions
  3. find out I had to dig a 3′ by 2′ deep hole
  4. spend all Saturday afternoon digging massive hole in the back garden, horny-handed son of toil style
  5. just as I finish, the skies open
  6. watch in horror as the hole rapidly becomes a pond
  7. since the green cone requires a dry hole, wait for it to drain…
  8. …and wait…
  9. …and wait…

I’m still waiting. :(

I just hope the flooded state of the pit is a side effect of the monsoon levels of rain over the last week, and will drain soon, rather than the normal situation for the garden. Otherwise, I’ll have to fill the hole and give up on the Green Cone entirely… argh. I should have gone for the wormery option, like lisey suggested!

Update: Enda left a good tip in the comments — dig deeper into the clay and fill in with more gravel. I did that and it looks like it’s working… Let’s see if the worms like it. I’ll keep yis posted ;)

How to solve a maze with Photoshop

wow, this is cool. lod3n, confronted by this heinous puzzle, wrote:

‘2 minutes in Photoshop. All too easy. So, where do I pick up my cake?

  1. Increase contrast.
  2. Select the right wall of the maze using the magic wand.
  3. Select > Modify > Expand 4 pixels
  4. Create new layer.
  5. Fill with Red.
  6. Select > Modify > Contract 2 pixels.
  7. Delete. Now you’ve got a line tracing the solution.
  8. Manually clean up the outer edge, and connect the dots.
  9. Cake!’

Here’s the result. Seriously nifty!

(Update: wow, this got Dugg heavily — 17000 pageviews from Digg alone! Unfortunately that caused a bit of a server meltdown. Should be back now though…)

7digital – a bit risky

Apparently <a href=’http://www.paidcontent.org/entry/419-emi-offers-drm-free-to-more-retailers-7digital-and-passalong-first/’>EMI are now offering their DRM-free MP3s via 7digital, so I thought I’ve give the newly-revamped 7digital site a go. Results were a little mixed, unfortunately.

I found a couple of tracks I wanted which were available as MP3 format, clicked the "purchase" button beside them, and they were added to the "basket" on the right-hand side. Pretty typical stuff, if you’ve used EMusic or iTunes. Then I created an account, chose to pay using Paypal, paid a couple of quid and all was well!

The good stuff:

  • the website works great in Firefox on Linux, and was nice and speedy.

  • the range of music seems pretty good; most of the catalogue is WMA-only unfortunately, but most of the new releases now seem to be coming out with MP3 as an option.

  • it’s very easy to pay by credit card or with Paypal.

There were a couple of glitches, however.

First, it allowed me to buy a file, then not give it to me. My first tester track was the Soulwax remix of ‘Standing in the Way of Control’ by Gossip. I happily added it to my basket, checked out, and paid — then when I got to my ‘Your downloads’ page, I was presented with this:

Gossip – Standing In The Way Of Control (Soulwax Nite Version) / 6:54 / Released 24.06.2007

No download links etc… hmm. A quick check of today’s date reveals that the 24th is a week from now — the track hasn’t been released yet! It seems this isn’t yet "available as a digital release" for some reason, despite the fact that as far as I can tell it’s been out for ages on CD. The only way to spot this in advance of purchase is to look at the "Digital release date" on the album info page and compare with today’s date; there’s no other notification that you’ll be buying a prerelease, and will have to wait to get your digital mitts on what you buy. Grrrr.

OK, next one; my other tester track was the title track from the new White Stripes, Icky Thump. At least this one was available. Now, supposedly we’re getting 320kbps MP3s, right? Not so, it seems — this one was 192kbps, a fact that’s only revealed once you’ve already paid for the tracks. Double grrr…

(it turns out, by the way, that only the "EMI content" is delivered in 320kbps format. I guess the other MP3 labels are sticking with 192kbps.)

So, two for two, both of the test downloads turned out to be wonky in one way or another. A bit disappointing. I hope they’ll improve though — there seems to be a new willingness to offer a decent MP3 music-download service there… and this is still more convenient for me than having to boot up a Windows virtual machine to use the iTunes Music Store.

They could really do with signposting exactly what you’re getting more clearly, though; in particular, being able to search by available format and bitrate would really help.

Lyris’ low SpamAssassin threshold

via jgc’s newsletter, Lyris’ latest ISP Deliverability Report (Q1 2007) makes an interesting point about legitimate bulk mail and SpamAssassin:

Contrary to popular belief among marketers, message content is not a major cause of deliverability challenges for most email marketers. This finding is a result of testing the content of more than 1,705 unique emails, using [Lyris] EmailAdvisor’s content scoring tool. The content scoring function is based on the content scoring rules of the widely adopted Spam Assassin open source project. The emails tested had an average content point score of 1.04 well below the filter’s generally accepted spam identification level of 3.0 or higher.

Now, that’s broadly good advice — SpamAssassin hasn’t really given much strength to signatures found in message body text in the past couple of years, since the signatures from other sources (especially DNS blocklists and URI blocklists) are much more reliable.

However, note the bit I emphasised. Since when is 3.0 the ‘generally accepted spam identification level’? Only the most paranoid user would ever go that low, since at that level, they’d expect to find 2.22% of their nonspam mail going into the spam folder (according to our own tests). In reality, our recommended level has always been 5.0 points, and that’s what we optimise for. I’m mystified as to where they’re getting 3.0 from…

Irish medical tourism

Just got a mail from an old friend, Caelen, who’s got a new start-up going with an interesting angle. Caelen and his (now-) wife, Barbara, spent a while travelling around Asia around the same time as we did. As I noted back in 2003, one thing he tried out, which I found particularly intriguing at the time, was to have some minor surgery in Bangkok:

This may seem foolish at first, but despite being in the heart of South East Asia, in what is generally thought to be a developing country, the Thai medical system is unbelievably good. Not only is it the medical hub for expatriates throughout the region, but tens of thousands fly here each year to have elective surgery, from laser eye treatments to boob jobs and face lifts. There are lots of reasons why they come to Bangkok but invariably quality of surgery and care comes top of the list. Simply put, medical care in Thailand is amongst the best in the word, available at a fraction of the cost.

The Thai government sees health care as the next logical step in its hospitality industry. As holiday makers in Thailand reach saturation point, growth has to come from other sectors and international healthcare has many of the same requirements as the tourism industry: good flight connections, plentiful accommodation and above all staff that are understanding and friendly. Gleaming hospitals, which could be mistaken for 5 star hotels, not only have rooms with all amenities but also have suites, restaurants, shops and cinemas. Menus from the finest restaurants in town are placed in the best rooms. Going to hospital doesn’t mean you have to stop having fun – this is Bangkok after all. This is a long way from the cold greasy egg served by the kitchen’s ‘Miserable Person of the Year’ award winner we get at home.

Back in 2002, this was pretty unprecedented — of course, nowadays, the concept is a lot more widely practiced, what with healthcare costs rising in the US and waiting lists rising in the UK.

I can vouch that the quality of care in Bangkok was fantastic, by all accounts; fastidiously clean and professional. (I never did it myself, but many people I knew at the time took advantage of the opportunity, rather than risk something flaring up in the less, er, reliable settings of Luang Prabang or Phnom Penh.)

Anyway, turns out Caelen has come up with a new site that is related to this — Reva Health Network. He says, ‘basically, we are a medical tourism search engine where consumers can find and compare hospitals and clinics from around the world. We cover everything although the bulk of our business is currently in dental.’

If you’re looking for some work done, it might be worth taking a look; it’s at revahealthnetwork.com.

Update 2010-08-16: They’ve moved! The new URL is http://www.whatclinic.com , which makes much more sense really. Apparently they’re getting 500,000 visitors a month, and proxy though 800 phone calls a day to clinics. Cool — sounds like it’s going well…

IKEA Dublin gets planning permission

Given that I’m trying to get a new house in order, here’s a topic close to my heart right now — massive IKEA store approved for Dublin:

An Bord Pleanála has given the go-ahead for the construction of a massive IKEA outlet in the Ballymun area of Dublin. Legal restrictions on the size of retail developments had already been changed to allow the Swedish furniture giant to build a 30,000 square foot shop in the area. However, several objections were received from the National Roads Authority, Green Party TD Eamon Ryan and a number of businesses which said they would be adversely affected by a huge increase in traffic on the M50 motorway. An Bord Pleanála has now decided to grant permission for the project, subject to 30 conditions aimed at preventing traffic congestion, protecting the visual amenity of the area and promoting sustainable development.

This is long overdue, and something Ireland’s been crying out for — the price and quality of furniture here is dire. I’m glad to see it.

The details are up on An Bord Pleanala’s site, including the Board’s conditions. For ease of reading, I’ve converted it to HTML using OpenOffice.

This one strikes me as potentially annoying:

A schedule of parking charges shall be applied to car park users (other than coaches and buses which shall not be charged for parking during opening hours) […]

At least two months prior to the opening of the proposed development for trading, an initial schedule of charges shall be agreed in writing with the planning authority. Where the daily peak hour two-way traffic flows as measured by the automatic traffic counters do not comply with the thresholds set above, the schedule of parking charges shall be varied as directed by the planning authority until compliance is achieved, save that breaches or non-compliances of a very minor or trivial nature or arising from exceptional circumstances may be disregarded at the discretion of the planning authority.

Reason: To minimise traffic impacts and avoid serious traffic congestion.

Patronising pregnancy

Via Yoz comes this great article: Zoe Williams: Being pregnant and receiving unscientific advice go hand in hand. Here’s a sample:

Listeria has been my particular bugbear ever since a midwife – that is, a trained prenatal professional who, unless I develop complications, represents the highest medical authority I can expect to deal with throughout my pregnancy – told me that I could get listeriosis, thereby brain-damaging my foetus, without knowing about it. Now, listeriosis is an incredibly serious disease, with extremely serious symptoms, taken extremely seriously by epidemiologists nationwide. Get it without noticing it? If I got listeriosis, the national papers would know about it. It would be the third outbreak that has occurred in [the UK] in the past 20 years.

Here are some other things that are wantonly untrue: pasteurisation, in fact, has nothing to do with a cheese’s ability to harbour the listeria bacteria. The bacteria that characterise different cheeses are introduced after the pasteurisation process anyway. Listeria flourishes in moist environments, so parmesan is safe where camembert isn’t, but even rinded and soft cheeses are safe once they have been cooked. But food hygiene is a much more important factor than moisture – raw fish does not come out of the sea carrying listeria, but contracts the bacteria from contact with dirty hands. Of the past two outbreaks of listeria in Britain, one was from butter and the other from lettuce (there have been other instances of product recalls, but no human contamination).

In fact the three worst recorded cases of listeria since 1992 have all been in France, and were all from pork tongue in jelly, which nobody in their right mind would ever eat. Of the past 10 listeriosis outbreaks in America, only two were from cheese, and one of those was a Mexican homemade cheese. The notion that there are pregnant people out there whipping themselves into a frenzy of guilt because they have eaten some gorgonzola is just infuriating.

This patronising "pregnant women mustn’t do X" paranoia is C’s pet hate of the moment; being a (pregnant) scientist, she’s been checking them against Medline, looking into the extent of the real research these claims are based on, and generally writing them off one by one. I’ve been trying to persuade her to write a blog post about this for taint.org, so far with no luck though…

MAAWG Talk

Here’s the talk I gave at MAAWG, entitled New Features in SpamAssassin 3.2.0 Of Interest To Large Receivers:

Abstract:

Many ISPs and mail receivers, at all scales, use SpamAssassin as part of their spam-filtering arsenal. The recent release of SpamAssassin 3.2.0 introduces much new functionality, and some of this is of particular interest to the large-scale mail receiver; in particular, rules compiled to parallel-matching native object code for increased speed, early short-circuiting based on administrator-specified rules, the new "msa_networks" setting to specify MSA hosts or pools, a new ruleset to detect spam/virus backscatter bounces, a way to run SpamAssassin in the Apache httpd server using mod_perl, and support for Amazon’s EC2 virtual server farm. In this talk, I’ll discuss each of these in detail, and discuss why it may be useful to you.

If you were at MAAWG, hope you enjoyed it ;)