Skip to content

Justin's Linklog Posts

Not Cosmo

So, we were all set to name our new arrival Cosmo, assuming it was a boy. We were certain it was going to be a boy. Guess what? It wasn’t… so now we have to narrow down the girl-name shortlist in a hurry!

Isn’t she lovely? Lots more photees at Flickr.

Anyway, I may be hard to get hold of for a while… this lady will be keeping me busy I think ;)

Update: Looks like the name is Beatrice Lily Mason, although there’s still a fair bit of indecision, unfortunately ;)

Update 2: Beatrice Lily Gray Mason. Final answer!

Stupid Unicode Tricks

Cool Unicode trick, via Mantari — cut and paste this character into a Unicode-aware application (like this post’s comment box!), then type something and see what happens:

‫‬‭‮‪‫‬‭‮҉

My Nokia 770

A couple of weeks back, there was quite a bit of buzz in the Irish blogosphere and elsewhere about the Nokia 770; prices for new N770s had dropped from $290ish to a very reasonable $140 / EUR130-ish price-point. I, along with a good few others, bought one.

I bought mine through Expansys, with a free 1GB RS-MMC memory card. They’ve sold out and no longer have any N770s listed; however, Buy.com still seem to have them in stock, so if you’re interested, you can probably still pick one up. (It seems Nokia is trying to sell off their remaining N770 stock, cheap, with plans to drop support for the software platform. I’m fine with this, but it may put other buyers off.)

I’ve now been using it for a while, and am still happy. ;) Here are my recommended top apps:

Slimserver. Originally designed to operate as the backend software for the Squeezebox thin-client MP3 player, this has a fantastic UI built for the N770, and its MP3 stream output works perfectly on the tablet.

This is by far the neatest way to get at a 6000-song music library without a laptop; there was some talk in the GNOME community of making a decent DAAP client, but so far there’s no working results there that I could find. :(

maemo-mapper. This is a fantastic mapping app for the tablet; it presents map tiles downloaded from OpenStreetMap or Google Maps in an N770-optimized format, with the usual nice draggable UI. Bonus: it’ll work offline, so you can follow a route while online, then take the tablet along to help navigate.

Tip: once you start maemo-mapper, click the "Download…" button in the "Repository Manager" and it’ll download details for the 5 most useful map repositories, including Google and Virtual Earth.

FBReader. A very nice document reader; much nicer than trying to read long HTML pages in the builtin web browser, especially since it allows you to turn the device on its side.

In general, the Opera Mini browser works fine; be sure to enable Javascript and set up a swap file on the RS-MMC card first. It does all the basic HTML and rudimentary AJAX; Google Calendar is a no-go, but GMail and even Google Maps works adequately, modulo minor bugs. Plain Old HTML sites like Wikipedia, IMDB and so on all work great.

As long as you’re realistic about the platform, it won’t disappoint — video requires custom transcoding, for example, and proprietary apps like Flash and RealPlayer lag behind their desktop equivalents, but as far as I can tell that’s the case for every embedded platform. (Since I spent a couple of years developing such a platform, I’m quite comfortable with this.)

A really really nifty thing about the N770 is that it’s now entirely hackable — within 30 minutes of powering on, I was able to get a terminal window open with a root prompt, and was adding ext3 partitions to the RS-MMC card. Apps are installed using "apt-get". The terminal even has word-completion system optimized for the UNIX command-line – nice ;)

This SomethingAwful thread contains plenty more good tips. I’m happy I bought it — so many of these gadgets can wind up as an overpriced door-stop, but this is easily worth what I paid for it.

Update: this thread at InternetTabletTalk seems pretty chock-full of good advice, too.

Test my auto-generated ruleset

(I posted this to the SA users and dev lists, too.)

I’ve been working on a new way to auto-generate body rules recently (see previous posts). The results are checked into SVN trunk daily in the "rulesrc/sandbox/jm/20_sought.cf" file.

We haven’t had much time to figure out how to produce auto-generated 3.2.x rule updates for our entire ruleset at updates.SpamAssassin.org, so instead of dealing with that, I’ve taken a shortcut around it ;) I’m now making just the "20_sought.cf" ruleset available as a standalone, unofficial sa-update ruleset at sought.rules.yerp.org.

Before using it, you’ll need the GPG key:

  wget http://yerp.org/rules/GPG.KEY
  sudo sa-update --import GPG.KEY                

then use this to update:

  sudo sa-update \
        --gpgkey 6C6191E3 --channel sought.rules.yerp.org \
        [...other channels...] \
        --channel updates.spamassassin.org

(similar to how you’d use Daryl’s sa-update version of the SARE rulesets.)

Feel free to run sa-update as frequently as you like.

Please consider it alpha; I may take it down in a few months depending on how it goes, or if we can get it working as part of the core updates. In the meantime though, I’m curious to hear how you get on with it. (In particular, copies of false positives would be very welcome.)

Update: it’s been very successful, so I’d now consider it in production.

The Prime Time Group pump-and-dump

Spamnation.info links to an interesting article by Computerworld’s Gregg Keizer about the massive PRTH.PK spam run.

As usual, there are no shortage of suckers:

The spam blast did drive up Prime Time’s share price from Monday’s low of around 7 cents to Wednesday’s high of 11 cents, a 57% jump. Thursday morning, however, the bottom dropped out, and the stock fell to under 7 cents. Trading volumes peaked Wednesday as well, at around 1.7 million shares, substantially higher than any day in the month prior. "You can actually see the wave of activity in the stock and compare it with the volume of spam that we trapped," said [Sophos analyst Ron] O’Brien.

But here’s an interesting new tactic by the good guys:

Last Wednesday afternoon, Prime Time announced that it was ordering a Non Objecting Beneficial Owners (NOBO) list to get a clearer picture of who owned its shares. "The NOBO list will be used to determine the naked short positions in Prime Time Group Inc.," the company said in a statement. "The finding will then be reported to the [National Association of Securities Dealers] to take action against the violators of the naked short regulations."

"Naked short" is a investment term that refers to selling short, essentially a bet that the price will drop, but with a twist: "naked" means that the investor sells short without first making sure he can borrow the shares from another investor holding a "long" position on the stock.

I hope this works; it’d be great to see the profit mechanism behind pump-and-dump spam killed off.

Spamnation notes:

Incidentally, the greeting card spam that built the botnet used to promote PRTH.PK and CYTV.OB also continues. It has iterated through another couple of generations: the current incarnation tells recipients to collect their custom Musical ecard or custom Movie-quality ecard or other variants on that theme. We’ve seen about 150 of these in the past three days, suggesting that the unknown senders are probably well on their way to building up another botnet for their next stock spam run.

Spreading trojans via greeting-card spam is a trademark of the gigantic Storm botnet, AFAIK: SecureWorks info, MessageLabs info, spam levels causing DDoS for Canadian networks, DDoS threat for EDU sector.

The Haughey 419 returns

A few months back, Blogorrah noted an amazing 419 scam, claiming to be a missive from ex-Taoiseach of Ireland Charlie Haughey‘s wife, Maureen. It’s really quite appropriate Charlie becoming the subject of a scam himself, given what he did to this country. But anyway… over the weekend, a new variant on the theme emerged:

From Mrs Maureen Haughey, ROI

My Dear Friend,

I am Maureen Haughey, widow of former Taoiseach of the Republic of Ireland, Charles J. Haughey and daughter of former Taoiseach of the Republic of Ireland and heir to de Valera, Sean F. Lemass.The Press has written a lot about unresolved mysteries and corruption surrounding CharlesÂ’s dealings, but I tell you something,my Charlie was a good man. He was human and he did whatever he did.

People marvel why I stuck with Charlie and didn’t speak during the mess that came with the exposure of his affairs with Terry Keane (I just hate to think of her). I had to stand by him through the tribunal times…. it was to do with what I’m doing now. No one knew the details of all Charlie’s financial dealings but me. I remain the only one who knows all who got loans from Charlie and didn’t come back to pay when he was disgraced. I am the only one who knows about these monies and the other Ansbacher accounts.

I write to you, an old weary woman, sick and almost tired of living. My end is near but I will not depart until my final mission is accomplished and I also write this with an unshaken belief in the power of aspirations and dreams of a human being. The Irish government thinks it can shave and reduce me to a poor widow but I have the winning ace. A few years ago, when we werenÂ’t sure if my Charlie would be convicted, he kept some money in trust for me in a Security and Finance company. He did not open the account in our names so it will not be traced to us to enable the past remain the past. The name on the account is Cedric de Vregille. I never thought Charlie would leave me so soon and it never occurred to me to ask if this name were fictitious or not or a name of any of his friends. I have tried to find this man but to no avail. The amount he deposited in this name is 30,000,000 (Thirty Million Euros).

I want an honest person to come forward and lay claims to this amount, moreover to use the funds as instructed by me. I have all the documents needed, I just need a face for the name. I have mapped out 30% of the funds for you, as you will help us (you and I) execute this job.

As soon as I receive your acceptance for this work I shall give you necessary details of my solicitor who will facilitate the release of the funds in your name. Please reply me via my personal email: maureen_haughey67@yahoo.co.uk


For my security and the sake of letting sleeping dogs lie, I strongly advice that you keep our dealings confidential. You can read more about my charlie from:

http://www.ireland.com/focus/haughey/ITstories/story11.htm

http://www.teachersparadise.com/ency/en/wikipedia/c/ch/charles_haughey.html

http://www.everything2.com/index.pl?node_id=548983&lastnode_id=0

Thank You.


Message sent using UebiMiau 2.7.2

It was sent via a webmail system at nildram.co.uk, from a proxy in Australia.

The writing is amazingly ornate — ‘I write to you, an old weary woman, sick and almost tired of living’, ‘the Irish government thinks it can shave and reduce me to a poor widow but I have the winning ace’, etc. Very odd stuff. Also, it looks spell-checked. And, once again, poor old cyclist Cedric de Vregille gets dragged into it, too! I wonder what he did to deserve that ;)

If you fancy scambaiting, ‘maureenhaughey67@yahoo.co.uk’ is the one to go for. These guys seem to be having a good go of it‘The thought of the Irish government trying to shave an old woman has shocked and appauled me, so I will assist in anyway possible.’_ ha!

Rule Discovery Progress Update

Back in March, I wrote a post about a new rule discovery algorithm I’d come up with, based on the BLAST bioinformatics algorithm. I’m still hacking on that; it’s gradually meandering towards production status, as time permits, so here’s an update on that progress.

There have been various tweaks to improve memory efficiency; I won’t go into those here, since they’re all in SVN history anyway. But the results are that the algorithm can now extract rules from 3500 spam and 50000 ham messages without consuming more than 36 MB of RAM, or hitting disk. It can also now generate a SpamAssassin rules file directly, and apply a basic set of QA parameters (required hit rate, required length of pattern, etc.).

On top of this, I’ve come up with a workflow to automatically generate a usable batch of rules, on a daily basis, from a spam and ham corpus. This works as follows:

  • Take a sample of the past 4 days traffic from our spamtrap network. Today this was about 3000 messages.

  • add the hand-vetted spam from my own accounts over the same period (this helps reduce bias, since spamtraps tend to collect a certain type of spam), about 3400 messages.

  • discard spams that scored over 10 points (to concentrate on the stuff we’re missing).

  • Pass the remaining 3517 spams, and text strings from over 50000 nonspam messages, into the "seek-phrases-in-log" script, specifying a minimum pattern length of 30 characters, and a minimum hitrate of 1% (in today’s corpus, a rule would have to hit at least 34 messages to qualify).

  • That script gronks for a couple of minutes, then produces an output rules file, in this case containing 28 rules, for human vetting. (Since I’ve started this workflow, I’ve only had to remove a couple of rules at this step, and not for false positives; instead, they were leaking spamtrap addresses.)

  • Once I’ve vetted it, I check it into rulesrc/sandbox/jm/20_sought.cf for testing by the SpamAssassin rule QA system.

The QA results for the ruleset from yesterday (Aug 3) can be seen here, and give a pretty good idea of how these rules have been performing over the past week or two; out of the nearly 70000 messages hit by the rules, only 2 ham mails are hit — 0.0009%.

In fact, I measured the ruleset’s overall performance in the logs provided by the 4 mass-check contributors who provided up-to-date data in yesterday’s nightly mass-check; bb-jm, jm, daf, dos, and theo (all SpamAssassin committers):

Contributor Hits Spams Percent
bb-jm 4249 24996 17.00%
jm 3450 14994 23.00%
daf 1236 35563 3.48%
dos 32867 100223 32.79%
theo 28077 382562 7.34%

(bb-jm and jm are both me; they scan different subsets of my mail.)

The "Percent" column measures the percentage of their spam collection that is hit by at least one of these rules; it works out to an average of 16.72% across all contributors. This is underestimating the true hitrate on "fresh" spam, too, since the mass-check corpora also include some really old spam collections (daf’s collection, for example, looks like it hasn’t been updated since the start of July).

Even better, a look at the score-map for these rules shows that they are, indeed, hitting the low-scoring spam that other rules don’t hit.

That’s pretty good going for an entirely-automated ruleset!

The next step is to come up with scores, and publish these for end-user use. I haven’t figured out how this’ll work yet; possibly we could even put them into the default "sa-update" channel, although the automated nature of these rules may mean this isn’t a goer.

If you’re interested, the hits-over-time graph for one of the rules (body JM_SEEK_ICZPZW / Home Networking For Dummies 3rd Edition \$10 /) can be viewed here.

Host monitoring with Jaiku

A few weeks back, we were having trouble with dogma, our shared server where taint.org is hosted, which would occasionally be unavailable for unknown reasons. We needed to monitor its availability so that it could be fixed when it crashed again, and we’d be able to investigate quickly. Since it was happening mostly out of working hours, SMS notification was essential.

Normally, that kind of monitoring is pretty basic stuff, and there’s plenty of services out there, from Host-Tracker.com to the more complex self-hosted apps like monit and Nagios which can do that. But looking around, I found that none of them offered SMS notification for free, and since this was our personal-use server, I wasn’t willing to sign up for a $10-per-month paid account to support it, or buy any hardware to act as a private SMS gateway.

Instead, I thought of Jaiku — the Finnish company which offers a microblogging/presence platform similar to Twitter. Jaiku had a couple of cool features:

  • SMS notifications
  • it’s possible to broadcast messages to a "channel", which others could subscribe to, IRC-style
  • it has an open API

This would allow me to notify any interested party of dogma’s downtime, allowing subscribers to subscribe and unsubscribe using whatever notification systems Jaiku support.

With a little perl and LWP, I rigged up a quick monitoring script to check http://taint.org/ via HTTP, and report if it was unavailable over the course of 5 retries in 50 seconds. If it was broken, the script sends a JSON-formatted POST request to Jaiku’s "presence.send" method, informing the target channel of the issue. (Perl source here.)

You can see the ‘#dogmastatus’ channel here — as you can see, we fixed the problem with dogma just over 2 weeks ago ;)

It’s worth noting that I had to set up an additional user, "downtimebot", on Jaiku to send the messages — otherwise I’d never see them on my configured mobile phone! Jaiku uses the optimisation that, if I sent the message, there’s no need to cc me with a copy of what I just sent; logical enough.

Anyway, if you’re interested in dogma’s availability (there might be one or two taint.org readers who are), feel free to add yourself to the #dogmastatus channel and receive any updates.

Update: Fergal noted that it’s pretty simple to use Cape Clear’s assembly framework to perform a HTTP ping test with output to Jabber/XMPP. nifty!

A fishy Challenge-Response press release

I have a Google News notification set up for mentions of "SpamAssassin", which is how I came across this press release on PRNewsWire:

Study: Challenge-Response Surpasses Other Anti-Spam Technologies in Performance, User Satisfaction and Reliability; Worst Performing are Filter-based ISP Solutions

NORTHBOROUGH, Mass., July 17 /PRNewswire/ — Brockmann & Company, a research and consulting firm, today released findings from its independent, self-funded "Spam Index Report– Comparing Real-World Performance of Anti-Spam Technologies."

The study evaluated eight anti-spam technologies from the three main technology classes — filters, real-time black list services and challenge- response servers. The technologies were evaluated using the Spam Index, a new method in anti-spam performance measurement that leverages users’ real-world experiences.

[…] The report finds that the best performing anti-spam technology is challenge-response, based on that technology’s lowest average Spam Index score of 160.

[…] Filter – Open Source software-(Spam Index: 388): This technology is frequently configured to work in conjunction with PC email client filters. The server adds SPAM to the subject line so that the client filter can move the message into the junk folder. This class of software includes projects such as ASSP, Mail Washer and SpamAssassin, among others.

The "Spam Index" is a proprietary measurement of spam filtering, created by Brockmann and Company. A lower "Spam Index" score is better, apparently, so C/R wins! (Funny that. The author, Peter Brockmann, seems to have some kind of relationship with C/R vendor Sendio, being quoted in Sendio press releases like this one and this one, and providing a testimonial on the Sendio.com front page.)

However — there’s a fundamental flaw with that "Spam Index" measurement, though; it’s designed to make C/R look good. Here’s how it’s supposed to work. Take these four measurements:

  • Average number of spam messages each day x 20 (to get approximate number per work-month)
  • Average minutes spent dealing with spam each day x 20 (to get approximate minutes per work-month)
  • Number of resend requests last month
  • Number of trapped messages last month

Then sum them, and that gives you a "Spam Index".

First off, let’s translate that into conventional spam filter accuracy terms. The ‘minutes spent dealing with spam each day’ measures false negatives, since having to ‘deal with’ (ie delete) spam means that the spam got past the filter into the user’s inbox. The ‘number of trapped messages’ means, presumably, both true positives — spam marked correctly as spam — and false positives — nonspam marked incorrectly as spam. The ‘number of resend requests last month’ also measures false positives, although it will vastly underestimate them.

Now, here’s the first problem. The "Spam Index" therefore considers a false negative as about as important as a false positive. However, in real terms, if a user’s legit mail is lost by a spam filter, that’s a much bigger failure than letting some more spam through. When measuring filters, you have to consider false positives as much more serious! (In fact, when we test SpamAssassin, we consider FPs to be 50 times more costly than a false negative.)

Here’s the second problem. Spam is sent using forged sender info, so if a spammer’s mail is challenged by a Challenge/Response filter, the challenge will be sent to one of:

  • (a) an address that doesn’t exist, and be discarded (this is fine); or
  • (b) to an invalid address on an innocent third-party system (wasting that system’s resources); or
  • (c) to an innocent third-party user on an innocent third-party system (wasting that system’s resources and, worst of all, the user’s time).

The "Spam Index" doesn’t measure the latter two failure cases in any way, so C/R isn’t penalised for that kind of abusive traffic it generates.

Also, if a good, nonspam mail is challenged, either

  • (a) the sender will receive the challenge and take the time to jump through the necessary hoops to get their mail delivered ("visit this web page, type in this CAPTCHA, click on this button" etc.); or
  • (b) they’ll receive the challenge, and not bother jumping through hoops (maybe they don’t consider the mail that important); or
  • (c) they’ll not be able to act on the challenge at all (for example, if an automated mail is challenged).

Again, the "Spam Index" doesn’t measure the latter two failure cases.

In other words, the situations where C/R fails are ignored. Is it any wonder C/R wins when the criteria are skewed to make that happen?

Stop with the fake phish data

An anonymous friend in the anti-phishing community writes:

For those of you who blog and/or have contacts in the general computer user ‘go fight ’em’ community:

Is there any way you can get the word out that dropping a couple hundred fake logins on a phishing site is NOT appreciated??

It creates havoc for those monitoring the drop since it’s an unbelieveable waste of time and resources to clean up the file. Also, for those drop files that ‘recycle’ after every 10 entries, valid data is lost.

It also creates havoc for those who get these files and try to notify victims. They waste time, too .. pulling legit info from amongst the trash.

I know there are programs out there that create/dump this stuff onto sites and some who call themselves ‘phish phighters’ enjoy the harassment aspect. But it wastes the time/effort of those who are seriously working these things.

New Science Gallery in Dublin

I just got this missive from the new Science Gallery at Trinity College Dublin:

The SCIENCE GALLERY is seeking EXPRESSIONS OF INTEREST for Festival of Light projects.

Calling all techno-artists, playful scientists, renegade engineers, architects, sculptors, lighting designers, fashion designers, guerilla projectionists and inventors…

The Science Gallery at Trinity College Dublin is developing a two week FESTIVAL OF LIGHT as its launching programme in January 2008 which will celebrate the art, science and technology of light through a range of installations and events in the Science Gallery and around Dublin’s city centre.

We are seeking proposals for installations, events and workshops. You can download our Expression of Interest form here. We would like this to reach far and wide so please forward this onto anyone you think may be interested in submitting!

If you would like to discuss your ides with us or would like further information prior to submitting an Expression of Interest Submission please contact Elizabeth Allen at elizabeth.allen /at/ sciencegallery.org .

I’m looking forward to see what happens with this; hope it works out well.

T9 in Ireland

Tobias DiPasquale <a href="http://blog.cbcg.net/articles/2007/07/11/damn-they-thought-of-that-too”>notes that the iPhone’s dictionary can correct the word ‘f***ing’ right out of the box. Handy!

The vagaries of various companies’ autocompletion dictionaries are always worth a comment. I’ve noticed that swearing is generally omitted, presumably for prudish reasons to do with tabloid PR fears. But as an Irishman, I find it particularly galling that Nokia’s T9 dictionary cycles through the following entries for "pints":

  • Shots
  • Pious
  • Riots
  • Pints

When I type "pints" (which happens a lot), believe me, I never mean to type "pious". Stupid phone!

Planet Antispam unborked

Those of you who visit Planet Antispam may have noticed that it hadn’t been updating in a few days. Somehow or other, the Planet software had corrupted its cache, and was dying with this error:

Traceback (most recent call last):
  File "planet.py", line 167, in ?
    main()
  File "planet.py", line 160, in main
    my_planet.run(planet_name, planet_link, template_files, offline)
  File "/home/planet/antispam/planet-2.0/planet/__init__.py", line 240, in run
    channel = Channel(self, feed_url)
  File "/home/planet/antispam/planet-2.0/planet/__init__.py", line 527, in __init__
    self.cache_read_entries()
  File "/home/planet/antispam/planet-2.0/planet/__init__.py", line 569, in cache_read_entries
    item = NewsItem(self, key)
  File "/home/planet/antispam/planet-2.0/planet/__init__.py", line 845, in __init__
    self.cache_read()
  File "/home/planet/antispam/planet-2.0/planet/cache.py", line 74, in cache_read
    self._type[key] = self._cache[cache_key + " type"]
  File "/usr/lib/python2.3/bsddb/__init__.py", line 116, in __getitem__
    return self.db[key]
KeyError: 'tag:blogger.com,1999:blog-9336495.post-117499582419244211 feedburner_origlink type'

Ah, Berkeley DB, always good for the infrequent inscrutable, yet fatal, error. A wipe of the contents of the cache directory, and it seems to be working again.

Unfortunately, I had to drop the RSS feed for Aunty Spam; it seems the domain has lapsed, and I can’t seem to find an RSS feed that contains just the spam-related Aunty Spam posts any more.

‘I Go Chop Your Dollar’ star arrested

The Register is reporting that ‘Nigerian comedian and actor Nkem Owoh’ has been arrested in Amsterdam as a suspected 419 scammer:

Nigerian comedian and actor Nkem Owoh was one of the 111 suspected 419 scammers arrested in Amsterdam recently as part of a seven month investigation, dubbed Operation Apollo.

Owoh became a well known star within the Nigerian film industry, sometimes colloquially known as Nollywood because of its trite plots, poor dialogue, terrible sound, and low production standards.

Owoh starred in the 2003 film Osuofia, and a year later was one of several actors temporarily banned from appearing in movies by Nigeria’s Association of Movie Marketers and Producers because he demanded excessive fees and unreasonable contract demands.

Owoh became internationally known for his song "I Go Chop Your Dollar", the anthem for 419 scammers ("Oyinbo man I go chop your dollar, I go take your money and disappear / 419 is just a game, you are the loser, I am the winner", full lyrics here), which was banned in Nigeria after many complaints.

The song was the title track from the comedy, "The Master", starring Owoh as a scheming 419er.

The alleged scammers are suspected of running a series of lottery-based (AKA 419-lite) scams.

Here’s the video for "I Go Chop Your Dollar".

It’s not exactly cut and dried, though. This thread suggests that he wasn’t arrested for fraud; instead that the Dutch authorities detained pretty much everyone at his concert. This article suggests similar:

The Netherlands police were said to have stormed the venue of the show in a helicopter about 2a.m and arrested practically everybody at the venue. […]

"Over 200 of them (Nigerians) were arrested that night. It was a big haul; they came with helicopter and cars and circled the whole area. As I speak with you, over 70 of those apprehended that night have been deported for possession of expired or fake immigration papers.

"Osuofia was also whisked away but was released hours after," the source said.

Update: It appears Osuofia was not arrested after all; lots more details here.

Hunting the wily mangosteen

A few weeks ago, I was in Tesco Clearwater when I spotted something I wasn’t expecting; a tray of fruit labelled "Mangosteen".

Mangosteen are delicious. In Thailand, they’re called "the queen of fruit" (with the oh-so-stinky and not quite as enjoyable Durian as the king). We once spent a week on a Thai beach snacking on bags of the things; they’re so good.

Unfortunately the tray was empty. :(

Ever since then, every time I’ve gone back to that Tesco, there’s been no sign of the mangosteen; not even another empty tray! Thing is, I now know they’re importing them, so I’m really jonesing… if any Dublin taint.org readers happen to spot some, please (a) be sure to buy some for yourself and (b) let us know where you found it!

Linking for charidee

Tom tagged me with another blog link-meme — a worthwhile one, though; the idea is to improve the page rank of charities in Ireland, by linking to them. Fair enough!

The list of charities so far is:

And I’ll add Focus Ireland (who seem to have broken their website!). Thanks to Dorothy for the suggestion.

Who to pass it on to? How’s about Una, James and Donncha?

NSAI invites comments on OOXML/OpenXML standard

Antoin writes:

NSAI (the Irish national standards body) has posted an invitation for comments on its site regarding the proposed new Office Open XML standard (ISO/IEC DIS 29500). NSAI has established an ad hoc committee to consider the matter, and I am a member of that committee, together with a number of far more important and qualified people.

Anyway, we are anxious to hear from anyone who has a view on what way NSAI should vote on this standard when it reaches committee. If you can provide links to any relevant articles, that would also be very helpful. If you have time, please review the documents and leave your comments either here or send them to the committee.

So if you’ve been following the ongoing drama (to be honest, I haven’t), please feel free to make a submission; the deadline is 11 July.

UPS Ireland suck

I’m waiting for a replacement battery from Dell, covered under warranty. Dell service have been great, but UPS, not so much…

On Monday (25th June), after a little back-and-forth to establish that the battery was faulty, I got a mail from Dell saying:

The Part (Battery) will be with you tomorrow pre 17:00 (Next Business Day). Please note that you will require to return the faulty part at the same point of time, the courier person would not be delivering the part until you return the defective part.

Great! That’s good warranty service. I’m happy.

So I wait… and wait. Finally, 2 days later, today (Wednesday 27th), at 17:45, a courier appears to pick up the faulty part. Unfortunately, he doesn’t have the replacement with him.

I go online to see what’s up via online tracking, and see this:

Location Date Local Time Description
DUBLIN,
IE
27/06/2007 16:41 A CORRECT STREET NAME IS NEEDED FOR DELIVERY. UPS IS ATTEMPTING TO OBTAIN THIS INFORMATION
27/06/2007 4:13 IN-TRANSIT SCAN
27/06/2007 4:12 IMPORT SCAN
DUBLIN,
IE
26/06/2007 18:31 IMPORT SCAN
26/06/2007 5:59 IMPORT SCAN
26/06/2007 5:58 OUT FOR DELIVERY
26/06/2007 3:59 ARRIVAL SCAN
KOELN (COLOGNE),
DE
26/06/2007 4:39 DEPARTURE SCAN
26/06/2007 4:14 DEPARTURE SCAN
HERKENBOSCH,
NL
25/06/2007 10:09 ORIGIN SCAN
NL 25/06/2007 14:02 BILLING INFORMATION RECEIVED

So, what, the street name is "INCORRECT" despite one UPS driver having no problem? I suspect someone just couldn’t be arsed.

I rang up UPS, provided a hint, and it seems the delivery is now rescheduled for Friday. So much for "next business day" delivery! Lucky the laptop works on AC without the battery, otherwise I’d be quite annoyed.

I wonder if I can provide feedback to Dell about this? There’s a possibility they might switch courier company if they get enough complaints about crappy service. It also makes me wonder if there’s any decent international parcel delivery service in Ireland. At least UPS haven’t yet required me to schlep over to a "local" depot 5 miles away to pick up the package myself, like An Post does…

How I wound up with a pond

My weekend went like this:

  1. buy a Green Cone composting system
  2. read instructions
  3. find out I had to dig a 3′ by 2′ deep hole
  4. spend all Saturday afternoon digging massive hole in the back garden, horny-handed son of toil style
  5. just as I finish, the skies open
  6. watch in horror as the hole rapidly becomes a pond
  7. since the green cone requires a dry hole, wait for it to drain…
  8. …and wait…
  9. …and wait…

I’m still waiting. :(

I just hope the flooded state of the pit is a side effect of the monsoon levels of rain over the last week, and will drain soon, rather than the normal situation for the garden. Otherwise, I’ll have to fill the hole and give up on the Green Cone entirely… argh. I should have gone for the wormery option, like lisey suggested!

Update: Enda left a good tip in the comments — dig deeper into the clay and fill in with more gravel. I did that and it looks like it’s working… Let’s see if the worms like it. I’ll keep yis posted ;)

How to solve a maze with Photoshop

wow, this is cool. lod3n, confronted by this heinous puzzle, wrote:

‘2 minutes in Photoshop. All too easy. So, where do I pick up my cake?

  1. Increase contrast.
  2. Select the right wall of the maze using the magic wand.
  3. Select > Modify > Expand 4 pixels
  4. Create new layer.
  5. Fill with Red.
  6. Select > Modify > Contract 2 pixels.
  7. Delete. Now you’ve got a line tracing the solution.
  8. Manually clean up the outer edge, and connect the dots.
  9. Cake!’

Here’s the result. Seriously nifty!

(Update: wow, this got Dugg heavily — 17000 pageviews from Digg alone! Unfortunately that caused a bit of a server meltdown. Should be back now though…)

7digital – a bit risky

Apparently <a href=’http://www.paidcontent.org/entry/419-emi-offers-drm-free-to-more-retailers-7digital-and-passalong-first/’>EMI are now offering their DRM-free MP3s via 7digital, so I thought I’ve give the newly-revamped 7digital site a go. Results were a little mixed, unfortunately.

I found a couple of tracks I wanted which were available as MP3 format, clicked the "purchase" button beside them, and they were added to the "basket" on the right-hand side. Pretty typical stuff, if you’ve used EMusic or iTunes. Then I created an account, chose to pay using Paypal, paid a couple of quid and all was well!

The good stuff:

  • the website works great in Firefox on Linux, and was nice and speedy.

  • the range of music seems pretty good; most of the catalogue is WMA-only unfortunately, but most of the new releases now seem to be coming out with MP3 as an option.

  • it’s very easy to pay by credit card or with Paypal.

There were a couple of glitches, however.

First, it allowed me to buy a file, then not give it to me. My first tester track was the Soulwax remix of ‘Standing in the Way of Control’ by Gossip. I happily added it to my basket, checked out, and paid — then when I got to my ‘Your downloads’ page, I was presented with this:

Gossip – Standing In The Way Of Control (Soulwax Nite Version) / 6:54 / Released 24.06.2007

No download links etc… hmm. A quick check of today’s date reveals that the 24th is a week from now — the track hasn’t been released yet! It seems this isn’t yet "available as a digital release" for some reason, despite the fact that as far as I can tell it’s been out for ages on CD. The only way to spot this in advance of purchase is to look at the "Digital release date" on the album info page and compare with today’s date; there’s no other notification that you’ll be buying a prerelease, and will have to wait to get your digital mitts on what you buy. Grrrr.

OK, next one; my other tester track was the title track from the new White Stripes, Icky Thump. At least this one was available. Now, supposedly we’re getting 320kbps MP3s, right? Not so, it seems — this one was 192kbps, a fact that’s only revealed once you’ve already paid for the tracks. Double grrr…

(it turns out, by the way, that only the "EMI content" is delivered in 320kbps format. I guess the other MP3 labels are sticking with 192kbps.)

So, two for two, both of the test downloads turned out to be wonky in one way or another. A bit disappointing. I hope they’ll improve though — there seems to be a new willingness to offer a decent MP3 music-download service there… and this is still more convenient for me than having to boot up a Windows virtual machine to use the iTunes Music Store.

They could really do with signposting exactly what you’re getting more clearly, though; in particular, being able to search by available format and bitrate would really help.

Lyris’ low SpamAssassin threshold

via jgc’s newsletter, Lyris’ latest ISP Deliverability Report (Q1 2007) makes an interesting point about legitimate bulk mail and SpamAssassin:

Contrary to popular belief among marketers, message content is not a major cause of deliverability challenges for most email marketers. This finding is a result of testing the content of more than 1,705 unique emails, using [Lyris] EmailAdvisor’s content scoring tool. The content scoring function is based on the content scoring rules of the widely adopted Spam Assassin open source project. The emails tested had an average content point score of 1.04 well below the filter’s generally accepted spam identification level of 3.0 or higher.

Now, that’s broadly good advice — SpamAssassin hasn’t really given much strength to signatures found in message body text in the past couple of years, since the signatures from other sources (especially DNS blocklists and URI blocklists) are much more reliable.

However, note the bit I emphasised. Since when is 3.0 the ‘generally accepted spam identification level’? Only the most paranoid user would ever go that low, since at that level, they’d expect to find 2.22% of their nonspam mail going into the spam folder (according to our own tests). In reality, our recommended level has always been 5.0 points, and that’s what we optimise for. I’m mystified as to where they’re getting 3.0 from…

Irish medical tourism

Just got a mail from an old friend, Caelen, who’s got a new start-up going with an interesting angle. Caelen and his (now-) wife, Barbara, spent a while travelling around Asia around the same time as we did. As I noted back in 2003, one thing he tried out, which I found particularly intriguing at the time, was to have some minor surgery in Bangkok:

This may seem foolish at first, but despite being in the heart of South East Asia, in what is generally thought to be a developing country, the Thai medical system is unbelievably good. Not only is it the medical hub for expatriates throughout the region, but tens of thousands fly here each year to have elective surgery, from laser eye treatments to boob jobs and face lifts. There are lots of reasons why they come to Bangkok but invariably quality of surgery and care comes top of the list. Simply put, medical care in Thailand is amongst the best in the word, available at a fraction of the cost.

The Thai government sees health care as the next logical step in its hospitality industry. As holiday makers in Thailand reach saturation point, growth has to come from other sectors and international healthcare has many of the same requirements as the tourism industry: good flight connections, plentiful accommodation and above all staff that are understanding and friendly. Gleaming hospitals, which could be mistaken for 5 star hotels, not only have rooms with all amenities but also have suites, restaurants, shops and cinemas. Menus from the finest restaurants in town are placed in the best rooms. Going to hospital doesn’t mean you have to stop having fun – this is Bangkok after all. This is a long way from the cold greasy egg served by the kitchen’s ‘Miserable Person of the Year’ award winner we get at home.

Back in 2002, this was pretty unprecedented — of course, nowadays, the concept is a lot more widely practiced, what with healthcare costs rising in the US and waiting lists rising in the UK.

I can vouch that the quality of care in Bangkok was fantastic, by all accounts; fastidiously clean and professional. (I never did it myself, but many people I knew at the time took advantage of the opportunity, rather than risk something flaring up in the less, er, reliable settings of Luang Prabang or Phnom Penh.)

Anyway, turns out Caelen has come up with a new site that is related to this — Reva Health Network. He says, ‘basically, we are a medical tourism search engine where consumers can find and compare hospitals and clinics from around the world. We cover everything although the bulk of our business is currently in dental.’

If you’re looking for some work done, it might be worth taking a look; it’s at revahealthnetwork.com.

Update 2010-08-16: They’ve moved! The new URL is http://www.whatclinic.com , which makes much more sense really. Apparently they’re getting 500,000 visitors a month, and proxy though 800 phone calls a day to clinics. Cool — sounds like it’s going well…

IKEA Dublin gets planning permission

Given that I’m trying to get a new house in order, here’s a topic close to my heart right now — massive IKEA store approved for Dublin:

An Bord Pleanála has given the go-ahead for the construction of a massive IKEA outlet in the Ballymun area of Dublin. Legal restrictions on the size of retail developments had already been changed to allow the Swedish furniture giant to build a 30,000 square foot shop in the area. However, several objections were received from the National Roads Authority, Green Party TD Eamon Ryan and a number of businesses which said they would be adversely affected by a huge increase in traffic on the M50 motorway. An Bord Pleanála has now decided to grant permission for the project, subject to 30 conditions aimed at preventing traffic congestion, protecting the visual amenity of the area and promoting sustainable development.

This is long overdue, and something Ireland’s been crying out for — the price and quality of furniture here is dire. I’m glad to see it.

The details are up on An Bord Pleanala’s site, including the Board’s conditions. For ease of reading, I’ve converted it to HTML using OpenOffice.

This one strikes me as potentially annoying:

A schedule of parking charges shall be applied to car park users (other than coaches and buses which shall not be charged for parking during opening hours) […]

At least two months prior to the opening of the proposed development for trading, an initial schedule of charges shall be agreed in writing with the planning authority. Where the daily peak hour two-way traffic flows as measured by the automatic traffic counters do not comply with the thresholds set above, the schedule of parking charges shall be varied as directed by the planning authority until compliance is achieved, save that breaches or non-compliances of a very minor or trivial nature or arising from exceptional circumstances may be disregarded at the discretion of the planning authority.

Reason: To minimise traffic impacts and avoid serious traffic congestion.

Patronising pregnancy

Via Yoz comes this great article: Zoe Williams: Being pregnant and receiving unscientific advice go hand in hand. Here’s a sample:

Listeria has been my particular bugbear ever since a midwife – that is, a trained prenatal professional who, unless I develop complications, represents the highest medical authority I can expect to deal with throughout my pregnancy – told me that I could get listeriosis, thereby brain-damaging my foetus, without knowing about it. Now, listeriosis is an incredibly serious disease, with extremely serious symptoms, taken extremely seriously by epidemiologists nationwide. Get it without noticing it? If I got listeriosis, the national papers would know about it. It would be the third outbreak that has occurred in [the UK] in the past 20 years.

Here are some other things that are wantonly untrue: pasteurisation, in fact, has nothing to do with a cheese’s ability to harbour the listeria bacteria. The bacteria that characterise different cheeses are introduced after the pasteurisation process anyway. Listeria flourishes in moist environments, so parmesan is safe where camembert isn’t, but even rinded and soft cheeses are safe once they have been cooked. But food hygiene is a much more important factor than moisture – raw fish does not come out of the sea carrying listeria, but contracts the bacteria from contact with dirty hands. Of the past two outbreaks of listeria in Britain, one was from butter and the other from lettuce (there have been other instances of product recalls, but no human contamination).

In fact the three worst recorded cases of listeria since 1992 have all been in France, and were all from pork tongue in jelly, which nobody in their right mind would ever eat. Of the past 10 listeriosis outbreaks in America, only two were from cheese, and one of those was a Mexican homemade cheese. The notion that there are pregnant people out there whipping themselves into a frenzy of guilt because they have eaten some gorgonzola is just infuriating.

This patronising "pregnant women mustn’t do X" paranoia is C’s pet hate of the moment; being a (pregnant) scientist, she’s been checking them against Medline, looking into the extent of the real research these claims are based on, and generally writing them off one by one. I’ve been trying to persuade her to write a blog post about this for taint.org, so far with no luck though…

MAAWG Talk

Here’s the talk I gave at MAAWG, entitled New Features in SpamAssassin 3.2.0 Of Interest To Large Receivers:

Abstract:

Many ISPs and mail receivers, at all scales, use SpamAssassin as part of their spam-filtering arsenal. The recent release of SpamAssassin 3.2.0 introduces much new functionality, and some of this is of particular interest to the large-scale mail receiver; in particular, rules compiled to parallel-matching native object code for increased speed, early short-circuiting based on administrator-specified rules, the new "msa_networks" setting to specify MSA hosts or pools, a new ruleset to detect spam/virus backscatter bounces, a way to run SpamAssassin in the Apache httpd server using mod_perl, and support for Amazon’s EC2 virtual server farm. In this talk, I’ll discuss each of these in detail, and discuss why it may be useful to you.

If you were at MAAWG, hope you enjoyed it ;)

DSPAM acquired by Sensory Networks

whoa, didn’t see that coming. Quoting Jonathan Zdziarski via jgc’s newsletter:

…The [DSPAM] project had grown to a point where it would take others – with enough free time – to bring DSPAM to the next level as a widely accepted enterprise-class solution, and [I] decided that it would be in the best interest of the project to entrust it to someone with the technical knowhow and dedication to reach these goals. Many of you are aware of my work in the past with Sensory Networks in developing a hardware-accelerated version of DSPAM (capable of supporting multi-megabit speeds in large carrier environments). I’ve spent a considerable amount of time with SN’s team over the past several years and when we initially discussed working together, they had shown to be very excited and motivated about the project.

After careful consideration and many discussions at length, I decided to allow Sensory Networks to acquire the rights to the project, and continue development on it with their own team. SN has displayed a strong commitment to the open source community and has been working closely with other leading projects such as Snort, Clam Antivirus, and SpamAssassin. They assured me that the project will remain open-source and available to all, and at the same time the project will receive exposure in commercial environments it has not seen before, as many of you have been asking for. We’ve now completed the acquisition for the project, and I’d like to encourage you to support them in helping them move forward as it grows into new areas.

More details at zdziarski.com.

Dealing with backscatter, revisited

Back in January, I wrote about how I deal with email backscatter nowadays. Since then, I’ve made a notable tweak.

This is that I no longer reject "null-sender" traffic during the SMTP transaction. It turned out that it broke Exim’s implementation of Sender Address Verification, which performs the SAV check using a MAIL FROM of <>, rendering it indistinguishable from a bounce during the SMTP transaction.

Now, I’ve complained about SAV, but I have to be pragmatic anyway (Postel’s law and all that!) — so it was better to just allow other sites to perform SAV lookups against our server, and fix the anti-bounce stuff some other way.

The new method (below) does this, by allowing null-sender SMTP traffic just fine; it detects bounces in Postfix if they arrive via SMTP in RFC-3464 format, and bounces that slip past are then dealt with in a more CPU-intensive manner using the SpamAssassin "VBounce" ruleset (which is part of the now-released SpamAssassin 3.2.0, btw).

This increases the load, since some bounces cannot be rejected at MAIL FROM time now, and instead we have to wait ’til DATA — but CPU hasn’t been a problem recently, so this is ok.

Here are the updated instructions:

In Postfix

In my Postfix configuration, on the machine that acts as MX for my domains — edit ‘/etc/postfix/header_checks’, and add these lines:

/^Content-Type: multipart\/report; report-type=delivery-status\;/  REJECT no third-party DSNs
/^Content-Type: message\/delivery-status; /     REJECT no third-party DSNs

Edit ‘/etc/postfix/main.cf’, and ensure it contains:

header_checks = regexp:/etc/postfix/header_checks

Then run:

sudo /etc/init.d/postfix restart

This catches most of the bounces — <a href=’http://www.faqs.org/rfcs/rfc3464.html’>RFC-3464-format Delivery-Status-Notification messages from other mail servers.

In SpamAssassin

As before, install the Virus-bounce ruleset and set it up. This will catch challenge-response mails, "out of office" noise, "virus scanner detected blah" crap, and bounce mails generated by really broken groupware MTAs — the stuff that gets past the Postfix front-line.

Dead laptop time

Argh. My Thinkpad’s power socket must have received a knock during the move. It no longer works with either of the two power bricks I have here — so it looks like it’s time to either (a) buy a soldering iron and some screwdrivers (incl Torx ones?) or (b) renew my IBM warranty service and send it in for some fixing :(

Bad timing.

Update: oh look, it’s working again! phew. I guess I should probably set aside some time for warranty service here anyway though…

Back

Hey — I’m back, rested and full of tasty, tasty Niçois and Provencal cuisine.

I got back just in time to vote, for what good that did with Bertie’s gang leading strongly in the current counts… argh!

For what it’s worth, I gave Patricia McKenna a preference, in the end. I was reminded that she’d been entirely on our side on software patents during her time as an MEP — so credit where it’s due, there; on top of that, a vote for the Greens is better than a vote going to Sinn Fein, after all, no matter what. ;)

Carbon offsetting

I’m off to Nice on vacation for two weeks, starting tomorrow — back on May 25th. See ya then!

In the meantime, and appropriately enough given that jet fuel I’ll be consuming, here’s some interesting stuff from my mate Eoin on carbon offsetting…

‘It’s a fecking minefield to figure out. There are many conflicting standards, some of which sound impressive but are useless in reality.

Steer clear of tree planting, especially outside Europe; even a well-run forestry in Europe will take decades to make any difference.

The best quality-mark appears to be the CDM Gold Standard. The Gold Standard is a recent introduction, a response to the weak, conflicting Kyoto standards and many ad hoc government ones. Gold Standard specifically excludes tree plantatations.

The following operators are the only ones I found that are Gold Standarded and also pass the bullshit smell test (which is far more stringent ;-) thanks to all who supplied links etc. — eoin

  • My Climate — Seem good. run out of Switzerland. Professional vibe. Mainly projects in the developing world.
  • Atmosfair — like the swiss one except smaller and German. Again, seems professional, their projects page in particular reads well. Doing a German schools project as well as developing world ones.
  • Climate Friendly — Aussies. Mainly wind power, in Oz & NZ. Again seem good, have been around for a few years. Website is decent if a bit all over the place.
  • Sustainable Travel International — more an eco-holidays travel agent than offsetting per se. Useful bookmark.
  • Puretrust.org.uk — These guys seem good. Interesting business model. They buy high quality carbon credits, from mainly Gold Standard providers, and retire these credits. Permanent retirement, I think, though this wasn’t 100% clear on their site. So they both support the providers directly by doing business with them, and also jack up the market price by reducing supply. This supply choke isn’t something that the rest of them do, at first glance anyway. Clever idea. As the market price gets higher it will put pressure on companies to reduce their emissions, not just buy their way out of it.’

Now it’s worth noting that this is the state of play as of May 2007; it’ll definitely change pretty quickly as time goes on. Good info, though.

Eircom broadband — it’s never easy

Argh, it’s never easy.

After this post, the consensus was that nowadays, Eircom have a pretty good quality of service for their DSL offerings, taking both price and service into account. I was happy enough to go with that, so I ordered their "Eircom broadband always on 2MB and Eircom talktime anytime bundle", back around the middle of April.

I had a great call with the sales agent, Hazel. Everything went swimmingly, we were all set for the modem to be delivered and the service to be up and running in 10 working days — by May 1st April 30th. I asked for an order reference number and she said I didn’t need one, it was all handled in their system. Great!

Unfortunately it seems the call centre staff never got that quality-of-service memo.

Come May 1st, there was no sign of the modem, so I rang Eircom’s order line to see how things were going. To my horror, the staff I talked to told me that there was no record of my previous order, or call… it was as if that call had never taken place at all. No part of the order had even started.

As a result, I’ve had to reorder from scratch. The previous 10 working days we’ve waited counts for nothing. (The agents lie through their teeth about this, though — one agent says they’ll send it out in the "next 3-5 days", the next agent insists that we have to wait the full 10 days, and the next says somewhere in between — anything to get us off the line within 4 minutes.)

This is bad news, since we’re waiting on the broadband to move in — since I work from home, we can’t move in until we have a good ‘net connection.

We can’t even make a complaint to Eircom about this fuckup, because they refuse to take complaints without the original order number to reference — the one that "Hazel" told me wasn’t needed anymore. Now that’s bureaucracy. Attempts at escalation just wound up with a dead end, where supervisors had no names and had left the office at 10am anyway. >:(

Best of all, their online complaints system now takes a maximum message length of 400 characters, so you can’t even provide a detailed written complaint online anymore. (That is, not unless you submit the complaint in 15 separate parts…)

What a fiasco.

So we now have to wait until May the 15th. We’ve submitted the complaint via the aforementioned 15 parts, and postally; if they don’t take action on those, we’ll complain to Comreg (and let’s see what that’s worth).

But here’s a question — assuming they fail to deliver the second order within time this time around, can we cancel at that stage? There’s a minimum contract length of 6 months, but since the service hasn’t been delivered, I would hope that hasn’t started yet. The terms and conditions document says:

"Ready for Service date" (otherwise "RFS date") means the date on which eircom establishes the Facility for the Customer.

3.1 This Agreement shall commence on the Ready for Service date and shall be for the Initial Period. Provided that this Agreement has not been terminated in accordance with its terms or in accordance with the Regulations, this Agreement shall thereafter automatically renew for successive six-month periods. For the purposes of this clause 3, a six-month period will be calculated from the anniversary of the RFS date.

3.2 The Customer may cancel its order for the Facility at any time prior to the RFS date. In the event of such cancellation by the Customer it shall be obliged to return any Kit, which may have been provided to it by eircom. Any Kit shall be returned to eircom by posting it to the freepost address detailed in the welcome pack. In the event of any Kit not being returned to eircom within fourteen (14) days of the cancellation of the Order for the Facility, the Customer shall be charged by eircom and shall pay to eircom such sum as is set out in the Regulations as being the charge payable in respect of the non-return of any Kit.

So I guess as long as the facility — the ADSL line — is not up and running, I’m clear to cancel, right? It’s a little worrying that the "facility" doesn’t include the "kit" — ie. the broadband modem, though; if they fuck up sending out the modem, but the line is up, am I liable for 200 Euros?

In terms of who are viable options to switch to — in my opinion it’s got to be fixed wireless, since everyone else now would have to go via Eircom’s exchanges anyway, and be delayed there. So — Irish Broadband. I know they had some pretty massive problems 2 or 3 years ago, but recently I’ve been hearing good things about them, Boards.ie has some reasonably good-sounding recent experiences, and half of my new neighbours (srsly!) are using them with great results. Anyone got recent news about how useful they are with service quality and install speed for their Breeze product in the D9/D11 area?

Alternatively, Ripwave might make a reasonable stop-gap option? 120 euros is the minimum fee (6 months at 18.95 per month), which is better than the money I’m paying now to live in two houses…

Alternatively anyone know an Eircom engineer in D9/D11 that can nip over to the exchange and plug in my connection on the DSLAM? ;)

Moin Moin attachment spam

Here’s a new trick used by the web spammers — attachments on a Moin Moin wiki. <a href=’http://taint.org/wk/RecentChanges?max_days=60′>The taint.org/wk RecentChanges list illustrates it well:

2007-05-07  set bookmark
[UPDATED]       UserPreferences         04:17   Info    ?StepStep [1-21]        
  #01 Upload of attachment 'big-cocks.html'.
  #02 Upload of attachment 'big-cock.html'.
  #03 Upload of attachment 'big-boobs.html'.
  #04 Upload of attachment 'big-ass.html'.
  #05 Upload of attachment 'bdsm.html'.
  #06 Upload of attachment 'bbw.html'.
  #07 Upload of attachment 'bang-bros.html'.
  #08 Upload of attachment 'bangbros.html'.
  #09 Upload of attachment 'baby.html'.
  #10 Upload of attachment 'asian-porn.html'.
  #11 Upload of attachment 'asian-girls.html'.
  #12 Upload of attachment 'anime-porn.html'.
  #13 Upload of attachment 'anime-girls.html'.
  #14 Upload of attachment 'angelina-jolie.html '.
  #15 Upload of attachment 'amature.html'.
  #16 Upload of attachment 'amatuer.html'.
  #17 Upload of attachment 'adult-videos.html'.
  #18 Upload of attachment 'adult-stories.html' .
  #19 Upload of attachment 'adult-games.html'.
  #20 Upload of attachment '69.html'.
  #21 Upload of attachment '3d.html'.

Great. Lots of spam. This first started appearing on Feb 27 2007, in a multi-upload attack on a single page ("FindPage"), from IP address 212.26.129.162; then reoccurred on Apr 27 and May 7 from the (insecure open proxy) proxy.drevlanka.ru.

Annoyingly my "subscribe to wiki changes" patch doesn’t catch this — these aren’t gatewayed through as "changes" via mail for review. I need to fix that in my copious free time. :(

Also, the RecentChanges RSS feed doesn’t list them, although the HTML form does.

So unfortunately, the only way I can see to block this is either to review by visiting the RecentChanges page in a web browser regularly (how retro!), and delete them retrospectively, or simply to turn off attachments entirely — which is what I’ve done, by editing "wikiconfig.py" and adding:

    actions_excluded = ['AttachFile']

It looks like quite a few other wikis around the web are running into the issue too :(

SpamAssassin 3.2.0!

W00t! SpamAssassin 3.2.0 has finally gone gold!

This release is a big one — it’s the first major release since 3.1.0, back in September 2005, just over a year and a half ago. <a href=’http://www.nabble.com/ANNOUNCE%3A-Apache-SpamAssassin-3.2.0-available-tf3680367.html’>Here is the release announcement mail, containing a list of major changes since version 3.1.8. There are a few major new features that I feel worth picking out in more detail and editorialising about:

sa-compile

This is a biggie. This new script takes the active SpamAssassin ruleset, and uses code contributed by Matt Sergeant to produce input for re2c. re2c in turn compiles the ruleset into a deterministic finite automaton, which can match multiple regular expressions in parallel. That’s not all, though; re2c then compiles that DFA into C code — which is then compiled into native object code. SpamAssassin will then load that object code and use it to replace the slower perl regexp tests, if it’s available at scan-time.

Now, it’s been a long time since SpamAssassin’s ruleset consisted mainly of rudimentary regular expressions matched against the body text — a good portion of SpamAssassin’s ruleset these days operates against headers, performs network lookups, analyzes URLs extracted from the body, uses the more advanced features supported by Perl’s NFA regexp engine, or so on. But even given that, the effects of ‘sa-compile’ seem to average between a 15% and 25% speedup, in my testing. That’s good ;)

Many of the commercial versions of SpamAssassin include their own body-rule speedups — but this is the first time anything similar has made it into the open source code.

Short-circuiting

Another good one for performance. There are some rules that you can reasonably assume will never hit nonspam or spam mail in a well-configured setup. For example, a hit on "ALL_TRUSTED" should mean that the message never traversed an untrusted network, therefore it cannot be spam, so why bother applying the expensive tests? It should be reasonable to "short-circuit" and immediately return a "ham" score for that mail.

This new plugin implements that algorithm — and efficiently, too, which historically has been the hard part!

I’ve been using this for a while with <a href=’http://wiki.apache.org/spamassassin/ShortcircuitingRuleset’>a ruleset like this one — in my experience, it’s cut overall CPU time spent scanning mail by 20%.

It is pretty flexible, too — there’s lot of tweakage that can be done with this functionality to suit your own setup.

Reduced memory footprint

One aim of this release has been to reduce the memory usage of SpamAssassin; the core code now uses less RAM than 3.1.x does, when tested with the same ruleset. (Unfortunately we’ve added lots more rules in the interim, so it’s a bit of a wash overall. ;)

The VBounce anti-bounce ruleset

Detects spurious bounce messages sent by broken mail systems in response to spam or viruses. More info about that here.

Apache-spamd

apache-spamd implements spamd as a mod_perl module. This was contributed by Radoslaw Zielinski, as a Google Summer of Code project last year. Thanks Radoslaw!

There are plenty more new, useful features and rules — these are just the top ones, in my opinion. Pretty cool stuff!

Patricia McKenna and MMR, again

Great! Patricia McKenna just called around, canvassing our area — and just got a serious telling off from the wife ;)

Catherine — unsurprisingly, given that she’s a zoology Ph.D — was fantastic, hitting every key point of the issue: that we’re both long-time Green voters who’ve been forced to not vote Green this time around, due to this MMR issue and the anti-science/pro-hokum angle it represents.

Interestingly, she claimed that her stance on MMR was always her own point of view, and that it wasn’t party policy — and that it was mentioned on the party website was a rumour put about by the PDs.

While it turns out that <a href=’http://www.unison.ie/irish_independent/stories.php3?ca=53&si=1773565&issue_id=15238′>Dr. Ruairi Hanley, the author of this letter to the Indo is indeed a PD (didn’t realise that!), Treasa at Winds and Breezes also noted it appearing on the Green Party site, as follows:

Questioning the Benefits of Immunisation

There are significant question marks about the effectiveness of mass immunisation programs. We would launch a major study of the benefits of these programs looking at all aspects of health

So Treasa — are you a stealth PD rumour-monger? ;)

Worth noting that at no time did McKenna reassure C that her policy would not become government policy if the Greens were elected… as an elected representative, surely her own policies would influence the government’s thinking?

Screenclick devolve again

After a short period where things were looking up, Screenclick have once again reverted to type, by ditching the lovely simple Netflix-style queue they seemed to be using, and instead instituting some new kind of bizarre homebrew wierdness.

It looks like a queue, with a line-by-line listing of movies — but then beside each title, there are 3 radio buttons: "High", "Medium", and "Low".

The instructions run as follows:

All titles are sorted in alphabetical order within their priority group
  • – High: Please deliver these titles as soon as possible
  • – Medium: Please deliver these titles as they become available
  • – Low: I don’t mind when you send these titles

So what — does this mean that if I put a title in as "High", I’m going to receive it next, or not, or what? and what’s with the alphabetical order? WTF is going on? argh.

Anyway, I just got out "Amores Perros", presumably due to this alphabetical ordering thing. not what I wanted at all. What a mess.