Skip to content

Justin's Linklog Posts

VISA and priorities

A couple of years ago, various anti-spammers discussed how the credit-card payment processing companies were perfectly placed to disrupt the spam economy, by tracking down spammers through “poison pill” transactions. Nothing happened from that, though, and spam is now a bigger problem than ever.

Today, I hear that the Russian MP3 site, AllOfMP3, have lost their account with Visa to process credit-card payments.

In other words, it sounds like the banks are happy enough to close off filesharing, but couldn’t be bothered dealing with spam…

Ireland now has RFID passports

Back in February, I wrote about some Dutch hackers remotely reading Dutch RFID passports, and my email to the Irish Passport Office enquiring about their plans.

They never bothered writing back; I guess they were too busy implementing the damn things :( Their new ‘ePassports’ are now mandatory for new Irish passports:

The chip technology allows the information stored in an Electronic Passport to be read by special chip readers at a close distance.

“special chip readers at a close distance” and/or “random criminals looking for Irish victims at a distance of 30 feet”, I guess.

Here’s the slides for Riscure’s attack on the Dutch passports. Irish passports are similarly using “Basic Access Control”. I wonder if Irish passport numbers are sequential, since that seems to be a key part of their attack?

DIY Glory

It’s been a while since I’ve embarked on a DIY job around the house with quite as much success as the most recent one — laying and tacking down some new carpet in the front hall. The last job was a bike rack, which had to be abandoned after the 4-inch screws proved too loose and threatened to fall out of the wall, leaving gigantic plugs of Polyfilla in their place (I’m sure bad drilling had nothing to do with it).

This has all now been forgotten in the glory of the freshly-laid carpet. Now, every time I walk past the front hall, I have to stick my head in and check out the perfectly-fitted carpet with pride. This can only last so long before my next botch job, of course…

Anti-spam group under attack — via ICANN

[This is a copy of an article I submitted to ICANNWatch.]

Spamhaus, the UK-based non-profit that runs the SBL and XBL anti-spam DNS blocklists, is reportedly facing serious legal trouble in the US.

A US-based spam gang has started legal action to have Spamhaus’ domain name confiscated by ICANN, and reportedly, Spamhaus may have been advised badly by their US legal people; so there is now a danger that they *may* indeed lose their domain, and possibly worse.

Note that Spamhaus is entirely UK-based, bar some mirrors; however, the proposed order is aimed at ICANN, which is US-based. This is the really tricky part; can a US company kill the domain of a non-US group?

According to anti-spam lawyer Matthew Prince, ‘there may be some time before ICANN is formally ordered to shut down the Spamhaus domain, but make no mistake that ICANN’s lawyers will be considering their options beginning first thing Monday, if they haven’t already begun the conference calls tonight’ … ‘In the end, [ICANN’s] decision is likely to be much more about setting a general policy than the specific details of who Spamhaus is or why they are critical for the Internet. ICANN will desperately want to stay out of this dispute, but they are subject to U.S. law and they will probably have attorneys who will argue they need to follow it. All it will take for this to end badly for Spamhaus is one lawyer at ICANN getting a little bit spooked and Spamhaus could lose not only it’s .org but potentially any other TLD that ICANN controls.’

This is interesting — if Spamhaus is forced to close down its domains and US-based mirrors, that will mean that the SBL and XBL blocklists will be down for a while, too. Typically those are used for up-front blocking, and if my servers are any indication, they take care of 75% of incoming spam before it hits any more CPU-intensive filtering.

Without those, there’ll be a lot of sites around the net suddenly dealing with quadrupled spam volumes hitting their MTAs.

NEDAP voting machines hacked

Here’s a press release from ICTE that’s well worth a read if you still trust voting machines:

Concerns expressed by many IT professionals about the security of the e-voting system chosen for use in Ireland were today shown to be well-founded when a group of Dutch IT Specialists, using documentation obtained from the Irish Department of the Environment, demonstrated that the NEDAP e-voting machines could be secretly hacked, made to record inaccurate voting preferences, and could even be secretly reprogrammed to run a chess program.

The recently formed Dutch anti e-voting group, “Wij vertrouwen stemcomputers niet” (We don’t trust voting computers), has revealed on national Dutch television program “EenVandaag” on Nederland 1, that they have successfully hacked the Nedap machines — identical to the machines purchased for use in Ireland in all important respects.

ICTE representative Colm MacCarthaigh, who has seen and examined the compromised Nedap machine in action in Amsterdam, notes “The attack presented by the Dutch group would not need significant modification to run on the Irish systems. The machines use the same construction and components, and differ only in relatively minor aspects such as the presence of extra LEDs to assist voters with the Irish voting system. The machines are so similar that the Dutch group has been using only the technical reference manuals and materials relevant to the Irish machines as a guide, as those are the only materials publicly available.”

Maurice Wessling, of Wij vertrouwen stemcomputers niet, adds “Compromising the system requires replacing only a single component, roughly the size of a stamp, and is impossible to detect just by looking at the machine”.

Both ICTE and Wij vertrouwen stemcomputers niet view this as yet another demonstration that no voting system which lacks a voter-verified audit trail can be trusted. According to ICTE spokesperson Margaret McGaley “Any system which lacks a means for the voter to verify that their vote has been correctly recorded is fundamentally and irreparably flawed”.

Margaret McGaley highlighted that it is the machines themselves that are at risk. “This particular issue is not about the vote counting software, which we already know must be replaced, this is about the machines that the Taoiseach has claimed were ‘validated beyond any question’. We now have proof that these machines can be made to lie about the votes that have been cast on them. It is abundantly clear that these machines would pose a genuine risk to our democracy if used in elections in Ireland.”

ICTE is repeating its call, which reflects the opinions shared by IT expert groups, including the E-voting group of the Irish Computing Society, that any voting system implemented must include a voter-verified audit-trail.

This is a major exploit. Colm’s earlier mail noted

As we knew already, the machines run on m64k processors, and it’s relatively easy to reverse engineer what all of the registers and inputs correspond to. The dutch group were able to successfull assemble code to run on the machine, and even burn it on the very eeprom that comes in the machine.

Since the NEDAP design does not include XBox-style boot-time cryptographic verification of the EEPROM’s contents, undetectable replacement of the operating system is a 2-minute matter of unsticking the trivial ‘seals’ on the voting machine’s access panels, popping out an EEPROM chip, and replacing with a modified one, then closing it up again.

Once that’s done, the election is rigged, as WVSN have demonstrated.

Update: here’s their paper describing the attack in detail — well worth a read.

a plug for Map24

Nat at O’Reilly Radar mentions that Multimap have added a public API . It’s great to see more sites adding public APIs, but sadly, as I note in a comment there, Multimap isn’t any use for me — they, along with Google and Yahoo!, have really crappy Irish mapping. Their geocoders (the part that turns an english-language address into a GIS coordinate pair) are pretty much non-functional for Ireland.

I moved from the US to Ireland earlier this year and found this pretty frustrating, after the joys of using the US mapping sites to get driving directions etc.

Thankfully, another contender has emerged recently — Map24.

They have a great geocoder for Ireland, and very reliable directions, which are even accurate for some of the more baroque one-way-system traffic-management changes that Dublin’s city planning department have come up with recently. The look and feel of the website is a little clunky in Firefox — not as smooth as Google’s — but it has some nice AJAXy touches now and seems to be heading in the right direction.

Interestingly, they now offer a public API for third-party mashups, and even offer an API for their geocoder — so someone preferring the Google look and feel could mash that up, using Map24 to find the coordinates and Google to display an area map! (Actually, I think that may be how John Handelaar’s earlier hack worked — I note in the comments that he mentions Map24 provide Lycos’ mapping backend. aha.)

Anyway — Map24 — if you’re looking for a good Irish mapping/driving-directions site, it’ll do the trick.

Some p0f Data From Craig

Regarding the use of p0f, passive OS fingerprinting, as an anti-spam measure — on top of this analysis which I linked to a few weeks back, one of the emeritus SA guys, Craig Hughes, sends over some p0f experiences. Handily, this includes a more detailed breakdown by OS release:

I’ve been using the SA p0f plugin for nearly a month or so now both on gumstix’s web server and my hughes-family.org server, and it actually looks like it could be pretty useful. So far I’ve just been scoring 0.001 for each OS to collect data, but here’s the results amavis has logged:

This breakdown shows what %age of the stuff coming in via OS xyz is spam or ham. ie 84.6% of all mail received from Windows-2000 is spam, 14.9% is ham (the rest is viruses). The first numeric column is number of messages of each type. Statistics are only since the last time amavis restarted:

On his home machine (comcast cable modem connection) :

spam.byOS.Windows-2000438 1/h 84.6 %
spam.byOS.Linux417 1/h 18.3 %
spam.byOS.Windows-XP265 1/h 97.8 %
spam.byOS.UNKNOWN135 0/h 55.1 %
spam.byOS.Windows-XP/200024 0/h 100.0 %
spam.byOS.Novell5 0/h 100.0 %
spam.byOS.Windows-983 0/h 60.0 %
spam.byOS.Windows-20032 0/h 66.7 %
spam.byOS.FreeBSD2 0/h 1.3 %
spam.byOS.Solaris1 0/h 1.8 %
spam.byOS.Windows-SP31 0/h 100.0 %
ham.byOS.Linux1851 6/h 81.2 %
ham.byOS.FreeBSD143 0/h 96.0 %
ham.byOS.UNKNOWN102 0/h 41.6 %
ham.byOS.Windows-200077 0/h 14.9 %
ham.byOS.Solaris56 0/h 98.2 %
ham.byOS.NetCache6 0/h 100.0 %
ham.byOS.Windows-XP6 0/h 2.2 %
ham.byOS.Tru642 0/h 100.0 %
ham.byOS.AIX2 0/h 100.0 %
ham.byOS.Windows-982 0/h 40.0 %
ham.byOS.Windows-20031 0/h 33.3 %

On gumstix.com (hosted at some provider in Texas):

spam.byOS.Windows-2000 401 1/h 58.4 %
spam.byOS.Windows-XP 131 0/h 92.9 %
spam.byOS.UNKNOWN 64 0/h 18.7 %
spam.byOS.Windows-XP/2000 29 0/h 96.7 %
spam.byOS.FreeBSD 11 0/h 4.1 %
spam.byOS.Linux 11 0/h 0.5 %
spam.byOS.Windows-98 6 0/h 85.7 %
spam.byOS.Solaris 4 0/h 3.3 %
spam.byOS.Windows-SP3 2 0/h 100.0 %
ham.byOS.Linux 1983 4/h 97.6 %
ham.byOS.UNKNOWN 277 0/h 80.8 %
ham.byOS.Windows-2000 271 0/h 39.4 %
ham.byOS.FreeBSD 253 0/h 93.7 %
ham.byOS.Solaris 116 0/h 96.7 %
ham.byOS.NetCache 40 0/h 100.0 %
ham.byOS.Windows-XP 9 0/h 6.4 %
ham.byOS.Windows-NT 7 0/h 70.0 %
ham.byOS.Novell 3 0/h 100.0 %
ham.byOS.Windows-XP/2000 1 0/h 3.3 %
ham.byOS.Windows-98 1 0/h 14.3 %
ham.byOS.Windows-2003 1 0/h 100.0 %

my home machine has a lot more relayed mail coming to it (all my various craig@* email addresses forward into there) which is probably why the linux spam rate is higher there — the relaying machines are probably running linux and forwarding spam through.

Interesting figures — but I’m still not-convinced that the correlation is quite high enough to form a good enough basis for solid anti-spam rules; reliable rules in the SpamAssassin core typically have over 95% accuracy at differentiating ham from spam (at least when we first check them in).

Update: it’s a natural for use as a Bayes token, though. The way amavisd-new implements p0f support is perfect for this use.

BTW, my guess is that many of the spam hits for “linux” are due to things like Netgear/Linksys routers, running embedded linuces. No evidence, just guessing ;)

Linus on Bayesian filtering

Linus Torvalds, in a post to linux-kernel today:

I’m sorry, but spam-filtering is simply harder than the bayesian word-count weenies think it is. I even used to know something about bayesian filtering, since it was one of the projects I worked on at uni, and dammit, it’s not a good approach, as shown by the fact that it’s trivial to get around.

I don’t know why people got so excited about the whole bayesian thing. It’s fine as one small clause in a bigger framework of deciding spam, but it’s totally inappropriate for a “yes/no” kind of decision on its own.

If you want a yes/no kind of thing, do it on real hard issues, like not accepting email from machines that aren’t registered MX gateways. Sure, that will mean that people who just set up their local sendmail thing and connect directly to port 25 will just not be able to email, but let’s face it, that’s why we have ISP’s and DNS in the first place.

But don’t do it purely on some bogus word analysis.

If you want to do word analysis, use it like SpamAssassin does it – with some Bayesian rule perhaps adding a few points to the score. That’s entirely appropriate. But running bogo-filter instead of spamassassin is just asinine.

Me, I like bogofilter — those guys are cool, and it’s a great anti-spam product for many purposes. But of course I have to agree with Linus that the correct approach in most cases is a bigger picture than just Bayes alone, a la SpamAssassin ;)

Back in one piece

Well, I’m back in Dublin in one piece, after a great honeymoon in Corsica. Lots of stuff to catch up on, so if you’re waiting on a response, sorry, it might take a little longer…

Hitched! Pt. 2

Well, the second half of the wedding — the fun part, with dinner, dancing, friends, and family — went off without a hitch. Our hippy-crap-laden humanist ceremony, celebrated with the aid of our friend Gerry, was a great success; the pianist and various DJs provided fantastic aural accompaniment; and the venue, Markree Castle in County Sligo, was fantastic, taking care of the entire party in every way we hadn’t foreseen and putting up with us far into the early hours of the next day.

That was the most fun I’ve had in yonks, and thanks to everyone who came. (And those who didn’t, due to the whims of US visa conditions — you were much missed.)

Photos will follow once we’re back from the honeymoon, which starts tomorrow morning. later ;)

BarCamp Ireland

wow, BarCamp Ireland is really shaping up!

Unfortunately, it’s very unlikely that I’ll be able to make it, due to all the wedding/honeymoon activity around that time (and it being down in Cork, which is a bit of a nontrivial journey at the moment). Pity, it looks like it’ll be great — and could probably do with some more talks about open source, to go with all the web2.0/startup content ;)

SpicyLinks and del.icio.us Network Summarization

Ross Mayfield:

Every time I see Gabe Rivera of TechMeme, I ask for the same thing — MeMeme. Give me TechMeme where the core index is based on who I read, about 150 people at any given time, to show me what my friends are interested in.

Funnily eough, that is exactly why I wrote SpicyLinks!

It works pretty well — in fact, nowadays I don’t really bother reading slashdot, Digg, Reddit, et al, particularly frequently, because I know that all the really interesting stuff will be at the top of my newsreader in the SpicyLinks feed.

Anyway, I’ve been calling SpicyLinks a ‘summarizing aggregator’, but the discussion that arose from Ross’ posting inspired me. A little bit of hacking has come up with an interesting twist: take a del.icio.us social network, a CGI script called deliciousnetwork2opml.cgi, and 15 minutes hacking on SpicyLinks to support inclusion of OPML via a remote URI, and hey presto — it’s now a social-network summarising aggregator. ;)

Unblocked

I just found an error in an Apache config file for taint.org, resulting in some of the legacy RSS feed URLs producing invalid data — this meant that anyone subscribed to the Feedburner feed, for example, had been missing out on my witterings. Fixed now — apologies!

Flickr’s Lousy US-Only Maps

Update: This is now fixed. See here for details…

Here’s the 2lmc boys getting rightly annoyed about Flickr’s new mapping feature, which displays geotagged photos overlaid on a mapping UI — as they note, it’s basically a steaming pile of crap outside the US:

However, because Flickr are owned by Yahoo, they’re using their maps. And, like all Yahoo! products, if you’re not American, it sucks.

Compare this lovely data-rich map of SF:

sf

With this featureless grey blob:

dublin

That’s just pathetic — there isn’t a single place name visible, and even the Phoenix Park, the biggest urban park in Europe, is simply displayed just as a light-coloured splat with a road going through it.

It appears the Yahoo! mapping data for the UK and Ireland just isn’t really there. What someone needs to do, is take the geotagging data from Flickr, and overlay it on the far more informative Google map data instead ;):

dublin google

It’s a real shame — I used to rely on Y! Maps to get directions everywhere while in the US. They’re missing out on so many customers here…

Update: good news — the Flickr maps are now things of beauty to match Google’s:

flickr-fixed.gif

Hitched!

Yesterday was spent in the beautiful surrounds of Naas Leisure Centre, attending the Kildare Registry Office for a brief ceremony and some putting of pen to paper — and hey presto, myself and the lovely C are now husband and wife ;) About time, really — we’ve been going out for 13 years, after all.

This is just the legal preliminaries — the big party is two weeks from now, in a castle in Sligo, and it’s shaping up to be a great party. But still, legally, she’s my wife now

By the way, one bonus of getting the legal stuff out of the way in advance is that we now don’t have to have all the fun marred by legal requirements on the big day. As a result, our mate Gerry, who a few taint.org readers will know, will be presiding over the real wedding ceremony. ;)

The EHIC and Irish government websites

The European Health Insurance Card is dead handy, providing access to healthcare for EU residents while travelling in Europe — it’s definitely worth having one.

There were a few reports in the Irish newspapers last week of an announcement by the Health Service Executive, warning of “a bogus website” which charges a fee of EUR22 to process applications for this:

The HSE also warned that the site is asking applicants to submit detailed financial information. “It has come to the attention of the Health Service Executive that Irish residents are being targeted by a website which is unnecessarily charging people to apply for EHIC cards. The bogus site concerned — http://www.ehic-card.eu/ — is not connected to the HSE,” said the HSE in a statement.

I’d link to the HSE’s press release on the topic, but it’s down, apparently — and that’s pretty indicative of the problem. You see, I’ve been trying to apply for one of these recently.

The HSE has been announcing that there’s no need to use this “bogus site”, since we can just use the “real” site at http://www.ehic.ie/ to apply for one. Here’s what they neglect to mention:

  • (a) that unless you’re a pensioner you can’t apply for one online — you have to print out a form, fill it in, and post it to your local health office.
  • (b) there’s no indication on the site as to what exactly your “Local Health Office” may be, just a long list of mysterious locations.
  • (c) in order to apply, the form demands that you supply all that ‘detailed financial information’ — namely your name, address, date of birth, proof of residency, and PPS number — anyway.
  • (d) the “bogus site” isn’t really all that bogus after all.

If they had a simple and usable online application process, perhaps they wouldn’t be plagued by other sites attempting to offer that service for what is really a quite reasonable EUR22 fee?

This is a pretty frequent phenomenon on Irish governmental websites; a half-assed attempt to bring governmental services online, resulting in shiny informational sites, full of clip-art of smiling people talking on the phone, which all come down to a bottom line of “print this out and post it in” or “call this number” — business as usual. Having said that, at least I can generally still get a human on the phone, which still beats dealing with US government agencies, I guess!

BTW, I notice the HSE claim that it only takes 10 working days for an EHIC to arrive using their system. I applied for mine 3 weeks ago, and there’s been no word yet…

Don’t use bl.spamcop.net as a blocklist

Update: as of Oct 2007, this advice is obsolete. The Spamcop algorithms have been greatly improved, as far as I and others can tell.

I’ve been hearing increasing reports of false positives using bl.spamcop.net.

One today spurred me to check out exactly how many times it I’m seeing it misfiring on nonspam in my own mail collection. The results have been pretty astonishing.

In my nonspam collection, it fired on 1043 messages out of 8415 in July; 12.4% of the mail. It gets worse for August, though — 884 messages out of 3729 since the start of August. That’s a staggering 23% of my nonspam mail this month. ;)

Most of that is due to the listings of GMail and Yahoo! Groups, both of which seem to have been listed for large swathes of the past month and a half.

Now, an important point — it can work pretty well as a single input to a scoring system, like Spamcop itself or SpamAssassin. In fact, I didn’t lose any mail as a result of those listings; SpamAssassin assigns only 1.5 points to the RCVD_IN_BL_SPAMCOP_NET rule, so it’s easily corrected by other rules.

However, people using it to block or reject spam outright, or who’ve changed the score of the RCVD_IN_BL_SPAMCOP_NET rule, need to turn that off ASAP — as they are losing mail.

More parallel string-match algorithm hacking: re2xs

Last week, Matt Sergeant released a great little perl script, re2xs, which takes a set of simplified regexps, converts them to the subset of regular expression language supported by re2c, then uses that to build an XS module.

In other words, it offers the chance for SpamAssassin rules to be compiled into a trie structure in C code to match multiple patterns in parallel. Given that this is then compiled down to native machine code, it has the potential to be the fastest method possible, apart from using dedicated hardware co-processors.

Sure enough, Matt’s results were pretty good — he says, ‘I managed to match 10k regexps against 10k strings in 0.3s with it, which I think is fairly good.’ ;)

Unfortunately, turning this into something that works with SpamAssassin hasn’t been quite so easy. SpamAssassin rules are free to use the full perl regular expression language — and this language supports many features that re2c’s subset does not. So we need to extract/translate the rule regexps to simplified subsets. This has generally been the case with all parallel matching systems, anyway, so that’s not a massive problem.

More problematically, re2c itself does not support nested patterns — if one token is contained within another, e.g. “FOO” within “FOOD”, then the subsumed token will not be listed as a match. SpamAssassin rules, of course, are free to overlap or subsume each other, so an automated way to detect this is required.

For simple text patterns, this is easy enough to do using substring matching — e.g. “FOOD” =~ /\QFOO\E/ . Unfortunately, once any kind of sophisticated regexp functionality is available, this is no longer the case: consider /FOO*OD/ vs /FOO/ , /F[A-Z]OD/ vs /FO[M-P]/ , /F(?:OO|U)D/ vs /F(?:O|UU)?O/ .

The only way to do this is to either (a) fully parse the regexp, build the trie, and basically reimplement most of re2c to do this in advance; or (b) change the trie-generation code in re2c to support states returning multiple patterns, as Aho-Corasick does.

I requested support for this in re2c, but got a brush-off, unfortunately. So work continues…

In other news, that food poisoning thing I had back at the end of June has lingered on. It’s now pretty clear that it isn’t food poisoning or a stomach bug… but I still have no idea what it actually is. No fun :(

“Stretch-to-fit Textareas” Greasemonkey User Script

Here’s another quick-hack Greasemonkey user script I wrote recently.

Stretch-to-fit Textareas is a user script which improves the usability of editable textareas; it causes them to “stretch” vertically to fit their contents, as you type. This behaviour was inspired by that of textareas in FogBugz.

It can be inhibited by turning off the small checkbox to the right of each textarea.

Update: it’s worth noting that this is different from the Resizeable Textareas Firefox extension. Whereas the latter allows the user to resize the textareas by hand, this user script does that action automatically, based on the contents of the field; no manual resize-handle-searching and dragging is required. On the other hand, this user script will only stretch textareas vertically, whereas the extension allows them to be dragged in both dimensions. In fact, the two are complementary — I’m running both, and I suggest you do too ;)

Update 2: here’s a Firefox extension version — Greasemonkey not required!

LKML discusses anti-spam moderation

LKML: Alexey Zaytsev: Time to forbid non-subscribers from posting to the list? — the linux-kernel mailing list discusses list moderation as an anti-spam strategy.

Spam really sucks; anything that deals with email now has to include some set of anti-spam features because of it. The LKML has important features that mitigate against simply closing the list partially, such as being a point where bug reports are submitted — so this is a thorny issue for them.

For what it’s worth, I have written a system to further automate moderation beyond the basic features provided by Mailman and ezmlm. http://taint.org/wk/ModerateList describes this in detail; in essence, it’s a specialised mail user agent designed to moderate lists quickly and efficiently, with an outboard spam filter built in (SpamAssassin, of course, via its perl API).

I moderate about a thousand messages per week using this (last time I checked), and it takes about 30 seconds per day to do so, so it’s pretty efficient.

In other news: wow, talking to a good accountant can really mitigate complicated tax issues… phew.

Wedding Poems

OK — looks like I’ve found the perfect poem for our wedding ceremony; allow me to present “Gravity of Love”:

One day, one day I asked myself
What is the right number or symbol?
What is the perfect equation?
What truly is LOGIC?
And who decides right reasoning?

In cause of no answer to my quest,
I traveled through the physical and metaphysical,
I traveled through the delusional and mystical
And at last back to the physical.

I made most important invention of my life career
That it’s only in the mysterious equation; logic of love
Any logical; mystical and psychological reasoning can be found.
It’s you in me I only believe that’s true and real

All I can say is — Wow.

Underwhelmed by ScreenClick

For the past few years, I’ve been a very happy user of Netflix, the innovative web site which let you receive DVDs via the post for a flat fee per month, for US residents. When I got back to Dublin, I was very happy to see that there was a local equivalent, in the form of ScreenClick — so I signed up.

However, I’ve become increasingly disillusioned with their service, for the same reasons as Adrian Weckler writes about here

Turnaround time: this varies wildly, and can take nearly a week to turn around a DVD from dropping it in the postbox to receiving the next one. Netflix was reliably two days for me, out in suburban Orange County, California; Even this Kansas blogger noted that the longest they’d waited was 4 days.

This may seem to be an externality for Screenclick — but really, it shouldn’t be. Their business is built on the postal service, and they have to have decent results for it to work.

The ‘wishlist’ model: Netflix uses a queue, operating on a first-in, first-out model, while Screenclick uses something they call a ‘wishlist’, where the DVDs are delivered based both on position in the list and availability — in other words, you can find you’ve been delivered the DVD at number 10 in your list, instead of whatever’s at the top.

Again, superficially a minor point. However, one important factor is that these services are bought by households, not by individuals. Chez jm, that means that we operated a pretty strict alternating system in our Netflix queue — one movie for me, one movie for the lovely C, repeat. This is now thoroughly scuppered with a random ‘lucky dip’ system. On top of that, forget about watching a serial in order. The end result is a mess.

The website: it’s atrocious, a hodge-podge of ads for third-party sites, press coverage of Screenclick, more ads for Screenclick (hey, I’m already a customer!), and news clippings I couldn’t care less about — with finally a few tiny sidebar boxes containing the things I want (login, search box and wishlist). My impression: it’s designed to sell the company to investors and advertisers, not for customer use.

On top of that, it’s all squished into a tiny window — Irish web designers need to buy bigger screens! That late-’90’s Jakob Nielsen thing about users not knowing how to scroll? They’ve learned by now.

That’s not even talking about the awful Javascript that’s used to edit the wishlist ordering, where little buttons need to be clicked repetitively, one by one, to reorder the list. Surely someone took a look around at other sites first — Amazon perhaps — to see how other sites do it?

Anyway, on this count, I sent in a mail containing a batch of bug reports and unsolicited opinions, and got no reply. ;)

Less bang-for-buck: pretty simple. Netflix: 3 movies at a time, more movies in the collection, $17.99 per month; Screenclick, 2 movies at a time, EUR 19.99 ($25.56, $10 more expensive than the equivalent Netflix service) per month. Surprisingly, this is actually a minor issue compared to the others, though, since it’s made plain from the outset.

These may seem to be minor points, but when selling a disposable-income service to consumers, the difference between an essential leisure-time service and a waste of pocket money is a very fine line. Looks like Adrian eventually cancelled. I’m not at that point yet, but it’s heading that way…

What Jeff Killed

What Jeff Killed is a blog from Shadow Hills, CA, documenting the murderous antics of Jeff, a large ginger tomcat:

we provide Jeff with food and water; however, this does little to lessen his killer instinct. To humans, Jeff is an exceptionally good-tempered and friendly cat; to rodents and other small animals, he is death itself. It could be that Jeff likes to bring us gifts to repay our hospitality. Perhaps he is simply a hardwired killing machine. All we know for certain is that he hunts down a wide variety of small animals and disembowels, decapitates, and dines on them. Often.

This was passed on by the lovely C, who noted ‘number of kills is about the same, cat for cat’ — indeed, Bubba, our cat, certainly had a similar career in Irvine, CA. However, I notice that as yet, there are no cases where Jeff has left the entrails and decapitated head of a rabbit lying up against the sandals of the neighbour’s 6 year old daughter… that was fun.

Kick.ie

I just noticed an interesting new site on the Irish web — kick.ie.

It’s closely based on the model of Digg, with a community of contributors who post new stories, comment, and “kick” stories they like so that those stories are given top billing. The interesting twist is that it’s not as general as Digg — instead of having a very broad “news” site, covering all bases, there are instead a smaller set of topic-focused “kick” sites. Using this model for the relatively-small Irish weblogging scene works pretty well, I think.

It’s nicely done — fast, clean, and featuring nifty features like RSS feeds throughout, and reader-contributed tagging. Nice work by Gavin Joyce!

Well worth subscribing to.

(Also, it’s cool to see that one of my posts discussing Irish road deaths managed to mass 7 ‘kicks’ a couple of weeks back ;)

Year 2038 Bug Strikes Early

Noted previously in the link-blog — here are more details on the first known instance of the Year 2038 UNIX epoch rollover bug, where AOLServer installs hung due to a 32-year timeout value hitting the end-of-epoch.

It appears that it was caused by an ‘official workaround’ for an Oracle driver bug, where an infinite timeout was desired. Instead of implementing true support for infinite timeouts, the developer just used a very large value — one BILLION seconds, Dr. Evil-style. Unfortunately, this led to the overflow issue.

Here’s some key snippets from the mailing list thread:

Bas Scheffers:

On 17 May 2006, at 21:34, Dossy Shiobara wrote:

Dave Siktberg seems to have narrowed it down to 2006-05-12 21:25.

In what timezone? It sound like that could equate to “Sat May 13 02:27:28 BST 2006”, or 1147483648 seconds since epoch, which makes it exactly 1,000,000,000 seconds until expiry of 32 bit time. Coincidence? Seems too strange as to a computer that is not a nice round number.

‘Jesus’ Jeff Rogers:

I had problems starting at the exact same time but on Solaris, where they manifested as a EINVAL return from pthread_cond_tomedwait. After a day of tracing the problem with debug builds and working with my sysadmin to track what changed (of course, nothing had) I cam to the same 1 billion second issue.

Which coincidentally is the expiry time (MaxOpen and MaxIdle) set on my database connections. My system is ACS-derived, so I wouldn’t be surprised if these database settings are common in other ACS-derived systems.

The only bug is that Ns_CondTimedWait doesn’t do any wraparound on the time parameter. All the same, I’ve been enjoying telling people that I hit my first y2038 bug.

Andrew Piskorski:

For those interested in ancient trivia, I think it was TWO bugs, one in the Oracle driver and/or OCI libraries (most likely OCI), and one in AOLserver. I think the workaround dates from before I ever used AOLserver, but I have these old comments in my AOLserver config file:

MaxIdle and MaxOpen:

Settings these to 1000000000 is a historical bug workaround. Could now probably set this to some normal number, or set to 0 to disable entirely. E.g., in this thread Rob Mayoff says:

http://www.arsdigita.com/bboard/q-and-a-fetch-msg?msg%5fid=000Ibq

It is a bug workaround. Many Linux users (including me) saw that when AOLserver tried to close a database connection, it would hang in the Oracle driver. So people started setting and MaxIdle to a very large number to keep connections from closing. You can also set them to zero, but at the time the bug was discovered, AOLserver had a bug that prevented you from setting them to zero.

I believe the bug was also seen, very rarely, on Solaris.

Curtis Galloway managed to get Oracle to investigate. They suggested to workarounds: use IPC or TCP to connect (which is what I do on my system), or set bequeath_detach=yes in sqlnet.ora.

2002/01/10 14:22 EST

Uselessly, the arsdigita thread URL is now a victim of needless website reorganisation, and redirects to their front page. Still, I think that’s enough info.

This is certainly going to be one of the first widely-recorded Y2038 rollover bugs, I think…

A Little Downtime

Quick note: taint.org, and the other sites on the same host, will be down for somewhere between 30 minutes and an hour tomorrow, at 1000 UTC, as the host moves to a new datacenter (and a new IP address).

Handily, the host will also get a hefty RAM upgrade, which should improve matters the next time we get slashdotted ;)

(If you need to get in touch during the downtime, jmason at gmail dot com will be the best bet.)

Update: this is now complete.

‘Small Engine Repair’

Last Friday, I visited the Galway Film Fleadh to see the Irish premiere of a new feature-length movie called Small Engine Repair, which was directed by a mate of mine called Niall Heery.

I loved it — funny, extremely black comedy, reminded me a lot of The Deer Hunter in visual style, but unmistakably Irish at the same time. (Blog movie reviews seem to be out of favour right now, so I’ll leave it at that.)

Here’s hoping it picks up wider distribution very soon — it deserves to be big, I think. Nice one, Niall! Happily, the voters of the Fleadh agreed — it went on to win the Best First Feature award.

Actually, it’s been a good year for friends and family at the Fleadh — I note that my cousin, Eoin Ryan, picked up first prize for Best Irish Short Animation with his excellent short, Demon. cool!

Road Deaths in Ireland

Road deaths are a hot topic in Ireland. They’re actually lower, per capita, than rates in other countries, but are given plenty of column inches and headlines here, and have become a government priority as a result.

Here’s the latest headline:

[Gay Byrne, head of the Road Safety Authority] claimed young people were ignoring road safety campaigns and that all he could do was to warn people to reduce speed and not to drink and drive. “I don’t know what else we can do. We have done all the horror ads, but there are obviously a great number of people who don’t look at television, listen to radio, or read newspapers and don’t get the message,” he said.

Ads. Great. Well, one thing that could be done is fixing the unsafe roads, and building decent ones; Irish country roads, while picturesque, are unable to deal with the levels of traffic they’re now facing. It’s time to apply modern safety standards, instead of considering a 2-lane boreen to be adequate.

There’s been a bit of improvement here; the roads from Dublin to Sligo, and from Dublin to Dundalk, for example, are both now fantastic, well-designed roads, and safe as a result. But try to get from Sligo to anywhere that isn’t Dublin, and you’re right back on those boreens again — with maniacs overtaking on blind corners into oncoming traffic and so on.

But here’s the real reason for the post. I have to reserve some special scorn for this idiot:

Hotelier Declan Corbett, who employed both siblings, yesterday called on Mr Byrne to resign following his comments.

“I am after coming down from the Frewen family house and if Gay Byrne or Michael McDowell were after witnessing what I saw he wouldn’t be coming out this morning with this ranting and blaming the young people of Ireland,” he said. […]

“Gay Byrne was given this job and he shouldn’t have been given this job. It’s typical Dublin 4 job-for-the-boys. A job like this should be given to someone in rural Ireland – somebody like Sean Og O’hAilpin that young people look up to.”

Sean Og O’hAilpin, eh? As Paul Moloney noted — that’d be the same Sean Og who ended his Gaelic football career when he overtook a car on a bend, at speed, crashing head-on into oncoming traffic? A great example, indeed.

I think that might be the problem.

A Released Perl With Trie-based Regexps!

Good news! From the Perl 5.9.2 ‘perl592delta’ change log:

The regexp engine now implements the trie optimization : it’s able to factorize common prefixes and suffixes in regular expressions. A new special variable, ${^RE_TRIE_MAXBUF}, has been added to fine-tune this optimization.

in other words, the trie-optimization patch contributed by demerphq back in March 2005 is now in a released build of Perl. Yay!

Here’s a writeup of what it does:

A trie is a way of storing keys in a tree structure where the branching logic is determined by the value of the digits of the key. Ie: if we have “car”, “cart”, “carp”, “call”, “cull” and “cars” we can build a trie like this:

        c + a + r + t
          |   |   |
          |   |   + p
          |   |   |
          |   |   + s
          |   | 
          |   + l - l
          |   
          + u - l - l

What the patch does is make /a | list | of | words/ into a trie that matches those words. This means that we can efficiently tell if any of the words are at a given location in a strng by simply walking the string and trie at the same time. In many cases we can rule out the entire list by looking at only one character of the input. The current way perl handles this would require looking at N chars where N is the number of words involved. (BTW: Thats the beauty of a trie, its lookup time is independent of the number of words it stores but rather on the key length of the word being looked up. )

SpamAssassin is, of course, both (a) very regular-expression-intensive and (b) searches a single block of text for a large number of independent patterns in parallel. I’d love to see someone coming up with a patch to SpamAssassin that uses trie-compatible regexps when the perl version is >= 5.9.2, and gets increased performance that way. hint ;)

BTW, the Regexp::Trie module on CPAN is related — in that it, similar to Regexp::Optimizer, Regex::PreSuf, or Regexp::Assemble, will compile a list of words or regular expressions into a super-efficient trie-style regexp. However, without the trie patch to the regexp engine itself, this would be a minor efficiency tweak at best; although having said that, Regexp::Assemble’s POD notes:

You should realise that large numbers of alternations are processed in perl’s regular expression engine in O(n) time, not O(1). If you are still having performance problems, you should look at using a trie. Note that Perl’s own regular expression engine will implement trie optimisations in perl 5.10 (they are already available in perl 5.9.3 if you want to try them out). Regexp::Assemble will do the right thing when it knows it’s running on a a trie’d perl. (At least in some version after this one).

(PS: interestingly, demerphq mentioned back in March 2005 that he was working on Aho-Corasick matching next. A-C is a great parallel-matching algorithm, and I would imagine it would increase performance yet more. I wonder what happened to that…)

Linksys NSLU2 Contemplation

These days, I shouldn’t have time for after-hours hobby projects; I should be organising weddings and so on. But it’s a compulsion. ;)

As a result, here’s some notes I’ve been keeping on building a home NAS (network-attached storage) server, using the nifty little Linksys NSLU2: http://taint.org/wk/BuildingNasServer

Anyone done this? Care to leave a comment noting the results? I’m curious.

Smithfield’s Decay

I live in Dublin 7, on the north side of Dublin. Historically, the north side has been run-down and under-developed, always losing out to the more well-maintained, and well-funded, south side.

A few years ago, though, it looked like this was changing; the Spire in O’Connell St. was erected, new bars and shops opened, and the Luas line was installed. One site, Smithfield Square in Dublin 7, was radically overhauled; its derelict buildings were renovated or knocked down, new construction was going up, and fantastic architecture was being put in place. The future was looking bright.

That was back around 2000/2001; in fact, I remember walking past the avenue of braziers on Milennium night. Fast forward — I’ve been back in Dublin 6 months now, and as far as I can tell, all that has petered out, while I was away. This Frank McDonald article in the Irish Times sums it up perfectly:

The cafes, bars and restaurants that were meant to be part of [Smithfield] are nowhere to be seen. The promoters had promised residents “an entire lifestyle on your doorstep, extended by the possibilities of the city and beyond”. There was to be an eclectic mix of restaurants and stylish bars – “a unique mix of offerings, ranging from food to culture to entertainment and leisure in a family-friendly development”, according to Paddy Kelly.

In November 2003, his son Chris said: “We are hoping it will emulate the New York example where everything – from your launderette, hairdresser and your masseuse – is only a block away, and that people will live, work and socialise within the same area”. On another occasion, London’s Covent Garden was cited as the urban model.

Incredibly, the lower end of Smithfield – through which Luas runs – remains unfinished six years after the rest of it was re-paved in an award-winning scheme by McGarry Ni Eanaigh Architects. It also has a redundant stone-clad structure, which served briefly as a plug-in point for open-air concerts.

The only real entertainment available in the area is the annual Christmas ice rink or the seriously indigenous and pre-existing horse fair, still being held on the first Sunday of every month.

Otherwise, the plaza attracts an assortment of winos, or juvenile offenders on their way to the Children’s Court, handcuffed to prison warders.

The little stage set up for open-air concerts is now covered in graffiti, and hosts a solid crew of junkies and winos; the braziers are no longer lit; the square boasts a permanent encrustation of construction fencing. The fruit and veg market that used to be held in one of the buildings has been bought out and moved on to somewhere on the outskirts of town, replaced by “Fresh“, which — while it sells the odd bit of interesting food, like the nice Bretzel bakery bread — is really just an upscale Spar. Even the local Indian takeaway has dropped in quality, and is now shipping out generic dishes that aren’t even made with Indian spices.

To be quite honest, Smithfield — and, to be honest, much of the north side — gives the impression it’s been abandoned again, after only one or two years of short-term investment, and no long-term thinking.

What happened?

(PS: it’s not over for Dublin 7, though — about a half-mile from Smithfield, a flashy new restaurant is set to open this weekend. But who’s to say that Capel St. won’t find itself similarly forgotten in a year or two?)

Blogorrah

Blurred Keys: Blogorrah.com – the start of empire building with ‘very few overheads’. Blurred Keys, “an Irish media blog”, brings the revelation that Blogorrah “copies” Gawker.com.

Honestly, though, this is blatantly obvious — and I’d consider it unfair to call this “copying”. It’s simply taking a successful format and adapting it to the local market, and doing so very well indeed if you ask me.

Blogorrah is a hilarious read. If you’re Irish and you’re not subscribed, you’re really missing out… it’s the funniest thing on the Irish web these days.

Daily Links Posting Off Again

I’ve turned this off again; even though it provides a nice way for people to comment and discuss link posts (which del.icio.us doesn’t provide, unfortunately), it does tend to break up the flow of the “main” article part of the weblog, and isn’t entirely popular I think.

If you’re interested in the links, your best bet is to read either the main page itself in your browser, where the link-blog appears over there —> , or one of these RSS feeds:

links for 2006-07-04

links for 2006-07-03

links for 2006-07-02

Ecch – that must have been poisonous! –more–

Since consuming a misjudged sossie at a BBQ last Saturday, I’ve been suffering from a stomach bug, causing nausea, sweating and the occasional vomit (never fun). On top of this, I spent Monday to Wednesday in Serbia on a work trip.

The result — I’ve managed to miss the entirety of ApacheCon EU 2006 in Dublin. I considered dropping down to catch the end of it this morning, but had to abort the attempt due to a bout of in-transit nausea.

All in all, a pretty miserable week. :(

Update: here’s something vaguely uplifting — a cover of Europe’s ‘Final Countdown’ in Khmer.

Update 2: wow, that little stomach bug has been wreaking havoc — over the weekend 3 more people laid low in our social group. sorry all…