Category: Uncategorized

Wisdom Teeth — Complete!

Published November 21, 2005

On Friday, I got my lower-left wisdom tooth extracted. That's the last one that should cause any trouble; there's only one remaining, and it's fully out so shouldn't act up. After a few years of on-again-off-again twinges, and lots of irresponsible putting-off of surgery, I've finally taken care of it.

The downside: I'm totally zonked on painkillers, so I won't be doing much for the next few days apart from what's required for day-to-day day-job stuff.

Urban Dead HUD; added Inventory Sorting

Published November 14, 2005

I've updated the Urban Dead HUD Greasemonkey userscript; it now offers inventory sorting, inspired by Ikko's userscript (albeit a little different in implementation). Here's a screenshot:

Right now, UD is reasonably interesting -- our team of plucky survivors have been helping out with the defence of Caiger Mall, a major mall towards the north-west of the city. We've repulsed the Church of the Resurrection's attempts to wipe us out, but that seems to have made us quite a juicy target; there are now no less than three separate Zombie groups ganging up on us. For now, we're still holding out.

Mobile phone repair at Karol Bagh Market

Published November 11, 2005

I love these pictures:

I link-blogged that article ages ago, but I keep thinking of it, so it's worth a proper post in its own right, to expand on that.

These guys work at an Indian mobile phone repair stall in Karol Bagh Market, in Delhi. The blog entry notes:

As in China, many of the mobile phone shops and street kiosks offer mobile phone repair service. Many of these guys can strip and rebuild a mobile phone in minutes. ... a lot of the hyperbole surrounding western hacker culture makes me smile compared to what these guys are doing day in day out.

Also, a commenter notes: 'in india, for about 1$, you can convert a CDMA phone to GSM !! also, they can unlock phones and do a veriety of hacks for little money.'

There's so many lessons I'm getting from it:

I've had a shoe resoled in 5 minutes for next to nothing at a stall not too different from that -- but this is a mobile phone. It's amazing to think of that level of hardware hacking taking place every day at a back-street market stall.
Those phones were doubtless planned, as a product, with a 'ship back to manufacturer' support plan. That clearly isn't going to fly without that developed-world luxury, Fedex. So this is the developing-world street finding its own uses for things, and working around the dependencies on systems that are optimised for the developed world.
It's the flip-side of Joshua Ellis' grim meathook future, where we're not facing down the barrel of a New-Orleans-style descent into barbarity if the power suddenly cuts out; tech can go on. It may be a little chunkier, though, and with more duct tape, but hey.
It's also a beautiful demonstration of how those of us in the developed world who assume that developing-worlders cannot find a use for high tech, are talking shit. (cf. Ethan Zuckerman as a good example of someone who gets this, more than almost anyone else I can think of.)

I think this is one of the most important lessons I learned while travelling through India and SE Asia a few years back -- the developing world is using high tech, and it's not using it in the same ways we do -- or even the ways we anticipated, and we have plenty to learn from them too.

Found at Jan Chipchase's site, which is full of great contemplation on this stuff. (The story on Seoul's selca culture is nuts, too -- it's like Flickr^1000.)

(PS: I have a wisdom tooth extraction scheduled for next Friday... wish me luck. That's another thing you don't want to happen in the developing world, although I daresay it'd rock in Bangkok!)

(Update: clarification -- my cite of Ethan Z was meant as a compliment ;)

IFSO Seminar In Dublin

Published November 4, 2005

Passing this on for readers in Ireland -- this sounds like an interesting event. From the FSFE-IE mailing list:

On the morning of Friday November 18th, IFSO is organising an event hosted by MEP Proinsias De Rossa about preventing software patents in the EU. Topics covered will be:

An analysis of the software patent directive;
a discussion of Free Software and computer security;
an introduction to IFSO/FSFE and their work;
the future of legislative obstacles to the development and distribution of software.

The event will be held in the European Parliament Office in Ireland, and spaces are limited. Participants are therefore asked to register their intent to attend. See here for more details.

Producing Open Source Software

Published November 2, 2005

Plug: Producing Open Source Software, a new book by Karl Fogel (of the Subversion and CVS projects), readable online as HTML or in ground-up wood formats.

It's got a whole load of solid-gold good advice on open-source development best practices, and even includes a section on dealing with the dreaded Reply-To munging issue.

Looks excellent -- this is definitely one to read.

Urban Dead HUD

Published October 29, 2005

I've been playing a bit of Urban Dead recently. Urban Dead is a very low-key, web-based MMORPG -- you play a 3-minute turn once every 24 hours. It needs some rebalancing and some new features, especially given the organised nature of some of the bigger marauding zombie hordes, but I'm still finding it fun.

To scratch a couple of itches, I've written a Greasemonkey user script for UD called the Urban Dead HUD. It adds several nifty features to the user interface:

keyboard accelerator access keys for the action buttons, and your inventory -- very handy when you're attacking an enemy repeatedly;
an on-page long-distance map of the surrounding squares;
a distance tracker, which tracks the distances to "important" locations for you

There's screenshots on the download page, so you can see what I'm talking about.

Greasemonkey is a fantastic tool, as is Mark Pilgrim's Dive Into Greasemonkey, which has repeatedly turned out to be an excellent, well-written reference while hacking this. Thanks guys!

trueColor() bug in GD::Graph

Published October 28, 2005

Hacking on a new rule-QA subsystem for SpamAssassin, I came across this bug in GD::Graph. If:

you are drawing a graph using GD::Graph;
outputting in PNG or GIF format;
and the 'box' area -- the margins outside the graph -- keeps coming up as black, instead of white as you've specified;

check your code for calls to GD::Image->trueColor(1);, or the third argument to the GD::Image->new() constructor being 1. It appears that there's a bug in the current version of GD (or GD::Graph) where graphing to a true-colour buffer is concerned, in that the 'box' area continually comes out in black.

(Seen in versions: perl 5.8.7, GD 2.23, GD::Graph 1.43 on Linux ix86; perl 5.8.6, GD 2.28, GD::Graph 1.43 on Solaris 5.10.)

False Positive ‘Reports’ != FP Measurement

Published October 26, 2005

John Graham-Cumming writes an excellent monthly newsletter on anti-spam, concentrating on technical aspects of detecting and filtering spam. Me, I have a habit of sending follow-up emails in response ;)

This month, it was this comment, from a techie at another software company making anti-spam products:

When I look at the stats produced on our spam traps, which get millions of messages per day from 11 countries all over the world, I see our spam catch rate being consistently over 98% and over 99% most of the time. We also don't get more than 1 or 2 false positive reports from our customers per week, which can give an impression of our FP rate, considering the number of mailboxes we protect.

My response:

'Worth noting that a "false positive report from our customer" is NOT the same thing as a "false positive" (although in fairness, [the sender] does note only that it will "give an impression" of their FP rate).

This is something that I've seen increasingly in the commercial anti-spam world -- attempting to measure false positive rates from what gets reported "upstream" via the support channels.

In reality, the false positives are still happening -- it's just that there are obstacles between the end-user noticing them, and the FP report arriving on a developer's desk; changes to the organisational structure, surly tech support staff, or even whether the user was too busy to send that report, will affect whether the FP is counted.

Many FPs will go uncounted as a result. As a result, IMO it is not a valid approach to measurement.'

I've been saying this a lot in private circles recently, so in my opinion that's a good reason to post it here...

Wired on the Motorola ROKR iTunes phone

Published October 26, 2005

Via Cory at Boing Boing, here's a great Wired post-mortem on how all the corporate vested interests (including Apple!) turned a nice concept for a new, music-playing mobile phone, into a useless, DRM-hogtied, designed-by-committee turd.

That's worth a read, in itself. However, what really blew my mind was this:

Anssi Vanjoki, executive vice president of Nokia and head of its multimedia group, has bad news for the [music] labels. ... He pushes a couple of buttons on the [phone's] keypad. Up pops Symella, a new peer-to-peer downloading program from Hungary. As the name suggests, Symella is a Symbian application that runs on Gnutella, the P2P network that hosts desktop file-sharing apps like BearShare and Limewire. It was created earlier this year by two students at a Budapest engineering school that for four years has been exploring mobile P2P in conjunction with a local Nokia research center.

Symella doesn't come installed on the N91; Vanjoki downloaded it from the university Web site. "Now I am connected to a number of peers," he continues, "and I can just go and search for music or any other files. If I find some music I like and it's 5 megabytes and I want to download it - the carriers will love this. It will give them a lot of traffic."

I had no idea the platform was that open, at this stage. It'll be interesting to see what happens next...

Ouch!

Published October 26, 2005

my new ipod.jpg
Originally uploaded by jmason.

Yep, they really are that easy to scratch, it seems.

UK ATM fraud in the 1990s

Published October 24, 2005

The Register: How ATM fraud nearly brought down British banking. This story is mind-boggling; it claims that UK ATM security had two major issues that have been kept secret since the 1990s:

An insecure data format used for the data on the magnetic stripes in one bank's cards;
Another bank's computing department "going rogue", "cracking PINs and taking money from customers' accounts with abandon" as the story puts it. Yikes.

The latter problem is scary, but in my opinion the former problem is more interesting from a computer security point of view.

This is a classic example of bad data format design, as it left the PIN and the account details individually rewritable -- in other words, an attacker could (and did) change one while keeping the other intact.

This British Computer Society abstract provides more details on the who, how and where:

... it was revealed that UKP 130,000 had been stolen from Abbey National cardholders during 1994 and 1995 with counterfeit cards. Andrew Stone, a bank security consultant who had been advising Which?, the magazine of the Consumers' Association, was jailed for five and a half years for the theft. This fraud involved spying on Abbey customers as they used their cards in automated teller machines (ATMs) or cash dispensers... [Stone] recorded card details and personal identification numbers (PINs) using powerful video cameras. The details were then encoded on the magnetic strips of other cards.

Finally, another quote from the Reg story:

why is he telling this explosive story now? Because chip and PIN has been deployed across the UK ATM network. "The vulnerability in the UK ATM network was still there to be exploited -- if someone had chanced upon it."

I wonder if other banking systems worldwide are still vulnerable, however? Did any other banks elsewhere license the vulnerable systems from UK banks, without knowing about these vulnerabilities? How long did it take for them to be fixed, if they were fixed?

Avian Flu, Health vs. IP Protection

Published October 22, 2005

Over at O'Reilly Radar, a question came up as to whether Roche's patent on Tamiflu should be respected if, in the event of a pandemic, people were dying on a large scale due to an inability for Roche to produce Tamiflu in sufficient quantities.

James Love of cptech.org recently pointed out that the WTO made an exception for a situation like this, allowing importation of medicines from foreign countries in violation of local patent licenses in the case of an emergency, in a 30 August 2003 decision:

Your country would benefit from importing generic medicines produced under a compulsory license, in order to build up adequate stockpiles or to obtain needed medicines in the event of a crisis.

However, many developed-world countries have explicitly made a commitment never to use this limited TRIPS waiver, namely the following:

Australia, Austria, Belgium, Canada, Denmark, Finland, France, Germany, Greece, Iceland, Ireland, Italy, Japan, Luxembourg, Netherlands, New Zealand, Norway, Portugal, Spain, Sweden, Switzerland, United Kingdom and the US.

Another 10 countries about to join the EU said they would only use the system to import in national emergencies or other circumstances of extreme urgency, and would not import once they had joined the EU: Czech Republic, Cyprus, Estonia, Hungary, Latvia, Lithuania, Malta, Poland, Slovak Republic and Slovenia.

So there you have it; the trade representatives for many developed-world countries took some kind of 'strong IP' high moral stand, and gave up this ability. I'll bet national health authorities are, right now, wandering government halls around the world, looking for trade representative asses to kick...

‘Internet Stamps’: ‘Sender Pays’ Is Back From The Dead

Published October 22, 2005

Jeremy Zawodny mentions that Tim Bray has proposed something he calls 'Internet Stamps' to solve the blog-spam problem; here's Tim's description of how it works:

An Internet Stamp is an assertion, signed by a Post Office, that some chunk of text was issued by someone who paid for the stamp. At least one major Post Office will be required by government statute to sell stamps to anyone in the world for either US$0.01 or EUR 0.01, and no stamp-selling organization will be recognized which sells stamps for less than this amount. For this to work, the number of stamp-selling organizations needs to be small and the organizations stable; another reason why Post Offices are plausible candidates.

It works like this: if you want to buy stamps, you sign up for an account with your Post Office; it works like paper stamps, you buy a bunch at a time in advance, in small amounts like $20 or EUR 10. Then the Post Office offers a Web Service where you connect to a port, authenticate yourself and send along some text; the Post Office decrements your account and sends back the stamp. There are a variety of digesting/signing/PKI techniques that could be applied to implement the stamps; a standard is required but should be easy.

Apparently himself and a few other guys chatted about it at the first Foo Camp, back in 2003. Funnily enough, in the anti-spam community, we were having our own chats about it, but it sounds like our paths didn't cross for some reason...

We call this idea 'sender pays'. Earlier in 2003, in June, John Levine published what I'd consider the canonical wrap-up of why it will not work, in 'An Overview of e-Postage'.

That report demolishes the use of 'sender pays' for e-mail anti-spam, on three main counts:

Creating a transaction system large enough for e-postage would be prohibitively expensive. The nearest parallel is the credit card transaction system, which deals with 1% of the transaction volume per day, and with much larger profit margins to make it worth their while.
The true financial, administrative, and social costs of e-postage are completely unknown. What do you do when a 'bad guy' steals the e-postage stamps off Aunt Millie's hard disk, without her knowledge? How much is the Fraud Handling Department going to cost? Is she just going to be out of luck when this happens? Will you need to use whitelisting and a content-based anti-spam filter as well, to filter out the messages sent using valid, but stolen, stamps?
Users hate micropayments. In short, see Andrew Odlyzko's research.

Now, using it on weblog spam is a little more practical than e-mail spam, for one because it has a lower daily volume of transactions; but these objections still stand, in my opinion.

John Levine is one of the foremost authorities in anti-spam, and this report has been a mainstay of the anti-spam canon for two years. Anyone discussing a new anti-spam concept really ought to know this report backwards and forwards by this stage, and go into some detail as to how their proposal deals with the issues raised, if it's to be taken seriously.

‘I Go Chop Your Dollar’, the video

Published October 21, 2005

Wow! videos.antville.org (via robotwisdom) came through with the goods. Go check out the video for Nkem Owoh (aka Osuofia) singing "I Go Chop Your Dollar", which turns out to be pretty catchy!

Here's the lyrics so all us oyinbos can sing along:

I don suffer no be small
Upon say I get sense
Poverty no good at all, no
Na im make I join this business
419 no be thief, its just a game
Everybody dey play am
if anybody fall mugu, ha! my brother I go chop am

Chorus
National Airport na me get am
National Stadium na me build am
President na my sister brother
You be the mugu, I be the master
Oyinbo I go chop your dollar, I go take your money dissapear
419 is just a game, you are the loser I am the winner
The refinery na me get am,
The contract, na you I go give am
But you go pay me small money make I bring am
you be the mugu, I be the master... na me be the master ooo!!!!

When Oyinbo play wayo, them go say na new style
When country man do im own, them go de shout bring am, kill am, die!
Oyinbo people greedy, I say them greedy
I don see them tire thats why when them fall enter my trap o!
I dey show them fire

Lyrics from here; there's a few other funny comments there too:

just saw the "i go chop your dollar"......i am glad we are blessed with a natural comedian as good as Nkem Owoh.....thank God say oyibo (sic) no sabi pidgin if not dis song for give them small panic........

Heh, looks like the 'small panic' is now underway ;)

‘I Will Eat Your Dollars’

Published October 21, 2005

An excellent, eye-opening interview with Samuel, an ex-419 scammer.

There's even a theme tune:

Their anthem, "I Go Chop Your Dollars," hugely popular in Lagos, hit the airwaves a few months ago as a CD penned by an artist called Osofia:

"419 is just a game, you are the losers, we are the winners.
White people are greedy, I can say they are greedy
White men, I will eat your dollars, will take your money and disappear.
419 is just a game, we are the masters, you are the losers."

Reportedly, Lagos inhabitants paint "This House Is Not For Sale" in big letters on their homes, in case someone posing as the owner tries to put it on the market.

Regarding the workings of the scam:

[Samuel] sent 500 e-mails a day and usually received about seven replies. Shepherd would then take over. "When you get a reply [to a 419 spam], it's 70% sure that you'll get the money," Samuel said.

(via Nelson.)

‘Life Hacking’ and Metacity

Published October 17, 2005

The NY Times story on "life hacking" is a pretty good one, and an excellent intro for anyone who hasn't been religiously reading the changing transcripts of Danny O'Brien's talk and so on.

This line:

Mann has embarked on a 12-step-like triage: he canceled his Netflix account, trimmed his instant-messaging "buddy list" so only close friends can contact him and set his e-mail program to bother him only once an hour.

Reminded me of something I ran into recently.

Last month, I switched from Sawfish, the venerable UNIX window manager, to GNOME's Metacity, which is the new(ish) GNOME standard window manager. (I was tired of some long-standing Sawfish crashes, and didn't want to be the last Sawfish user on the planet, which was seeming increasingly likely.)

One interesting UI change is that application windows no longer 'pop up' -- if an app wants to notify you of some important change, it instead can only cause its taskbar button to subtly pulse in the corner of your screen.

Initially, this threw me for a loop, and I rudely (albeit accidentally) ignored my friends on IM and suchlike. But I quickly got the hang of glancing at the taskbar once in a while when I wasn't concentrating on a task; it's now second nature, and has significantly reduced the number of interruptions I find myself experiencing in a typical day.

BTW, in passing: switching WMs is a big deal, user interface-wise. One of the key gating factors, for me, was a feature I use to control windows without laying hands on the dreaded rodent -- namely, a 'move window to screen corner' keyboard shortcut. This patch implements it for Metacity.

I implemented this last year for KWin, too, to resounding disapproval and bitchy comments about how I'm using the mouse all wrong. Meh. I fully expect the Metacity maintainers to throw it out, likewise, leaving me hand-patching WMs for a while yet ;)

Update, Nov 2006: they applied it! yay.

The Adelphi Charter

Published October 17, 2005

I've just finished Sir John Sulston's inspiring book about the Human Genome Project, The Common Thread, in which he discusses how he found himself on one front line of the battle between intellectual 'property' maximalism attempting to grab 'property rights' over the human genome, and the common good, preserving such rights for all humanity and unfettered research. (Thankfully, he -- and therefore the latter side -- won.)

I've been meaning to post a few choice quotes here about it at some stage, but haven't had the time -- I've had to just limit myself to correcting the Wikipedia entry for the Human Genome Project instead. ;)

Anyway, Sir John is in the news again, as part of a new international initiative -- the Adelphi Charter:

Called the Adelphi charter, it is an attempt to lay out those principles. Central among them are the ideas that policy should be evidence-based and that it should respect the balance between property and the public domain, not eliminate the latter to maximise the former.

Coverage:

Very encouraging to see something taking off at this level. I hope it does well, and I hope Ireland and the EU's lawmakers take note, since I've been hearing a lot of IP maximalist party-line from there recently...

I Agree With Goopy

Published October 12, 2005

shiny do-dad
Originally uploaded by goopymart.

Daniel Cuthbert’s Travesty of Justice

Published October 12, 2005

The Samizdata weblog posts more details about the Daniel Cuthbert case, where a UK techie was arrested for allegedly attempting to hack a tsunami-donation site. Here's what happened:

Daniel Cuthbert saw the devastating images of the Tsunami disaster and decided to donate UKP30 via the website that was hastily set up to be able to process payments. He is a computer security consultant, regarded in his field as an expert and respected by colleagues and employers alike. He entered his full personal details (home address, number, name and full card details). He did not receive confirmation of payment or a reference and became concerned as he has had issues with fraud on his card on a previous occasion. He then did a couple of very basic penetration tests. If they resulted in the site being insecure as he suspected, he would have contacted the authorities, as he had nothing to gain from doing this for fun and keeping the fact to himself that he suspected the site to be a phishing site and all this money pledged was going to some South American somewhere in South America.

The first test he used was the (dot dot slash, 3 times) http://taint.org/ sequence. The ../ command is called a Directory Traversal which allows you to move up the hierarchy of a file. The triple sequence amounts to a DTA (Directory Traversal Attack), allows you to move three times. It is not a complete attack as that would require a further command, it was merely a light 'knock on the door'. The other test, which constituted an apostrophe (`) was also used. He was then satisfied that the site was safe as his received no error messages in response to his query, then went about his work duties. There were no warnings or dialogue boxes showing that he had accessed an unauthorised area.

20 days later he was arrested at his place of work and had his house searched.

(His actions were detected by the IDS software used by British Telecom.)

In my opinion, this is a travesty of justice.

His actions were entirely understandable, under the circumstances, IMO. They were not hostile activities in themselves -- they might have been the prelude to hostility, in other cases, but, as his later activity proved, not in this one.

Instead of making parallels with "rattling the doorknob" or "lurking around the back door of a bank", a better parallel would be looking through the bank's front window, from the street!

If only law enforcement took this degree of interest in genuine phishing cases, where innocent parties find their bank accounts emptied by real criminals, like the unprosected phisher in Quebec discussed in this USA Today article!

Appalling.

Harpers: The Uses of Disaster

Published October 12, 2005

In this month's Harpers -- The Uses of Disaster contains a passages that rings bells, post-Katrina:

You can see the grounds for that anxiety in the aftermath of the 1985 Mexico City earthquake, which was the beginning of the end for the one-party rule of the PRI over Mexico. The earthquake, measuring 8.0 on the Richter scale, hit Mexico City early on the morning of September 19 and devastated the central city, the symbolic heart of the nation. An aftershock nearly as large hit the next evening. About ten thousand people died, and as many as a quarter of a million became homeless.

The initial response made it clear that the government cared a lot more about the material city of buildings and wealth than the social city of human beings. In one notorious case, local sweatshop owners paid the police to salvage equipment from their destroyed factories. No effort was made to search for survivors or retrieve the corpses of the night-shift seamstresses. It was as though the earthquake had ripped away a veil concealing the corruption and callousness of the government. International rescue teams were rebuffed, aid money was spent on other programs, supplies were stolen by the police and army, and, in the end, a huge population of the displaced poor was obliged to go on living in tents for many years.

However, there's a happy ending there:

That was how the government of Mexico reacted. The people of Mexico, however, had a different reaction. 'Not even the power of the state,' wrote political commentator Carlos MonsivÃ¡s, 'managed to wipe out the cultural, political, and psychic consequences of the four or five days in which the brigades and aid workers, in the midst of rubble and desolation, felt themselves in charge of their own behavior and responsible for the other city that rose into view.' As in San Francisco in 1906, in the ruins of the city of architecture and property, another city came into being made of nothing more than the people and their senses of solidarity and possibility. Citizens began to demand justice, accountability, and respect. They fought to keep the sites of their rent-controlled homes from being redeveloped as more lucrative projects. They organized neighborhood groups. And eventually they elected a left-wing mayor -- a key step in breaking the PRI's monopoly on power in Mexico.

Photo Update

Published October 12, 2005

Photoblog! We recently ticked off another of California's national parks with a trip to Joshua Tree, and saw this:

Scary desert people. Also, I got to be in a fractal:

Beardy progress continues, as you can see!

In other pics, Catherine cooked me an amazing birthday cake:

Also: I ate the most sacrilicious food ever -- mochi that tastes like green-tea-filled Eucharist wafer!

Ah, the blessed sacrament of the (green tea) body and (red bean) blood. The textural resemblance really was phenomenal; I guess it never came up in product taste tests. Quite funny. Very tasty too, by the way.

Bruce Sterling on J. G. Ballard

Published September 28, 2005

Ballardian.com just posted an interview with Bruce Sterling about J.G. Ballard by Chris Nakashima-Brown. One of my favourite authors talks about the other -- it's amazing!

A couple of highlights:

... The assumptions behind The Crystal World were so radically different and ontologically disturbing compared to common pulp-derived SF. If you just look at the mechanisms of the suspension of disbelief in The Crystal World, it's like, okay, time is vibrating on itself and this has caused the growth of a leprous crystal ... whatever. There's never any kind of fooforah about how the scientist in his lab is going to understand this phenomenon, and reverse it, and save humanity. It's not even a question of anybody needing to understand what's going on in any kind of instrumental way. On the contrary, the whole structure of the thing is just this kind of ecstatic surreal acceptance. All Ballard disaster novels are vehicles of psychic fulfilment.

....

My suspicion is that in another four to five years you're going to find people writing about climate change in the same way they wrote about the nuclear threat in the 50s. It's just going to be in every story every time. People are going to come up with a set of climate-change tropes, like three-eyed mutants and giant two-headed whatevers, because this is the threat of our epoch and it just becomes blatantly obvious to everybody. Everybody's going to pile on to the bandwagon and probably reduce the whole concept to kindling. That may be the actual solution to a genuine threat of Armageddon -- to talk about it so much that it becomes banal.

To me these late-Ballard pieces, these Shepperton pieces -- Cocaine Nights, Super-Cannes and so forth -- really seem like gentle chiding from somebody who's recognized that his civilisation really has gone mad. They're a series of repetitions that say, 'Look, we're heading for a world where consensus reality really is just plain unsustainable, and the ideas that the majority of our people hold in their heart of hearts are just not connected to reality'. I think that may be a very prophetic assessment on his part. I think we may in fact be in such a world right now -- where people have really just lost touch with the 'reality-based community' and are basically just living in self-generated fantasy echo chambers that have no more to do with the nature of geopolitical reality than Athanasius Kircher or Castaneda's Don Juan.

Kitty vs. International RFID Standardisation

Published September 27, 2005

So, I've just bought myself an RFID implant reader.

However, don't jump to conclusions -- it's not that I'm hoping that possession will put me on the right side of the New World Order 21st-century pervasive-RFID-tracking security infrastructure or anything -- it's for my cat. Here's why...

Many years ago, back in Ireland, we had an RFID chip implanted in our cat, as you do. Then 3 years ago, we entered the US, bringing the cat with us, and started looking into what we'd have to do to bring him back again.

Ireland and the UK are rabies-free, and have massive paranoia about pets that may harbour it; as a result, pets imported into those countries generally have to stay in a quarantine facility for 6 months. Obviously 6 months sans kitty is something that we want to avoid, and thankfully a recent innovation, the Pet Travel Scheme allows this. It allows pets to be imported into the UK from the USA, once they pass a few bureaucratic conditions, and from there they can travel easily to Ireland legally. (BTW Matt, this still applies; we checked!)

One key condition is that the pet be first microchipped with an RFID chip, then tested for rabies, with those results annotated with the chip ID number. Once the animal arrives in the UK on the way back, the customs officials there verify his RFID implant chip's ID number against the number on the test result documentation, and (assuming they match and all is in order) he skips the 6 month sentence.

So far, it seems pretty simple; the cat's already chipped, we just have to go to the vet, get him titred, and all should proceed simply enough from there. Right? Wrong.

We spent a while going to various vets and animal shelters; unfortunately, almost everyone who works in a vet's office in California seem to be incompetent grandmothers who just work there because they like giving doggies a bath, couldn't care less about funny foreign European microchips, and will pretty much say anything to shut you up. Tiring stuff, and unproductive; eventually, after many fruitless attempts to read the chip, I gave up on that angle and just researched online.

Despite what all the grannies claimed, as this page describes, the US doesn't actually use the ISO 11784/11785 standard for pet RFID chips. Instead it uses two alternative standards, one called FECAVA, and another FECAVA-based standard called AVID-Encrypted. They are, of course, entirely incompatible with ISO 11784/11785, although, to spread confusion, the FECAVA standard appears to be colloquially referred to in parts of the US vet industry, as "European" or even "ISO standard". I think it was originally developed in Europe, and may have been partially ISO-11784-compliant to a degree, but the readers have proven entirely incompatible with the chip we had, which is referred to as "ISO" in the UK and Ireland at least. They don't even use the same frequencies; FECAVA/AVID are on 125 KHz, while ISO FDX-B is on 134.2 KHz.

(BTW, a useful point for others: you can also tell the difference at the data level; FECAVA/AVID use 10-digit ID numbers, while ISO numbers are 15-digit. Also, "FDX-B" seems to accurately describe the current Euro-compatible ISO-standard chip system.)

Now, a few years back, it appears that one company attempted to introduce ISO-FDX-B-format readers and chips to the FECAVA-dominated marketplace, in the form of the Banfield 'Crystal Tag' chip and reader system.

That attempt foundered last year, thanks to what looks a lot like some MS-style dirty tricks -- patent infringement lawsuits and some 'your-doggy-is-in-danger' FUD:

what we have here is a different, foreign chip that's being brought in and it's caused a lot of confusion with pet owners, with shelters, and veterinarians.

(Note 'foreign' -- a little petty nationalism goes a long way.) The results can be seen in this press story on the product's withdrawal:

Although ISO FDX-B microchips are being used in some European countries and parts of Australia, acceptance of ISO FDX-B microchips is not universal and the standard on which they are based continues to generate controversy, in part due to concerns about ID code duplication.

FUD-bomb successful!

Anyway, this left us in a bad situation; our cat's chip was unreadable in the US, and possibly even illegal given the patent litigation ;) . We had two choices: either we got the cat re-chipped with a US chip, paying for that, or we could find our own ISO-compatible reader.

We sprung for the latter; although the re-chipping and re-registration would probably cost less than the $220 the reader would cost, we'd need to buy a US reader in addition, since the readers at London Heathrow airport are ISO readers, not FECAVA/AVID-compatible. On top of that, this way gives me a little more peace of mind about compatibility issues when we eventually get the cat to Heathrow; we now know that the cat's chip will definitely be readable there, instead of taking a risk on the obviously-quite-confusing nest of snakes that is international RFID standardisation.

Anyway, having decided to buy a reader, that wasn't the last hurdle. Apparently due to the patent infringement lawsuit noted above, no ISO/FDX-B-compatible readers were on sale in the US! A little research found an online vendor overseas, and with a few phone calls, we bought a reader of our very own.

This arrived this morning; with a little struggling from the implantee, we tried it out, and verified that his ID number was readable. Success!

PRNGs and Groove Theory

Published September 23, 2005

Urban Dead is a new browser-based MMORPG that's become popular recently. I'm not planning to talk about the game itself, at least not until I've played it a bit!, but there's something worth noting here -- a cheat called Groove Theory:

Groove Theory was a cheat for Urban Dead that claimed to exploit an apparent lack [sic] of a random number generator in the game, [so] that performing an action exactly eight seconds after a successful action would also be successful.

Kevan, the Urban Dead developer, confirmed that Groove Theory did indeed work, and made this comment after fixing the bug:

There is some pattern to the random numbers, playing around with them; "srand(time())" actually brings back some pretty terrible patterns, and an eight-second wait will catch some of these.

So -- here's my guess as to how this happened.

It appears that Urban Dead is implemented as a CGI script. I'll speculate that somewhere near the top of the script, there's a line of code along the lines of srand(time()), as Kevan mentioned. With a sufficiently fast network connection, and a sufficiently unloaded server, you can be reasonably sure that hitting "Refresh" will cause that srand call to be executed on the server within a fraction of a second of your button-press. In other words, through careful timing, the remote user can force the pseudo-random-number generator used to implement rand() into a desired set of states!

As this perl script demonstrates, the output from perl's rand() is perfectly periodic in its low bits on a modern Linux machine, if constantly reseeded using srand() -- the demo script's output decrements from 3 to 0 by 1 every 2 seconds, then repeats the cycle, indefinitely.

I don't know if Urban Dead is a perl script, PHP, or whatever; but whatever language it's written in, I'd guess that language uses the same PRNG implementation as perl is using on my Linux box.

As it turns out, this PRNG failing is pretty well-documented in the manual page for rand(3):

on older rand() implementations, and on current implementations on different systems, the lower-order bits are much less random than the higher-order bits. Do not use this function in applications intended to be portable when good randomness is needed.

That manual page also quotes Numerical Recipes in C: The Art of Scientific Computing (William H. Press, Brian P. Flannery, Saul A. Teukolsky, William T. Vetterling; New York: Cambridge University Press, 1992 (2nd ed., p. 277)) as noting:

"If you want to generate a random integer between 1 and 10, you should always do it by using high-order bits, as in

j=1+(int) (10.0*rand()/(RAND_MAX+1.0));

and never by anything resembling

j=1+(rand() % 10);

(which uses lower-order bits)."

I think Groove Theory demonstrates this nicely!

Update: I need to be clearer here.

Most of the Groove Theory issue is caused by the repeated use of srand(). If the script could be seeded once, instead of at every request, or if the seed data came from a secure, non-predictable source like /dev/random, things would be a lot safer.

However, the behaviour of rand() is still an issue though, due to how it's implemented. The classic UNIX rand() uses the srand() seed directly, to entirely replace the linear congruential PRNG's state; on top of that, the arithmentic used means that the low-order bits have an extremely obvious, repeating, simple pattern, mapping directly to that seed's value. This is what gives Groove Theory its practicability by a human, without computer aid; with a more complex algorithm, it'd still be guessable with the aid of a computer, but with the simple PRNG, it's guessable, unaided.

Update 2: as noted in a comment, Linux glibc's rand(3) is apparently quite good at producing decent numbers. However, perl's rand() function doesn't use that; it uses drand48(), which in glibc is still a linear congruential PRNG and displays the 'low randomness in low-order bits' behaviour.

Buying Music From iTMS in Linux

Published September 20, 2005

On saturday, I spent a little time trying to work out how to give Steve Jobs my money; more accurately, I wanted to get some way to buy music from the iTunes Music Store from my Linux desktop, and this isn't as easy as it really should be, because the official iTMS is a mess of proprietary Mac- and Windows-only DRM-laden badness.

Here's a quick walkthrough of how this went:

install iTunes in my VMWare Windows install
sign up for iTMS, and give Apple all my personal info, including super-s3kr1t card verification codes, eek
buy a song
find the DRM'd file in the filesystem; it's an .m4p file, and xine doesn't seem to like it
do some googling for 'iTunes DRM remove linux'; that leads to Jon Lech Johansen's JusteTune
download and run JusteTune installer
get obscure hexadecimal error code dialog. hmm! what could that mean?
download and run .NET runtime, link on JusteTune page
rerun JusteTune -- it works this time
select Account -> Authorize, enter login info
drag and drop file -- it's decrypted!

So, that yields a decrypted AAC file, which I can play on Linux using xine. That's the hard part done!

However, I want to play my purchases in JuK, the very nice iTunes-style music player app for KDE.

While the gstreamer audio framework supports playback of AAC files with the gstreamer0.8-faad package ('sudo apt-get install gstreamer0.8-faad'), JuK itself can't find the file or read its metadata, so it doesn't show up in the music collection as playable. I don't want to go hacking code from CVS into my desktop's music player -- possibly the most essential app on the desktop -- so transcoding them to MP3 seems to be the best option.

Somebody's already been here before, though -- that's one of the benefits of being a late adopter! Here's a script to convert .m4a files to .mp3 using the 'faad' tool ('sudo apt-get install faad').

During this work, I came across Jon Lech Johansen's latest masterwork -- SharpMusique, a fully operational native Linux interface to the iTMS. Building on Ubuntu Hoary was a simple matter of tar xvfz, configure, make, sudo make install, and it works great -- and automatically de-DRMs the files on the fly as it downloads them! Now that's the way to enjoy the iTMS on Linux, at least until Apple's engineers break it again.

Update, May 2006: Apple's engineers broke it. Thanks Wilfredo ;)

End result: a brand new, complete, high-quality copy of Dengue Fever's new album, Escape From Dragon House. Previously I'd only had a couple of tracks off this, so I'm now a happy camper, music-wise.

BTW, I was also considering trying out the new Yahoo! Music Store, but it too uses fascist DRM tricks and is platform-limited, and I'm not sure how breakable it is. On top of that, the prospect of not being able to try it out before handing over credit-card details put me off. As far as I can see, I can't even look up the albums offered before subscribing. All combined, I'll stick with iTMS for now.

Don’t Dumb Me Down

Published September 20, 2005

A great Guardian 'Bad Science' column by Ben Goldacre, Don't dumb me down. An excellent article on how mainstream journalists fail miserably in their attempts to report science stories accurately, and how this fundamentally misrepresents science to society at large.

Being a geek (of the computing persuasion) who hangs out with other geeks (of various science persuasions), I would up discussing this problem myself a month or two ago. This paragraph sums up where I think the failure lies:

There is one university PR department in London that I know fairly well - it's a small middle-class world after all - and I know that until recently, they had never employed a single science graduate. This is not uncommon. Science is done by scientists, who write it up. Then a press release is written by a non-scientist, who runs it by their non-scientist boss, who then sends it to journalists without a science education who try to convey difficult new ideas to an audience of either lay people, or more likely - since they'll be the ones interested in reading the stuff - people who know their way around a t-test a lot better than any of these intermediaries. Finally, it's edited by a whole team of people who don't understand it. You can be sure that at least one person in any given "science communication" chain is just juggling words about on a page, without having the first clue what they mean, pretending they've got a proper job, their pens all lined up neatly on the desk.

I'd throw in the extra step of a paper in Nature. Apart from that, in my opinion, he's spot on.

Other disciplines don't have this problem:

Because papers think you won't understand the "science bit", all stories involving science must be dumbed down, leaving pieces without enough content to stimulate the only people who are actually going to read them - that is, the people who know a bit about science. Compare this with the book review section, in any newspaper. The more obscure references to Russian novelists and French philosophers you can bang in, the better writer everyone thinks you are. Nobody dumbs down the finance pages. Imagine the fuss if I tried to stick the word "biophoton" on a science page without explaining what it meant. I can tell you, it would never get past the subs or the section editor. But use it on a complementary medicine page, incorrectly, and it sails through.

Statistics are what causes the most fear for reporters, and so they are usually just edited out, with interesting consequences. Because science isn't about something being true or not true: that's a humanities graduate parody. It's about the error bar, statistical significance, it's about how reliable and valid the experiment was, it's about coming to a verdict, about a hypothesis, on the back of lots of bits of evidence.

Fingerprinting and False Positives

Published September 20, 2005

New Scientist News - How far should fingerprints be trusted? (via jwz):

Evidence from qualified fingerprint examiners suggests a higher error rate. These are the results of proficiency tests cited by Cole in the Journal of Criminal Law & Criminology (vol 93, p 985). From these he estimates that false matches occurred at a rate of 0.8 per cent on average, and in one year were as high as 4.4 per cent. Even if the lower figure is correct, this would equate to 1900 mistaken fingerprint matches in the US in 2002 alone.

This is why I'm so unhappy about getting fingerprinted as part of US immigration's US-VISIT program and similar. My fingerprints have been collected on several occasions as part of that program, and as a result will now be shared throughout the US government, and internationally, and will be retained for 75 to 100 years, whether I like it or not.

As a result, with sufficient bad luck, I may become one of those false positives. Fingers crossed all those government and international partner agencies are competent enough to avoid that!

Update: oh wow, this snippet from the New Scientist editorial clearly demonstrates one case where it all went horribly wrong:

Last year, an Oregon lawyer named Brandon Mayfield was held in connection with the Madrid bombings after his fingerprint was supposedly found on a bag in the Spanish capital. Only after several weeks did the Spanish police attribute the print to Ouhnane Daoud, an Algerian living in Spain.

eek! Coverage from the National Assoc of Criminal Defense Lawyers, and the Washington Post.

ToorCon

Published September 15, 2005

ToorCon this year looks good. I'm not going, but I wish I'd gotten it together. There's a couple of spam/phishing-related talks, and a data-visualisation talk by Christopher Abad; hopefully he might diverge into some of this phishing data he talks about in this First Monday paper. Dan Kaminsky's talk looks interesting, too --

Application-layer attacks against MD5

We will show how web pages and other executable environments can be manipulated to emit arbitrarily different content with identical MD5 hashes.

Sounds like fun!

TiVo Co-Opts Anti-Spam Terminology

Published September 15, 2005

This is pathetic. As noted in the link-blog a couple of days ago (as well as everywhere else), TiVo's new DRM features have been spotted 'in the wild', protecting the valuable Intellectual Property that is Family Guy and Simpsons reruns.

The icing on the cake is that TiVo have come up with a hilarious hand-wavy explanation -- apparently it was line noise. Marc Hedlund of O'Reilly and Cory Doctorow are having none of it, and rightly so; as a bonus, Cory asked a group of DRM experts, who 'burst into positive howls of disbelief' that line noise could corrupt the DRM bits and the corresponding checksums to match.

From my angle, though, there's another noteworthy factor:

"During the test process, we came across people who had false positives because of noisy analog signals. We actually delayed development (of the new TiVo software) to address those false positives." (-- Jim Denney, director of product marketing for TiVo)

Interesting use of the term 'false positive' there. Sounds more like a good old-fashioned bug if you ask me ;)

Anyway, I'm glad I went for the home-built option. It was pretty obvious that TiVo are in the cross-hairs, and their product is only going to get worse as the DRM industry push harder...

SpamAssassin 3.1.0

Published September 15, 2005

'ANNOUNCE: SpamAssassin 3.1.0 available!' - MARC

Phew. That took a while, but it's worth it ;)

Bart Simpson vs. Missing-Person Data Entry

Published September 13, 2005

Ka-Ping Yee notes that Google have launched Katrina People Search, using Ping's speedily-created People Finder Interchange Format. This is good, especially since there's still no sign of the FEMA/Microsoft effort.

In passing -- it's disappointing to note how many appearances are made by a Mr. Heywood J. Ablohmie...

DnsblAccuracy082005 – Spamassassin Wiki

Published September 11, 2005

Do you use anti-spam DNS blocklists? If so, you should probably go take a look at DnsblAccuracy082005 on the SpamAssassin wiki; I've collated the results from our recent mass-check rescoring runs for 3.1.0, to produce have up-to-date measurements of the accuracy and hit-rates for most of the big DNS blocklists.

A few highlights:

highest hit-rate of an IP blocklist: the Distributed Sender Blackhole List, with 39.84% of spam hit
most accurate IP blocklist: the Spamhaus Exploits Block List, with a miniscule 0.04% false positive rate
highest hit-rate network test, overall: the OB SURBL list, 51.93% of spam hit

We don't have accurate figures for the new URIBL.COM lists, btw -- only the rulesets that are distributed with SpamAssassin were measured.

Bogus Challenge-Response Bounces: I’ve Had Enough

Published September 11, 2005

I get quite a lot of spam. For one random day last month (Aug 21st), I got 48 low-scoring spam mails (between 5 and 10 points according to SpamAssassin), and 955 high-scorers (anything over 10). I don't know how much malware I get, since my virus filter blocks them outright, instead of delivering to a folder.

That's all well and good, because spam and viruses are now relatively easy to filter -- and if I recall correctly, they were all correctly filed, no FPs or FNs (well, I'm not sure about the malware, but fingers crossed ;).

The hard part is now 'bogus bounces' -- the bounces from 'good' mail systems, responding to the forged use of my addresses as the sender of malware/spam mails. There were 306 of those, that day.

Bogus bounces are hard to filter as spam, because they're not spam -- they're 'bad' traffic originating from 'good', but misguided, email systems. They're not malware, either. They're a whole new category of abusive mail traffic.

I say 'misguided', because a well-designed mail system shouldn't produce these. By only performing bounce rejection with a 4xx or 5xx response as part of the SMTP transaction, when the TCP/IP connection is open between the originator and the receiving MX MTA, you avoid most of the danger of 'spamming' a forged sender address. However, many mail systems were designed before spammers and malware writers started forging on a massive scale, and therefore haven't fixed this yet.

I've been filtering these for a while using this SpamAssassin ruleset; it works reasonably well at filtering bounces in general, catching almost all of the bounces. (There is a downside, though, which is that it catches more than just bogus bounces -- it also catches real bounces, those in response to mails I sent. At this stage, though, I consider that to be functionality I'm willing to lose.)

The big remaining problem is challenge-response messages.

C-R is initially attractive. If you install it, your spam load will dwindle to zero (or virtually zero) immediately -- it'll appear to be working great. What you won't see, however, is what's happening behind the scenes:

your legitimate correspondents are getting challenges, will become annoyed (or confused), and may be unwilling or unable to get themselves whitelisted;
spam that fakes other, innocent third party addresses as the sender, will be causing C-R challenges to be sent to innocent, uninvolved parties.

The latter is the killer. In effect, you're creating spam, as part of your attempts to reduce your own spam load. C-R shifts the cost of spam-filtering from the recipient and their systems, to pretty much everyone else, and generates spam in the process. I'm not alone in this opinion.

That's all just background -- just establishing that we already know that C-R is abusive. But now, it's time for the next step for me -- I've had enough.

I initially didn't mind the bogus-bounce C-R challenges too much, but the levels have increased. Each day, I'm now getting a good 10 or so C-R challenges in response to mails I didn't send. Worse, these are the ones that get past the SpamAssassin ruleset I've written to block them, since they don't include an easy-to-filter signature signifying that they're C-R messages, such as Earthlink's 'spamblocker-challenge' SMTP sender address or UOL's 'AntiSpam UOL' From address. There seems to be hundreds of half-assed homegrown C-R filters out there!

So now, when I get challenge-response messages in response to spam which forges one of my addresses as the 'From' address, and it doesn't get blocked by the ruleset, I'm going to jump through their hoops so the spam is delivered to the C-R-protected recipient. Consider it a form of protest; creating spam, in order to keep youself spam-free, is simply not acceptable, and I've had enough.

And if you're using one of these C-R filters -- get a real spam filter. Sure they cost a bit of CPU time -- but they work, without pestering innocent third parties in the process.

Beardy Justin

Published September 10, 2005

Yes, I've been growing a beard. Strangely, it seems to be going quite well! Here's a good pic of beardy Justin, standing on a bridge over the Merced river in Yosemite:

Lots more pics from the holiday should be appearing here shortly, if you're curious.

Mosquitos, Snakes and a Bear

Published September 7, 2005

Well, I'm back... it appears that Google Maps link I posted wasn't too much use in deciphering where I was going; sorry about that. Myself and C spent a fun week and a bit, driving up to Kings Canyon and Yosemite, backpacking around for a few days, then driving back down via the 395 via Bishop, Mammoth Lakes, Lone Pine and so on.

Kings Canyon: Unfortunately, not so much fun; we had the bad luck of encountering what must be the tail end of the mosquito season, and spent most of our 2 days there running up and down the Woods Creek trail without a break, entirely surrounded by clouds of mozzies. Possibly this headlong dashing explains how we ran into so much other wildlife -- including a (harmless) California Mountain King Snake and, less enjoyably -- and despite wearing bear bells on our packs to avoid this -- a black bear...

We rounded a corner on the trail, and there it was, munching on elderberries. Once we all spotted each other, there were some audible sounds of surprise from both bear and humans, and the bear ran off in the opposite direction; the humans, however, did not. We were about 500 feet from our camp for the night, so we needed to get past where the bear had been, or face a long walk back.

Despite some fear (hey, this was our first bear encounter!), we stuck around, shouted, waved things, and took the various actions you take. It all went smoothly, the bear had probably long since departed, but we took it slow regardless, and had a very jittery night in our tent afterwards. After that, and the unceasing mozzie onslaught, we were in little hurry to carry on around the planned loop, so we cut short our Kings Canyon trip by a day and just returned down the trail to its base.

Yosemite: a much more successful trip. There were many reasons, primarily that the mosquito population was much, much lower, and discovering that the Tuolumne Meadows Lodge -- comfortable tent cabins, excellent food, and fantastic company -- provided a truly excellent base camp.

But I'd have to say that the incredible beauty of Tuolumne Meadows and the Vogelsang Pass really blew me away. I don't think I've seen any landscape quite like that, since trekking to Annapurna Base Camp in Nepal. I'm with John Muir -- Yosemite and its surrounds are a wonder of the world.

Lee Vining: had to pick up a sarnie at the world-famous Whoa Nellie Deli. Yum! After all the camping, we stayed in a hotel with TV, got some washing done, and watched scenes from a J.G. Ballard novel play out on NBC and CNN. Mind-boggling.

Mammoth Lakes: A quick kvetch. Mammoth is possibly the most pedestrian-hostile town I've ever visited. They have a hilarious section of 100 feet of sidewalk, where I encountered a fellow pedestrian using those ski-pole-style hiking walking sticks, and entirely in seriousness. Was the concept of walking so foreign in that town that long-distance walking accessories were required? I don't know, but it didn't make up for the other 90% of the streets where peds were shoved off onto the shoulder, in full-on 'sidewalk users aren't welcome here' Orange County style.

On top of that, the single pedestrian crossing in the main street spans five lanes of traffic, with no lighting, warning signs, or indeed any effective way for drivers to know whether peds were crossing or not. Unsurprisingly we nearly got run over when we tried using the damn thing. Best avoided.

I'm amazed -- it's like they designed the town to be ped-hostile. Surely allowing peds to get around your town is a bonus when you're a ski resort for half of the year? Meh.

Anyway, back again, a little refreshed. Once more into the fray...

Off on Holidays

Published August 27, 2005

I'm taking a week off to go hiking in some of the amazing back country that California has to offer. Assuming I don't get eaten by a bear, I'll see you all around Sep 6....

Faster string search alternative to Boyer-Moore: BloomAV

Published August 26, 2005

An interesting technique, from the ClamAV development list -- using Bloom filters to speed up string searching. This kind of thing works well when you've got 1 input stream, and a multitude of simple patterns that you want to match against the stream. Bloom filters are a hashing-based technique to perform extremely fast and memory-efficient, but false-positive-prone, binary lookups.

The mailing list posting ('Faster string search alternative to Boyer-Moore') gives some benchmarks from the developers' testing, along with the core (GPL-licensed) code:

Regular signatures (28,326) :

Extended Boyer-Moore: 11 MB/s

BloomAV 1-byte: 89 MB/s

BloomAV 4-bytes: 122 MB/s

Some implementation details:

the (implementation) we chose is a simple bit array of (256 K * 8) bits. The filter is at first initialized to all zeros. Then, for every virus signature we load, we take the first 7 bytes, and hash it with four very fast hash functions. The corresponding four bits in the bloom filter are set to 1s.

Our intuition is that if the filter is small enough to fit in the CPU cache, we should be able to avoid memory accesses that cost around 200 CPU cycles each.

Also, in followup discussion, the following paper was mentioned: A paper describing hardware-level Bloom filters in the Snort IDS -- S. Dharmapurikar, P. Krishnamurthy, T. Sproull, and J. W. Lockwood, "Deep packet inspection using parallel Bloom filters," in Hot Interconnects, (Stanford, CA), pp. 44--51, Aug. 2003.

This system is dubbed 'BloomAV'. Pretty cool. It's unclear if the ClamAV developers were keen to incorporate it, though, but it does point at interesting new techniques for spam signatures.

Tech Camp Ireland

Published August 25, 2005

Irish techies, mark your calendars! Various Irish bloggers are proposing a Tech Camp Ireland geek get-together, similar to Bar Camp in approach, for Saturday October 15th.

Ed Byrne and James Corbett are both blogging up a storm already. I'd go, but it'd be a hell of a trip ;)

I would say it needs a little less blog, a little more code, and a little more open source, but it does look very exciting, and it's great to see the Bar Camp spirit hitting Ireland.

More on ‘Bluetooth As a Laptop Sensor’

Published August 23, 2005

Bluetooth As a Laptop Sensor in Cambridge, England.

I link-blogged this yesterday, where it got picked up by Waxy, and thence to Boing Boing -- where some readers are reportedly considering it doubtful. Craig also expressed some skepticism. However, I think it's for real.

Check out the comments section of Schneier's post -- there's a few notable points:

Some Bluetooth-equipped laptops will indeed wake from suspend to respond to BT signals.
Davi Ottenheimer reports that the current Bluetooth spec offers "always-on discoverability" as a feature. (Obviously the protocol designers let usability triumph over security on that count.)
Many cellphones are equipped with Bluetooth, and can therefore be used to detect other 'discoverable' BT devices in range.
Walking around a UK hotel car park, while pressing buttons on a mobile phone, would be likely to appear innocuous -- I know I've done it myself on several occasions. ;)

Finally -- this isn't the first time the problem has been noted. The same problem was reported at Disney World, in the US:

Here's the interesting part: every break-in in the past month (in the Disney parking lots) had involved a laptop with internal bluetooth. Apparently if you just suspend the laptop the bluetooth device will still acknowledge certain requests, allowing the thief to target only cars containing these laptops.

Mind you, perhaps this is a 'chinese whispers' case of the Disney World thefts being amplified. Perhaps it was noted as happening in Disney World, reported in an 'emerging threats' forum where the Cambridgeshire cop heard it, and he then picked it up as something worth warning the public about, without knowing for sure that it was happening locally.

Update: aha. An observant commenter on Bruce Schneier's post has hit on a possibly good reason why laptops implement wake-on-Bluetooth:

On my PowerBook, the default Bluetooth settings were "Discoverable" and "Wake-on-Bluetooth" -- the latter so that a Bluetooth keyboard or mouse can wake the computer up after it has gone to sleep.

Emergent Chaos: I’m a Spamateur

Published August 23, 2005

Emergent Chaos: I'm a Spamateur:

In private email to Justin "SpamAssassin" Mason, I commented about blog spam and "how to fix it," then realized that my comments were really dumb. In realizing my stupidity, I termed the word "spamateur," which is henceforth defined as someone inexperienced enough to think that any simple solution has a hope of fixing the problem.

I think this is my new favourite spam neologism ;)

How convenient does the ‘right thing’ have to be?

Published August 17, 2005

Environment: Kung Fu Monkey: Hybrids and Hypotheses. A great discussion of the Toyota Prius:

Kevin Drum recently quoted a study which re-iterated that there's no "real" advantage to buying a hybrid. It's only just as convenient -- so if you're driving a hybrid, you're doing it for some other reason than financial incentive.

That made me think: what a perfect example of just how fucking useless as a society we've become. We can't even bring ourselves to do the right thing when it's only JUST as convenient as doing the wrong thing. And that's not even considered odd. Even sadder.

Box Office Patents

Published August 15, 2005

Forbes: Box Office Patents.

It's the kind of plot twist that will send some critics screaming into the aisles: Why not let writers patent their screenplay ideas? The U.S. Patent and Trademark Office already approves patents for software, business methods -- remember Amazon.com's patent on 'one-click' Internet orders? -- even role-playing games. So why not let writers patent the intricate plot of the next cyberthriller?

So in other words, a law grad called Andrew Knight actually wants to see the world RMS described in his 'Patent Absurdity' article for the Guardian, where Les Miserables was unpublishable due to patent infringement. Incredible.

He himself plays the classic lines, familiar to those who followed the EU software patenting debate:

Knight agrees, up to a point. He won't reveal the exact details of the plots he's submitted to the Patent Office, other than to say they involve cyberspace. And he says patents would apply only to ideas that are unique and complex. But he worries that without patent protection, some Hollywood sharpies could change ideas like his around and pass them off as their own.

''I'm trying to address a person who comes up with a brand-new form of entertainment who may not be a Poe, may not be a Shakespeare, but still deserves to be paid for his work,'' Knight says. ''Otherwise, who will create anything?''

A perfect pro-patent hat trick!

Running on WordPress!

Published August 15, 2005

I've decided to try out the real deal -- a 'proper' weblogging platform, namely WordPress. Be sure to comment if you spot problems...

Grumpiness and Cigarettes

Published August 13, 2005

Meta: My apologies if you wound up running into me online at some stage this week -- I've been in a lousy mood.

I gave up smoking cigarettes at the end of May, and switched to patches. That went pretty well, dropping from 21mg patches, to 14mg, to 7mg. But this week I finally hit the end of the line, stopped applying a patch every morning, and became fully nicotine-free. Only, ouch -- it's not quite as easy as I thought!

Cigarette addiction is (apparently) composed of two conceptual lumps -- the physical addiction to nicotine, and the mental addiction to the 'idea' of smoking. Through the patches, I've successfully nailed the mental addiction, but I'm now facing the physical withdrawal. I'm sweating, dizzy, can't focus my eyes, can't concentrate, my skin is going crazy, and I'm INCREDIBLY grouchy. It's amazing how much havoc the act of withholding nicotine can cause, especially when you consider that it's not a required nutrient for the human body -- it's an 'optional extra' that I never should have gone near in the first place.

Wierdly, though, I don't want a cigarette. Instead, I want a patch ;)

Xen and UKUUG 2005

Published August 11, 2005

Linux: PingWales' round-up of UKUUG Linux 2005 Day 3 includes this snippet:

As well as running (Virtual Machines), Xen allows them to be migrated on the fly. If a physical system is overloaded, or showing signs of failure, a virtual machine can be migrated to a spare node. This process takes time, but causes very little interruption to service. The machine state is first copied in its entirety, then the changes are copied repeatedly until there are a small enough number than the machine can be stopped, the remaining changes copied and the new version started. This usually provides a service interruption of under 100ms - a small enough jitter that people playing Quake 3 on a server in a virtual machine did not notice when it was moved to a different node.

Now that is cool.

Jim Winstead’s A9 on foot

Published August 11, 2005

Images: Jim Winstead's walk up Broadway from a few days ago has already garnered a few interested parties, since he's Creative-Commons-licensed all the photos, and they're easily findable via Google and on Flickr.

I find this interesting; the collision between open source, photography and cartography is cool. The result is a version of maps.A9.com, where you can actually use the images legally in your own work. More people should do this for other cities.

Where the ‘cursor’ came from

Published August 9, 2005

Stuff: So C is a massive antiques nut, and got tickets for the Antiques Roadshow next month in LA. As a result, we've been shopping around for interesting stuff for her to bring along.

Here's what I found at the antiques market last weekend:

Click on the pic to check out my multiplication skills!

The Life of a SpamAssassin Rule

Published August 6, 2005

Spam: during a recent discussion on the SpamAssassin dev list, the question came up as to how long a rule could expect to maintain its effectiveness once it was public -- the rule secrecy issue.

In order to make a point -- that certain types of very successful rules can indeed last a long time -- I picked out one rule, MIME_BOUND_DD_DIGITS. Here's a smartened-up copy of what I found out.

This rule matches a certain format of MIME boundary, one observed in 17.4637% of our spam collection and with 0 nonspam hits. Since we have a massive collection of mails, received between Jan 2004 to May 2005, and a rule with a known history, we can then graph its effectiveness over time.

The rule's history was:

bug 3396: the initial contribution from Bob Menschel, May 15 2004
r10692: arrived in SVN: May 16 2004
r20178: promoted to 'MIME_BOUND_DD_DIGITS': May 20 2004 (funnily enough, with a note speculating about its lifetime from felicity!)
released in the SpamAssassin 3.0.0 release: mid-Sep 2004

So, we would expect to see a drop in its effectiveness against spam in late May 2004 and onwards, if the spammers were reacting to SVN changes; or post September 2004, if they react to what's released.

By graphing the number of hits on mails within each 2-hour window, we can get a good idea of its effectiveness over time:

The red bars are total spam mails in each time period; green bars, the number of spam mails that hit the rule in each period. May 15 2004 and Sep 20 2004 are marked; Jan 2004 is at the left, and May 2005 is at the right-most extreme of the graph. (There's a massive spike in spam volume at the right -- I think this is Sober.Q output, which disappears after a week or so.)

It appears that the rule remains about even in effectiveness in the 4 months it's in SVN, but unreleased; it declines a little more after it makes it into a SpamAssassin release. However, it trails off very slowly -- even in May 2005, it's still hitting a good portion of spam.

Given this, I suspect that most spammers are not changing structural aspects of their spam in response to SpamAssassin with any particular alacrity, or at least are not capable of doing so.

To speculate on the latter, I think many spammers are using pirated copies of the spamware apps, so cannot get their hands on updated versions through 'legitimate' channels.

Speculating on the former -- in my opinion there's a very good chance that SpamAssassin just isn't a particular big target for them to evade, compared to the juicy pool of gullible targets behind AOL's filters, for example. ;)

‘Irish EFF’

Published August 6, 2005

Ireland: There's been some discussion about 'an Irish EFF' recently, reminding me of the old days of Electronic Frontier Ireland in the 1990s.

I was reminded of this by Danny O'Brien's article in The Guardian, where he notes an interesting point -- half of the effectiveness of the EFF in the US, is because they have a few full-time people sitting in an office, answering phone calls. Essentially they act as a human PBX, being the go-to guy connecting journalists to activists and experts.

Now that is something that could really work, and is needed in Ireland, which is in the same boat as the UK in this respect; the journalists don't know who to ask for a reliable opposing opinion when the BSA, ICT Ireland, or the IRMA put out incorrect statements. It has to be someone who's always available for a quote at the drop of a hat, over the phone. From experience, this takes dedication -- and without getting paid for it, it's hard to keep the motivation going.

IrelandOffline have done it pretty well for the telecoms issue; ICTE have done a brilliant job, the best I've seen in Europe IMO, of grabbing hold of the e-voting issue to the stage where they own it; but for online privacy, software patenting, and other high-tech-meets-society issues, there's nobody doing it that successfully.

(Update: added ICTE, slipped my mind! Sorry Colm!)

Happy Birthday to the RISKS Forum!

Published August 6, 2005

Tech: One of the first online periodicals I started reading regularly, when I first got access to USENET back in 1989 or so, was comp.risks -- Peter G. Neumann's RISKS Forum. Since then, I've been reading it religiously, in various formats over the years.

It appears that RISKS has just celebrated its 20th anniversary.

Every couple of weeks it provides a hefty dose of computing reality to counter the dreams of architecture astronauts and the more tech-worshipping members of our society, who fail to realise that just because something uses high technology, doesn't necessarily make it safer.

I got to meet PGN a couple of weeks ago at CEAS, and I was happy to be able to give my thanks -- RISKS has been very influential on my code and my outlook on computing and technology.

Nowadays, with remote code execution exploits for e-voting machines floating about, and National Cyber-Security Czars, I'd say RISKS is needed more than ever. Long may it continue!

Stupid ‘Ph’ Neologisms Considered Harmful

Published August 6, 2005

Words: 'Pharming'. I recently came across this line in a discussion document:

'Wait, isn't this exactly the kind of attack pharmers mount?'

I was under the impression that 'pharming' was a transgenics term: 'In pharming, ... genetically modified (transgenic) animals are
mostly used to make human proteins that have medicinal value. The protein encoded by the transgene is secreted into the animal's milk, eggs or blood, and then collected and purified. Livestock such as cattle, sheep, goats, chickens, rabbits and pigs have already been modified in this way to produce several useful proteins and drugs.'

Obviously this wasn't what was being referred to. So I got googling. It appears the sales and marketing community of various security/filtering/etc. companies, have been getting all het up about various phishing-related dangers.

The earliest article I could find was this -- GCN: Is a new ID theft scam in the wings? (2005-01-14):

''Pharming is a next-generation phishing attack,' said Scott Chasin, CTO of MX Logic. 'Pharming is a malicious Web redirect,' in which a person trying to reach a legitimate commercial site is sent to the phony site without his knowledge. 'We don't have any hard evidence that pharming is happening yet,' Chasin said. 'What we do know is that all the ingredients to make it happen are in place.'

Oooh scary! The article is short on technical detail (but long on scary), but I think he's talking about DNS cache poisoning, whereby an attacker implants incorrect data in the victim's DNS cache, to cause them to visit the wrong IP address when they resolve a name. This Wired article (2005-03-14) seems to confirm this.

But wait! Another meaning is offered by Green Armor Solutions, who use the term to talk about the Panix and Hushmail domain hijacks, where an attacker social-engineered domain transfers from their registrars. There's no date on the page, but it appears to be post-March 2005.

Finally, yet another meaning is offered in this article at CSO Online: How Can We Stop Phishing and Pharming Scams? (May 2005): 'The Computing Technology Industry Association has reported that pharming occurrences are up for the third straight year.' What?! Call Scott Chasin!

Steady on -- it appears that the 'pharming' CSO Online is talking about, has devolved to the stage where it's simply a pop-up window that attempts to emulate a legit site's input -- no DNS trickery involved. (This trick has, indeed, been used in phish for years.)

So right there we have three different meanings for 'pharming', or four if you count the biotech one.

It may be impossible to get the marketeers to stop referring to 'pharming'. But please, if you're a techie, don't use that term, it's lack of clarity renders it useless. Anyway, the biotech people were there first, by several years...

Stunning round-up of alleged election fraud in Ohio

Published August 5, 2005

Voting: None Dare Call It Stolen - Ohio, the Election, and America's Servile Press, by Mark Crispin Miller.

Miller and many others have obviously been spending a lot of work chasing down each incident in Ohio since last November, and there's quite a lot of them. It's impressive the degree to which recounts were evaded, if these allegations are true. There's many shocking cases alleged than I could really fit here -- but here's some of the lowest points:

On December 13, 2004, it was reported by Deputy Director of Hocking County Elections Sherole Eaton, that a Triad GSI employee had changed the computer that operated the tabulating machine, and had "advised election officials how to manipulate voting machinery to ensure that preliminary hand recount matched the machine count." This same Triad employee said he worked on machines in Lorain, Muskingum, Clark, Harrison, and Guernsey counties.
it strongly appears that Triad and its employees engaged in a course of behavior to provide "cheat sheets" to those counting the ballots. The cheat sheets told them how many votes they should find for each candidate, and how many over and under votes they should calculate to match the machine count. In that way, they could avoid doing a full county-wide hand recount mandated by state law.
In Union County, Triad replaced the hard drive on one tabulator. In Monroe County, "after the 3 percent hand count had twice failed to match the machine count, a Triad employee brought in a new machine and took away the old one. (That machine's count matched the hand count.)"

The willingness to throw away functioning, reliable election systems, and replacing them with new, easy-to-subvert ones, is astounding. But on top of that, when concerned parties investigate and find danger signs, it's easily buried:

Miller emphasizes that, even after the National Election Data Archive Project, on March 31, 2005, "released its study demonstrating that the exit polls had probably been right, it made news only in the Akron Beacon-Journal," while "the thesis that the exit polls were flawed had been reported by the Associated Press, the Washington Post, the Chicago Tribune, USA Today, the San Francisco Chronicle, the Columbus Dispatch, CNN.com, MSNBC, and ABC."

Miller's conclusion: 'the press has unilaterally disarmed'.

Lean’s got a weblog

Published August 3, 2005

Friends: the ex-Iona readers, and those with an interest in urban design, might like to go take a look at citynoise.blogspot.com -- Lean Doody's new urban design weblog.

SpikeSource, Open Source, and Bongo

Published July 26, 2005

Open Source: so I was just looking at OSCON 2005's website, and I noticed that it listed Kim Polese, of SpikeSource, as a presenter.

I don't really pay any attention to what's happening in Java these days, but it appears that SpikeSource launched last year to provide 'enterprise support services for open-source software' with a Java/enterprise slant.

Funnily enough, my last encounter with a Kim-Polese-headed company did indeed have a big effect on me, open-source-wise.

That company was Marimba, and they made an excellent Java GUI builder called Bongo. In those days (nearly ten years ago!), I was working on a product for Iona as a developer, in Java and C++, and we needed to provide a GUI on a number of Java tools. I chose to use Bongo, as it had a great feature set and looked reliable.

Wow, was I wrong! The software was reliable -- sadly, the same couldn't be said about the vendor. What I hadn't considered was the possibility that the company might decide to discontinue the product, and not offer any migration help to its customers -- and that's exactly what happened, Sometime around 1998, Marimba decided that Bongo wasn't quite as important as their Castanet 'push' product, and dropped it. Despite calls from the Bongo-using community to release the code so that the community could maintain it and avoid code-rot, they never did, and as a result apps using Bongo had to be laboriously rewritten to remove the Bongo dependencies.

I learned an important lesson about writing software -- if at all possible, build your products on open source, instead of relying on a fickle commercial software vendor. It's a lot harder to have the rug pulled out from under you, that way.

Update: Well, it seems it was quite far off the mark about Marimba. Someone who worked at Marimba at the time read the blog entry, and got in touch via email:

I was an employee of Marimba in the early days, and was around when we developed Bongo, and still later, when we discontinued it, and still later, when Bongo *was* released to the open-source community (jm: appears to be around the start of 1999 I think). It was hosted on a site called freebongo.org and continued to be enhanced with new features and a lot of new and cool widgets. It was ultimately discontinued a few years later due to lack of interest.
It was hosted and primarily maintained in the open-source community by one of the original Bongo engineers. Here's a link from the Java Gazette from the days when it was called Free Bongo.
So don't go blaming Marimba. We did listen to our users and release the code!

Fair enough -- and they deserve a lot more credit than I'd initially assumed. I guess I must have missed this later development after leaving Iona. Apologies, ex-Marimbans!

Patents and Laches

Published July 25, 2005

Patents: This has come up twice recently in discussions of software patenting, so it's worth posting a blog entry as a note.

There's a common misconception that a patenter does not necessarily need to enforce a patent in the courts, for it to remain valid. This isn't true in the US at least, where there is the legal doctrine of 'laches', defined as follows in the Law.com dictionary:

Laches - the legal doctrine that a legal right or claim will not be enforced or allowed if a long delay in asserting the right or claim has prejudiced the adverse party (hurt the opponent) as a sort of 'legal ambush'.

The Bohan Mathers law firm have a good paragraph explaining this:

...the patent holder has an obligation to protect and defend the rights granted under patent law. Just as permitting the public to freely cross one's property may lead to the permanent establishment of a public right of way and the diminishment of one's property rights, so the knowing failure to enforce one's patent rights (one legal term for this is laches) against infringement by others may result in the forfeiture of some or all of the rights granted in a particular patent.

See also this and this page for discussion of cases where it was relevant. It seems by no means clear-cut, but the doctrine is there.

CEAS

Published July 25, 2005

Spam: back from CEAS. The schedule with links to full papers is up, so anyone can go along and check 'em out, if you're curious.

Overall, it was pretty good -- not as good as last year's, but still pretty worthwhile. I didn't find any of the talks to be quite up to the standards of last year's TCP damping or Chung-Kwei papers; but the 'hallway track' was unbeatable ;)

Here's my notes:

AOL's introductory talk had some good figures; a Pew study reported that 41% of people check email first thing in morning, 40% have checked in the middle of the night, and 26% don't go more than 2-3 days without checking mail. It also noted that URLs spimmed (spammed via IM) are not the same as URLs spammed -- but the obfuscation techniques are the same; and they're using 2 learning databases, per-user and global, and the 'Report as Spam' button feeds both.

Experiences with Greylisting: John Levine's talk had some useful data -- there are still senders that treat a 4xx SMTP response (temp fail) as 5xx (permanent fail), particularly after end of the DATA phase of the transaction, such as an 'old version of Lotus Notes'; and there are some legit senders, such as Kodak's mail-out systems, which regenerate the body in full on each send, even after a temp fail, so the body will look different. He found that less than 4% of real mail from real MTAs is delayed, and overall, 17% of his mail traffic was temp-failed. The 4% of nonspam that was delayed was delayed with peaks at 400 and 900 seconds between first tempfail and eventual delivery.

As usual, there were a variety of 'antispam via social networks' talks -- there always are. Richard Clayton had a great point about all that: paraphrasing, I trust my friends and relatives on some things, and they are in my social networks -- but I don't trust their judgement of what is and is not spam. (If you've ever talked to your mother about how she always considers mails from Amazon to be spam, you'll know what he means.)

Combating Spam through Legislation: A Comparative Analysis of US and European Approaches:
the EU 'opt-in' directive is now transposed everywhere in the EU; EU citizens who are spammed by a citizen from another EU country, the reports should be sent to the antispam authority in the sender's country; and there's something called 'ECNSA', an EU contact network of spam authorities, which sounds interesting (although ungoogleable).

Searching For John Doe: Finding Spammers and Phishers: MS' antispam attorney, Aaron Kornblum, had a good talk discussing their recent court cases. Notably, he found one cases where an Austrian domain owner had set up a redirector site which sounded like it was expressly set up for spam use -- news to me (and worrying).

A Game Theoretic Model of Spam E-Mailing: Ion Androutsopoulos gave a very interesting talk on a game theoretic approach to anti-spam -- it was a little too complex for the time allotted, but I'd say the paper is worth a read.

Understanding How Spammers Steal Your E-Mail Address: An Analysis of the First Six Months of Data from Project Honey Pot: Matthew Prince of Project Honeypot had some excellent data in this talk; recommended. He's found that there's an exponential relationship between google Page Rank and spam received at scraped addresses, which matches with my theory of how scrapers work; and that only 3.2% of address-harvesting IPs are in proxy/zombie lists compared to 14% of spam SMTP delivery IPs. (BTW, my theory is that address scraping generally uses Google search results as a seed, which explains the former.)

Computers beat Humans at Single Character Recognition in Reading based Human Interaction Proofs (HIPs): this presented some great demonstrations of how a neural network can be used to solve HIPs (aka CAPTCHAs) automatically. However, I'm unsure how useful this data is, given that the NN required 90000 training characters to achieve the accuracy levels noted in the paper; unless the attacker has access to their own copy of the HIP implementation they can run themselves, they'd have to spend months performing HIPs to train it, before an attack is viable.

Throttling Outgoing SPAM for Webmail Services: cites Goodman in ACM E-Commerce 2004 as saying that ESP webmail services are a 'substantial source of spam', which was news to me! (less than 1% of spam corpora, I'd guess). It then discusses requiring the submitter of email via an ESP webmail system to perform a hashcash-style proof-of-work before their message is delivered. By using a Bayesian spam filter to classify submitted messages, the ESP can cause spammers to perform more work than non-spammers, thereby reducing their throughput. Didn't strike me as particularly useful -- Yahoo!'s Miles Libbey got right to the heart of the matter, asking if they'd considered a situation where spammers have access to more than one computer; they had not. A better paper for this situation would be Alan Judge's USENIX LISA 2003 one which discusses more industry-standard rate-limiting techniques.

SMTP Path Analysis: IBM Research's anti-spam team discuss something very similar to several techniques used in SpamAssassin; our versions have been around for a while, such as the auto-whitelist (which tracks the submitter's IP address rounded to the nearest /16 boundary), since 2001 or 2002, and the Bayes tweaks we added from bug 2384, back in 2003.

Naive Bayes Spam Filtering Using Word-Position-Based Attributes: an interesting tweak to Bayesian classification using a 'distance from start' metric for the tokens in a message. Worth trying out for Bayesian-style filters, I think.

Good Word Attacks on Statistical Spam Filters: not so exciting. A bit of a rehash of several other papers -- jgc's talk at the MIT conference on attacking a Bayesian-style spam filter, the previous year's CEAS paper on using a selection of good words from the SpamBayes guys, and it entirely missed something we found in our own tech report -- that effective attacks will result in poisoned training data, with a significant bias towards false positives. In my opinion, the latter is a big issue that needs more investigation.

Stopping Outgoing Spam by Examining Incoming Server Logs: Richard Clayton's talk. Well worth a read. It's an interesting technique for ISPs -- detecting outgoing spam by monitoring hits to your MX from your own dialup pools which uses known ratware patterns.

Anonymous remailers being tampered with

Published July 14, 2005

Politics: EDRI-gram notes that the Firenze Linux User Group's server was tampered with last month at its ISP colo:

On Monday 27 June 2005, two members of FLUG (Firenze Linux User Group) visited the data centre of Dada S.p.a., in Milan, where the community server of the group is physically housed, in order to move it to another provider.
When the server was put out of the rack, however, it was discovered that the upper lid of the server case was half-opened. At a closer inspection, it was also discovered that the case lid was scratched, as if it had been put out and reinserted into the rack. Worse, the CD-ROM cable was missing, as were the screws that kept the hard disks in place.
What is particularly worrying is that the server hosted an anonymous remailer, whose keys and anonymity capabilities could have been compromised. Considering what happened to Autistici/Inventati server - which hosted another anonymous remailer - this possibility is not so far fetched. This begs the question whether a co-ordinated attempt at intercepting anonymous/private communications on the Internet has been ongoing in the past weeks and months.

Bizarre goings-on.

looking at the new DKIM draft

Published July 11, 2005

The combined DKIM standard, mixing Yahoo!'s DomainKeys and Cisco's IIM, has been submitted to the IETF as a candidate spec by the MASS 'pre-working group effort'. I like the idea behind both (a few years back, I, a few other SpamAssassin developers, and several others came up with the roots of a message-signature anti-forgery scheme we called 'porkhash', but never really went anywhere with it), so I'm glad to see this one progressing nicely.

Seeing as I never seem to write much about anti-spam here any more, I might as well remedy that now with some comments on the new DKIM draft. ;)

It's a very good synthesis of the two previous drafts, DomainKeys and IIM, more DK-ish, but taking the nice features from IIM.

The 'h=' tag is now listed as REQUIRED. This specifies the list of headers that are to be signed. If I recall correctly, this was added in IIM, modifies the behaviour of DK, and is a good feature -- it protects against in-transit corruption by, (a) specifying an order of the headers, to protect against MTAs that reorder them; and (b) allowing sites to protect the 'important' headers (From, To, Subject etc.) and ignore possible additions by MTAs down the line (scanner additions, mailing list munging and additions, and so on).

A list of recommended headers to sign is included, with From as a MUST and Subject, Date, Content-Type and Content-Transfer-Encoding as a SHOULD.

Forwarding is, of course, just fine. This one doesn't suffer from the SPF failure mode, whereby a forwarder will break a signature if it doesn't rewrite the SMTP MAIL FROM sender address. (Of course, it now has its own new failure modes -- the message must be forwarded in a nearly-pristine state.)

The message length to sign can be specified with 'l='. This may be useful to protect against the issue where mailing list managers add a footer to a signed message. It recommends that verifiers remove text after the 'l' length, if it appears, since that offers a way for spammers to reuse existing signatures. I still have to think about this, but I suspect SpamAssassin could give points for additional text beyond the 'l=' point that doesn't match mailing list footer profiles.

The IIM HTTP-based public-key infrastructure is gone; it's all DNS, as it was in DK.

The 'z=' field, which contains copies of the original headers, is a great feature for filters -- we can now pragmatically detect 'acceptable' header rewriting if necessary, and handle recovery at the receiver end.

Multiple signatures, unfortunately, couldn't be supported. I can see why, though, it's a very hard problem.

The 'Security Considerations' section is excellent -- 9.1.2 uses a very clever HTML attack.

Looks like development of DKIM-Milter, and an associated library, libdkim, are underway.

Given all that, it looks good. It's not clear how much we can do with DK, and now DKIM, in SpamAssassin, however -- it's very important in these schemes that the message be entirely unmunged, and in most SpamAssassin installs, the filter doesn't get to see the message until after the delivering MTA, or the MDA (Message Delivery Agent), has performed some rewriting. This would cause FPs if we're not very, very careful.

I hope though, that we can find a useful way to trust DKIM results. It appears likely that they'd make an excellent way to provide trustworthy whitelisting -- 'whitelist_from_dkim' rules, similarly to our new whitelist_from_spf support. (In fact, we could probably just merge both into some new 'whitelist_from_authenticated' setting.)

OpenWRT vs Netgear MR814: no contest

Published July 8, 2005

Hardware: After a few weeks running OpenWRT on a Linksys WRT54G, here's a status report.

Things that the new WRT54G running OpenWRT does a whole lot better than the Netgear MR814:

Baseline: obviously it doesn't DDoS the University of Wisconsin, and it doesn't lose the internet connection regularly, as noted in that prior post. I knew that, so those are not really new wins, though.
It's quite noticeably faster. I've seen it spike to double the old throughput rates, and it's solid, too; less deviation in those rates.
It doesn't break my work VPN. I wasn't sure if it was the MR814 that was doing this, requiring an average of about 20 reconnects per day -- now, I know it for a fact. I've had to reconnect my VPN connection about 4 times over the past week.
It doesn't break the Gigafast UIC-741 USB wifi dongle I'm using on the MythTV box. Previously that would periodically disappear from the HAN. Again, I had this pegged as an issue with the driver for that; removing the MR814 from the equation has solved it, too, and it's now running with 100% uptime so far.
It does traffic shaping with Wondershaper, so I can use interactive SSH, VNC, or remote desktop while downloading, even if it's another machine on the HAN doing the download.
It's running linux -- ssh'ing in, using ifconfig, and vi'ing shell scripts on my router is very, very nice.

Man, that MR814 was a piece of crud. ;) I can't recommend OpenWRT enough...

EU software patents directive is history

Published July 7, 2005

Patents: A great outcome! The proposed Directive has been dropped, in the face of massive opposition. Coverage: /., FFII, FT.com, VNUnet, FSFE.

Unfortunately, Rocard's proposed amendments which would have turned this directive into a major win for us, didn't pass -- but it's still a good win. Software patents are not explicitly legal throughout Europe; although some jurisdictions do permit them, they're in a legal grey area, and prosecution is therefore hard (and very expensive for patent holders). This is a much better situation than if the directive as proposed by the Council had passed, since that would have explicitly legalised them throughout the EU.

On top of this win, what I find significant is that we've now brought the issue from where it was a few years ago, as a minor concern known only to a few uber-geeks, to a major political issue that made headlines around the globe. Even my local NPR affiliate reported on this decision! That's a far cry from the mid-90's, when I had a hard time explaining the point of theLeague for Programming Freedom to my hacker friends in the TCD Maths Department.

A great quote from the VNUnet article:

'This represents a clear victory for open source,' said Simon Phipps, chief open source officer at Sun Microsystems. 'It expresses Parliament's clear desire to provide a balanced competitive market for software.'

Yes, that's Sun saying that less software patenting is a good thing. Believe me, that's a great leap forward. Or check out Irish MEP Kathy Sinnott's amazing comments. She hits the nail right on the head; I'm very impressed by that speech.

McCreevy seeing anti-globalisation protesters everywhere

Published July 5, 2005

Patents: I'm just back from a fantastic holiday weekend, totally offline, hiking through Catalina Island. I'm a little bit sunburnt, my nose is peeling, but it was great fun. I got a fantastic picture of the sun setting over hundreds of boats bobbing at their moorings in Two Harbors, which I must upload at some stage.

Anyway, it seems that over the weekend, the EU software-patents debate has swung back heavily towards the anti-swpat side. Fingers crossed -- the vote is this week.

Also, today, EUpolitix.com has an interview with Charlie McCreevy, quoting him as saying:

'The theme, or the background music, to both of these particular directives (the CII and Services Directives) you could see as part of, anti-globalisation, anti-Americanism, anti-big business protests -- in lots of senses, anti-the opening up of markets'

This is standard practice for the Irish government -- they did exactly the same thing with the e-voting issue, painting the ICTE as 'linked to the anti-globalisation movement'. (I have a feeling they think that any group organised online must be 'anti-globalisation', at this stage.)

Of course, with these accusations of being anti-free-market, it's important to remember that a patent is a government-issued monopoly on an invention (or in the software field, on an idea), in a particular local jurisdiction. If anything, being against software patenting is a pro-free-market position, one shared by prominent US libertarians; and nothing gets more pro-free-market than those guys. ;)

CEAS coming up soon…

Published July 2, 2005

Spam: if you work in anti-spam, especially in filtering, or even just in working with email in general, it's well worth going to CEAS 2005, the Conference on Email and Anti-Spam, on Thursday July 21st and Friday 22nd in Stanford:

The organizers of the Conference on Email and Anti-Spam invite you to participate in its second annual meeting. This forum brings together academic and industrial researchers to present new work in all aspects of email, messaging and spam -- with papers this year covering fields as diverse as text classification, clustering and visualization of email, social network analysis applied to both email and spam, spam filtering methods including text classification and systems approaches, game theory, data analysis, Human Interactive Proofs, and legal studies, among others. The conference will feature 26 paper presentations, a banquet, and two invited speakers. See http://www.ceas.cc for details of the current program, as well as on-line registration.

Registration runs out on July 10th.

I went last year, and it was excellent -- several very interesting papers were presented. I'm going this year, too, along with quite a few SpamAssassin committers, and I'm looking forward to it.

Hackability as a selling point

Published June 29, 2005

Hardware: On my home network, I recently replaced my NetGear MR814 with a brand new Linksys WRT54G.

My top criteria for what hardware to buy for this job weren't price, form factor, how pretty the hardware is, or even what features it had -- instead, I bought it because it's an extremely hackable router/NAT/AP platform. Thanks to a few dedicated reverse engineers, the WRT hardware can now be easily reflashed with a wide variety of alternative firmware distributions, including OpenWRT, a fully open-source distro that offers no UI beyond a command-line.

Initially, I considered a few prettier UIs -- HyperWRT, for example -- since I didn't want to have to spend days hacking on my router, of all things, looking stuff up in manuals, HOWTOs and in Google. Finally I decided to give OpenWRT a spin first. I'm glad I did -- it turned out to be a great decision.

(There was one setup glitch btw -- by default, OpenWRT defaults to setting up WPA, but the documentation claims that the default is still no crypto, as it was previously.)

The flexibility is amazing; I can log in over SSH and run the iftop tool to see what's going on on the network, which internal IPs are using how much bandwidth, how much bandwidth I'm really seeing going out the pipe, and get all sorts of low-level facts out of the device that I'd never see otherwise. I could even run a range of small servers directly on the router, if I wanted.

Bonus: it's rock solid. My NetGear device had a tendency to hang frequently, requiring a power cycle to fix; this bug has been going on for nearly a year and a half without a fix from NetGear, who had long since moved on to the next rev of cheapo home equipment and weren't really bothering to support the MR814. I know this is cheap home equipment -- which is why I was still muddling along with it -- but that's just ridiculous. None of that crap with the (similarly low-cost) WRT. OpenWRT also doesn't contain code to DDOS NTP servers at the University of Wisconsin, which is a bonus, too. ;)

Sadly, I don't think Cisco/Linksys realise how this hackability is making their market for them. They've been plugging the security holes used to gain access to reflash the firmware in recent revisions of the product (amazingly, you have to launch a remote command execution attack through an insecure CGI script!), turning off the ability to boot via TFTP, and gradually removing the ways to reflash the hardware. If they succeed, it appears the hackability market will have to find another low-cost router manufacturer to give our money to. (update, June 2006: they since split the product line into a reflashable Linux-based "L" model and a less hackable "S" model, so it appears they get this 100%. great!)

Given that, it's interesting to read this interview with Jack Kelliher of pcHDTV, a company making HDTV video capture cards:

Our market isn't really the mass market. We were always targeting early adopters: videophiles, hobbyists, and students. Those groups already use Linux, and those are our customers.
Matthew Gast: The sort of people who buy Linksys APs to hack on the firmware?
Jack Kelliher: Exactly. The funny thing is that we completely underestimated the size of the market. When we were starting up the company, we went to the local Linux LUG and found out how many people were interested in video capture. Only about 2 percent were interested in video on Linux, so we thought we could sell 2,000 cards. (Laughs.) We've moved way beyond that!

Well worth a read. There's some good stuff about ulterior motives for video card manufacturers to build MPEG decoding into their hardware, too:

The broadcast flag rules are conceptually simple. After the digital signal is demodulated, the video stream must be encrypted before it goes across a user accessible bus. User accessible is defined in an interesting way. Essentially, it's any bus that a competent user with a soldering iron can get the data from. Video streams can only be decrypted right before the MPEG decode and playback to the monitor.
To support the broadcast flag, the video capture must have an encryptor, and the display card must have a decryptor. Because you can't send the video stream across a user accessible bus, the display card needs to be a full MPEG decoder as well, so that unencrypted video never has to leave the card.
Matthew Gast: So the MPEG acceleration in most new video cards really isn't really for my benefit? Is it to help the vendors comply with the broadcast flag?
Jack Kelliher: Not quite yet. Most video cards don't have a full decoder, so they can't really implement the broadcast flag. ATI and nVidia don't have full decoders yet. They depend on some software support from the operating system, so they can't really implement the broadcast flag. Via has a chipset with a full decoder, so it would be relatively easy for them to build the broadcast flag into that chipset.

Aha.

Project management, deadlines etc.

Published June 28, 2005

Work: I took a look over at Edd Dumbill's weblog recently, and came across this posting on planning programming projects. He links to another article and mentions:

My recent return to managing a team of people has highlighted for me the difficulties of the arbitrary deadline approach to project management. Unfortunately, it's also the default management approach applied by a lot of people, because the concept is easy to grasp.
The arbitrary deadline method is troublesome because of the difficulty of estimation. As John's post elaborates, you can never foresee all of the problems you'll meet along the way. The distressing inevitability of 90% of the effort being required by 2% of the deliverable is frequently inexplicable to developers themselves. Never mind the managers remote from the development!

I've been considering why my experience of working with open source seems generally preferable to commercial work, and this may be one of the key elements. Commercial software development is deadline-driven, whereas most open source development has not been, in my experience; 'it's ready when it's ready'.

Edd suggests that using a trouble-ticket-based system for progress tracking and management is superior. I'm inclined to agree.

Irish SME associations quiet on patenting

Published June 23, 2005

Patents: yes, I keep rattling on about this -- the vote is coming up on July 6th. I promise I'll shut up after that ;)

UEAPME has issued a statement regarding the directive which is strongly critical of its current wording (UEAPME is the European small and medium-sized business trade association, comprising 11 million SMEs). Quote:

'The failure to clearly remove software from the scope of the directive is a setback for small businesses throughout Europe. UEAPME is now calling on the European Parliament to reverse yesterday's decision at plenary session next month and send a strong message that an EU software patent is not an option,' Hans-Werner Müller, UEAPME Secretary General, stated.
'There is growing agreement among all actors that software should not be patented, so providing an unequivocal definition in the directive that guarantees this is clearly in the general interest. We are calling on the Parliament to support the amendments that would ensure this,' said Mr Müller.
'The cacophony of misinformation and misleading spin from the large industry lobby in the run up to this vote has obscured the general consensus on preventing the patenting of pure software.'

That's all well and good. So presumably the Irish members of UEAPME, ISME and the SFA, are agreeing, right? Sadly, neither of these have issued any press releases on the subject, as far as I can see, and approaches by members of IFSO have been totally fruitless.

Since both have made recent press noting that Irish small businesses face difficulties with the rising costs of doing business, this would seem to be a no-brainer -- legalising software patents would immediately open Irish SMEs up to the costs associated with them: licensing fees, fighting spurious infringement litigation from 'patent troll' companies, the 'chilling effects' on investors noted by Laura Creighton, and of course the high price of retaining patent lawyers to file patents on your own innovations. One wonders why they aren't concerned about these costs...

Happy Midwinter’s Day!

Published June 22, 2005

Antarctic: Happy Midwinter's Day!

I've just finished reading Big Dead Place , Nicholas Johnson's book about life at McMurdo Base and the US South Pole Station, with anecdotes from his time there in the early years of this decade.

It's a fantastic book -- very illustrative of how life really goes on on a distant research base, once you get beyond romantic notions of exploration of the wild frontiers. (Like many geek kids, I spent my childhood dreaming of space exploration, and Antarctica is the nearest thing you can get to that right now.) A bonus: it's hilarious, too.

Unfortunately it's far from all good -- as one review notes, it's like 'M*A*S*H on ice, a bleak, black comedy.' There's story after story of moronic bureaucratic edicts emailed from comparatively-sub-tropical Denver, Colorado, ass-covering emails from management on a massive scale, and injuries and asbestos exposures covered up to avoid spoiling 'metrics'.

Here's a sample of such absurdity, from an interview with Norwegian world-record breaking Antarctic explorer, Eirik Sønneland:

BDP: I was working at McMurdo when you arrived in 2001. I remember it well because we were commanded by NSF not to accommodate you in any way, and were forbidden to invite you to our rooms or into any buildings. We were told not to send mail for you, nor to send email messages for you. While you were in the area, NSF was keeping a close eye on you. What did the managers say to you when you arrived?
They asked us what plans we had for getting home. The manager at Scott Base (jm: the New Zealand base) was calm and listened to what we had to say. I must be honest and say that this was not the way we were treated by the U.S. manager. It was like an interrogation. Very unpleasant. He acted arrogant. However, it seemed like he started to realize after a couple of days that we didn't try to fool anybody. He probably got his orders from people that were not in Antarctica at the time. And, to be honest, today I don't have bad feelings toward anyone in McMurdo. Bottom line, what did hurt us was that people could not think without using bureaucracy. If people could only try to listen to what we said and stop looking up paragraphs in some kind of standard operating procedures for a short while, a lot could have been solved in a shorter time.
One example: our home office, together with Steven McLachlan and Klaus Pettersen in New Zealand, got a green light from the captain of the cargo ship that would deliver cargo (beer, etc.) to McMurdo, who said he would let us travel for free back to New Zealand if it was okay with his company. At first the company was agreeable, but then NSF told them that the ship would be under their rent until it left McMurdo and was 27 km away. Reason for the 27 km? The cargo ship needed support from the Coast Guard icebreaker to get through the ice. Since, technically, the contract with NSF did not cease until the ship left the ice, NSF could stop us from going on the ship. At which point NSF offered to fly us from McMurdo for US$50,000 each.

He also maintains an excellent website at BigDeadPlace.com, so go there for an idea of the writing. BTW, it appears the UK also maintains an Antarctic base. Here's hoping they keep the bureaucracy at a saner level over there.

The meaning of the term ‘technical’ in software patenting

Published June 22, 2005

Patents: One of the key arguments in favour of the new EU software patenting directive as it's currently worded, from the 'pro' side, is that it doesn't 'allow software patents as such', since it requires a 'technical' inventive step for a patent to be considered valid.

Various MEPs have tried to clarify the meaning of this vague phrase, but without luck so far.

Coverage has mostly noted this as meaning that 'pure software' patents are not permissible, for example this Washington Post article, FT.com,and InformationWeek.

But is this really the case, in pragmatic terms? What does a 'technical inventive step' mean to the European Patent Office?

Well, it doesn't look at all promising, according to this report from the Boards of Appeal of the European Patent Office from 21 April 2004, dealing with a Hitachi business method patent on an 'automatic auction method'. The claims of that patent application (97 306 722.6) covered the algorithm of performing an auction over a computer network using client-server technology. The actual nature of this patent isn't important, anyway -- but what is important is how the Boards of Appeal judge its 'technical' characteristics.

The key section is 3.7, where the Board writes:

For these reasons the Board holds that, contrary to the examining division's assessment, the apparatus of claim 3 is an invention within the meaning of Article 52(1) EPC since it comprises clearly technical features such as a "server computer", "client computers" and a "network".

So in other words, if the idea of a computer network is involved in the claims of a patent, it 'includes technical aspects'. It then goes on to discuss other technical characteristics that may appear in patents:

The Board is aware that its comparatively broad interpretation of the term "invention" in Article 52(1) EPC will include activities which are so familiar that their technical character tends to be overlooked, such as the act of writing using pen and paper.

So even writing with a pen and paper has technical character!

It's a cop-out, designed to fool MEPs and citizens into thinking that a reasonable limitation is being placed on what can be patented, when in reality there's effectively no limits, if there's any kind of equipment involved beyond counting on your fingers.

The only way to be sure is to ensure the directive as it eventually passes is crystal clear on this point, with the help of the amendments that the pro-patent side are so keen to throw out.

(BTW, I found this link via RMS' great article in the Guardian where he discusses software patenting using literature as an analogy. recommended reading!)

Latest Script Hack: utf8lint

Published June 18, 2005

Perl: double-encoding is a frequent problem when dealing with UTF-8 text, where a UTF-8 string is treated as (typically) ISO Latin-1, and is re-encoded.

utf8lint is a quick hack script which uses perl's Encode module to detect this. Feed it your data on STDIN, and it'll flag lines that contain text which may be doubly-encoded UTF-8, in a lintish way.

BSA Spams Patent Holders

Published June 17, 2005

Patents: An anonymous contributor writes:

'I just received this letter and these pre-addressed postcards in the post this morning. I was surprised when I saw the envelope, because I'd never received anything from the BSA before. It turned out that they had extracted my name and address from the European Patents database, because I registered a software patent once. So a lot of these letters have been probably been sent out.

According to the letter, from Francisco Mingorance, the draft directive is being turned around to 'rob small businesses of their intellectual property assets'.

I find it hard to see how that could be true. However the BSA's letter has an important message you should heed - it is critical to contact your European representatives (your MEP and your country's Commissioner) within the next two weeks. Let them know that the European Union should curtail software patents for once and for all.

Get out your best stationery and write to your MEP at the address given on this page.

Make sure your message is short and clear. SME's don't benefit from patents. Few patents are held by SME's and the cost of applying for, maintaining and defending them is crippling.'

jm: I would suggest noting that you support the position of rapporteur
Michel Rocard MEP, and/or the FFII -- details here. Please do write!

BTW, the contributor also offers: 'if anyone is interested in doctoring up the BSA postcards, I can provide the hi-res scans.' ;)

Amazing article series on Climate Change

Published June 16, 2005

Science: in April and May, the New Yorker printed an amazing series of articles on climate change by Elizabeth Kolbert, full of outstanding research and interviews with the key players.

Unlike much coverage, it includes the expected results of climate change in the US:

Different climate models offer very different predictions about future water availability; in the paper, Rind applied the criteria used in the Palmer index to GISS's model and also to a model operated by NOAA's Geophysical Fluid Dynamics Laboratory. He found that as carbon-dioxide levels rose the world began to experience more and more serious water shortages, starting near the equator and then spreading toward the poles. When he applied the index to the giss model for doubled CO2, it showed most of the continental United States to be suffering under severe drought conditions. When he applied the index to the G.F.D.L. model, the results were even more dire. Rind created two maps to illustrate these findings. Yellow represented a forty-to-sixty-per-cent chance of summertime drought, ochre a sixty-to-eighty-per-cent chance, and brown an eighty-to-a-hundred-per-cent chance. In the first map, showing the GISS results, the Northeast was yellow, the Midwest was ochre, and the Rocky Mountain states and California were brown. In the second, showing the G.F.D.L. results, brown covered practically the entire country.
'I gave a talk based on these drought indices out in California to water-resource managers,' Rind told me. 'And they said, 'Well, if that happens, forget it.' There's just no way they could deal with that.'
He went on, 'Obviously, if you get drought indices like these, there's no adaptation that's possible. But let's say it's not that severe. What adaptation are we talking about? Adaptation in 2020? Adaptation in 2040? Adaptation in 2060? Because the way the models project this, as global warming gets going, once you've adapted to one decade you're going to have to change everything the next decade.

And how the anti-climate-change side are attempting to control US public opinion:

The pollster Frank Luntz prepared a strategy memo for Republican members of Congress, coaching them on how to deal with a variety of environmental issues. (Luntz, who first made a name for himself by helping to craft Newt Gingrich's 'Contract with America,' has been described as 'a political consultant viewed by Republicans as King Arthur viewed Merlin.') Under the heading 'Winning the Global Warming Debate,' Luntz wrote, 'The scientific debate is closing (against us) but not yet closed. There is still a window of opportunity to challenge the science.' He warned, 'Voters believe that there is no consensus about global warming in the scientific community. Should the public come to believe that the scientific issues are settled, their views about global warming will change accordingly.'

They're a great synthesis. Go read the articles -- part 1 ('Disappearing islands, thawing permafrost, melting polar ice. How the earth is changing'), part 2 ('The curse of Akkad'), and part 3 ('What can be done?'). They're long, but if you're still on the fence about this one, they'll wake you up.

Bayesian learning animation

Published June 16, 2005

Spam: via John Graham-Cumming's excellent anti-spam newsletter this month, comes a very cool animation of the dbacl Bayesian anti-spam filter being trained to classify a mail corpus. Here's the animation:

And Laird's explanation:

dbacl computes two scores for each document, a ham score and a spam score. Technically, each score is a kind of distance, and the best category for a document is the lowest scoring one. One way to define the spamminess is to take the numerical difference of these scores.
Each point in the picture is one document, with the ham score on the x-axis and the spam score on the y-axis. If a point falls on the diagonal y=x, then its scores are identical and both categories are equally likely. If the point is below the diagonal, then the classifier must mark it as spam, and above the diagonal it marks it as ham.
The points are colour coded. When a document is learned we draw a square (blue for ham, red for spam). The picture shows the current scores of both the training documents, and the as yet unknown documents in the SA corpus. The unknown documents are either cyan (we know it's ham but the classifier doesn't), magenta (spam), or black. Black means that at the current state of learning, the document would be misclassified, because it falls on the wrong side of the diagonal. We don't distinguish the types of errors. Only we know the point is black, the classifier doesn't.
At time zero, when nothing has been learned, all the points are on the diagonal, because the two categories are symmetric.
Over time, the points move because the classifier's probabilities change a little every time training occurs, and the clouds of points give an overall picture of what dbacl thinks of the unknown points. Of course, the more documents are learned, the fewer unknown points are left.

This is an excellent visualisation of the process, and demonstrates nicely what happens when you train a Bayesian spam-filter. You can clearly see the 'unsure' classifications becoming more reliable as the training corpus size increases. Very nice work!

It's interesting to note the effects of an unbalanced corpus early on; a lot of spam training and little ham training results in a noticeable bias towards the classifier returning a spam classification.

Flickr as a ‘TypePad service for groups’

Published June 13, 2005

Web: a while back, I posted some musings about a web service to help authenticate users as members of a private group, similarly to how TypeKey authenticates users in general.

Well, Flickr have just posted this draft authentication API which does this very nicely -- it now allows third-party web apps to authenticate against Flickr, TypeKey-style, and perform a limited subset of actions on the user's behalf.

This means that using Flickr as a group authentication web service is now doable, as far as I can see...

DVD annoyances

Published June 13, 2005

Hardware: I've been needing a decent backup solution, since I've got 60GB of crud on my hard disk that isn't being rsynced offsite yet. So I bought myself a nifty DVD writer from woot.com a week ago, supporting DVD+RW, DVD+R, DVD-RW, and DVD-R, and a spindle of 20 DVD+Rs from Target. Little did I realise the world of pain I was entering.

Did you know there are no less than 6 barely-compatible DVD formats? Prerecorded DVD, DVD-RAM, DVD-R, and DVD-RW, from the DVD Forum, and DVD+RW and DVD+R, from the 'DVD+RW Alliance'. Interoperability is, needless to say, a total mess, even with the Sony 4-format drive I picked up.

I eventually managed to burn myself a DVD+R backup of bits of my home dir, making several coasters in the process (DVD+Rs apparently do not support simulated-write dry-runs, at least not with growfs). So, great!

Next thing to do was try it out on my laptop's internal CD/DVD drive to make sure it worked. Needless to say, it didn't.

Apparently, single-session, single-track DVD+Rs are virtually identical to DVD-ROMs, which most generic DVD-reader drives support. However, Sony drives do not support setting the 'book type' bits, which is the trick that turns a DVD+R 'into' a DVD-ROM-compatible disc. Guess why (hint: it's Sony). Yep, that's right, paranoia about piracy. Well, thanks a bunch, Sony -- my backups are now of decidedly limited usefulness, since I don't know if I'm ever going to be able to read them again! (more info from the OSTA.) I think I now see why Woot were flogging them cheap.

I'm not sure where to go with this -- do I have a spindle of 17 shiny frisbees? I have a very nasty feeling I'm heading into dead media territory here. What a mess...

Aaaanyway. Here's some possibly-useful bookmarks.

OTOH, I got to watch the BBC's new documentary, The Power of Nightmares, a fantastic history of the two parallel ideological worlds of al-Qaeda and the US neo-conservatives. Mind-boggling, but highly recommended.

European swpat update letter

Published June 10, 2005

Patents: Ian Clarke copied the FSFE-IE mailing list with a good mail he sent to Mairead McGuinness MEP, detailing the current state of proposed fixes to the European software patenting directive. He discusses a comment from an Ericsson employee asking for software patentability:

It may be the case that this employee was concerned about Ericsson's ability to compete against smaller competitors if Ericsson cannot use software patents against them. I would argue that it is not the responsibility of any EU institution to protect Ericsson against legitimate competition from other companies, indeed competition must be encouraged. Software patents will have a stifling effect on competition in Europe, and this is why some large companies like Ericsson are strong advocates for this directive.

And a brief overview of the amendments we want:

The Foundation for a Free Information Infrastructure, an organisation whose line we endorse, has prepared an analysis of the amendments, indicating which will help to ensure that software patents do not become patentable, and which will not. This document may be downloaded here.
In particular, we support the position and amendments of Piia Noora Kauppi MEP, who has taken a strong position against the introduction of software patents within the EPP group, and also the position of Michel Rocard MEP who is the rapporteur for this Directive.

The only other thing it misses, in my opinion, is a paragraph discussing the 'as such' loophole that has been heavily relied upon by most pro-swpat politicians recently -- the trick of saying 'this directive does not permit software patenting, as such'.

Indeed, it does not permit patenting of all software techniques, but instead permits the patenting of software techniques as long as it is of 'a technical nature' -- without defining what that means. Given that it's clearly arguable that all software is technical, and since patent offices earn money based on the patents they accept, rather than those they reject, this is a loophole the size of a bus. Many of the desired amendments concern cleaning up this obvious omission.

Anyway, here's the full text of Ian's mail from the list archive.

Remote playback from laptop to MythTV box

Published June 10, 2005

Linux: the MythTV hacking continues (infrequently). Here's the latest -- a way to play music from my laptop, with sound output via the Mythbox.

Dot-coms and geographical insularity

Published June 8, 2005

Web: i caught sight of (8 June 2005, Interconnected), on the geographical insularity of the dot-com boom. A good read:

The huge influx of cash at the turn of the millennium led to the whole Web being built in the image of the Bay area. The website patterns that started there and - just by coincidence - happened to scale to other environments, those were the ones that survived.

Lots to think about. He's spot on, of course -- many of the web's big commercial success stories are almost shamelessly US-oriented, and if they work outside that, it's purely by accident.

I'd love to see more web businesses that work well for other parts of the world, but that'll take money -- and from what I saw in Dublin, the money either (a) just isn't there, or (b) frequently goes to the companies that talk the talk, but then piddle it away on ludicrous 'e-business architectures' and get nothing useful out the other end.

On both counts, Silicon Valley has an ace up its sleeve. The VCs are smart and well-funded, and the developers have experience, and know which tools are right for the job.

I'd be curious to hear how other high-tech hotspots in the US (Boston, for example) find this.

IBM patents web transcoding proxies

Published June 2, 2005

Web: I link-blogged this, but it's generated some email already, so it deserves a proper posting.

One thing you quickly learn about IBM where software patents are concerned, is that if IBM Research is making noise about a new software technique, they've probably patented it already. A few years ago, IBM was keen on HTTP transcoding -- rewriting web content in a proxy, to be more suitable for display and access from less-capable devices, like PDAs and mobile phones.

So I probably should not have been surprised today when I came across USPTO patent 6,886,013, which is an IBM patent on a 'HTTP caching proxy to filter and control display of data in a web browser'. It was applied for on Sep 11 1997, and finally granted on Apr 26 of this year.

The first claim covers:

A method of controlling presentation on a client of a Web document formatted according to a markup language and supported on a server, the client including a browser and connectable to the server via a computer network, the method comprising the steps of:
as the Web document is received on the client, parsing the Web document to identify formatting information;
altering the formatting information to modify at least one display characteristic of the Web document; and
passing the Web document to the browser for display.

Notice that there's actually no mention of a HTTP proxy there -- in other words, an in-browser rewriting element, such as Greasemonkey or Trixie may be covered by that claim. However, the claim does indicate that the document is passed from the 'client' to the 'browser', so perhaps having the 'client' inside the 'browser' evades that.

It appears this really wasn't original research even when the patent was applied for -- there's probable prior art, even if the patent itself doesn't cite it. For example, WWW4 in 1995 included Application-Specific Proxy Servers as HTTP Stream Transducers, which discusses 'transduction' of the HTTP traffic and gives an example of 'A ``rewriting'' OreO (transducer element) that encapsulates each anchor inside the Netscape Blink extension, making anchors easier to spot on monochrome displays'. On top of that, Craig Hughes notes that his 'senior project at Stanford in 1992 was an implementation of a content-modifying HTTP proxy. It re-worked HTML in http streams to add some markup to enable full navigability through touch screen or voice control, for screen-only kiosks.'

Add this to the ever-growing list of over-broad software patents.

Getting JuK to output sound via ALSA

Published June 2, 2005

Linux: Linux sound is still a mess. Due to the ever-changing 'sound server of the week' system used to decide how an app should output sound, it's perfectly possible to have 3 apps on your desktop happily making noise at the same time, while another app complains about requiring exclusive access to /dev/dsp -- or worse, hangs silently while it attempts to grab an exclusive lock on the device.

This page gives a reasonably good guide to getting software mixing working across (virtually) all apps, using ALSA software mixing and esd.

However, some cases are still very kludgy -- in particular, JuK, the excellent KDE mp3 jukebox app, has a tendency to play poorly with others, requiring playback via no less than two sound servers -- artsd and esd -- to work correctly in the above setup. In addition, the support for mp3 files in artsd is buggy -- it's frequently unable to open certain mp3s, depending on how they were encoded.

Well, good news -- the current release of JuK now supports direct playback from GStreamer via ALSA. Here's how. By adding these lines:

[GStreamerPlayer]
SinkName=alsasink

to ~/.kde/share/config/jukrc, you can skip sending JuK mp3 playback via 2 sound servers, and just play directly to the hardware from the mp3 player. An improvement! Not quite optimal, and certainly not user-friendly -- but getting there...

Patents come to computer gaming

Published June 2, 2005

Patents: in a recent discussion about games and patents, it emerged that these common elements are patented:

streamed loading
minigames during a loading screen
ghost cars in racing games

Looks like software patenting is coming to computer games in a big way. I'm not sure how any game on a modern platform can avoid the 'streamed loading' patent.

Naturally, I can remember playing games on the Commodore 64 in the 1980s that included these...

Yet another non-smoking weblog

Published June 1, 2005

Life: seeing as yesterday was World No Tobacco Day, it's worth noting that I gave up smoking last Thursday.

This is the first time I've taken the step of quitting with any seriousness. I've been smoking since I was 18 or 19, without any real attempts to quit before now. It was a gradual process, but imagining a smoker's future, with the diseases and reduced life expectancy it involves, makes it quite sensible in the end. So far, it's going pretty well -- lots of occasional pangs, but nothing I can't say no to... especially with the aid of Liquorice Altoids. wish me luck!

Irish Oireachtas take care of their own

Published May 27, 2005

Net: Fergus Cassidy reports that 'bandwidth-starved TDs and Senators' in the Oireachtas will be taking a shortcut around Ireland's woeful consumer broadband situation, especially in terms of deployment outside of the main urban areas.

There's a tender up to implement 'an enhanced remote access system, which will improve access from Members' homes or constituency Offices to data and services on servers in Leinster House'.

No similar luck for their constituents, of course. That really takes the biscuit...

Backscatter X-ray ‘naked scanners’ in the news

Published May 27, 2005

Security: the use of backscatter x-ray scanners has hit the US press now that the TSA are taking an interest.

These are interesting devices; unlike normal X-rays, they effectively render clothes invisible. That's obviously got big privacy implications.

Quite a few of the press stories include images that have been blurred or obscured, presumably to render them printable. However, this image seems closer to the real results (not work-safe).

They were trialled in Heathrow's Terminal 4 last year. One slashdotter's experience:

Every Nth person in the line had to go through. They take you to a seperate are which is blocked off, make you lift up your arms and then move, facing three different directions. There was one operator and the screen was blocked off. The operator is always the gender of the person being scanned. Still I felt very offended for two reasons. First, even though it was enclosed it still made me feel exposed and my personal space violated, second, any questions I asked the operator with regards to their data storage, or if I could see the images that had been made were met with ignorance and my questions were ignored. However, turning down a scan you would probably get a strip search which would be even worse. I disliked airplane security checks before, but now it is incredibly annoying.

The Times has some passenger's reactions to images from their scans:

'I was quite shocked by what I saw,' said Gary Cook, 40, a graphic designer from Shaftesbury, Dorset. 'I felt a bit embarrassed looking at the image.'
A female passenger, who did not want to be named, said: 'It was really horrible. It doesn't leave much to the imagination because you're virtually naked, but I guess it's less intrusive than being hand searched.'

If these are installed more widely, I wonder how long it'll take before we start seeing backscatter images of supermodels being saved to floppy by unscrupulous staff, and leaked?

Also, SpyBlog notes that images of children scanned with this device would constitute 'making, distributing or possessing child pornography' in the UK, presuming the machine stores them internally in electronic form. oops!

I’m a new ASF Member!

Published May 27, 2005

Apache: It seems I've been elected as a member of the Apache Software Foundation! That's a nice surprise ;)

Massive US bank breaches, and Europe

Published May 25, 2005

Security: Adam Shostack has been tracking the immense volume of recent bank disclosures of compromised customer data. Bruce Schneier has also commented, and an interesting question arose in his posting's comments -- why are there seemingly no similar problems with European banks?

One responder points to a WSJ article which broadly misses the point. It discusses the additional layers of security imposed by European banks above the usual username/password combo. This is true -- Eurobanks generally have higher security at the 'front gate'; for example, I recall Bank of Ireland even issued SecurID-type tokens in its earliest online banking system. However, that misses the 'insider' attack, as in the most recent case of these 676,000 accounts, so I think it misses the point.

Bruce Schneier's take:

Personal data is 1) not collected as widely, and 2) much less valuable as a tool to commit fraud. The second reason is far more important.

I think he's partially right. Access to new and existing accounts in the US often requires little more than an SSN or similar trivial, easily-discoverable, data which is used in common across multiple institutions, and can be performed online; whereas in Europe, one requires documentary proof of address, ID, and the act must be performed in person at a bank branch. (This is often exceedingly annoying, of course. ;) In general, identity theft seems to be at a greater level in the US, and this is one reason why, I'd guess.

Adam Shostack has another take: these disclosures have all arrived on the heels of California's SB 1386. It's very unlikely that these kind of breaches never occurred before this, and suddenly began recently -- it's more likely that they've always gone on, but are unreported in Europe (and of course were unreported in the US, pre-SB 1386).

I'd add another point -- the US has a large population of targets, with banks sharing financial systems across the entire country. Europe, by contrast, has many individual countries which each have their own set of banks and banking systems, and less interoperability and cross-state data flow. The potential return from ID theft fraud is increased by the larger pool of candidate victims in the US, compared to what an attacker could achieve in each individual European country. This means both that (a) an attack will affect a smaller number of victims in Europe than the US, and (b) widening the scale of an attack becomes significantly harder when the attacker must deal with new systems. It's the 'security monoculture' issue again, applied to banking instead of operating systems.

The Nokia 770 Internet Tablet

Published May 25, 2005

Hardware: Slashdot: Nokia's Linux Handheld. It's to be called the Nokia 770 Internet Tablet, and runs on an open source development platform called Maemo.

This looks really nifty. ARM processor, 800x480 pixel resolution, GTK+, 2.6 kernel, wifi, 3 hours of active battery life, and a clever panning system to get around the clunkiness of scrollbars on a touchscreen.

I note particularly that they seem to have planned to include an RSS reader based on Liferea.

The Maemo site looks interesting, in that it's clearly a bunch of switched-on, open-source-comprehending developers who set it up; it's built using Apache Forrest, they use Bugzilla for issue tracking, Mailman for lists, the terms of use for user contributions explicitly call out OSI-approved licenses as a requirement, there's plentiful references to Debian's apt as the preferred means of installing developer platform software, and Maemo apps are distributed as Debian packages.

There's clearly been quite a lot of work going on behind the scenes. There's already some third-party apps out there, such as those on INdT's Maemo apps page, and the the SDK tutorial contains copious detail, suggesting it's been seeing some use.

That SDK tutorial is full of tantalizing glimpses into Maemo's operation.

It all looks very promising, and nicely hackable! I'm looking forward to a closer look at one of these. It's especially good to see such a solid comprehension of the open source model by such a major company. (If only they could have a word with their patents department ;)

Update: They've ported WebCore to GTK+. Mobile Gazette has more info, too, including this worrying line:

And although Nokia hold several patents for (the Maemo development platform), they intent to open up access to their intellectual property to aid development.

(My emphasis.) That line is not encouraging, seeing as it seems to be a pretty typical cross-compilation platform as seen in embedded systems development. But hey, let's see the patents first.

Threadless RSS

Published May 25, 2005

Clothing: I love Threadless. Unfortunately, they don't have an RSS feed for new T-shirts. So I wrote a quick scraper:

Threadless New Tees (RSS)

with pictures, naturally. This is not going to help my Threadless habit. ;)

Here's a preview of what the feed looks like:

Del.icio.us ranking systems

Published May 25, 2005

Weblogs: there's been a few attempts to mine 'trend' data from del.icio.us:

trendalicious, a 'near real-time view of website popularity trends as reflected by the del.icio.us social bookmarking service'
grafolicious, which is similar
hublog: gatherers of the month lists the 'trendsetters' who link to popular sites first

However, none consider how many links a user generates. A user who links to every single page on the web would quickly gain a good 'trendsetting' rating, and would also skew the website trends upwards, without actually providing useful data to others.

A look at the hublog top posters does seem to indicate they're linking prolifically to any old crap that looks likely to be popular, which is a more humanly-possible way to do that. ;)

However, populicious new links is quite cool -- popular sites that are new in the last 24 hours. Especially handy to find out where one could download Daily Show torrents these days. ;)

There's also the venerable Hot Links, which unfortunately tracks a very small population, but still gets interesting stuff.

Lexis-Nexis hacked through spam

Published May 20, 2005

Spam: WashPost: Computers Seized in Data-Theft Probe:

According to an account provided by the teenaged member of the hacker group -- and confirmed by the law enforcement source who insisted on anonymity -- the LexisNexis break-in was set in motion by a blast of junk e-mail. Sometime in February a small group of hackers ... sent out hundreds of e-mails with a message urging recipients to open an attached file to view pornographic child images. The attachments had nothing to do with child porn; rather, the files harbored a virus (sic) that allowed the group's members to record anything a recipient typed on his or her computer keyboard.
According to the teenage source, a police officer in Florida was among those who opened the infected e-mail message. Not long after his computer was infected with the keystroke-capturing virus, the officer logged on to his police department's account at Accurint, a LexisNexis service provided by Florida-based subsidiary Seisint Inc. ...
The young hacker said the group members then created a series of sub-accounts using the police department's name and billing information. Over several days, the hacker said the group looked up thousands of names in the database, including friends and celebrities. The law enforcement source said the group eventually began selling Social Security numbers and other sensitive consumer information to a ring of identity thieves in California.

Justice Bradley on patent law

Published May 17, 2005

Mr. Justice Bradley, discussing US patent law in 1882:

The design of the patent laws is to reward those who make some substantial discovery or invention, which adds to our knowledge and makes a step in advance in the useful arts. Such inventors are worthy of all favor. It was never the object of those laws to grant a monopoly for every trifling device, every shadow of a shade of an idea, which would naturally and spontaneously occur to any skilled mechanic or operator in the ordinary progress of manufactures.
Such an indiscriminate creation of exclusive privileges tends rather to obstruct than to stimulate invention. It creates a class of speculative schemers who make it their business to watch the advancing wave of improvement, and gather its foam in the form of patented monopolies, which enable them to lay a heavy tax upon the industry of the country, without contributing anything to the real advancement of the arts. It embarrasses the honest pursuit of business with fears and apprehensions of concealed liens and unknown liabilities to lawsuits and vexatious accountings for profits made in good faith.

Well said that man! (via)

Virtualisation is good for the environment

Published May 16, 2005

Computing: mentioned in a Slashdot thread about green server farms -- a page extolling the OpenVPS virtual-server software's environmental benefits:

OpenVPS is good for the environment: a low-end server these days consumes no less than 200W. Given that typical servers run 24/7/365 this amounts (to) 1752 KWh per year. And because every joule of energy consumed by a server is transformed to heat, you need to at least double this to consider the air conditioning costs, which brings us to 3504 KWh per year. ...
At some point this becomes an ethical question: If my CPU is 99.9% idle, is it environmentally (not to mention fiscally!) responsible of me to keep this server running?
Virtualization technologies such Linux VServer used by OpenVPS offer a very viable alternative. If the server acts and feels like a dedicated server, what difference does it really make if it's actually virtual? Yet consolidating 30 physical servers into 30 OpenVPS accounts running on one (albeit power hungry) server would save over 100000 kWh per year. That's as much energy as is consumed on average by 10 houses!

What an excellent point! The OpenVPS dev's slashdot commment reveals another good demo of this --

  # cat /proc/uptime
  16000520.62 9482790.31

The first number is seconds of uptime, the second number is seconds spent in a CPU-idle state. So the server for taint.org, going by those numbers, has spent 59% of its time in a CPU-idle state -- and converting fossil fuels to waste heat in the process...

UBE, not UCE

Published May 16, 2005

Spam: About this time last year, German neo-nazis launched a massive worldwide spam run with the aid of the Sober.H worm.

Well, it looks like they're planning to make this a regular occurrence, because it's on again, spamming nazi opinions linking to stories on reputable news sites, as well as pages on less reputable right-wing sites, Joe Wein has posted some samples. I've already received nearly a thousand since last night.

The good news -- here's a SpamAssassin ruleset that catches these nicely. thanks Raymond!

Using sound as a dead man’s switch

Published May 13, 2005

Software: a nifty trick in this Slashdot comment:

... This reminds me of an old trick we developed to use on the Amiga on a public-access cable channel. The software was under development and crashed occasionally, so rather than having a flashing guru meditation up on a local TV channel until it was rebooted the next day, we came up with a plan, that would probably work on a Windows machine as well (or just about any other system)
The idea was that while the software application was running, it drove a continuous 1khz tone out the audio port that kept a relay energized (that kept the signal on-air). When the system crashed, the audio output stopped, which meant the relay was no longer energized = video signal switched back to a stock SMPTE bars signal from a test generator.

Nowadays, I'd probably pay the money for a hardware watchdog timer. But this is a good, cheap way to implement a dead man's switch. Very clever!

The Stag’s new owner: Louis Fitzgerald

Published May 12, 2005

Dublin: Sorry to the non-Dublin readership, I'm sure you all are getting quite bored of this by now. But anyway...

According to jd on the discussion page, the new owner of the Stag's Head is Louis Fitzgerald, who picked it up for EUR 5.8 million.

Reportedly, he's 'the biggest publican in Dublin' (sic), and owns The Quays in Temple Bar, The Palmerstown House in Palmerstown, The Big Tree on Dorset Street and The Poitin Stil in Rathcoole -- and Kehoe's on South Anne Street. Quite an empire.

I'll have to leave the speculation on Fitzgerald's pros and cons to more recent residents of Dublin, but I agree with jd's comment: 'hope he does half a good as job as the Shaffrys, and the bicycles are left outside rather than on the ceiling,' Amen to that.

The Bayh-Dole Act and publicly-funded research

Published May 11, 2005

Science: in passing -- this came up elsewhere, and it's worth copying here, too (for reference).

The question was: how much should publicly-funded researchers be required to disclose - should they be allowed to generate 'closed-source' solutions at the taxpayers' expense?

In the US and world-wide, there used to be a tradition that government-funded research should be made open to all, since if it was funded from public taxation, the fruits of that taxation should go back to the public. However, 25 years ago, the US enacted the Bayh-Dole Act, in which:

Universities were encouraged to collaborate commercial concerns to promote the utilization of inventions arising from federal funding.
It was clearly stated that universities may elect to retain title to inventions developed through government funding.
Universities must file patents on inventions they elect to own.

So in other words, the government has dictated since 1980 that government-funded research should not produce open-source or public-domain solutions, necessarily, as the results of research are to be considered private-sector profit-generating centers for the host universities. Naturally, cash-strapped universities have imposed internal regulations to maximise revenue from their research staff.

The implications for whatever 'the next BSD TCP/IP stack' may be are obvious.

Stag’s on the block today

Published May 11, 2005

Dublin: Lean forwards on this story from today's Irish Times. Sadly, it's behind their subscription firewall, so I'll just snip out a few choice quotes from Philip Shaffry, the current owner:

'(The Stag's Head) has been part of my life for three decades and I've been running it for 10 years,' he says. 'I've two small children and I'm living 10 miles out of town, so I'm hoping to find a pub a bit out of the city centre. But of course I'll miss this place. I have got really attached to the clientele and the crowd that comes in.'
Looking around at the Victorian bar, opulently decorated with mahogany panelling and a red Connemara marble bar counter, Shaffry is confident there will be no changes to the building.
'They won't be able to touch it. This is the crème de la crème, the jewel in the crown, of Dublin pubs. It has been here since 1760, although it was completely refurbished in 1895. This is a grade-one listed building.'

But the bad news?

There are no State laws regulating some aspects of the pub, namely his family's refusal to allow music - live or otherwise - or television in the bar. Any new owner could change this tradition, says Shaffry, which is a source of concern for some regulars. (....)
A spokesman for CBRE Gunne, which will auction the pub this afternoon, says there had been 'enormous interest' in the premises from Irish and international buyers.

Eeek! The guide price is 5 million Euros, if you fancy a shot.

Thanks for Philip for his excellent stewardship -- here's hoping any new buyer will keep his approach. That approach made the Stag's what it is today -- the best pub in Dublin. (In my opinion, at least ;)

PVR Build Log

Published May 11, 2005

TV: I've taken a little time to throw up my PVR build log.

If you're hacking on one yourself, or curious about what it takes, or just like reading cut-and-pasted UNIX command lines -- go take a look!

Tip: secure SSH tunneling for cron jobs

Published May 11, 2005

UNIX: a quick recap of a good tip combo picked up from ILUG recently. To paraphrase Conor Wynne's original question:

What's the best way to set up a secure connection between two hosts, possibly over the internet, using SSH, suitable for use from cron so that it can run via crontab without entering authentication manually?

Barry O'Donovan replied:

I suggested ssh keys without passphrases ... in
http://www.barryodonovan.com/publications/lg/104/ and it includes instructions. ... You can invoke rsync over ssh and specify a specific key with:
rsync -a -e 'ssh -i /home/username/.ssh/id_rsa-serverbackup'

Colm MacCárthaigh followed up with:

You can restrict what commands an ssh account can run in the ssh public key. This is how some of our more important projects (like Debian, FreshRPMS, and a few more) push us updates. The key looks like (jm: all on one line, no space between 'no-pty,' and 'command'):
no-port-forwarding,no-X11-forwarding,no-agent-forwarding,no-pty, command="/home/ximian/rsync-ximian-nolog &"
ssh-dss keydata username@blah
So, create a passwordless public key like so, and just change the command to whatver rsync runs.

Combined, that's a useful tip -- I knew about the ssh command restriction technique, but being able to use a specific single-purpose key from the ssh client is very useful.

(updated: mbp mailed to note some missing quotes in Barry's command above; they'd been eaten by WebMake. drat.)

Tip: expand a bash commandline as you type it

Published May 4, 2005

UNIX: another useful tip. Bash supports a wide variety of command line editing tricks; you have the usual GUIish editing (backspace, insert new characters, delete, blah blah) through the GNU Readline library, and in addition to that you have the traditional csh-style history expansion (like '!!' to refer to the previous command typed).

The latter are great, but they won't actually be expanded until you hit Enter and run the command line. That can be inconvenient, resulting in the user being forced to reach for the rodent for some cut'n'paste instead.

Here's a handy trick -- add this line to ~/.inputrc (creating the file if necessary):

Control-x: shell-expand-line

Start a new bash shell. Now, if you type CTRL-X during command line entry, any shell metacharacters will be expanded on the current command line. For example:

% echo Hello world
Hello world

% echo Hi !$       (press CTRL-X)
           (current command line expands to:)
% echo Hi world

There's a few more commands supported, but none of them are really quite as useful as shell-expand-line.

Update: 'Smylers' wrote to point me at this UKUUG talk from 2003 which discusses .inputrc expansions, and provides some insanely useful tips.

In particular, Magic Space clearly knocks this tip into a cocked hat, by performing the expansion on the fly as you type the command, with no additional keypresses -- amazing! Bonus: it works if you use Emacs-mode line editing as well as Vi-mode.

I strongly recommend reading that paper -- lots of other good tips there.

Sony coins new name for vapour

Published May 2, 2005

Patents: New Scientist: Sony patent takes first step towards real-life Matrix:

IMAGINE movies and computer games in which you get to smell, taste and perhaps even feel things. That's the tantalising prospect raised by a patent on a device for transmitting sensory data directly into the human brain - granted to none other than the entertainment giant Sony.

It's a very lame 'first step' though -- Sony has done no research and development on this invention whatsoever, it's just a patent form of the old 'in the future, we'll wear tinfoil suits! And here's how they'll probably work!' speculation. Sony's comment:

Elizabeth Boukis, spokeswoman for Sony Electronics, says the work is speculative. 'There were not any experiments done,' she says. 'This particular patent was a prophetic invention. It was based on an inspiration that this may someday be the direction that technology will take us.'

That's nice; I'm sure they have some in the pipeline for flying cars, too.

It's good to know that if an inventor does eventually come up with an ultrasound-based human-computer brain interface, they'll have to pay license fees to Sony so they can use their 'prophecy' in their invention. The USPTO's high standards are being maintained, as usual...

Forfás Intellectual Property Lecture Series

Published April 27, 2005

Ireland: Worth watching for european software-patent watchers, Forfás, Ireland's 'national policy advisory board on enterprise, trade, science, technology and innovation' are running a series of monthly seminars on 'Intellectual Property' in association with Licensing Executives Society Britain and Ireland.

This one looks quite interesting -- 10 June: 'Patenting Software - The Current State of Play', Author Barry Moore, of Hanna Moore & Curley, patent attorneys.

Interested parties can attend with pre-registration, or wait to download the mp3 at Forfás' website, apparently, along with the rest of the lecture series. (No sign what the license is on those files, though ;)