Skip to content

Justin's Linklog Posts

Vim hanging while running VMWare: fixed

I’ve just fixed a bug on my linux desktop which had been annoying me for a while. Since there seems to be little online written about it, here’s a blog post to help future Googlers.

Here’s the symptoms: while you’re running VMWare, your Vim editing sessions freeze up for 20 seconds or so, roughly every 5 minutes. The editor is entirely hung.

If you strace -p the process ID before the hang occurs, you’ll see something like this:

select(6, [0 3 5], NULL, [0 3], {0, 0}) = 0 (Timeout)
select(6, [0 3 5], NULL, [0 3], {0, 0}) = 0 (Timeout)
select(6, [0 3 5], NULL, [0 3], {0, 0}) = 0 (Timeout)
_llseek(7, 4096, [4096], SEEK_SET)      = 0
write(7, "tp\21\0\377\0\0\0\2\0\0\0|\0\0\0\1\0\0\0\1\0\0\0\6\0\0"..., 4096) = 4096
ioctl(0, SNDCTL_TMR_TIMEBASE or TCGETS, {B38400 opost -isig -icanon -echo ...}) = 0
select(6, [0 3 5], NULL, [0 3], {0, 0}) = 0 (Timeout)
_llseek(7, 20480, [20480], SEEK_SET)    = 0
write(7, "ad\0\0\245\4\0\0\341\5\0\0\0\20\0\0J\0\0\0\250\17\0\0\247"..., 4096) = 4096
ioctl(0, SNDCTL_TMR_TIMEBASE or TCGETS, {B38400 opost -isig -icanon -echo ...}) = 0
select(6, [0 3 5], NULL, [0 3], {0, 0}) = 0 (Timeout)
fsync(

In other words, the hung process is sitting in an fsync() call, attempting to flush changed data for the current file to disk.

Investigation threw up the following: a kerneltrap thread about disk activity, poor responsiveness with Firefox 3.0b3 on linux, and a VIM bug report regarding this feature interfering with laptop-mode and spun-down hard disks.

VMWare must be issuing lots of unsynced I/O, so when Vim issues its fsync() or sync() call, it needs to wait for the VMWare I/O to complete before it can return — even though the machine is otherwise idle. A bit of a Linux kernel (or specifically, ext3) misfeature, it seems.

Synthesising details from those threads comes up with this fix: edit your ~/.vimrc and add the following lines —

set swapsync=
set nofsync

This will inhibit use of both fsync() and sync() by Vim, and the problem is avoided nicely.

Update: one of the Firefox developers discusses how this affects FF 3.0.

Irish ISPs in record company crosshairs

RTE reports that 4 record companies, EMI, Sony BMG, Universal Music and Warner Music, have brought a High Court action to compel Eircom — Ireland’s largest ISP — to prevent its networks being used for the illegal downloading of music:

Willie Kavanagh, Managing Director of EMI Ireland and chairman of IRMA, said because of illegal downloading and other factors, the Irish music industry was experiencing a “dramatic and accelerating decline” in income. He said sales in the Irish market dropped 30% in the six years up to 2007.

EMI and the other companies are challenging Eircom’s refusal to use filtering technology or other measures to voluntarily block or filter illegally downloaded material. Last October Eircom told the companies it was not in a position to use the filtering software.

(I wonder if those dropping sales in the Irish market comprise only CDs sold by Irish shops? 2001 to 2007 is also the time period when physical sales have given way to online shopping on a gigantic scale, especially for music.)

The Irish Times coverage includes another interesting factoid, which appears in a lot of press regarding this case:

Latest figures available, for 2006, indicate that 20 billion music files were illegally downloaded worldwide that year. The music industry estimates that for every single legal download, there are 20 illegal ones.

A little research reveals that that figure comes from the IFPI Digital Music Report 2008. I’d have a totally different take on it, however. In my opinion, the figure is probably correct, but not for the reasons the IFPI want them to be. There are a number of factors:

There’s more commentary on the 20-to-1 figure here.

The IFPI Digital Music Report 2008 also notes:

“2007 was the year ISP responsibility started to become an accepted principle. 2008 must be the year it becomes reality”

Governments are starting to accept that Internet Service Providers (ISPs) should take a far bigger role in protecting music on the internet, but urgent action is needed to translate this into reality, a new report from the international music industry says today.

ISP cooperation, via systematic disconnection of infringers and the use of filtering technologies, is the most effective way copyright theft can be controlled. Independent estimates say up to 80 per cent of ISP traffic comprises distribution of copyright-infringing files.

The IFPI Digital Music Report 2008 points to French President Sarkozy’s November 2007 plan for ISP cooperation in fighting piracy as a groundbreaking example internationally. Momentum is also gathering in the UK, Sweden and Belgium. The report calls for legislative action by the European Union and other governments where existing discussions between the music industry and record companies fail to progress.

So it seems Ireland is the vanguard of an international effort by IFPI members to force ISPs to install filtering, worldwide. It seems the same happened in Belgium last year — and I reckon there’ll be similar cases elsewhere soon.

Either way, I doubt this will be good for Irish internet users.

(PS: while I’m talking about buying MP3s online — a quick plug for 7digital. Last time I used them, I had a pretty crappy experience, but the situation is a lot better nowadays. They now have a great website that works perfectly in Firefox on Linux; they sell brand new releases like the Hercules and Love Affair album as 320kbps DRM-free MP3s; they support PayPal payments; and downloads are fast and simple — right click, “Save As”. hooray!)

Some other blog coverage: Lex Ferenda with some details about the legal situation, and Jim Carroll.

Update: EMI Ireland seem to be singing from a different hymn-sheet than their head office… interesting.

Update 2: I’ve taken a look at the Copysense filtering technology, and how it can be evaded.

Announcing IrishPulse

As I previously threatened, I’ve gone ahead and created a “Microplanet” for Irish twitterers, similar to Portland’s Pulse of PDX — an aggregator of the “stream of consciousness” that comes out of our local Twitter community: IrishPulse.

Here’s what you can do:

Add yourself: if you’re an Irish Twitter user, follow the user ‘irishpulse’. This will add you to the sources list.

Publicise it: feel free to pass on the URL to other Irish Twitter users, and blog about it.

Read it: bookmark and take a look now and again!

In terms of implementation, it’s just a (slightly patched) copy of Venus and a perl script using Net::Twitter to generate an OPML file of the Twitter followers. Here’s the source. I’d love to see more “Pulse” sites using this…

Google’s CAPTCHA – not entirely broken after all?

A couple of weeks ago, WebSense posted this article with details of a spammer’s attack on Google’s CAPTCHA puzzle, using web services running on two centralized servers:

[…] It is observed that two separate hosts active on same domain are contacted during the entire process. These two hosts work collaboratively during the CAPTCHA break process. […]

Why [use 2 hosts]? Because of variations included in the Google CAPTCHA image, chances are that host 1 may fail breaking the code. Hence, the spammers have a backup or second CAPTCHA-learning host 2 that tries to learn and break the CAPTCHA code. However, it is possible that spammers also use these two hosts to check the efficiency and accuracy of both hosts involved in breaking one CAPTCHA code at a time, with the ultimate goal of having a successful CAPTCHA breaking process.

To be specific, host 1 has a similar concept that was used to attack Live mail CAPTCHA. This involved extracting an image from a victim’s machine in the form of a bitmap file, bearing BM.. file headers and breaking the code. Host 2 uses an entirely different concept wherein the CAPTCHA image is broken into segments and then sent as a portable image / graphic file bearing PV..X file headers as requests. […]

While it doesn’t say as such, some have read the post to mean that Google’s CAPTCHA has been solved algorithmically. I’m pretty sure this isn’t the case. Here’s why.

Firstly, the FAQ text that appears on “host 1” (thanks Alex for the improved translation!):

img

FAQ

If you cannot recognize the image or if it doesn’t load (a black or empty image gets displayed), just press Enter.

Whatever happens, do not enter random characters!!!

If there is a delay in loading images, exit from your account, refresh the page, and log in again.

The system was tested in the following browsers: Internet Explorer Mozilla Firefox

Before each payment, recognized images are checked by the admin. We pay only for correctly recognized images!!!

Payment is made once per 24 hours. The minimum payment amount is $3. To request payment, send your request to the admin by ICQ. If the admin is free, your request will be processed within 10-15 minutes, and if he is busy, it will be processed as soon as possible.

If you have any problems (questions), ICQ the admin.

That reads to me a lot like instructions to human “CAPTCHA farmers”, working as a distributed team via a web interface.

Secondly, take a look at the timestamps in this packet trace:

img2

The interesting point is that there’s a 40-second gap between the invocation on “Captcha breaking host 1” and the invocation on “Captcha breaking host 2”. There is then a short gap of 5 seconds before the invocations occur on the Gmail websites.

Here’s my theory: “host 1” is a web service gateway, proxying for a farm of human CAPTCHA solvers. “host 2”, however, is an algorithm-driven server, with no humans involved. A human may take 40 seconds to solve a CAPTCHA, but pure code should be a lot speedier.

Interesting to note that they’re running both systems in parallel, on the same data. By doing this, the attackers can

  1. collect training data for a machine-learning algorithm (this is implied by the ‘do not enter random characters!’ warning from the FAQ — they don’t want useless training data)

  2. collect test cases for test-driven development of improvements to the algorithm

  3. measure success/failure rates of their algorithms, “live”, as the attack progresses

Worth noting this, too:

Observation*: On average, only 1 in every 5 CAPTCHA breaking requests are successfully including both algorithms used by the bot, approximating a success rate of 20%. The second algorithm (segmentation) has very poor performance that sometimes totally fails and returns garbage or incorrect answers.

So their algorithm is unreliable, and hasn’t yet caught up with the human farmers. Good news for Google — and for the CAPTCHA farmers of Romania ;)

Update: here’s the NYTimes’ take, with broadly agreeing comments from Brad Taylor of Google. (The Register coverage is off-base, however.)

On the effects of lowering your SpamAssassin threshold

So I was chatting to Danny O’Brien a few days ago. He noted that he’d reduced his Spamassassin “this is spam” threshold from the default 5.0 points to 3.7, and was wondering what that meant:

I know what it means in raw technical terms — spamassassin now marks anything >3.7 as spam, as opposed to the default of five. But given the genetic algorithm way that SA calculates the rule scoring, what does lowering the score mean? That I’m more confident that stuff marked ham is stuffed marked ham than the average person? That my bayesian scoring is now really good?

Do people usually do this without harmful side-effects? What does it mean about them if they do it?

Does it make me a good person? Will I smell of ham? These are the things that keep me awake at night.

It’s a good question! Here’s what I responded with — it occurs to me that this is probably quite widely speculated about, so let’s blog it here, too.

As you tweak the threshold, it gets more or less aggressive.

By default, we target a false positive rate of less than 0.1% — that means 1 FP, a ham marked as spam incorrectly, per 1000 ham messages. Last time the scores were generated, we ran our usual accuracy estimation tests, and got a false positive rate of 0.06% (1 in 1667 hams) and a false negative rate of 1.49% (1 in 67 spams) for the default threshold of 5.0 points. That’s assuming you’re using network tests (you should be) and have Bayes training (this is generally the case after running for a few weeks with autolearning on).

If you lower the threshold, then, that trades off the false negatives (reducing them — less spam getting past) in exchange for more false positives (hams getting caught). In those tests, here’s some figures for other thresholds:

SUMMARY for threshold 3.0: False positives: 290 0.43% False negatives: 313 0.26%

SUMMARY for threshold 4.0: False positives: 104 0.15% False negatives: 1084 0.91%

SUMMARY for threshold 4.5: False positives: 68 0.10% False negatives: 1345 1.13%

so you can see FPs rise quite quickly as the threshold drops. At 4.0 points, the nearest to 3.7, 1 in 666 ham messages (0.15%) will be marked incorrectly as spam. That’s nearly 3 times as many FPs as the default setting’s value (0.06%). On the other hand, only 1 in 109 spams will be mis-filed.

Here’s the reports from the last release, with all those figures for different thresholds — should be useful for figuring out the likelihoods!

In fact, let’s get some graphs from that report. Here is a graph of false positives (in orange) vs false negatives (in blue) as the threshold changes…

and, to illustrate the details a little better, zoom in to the area between 0% and 1%…

You can see that the default threshold of 5.0 isn’t where the FP% and FN% rates meet; instead, it’s got a much lower FP% rate than FN%. This is because we consider FPs to be much more dangerous than missed spams, so we try to avoid them to a higher degree.

An alternative, more standardized way to display this info is as a Receiver Operating Characteristic curve, which is basically a plot of the true positive rate vs false positives, on a scale from 0 to 1.

Here’s the SpamAssassin ROC curve:

More usefully, here’s the ROC curve zoomed in nearer the “perfect accuracy” top-left corner:

Unfortunately, this type of graph isn’t much use for picking a SpamAssassin threshold. GNUplot doesn’t allow individual points to be marked with the value from a certain column, otherwise this would be much more useful, since we’d be able to tell which threshold value corresponds to each point. C’est la vie!

Update:: this is possible with GNUplot 4.2 onwards, it seems. great news! Hat tip to Philipp K Janert for the advice. here are updated graphs using this feature:

(GNUplot commands to render these graphs are here.)

Update again: much better interactive Flash graphs here.

Microplanets

Intriguing! Via Glynn Moody comes an interesting new site, Pulse of Open Source:

To highlight open source activity on Twitter, I have launched a new web application today called The Pulse of Open Source. This is the stream of collective consciousness from the open source community on Twitter. You can follow this stream by simply bookmarking the site and visiting regularly or by adding the RSS feed to your feed reader. You can also create a Twitter account and add the individuals you’d like to follow to your own Twitter friends list if you’d prefer. There is also a mobile version of the site for on-the-go viewing.

I’m not entirely convinced it makes sense — the “open source community” is a pretty wide and amorphous concept, covering “enterprisey” types like Iona, to conference organisers, to web standards guys to GNOME developers. That’s a wide range.

However, that site links to the original, and a version which resonates better: PulseOfPDX.com, ‘the stream of Portland’s collective consciousness‘. Basically, this is a local syndication site, with microblogging from a community of local Twitterers. Similar to the “Planet” concept, which aggregates posts from multiple weblogs into a new ‘river of news’ combined feed, as seen on Planet Antispam, Planet Perl, Planet.journals.ie, but for off-the-cuff Twitter microblog comments. It’s a microplanet, to coin a phrase.

I think I might set up one of these for Ireland… what a great idea!

Update: Ted Leung posted about this today as well, I see, linking to this call for an “out-of-the-box” Twitter aggregator:

In theory, this whole pulse idea could be packaged up to be as easily deployable as “planet” sites. Here, “pulse” is the operational brand-name of aggregating Twitter accounts, where as “planet” is the tried and true operational brand-name of aggregating blogs.

I think I still prefer “microplanet” ;)

Update 2: check out IrishPulse!

Bea picture of the week


Another super-cute photo of Bea, from the latest batch. Nowadays, my photos are all-Bea, all of the time…


Plug plug

It’s been a while since I’ve posted about good shopping experiences I’ve had. Here’s a couple:

SoleTrader.co.uk: I’m a terrible shopper. I hate shops, I always wind up having to visit them at their busiest times on the weekend, and the last time I tried to go shopping for a new pair of shoes, I got caught in torrential rain, fell over and broke my thumb instead. seriously. So feck that.

Instead, I resolved to buy them online, and that I did — from SoleTrader. They had a great range of trainers, I found what I was after, the price was grand, and delivery on time. Shoes are always the same size — their sizes are standardised, after all — so naturally they fit fine. All in all, it worked out great.

Be Organic: these guys operate in North Dublin, delivering bags of organic fruit and vegetables to your door, weekly. We get the Essential Fruit Bag and the Mini Box, with a bi-weekly bag of spuds on top, for EUR 32 per week. The quality of the food is absolutely fantastic, there’s never any spoilage or wilting, and it’s always fresh and delicious. Compared to supermarket fare, it’s leagues ahead. They’ve also been grand and flexible when we need to tweak the order slightly — for example we have a veto on celery, and that’s not an issue at all. The only problem would be that they’ve recently increased their prices… but unfortunately that seems to be a general problem in Ireland these days!

vote for Dustin on Saturday

A friend of a friend writes:

Unless you are pretty good at avoiding the media, you will be aware that Dustin the Turkey has been chosen as one of six finalists for RTE’s Eurosong, the winner of which will go on to represent Ireland in the Eurovision Song Contest in Serbia in May.

What you may not be aware of is that I wrote and recorded the song with him and need your votes to help get me to Serbia!!!

The TV show will be broadcast live on RTE this Saturday Feb 23rd, at 7pm. It is a televote (a la X-factor format), so get your mobile phones ready. The results are at 9:45pm.

The song, Irlande Douze Points, is a parody on the current types of songs, acts and block-voting in the Eurovision. It may make your ears bleed a bit, you may ask yourself why, but what the hell, send someone you know to the final!!!

Apparently, Dustin urges the contest judges to “give douze points to Ireland, for its lowlands and its highlands, for Terry Wogan’s wig and Bono’s leather pants. We brought you Guinness and Westlife, 800-years of war and strife, but we all apologise for Riverdance.”

Check out the outraged reactions from Ireland’s past Eurovision “winners”:

Frank McNamara, who wrote two of the Irish Eurovision winners, asked whether RTE, the state broadcaster that selected the six acts, was “giving two fingers” to Irish ‘song’writers. “I think it is absolutely disgraceful.”

Shay Healy, who wrote Johnny Logan’s Eurovision hit What’s Another Year?, wondered “how any bunch of grown-ups could come up with this as a solution”

Phil Coulter thought that Eurovision was going “down the tubes”.

The choice on Saturday is between a turkey puppet taking the piss in a Northside accent, and such po-faced “serious pop” mawkfests as ‘“Double Cross My Heart” performed by Donal Skehan’ and ‘“Time to Rise” performed by Maya’. snore. You know it’s got to be the turkey.

Here’s the official Bebo page, and the Facebook group — and here’s the song itself:

Update: actually, here’s another, higher quality clip — with an entirely different song! Let’s hope this is the one…

Update 2: he won. Dana and the other professional Eurovision types have been chewing wasps, it’s hilarious!

A historical DailyWTF moment

Today, in work, we wound up discussing this classic DailyWTF.com article — “Remember, the enterprisocity of an application is directly proportionate to the number of constants defined”:

public class SqlWords
{
  public const string SELECT = " SELECT ";
  public const string TOP = " TOP ";
  public const string DISTINCT = " DISTINCT ";
  /* etc. */
}

public class SqlQueries
{
  public const string SELECT_ACTIVE_PRODCUTS =
    SqlWords.SELECT +
    SqlWords.STAR +
    SqlWords.FROM +
    SqlTables.PRODUCTS +
    SqlWords.WHERE +
    SqlColumns.PRODUCTS_ISACTIVE +
    SqlWords.EQUALS +
    SqlMisc.NUMBERS_ONE;
  /* etc. */
}

This made me recall the legendary source code for the original Bourne shell, in Version 7 Unix. As this article notes:

Steve Bourne, at Bell Labs, worked on his version of shell starting from 1974 and this shell was released in 1978 as Bourne shell. Steve previously was involved with the development of Algol-68 compiler and he transferred general approach and some syntax sugar to his new project.

“Some syntax sugar” is an understatement. Here’s an example, from cmd.c:

LOCAL REGPTR    syncase(esym)
        REG INT esym;
{
        skipnl();
        IF wdval==esym
        THEN    return(0);
        ELSE    REG REGPTR      r=getstak(REGTYPE);
                r->regptr=0;
                LOOP wdarg->argnxt=r->regptr;
                     r->regptr=wdarg;
                     IF wdval ORF ( word()!=')' ANDF wdval!='|' )
                     THEN synbad();
                     FI
                     IF wdval=='|'
                     THEN word();
                     ELSE break;
                     FI
                POOL
                r->regcom=cmd(0,NLFLG|MTFLG);
                IF wdval==ECSYM
                THEN    r->regnxt=syncase(esym);
                ELSE    chksym(esym);
                        r->regnxt=0;
                FI
                return(r);
        FI
}

Here are the #define macros Bourne used to “Algolify” the C compiler, in mac.h:

/*
 *      UNIX shell
 *
 *      S. R. Bourne
 *      Bell Telephone Laboratories
 *
 */

#define LOCAL   static
#define PROC    extern
#define TYPE    typedef
#define STRUCT  TYPE struct
#define UNION   TYPE union
#define REG     register

#define IF      if(
#define THEN    ){
#define ELSE    } else {
#define ELIF    } else if (
#define FI      ;}

#define BEGIN   {
#define END     }
#define SWITCH  switch(
#define IN      ){
#define ENDSW   }
#define FOR     for(
#define WHILE   while(
#define DO      ){
#define OD      ;}
#define REP     do{
#define PER     }while(
#define DONE    );
#define LOOP    for(;;){
#define POOL    }


#define SKIP    ;
#define DIV     /
#define REM     %
#define NEQ     ^
#define ANDF    &&
#define ORF     ||

#define TRUE    (-1)
#define FALSE   0
#define LOBYTE  0377
#define STRIP   0177
#define QUOTE   0200

#define EOF     0
#define NL      '\n'
#define SP      ' '
#define LQ      '`'
#define RQ      '\''
#define MINUS   '-'
#define COLON   ':'

#define MAX(a,b)        ((a)>(b)?(a):(b))

Having said all that, the Bourne shell was an awesome achievement; many of the coding constructs we still use in modern Bash scripts, 30 years later, are identical to the original design.

Technorati bloginfo API wierdness

For the benefit of other Technorati API users…

In a comment on this entry, Padraig Brady mentioned that his blog had mysteriously disappeared from the Irish Blogs Top 100 list.

I investigated, and found something odd — it seems Technorati has made a change to their bloginfo API, now listing weblogs with their ‘rank’, but without some of the important metadata, like ‘inboundblogs’, ‘inboundlinks’, and with a ‘lastupdate’ time set to the epoch (1970-01-01 00:00:00 GMT), in the API. Here’s an example:

<!-- generator="Technorati API version 1.0" -->
<!DOCTYPE tapi PUBLIC "-//Technorati, Inc.//DTD TAPI 0.02//EN"
                 "http://api.technorati.com/dtd/tapi-002.xml">
<tapi version="1.0">
<document>
    <result>
        <url>http://www.pixelbeat.org</url>
                    <weblog>
                <name>Pádraig Brady</name>
                <url>http://www.pixelbeat.org</url>
                <rssurl></rssurl>
                <atomurl></atomurl>
                <inboundblogs></inboundblogs>
                <inboundlinks></inboundlinks>
                <lastupdate>1970-01-01 00:00:00 GMT</lastupdate>
                <rank>74830</rank>
            </weblog>
                            </result>
</document>
</tapi>

Compare that with this lookup result, on my own blog:

<?xml version="1.0" encoding="utf-8"?>
<!-- generator="Technorati API version 1.0" -->
<!DOCTYPE tapi PUBLIC "-//Technorati, Inc.//DTD TAPI 0.02//EN"
                 "http://api.technorati.com/dtd/tapi-002.xml">
<tapi version="1.0">
<document>
    <result>
        <url>http://taint.org</url>
                    <weblog>
                <name>taint.org: Justin Mason’s Weblog</name>
                <url>http://taint.org</url>
                <rssurl>http://taint.org/feed</rssurl>
                <atomurl>http://taint.org/feed/atom</atomurl>
                <inboundblogs>143</inboundblogs>
                <inboundlinks>227</inboundlinks>
                <lastupdate>2008-02-12 11:48:10 GMT</lastupdate>
                <rank>43404</rank>
            </weblog>
                            <inboundblogs>143</inboundblogs>
                            <inboundlinks>227</inboundlinks>
            </result>
</document>
</tapi>

This bug had caused a number of blogs to be dropped from the list, since I was using “inboundblogs and inboundlinks == 0” as an indication that a blog was not registered with Technorati.

It’s now worked around in my code, although a side-effect is that blogs which have this set will appear with question-marks in the ‘inboundblogs’ and ‘inboundlinks’ columns, and will perform poorly in the ‘ranked by inbound link count’ table (unsurprisingly).

I’ve posted a query to the support forum — let’s see what the story is.

Interesting Irish Blog Awards shortlistee

This year’s Irish Blog Awards shortlists were posted yesterday. I maintain the Irish Blogs Technorati Top 100 list, so good sources of Irish blog URLs are always welcome; I took the shortlisted blogs and added them all.

Interestingly, straight in at number 2 went towleroad.com (warning: not worksafe!). It has a staggering Technorati rank of 1074 — way ahead of Donncha’s 5831 or Mulley’s 8678. I was pretty curious as to how an Irish blog could hit those heights without me having heard of it, so I took a look.

Let’s just say the content isn’t quite what you’d expect to find in a blog shortlisted for ‘Best News/Current Affairs Blog’ — a little bit short on Irish news, but heavy on pictures of naked guys getting off with each other. ;)

I took a quick glance, and I couldn’t spot any Irish content. WHOIS says the the publisher is LA-based. so I’m curious as to what qualified it as an “Irish blog”…

(by the way, I tried to leave a comment on the blog entry, but it appears Akismet is marking my comments as spam on a number of Wordpress-based blogs at the moment. Yes, I am aware of the irony. No, if SpamAssassin was a blog-spam filter, it wouldn’t do that ;)

Update: it’s sorted — they’re now gone. Also, it appears I’ve been removed from the Akismet blacklist, yay.

More on the Trend Micro patent

Dutch free knowledge and culture advocacy group ScriptumLibre.org has called for a worldwide boycott of Trend Micro products. Their chairman, Wiebe van der Worp, claims Trend Micro’s aggressive use of litigation is “well beyond the borders of decency”.

Also, this Linux.com feature has a great quote from Jim Zemlin, the executive director of the Linux Foundation:

“A company that files a patent claim against code coming from a widely adopted open source project vastly underestimates the self-inflicted damage to its customer and community relationships. In today’s world, all of our customers in the software industry are enjoying the benefits of a wide variety of open source projects that provide stability and vendor-neutral solutions to the most basic of their computing needs. I talk to those customers every day. They consider these claims short-sighted and those that assert them to be fearful of their ability to compete in today’s economy.”

Well said.

Plug: Lenovo service still rocks

I needed to buy a new laptop for work a few months back, and after a little agonizing between the MacBook Pro and a Thinkpad T61p, I plumped for the latter. As I noted at the time, one of the major selling points was the quality of IBM/Lenovo’s after-sales warranty service, compared to the atrocious stories I’d heard about AppleCare in Europe. I was, however, taking a leap of faith — I had used IBM service to great effect in the US, but had never actually tried it out in Ireland.

Sadly, I had to put this to the test today, after the hard disk started producing these warnings:

/var/log/messages:Feb  7 11:21:13 wall kernel: 
[2075890.116000] end_request: I/O error, dev sda, sector 116189461
/var/log/messages:Feb  7 11:21:38 wall kernel: 
[2075914.824000] end_request: I/O error, dev sda, sector 116189460
/var/log/messages:Feb  7 11:24:18 wall kernel: 
[2076075.072000] end_request: I/O error, dev sda, sector 116189462
/var/log/messages:Feb  7 11:25:05 wall kernel: 
[2076121.932000] end_request: I/O error, dev sda, sector 116189463

It’s a brand new machine, and a Hitachi TravelStar 7K100 drive, with a good reputation for reliability — but these things do happen. :(

Interestingly, I thought this was a case of the Bathtub curve in action — but this comprehensive CMU study of hard drive reliability notes that the ‘infant mortality’ concept doesn’t seem to apply to current hard-drive technology:

Replacement rates [of hard drives in a cluster] are rising significantly over the years, even during early years in the lifecycle. Replacement rates in HPC1 nearly double from year 1 to 2, or from year 2 to 3. This ob- servation suggests that wear-out may start much earlier than expected, leading to steadily increasing replacement rates during most of a system’s useful life. This is an in- teresting observation because it does not agree with the common assumption that after the first year of operation, failure rates reach a steady state for a few years, forming the “bottom of the bathtub”.

Anyway, I digress.

I ran the BIOS hard disk self-test, got the expected failure, then rang up Lenovo’s International Warranty line for Ireland. I got through immediately to a helpful guy in India, and gave him my details and the BIOS error message; he had no tricky questions, no guff about me using Linux rather than Windows, and there were no attempts to sting me for shipping.

There’s now a replacement HD (and a set of spare recovery disks, bonus!) winging their way via 2-day shipping, expected on Tuesday; I’m to hand over the broken HD to the courier once it arrives. Fantastic stuff!

Assuming the courier doesn’t screw up, this is yet another major win for IBM/Lenovo support, and I feel vindicated. ;)

Update: the HD arrived this morning at 10am — a day early. Very impressive!

CEAS needs your ham

CEAS 2008 is doing another Spam Challenge test of various spam-filters, and as part of this, they need samples of ham mail messages.

As part of the data collection effort, we have set up a website through which it is possible to donate non-sensitive legitimate email, to be used in the evaluation. Any kind of email that the recipient considers legitimate is welcome, including computer generated (non-spam) messages.

After the CEAS evaluation, the benchmark data will be made publicly available to facilitate future reasearch and development in the field of spam prevention.

Here is the collection site; they accept UNIX mbox format, and tar.gz or zip files of same, with an 8MB upload limit.

Remote sound playback through a Nokia 770

For a while now, I’ve been using various hacks to play music from my Linux laptop, holding my main music collection, to client systems which drive the speakers.

Previously, I used this setup to play via my MythTV box. Nowadays, however, my TV isn’t in the room where I want to listen to music. Instead, I have my Nokia 770 hooked up to the speakers; this plays the BBC Radio 4 RealAudio streams nicely, and also the laptop’s MP3 collection using a uPnP AV MediaServer.

I specifically use TwonkyMedia right now, playing back via the N770’s Media Streamer app. (That works pretty well — uPnP AV is one of those standards plagued with incompatibilities, but TwonkyMedia and Media Streamer seem to be a reliable combination.)

However, TwonkyMedia sometimes fails to notice updates of the library, and nothing has quite as good a music-player user interface as JuK, the KDE music player and organiser app, so a way to play directly from the laptop instead of via uPnP would be nice…

A weekend’s hacking reveals that this is pretty easily done nowadays, thanks to some cool features in pulseaudio, the current standard sound server on Ubuntu gutsy, and the Esound server running on the N770.

Unfortunately, the N770 doesn’t (yet) support pulseaudio directly, otherwise we could use its seriously cool support for RTP multicast streams. Still, we can hack something up using the venerable “esd” protocol (again!) Here’s how to set it up…

On the N770:

You need to fix the N770’s “esd” sound server to allow public connections. Set up your wifi network’s DHCP server to give the N770 a static IP address. Log in over SSH, or fire up an xterm. Run the following:

mv /usr/bin/esd /usr/bin/esd.real

cat > /usr/bin/esd <<EOM
#!/bin/sh
exec /usr/bin/esd.real -tcp -public -promiscuous -port 5678 $*
EOM

chmod 755 /usr/bin/esd
/etc/init.d/esd restart

On the server:

Download this file, and save it as n770.pa. Edit it, and change server=n770:5678 on the fourth line to use the IP address or hostname of your Nokia 770 instead of n770. Then run:

cp n770.pa ~/.n770.pa

cat > ~/bin/sound_n770 <<EOM
#!/bin/sh
pulseaudio -k; pulseaudio -nF $HOME/.n770.pa &
EOM

cat > ~/bin/sound_here <<EOM
#!/bin/sh
pulseaudio -k; pulseaudio &
EOM

chmod 755 ~/bin/sound_here ~/bin/sound_n770

Now you just need to run ‘~/bin/sound_n770’ to redirect sound playback to the N770, and ‘~/bin/sound_here’ to reset back to laptop speaker output, for the entire desktop environment. Nifty!

Update: it appears that things may work more reliably if you add “rate=22050” at the end of the “load-module module-esound-sink” line — this halves the bitrate of the network stream, which copes better with harsh wifi network conditions. The n770.pa file above now includes this.

Irish crumblies don’t trust blogs

It appears a public relations firm, Edelman’s, recently performed a phone survey which concluded that bloggers are the “least trusted” group of authority figures source of information in Ireland. This has been widely reported:

on Edelman Dublin’s blog:

When we consider who we trust the most as a spokesperson in Ireland, the most trusted sources of information include, financial or industry analysts at 62%, followed by a doctor or healthcare specialist at 57%, an NGO representative at 57% and academics at 53%. Bloggers are the least trusted at 7%.

at Silicon Republic:

Bloggers have emerged as the “least trusted” group in the country.

and on ElectricNews.net:

“What has been interesting to note in this year’s findings is the apparent low standings of bloggers and social media in general,” said [Mark Cahalane, managing director of Edelman Dublin]. “One interpretation of the survey would be that bloggers have now entered the mainstream and people no longer distinguish between blogs and ordinary websites. This is also reflected by the fact that numerous high profile bloggers are widely quoted in the media.”

However, as Damien noted, Piaras Kelly raised a very significant point about this — ‘the people surveyed for the research had to fit a certain demographic, including having to be aged between 35-64.’ […] ‘A Generational gap is evident.’ This press release corroborates that. Sure enough, most blog readers (and writers) would tend to be of the younger generation — a pretty key point, one would assume, but one that most of the non-blogger coverage has omitted ;)

(Update: the term “authority figure” wasn’t quite correct; replaced with what Edelman themselves use, “source of information”.)

Trend Micro’s attack on open source

Trend Micro are demanding that Barracuda Networks pay licensing fees, alleging that they infringe U.S. Patent No. 5,623,600 with their use of the open-source anti-virus tool ClamAV. Here’s a Barracuda press release, and here’s some details from Barracuda:

Trend Micro alleges that Barracuda Networks and ClamAV infringe on Trend Micro’s U.S. Patent No. 5,623,600. Barracuda Networks believes that the patent is invalid due to prior art and further believes that neither its products nor the ClamAV software infringe the patent.

On Sept. 21, 2006, Trend Micro sent Barracuda Networks a letter regarding a license to Trend Micro’s ‘600 patent. After several discussions on paying a license for the patent, Trend Micro demanded Barracuda Networks either remove ClamAV from its products or pay a patent license fee. Barracuda Networks felt it had no choice other than to file for a declaratory judgment in early 2007 in U.S. Federal Court to invalidate Trend Micro’s ‘600 patent and end continued legal threats against Barracuda Networks for use of the free and open source ClamAV software.

Trend Micro subsequently responded to that declaratory action and more recently, Trend Micro filed a claim with the International Trade Commission (ITC). The ITC voted to investigate the claim in December 2007. Trend Micro’s ITC claim alleges that Barracuda Networks infringes on Trend Micro’s ‘600 patent, but effectively implies that anyone using the free and open source ClamAV software at the gateway infringes the patent.

The interesting aspects of this case, from my point of view, are twofold — the patent is a classic bad software patent, very broad and totally obvious both now and at the time it was issued; and it hinges on Barracuda’s use of the free software antivirus product, ClamAV. Given Apache SpamAssassin‘s prevalence in many anti-spam mail filtering appliances (including Barracuda!), this is a very worrying precedent for us — our product could be next, for some other patent troll company’s extortion scheme.

For what it’s worth, it appears this patent has long been a licensing moneyspinner for Trend. In 1997, once the patent was issued, Trend went on a spree; McAfee, Symantec and Integralis were sued, eventually buying licenses, as did Electric Mail Company. 2 years ago, Fortinet were sued and settled in their case.

I happily gave Barracuda a quote for their press release on this:

“Trend Micro’s actions are clearly an attack on free and open source software and its users, as well as on Barracuda Networks. The ‘600 patent covers a trivial method, one which was obvious to anyone skilled in the art at the time the patent was written, and should be rendered invalid as soon as possible. I hope that Barracuda Networks is successful in its attempts to defend all users from this patent shakedown.”

If you know of prior art for this patent, please head over to Barracuda’s site and provide details — helping to fend off this protection racket would be good for all of us. Barracuda say:

People should look for art dated prior to Trend Micro’s filing date of September 26, 1995. The ‘600 patent is entitled “Virus Detection And Removal Apparatus For Computer Networks.” We are interested in all material, including software, code, publications or papers, patents, communications, other media or Web sites that relate to the technology described prior to the filing date.

In particular, this prior art should show antivirus scanning on a firewall or gateway. However, many of the claims do not require virus detection at a gateway. So any material that illustrates virus scanning on a file server is also of interest.

We also believe that a product called MIMESweeper 1.0 from a company called Clearswift, Authentium, or Integralis anticipates several claims of the ‘600 patent. We have yet to locate a copy of this product and would appreciate anyone who has a copy sending it our way.

Some more coverage:

  • Don Marti at LinuxWorld: ‘Regardless of the decision in this case, software patent trolls will continue to be a problem for all software companies, Eben Moglen says. “Getting them to [not operate] in your neighborhood is the best you can do.”‘

  • Matt Asay at C|Net: ‘Antivirus and antispam innovation has tended to come from open source, not the large proprietary vendors. Trend Micro’s lawsuit is designed to put cash in its pocket but will end up hurting the consumer.’ (Matt led with my quote ;)

  • GrokLaw: ‘Anyone using ClamAV, should Trend Micro be successful, is potentially a target.’

  • Ars Technica: ‘The patent is very clearly without merit, but that hasn’t stopped Trend Micro from using it to threaten ClamAV and extort money from several companies. Situations like this demonstrate a very urgent need for patent reform and illuminate the risks posed by broad software patents, particularly in the area of security.’

Interview with two phish-scene infiltrators

/. posted a link to this interview with Nitesh Dhanjani and Billy Rios, two guys who have infiltrated the “phishing underground”.

It’s a good article — lots of detail on the current toolset of a typical phisher, and some details on the community itself:

I had always thought that most phishers were clever hackers evading authorities using the latest evasion techniques and tools. The reality of the matter is most of the phishers we tracked were sloppy and unsophisticated. The tools they used were rarely created by the phisher deploying the actual scam, and for the most part it seemed the phisher merely downloaded kits and tools from some place and reused over and over and over again. It also seemed that many phishers don’t even really understand how the phishing kits they’ve deployed work! We also came across many phishing kits and tools that had simple backdoors written into the source code (essentially, phishers phishing phishers). These backdoors are easily spotted by anyone who has even a basic idea of how the source code flow worked, yet was undetected by many phishers. Maybe a few phishers out there are skilled, but the majority are clueless.

Here’s something I’ve noted about spammers, too — there’s no honour among thieves:

The number of backdoors we saw was staggering. The servers serving the phishing sites had backdoors, the code used in the phishing kits had backdoors, the tools used by phishers had backdoors. Phishers aren’t afraid to steal from regulars people and they are also not afraid to steal from other phishers. Some of the backdoors were meant to keep control over a compromised server, while other simply stole information that had been stolen by other phishers! We came across several forums where phishers, scammers, and carders basically identified other phishers, scammers, and carders that had scammed them. These shady characters may work with each other but they sure don’t trust each other, that’s for sure.

And this is a very important point about blacklists:

Phishers are likely to abuse the blacklists published for [anti-phishing] plugins for their own benefit. The blacklists are a list of known phishing sites that the plugins consume in order to identify what websites are fraudulent. These blacklists therefore contain IP addresses and host names of servers hosting phishing sites. Since phishing sites are commonly installed on servers that have been compromised, and phishers don’t bother to patch systems they have installed their kits on, this list translates to a ‘list of easily compromisable hosts’ for other phishers.

On the latter point, this is one of the key benefits of DNS blocklists, compared to the downloaded, text-based style that Google initially used for its anti-phishing toolbar. To query a DNSBL, you need to know the address you’re looking for first of all; but with a text file, you can read the lists in their entirety, without knowing the address in advance. (Google is now apparently tending to use the enchash format, which fixes this.)

And a final word:

For the next few years, we are going to continue to apply band-aids around the problem of data leakage, and continue to play whack-a-mole with the phishers without solving the actual problem at hand. In order to make any significant progress, we must come up with a brand new system that does away with depending on static identifiers. We will know weâ??ve accomplished this when we will be able to publish our credit reports publicly without fearing for our identities.

(I’d place more importance on the liability of the financial institutions, myself — I think they get away with placing too much blame on the victims of fraud and identity theft.)

Good interview — worth reading.

Insane Dell.ie markup

A good deal came up on a mailing list I’m on: SAMSUNG 245BW Black High Glossy 24″ 5ms DVI Widescreen LCD Monitor for $459.99, or $409.99 after rebate, via Newegg.

A follow-up from a German poster: he’d just picked up a Dell 2407WFP-HC ‘for the low, low price of 659 EUR’.

We marvelled at the price difference — then I looked up Dell.ie forcomparison. I thought 659 EUR was bad, but Dell.ie is asking for 1,117.74 Euros inc VAT for the same product — insane!!

What possible excuse could there be for that? EUR 458.74 worth of shipping maybe? Do they encase it in platinum? That’s nearly three times the price of the Newegg monitor.

Update: Duh. I’m an idiot. That’s a 2707WFP, not a 2407WFP; it’s 3″ bigger and quite a bit fancier. It appears Dell.ie is no longer selling the 2407WFP.

Bad law in North Dakota

This is very bad news for North Dakota-based anti-spammers — a guy called David Ritz is being sued there by alleged porn spammer Jerry Reynolds, for performing DNS lookups, a DNS zone transfer and a Whois lookup. It appears the judge has found Ritz guilty.

This is astonishingly bad lawmaking by the judge. These are entirely innocuous tools, part of every network administrator’s toolkit for debugging and examining internet traffic legitimately. There’s nothing remotely criminal or malicious in their use, and the judge has allowed himself to be misled.

North Dakota Judge Gets it Wrong:

‘Ritz’s behavior in conducting a zone transfer was unauthorized within the meaning of the North Dakota Computer Crime Law. A zone transfer is simply asking a DNS server for all the particular public info it provides about a given domain. This is a common task performed by system administrators for many purposes. The judge is saying that DNS zone transfers are now illegal in North Dakota.’

More details from Ed Falk

David’s legal defense fund

My Commodore 64 demos

I recently came across my record at the Commodore Scene Database, and was happy to find that someone had found and uploaded two demos I had written, back in my days as a member of the C=64 demo scene between 1988 and 1990:

(I was a member of the groups ‘Excess’ and ‘Thundertronix’ / ‘TNT’, going by the handle of ‘Mantis’.)

With the help of CBA, I was overjoyed to track down another long-lost demo, my crowning achievement on the platform:

If you’re curious, feel free to go read those wiki pages or download the .d64’s — they run fine in VICE, the Commodore emulator (amazingly). If you’ve only got time to check one, check Rhaphanadosis; it’s much better than the others.

I’m very impressed with VICE. As far as I can tell, it’s perfectly bug-for-bug compatible with the real hardware, playing all of the demos perfectly (apart from a little additional speed due to differing hardware performance). If you haven’t already got VICE set up, bear in mind that after installing it, you’ll need a copy of the C=64’s ROM images; here’s a local set.

Also, the Commodore Scene Database is pretty awesome — it’s a full-scale IMDB-style setup, tracking the history of the Commodore demo scene in massive detail. Nice work guys!

The demos were written 100% in 6502/6510 assembly. I developed them using an Action Replay cartridge’s built-in monitor; it had an assembler, but one which didn’t support symbolic addressing. In other words, every piece of assembly used hand-computed branch offsets, and every variable and subroutine was tracked — on paper — by memory location, rather than using symbolic labels. If you want to know what the monitor was like, the VICE built-in monitor is almost identical!

I wrote these when I was 16; part 4 of Rhaphandosis notes the date as being 20 May 1989.

It’s interesting reading the scrollers, and doing web and CSDB searches in follow-up to see what happened next — one of the other Excess members, Raistlin is now Robert Troughton, a successful game developer in the UK with several major titles under his belt.

A Google search for Thundertronix finds a copy of “sex’n’crime” zine, issue 17, July 1990, which notes:

one of the new groups formed in 1990 (jm: slightly off, I think) is THUNDERTRONIX, better known as TNT. they are based in ireland and are doing very well for themselves. they have, in my mind, one of the best coders in the uk, namely MANTIS. he is currently coding a game with many new routines, etc… hopefully he should get some demos out soon!

woo! Er, unfortunately that game never went anywhere. ah well. ;)

BTW, it’s funny reading my scrollers in those demos. At the time, I was convinced that the c=64 was a dead platform — yet here we are in 2008, and there’s still a thriving demo scene on the Commodore. Incredible!

Vincent Browne on RTE’s coke habit

Before Christmas, it seemed you could hardly read a newspaper, listen to the radio or watch TV in Ireland without being bombarded with stories about how the country was awash in cocaine.

It’s an attractive story, tying in nicely with the death of lingerie model Katy French, hand-wringing over Ireland’s recent ‘celtic tiger’ wealth, a supposed loss of our traditions, etc. etc. RTE, our national broadcaster, made a tabloid series called ‘High Society’, which cashed in on the issue in a particularly crass way — crappy “reconstructions” of actors chopping lines with voiceovers, dodgy-looking men handing over money to ominous music, that kind of thing.

Well, just before Christmas, Vincent Browne wrote a fantastic op-ed in the Irish Times regarding this. I have to quote this particularly perceptive passage:

Cocaine abuse is a social problem, but the thrust of much of RTE’s coverage of the phenomenon is to suggest that it is a widespread, pervasive problem. There are no recent statistics available on the prevalence of cocaine consumption in Ireland – the last survey was done four years ago. The National Advisory Committee on Drugs (NACD) will be publishing a prevalence report next month and we will know then the size of the phenomenon.

But we have some indicators about the scale of cocaine use. The European drug agency EMCDDA estimates that 3 per cent of all adults in Europe aged between 15 and 64 have used cocaine at least once in their lives.

A third of these took cocaine during the previous year and half of these took cocaine during the previous month. This means that about 0.5 per cent of the adult population took cocaine over the previous month. And the data suggests that, for at least two-thirds of those who have ever taken cocaine, the drug is not a problem for them.

In the US the statistics are higher. Almost 15 per cent of the population aged between 12 and 64 have taken cocaine in their lives and 2.5 per cent took cocaine over the previous year. Again, this is suggestive that cocaine use for most people is not a problem, otherwise the number of people who took cocaine during the previous year as a proportion of the number of people who ever took cocaine would be far higher.

The figures for Ireland are likely to be that about 4 per cent of the adult population have taken cocaine in their lifetime, with about 1 per cent having taken cocaine in the previous year and 0.5 per cent having taken cocaine in the previous month.

It would be better if people did not take cocaine, but the prevalent contention that the consumption of cocaine at all is necessarily harmful and addictive is obviously false.

It would also be better if people did not drink here, for the problems related to the consumption of alcohol are far, far greater than in the case of cocaine.

Instead of presenting a balanced picture of the cocaine phenomenon, RTE has greatly exaggerated the issue, in a way more typically associated with tabloid journalism.

Well said!

Spambots stealing GMail and Hotmail passwords?

I just received this mail from a friend:

Dear friend

Welcome to stwoxy.com ! We are one of the largest electronic distributors and wholesalers in Beijing China. We offer qualified digital products: Motorcycles?TVs, Notebooks, phones. PSP, projectors, GPS, DVD, DV, DC, MP3/4 and so on, which are of world famous brands, such as Sony, IBM, PHILIPS, NOKIA, DELL and so on. All our items are brand new from the manufactures and they come with 1-3 years’ after service. These days we are expanding our overseas market, and every item is sold in extremely low price. Such chances should never be missed, ladies and gentlemen, do come to stwoxy.com! you will surely have a big surprise! We are looking forward to hearing from you!

It was sent from a HTTP connection into GMail, and was delivered from there using valid DKIM, Domain Keys and SPF signatures. In addition, it was sent to all the addresses in his address book. In other words, this was no run-of-the-mill impersonation spam — for this one, the spammer obtained my friend’s username and password somehow, logged into GMail, scraped the address book, and then sent spam via GMail that way.

My friend says he didn’t access GMail using a desktop mail client, but did have his Google password saved in his web browser (a pretty typical configuration). My theory is that some virus/malware has infected his desktop machine, captured the saved-passwords file from the web browser configuration, and used that to log into GMail. Alternatively, it could also be a guessable username and password which was picked up via dictionary attack, I guess…

This is the first case I’ve heard of where spammers are actively stealing user account authentication tokens, in order to take over the accounts for spamming. (We’d long predicted it, of course, since it’s a natural response to “pay for mail” schemes… but since there’s no widely-used pay-for-mail system available yet, it’s premature!)

It seems this is not just a GMail thing, btw. Here’s a report of the same thing happening to some French guy via HotMail last month (or in english). I don’t speak Dutch, but this forum post looks like it might be the same situation.

If you’re curious, here’s a copy of the spam, delivered to a Yahoo! group; it appears these spammers aren’t too sophisticated in terms of the text they’re sending, since they haven’t morphed that text, HTML, or even the domain in the link yet. It’s just the malware that’s sophisticated, at this stage.

GNOME, Google and the UNIX user interface

Recently, after a flurry of annoying user interface issues, I’ve switched my RSS reader from Liferea to Google Reader. Interestingly, it turns out that Google Reader actually fits better with the traditional UNIX user interface concept, I’ve found.

What triggered this was an upgrade from Liferea 1.0.x to 1.4.4 as part of Ubuntu Gutsy; this brought with it a lot of changed behaviours, such as ‘drag-and-drop of feed URL to HTML view no longer subscribes’, and one crucial UI issue, ‘”Skim through articles” only works with ctrl+space’.

I’ve been a long-time UNIX user, dating back to the days where curses-based interfaces were the norm. As such, I tend to drive commonly-used applications using keyboard commands where possible. (This isn’t a purely UNIX thing; Windows has the phenomenon of the keyboard-wielding “power user”, too.)

Liferea was attractive, since it offered the ability to skim through articles quickly by just pressing the “Space” key; simply press space to page down, or to skip to the next unread article if at the end of the current one. Unfortunately, Liferea 1.4.x breaks this, and it wasn’t going to be fixed, since apparently a GNOME app shouldn’t behave this way:

GTK explicitely does implement as a key binding for several of it’s widgets. Rebinding means to break the default behaviour for such widgets (tree views, buttons, input fields). [….] Liferea as a web-browsing application should behave like any other web browser and like every other GNOME/GTK application as much as possible.

Now, I don’t know if it’s GNOME’s fault, or what, but for a UNIX desktop app to break with UNIX UI conventions, that’s a bad move in my opinion. I gave it a bit of argument in the bug tracker, but eventually gave up as I clearly wasn’t getting anywhere. :(

Instead, based on recommendation from friends, I gave Google Reader a try, and quickly figured out its extensive collection of keyboard shortcuts. Now, I’m skimming through my feeds in even less time than it took with Liferea, simply by hitting “ga” to go to my “all unread items” list, then “j”, “j”, “j” to skip through the postings one by one. Sweet!

It’s interesting to note that other Google web apps use the same concepts; Gmail also has a hefty set, and can be driven using them in a manner very reminiscent of the classic UNIX mailreader, Mutt. So, despite being designed with end-users in mind by extremely clever professional user experience designers, these apps still find space for power-user keyboard operation. Take note, GNOME.

Anyway, I’m not too bothered. Google Reader brings other benefits, such as fixing this bug: ‘please add ability to go to previous entry in Unread feed’, avoiding ‘constant memory leak requires daily restarts’, and, of course, the utility of being able to track the same set of feeds and keep track of which items I’ve read in two places (work and home).

If only it was open source ;)

Planet Antispam update

A brief update on Planet Antispam

I’ve just added MailChannels’ Anti-Spam Blog. Now — in the interests of disclosure — I’m a member of MailChannels’ Technical Advisory Board. However, that didn’t affect this — their blog has had consistently good, interesting posts dealing with anti-spam-related topics, and without too much plugging of their own products. ;)

Also added recently:

If you know of any other good email anti-spam-related blogs, drop a line in the comments here. (Note that I’m trying to keep it email-related, however, so we’re not covering web-spam.)

Spammers “giving up” according to Google

According to this Wired story, Google reckons spammers are giving up on spam:

a remarkable trend is underfoot, according to Brad Taylor, a staff software engineer at Google: The number of spam attempts — that is, the number of junk messages sent out by spammers — is flat, and may even be declining for the first time in years.

Actually, this is a wilful misunderstanding of what the Googler in question really said, which was that ‘attempts to spam Gmail users have been leveling off over the last year and more recently, even declining slightly’. In other words, they didn’t make an observation about the state of the spam problem on an internet-wide basis — just about the “local” situation as it pertains to Gmail. Bad reporting there, Wired.

But, in passing…

David Berlind at ZDNet recently blogged a rather grumpy response to InfoWorld coverage of CEAS 2007. He raised a very important point:

If I could say something to the author of that story, it would be that so long as any anti-spam solution is not deployed universally throughout the Internet’s e-mail system (in other words, so long as some anti-spam tech is not a standard), that anti-spam solution actually makes the spam problem worse. You read that right. Worse. Proprietary anti-spam solutions make the global spam problem worse. They are digging us deeper into the hole that the Internet is already in because everyone who makes those solutions is under the false belief that “s/he who is finally successful at filtering out all spam while allowing the legitimate mail in wins.”

Google’s blog post is a case in point: ‘we’re keeping more spam out of your inbox than ever before, so more and more, you can use Gmail for things you enjoy without even realizing that the spam filter is there most of the time.’

That’s great — but it doesn’t help anyone except Gmail. It’s a myopic view of the spam problem, and David’s point stands.

(I disagree with his later conclusion that the only way forward is for Google, MS, AOL and Yahoo! to get together and ‘commit to jointly supporting the same technical solutions’ — when the usual BigCos get together, they tend to focus on their own priorities. Take what happened back in 2005 with nofollow for blog-spam — while it helped the search giants with their own overriding priority, which was to tweak their algorithms to filter out the spam on the search results page, it did nothing to slow the spam flood itself, which has continued unabated.)

We need more open-source, and open-data, anti-spam work.

Informed

This should be in the running for “least informative dialog ever”.

(The information in question was that Firefox had been upgraded by the Ubuntu Gutsy Update Manager app, if you’re curious…)

Working around O2 Ireland

I’m pretty conservative with my mobile phones — until recently, my mobiles were all cheap, low-end, super-lightweight Nokias with long battery life and low “worry factor” (ie. not a big deal if they were lost or stolen). Very sensible.

I’ve finally started catching up with the gadgetorati, though — my current phone is now a Sony Ericsson K550i, which is still small and light, but has nice features like a 2 megapixel camera, a decent amount of onboard flash space, and a good implementation of Java, hence support for GMail and Google Maps. (Thanks to Joe for the recommendation!)

The only downside is that it came from my operator, O2 Ireland, with some broken configuration settings. (This shouldn’t be surprising, of course — I don’t think I’ve ever heard of a phone arriving with working data connectivity, from any operator, anywhere in the world.)

Anyway, here’s what I’ve done so far to fix it. Hopefully this might be helpful for random google searchers.

1. “Failed to resolve hostname” when publishing photos:

Generally, when I’d try to publish a photo using its Blogger support, I’d get a “failed to resolve hostname” error message. Investigating further, I found that the “O2 WAP” service used a proxy server — turning that off fixed the problem nicely. Nice reliable proxy you’ve got there, O2 ;)

Here’s how to do that. Open the menu, then select Settings -> Connectivity -> Internet settings -> Internet Profiles. Select O2 WAP and hit More -> Settings. Select Use proxy and change it to No, then hit Save. Problem solved.

2. Cannot send email from the device:

O2’s default mail server has a tendency to refuse to accept outbound mail from the phone. Switching to GMail for outbound SMTP works fine. Notice a trend here?

Open the menu, Messaging -> Email -> Settings -> New account. Set the Account name to “gmail”. Scroll down to Email address, set it to “yourname@gmail.com”. Connection type is “POP3”, Username and Password are whatever your GMail account uses. Outgoing server is “smtp.gmail.com”. Enter Advanced settings, and set Encryption to “TLS/SSL”. Set Outgoing port to “25”. Press the back button, then select the “gmail” account’s tickbox to make it active, before pressing back again to exit the configuration screen.

3. The “side” buttons go online:

By default, if you hit the “globe” button or the “open window” button on the side of the phone, to the left and right of the main joystick, it’s set to open various URLs at www.o2.ie. These buttons are prime UI real estate, and easily accidentally hit; I don’t want to go online (and possibly incur a charge) if they’re pressed.

Easily fixed. Open the menu, then select Settings -> Connectivity -> Internet settings -> Internet Profiles. Select O2 WAP and hit More -> Advanced, then Change homepage and enter “file:///” under Address and hit Save. It’ll now issue an ugly warning if you press those buttons, but at least it won’t go online. (It’d be nice to get a nicer fix for this.)

I’m sure there’s plenty more; if you’ve got this phone and have any tips to share, feel free to drop a comment below.

In particular, I’d love to know how to further “de-O2ify” the UI; the top 3 buttons on the menu screen are taken up with worthless operator spam (“O2 Music Store”, “O2 Menu” and “Entertainment”, all of which go to various URLs at www.o2.ie), while the useful Applications and Alarm screens, which I use all the time, are hidden in a submenu. ugh.

Investing in real estate

Screen real estate, that is — 3600×1050 pixels of it:

(That’s a Samsung SyncMaster 225bw226bw connected to a Thinkpad T61p running Ubuntu Gutsy, if you’re curious.)

‘Dead spammer’ story: yep, spam

Remember the ‘Russian ‘make penis fast’ spammer murdered’ fake blog posting I wrote about last month? I was right — the site has now become a spammer link farm.

There’s now a new category in the right-hand sidebar of the fake blog post. See if you can spot the odd one out:

  • Programming
  • Personal
  • Web 2.0
  • Python
  • Penis exercises
  • Uncategorized

Sure enough, “Penis exercises” is the only valid outlink from the page (all the others lead to the ‘sorry, closed due to too much traffic’ page). It leads to a page discussing the usual ‘make penis fast’ topics, with a batch more links to more pages along the same lines. If you follow the links a little, the whole thing appears to be hawking some device called “Size Genetics”. Totally spammy.

New job!

So, as I’ve hinted previously, I’ve left Vast to work full-time at a new gig: PutPlace.

I’ll be working on more EC2/S3/SQS-related large-scale cluster stuff, and on their open-source plans… looking forward to that. They’re a great team — lots of familiar faces from the Iona days — and it finally gets me out of telecommuting from home, back into an office again after 5 years ;)

Joe has put up a nice blog post welcoming me. Cheers Joe!

Now to get to grips with Python. (I still love Perl though. ;)

Fedex Ireland and unfair duty charges

I’ve been on vacation for a week, introducing Bea to the many joys of the bogs of Connemara. I think she liked it.

While I was away, I appeared in Ireland’s newspaper of record, the Irish Times, specifically in Conor Pope’s ‘Pricewatch’ consumer-affairs column, under the byline “Shopped to the taxman”. Here’s a cut-and-paste of some relevant snippets:

Justin Mason [hey, that’s me] contacted Pricewatch after being hit with just such a charge. In August, he and his wife, who were expecting a baby, received a package from friends in the US [thanks Nishad and Janet!] containing amongst other things, some hats, socks and a little hoodie for their baby.

“It was shipped via FedEx, got here in good time and was very cute,” he says. The couple were delighted, until a couple of weeks later, when they received an invoice from FedEx looking for EUR 34.47, made up of EUR 2.49 duty, EUR 19.88 VAT and EUR 10 in “administration fees”, plus an additional EUR 2.10 VAT on the “administration fee”.

“This strikes me as pretty unfair, maybe there’s duty payable, but I’ve never had to pay VAT on a gift I’ve received before? On top of that, being charged one-third of the price as an administrative fee? Ouch!”

The couple disputed the fee and were told if they didn’t pay, the invoice would be sent to a debt collection agency and non-payment would affect their credit rating. A couple of weeks later, another gift arrived from the US, followed by another invoice looking for EUR 7.84 in duty, plus the EUR 10 administration fee and EUR 2.10 VAT on that fee. Mason disputed the charge and was eventually told it would be waived as it had a value of less than $50 (EUR 34.70) and was clearly labelled as a gift. There is tax relief called Small Parcel Standard Relief on goods purchased from outside the EU, which is EUR 22 for bought goods and EUR 45 for gifts, so the tax should never have been applied by FedEx.

We contacted FedEx and UPS, highlighting our readers’ concerns. A spokesman for FedEx said the administration charge has always been in place in Ireland and was applied “to ensure customers receive their packages quickly”.

He said that if it did not pay the VAT and duty, “packages would not be cleared through customs until the customer has paid them, thus adding severe delays to the delivery process”.

So, to be honest, I’m not impressed at all with Fedex’ response here. I was hoping they’d be more helpful, especially once it hit the most significant consumer-affairs column in the country — but not at all :(

To recap — since Conor didn’t mention it — here are my problems with the charges:

  • the packages were both genuine, unsolicited, gifts. Surely having to pay duty on a gift is not applicable; it certainly makes receiving a gift a particularly unpleasant experience!

  • the first package contained baby clothes, which are VAT-free in Irish tax law anyway.

  • we cannot seem to get contact details for someone at Customs and Excise to talk to about this, and Fedex have failed to get back to us since then.

Not sure what the next step is…

There’s also a little follow-on discussion at Conor’s blog.

Update: good news. A couple of days ago, a letter arrived from Fedex UK, containing 2 credit notes; both invoices had been reduced to EUR 0.00, citing “incorrect application of duty” for one, and “customer satisfaction policy” for the other. Hooray!

Surprise smash hit in the Irish Blogs Top 100

Damien posted an interesting suggestion for the Irish Blogs Top 100 the other day — during discussion of which, it emerged that there were a few overlooked Irish blogs which hadn’t yet shown up on the planet.journals.ie Irish blogs aggregator, and therefore were not appearing in the Top 100. These were:

Anyway, they’re in now. When I first spun up the script and checked the results, though I was a bit shocked and had to do a bit of a double-take — at number 1, far beyond Damien’s number 2, was InPhotos.org, with a Technorati Rank of 1 and 102,857 inbound links from 88,772 blogs, compared to Damien’s Rank of 7946 with 1,606 links from 519 blogs.

Insane! I guess being in the default WordPress install makes a bit of difference there ;)

Interestingly, InPhotos.org, with a Technorati Authority of 88,434, is far beyond the most popular blog listed on the Technorati Popular Blogs page. It seems that page is a hand-tweaked set of blogs, and not just a “Technorati global Top 100”, then, despite what one might naively assume…

PS: Damien’s original suggestion, btw, was to measure blog popularity using Google Reader and Feedburner’s audience stats. However, I can’t do that without a public API I’m allowed to scrape. Does anyone know of one?

Also worth noting that I recently added del.icio.us bookmarks as a metric of popularity, to go with the Technorati stuff. It’s interesting to see how those rankings differ — bloggers and bookmarkers don’t always agree, with bookmarkers preferring MP3s, Second Life, and politics I reckon.

the Ron Paul spam scandal

A US presidential candidate called Ron Paul has been advertised in spam. There’s currently a massive shitstorm raging about the true source of the spam — it was delivered via an infected consumer broadband machine, so the source is of course untraceable from the email alone.

Of course, being spam, I received a copy ;) Here’s a spample, if you’re curious.

The unusual “Content-Type” header format (matching the STOX_REPLY_TYPE SpamAssassin rule) has been seen in a lot of pump-and-dump stock spam recently. (It’s also shown up in Storm output, but this isn’t from Storm.) It’s been around for at least 6 months, so it’s probably a built-in behaviour of a downloaded spamware app, rather than a frequently-updated web-hosted spamware site.

My guess — I’d say the spam was sent using the same spamware application that one of the larger, recent pump-and-dump spammers has been using — so a reasonably sophisticated app, and not just an ancient copy of DarkMailer or whatever.

It’ll be interesting to see how this pans out…

Changes to the Irish learner driver system

The Irish Road Safety Authority have just revised Irish law as it relates to ‘learner drivers’, the 15% of drivers who haven’t yet passed a driving test. (This includes me — my US driving license doesn’t allow me to drive a manual-transmission car in Ireland, so I’m still a learner over here!)

They helpfully released the details as a rather broad PDF entitled ‘Road Safety Strategy 2007-2012‘, which covers the changes along with other plans and statistics; and a more focused document, ‘Learner Permit and Changes to the Driver Licensing System‘, dealing with just the learner-permit system.

Unfortunately, the latter was released as an MS Word document. Given the problems this raises — lack of searchability, integration with the web, etc. — I thought it’d be helpful for searchers if I put up the text in full here, so here it is.

Introduction of Learner Permit and Changes to the Driver Licensing System – Changes to the Driver Licensing System announced on 25 October 2007

In this document you will find information about changes to the driver licensing regime. These changes affect learner drivers and recognise the fact that learner drivers are a vulnerable group of road users. The changes also serve to emphasise the importance of the learning phase for drivers, one element of this is the replacement of provisional licences with learner permits. The changes also highlight the important role played by the driver who accompanies a learner driver.

Over time the intention is to expand the range of conditions applying to a learner permit and to develop a graduated licensing system where there will be a number of different restrictions/conditions applying at different stages. These restrictions will apply while driving with a learner permit and in the initial years of driving with a full driving licence.

Specific details about each of the current changes together with questions and answers on the impact of each change are set out below.

Provisional licences are being replaced by learner permits to emphasise the fact that the holder is a probationary driver and is learning to drive. Existing provisional licences will continue in force until their expiry date. On renewal the person will be issued with a learner permit.

Q: When will learner permits start to issue?

A: Learner permits will issue as and from 30 October 2007.

Q: Does the learner permit system apply to all driving licence categories?

A: Yes, the learner permit system will apply to all licence categories.

Q: Is there any change to the period of validity or the fee for a learner permit compared to that for a provisional licence?

A: No, the duration and fee remain the same as applied to provisional licences.

Q: Are there any changes to apply under the learner permit system?

A: A number of changes detailed below are being introduced for drivers with a learner permit. These are also being applied to drivers with a current provisional licence.

The holder of category B (Car) learner permit (provisional licence) must be accompanied by and under the supervision of a qualified person at all times. This change removes an exemption that, up to now, allowed a person on a second provisional licence to drive unaccompanied. To drive unaccompanied will be a penal offence and the person will be subject to prosecution.

Q: When does this new rule come into effect?

A: This is coming into effect as and from 30 October 2007.

Q: I am currently on a second (provisional licence) learner permit for driving a car, and was not required to be accompanied heretofore with this (provisional licence) learner permit. Must I now be accompanied?

A: Yes, you must be accompanied at all times when driving with a (provisional licence) learner permit for a car.

Q: I have passed the driving test in a vehicle with an automatic transmission and now hold a (provisional licence) learner permit for driving a car with a manual transmission, can I drive this car unaccompanied.

A: No, you must be accompanied by a qualified person until such time as you pass the driving test for a manual transmission car.

Q: In respect of which licence categories is a person who holds a (provisional licence) learner permit required to be accompanied by a qualified person?

A: Drivers with a (provisional licence) learner permit for vehicles of category B, C1, C, D1, D, EB, EC1, EC, ED1 or ED, (Cars, Trucks, Buses and Articulated Vehicles) must be accompanied by and under the supervision of a qualified person.

An accompanying qualified person must hold a full driving licence for the vehicle category for at least two years. It will be a penal offence for the driver not to be accompanied by a qualified person so licenced to drive.

Q. When is this change coming into effect?

A. This change will apply as and from 30 October 2007.

Q: If I am a learner driver driving a car and the accompanying person has held a driving licence for two years in respect of a motorcycle, or a tractor/work vehicle, can this person act as an accompanying qualified person?

A: No, the accompanying qualified person must hold a driving licence for two years for the category of vehicle you are driving.

Q: If a person has passed a driving test to drive the vehicle category, can this person act as an accompanying qualified person?

A: No.

Q: If a person has held a full driving licence for an automatic vehicle for two years, may this person act as the accompanying person?

A: Yes, but only if the learner driver is driving an automatic transmission vehicle in the same category. If s/he is driving a manual transmission vehicle, the accompanying qualified person has to hold a full driving licence for at least two years for a manual transmission vehicle.

Q: If I have a learner permit (provisional licence) in category C1 (small truck) can I be accompanied by a person who holds a full driving licence for category B for two years and for category C1 for one year?

A: No, the accompanying qualified person must hold a full driving licence for two years in respect of the vehicle category which you wish to drive, in this case category C1.

Q: If the accompanying driver has heId his / her driving licence since six years ago but has been disqualified for 2 of the last 3 years, may he /she act as an accompanying driver?

A: No, the accompanying qualified person, at the time you are driving, must hold a full driving licence for two years in respect of the vehicle category which you wish to drive. He/she must not have been disqualified for any period of the previous two years.

The carrying of a passenger by a motorcyclist with a (provisional licence) learner permit is a penal offence.

Q. When is this change coming into effect?

A. This change will apply as and from 30 October 2007.

Q: Can I carry a passenger on any motorcycle category for which I hold a learner permit (provisional licence) ?

A: No, you must have a full driving licence for the motorcycle in order to be able to carry a passenger.

Q: Can I carry a passenger on a category A motorcycle for which I hold a learner permit/ provisional licence if I have a full driving licence for category A1?

A: No.

Q: If I pass the motorcycle driving test, can I carry a passenger?

A: No, you must first exchange your certificate of competency (driving test pass certificate) for a full driving licence to be able to carry a passenger.

It is a penal offence for a holder of a category W (Tractor/Works vehicle) learner permit (provisional licence) to carry a passenger unless the vehicle is constructed or adapted to carry a passenger and the passenger is a qualified person, ie. a person who holds a full driving licence for the vehicle category for at least two years.

Q. When is this change coming into effect?

A. This change will apply as and from 30 October 2007.

Q: When can I carry a passenger?

A: When the passenger holds a driving licence for the vehicle category for at least two years, and where the vehicle is constructed or adapted to carry a passenger.

Q: Can I carry a passenger who is a qualified person if there is no passenger seat?

A: No, the vehicle must be constructed/ adapted for the carriage of a passenger.

It is a penal offence for the holder of a learner permit (provisional licence) in respect of any licence category to carry in the vehicle any passenger for reward.

Q. When is this change coming into effect?

A. This change will apply as and from 30 October 2007.

Q: Can I carry a passenger for reward in the course of my employment?

A: No, you may not do so while driving under a learner permit (provisional licence).

Q: If I have a category D1 learner permit (provisional licence) to drive a minibus, can I carry a passenger for reward?

A: No, you may not do so while driving under a learner permit (provisional licence).

It is a penal offence for the holder of a learner permit (provisional licence) for vehicles of category B, C1, C, D1, D, EB, EC1, EC, ED1 or ED, to drive such a vehicle unless there are displayed on the vehicle rectangular plates or signs bearing the letter ‘L’ not less than 15 centimetres high in red on a white ground, in clearly visible vertical positions to the front and rear of the vehicle.

Q. When is this change coming into effect?

A. This change will apply as and from 30 October 2007.

Q: If I have a category B full driving licence and a learner permit for category C (truck) or category D1 (minibus) must I display L plates?

A: Yes, you must display L plates on the truck or minibus if driving on a learner permit.

It will be a penal offence for the holder of a learner permit (provisional licence) for vehicles of category B, C1, C, D1 or D, to drive such a vehicle while the vehicle is drawing a trailer.

Q: If I have a category B driving licence and a learner permit for category C1 (small truck) can I draw a trailer?

A: No, you may not drive a truck while drawing a trailer if you hold a learner permit (provisional licence) for a truck. You must have the trailer entitlement for the category on the learner permit (provisional licence) in order to draw a trailer.

Learner Motorcyclist to display ‘L’ plates on a high visibility tabard.

Q: From what date will motorcyclists have to display L plates on a high visibility tabard?

A: It takes effect as and from 1 December 2007.

Q: Which learner motorcyclists are required to display L plates on a high visibility tabard?

A: All persons with a learner permit (provisional licence) for category A, A1, or M, must when driving such a vehicle display a yellow fluorescent tabard bearing the letter ‘L’ not less than 15 centimetres high in red on a white ground, in clearly visible vertical positions worn over the chest clothing. The ‘L’ plates are to be to the front and rear of the person’s torso. It will be a penal offence not to so display L plates.

A person who is a first time holder of a learner permit (provisional licence) cannot take a driving test for a six month period after the commencement date of the permit (provisional licence).

Q. When is this change coming into effect?

A. This change will apply to driving test applicants with an appointment date for a test on or after 1 December 2007 and who hold a learner permit (provisional licence) for less than six months. At this point driving tests are scheduled up to this date and the change will not affect existing appointment holders.

Q: Does the change apply to all licence categories?

A: Yes, It applies to all licence categories.

Q: Why is the six month limitation being applied?

A: The purpose of the provisional licence/learner permit is to allow a learner driver to gain experience of driving. Research shows that the longer a learner is supervised while driving, the less likely s/he is to be involved in an accident. For this reason the six months limitation is being applied.

Q: I hold a first learner permit (provisional licence ) for less than six months. I have an appointment already arranged for a driving test. Can I take the test?

A: Yes, the change is being introduced with effect from 1 December 2007 and should not affect existing appointments for driving tests.

Upcoming Mike Culver talk about AWS

Mike Culver, Amazon’s “Web Services Evangelist”, will be in Dublin next week to evangelize about the goodness that is Amazon S3, EC2, SQS and so on. It seems he’ll be talking at the following locations:

  • in the Auditorium of the Digital Exchange, Crane Street, Dublin 8 on Tuesday October 30th, 3-5pm; here’s a flyer the Amazonites have been passing around. (upcoming.org page)

  • according to Damien, later that evening, he’s in the Westin Hotel on Westmoreland St., D2, starting at 7pm; note, it seems you need to book places at this, see Damien’s post.

  • and again at the Irish Linux User’s Group on Thursday November 1st at 19:30 in the Irish Computer Society in Dublin (map).

I guess these are all going to be same talk, bar the Q&A ;)

There was some kind of an ICTE get-together mooted for Friday 2nd.

Also, the ILUG annual general meeting is scheduled on the following Saturday, 3rd November, also at the ICS. Gareth Eason notes ‘we’re hoping to start at 3pm sharp, with talks from Dave Wilson (HEAnet), Frank Duignan, John Looney (Google), and others, followed by a relaxing wind-down in the Schoolhouse pub later on.’ (upcoming.org page)

Hopefully I’ll get to at least one of the AWS talks (probably the Digital Exchange one) and the ILUG AGM… busy week!

BBC’s iPlayer — what a mess

I haven’t paid a whole lot of attention to the BBC’s “iPlayer” project, since, as a non-UK resident, I’m not allowed to use it anyway. But this interview at Groklaw with Mark Taylor, President of the UK Open Source Consortium, was really quite eye-opening. Here’s some choice snippets.

On the management team’s Microsoft links:

The iPlayer is not what it claimed to be, it is built top-to-bottom on a Microsoft-only stack. The BBC management team who are responsible for the iPlayer are a checklist of senior employees from Microsoft who were involved with Windows Media. A gentleman called Erik Huggers who’s responsible for the iPlayer project in the BBC, his immediately previous job was director at Microsoft for Europe, Middle East & Africa responsible for Windows Media. He presided over the division of Windows Media when it was the subject of the European Commission’s antitrust case. He was the senior director responsible. He’s now shown up responsible for the iPlayer project.

On their attempts to bullshit the BBC Trust on the cross-platform issue:

In the consultations that the BBC Trust made, there were 10,000 responses from the public. And the overwhelming majority of them, over 80% — which is an unheard-of figure in these kind of things — said, we don’t like the platform. We don’t like it being single-platform. So it’s a big issue. And the BBC Trust said to us, “Why the vehemence? Why have people reacted this way?” And I explained the ‘Auntie’ analogy. It’s people don’t expect that from the BBC. It’s got this huge history of integrity, doing the right thing, standing up to bullies. (laughter) They’ve done this for a very long time. And people find that it’s surprising. And they said, “Yeah, but,” you know, the BBC guys said, “Well, trust us. This is going to be cross-platform.” And we said, “Well, how? It’s completely single-platform.” They say that, but we haven’t been able to find anyone who’s been able to explain how they’re going to achieve that at the moment, even though they’re entirely locked into one single platform.

(aside: MS did this at one point with Internet Explorer — remember, there was some mystery team in Germany that supposedly had IE ported to Solaris, hence it therefore qualified as ‘cross-platform’.)

On the architecture of the product:

Q: it’s a Verisign Kontiki architecture, it’s peer-to-peer, and in fact one of the more worrying aspects is that you have no control over your node. It loads at boot time under Windows, the BBC can use as much of your bandwidth as they please (laughter), in fact I think OFCOM … made some kind of estimate as to how many hundreds of millions of pounds that would cost everyone […]. There is a hidden directory called “My Deliveries” which pre-caches large preview files, it phones home to the Microsoft DRM servers of course, it logs all the iPlayer activity and errors with identifiers in an unencrypted file. Now, does this assessment agree with what you’ve looked at?

Mark Taylor: Yes.

Q: What are the privacy implications for an implementation like this?

Mark Taylor: Well, just briefly going back to the assessment thing, yes it does log precisely RSS and stuff like that and more importantly, anyone technically informed who’s had a look at it — even more importantly, the user’s assessment as well and — frankly horrified if you go and spend some time in the BBC iPlayer forums, it’s eye-opening to see the sheer horror of the users, some of them technically not — you know, relatively early-stage users — but when it gets explained to them by some of the longer-using users of it, it’s concentrated misery. (laughter)

[…]

it’s a remarkable thing with them as well, there’s a lot of pain going on in the user forums, and some of the main technical support questions in there are “how do I remove Kontiki from my computer?” See, it’s not just while iPlayer is running that Kontiki is going, it’s booted up. When the machine boots up, it runs in the background, and it’s eating people’s bandwidth all the time. (laughter) In the UK we still have massive amounts of people who’ve got bandwidth capping from their ISPs and we’ve got poor users on the online forums saying, “Well, my internet connection has just finished, my ISP tells me I’ve used up all of my bandwidth.”

Q: It uses up their quota, but they can’t throttle it, they can’t reduce it —

Mark Taylor: No, they can’t throttle it. […] It’s malware as well as spyware.

And to top this off, there’s a (frankly insane) budget of UKP 130,000,000 to build this — that’s $266,000,000 — for something that could be built better by just hiring the guys behind UKNova and simply negotiating with the rights-holders directly.

Holy crap. Talk about a technical disaster masquerading as a solution to a business problem…