Skip to content

Justin's Linklog Posts

Links for 2011-04-21

Links for 2011-04-20

  • demerphq on “perl’s regexps are slow” : His classic response to the Russ Cox DFA-over-NFA regular expressions paper. ‘A general purpose regex engine like that required for perl has to be able to do a lot, and has to balance considerations ranging from memory footprint of a compiled object, construction time, flexibility, rich feature-sets, the ability to accomodate huge character sets, and of course most importantly matching performance. And it turns out that while DFA engines have a very good worst case match time, they dont actually have too many other redeeming features. Construction can be extremely slow, the memory footprint vast, all kinds of trickery is involved to do unicode or capturing properly and they aren’t suitable for patterns with backreferences.’ — Also interesting to note that he mentions an approach I’ve used in several SpamAssassin speedup add-ons, too ;)
    (tags: performance perl regular-expressions perlmonks demerphq regexps dfa nfa state-machines)

temporary Hackerspace at MindField

This sounds very cool! Nice one, hackerspace ppl.

Ireland’s Hackerspaces and Makerspaces (091 Labs – Galway, Belfast Hackerspace, MilkLabs – Limerick, Nexus Cork and TOG – Dublin) have been asked to build and man a temporary hackerspace during the MindField – International Festival of Ideas (http://www.mindfield.ie/). MindField will take place over the weekend of 29 April – 1 May in Merrion Square.

During MindField our temporary hackerspace will provide a range of events where festival participants can learn about diybio, 3D printing, basic electronics and micro controllers, electronic fashion/crafting and open data. These events are included in the festival schedule (http://mindfield.ie/festival-schedul/).

In parallel with these events we have an opportunity run a Hardware Hacking Challenge. In this challenge we will try to engage a group of willing hacker, makers and festival participants in the challenge to create or construct interesting or innovative projects out of recycled hardware. We are trying to source interesting materials, electronic devices or equipment that can be used to based projects off or as sources of components.

We are particularly interested in devices that contain various types of transducers which can then be hooked up to micro controllers and computers. We’re not looking for normal computer equipment or servers we’ve got lots of that, but more unusual stuff that people have lying around.

If you think you’ve got something they might like, contact Robert Fitzsimons.

Links for 2011-04-20

Links for 2011-04-19

Links for 2011-04-19

Links for 2011-03-29

Links for 2011-03-24

Links for 2011-03-23

  • Detecting Certificate Authority compromises and web browser collusion | The Tor Blog : ‘If I had to make a bet, I’d wager that an attacker was able to issue high value [SSL] certificates, probably by compromising [the USERTRUST SSL certificate authority] in some manner, this was discovered sometime before the revocation date, each certificate was revoked, the vendors notified, the patches were written, and binary builds kicked off – end users are probably still updating and thus many people are vulnerable to the failure that is the CRL and OCSP method for revocation.’ It seems addons.mozilla.org was one of the bogus certs acquired. Major ouch. Thanks to EFF/Tor et al for investigating this — SSL cert revocation is a shambles
    (tags: security ssl tls certificates ca revocation crypto exploits eff tor comodo usertrust)

My Problem With Norris

I’m uncomfortable voting for David Norris for President. Here’s why.

In November last year, he was a key voice in a Senate debate on the topic of "Protection of Intellectual Property Rights", where he quoted heavily from the flawed judgement by Mr. Justice Peter Charleton in the Warner, Universal, Sony BMG and EMI vs UPC case. (There are allegations that he called the debate after speaking to Paul McGuinness (U2’s manager) and Niall Stokes (of Hot Press).)

In the debate, Norris quotes Mr Justice Charleton, saying:

‘In failing to provide legislative provision for blocking, diverting and interrupting internet copyright theft, Ireland is not yet fully in compliance with its obligations under European law.’ Norris then says: ‘Irish law could be brought into alignment with the intention of the European directive through a simple statutory instrument.’ [1]

Now, let me clarify my position — I’m in favour of some means of resolving the level of piracy of music and movies which is widespread nowadays, and I believe there’s a mutually agreeable way to do this. But what Norris and Mr Justice Charleton propose is not it. Here are the problems as I see them.

It Lets The Internet Filtering Genie Out Of The Bottle

The big one.

The problem is that any infrastructure for ‘blocking, diverting and interrupting internet copyright theft’ is effectively infrastructure for ‘blocking, diverting and interrupting’ any communication on the net. We have to be very careful about how this is permitted, as it’ll very quickly suffer "feature creep" and become a general-purpose censorship system — the Great Firewall Of Ireland. As Damien Mulley put it:

‘first they’ll start with the Pirate Bay. Then comes Mininova, IsoHunt, then comes YouTube (they have dodgy stuff, right?), how long before we have Boards.ie because someone quoted a newspaper article or a section of a book? And don’t think they’ll stop there too, any site that links to The Pirate Bay and the others on the hate list will probably be added to the list too…’

In Australia, the anti-child-porn filtering system was quickly used to block gambling websites, gay and straight porn sites, political parties, Wikipedia entries, Christian sites, Wikileaks, and a dentist; in Thailand, a similar system was used to block criticism of the royal family.

Will It Help? I Don’t Think So

Norris:

‘As long as Irish law is deficient, Mr. Justice Charleton has found that all creative Irish industries are losing money.’

This is quite a hilariously overblown and sweeping statement. ALL creative Irish industries? What qualifies as a ‘creative’ industry? I suspect some in this country have been involved in industrial acts of creation that made money. ;)

While they’re not Irish, the well-known indie label Beggar’s Banquet has gone on the record as stating the opposite where the current music situation is concerned —

"There’s fewer gatekeepers now. We don’t have to knock on a TV station’s door or a radio station’s door and it’s made us far more competitive. […] There’s a wide highway in front of us we can go speeding down, and it wasn’t there even two years ago. It means the majors are looking at a world where only 35 Gold Albums a year are certified compared to ten times that recently. But going above Gold in the US is not a problem for us."

So it appears a ‘creative’ industry (albeit in the UK) is finding things not quite so bad.

Norris again:

‘the facts were established in the judgment of Mr. Justice Charleton in which he stated: “Between 2005 and 2009 the recording companies experienced a reduction of 40% in the Irish market for the legal sale of recorded music.” That is a devastating blow. […] He went on to state: “Some 675,000 people are likely to be engaged in some form of illegal downloading from time to time.”’

Without quite lining up one statement with the other, this reinforces the impression that the only reason the recording companies have seen these drops in revenues is due to internet-borne piracy. However, quoting the brilliant Mumblin’ Deaf Ro on the topic of lies, damn lies, and music biz statistics:

‘The drop in the value of Irish retail music sales was 11.7% between 2008 and 2009, which is significantly less than the 18% overall drop in retail sales for the economy that year. Digital album sales have increased by 30% since 2007 both in terms of volume and market value.’

So in other words, between 2008 and 2009, Irish retail music sales outperformed the retail sales economy as a whole!

In addition, Ro provides the following BPI figures for UK market volumes over the 2005-2009 period:

    Year  Albums  Singles
    2005  159.0m   47.9m
    2006  154.7m   66.9m
    2007  138.1m   86.6m
    2008  133.6m  115.1m
    2009  128.9m  152.7m

It’s clear that singles sales went through the roof, more than tripling. Album sales did drop however, but nowhere near by 40% — and this coincided with the general drop in the prevailing global economy around that time. He also notes that digital sales in the UK went through the roof globally on a number of metrics in 2009.

While this does not provide figures for the Irish market, I’m at a loss as to how it could be radically different — Irish and UK consumers have pretty similar musical tastes and consumption habits, I would guess.

Here’s a theory: perhaps the issue could be that "Irish" music sales are associated with bricks-and-mortar music shops selling the physical product, whereas digital music sales are associated with online services based outside Ireland, and an Irish buyer buying an album at 7digital.co.uk, or on iTunes, isn’t counted as an "Irish retail sale"? Could the problem be that we don’t have any significant Irish shops selling music online, I wonder?

Bricks-and-mortar music shops, such as ex-Senator Donie Cassidy’s "Celtic Note" (who coincidentally was quite vociferous in that Seanad debate), are indeed hurting in this new model of music consumption — and that’s a problem. But given that good, working digital music sales systems are in operation, it doesn’t necessarily appear to be due to massive volumes of internet-borne piracy, going by these figures.

Essentially, internet piracy is a convenient bogeyman, especially for the technophobic old guard, but may have little bearing on the current woes of the Irish record industry and bricks-and-mortar music shops.

(Update: a couple of days after this was posted, a pair of economists at the LSE have said basically the same thing.)

Audible Magic Won’t Work For Long Anyway

Audible Magic, which Norris suggests is IRMA’s favoured filtering system, received the following verdict from the EFF back in 2004:

‘Should Audible Magic’s technology be widely adopted, it is likely that P2P file-sharing applications would be revised to implement encryption. Accordingly, network administrators will want to ask Audible Magic tough questions before investing in the company’s technology, lest the investment be rendered worthless by the next P2P "upgrade."’

Naturally, encryption is widespread nowadays, so this may already be the case.

Internet Censorship Harms Our Global Image

As Adrian Weckler points out:

‘do we really want to send out the message that, digitally, we’re the new France? Come to think of it, do we want to tell Google, Facebook, Apple and Twitter that, digitally, we’re the new Britain?’

Right now, more than ever, we need to put out an image that we’re ready to do business on our end of the internet. Mandatory censorship systems don’t exactly support this.

In Summary

So in summary, I would hope to see a more balanced approach to the issue from Norris. Most of the problematic statements in his speech were directly sourced from Mr. Justice Charleton’s flawed judgement, but some critical thinking would be vital, I would have thought. The fact that this was lacking, particularly given the allegations of heavy music-biz lobbying beforehand, leaves me feeling less inclined to vote for him than I would have been before, particularly since I haven’t heard any clarification on these issues.

([1]: Funnily enough, an SI similar to this was nearly sneaked through a couple of weeks ago, according to reports.)

Links for 2011-03-14

Links for 2011-03-03

Links for 2011-03-02

Links for 2011-02-28

Against The Use Of Programming Languages in Configuration Files

It’s pretty common for apps to require "configuration" — external files which can contain settings to customise their behaviour. Ideally, apps shouldn’t require configuration, and this is always a good aim. But in some situations, it’s unavoidable.

In the abstract, it may seem attractive to use a fully-fledged programming language as the language to express configuration in. However, I think this is not a good idea. Here are some reasons why configuration files should not be expressed in a programming language (and yes, I include "Ruby without parentheses" in that bucket):

Provability

If a configuration language is Turing-incomplete, configuration files written in it can be validated "offline", ie. without executing the program it configures. All programming languages are, by definition, Turing-complete, meaning that the program must be executed in full before its configuration can be considered valid.

Offline validation is a useful feature for operational usability, as we’ve found with "spamassassin –lint".

Security

Some configuration settings may be insecure in certain circumstances; for example, in SpamAssassin, we allow certain classes of settings like whitelist/blacklists to be set in a users ~/.spamassassin/user_prefs file, while disallowing rule definitions (which can cause poor performance if poorly written).

If your configuration file is simply an evaluated chunk of code, it becomes more difficult to protect against an attacker introspecting the interpreter and overriding the security limitations. It’s not impossible, since you can, for instance, use a sandboxed interpreter, but this is typically not particularly easy to implement.

Usability

Here’s a rather hairy configuration file I’ve concocted.

    #! /usr/bin/somelanguage
    !$ app.status load html
    !c = []
    ;c['sources'] = < >
    ;c['sources'].append(
        NewConfigurationThingy("foo_bar",
            baz="flargle"))
    ;c['builders'] = < >
    ;c['bots'] = < >
    !$ app.steps load source, shell
    ;bf_mc_generic = factory.SomethingFactory( <
        woo(source.SVN, svnurl="http://example.com/foo/bar"),
        woo(shell.Configure, command="/bar/baz start"),
        woo(shell.Test, command="/bar/baz test"),
        woo(shell.Configure, command="/bar/baz stop")
        > );
    ;b1 = < "name": "mc-fast", "slavename": "mc-fast",
                 "builddir": "mc-fast", "factory": ;bf_mc_generic >
    ;c['builders'].append(;b1)
    ;SomethingOrOther = ;c

This isn’t actually entirely concocted from thin air — it’s actually bits of our BuildBot configuration file, from before we switched to using Hudson. I’ve replaced the familiar Python syntax with deliberately-unfamiliar made-up syntax, to emulate the user experience I had attempting to configure BuildBot with no pre-existing Python knowledge. ;)

Compare with this re-stating of the same configuration data in a simplified, "configuration-oriented" imaginary DSL:

add_source NewConfigurationThingy foo_bar baz=flargle

buildfactory bf_mc_generic source.SVN http://example.com/foo/bar
buildfactory bf_mc_generic shell.Configure /bar/baz start
buildfactory bf_mc_generic shell.Test /bar/baz test
buildfactory bf_mc_generic shell.Configure /bar/baz stop

add_builder name=mc-fast slavename=mc-fast
     builddir=mc-fast factory=bf_mc_generic

Essentially, I’ve extracted the useful configuration data from the hairy example, discarded the symbology used to indicate types, function calls, data structure construction, and let the configuration domain knowledge imply what’s necessary. Not only is this easier to comprehend for the casual reader, it also reduces the risk of syntax errors, by simply minimising the number of syntactical components.

See Also

The Wikipedia page on DSLs is quite good on the topic, with a succinct list of pros and cons.

This StackOverflow thread has some good comments — I particularly like this point:

When you need your application to be very "configurable" in ways that you cannot imagine today, then what you really need is a plugins system. You need to develop your application in a way that someone else can code a new plugin and hook it into your application in the future.

+1.

This seems to be a controversial topic — as you can see, that page has people on both sides of the issue. Maybe it fundamentally comes down to a matter of taste. Anyway — my $.02.

Update: discussions elsewhere: HackerNews

Another Update, 2012-04-06: Robey Pointer wrote a post called Why Config?, in which he describes a Scala-based configuration language in use at Twitter, which uses Scala’s runtime code evaluation, and a Scala trait, to express configuration succinctly in a Scala source file and load it at runtime. The downside? It’s a Scala source file, executed at runtime, containing configuration. :(

However, this comment in the comments section is worth a read:

At Netli (now part of Akamai) we had a configuration framework very similar in spirit and appearance to Configgy. It was in early 2000-s, we open sourced it since. (http://ncnf.sourceforge.net/). It would provide on-the-fly reload for the C-based programs (the ncnf if a C library). It also had some perks like attribute inheritance and a concept of block references. Most importantly though, it contained a separate schema language and a validator to allow configuration be checked before pushing in production. At Netli we used it to configure 1200 services on over 400 hardware boxes, the configuration becoming about 20+mb in length (assembled from several pieces by the CPP, then M4 templating library).

Naturally, it wasn’t Netli’s first attempt at doing configuration. One of the first attempts failed since it was Turing-complete. That approach was to specify the configuration as a Perl data specification. In a very short time the lure of unused expressiveness of such Turing-complete environment prevailed and people started to write for-loops around data pieces and doing other tricks to remove redundancy from the configuration. It turned out to be a disaster in the end, with configuration becoming unmaintainable and flaky.

One principle I got out out of that exercise is that configuration shall not be Turing-complete. We’ve got burned specifically by that property far too many times. Yet I do agree with you that a validation facility is a must-have, which is something not usually part of the simple text-based frameworks. C-based NCNF had it almost from the very beginning though, and it proved to be a very useful harness.

+1. There’s lots more info on that system at this post at lionet.livejournal.com.

Another Update, 2017-05-09: casio_juarez on Twitter:

Also related: The Configuration Complexity Clock.

(Image credit: Turn The Dial by VERY URGENT Photography)

Links for 2011-02-16

Links for 2011-02-09

Irish Times “Most Read” Article Feed

If you visit the Irish Times at all frequently, you’ll probably have noticed a nifty "wisdom of crowds" feature in the right sidebar: the list of "most read" articles. It’s quite good, since they’re often very interesting articles. Unfortunately, there’s no RSS feed for this feature.

Well, now there is:

Links for 2011-02-04

Links for 2011-02-02

Links for 2011-01-17