Skip to content

Justin's Linklog Posts

Against The Use Of Programming Languages in Configuration Files

It’s pretty common for apps to require "configuration" — external files which can contain settings to customise their behaviour. Ideally, apps shouldn’t require configuration, and this is always a good aim. But in some situations, it’s unavoidable.

In the abstract, it may seem attractive to use a fully-fledged programming language as the language to express configuration in. However, I think this is not a good idea. Here are some reasons why configuration files should not be expressed in a programming language (and yes, I include "Ruby without parentheses" in that bucket):

Provability

If a configuration language is Turing-incomplete, configuration files written in it can be validated "offline", ie. without executing the program it configures. All programming languages are, by definition, Turing-complete, meaning that the program must be executed in full before its configuration can be considered valid.

Offline validation is a useful feature for operational usability, as we’ve found with "spamassassin –lint".

Security

Some configuration settings may be insecure in certain circumstances; for example, in SpamAssassin, we allow certain classes of settings like whitelist/blacklists to be set in a users ~/.spamassassin/user_prefs file, while disallowing rule definitions (which can cause poor performance if poorly written).

If your configuration file is simply an evaluated chunk of code, it becomes more difficult to protect against an attacker introspecting the interpreter and overriding the security limitations. It’s not impossible, since you can, for instance, use a sandboxed interpreter, but this is typically not particularly easy to implement.

Usability

Here’s a rather hairy configuration file I’ve concocted.

    #! /usr/bin/somelanguage
    !$ app.status load html
    !c = []
    ;c['sources'] = < >
    ;c['sources'].append(
        NewConfigurationThingy("foo_bar",
            baz="flargle"))
    ;c['builders'] = < >
    ;c['bots'] = < >
    !$ app.steps load source, shell
    ;bf_mc_generic = factory.SomethingFactory( <
        woo(source.SVN, svnurl="http://example.com/foo/bar"),
        woo(shell.Configure, command="/bar/baz start"),
        woo(shell.Test, command="/bar/baz test"),
        woo(shell.Configure, command="/bar/baz stop")
        > );
    ;b1 = < "name": "mc-fast", "slavename": "mc-fast",
                 "builddir": "mc-fast", "factory": ;bf_mc_generic >
    ;c['builders'].append(;b1)
    ;SomethingOrOther = ;c

This isn’t actually entirely concocted from thin air — it’s actually bits of our BuildBot configuration file, from before we switched to using Hudson. I’ve replaced the familiar Python syntax with deliberately-unfamiliar made-up syntax, to emulate the user experience I had attempting to configure BuildBot with no pre-existing Python knowledge. ;)

Compare with this re-stating of the same configuration data in a simplified, "configuration-oriented" imaginary DSL:

add_source NewConfigurationThingy foo_bar baz=flargle

buildfactory bf_mc_generic source.SVN http://example.com/foo/bar
buildfactory bf_mc_generic shell.Configure /bar/baz start
buildfactory bf_mc_generic shell.Test /bar/baz test
buildfactory bf_mc_generic shell.Configure /bar/baz stop

add_builder name=mc-fast slavename=mc-fast
     builddir=mc-fast factory=bf_mc_generic

Essentially, I’ve extracted the useful configuration data from the hairy example, discarded the symbology used to indicate types, function calls, data structure construction, and let the configuration domain knowledge imply what’s necessary. Not only is this easier to comprehend for the casual reader, it also reduces the risk of syntax errors, by simply minimising the number of syntactical components.

See Also

The Wikipedia page on DSLs is quite good on the topic, with a succinct list of pros and cons.

This StackOverflow thread has some good comments — I particularly like this point:

When you need your application to be very "configurable" in ways that you cannot imagine today, then what you really need is a plugins system. You need to develop your application in a way that someone else can code a new plugin and hook it into your application in the future.

+1.

This seems to be a controversial topic — as you can see, that page has people on both sides of the issue. Maybe it fundamentally comes down to a matter of taste. Anyway — my $.02.

Update: discussions elsewhere: HackerNews

Another Update, 2012-04-06: Robey Pointer wrote a post called Why Config?, in which he describes a Scala-based configuration language in use at Twitter, which uses Scala’s runtime code evaluation, and a Scala trait, to express configuration succinctly in a Scala source file and load it at runtime. The downside? It’s a Scala source file, executed at runtime, containing configuration. :(

However, this comment in the comments section is worth a read:

At Netli (now part of Akamai) we had a configuration framework very similar in spirit and appearance to Configgy. It was in early 2000-s, we open sourced it since. (http://ncnf.sourceforge.net/). It would provide on-the-fly reload for the C-based programs (the ncnf if a C library). It also had some perks like attribute inheritance and a concept of block references. Most importantly though, it contained a separate schema language and a validator to allow configuration be checked before pushing in production. At Netli we used it to configure 1200 services on over 400 hardware boxes, the configuration becoming about 20+mb in length (assembled from several pieces by the CPP, then M4 templating library).

Naturally, it wasn’t Netli’s first attempt at doing configuration. One of the first attempts failed since it was Turing-complete. That approach was to specify the configuration as a Perl data specification. In a very short time the lure of unused expressiveness of such Turing-complete environment prevailed and people started to write for-loops around data pieces and doing other tricks to remove redundancy from the configuration. It turned out to be a disaster in the end, with configuration becoming unmaintainable and flaky.

One principle I got out out of that exercise is that configuration shall not be Turing-complete. We’ve got burned specifically by that property far too many times. Yet I do agree with you that a validation facility is a must-have, which is something not usually part of the simple text-based frameworks. C-based NCNF had it almost from the very beginning though, and it proved to be a very useful harness.

+1. There’s lots more info on that system at this post at lionet.livejournal.com.

Another Update, 2017-05-09: casio_juarez on Twitter:

Also related: The Configuration Complexity Clock.

(Image credit: Turn The Dial by VERY URGENT Photography)

Links for 2011-02-16

Links for 2011-02-09

Irish Times “Most Read” Article Feed

If you visit the Irish Times at all frequently, you’ll probably have noticed a nifty "wisdom of crowds" feature in the right sidebar: the list of "most read" articles. It’s quite good, since they’re often very interesting articles. Unfortunately, there’s no RSS feed for this feature.

Well, now there is:

Links for 2011-02-04

Links for 2011-02-02

Links for 2011-01-17

Links for 2011-01-07

Links for 2011-01-05

Links for 2010-12-31

Links for 2010-12-16

  • opendata.ie : ‘to help citizens access to high value, machine readable datasets generated by the Irish Government and public sector authorities; to improve access to the Irish Government data and to establish an innovative platform that can demonstrate to government how and why they should share data’
    (tags: open data ireland open-data open-source free datasets)

  • RunwayFinder shut down by patent trolls : “While we appreciate your offer to shut down the website to stop future infringement, we notice that your website is still operation. And without further information from you, our only means to assess the potential damages is the observation that your website had 22,256 unique visitors in July 2010. Each visit represents a potential lost sale of our client’s patented invention at $149 per sale. This damage calculation exceeds $3.2 million per month in lost revenue.”
    (tags: patents swpats patent-trolls flightprep runwayfinder aviation web law)

  • The Background Dope on DHS Recent Seizure of Domains : according to this, the US Dept of Homeland Security is “seizing” domains through a back-channel to Verisign, since they directly control the .com TLD’s nameservers. Expect to see dodgy sites start using non-US TLDs, names in multiple TLDs a la Pirate Bay, and eventually IPs instead of DNS records
    (tags: tlds dns security dhs seizure domains cctlds filesharing icann immixgroup)

Links for 2010-12-13

  • Accentuate.us : ‘We are proud to announce the free and open-source Accentuate.us, a new method of input for over 100 languages that uses statistical reasoning so that users can type effortlessly in plain ASCII while ultimately producing accurate text. This allows Vietnamese users, for example, to simply type “Moi nguoi deu co quyen tu do ngon luan va bay to quan diem,” which will be automatically corrected to “M?i ng??i ??u có quy?n t? do ngôn lu?n và b?y t? quan ?i?m” after Accentuation. To date, we support four clients: Mozilla Firefox, Perl, Python, and Vim, with more to be added shortly.’ cool
    (tags: accents language web-services typing text-entry ascii unicode characters)

  • The Day MAME Saved My Ass : ‘Publishers would have people believe that MAME and the emulation scene is the root of all evil, that it promotes piracy and ultimately hurts the poor, starving developers slaving away on the game. Not only is this claim patently false, it ignores the fact that many developers use things like MAME, mod chips, and homebrew development utilities to help us overcome the day-to-day frustrations caused by the people behind the real problems in our industry.’
    (tags: mame games coding legal spy-hunter emulation rips takedowns)

  • Digital Socket Awards : ‘We’d like you to nominate the longlist of best music of 2010 on www.digitalsocketawards.com. From this, 26 blogger judges from towns and cities all over Ireland will each score their top choices to reach a shortlist of three finalists in each category. The winners will be announced on 3 February 2011 at a live event in Dublin’s Grand Social.’
    (tags: blogs blogging irishblogs music mp3 mp3blogs ireland awards)

I made a sled

Facing yet another day of being snowed in, with Dublin’s icy roads and footpaths driving us all stir crazy, I came up with this:

More pics, vid — fun!

Links for 2010-12-03

Links for 2010-12-02

Links for 2010-12-01

  • Barry Eichengreen on the Irish bailout : ‘The Irish “program” solves exactly nothing – it simply kicks the can down the road. A public debt that will now top out at around 130 per cent of GDP has not been reduced by a single cent. The interest payments that the Irish sovereign will have to make have not been reduced by a single cent, given the rate of 5.8% on the international loan. After a couple of years, not just interest but also principal is supposed to begin to be repaid. Ireland will be transferring nearly 10 per cent of its national income as reparations to the bondholders, year after painful year. This is not politically sustainable, as anyone who remembers Germany’s own experience with World War I reparations should know. A populist backlash is inevitable.’
    (tags: ireland economy bailout eu euro)

  • Video: Robots Explain The Irish Economic Crisis : Pretty good explanation, actually
    (tags: news ireland robots youtube debt eu politics economy)

Links for 2010-11-26

Science Gallery Xmas Cards

The Dublin Science Gallery Greeting Cards are excellent!

Get ’em here, or pick up one of the great gadgets and gifts they have in stock.

(disclaimer: I am mates with the designer and the guy who runs the shop — but I still think they’re great work, regardless ;)