Skip to content

Justin's Linklog Posts

Links for 2012-12-03

  • Scoop! The inside story of the news website that saved the BBC

    The Register’s take on the early days of www.bbc.co.uk. Lots of politics, unsurprisingly.

    Fifteen years ago this month the BBC launched its News Online website. Developed internally with a skeleton team, the web service rapidly became the face of the BBC on the internet, and its biggest success story – winning four successive BAFTA awards. Remarkably, it operated at a third of the cost of rival commercial online news operations – unheard of in public-sector IT projects. Devised before there were really any content management systems, the technical architecture became a template for all major news systems, and one that’s still in use today. The team endured some furious internal politicking and sabotage to survive.

    (tags: bbc news history web uk the-register)

  • Irish mobile phone companies: still spammy

    ‘Pro tip: if you’re going to spam, try not to spam the DPC’s Director of Investigations.’ — lolz

    (tags: funny oh-dear three hutchinson ireland mobile spam dpc law)

  • Hamming weight

    Wikipedia page.

    The Hamming weight of a string is the number of symbols that are different from the zero-symbol of the alphabet used. It is thus equivalent to the Hamming distance from the all-zero string of the same length. For the most typical case, a string of bits, this is the number of 1’s in the string. In this binary case, it is also called the population count, popcount or sideways sum. It is the digit sum of the binary representation of a given number.
    Contains an efficient algorithm to compute this for a given long value, by ‘adding counts in a tree pattern.’

    (tags: algorithms hamming-distance bits hamming weight binary)

  • Efficient concurrent long set and map

    An ordered set and map data structure and algorithm for long keys and values, supporting concurrent reads by multiple threads and updates by a single thread.
    Some good stuff in the linked blog posts about Clojure’s PersistentHashMap and PersistentVector data structures, too.

    (tags: arrays java tries data-structures persistent clojure concurrent set map)

Links for 2012-11-28

  • The Rise And Fall Of The Obscure Music Download Blog: A Roundtable

    One internet music “sharing” trend largely unnoticed by the powers that sue was the niche explosion of obscure music download blogs, lasting roughly from 2004-2008. Using free filesharing services like Rapidshare and Mediafire, and setting up sites on Blogspot and similar providers, these internet hubs stayed hidden in the open by catering to more discerning kleptomaniac audiophiles. Their specialty: parceling out ripped recordings — many of them copyrighted — from the more collectible and unknown corners of music’s oddball, anomalous past. While the RIAA was suing dead people for downloading Michael Jackson songs (and Madonna was using Soulseek to curse at teenagers), obscure music blogs racked up millions of hits, ripping and sharing 80s Japanese noise, 70s German prog, 60s San Francisco hippie freak-outs, 50s John Cage bootlegs, 30s gramophone oddities, Norwegian death metal, cold wave cassettes made by kids in their garages, and the like. It was the mid aughts, and the advent of digitization had inadvertently put the value of the music industry’s “Top Ten” commercial product in peril. That same process transformed the value of old, collectible music as well. If one smart record collector was able to share the entire contents—music, artwork and all—of one vinyl LP on his blog, for free, and upload another item from his 1,000+ collection the next day, for weeks and years, and others like him did the same, competing with each other about who could upload the rarest and most sought-after record, and anyone who downloaded it could then share it again and again… Suddenly everyone in the world had the coolest record collection in the world; and soon, nobody in the world had the coolest record collection in the world. Obscure music download blogs weren’t shut down like Napster or Megaupload were (though they were indirectly affected by that crackdown); they just, mysteriously, seemed to burn out on their own sometime around 2008. While some are still around, their number represents only a fraction of that mid-00s heyday. Was this because obscure music blogs had overshared the underexposed and blown the whole thing into oblivion? Is the fact that a guy in Japan will no longer pay $500 on eBay for a first pressing of the No New York compilation because he can find it for free on the internet good for the world? Was the commodity-lost but the knowledge-gained an even exchange? To explore what was going on then, I assembled this email roundtable discussion between creators of some of the most popular blogs of the time: Eric Lumbleau of Mutant Sounds, Liam Elms of 8 Days in April, Frank of Systems of Romance and Brian Turner, Music Director of WFMU.
    (via Loreana Rushe)

    (tags: music mp3 blogs obscure via-loreana-rushe history 2000s)

Links for 2012-11-26

  • Conor’s 2012 Raspberry Pi Christmas Gift Guide

    Ah, memories! Wish my kiddies were old enough for one of these…

    I really think this Christmas could be a lovely replay of 1982 for a lot of people, like me, who got their first home computer that year. You could have so much fun on Christmas Day messing with the RPi rather than falling asleep in front of the fire. Just don’t fight over who gets the telly when Doctor Who is on. Whilst the bare-bones nature of the Raspberry Pi is wonderful, it is unusable out of the box unless you are a house with smartphones, digital cameras and existing PCs already that you can raid for components. What you want to avoid is a repeat of me that December in 1982 with my brand-new 16K ZX Spectrum which didn’t work on our Nordmende TV until two weeks later when the RTV Rentals guy came and replaced the TV Tuner. Two weeks typing Beep 1,2 to make sure it wasn’t broken.

    (tags: raspberry-pi gifts computers kids hacking education gadgets christmas)

  • Nintendo’s work on Miiverse Penis Drawing Detection

    ‘The unique feature of the Miiverse is being able to send drawings, not just text. But since the advent of the internet, there have always been those who have used it for unsavory purposes.’
    ‘Motoyama: we never had such a problem with our Hatena services. But, when we brought Hatena Flipnote to the West, we were caught off-guard by the amount of penises drawn by people.
    Kurisu: So the team and I had to come up with a way to create a system that auto-detects those types of pictures. […]
    ‘Motoyama: After a week, we made very good progress on the system. Then we tested the system with Nintendo of America and told them to start drawing. It went horribly.
    Kurisu: What we learned is that people enjoy drawing penises. Multiple ones. (laughs) The system was not prepared to handle that.’
    See also the “time-to-penis” metric in MMO games: http://www.joystiq.com/2009/03/24/overheard-gdc09-ttp-time-to-penis/

    (tags: nintendo image-detection ttp metrics games gaming mmo miiverse drawing)

  • The trench talk that is now entrenched in the English language

    ‘From cushy to crummy and blind spot to binge drink, a new study reveals the impact the First World War had on the English language and the words it introduced.’ Incredible comments, too…

    (tags: english etymology history wwi great-war via:sinead-gleeson words language)

  • Special encoding of small aggregate data types in Redis

    Nice performance trick in Redis on hash storage: ‘In theory in order to guarantee that we perform lookups in constant time (also known as O(1) in big O notation) there is the need to use a data structure with a constant time complexity in the average case, like an hash table. But many times hashes contain just a few fields. When hashes are small we can instead just encode them in an O(N) data structure, like a linear array with length-prefixed key value pairs. Since we do this only when N is small, the amortized time for HGET and HSET commands is still O(1): the hash will be converted into a real hash table as soon as the number of elements it contains will grow too much (you can configure the limit in redis.conf). This does not work well just from the point of view of time complexity, but also from the point of view of constant times, since a linear array of key value pairs happens to play very well with the CPU cache (it has a better cache locality than an hash table).’

    (tags: memory redis performance big-o hash-tables storage coding cache arrays)

  • HTTP Error 403: The service you requested is restricted – Vodafone Community

    Looks like Vodafone Ireland are failing to scale their censorware; clients on their network reporting “HTTP Error 403: The service you requested is restricted”. According to a third-party site, this error is produced by the censorship software they use when it’s insufficiently scaled for demand:

    “When you try to use HTTP Vodafone route a request to their authentication server to see if your account is allow to connect to the site. By default they block a list of adult/premium web sites (this is service you have switched on or off with your account). The problem is at busy times this validation service is overloaded and so their systems get no response as to whether the site is allowed, so assume the site you asked for is restricted and gives the 403 error. Once this happens you seem to have to make new 3G data connection (reset the phone, move cell or let the connection time out) to get it to try again.”
    Sample: http://pic.twitter.com/N1lAwBjW

    (tags: scaling ireland vodafone fail censorware scalability customer-service)

Links for 2012-11-24

Links for 2012-11-23

  • IBM insider: How I caught my wife while bug-hunting on OS/2 • The Register

    Wow, working for IBM in the 80’s was truly shitty.

    ‘IBM HR came up with a plan that summed up the department’s view of tech staff: a dinner dance. In Southsea. For our non-British readers this is not a glamorous location. As a scumbag contractor I wasn’t invited, but since I was dating one of the seven women on the project, I went anyway and was impressed by the way IBM had tried so very hard to make the inside of a municipal leisure centre look like Hawaii. This is so crap that the integrity checks I’ve installed to watch myself for incipient senility keep flagging it as a false memory. The only way I can force myself to believe the idea that the richest corporation on the planet behaved that way is that the girl who took me is now a reassuringly expensive lawyer who was kind enough to marry me and so we have photographic evidence. (I wish to make it clear that I’m not saying IBM had the worst HR of any firm in the world, merely that my 28 years in technology and banking have never exposed a worse one to me.)’
    And indeed, so were MS:
    ‘We, on the other hand, were regarded as hopelessly bureaucratic. After Microsoft lost the source code for the actual build of OS/2 we shipped, I reported a bug triggered when you double-clicked on Chkdsk twice: the program would fire up twice and both would try to fix the disk at the same time, causing corruption. I noted that this “may not be consistent with the user’s goals as he sees them at this time”. This was labelled a user error, and some guy called Ballmer questioned why I had this “obsession” with perfect code.’
    (thanks, Conor!)

    (tags: via:conor-delaney os2 ibm microsoft work 1980s pc uk steve-ballmer)

Links for 2012-11-21

Links for 2012-11-19

  • drip

    Unlike other tools intended to solve the JVM startup problem (e.g. Nailgun, Cake), Drip does not use a persistent JVM. There are many pitfalls to using a persistent JVM, which we discovered while working on the Cake build tool for Clojure. The main problem is that the state of the persistent JVM gets dirty over time, producing strange errors and requiring liberal use of cake kill whenever any error is encountered, just in case dirty state is the cause. Instead of going down this road, Drip uses a different strategy. It keeps a fresh JVM spun up in reserve with the correct classpath and other JVM options so you can quickly connect and use it when needed, then throw it away. Drip hashes the JVM options and stores information about how to connect to the JVM in a directory with the hash value as its name.
    (via HN)

    (tags: java command-line tools startup speed)

Links for 2012-11-14

Links for 2012-11-08

Links for 2012-10-31

  • The Future of Markdown

    ‘I’d really prefer not to fork the language; I’d much rather collectively help carry the banner of Markdown forward into the future, with the blessing of John Gruber and in collaboration with other popular sites that use Markdown. So… who’s with me?’

    (tags: markdown markup html web standards)

Links for 2012-10-28

  • SipHash: a fast short-input PRF

    a family of pseudorandom functions optimized for short inputs. Target applications include network traffic authentication and hash-table lookups protected against hash-flooding denials-of-service attacks. SipHash is simpler than MACs based on universal hashing, and faster on short inputs. Compared to dedicated designs for hash-table lookup, SipHash has well-defined security goals and competitive performance. For example, SipHash processes a 16-byte input with a fresh key in 140 cycles on an AMD FX-8150 processor, which is much faster than state-of-the-art MACs.

    (tags: hashing siphash djb security algorithms)

Links for 2012-10-27

Links for 2012-10-26

Flood of posts

Sorry for the flood of recent posts — turns out my cron job to gateway from Pinboard had stopped running due to cron fail. (I should really set up some monitoring someday ;)

Links for 2012-10-25

Links for 2012-10-24

Links for 2012-10-12

  • ElementCostInDataStructures

    “The cost per element in major data structures offered by Java and Guava (r11)].” A very useful reference!

    Ever wondered what’s the cost of adding each entry to a HashMap? Or one new element in a TreeSet? Here are the answers: the cost per-entry for each well-known structure in Java and Guava. You can use this to estimate the cost of a structure, like this: if the per-entry cost of a structure is 32 bytes, and your structure contains 1024 elements, the structure’s footprint will be around 32 kilobytes. Note that non-tree mutable structures are amortized (adding an element might trigger a resize, and be expensive, otherwise it would be cheap), making the measurement of the “average per element cost” measurement hard, but you can expect that the real answers are close to what is reported below.

    (tags: java coding guava reference memory cost performance data-structures)

Links for 2012-10-11

Links for 2012-10-08

  • Trident: a high-level abstraction for realtime computation

    built on Storm:

    Trident is a new high-level abstraction for doing realtime computing on top of Twitter Storm, available in Storm 0.8.0. It allows you to seamlessly mix high throughput (millions of messages per second), stateful stream processing with low latency distributed querying. If you’re familiar with high level batch processing tools like Pig or Cascading, the concepts of Trident will be very familiar – Trident has joins, aggregations, grouping, functions, and filters. In addition to these, Trident adds primitives for doing stateful, incremental processing on top of any database or persistence store. Trident has consistent, exactly-once semantics, so it is easy to reason about Trident topologies.

    (tags: distributed realtime twitter storm trident distcomp stream-processing low-latency nathan-marz)

Links for 2012-10-05

  • Cliff Click’s 2008 JavaOne talk about the NonBlockingHashTable

    I’m a bit late to this data structure — highly scalable, nearly lock-free, benchmarks very well (except with the G1 GC): http://edwwang.com/blog/2012/02/10/concurrent-hashmap-benchmark/ . Having said that, it doesn’t cope well with frequently-changing unique keys: http://sourceforge.net/tracker/?func=detail&aid=3563980&group_id=194172&atid=948362 . More background at: http://www.azulsystems.com/blog/cliff/2007-03-26-non-blocking-hashtable and http://www.azulsystems.com/blog/cliff/2007-04-01-non-blocking-hashtable-part-2 This was used in Cassandra for a while, although I think the above bug may have caused its removal?

    (tags: nonblockinghashtable data-structures hashmap concurrency scaling java jvm)

Links for 2012-10-01

  • Ingenious Dublin

    Excellent stuff, by Mary Mulvihill:

    Where in Dublin can you see a Victorian diving bell? What about the skeleton of Tommy, the prince’s elephant? The site of the world’s first earthquake experiment? Or the world’s sports pirate radio broadcast? Our new e-book Ingenious Dublin has all these fascinating stories and more. It is packed with information, places to visit, and lots of illustrations, and covers the city and county, from Skerries windmills to Ballybetagh’s fossil deer.’
    EUR 4.99 for the Kindle e-book. I’ll buy that!

    (tags: kindle reading books mary-mulvihill science facts dublin ireland history)

Links for 2012-09-28

Links for 2012-09-20

  • Facebook monitoring cache with Claspin

    reasonably nice heatmap viz for large-scale instance monitoring. I like the “snake” pattern for racks

    (tags: facebook monitoring dataviz heatmaps claspin cache memcached ui)

  • The Oireachtas great leap backwards: it’s not just about KildareStreet.com

    ‘it appears that the Oireachtas has decided to save time and money by eliminating entirely the stage in their workflow that parsed raw debates records into XML. This stage has been replaced with a (presumably automated) process that generates web pages from Lotus Notes. It’s easy to see how somebody with little appreciation of the value of providing open public data in a structured format could have viewed this stage as a costly luxury, and its elimination as a simple and obvious “efficiency”. It’s particularly disappointing, however, that nobody in the decision-making process seemed to be aware of how much of a backward step this “efficiency” would represent. As John Handelaar of KildareStreet.com told The Irish Times, “We are replacing 2012 with 1995 overnight”.’

    (tags: kildare-street open-data opengov ireland data oireachtas)

Links for 2012-09-18

  • PCRE Performance Project

    Excellent stuff. Using “sljit”, a stackless platform-independent JIT compiler, this compiles Perl-compatible regular expressions to machine code on ARM, x86, MIPS and PowerPC platforms, resulting in ‘similar matching speed to DFA based engines (like re2) on common patterns’ with Perl compatibility. ‘This work has been released as part of PCRE 8.20 and above. Now (PCRE 8.31), nearly all PCRE features are supported including UTF-8/16 and partial matching.’

    (tags: pcre regexps regex performance optimization jit compilation dfa re2 via:akohli)

Links for 2012-09-15

  • Spanner: Google’s Globally-Distributed Database [PDF]

    Abstract: Spanner is Google’s scalable, multi-version, globally-distributed, and synchronously-replicated database. It is the first system to distribute data at global scale and support externally-consistent distributed transactions. This paper describes how Spanner is structured, its feature set, the rationale underlying various design decisions, and a novel time API that exposes clock uncertainty. This API and its implementation are critical to supporting external consistency and a variety of powerful features: non-blocking reads in the past, lock-free read-only transactions, and atomic schema changes, across all of Spanner. To appear in: OSDI’12: Tenth Symposium on Operating System Design and Implementation, Hollywood, CA, October, 2012.

    (tags: database distributed google papers toread pdf scalability distcomp transactions cap consistency)

  • NCBI ROFL: Probably the most horrifying scientific lecture ever

    In 1983, at the Urodynamics Society meeting in Las Vegas, Professor G.S. Brindley first announced to the world his experiments on self-injection with papaverine to induce a penile erection. This was the first time that an effective medical therapy for erectile dysfunction (ED) was described, and was a historic development in the management of ED. The way in which this information was first reported was completely unique and memorable, and provides an interesting context for the development of therapies for ED. I was present at this extraordinary lecture, and the details are worth sharing. Although this lecture was given more than 20 years ago, the details have remained fresh in my mind, for reasons which will become obvious.
    Go on, guess.

    (tags: medicine science funny erectile-dysfunction omgwtf conferences)

  • Yuri Suzuki: London Underground circuit map radio

    Japanese designer yuri suzuki has sent designboom images of his ‘london underground circuit maps’ project developed as part of the designers in residence program at the london design museum, on show until january 13th, 2013. responding to ‘thrift’ as a theme, suzuki’s work explores communication systems in consumer electronics. a printed circuit board (PCB) is used as a precedent for developing a electrical circuit influenced by harry beck’s iconic london underground map diagrams. by strategically positioning certain speaker, resistor and battery components throughout the map, users can visually understand the complex networks associated with electricity and how power is generated within a radio.
    Beautifully done (via jwz.)

    (tags: electronics london art design underground travel yuri-suzuki circuitry)

Links for 2012-09-14

  • The meanings and origins of ‘feck’

    It’s a “minced oath”, apparently:

    ‘Feck is a popular minced oath in Ireland, occupying ground between the ultra-mild expletive flip and the often taboo (but also popular) fuck. It’s strongly associated with Irish speech, and serves a broad range of linguistic purposes that I’ll address briefly in this post.’
    It doesn’t derive from the obvious source:
    So where does the curse, the not-quite-rude word, come from? It’s commonly assumed to stem from its coarser cousin fuck, the simple vowel change undercutting its power and making it more suitable for public expression. But Julian Walker, an educator at the British Library, offers a more roundabout route: “In faith” becomes the improbable “in faith’s kin” shortened to “i’fackins”, which gradually shrinks to “fac” and “feck”.

    (tags: feck swearing ireland irish hiberno-english father-ted etymology cursing)

Links for 2012-09-11

  • Chip and Skim: cloning EMV cards with the pre-play attack

    Worrying stuff from the LBT team. ATM RNGs are predictable, and can be spoofed by intermediate parties:

    ‘So far we have performed more than 1000 transactions at more than 20 ATMs and a number of POS terminals, and are collating a data set for statistical analysis. We have developed a passive transaction logger which can be integrated into the substrate of a real bank card, which records up to 100 unpredictable numbers in its EEPROM. Our analysis is ongoing but so far we have established non-uniformity of unpredictable numbers in half of the ATMs we have looked at. First, there is an easier attack than predicting the RNG. Since the unpredictable number is generated by the terminal but the relying party is the issuing bank, any intermediate party – from POS terminal software, to payment switches, or a middleman on the phone line – can intercept and superimpose their own choice of UN. Attacks such as those of Nohl and Roth, and MWR Labs show that POS terminals can be remotely hacked simply by inserting a sabotaged smartcard into the terminal.

    (tags: atm banking security attack prngs spoofing banks chip-and-pin emv smartcards)

Links for 2012-09-07

  • New UK Conservative Party Co-Chair Grant Shapps Founded Google Spamming Business

    Wow. Scummy stuff.

    Shapps founded HowToCorp in 2005, a site that, among other products, pitches the TrafficPaymaster software. The software apparently “scrapes” or copies content from all over the web, from RSS feeds to even sets of search results, to automatically generate pages that probably make little sense to the human visitor but which may pick up some traffic from Google and, in turn, generate clicks on Google AdSense or other ads.
    Google are not happy: On Sunday sources at Google confirmed TrafficPaymaster was in “violation” of its policies and that its search engine’s algorithms had been equipped to drop the ranking of any webpages created using HowToCorp’s software. Officially, Google said it does not comment on individual cases. “We have strict policies in place to ensure web users are presented with useful ads when browsing sites in our content network and to ensure our advertisers reach an engaged audience. If we are alerted to a site which breaks our AdSense policies, we will review it and can remove it from our network.”

    (tags: grant-shapps uk politics tories spammers spamming spinning adsense google spam trafficpaymaster)

  • NunatsiaqOnline 2012-09-06: The First Non-Inuk on the Moon

    No, I am not a conspiracy theorist who believes that Armstrong’s moon landing was faked at some mysterious location in the Nevada desert. Armstrong reached the moon. But his accolades are undeserved because he was not first. All right-thinking Nunavummiut know this, because we know that Inuit regularly visited the moon for centuries. David Iqaqrialu said as much in a heated exchange in the Nunavut legislature on May 6, 2002. We know it was heated because he prefaced his remarks by telling the Speaker, “I am starting to get hot under the collar…” He then went on to say, as reported in Hansard, “…it is not really related to the question that I posed, but this is background material. Inuit had reached the moon quite some time ago during the shamanistic ages, prior to the Americans reaching it with their machines and finding out it wasn’t what they thought it was.”
    (via Dave Walsh)

    (tags: inuit via:daev shaman nunavut neil-armstrong moon space exploration)

Links for 2012-09-06

  • Dublin City contact numbers for potholes, dangerous drivers, illegal parking etc.

    I’m sure these are about as useful as a chocolate teapot, but what the hey

    (tags: dublin parking cycling roads safety potholes reporting)

  • Knots on Mars! (and a few thoughts on NASA’s knots)

    amazing post from the International Guild of Knot Tyers Forum:

    While a few of the folks here are no doubt aware, it might surprise most people to learn that knots tied in cords and thin ribbons have probably traveled on every interplanetary mission ever flown. If human civilization ends tomorrow, interplanetary landers, orbiters, and deep space probes will preserve evidence of both the oldest and newest of human technologies for millions of years. Knots are still used in this high-tech arena because cable lacing has long been the preferred cable management technique in aerospace applications. That it remains so to this day is a testament to the effectiveness of properly chosen knots tied by skilled craftspeople. It also no doubt has a bit to do with the conservative nature of aerospace design and engineering practices. Proven technologies are rarely cast aside unless they no longer fulfill requirements or there is something substantially better available. While the knots used for cable lacing in general can be quite varied — in some cases even a bit idiosyncratic — NASA has in-house standards for the knots and methods used on their spacecraft. These are specified in NASA Technical Standard NASA-STD-8739.4 — Crimping, Interconnecting Cables, Harnesses, and Wiring. As far as I’ve been able to identify in the rover images below, all of the lacings shown are one of two of the several patterns specified in the standard. The above illustration shows the so-called “Spot Tie”. It is a clove hitch topped by two half-knots in the form of a reef (square) knot. In addition to its pure binding role, it is also used to affix cable bundles to tie-down point.
    Some amazing scholarship on knot technology in this post — lots to learn! (via Tony Finch, iirc)

    (tags: via:fanf mars nasa science knots tying rope cables cabling geek aerospace standards)

Links for 2012-09-05

  • Estonia introduces coding classes to 8-year-olds

    ‘ProgreTiiger education will start with students in the first grade, which starts around the age of 7 or 8 for Estonians. The compsci education will continue through a student’s final years of public school, around age 16. Teachers are being trained on the new skills, and private sector IT companies are also getting involved, which makes sense, given that these entities will likely end up being the long-term beneficiaries of a technologically literate populace. The ProgreTiiger program is launching at a few pilot schools and will soon be rolling out to all general education schools in Estonia.’

    (tags: estonia education coding programming kids children students learning school)

  • Avoiding Hash Lookups in a Ruby Implementation

    ‘If I were to sum up the past 6 years I’ve spent optimizing JRuby it would be with the following phrase: Get Rid Of Hash Lookups.’ This has been a particular theme of some recent optimization hacks I’ve been working on. Hashes may be O(1) to read, on average, but that doesn’t necessarily mean they’re the right tool for performance… (via Declan McGrath)

    (tags: via:declanmcgrath hash optimization ruby performance jruby hashing data-structures big-o optimisation)