How to Store/Load Wii Games via USB Hard Drive : nifty! uses the Wii Homebrew Channel (ie the Twilight Hack savefile hack). apparently quite doable
(tags: wii hacks homebrew twilight-hack games backup)
Justin's Linklog Posts
review of the MySQL Tokutek storage engine : ‘fractal tree indexes’ instead of B-trees. new to me
(tags: fractal-tree-indexes b-trees fractals algorithms data-structures mysql performance tokutek tokudb databases)
Haystack design notes : pretty exhaustive walkthrough of Facebook’s new photo storage backend, running on XFS. nice setup for a very specific use-case
(tags: storage scaling netapp facebook scalability images nfs haystack)Party Cat : “I just feel lately your PARTIES have not been up to PAR.” “…ty”
(tags: party-cat parties comics funny via:fp cats)
REST worst practices : good advice on things to avoid in providing a REST API from a Django app
(tags: rest django web http webdev web-services antipatterns best-practices)Consistent hashing vs order-preserving partitioning in distributed databases : ‘An order-preserving partitioner, where keys are distributed to nodes in their natural order, has huge advantages over consistent hashing, particularly the ability to do range queries across the keys in the system’
(tags: consistent-hashing order-preserving-partitioning partitioning sharding distcomp networking distributed databases k-v-stores cassandra)How to use JetS3t with Eucalyptus : wow, impressive i14y; also Eucalyptus now includes an S3-like service
(tags: ec2 eucalyptus jets3t s3 storage open-source java)Psych Ward episode 2 : vote for my mate Luke’s latest TV programme. it’s great
(tags: rte psychward voting tv luke)
Here’s a great example of numerical illiteracy spotted by my mate Tom:
some classic reporting in the Irish Examiner today…
“Department staff clocked up 20,000 sick days in the three years” is the headline. Closer examination of the article reveals there are 5,000 people in the department. Do the maths (which the paper doesn’t – I wonder why) and that’s a SHOCKING 1.3 sick days a year.
Even better is this quote: “Department of Agriculture staff clocked up 3,095 uncertified sick days last year – 653 of these on a Monday”
So that would be about a fifth of the sick days being taken on one of the five working days in the week. DISGRACE!
Let’s hear it for old media’s commitment to quality journalism!
Dear Fellow Rubyists « Dyepot, Teapot : good follow-up post regarding the shitstorm that erupted in the Ruby community after a talk entitled “CouchDB + Ruby: Perform Like a Pr0n Star” (with content about like you’d imagine). to be honest, I can’t understand why the Rubyists are being so obtuse about this teenager-level stupidity
(tags: community conferences porn sex culture couchdb opensource)Eucalyptus devs forming commercial company : Eucalyptus Systems to provide “commercial support, integration, and development services for Eucalyptus users while continuing to develop the core code base under an open source license.” hopefully they won’t do a Xen and kill the goose
(tags: eucalyptus ec2 linux ubuntu xen opensource cloud-computing)
Kanban : a new agile software-dev methodology. hmm
(tags: software work agile kanban process)Home Office ‘colluded with Phorm’ : holy shit. ‘In an e-mail dated 22 January 2008, a Home Office official wrote again to Phorm and said: “I should be grateful if you would review the attached document, and let me know what you think.” In January 2008 the Home Office thanks Phorm for comments and changes to its draft paper, which show the company making deletions and changes to the document.’
(tags: phorm uk home-office politics interception advertising dpi networking internet web isps regulation)
lots more details on the “marblecake” 4chan Time poll-stuffing : including an attempted poisoning of Recaptcha, which the author claims it was immune to, and a final manual-CAPTCHA data-entry process towards the end
(tags: recaptcha captchas moot time 4chan via:waxy security web poll 2009 anonymous)
The full story behind Little Edvin Tables : ‘As the names are so similar, searches for our company in the official Norwegian registry of just-about-anything (Brønnøysundregistrene) often resulted in potential customers looking up the wrong company. To prevent this confusion we recently changed the name of the old (non-LLC) company, and figured we’d use the opportunity for some harmless – or so we thought – fun.’
(tags: little-bobby-tables sql injection xss via:mikkohypponen norway sysedata security)“Carne Asada is not a crime” tee-shirts : WANT
(tags: carne-asada food mexican fashion tshirts tacos trucks taco-trucks la california)Tesco brand in Ireland “almost exclusively” associated with a Paddy Tax rip-off : ‘Consumers, media and government associate Tesco Ireland almost exclusively with price differentials between Northern Ireland and Ireland.’ Talk about a massive PR fail!
(tags: pr fail disaster tesco paddy-tax rip-off-ireland rip-offs surveys northern-ireland ireland)great neologism: meatcloud : ie. server-deployment sysadmin teams. ‘If you want to participate in this ‘as a Service’ brave new world, and your plan to bring up new servers involves a meatcloud ssh’ing their little hearts out, you might as well give up now’
(tags: sysadmin meatcloud funny puppet agile neologism infrastructure words saas cloud-computing ec2 deployment)
Ending BioShock : a much better ending than the real one
(tags: bioshock gaming videogames narrative plot)Little Bobby Tables’ Norwegian cousin : “Navn/foretaksnavn: ‘;UPDATE TAXRATE SET RATE = 0 WHERE NAME = ‘EDVIN SYSE’ ” — ahahaha!
(tags: lol sql haxx0ring xkcd funny security via:simonw norway little-bobby-tables xss escaping)OAuth Session Fixation Attack : the reason why Twitter, Y! (and others) shut down their OAuth services recently; a massive hole in the OAuth authorization protocol. this will be tricky to fix
(tags: oauth security twitter flickr holes yahoo google)Top Tips : some of the worst “top tip” sidebars collected from lowbrow UK mags. even shittier than the made-up Viz ones
(tags: top-tips viz funny advice idiotic omgwtf)
Performance comparison: key/value stores for language model counts : useful benchmarks, and another plug for Tokyo Cabinet; over 4x as fast as writes to an on-disk BerkeleyDB via its Python bindings
(tags: tokyo-cabinet benchmarks db storage berkeley-db k-v-stores)
John Handelaar goes public with KildareStreet.com : TheyWorkForYou ported to the Irish Oireachtas — yay John!
(tags: politics ireland oireachtas john-handelaar kildarestreet)Fun with YouTube’s Audio Content ID System : awesome black-box analysis of what it takes to evade the Content-ID system deployed by YouTube to block use of copyrighted music in third-party videos, using Audible Magic’s acoustic fingerprinting. easy workaround: skip the first 30 seconds of the track or resample by 5%
(tags: via:ninnx drm hacking youtube audio analysis content fingerprint identification watermarking algorithm)RedMonk’s Stephen O’Grady on the Oracle/Sun acquisition : great analysis, particularly where it affects ZFS and their open-source products
(tags: redmonk analysis mergers m&a sun oracle via:segphault)‘The Emergency’ now blogging : brilliant Irish political satire
(tags: the-emergency comedy funny ireland politics satire blogs)
Abaca’s radical anti-spam tech wins at Yahoo! : claimed 99.997% catch rate, FP rate of 1 in a million, supposedly. sounds like a major leap forward if true. wonder how it works…
(tags: abaca anti-spam via:richi yahoo)Study finds pirates 10 times more likely to buy music : great stat (via Tony Finch)
(tags: via:fanf filesharing p2p mp3 piracy copyright piratebay downloads file-sharing)RTMPE : ‘Encrypted Real Time Messaging Protocol (RTMPE or RTMPTE) is a proprietary protocol created by Macromedia used for streaming video and DRM.’ apparently used by RTE’s streaming video
(tags: rte drm security rtmp rtmpe macromedia flash video streaming)Some Notes on Distributed Key Stores : great investigation from Leonard Lin; Tokyo Tyrant gets a strong thumbs-up. also: ‘based on the maturity of projects out there, you could write your own in less than a day. It’ll perform as well and at least when it breaks, you’ll be more fond of it. Alternatively, you could go on the conference circuit and talk about how awesome your half-baked distributed keystore is.’ ha!
(tags: scaling storage distcomp k-v-stores tokyocabinet tokyotyrant voldemort mysql databases cassandra)Schooner Appliance for Memcached : you really know you’ve made it as open-source infrastructure when third parties are building custom off-the-shelf hardware platforms for your code. crazy stuff, though; isn’t half of the idea of memcached that you can run it on COTS hardware?
(tags: appliances memcached hardware caching web)pubsubhubbub : aka. PSHB. ‘open, web-hook-based pubsub (publish/subscribe) protocol. Includes a [python] open source reference implementation’, from a mainly-Google-based team incl Brad Fitzpatrick. note: server-to-server only; there’s no NAT or COMET support
(tags: pshb web gae webhooks syndication xmpp pubsub pubsubhubbub google http atom feeds)
Mike Cardwell attempts to opt out of Phorm interception : I did just the same thing myself last week
(tags: phorm interception http privacy dpi advertising bt webwise org)RTÉ ‘gets it wrong’ with new music downloads which don’t work on iPods : ‘Launched recently at a cost of €230,000, listeners can buy tracks heard on the station’. the tracks are DRM-laden WMA files, so don’t work on iPods or any other MP3 player. sounds like the record labels browbeat RTE on this one, resulting in just another useless DRM store that nobody will use. great way to spend my license fee :(
(tags: rte waste fail mp3 wma music 2fm via:unarocks)recording what’s playing on PulseAudio : every sink (output) also provides a built-in “monitor” source. This script records the currently-playing audio to WAV
(tags: linux audio recording pulseaudio stream drm sox wav)Collectl : _very_ comprehensive Linux system monitoring tool; looks nifty! ‘Collectl tries to do it all. You can choose to monitor any of a broad set of subsystems which currently include buddyinfo, cpu, disk, inodes, infiniband, lustre, memory, network, nfs, processes, quadrics, slabs, sockets and tcp.’
(tags: collectl linux tools sar processes disk cpu io monitoring sysadmin network)Cooliris For Linux : ‘a browser extension that leverages the GPU to allow users to visually navigate photos, videos, games, and news stories from their favorite sites on a full screen 3D wall’. sounds nifty, must give this a try
(tags: cooliris linux 3d vizualisation photos firefox)JG Ballard dead : of cancer at the age of 78. one less genius alive
(tags: jg-ballard ballard dystopia sf fiction future literature authors)
fantastic LED “faceless” watch : ‘Part of apertures of metal band became digital display screen. Metal band and digital figures mingle together in proportion naturally. Without the face of “timepiece”, it displays figures only when needed but also quite vague existence, “time”‘
(tags: led designer watches want wishlist design cool nifty)Metric counts its iTunes success – Los Angeles Times : ‘”Talking gross numbers that come directly to the band, we have made more money already than we have on the last record in four years,” said [Metric]’s co-manager. “Without any intermediary, we’re making 77 cents on the dollar for every record we sell” on iTunes. Under a label deal […] Metric would have earned closer to 22 cents.’
(tags: metric bands music music-industry future itunes mp3 itms)
Don’t forget — next Monday, the Heritage Society of Engineers Ireland, in association with The Irish Computer Society, and the ICT and Electronic and Electrical Divisions of Engineers Ireland, will be hosting an evening lecture entitled "Reminiscences of Early days of Computing in Ireland", by Gordon Clarke (M.A., CEng., F.B.C.S., C.I.T.P., F.I.C.S). Sounds like it’ll be great. More details.
Update: it starts at 8pm; useful info! Also, the event’s flyer can be found on this page, which notes:
For those new to using our webcast facility, please see www.engineersireland.ie/webcast for information on how to set-up and access our webcasts. To view the event, please log onto the url below: https://engineersireland.webex.com/engineersireland/onstage/g.php?t=a&d=841959965 The password: computer
Chino Otsuka: “Imagine Finding Me” : the artist’s childhood photos, digitally manipulated to feature the artist as an adult alongside. fantastic (via Waxy)
(tags: via:waxy art chino-otsuka photography photoshop history memories self-portraits)Echo vision: The man who sees with sound : amazing first-person report of echolocation in humans: the author calls it “FlashSonar”, and teaches other blind people how to use it
(tags: echolocation via:eoin flashsonar sonar new-scientist blind acoustics echo perception neuroscience)notes on “A Canticle for Leibowitz” : reading notes for the 50-year-old Hugo-Award-winning SF classic, dealing with theology, science, and Cold War terror of a nuclear armageddon
(tags: theology science nuclear-war cold-war 1950s science-fiction reading books a-canticle-for-leibowitz religion)
A while back, I linkblogged about "iotop", a very useful top-like UNIX utility to show which processes are initiating the most I/O bandwidth.
Teodor Milkov left a comment which is well worth noting, though:
Definitely iotop is a step in the right direction.
Unfortunately it’s still hard to tell who’s wasting most disk IO in too many situations.
Suppose you have two processes – dd and mysqld.
dd is doing massive linear IO and its throughput is 10MB/s. Let’s say dd reads from a slow USB drive and it’s limited to 10MB/s because of the slow reads from the USB.
At the same time MySQL is doing a lot of very small but random IO. A modern SATA 7200 rpm disk drive is only capable of about 90 IO operations per second (IOPS).
So ultimately most of the disk time would be occupied by the mysqld. Still iotop would show dd as the bigger IO user.
He goes into more detail on his blog. Fundamentally, iotop works based on what the Linux kernel offers for per-process I/O accounting, which is I/O bandwidth per second, not I/O operations per second. Most contemporary storage in desktops and low-end server equipment is IOPS-bound (‘A modern 7200 rpm SATA drive is only capable of about 90 IOPS’). Good point! Here’s hoping a future change to the Linux per-process I/O API allows measurement of IOPS as well…
Under the Covers of Google App Engine Datastore : via James Hamilton. some details on BigTable
(tags: bigtable google appengine notes implementation storage)French National Assembly reject HADOPI law : ‘On Friday the French National Assembly rejected the HADOPI law, which would impose the toughest “three strikes” copyright enforcement law in the world on French Internet users.’ phew
(tags: hadopi sarkozy france censorship privacy law eu)UPC block out D-Boxes : Irish cable-TV company UPC have rolled out Nagravision 2 encryption, finally breaking the dodgy “D-Box” decoder boxes sold on a massive scale throughout Ireland for several years now. can’t see it staying hacked for long though. NTL’s comment: http://url.ie/1g0q
(tags: nagravision tv cable-tv encryption security ireland upc ntl d-box)hatful of hollow – Visualising Sorting Algorithms : another dataviz of sorting algorithms, avoiding animation and instead coming up with a nice line-based viz. interesting, but wtf no merge sort ;)
(tags: via:simonw sorting algorithms visualization dataviz cairo coding python)Bank of Ireland Credit Card Security: FAIL : if BoI need to verify a transaction out-of-band, they send an SMS to the cardholder asking them to call an unpublished number which diverts to a UK number before demanding all their card details; exactly the modus operandi of a phish. wtf are they thinking?
(tags: omgwtfbbq banking boi ireland credit-cards verification security sms via:mulley)
We have an extremely open-plan layout in work — no partitions, just long benches of keyboards and monitors. It looks a bit like this, but with less designer furniture and more Office Depot:
Aman pointed out that this is a new trend in workplace design, which <a href=’http://www.workalicious.org/big_table_desking/’>Workalicious calls "Big Table Desking":
I’m still not sure what to make of the frequent instances of Big Table Desking. While this kind of workstation arrangement is no doubt a new trend, the no-privacy work place is a throwback to the 1950s office pool, a line up of identical desks classroom style. Is it the peer to peer seating position that overcomes this? How would it? By building community? As opposed the pilot and passenger 747, catholic church model of everybody facing "forward". Does the Big Table Desk break down this heirarchy by facing people towards one another, sharing a big desk instead of staking out territory? Is the big table desk a microcosm, a representation of a healthy organizational structure?
No comment ;)
It seems to be popular with designers, presumably due to their collaborative working needs.
Mind you, it also looks a bit like a Taylorist workplace layout from 1904, of which Wired says:
American engineer Frederick Taylor was obsessed with efficiency and oversight and is credited as one of the first people to actually design an office space. Taylor crowded workers together in a completely open environment while bosses looked on from private offices, much like on a factory floor.
So, after spending an hour or two attempting to figure out where the hell UPC had moved Channel 4 to, I eventually found out that it was now being broadcast on 543 Mhz. I also found out that this wasn’t part of the standard list of A1 to A30 channels in the "pal-ireland" range. :(
Thankfully, I then found this Frequency to MythTV channel converter page; here’s the correct values to use on the MythWeb channels page:
- Freqid = 30
- Finetune = -4
TopatoCo: Time Traveler Essentials Shirt : ‘Go back in time wearing this and you’ll invent heavier-than-air flight! YOU’LL discover penicillin. YOU’LL be the first to isolate aluminum. Did you know aluminum used to be more valuable than gold? YOU’RE GONNA BE RICH.’
(tags: funny history science design clothing t-shirts awesome topato)
EU to require internet filtering? : essentially mandating IWF-style (ie. half-assed and broken) filtering in all EU countries, I would imagine
(tags: iwf eu europe filtering censorship privacy isps ireland ec)Sorting Algorithm Animations : very nice visualizations of insertion, selection, bubble, shell, merge, heap, quick and quick3 sorts
(tags: javascript algorithms coding visualization sorting demo via:reddit)blekko’s ambient cluster health visualization : nice, custom sysadmin dataviz, via Rich Skrenta
(tags: sysadmin data monitoring visualization dataviz operations charts nagios)The reality behind Area 51 : A top-secret 1960’s spy plane project called OXCART. ‘The shape of OXCART was unprecedented, with its wide, disk-like fuselage designed to carry vast quantities of fuel. Commercial pilots cruising over Nevada at dusk would look up and see the bottom of OXCART whiz by at 2,000-plus mph. The aircraft’s titanium body, moving as fast as a bullet, would reflect the sun’s rays in a way that could make anyone think, “UFO”.’ but then — isn’t that what they’d _want_ you to think? ;)
(tags: area51 ufo debunking fortean cold-war spy-planes oxcart u-2 nevada history)
SpamAssassin benchmarked on LLVM : similar to Google’s “Unladen Swallow” port of Python. results aren’t stellar — yet — but there’s plenty of room — and possible contracts
(tags: via:matt unladen-swallow google llvm perl porting benchmarks spamassassin speed optimization)
Scheduled Tasks With Cron on Google App Engine : much needed. ‘The App Engine Cron Service allows you to configure regularly scheduled tasks that operate at defined times or regular intervals.’
(tags: cron async python google appengine gae background)aws : comprehensive all-in-one perl script giving easy command-line access to Amazon EC2 and S3; very nicely packaged — installs with a single “curl” command! brilliant
(tags: via:mattb ec2 s3 aws perl scripts command-line)
downsides of the Akamai IP Application Accelerator : certainly not all roses. (via Tony Finch)
(tags: via:fanf akamai networking tcp-ip private-networks routing)Facebook lose a RAID group : on one of their legacy NetApps? hmm
(tags: netapp facebook raid raid5 backup storage data-loss hard-disks)Facebook’s Haystack photo storage backend : ditching NetApp and Akamai, rolling their own massive-blob storage cloud
(tags: storage scaling facebook web http scalability photos infrastructure distributed haystack cdn cloud netapp)The Game Industry – Push cx : ‘Looking in, it’s clear that the [computer] game industry is broken and not getting fixed anytime soon. I will not be joining the game industry. I’m interested in building a profitable business making fun games in a good working environment, and that’s simply not what it does.’ +1; a lot of people, including myself, have also come to that conclusion, over the years
(tags: games coding work business management game-industry ea igda quality-of-life crunch-mode mismanagement scheduling)
Message Queue evaluation notes from Second Life : fantastic research notes; they’ve identified a lot of niggles and problems with the existing queueing systems out there
(tags: messaging scalability queueing rabbitmq mq amqp jms queue secondlife via:proggit)Full data export from discogs.com : awesome! full artist/album/track data for decades of dance music releases, released to the public domain
(tags: discogs music dance-music public-domain open-data data tracklistings)comment from RabbitMQ dev regarding the Twitter/Scala/Kestrel drama : ‘Writing messaging systems that work under any combination of flows, on any number of machines, and in multiple different reliability scenarios … is a more interesting problem. Page-to-disk is a way to make RabbitMQ better and address more scenarios.’
(tags: rabbitmq disk persistence queueing messaging async twitter scala kestrel)
Oh man, this Twitter Ruby-vs-Scala language spat is hilarious; talk about handbags at dawn. I loved this exchange in the comments to this post in particular:
I’m mostly surprised that a guy who wrote the book on Scala comes out and says that Scala is better than everything else and someone actually listened and took him seriously. He has a vested interest in saying that Scala is the next big thing and I’ve yet to see any evidence that Kestrel is better (at anything) than RabbitMQ.
And frankly, I still get fail whales at Twitter on a daily basis, so, what exactly are they so proud about over there?
Kestrel pages queues to disk: if you get more messages than you have memory, it’s fine. If RabbitMQ gets more messages than memory, it crashes. We talked to them extensively about this problem and they’re going to address it. We were hoping we’d be able to use RabbitMQ or another message queue. We didn’t want to be in the message queue business. At this point, given that we know the code and it’s performance inside and out, it makes sense to continue using and developing it.
I don’t feel like arguing with you but your logic isn’t clear to me. It would make sense that if you don’t want to be in the message queue business, you’d submit patches against an established message queue to make it work in your situation instead of writing your own message queue, twice. This is overlooking the fact that twitter is basically a massive message queue and you are, in fact, in the message queue business.
Zing!
Amazon Removes Delivery Restrictions To Ireland : great news! we can buy electronics on Amazon again
(tags: amazon ireland delivery via:mneylon shopping e-commerce electronics)Watch out Broughton! Street View fans plan to descend on ‘privacy’ village for photo fest : ‘it has raised the ire of Internet users, who are now campaigning for Street View enthusiasts from across the UK to descend on the village to snap their own perfectly legal photographs.’ ha!
(tags: privacy google street-view broughton yokels)
A good post from Joshua Schachter about URL shortening services.
For what it’s worth, I ran into the <a href="http://news.ycombinator.com/item?id=508132″>unwanted-interstitial risk. At one stage, before I’d bothered registering jmason.org, sitescooper.taint.org or my other domains, I used a URL-shortening service to provide a memorable, short URL for an open-source application I wrote — http://zap.to/snarfnews/.
At some point a few years down the line, the forwarding process started accreting ads; eventually they became soft-porn in content, and I was forced to apologise to users for the forwarding I could no longer control!
By now, 10 years down the line, it seems to hijack the page entirely, returning a page in Cyrillic I can’t even read :( (apparently it’s a page of Flash games; thanks, Alexandr Ciornii, for the interpretation!)
Anyway, lesson learned.
Damien Katz: Moving To California : another developer moves! a lot of people doing it recently, which worries me; will there be any top expertise left outside of the Bay Area at this rate? we need diversity
(tags: diversity bay-area california living work)Angry villagers run Google Street View out of town : fetch the pitchforks! Street View bin worryin’ my sheep! Buckinghamshire yokels fear change
(tags: funny privacy uk google street-view buckinghamshire yokels crime paranoia)COBOL ON COGS : ‘COBOL ON COGS SUPPORTS STANDARD TERMINALS (VT100 AND IBM 3200) IN THE MOST USEFUL SCREEN CONFIGURATIONS SUCH AS 80X20 AND 40X16’ (via Nishad)
(tags: via:nishad funny web coding ruby rails retro webdev cobol)
Twitter has this "Trending Topics" sidebar now, which lists the following topics:
Trending Topics
- TGIF
- National Cleavage
- G20
- Easter
- #grammarsongs
- France
- #rp09
- French
- Grand National
- Report Says Deal
Now, I’m not going to go into the topic of National Cleavage right now. ‘Report Says Deal’ is intriguing because it makes no sense, until you click through to see:
Real-time results for “Report Says Deal”
- dlloydsecret Google to Buy Twitter? Report Says Deal is in the Works http://bit.ly/Wt1Wb
- dlloydthemlmpro Google to Buy Twitter? Report Says Deal is in the Works http://bit.ly/Wt1Wb
- techupdates [PCWrld] Google to Buy Twitter? Report Says Deal is in the Works http://tinyurl.com/c63ont
- icidade Google to Buy Twitter? Report Says Deal is in the Works. http://is.gd/quu9
- chrisgraves Retweeting @CinWomenBlogger: Retweeting @ays: Google to Buy Twitter? Report Says Deal is in the Works – PC World http://bitly.com/LhT4
So I’d say that Twitter’s "Trending Topics" uses N-grams of between 1 and 3 "words" for topic identification. In this case, rather than "Report Says Deal", a better topic string would be something like:
Google to Buy Twitter? Report Says Deal is in the Works – PC World
or even:
Google to Buy Twitter? Report Says Deal is in the Works – PC World http://bitly.com/LhT4
Funnily enough this is exactly the issue I ran into while developing this algorithm. The trick at this point is to apply a variant of the BLAST pattern-discovery algorithm, expanding the patterns sideways while they still match the same subsets of the corpus until they’re maximal.
Twitter folks, if you can read Perl, "assemble_regexps()" in seek-phrases-in-log in SpamAssassin SVN does this pretty nicely, and reasonably efficiently, and is licensed under the ASL 2.0. ;)
Warren Ellis » The Conclusion Of BATTLESTAR GALACTICA (Condensed Version) : hahaha, spot on
(tags: bob-dylan bsg funny religion deus-ex-machina warren-ellis via:fp)Easy AI with Python – PyCon 2009 : ‘several basic AI techniques implemented with short, open-source Python code recipes … For each technique, learn the basic operating principle, discuss an approach using Python, and review a worked out-example. We’ll cover database mining using neural nets, automated categorization with a naive Bayesian classifier, solving popular puzzles with depth-first and breath-first [sic] searches, solving more complex puzzles with constraint propagation, and playing a popular game using a probing search strategy.’ video: http://pycon.blip.tv/file/1947373/
(tags: python problem-solving games puzzles ai search constraint-propagation depth-first breadth-first)Amazon Elastic MapReduce : excellent! run Hadoop jobs on EC2, with data hosted on S3. essentially, AWS have integrated a Hadoop dashboard to provide a great web-based and command-line UI
(tags: hadoop mapreduce scalability ec2 s3 aws amazon)Arthur Kade meets Angelina Jolie : best blog ever. Narcissistic meathead ‘actor/model’ type waxes lyrical on how Angelina Jolie is ‘“mother hot”, rather than “stripper hot”’: ‘I would probably rate her an 8.5-9 on my looks scale. I am not that sure that I would even feel the need to come up and initiate a conversation with her if I met her out somewhere.’ ‘I couldn’t really say that she would stick out for me if I saw her at a hot club like 1Oak or Rosebar.’ The entire blog is solid gold idiocy; I’d swear it was fake, but apparently not
(tags: wtf funny arthur-kade blogs narcissism angelina-jolie via:gerry stripper-hot beauty)Google uncloaks once-secret server : GOOG’s servers include built-in 12V batteries (and of course lots of velcro). video: http://www.youtube.com/watch?v=xgRWURIxgbU (via wmf)
(tags: via:wmf google hardware servers infrastructure batteries electricity energy data-center efficiency pue)
Wrong Tomorrow – pundits vs. time : great idea from Maciej Ceglowski
(tags: pundits windbags bullshit journalism trends futurism future prediction experts wrongtomorrow forecasting futurology predictions)Stable URLs in Mailman mailing list archives : hooray. I requested this ages ago, it’s now being implemented
(tags: archival mailing-lists web uris mailman urls addressing permanence mail discussion)Not-so-open Cloud Manifesto rains on interoperability parade : ‘The controversy surrounding the Open Cloud Manifesto demonstrates the risk of trying to build interoperability behind closed doors and through exclusionary practices. Such environments are not conducive to building consensus, which is one of the key ingredients of successful standards.’
(tags: collaboration cloud-computing sun open-source open standards politics vendors)Continuous deployment in 5 easy steps : more on IMVU’s continuous-deployment concept. interesting that they halt SVN commits on CI build failure, that seems extreme
(tags: imvu deployment software coding sysadmin testing automation build ci process agile continuousdeployment)faceboards.ie : Boards.ie a la Facebook, for April 1. thing is, I think I prefer this UI
(tags: boards.ie community ireland facebook web forums)
The Snooping Dragon : awesome, if terrifying research from Shishir Nagaraja and Ross Anderson on Chinese cyber-surveillance of the Tibetan movement. ‘we described how agents of the Chinese government compromised the computing infrastructure of the Office of His Holiness the Dalai Lama. They used social phishing to install rootkits on a number of machines and then downloaded sensitive data. People in Tibet may have died as a result.’
(tags: phishing social-phishing dalai-lama security surveillance privacy law china ross-anderson research papers windows microsoft)
German Police Raid Homes of Wikileaks.de Domain Owner : “what the Australian government’s secret ACMA internet censorship blacklist has to do with Germany is a mystery. This case is a prime example of multiple governments collaborating in support of censorship.” worrying.
(tags: censorship germany legal police wikileaks brbfbi privacy)Fast polling using C, memcached, nginx and libevent : well-written worked-through example of a classic memcached-backed libevent front-end caching system
(tags: http memcached caching optimization scalability plurk libevent nginx polling c)“The Powers That Be Want Action Taken” : ‘Gardai were in the [Today FM] offices yesterday looking for email communications between the team and the artist. According to D’Arcy the team were told [..] that “the powers that be want action takenâ€.” ffs! how’s about taking action against the fraudsters who’ve bankrupted our country instead? appalling diversionary tactics
(tags: diversions gardai picturegate brian-cowan art pranks today-fm ray-darcy censorship)AWS Toolkit for Eclipse : ‘Eclipse extensions automatically configure remote debugger connections for diagnosing problems and debugging software run in the cloud’ — ie. you can set a breakpoint on code running remotely, at EC2. that’s pretty awesome (via Steve Loughran)
(tags: via:steveloughran aws ec2 programming java plugins development eclipse cloudcomputing tomcat)Ask a Flowchart: Which Blowhard Am I? : YES
(tags: blowhards funny internet web2.0 magazines wired flowcharts dave-winer)Zooko laid off by AllMyData.com : looks like AllMyData are facing a money crunch (“focussed on keeping costs down”). hopefully this isn’t bad news for Tahoe, the fault-tolerant open-source distributed filesystem — or indeed for Zooko himself
(tags: allmydata zooko money tahoe filesystems storage fault-tolerance funding open-source distributed scalability)RTE Apologise to Brian Cowen for Nudie Pics Report : the national broadcaster apologises, on air, for a news story covering the ‘paintings of an Taoiseach in the nude’ prank. wtf!
(tags: rte television freedom-of-speech censorship satire wtf apologies soft weakness)
Australian ISP abandons blocking : “We are not able to reconcile participation in the trial with our corporate social responsibility, our customer service objectives and our public position on censorship,†iiNet managing director Michael Malone said. “It became increasingly clear that the trial was not simply about restricting child pornography or other such illegal material, but a much wider range of issues including what the Government simply describes as ‘unwanted material’ without an explanation of what that includes.â€
(tags: australia freedom censorship iinet blocking filtering acma)Akamai have developed a parallel internet : and, most surprising of all, it _works_. holy crap. (thanks Antoin!)
(tags: ip-application-accelerator akamai internet routing speed network networking ip latency joelonsoftware copilot via:antoin)Guerilla artist hangs nude Cowen paintings : some prankster put up rather disturbing paintings of Ireland’s taoiseach in the National Gallery and Royal Hibernian Academy. “‘It’s reasonably well painted. It’s not the worst thing I’ve ever seen,’ conceded James O’Halloran of Adam’s Fine Art Auctioneers & Valuers.”
(tags: painting pranks ireland galleries brian-cowan politics funny)