Travel: Toorcon was great fun! Lots of interesting conversations.
Unfortunately they had a cruddy internet connection, so I'm majorly backlogged, and can't write about any of it just yet ;)
(Things I found interesting recently.)
Travel: Toorcon was great fun! Lots of interesting conversations.
Unfortunately they had a cruddy internet connection, so I'm majorly backlogged, and can't write about any of it just yet ;)
Spam: SpamAssassin 3.0.0 is now released! w00t! Only 4 months late this time ;) Announcement, techie details, Slashdot. New logo too:

(Note: if you're running SpamAssassin 2.x and plan to upgrade, this is a new major release cycle -- so we've taken the chance to break some backwards compatibility. Be sure to read the UPGRADE doc!)
Politics: So I was listening to that this morning. Did I hear correctly? Did Bush really say that one of the good side-effects of Iraq's invasion, was that there were now hopefully less attacks inside other countries? sure looks like it:
'Coalition forces now serving in Iraq are confronting the terrorists and foreign fighters so peaceful nations around the world will never have to face them within our own borders.'
I'm sure the Iraqi civilians will love that. 'Hey guys, sorry about all the missing limbs, but you're doing a really good job of being flypaper so we don't get hurt. Cheers! Have a 15% corporate tax rate!'
Conferences: Hey -- I'm talking at ToorCon 2004 down in San Diego this weekend! Come along and check it out, if you can.
I'd better hurry up and file my presentation slides pronto ;) The topic is:
Spam Forensics: Reverse-Engineering Spammer Tactics
In this talk, I'll discuss how the SpamAssassin project has identified reliable signatures indicating that a message is spam, by reverse-engineering spammer tactics from the spam mails themselves. I'll also discuss several specific features that we have identified, how we found them, and why the spammers add them.
Patents: Now that the summer break is over, software patents are back on the EU's agenda. The FFII (via EDRI-gram) reports
On 24 September 2004, the European Council will probably meet to rubber-stamp the 'political agreement' achieved on 18 May 2004 on the highly controversial software patents directive (2002/0047 COM-COD).
According to the FFII the text was designed to mislead ministers about its real effects. 'It consists of many sentences of the form or 'software is ... unpatentable, unless ... [condition, which, upon closer scrutiny, turns to be always true]'.' And, states FFII, 'It can be said with certainty that only a minority of governments really agrees with what was negotiated, but several governments were misrepresented by their negotiators, who broke intra-ministerial agreements or even violated instructions from their superiors.'
More info:
Web: Flickr's latest trend -- using just an eye (or similar minimalist face part) as your avatar pic:
| |
|
|
|
|
| |
|
|
|
|
| |
|
|
|
|
Life: Luke writes:
Lean and I were joined by Lara at 6.10pm on Saturday 28th September. Lara is a little (8lb 7oz, so not _that_ little) girl. And she is gorgeous. Of course.
Congrats! I'll be dropping in on the three of them next week, looking forward to it...
Funny: The Daily Show's GWB reelection film: 'George W. Bush -- Because He Says So' (Quicktime MOV, 6MB). This is the funniest thing I've seen in ages.
Remember -- don't listen to the facts -- listen to the words!
(thanks to anaxamander for the file. This URL is cached through CoralCDN, so pass it on!)
Web: My Nearly-Live Planetary Desktop Backgrounds site is now using NYU's Coral Content Distribution Network instead of FreeCache.org. (FreeCache wasn't caching the files, because they were too small. drat.)
Coral is a 'decentralized, self-organizing, peer-to-peer web-content distribution network', using a distributed sloppy hash table and peer-to-peer DNS redirection infrastructure.
At least, apparently. ;) I haven't read the papers yet, but what I do know is that so far, it seems to be working perfectly -- each file is requested exactly once by the CDN servers:
193.10.133.129 - - [31/Aug/2004:16:50:31 +0100] "GET /xplanet/tmp/200408311455.399750/day_clouds_800x600.png HTTP/1.1" 200 706936 "-" "CoralWebPrx/0.1 (See http://www.scs.cs.nyu.edu/coral/)"
and never requested again. That's a big saving... nifty!
Linux: Everyone who's used a non-MS system will have learned -- typically the hard way -- that not all hardware is equal. Not just in terms of specs, flexibility and power, but also in terms of whether or not it can be used at all.
Most hardware vendors consider their specification and interface documentation to be their crown jewels; giving access to these without a signed NDA is impossible. On the other hand, for free software developers, signing an NDA makes life quite difficult -- it can be done, but nobody else can help you maintain it further without signing an NDA, the resulting code may 'disclose' too much of the 'IP', and so on. In a lot of cases, the vendor isn't interested in giving access to the specs, even with an NDA -- it's their IP and why isn't the customer just using Windows?
The end result: lots of hardware with crappy support on non-MS operating systems.
Things aren't as bad as they used to be, though -- since nowadays the high-end hardware is more likely to support standards, and Linux is a top choice on embedded hardware (set-top boxes for example), so it has a much higher profile. But cheap, end-user oriented PCs still wind up with components from vendors who couldn't be bothered with non-Windows customers, and that can mean using a hacked-up, reverse-engineered driver and hoping it works. (That's not to denigrate reverse-engineered drivers. some of them work great. But fundamentally, the vendors are making a mistake here.)
So it's pretty impressive to see that LaCie are now sponsoring development of k3b, the CD/DVD burning application for KDE!
Good timing too, I was about to buy a DVD burner ;)
Linux: Here's a patch that adds support for ALSA/Artsd/ESD output from kgst, the KDE gstreamer middleware used by JuK.
Background: JuK is a great music player app for KDE. However, it hogs
the sound device while running, which means that nothing else gets access
to play sound until the app is shut down. This is suboptimal.
The reason it does this is because it plays sound via this chain of
components: juk -> kgst -> gstreamer -> sink. Unfortunately, the
kgst component doesn't allow control over what output sink to use,
instead hard-coding the string 'osssink' -- the OSS drivers, for
traditional Linux /dev/dsp sound. My laptop doesn't support mixing in
the sound hardware, which means I need to use a software mixer.
'osssink' doesn't support software mixing, instead giving the caller
exclusive access to the sound card, and other apps will just have to wait
for it to finish.
(As to why JuK doesn't just play mp3s by running 'mpg321 name-of-file.mp3', and let us specify the '-o' switch to use, I wish I knew. (ObOldbieGripe: component-based architectures are full of this kind of needless over-complexity ;)
Anyway, the patch in the bug above lets the user provide an environment
variable for a string for kgst to use instead of 'osssink'.
Tech: This is the second entry talking about 'Life Hacks'. Possibly the best tip I came away from the talk with, is this one:
All geeks have a todo.txt file. They use texteditors (Word, BBEdit, Emacs, Notepad) not Outlook or whathaveyou.
What we keep in our todo is the stuff we want to forget. Geeks say they remember details well, but they forget their spouses' birthdays and the dry-cleaning. Because it's not interesting.
It's the 10-second rule: if you can't file something in 10 seconds, you won't do it. Todo.txt involves cut-and-paste, the simplest interface we can imagine.
It's also the simplest way to find intercomation. EMACS, Moz and Panther have incremental search: when you type a "t" it goes to the first mention of "t", add "to" and you jump to the first instance of "to", etc.
Power-users don't trust complicated apps. Every time power-geeks has had a crash, s/he moves away from it. You can't trust software unless you've written it -- and then you're just more forgiving. Text files are portable (except for CRLF issues) between mac and win and *nix. Geeks will try the Brain, etc, but they want to stay in text.
I was already doing this, having learned the latter lesson ;), but I was making one mistake -- I was trying to keep the TODO.txt file small by clearing out old stuff, done stuff, and cut-and-paste snippets of command lines, and by moving things into files in 'storage' directories.
That doesn't work. You think you'll be able to grep for it later, but you'll have forgotten what to grep for. You'll even have forgotten what storage directory you used. The solution is to keep it all in one big file, and use i-search. That really does work.
In fairness, I actually have two files of this type. One is the "real" TODO.txt. But the other is a GPG-encrypted file containing usernames, URLs, passwords, nameservers, VPN settings, etc. I have a feeling this is another common Life Hack idiom, too...
Another great tip in the same vein, from JWZ -- make an /etc/LOG:
Every machine I admin has a file called /etc/LOG where I keep a script of every system-level change I make (installing software, etc.) I rsync these LOG files around (keeping redundant copies of all of them in several places) so that if/when I need to re-build a server from scratch, it's just a matter of following the script.
This has been working out great (when I remember to do it. Discipline! ;)
Tech: So Danny O'Brien's 'Life Hacks' talk is one of the most worthwhile reflections on productivity (and productivity technology) I've heard. (Cory Doctorow's transcript from NotCon 2004, video from ETCon.)
There's a couple of things I wanted to write about it, so I'll do them in separate blog entries.
(First off, I'd love to see Ward Cunningham's 'cluster files by time' hack, it sounds very useful. But that's not what I wanted to write about ;)
People don't extract stuff from big complex apps using OLE and so on; it's brittle, and undocumented. Instead they write little command-line scriptlets. Sometimes they do little bits of 'open this URL in a new window' OLE-type stuff to use in a pipeline, but that's about it. And fundamentally, they pipe.
This ties into the post that reminded me to write about it -- Diego Doval's atomflow, which is essentially a small set of command-line apps for Atom storage. Diego notes:
Now, here's what's interesting. I have of course been using pipes for years. And yet the power and simplicity of this approach had simply not occurred to me at all. I have been so focused on end-user products for so long that my thoughts naturally move to complex uber-systems that do everything in an integrated way. But that is overkill in this case.
Exactly! He's not the only one to get that recently -- MS and Google are two very high-profile organisations that have picked up the insight; it's the Egypt way.
There's fundamentally a breakage point where shrink-wrapped GUI apps cannot do everything you want done, and you have to start developing code yourself -- and the best APIs for that, after 30 years, has been the command-line and pipe metaphor.
(Also, complex uber-apps are what people think is needed -- however, that's just a UI scheme that's prevailing at the moment. Bear in mind that anyone using the web today uses a command line every day. A command line will not necessarily confuse users.)
Tying back into the Life Hacks stuff -- one thing that hasn't yet been
done properly as a command-line-and-pipe tool, though, is web-scraping.
Right now, if you scrape, you've got to do either (a) lots of munging in a
single big fat script of your own devising, if you're lucky using
something like WWW::Mechanize (which
is excellent!); (b) use a scraping app like sitescooper; or (c) get hacky with a
shell script that runs wget and greps bits of output out in a really
brittle way.
I've been considering a 'next-generation sitescooper' a little bit occasionally over the past year, and I think the best way to do it is to split its functionality up into individual scripts/perl modules:
Tie those into HTML Tidy and XMLStarlet, and you have an excellent command-line scraping framework.
Still haven't got any time to do all that though. :(
Spam: I'm quoted in
New
Scientist! w00t!
SlashDot picked it up pretty quickly. One comment there misses the point, though:
This is interesting and promising technology. But like all antispam techniques, spammers will find a way around it. Once spammers get a copy of the software, they can create and test countermeasures in the comfort of their own sleazy lairs.
It's worth talking about this. Newsflash: spammers have no difficulty testing their spam against closed-source spam filters, even when they can't 'get a copy' and test them in 'their sleazy lairs'.
How do they do it? Easy -- just set up an account at a site that uses that filter (AOL, Yahoo!, Hotmail, and GMail, it's pretty obvious how to do that; for other closed-source filters, find an ISP that uses it). Then send 'test mails' repeatedly to that account, and apply trial and error to see what gets past the filter and what doesn't. Eventually, they figure out what works for that filter, and what doesn't.
How did I figure this out? Well, I came across the manual for the Send-Safe ratware on-line. It noted that the 'hashbuster' randomisation technique, which we in the SpamAssassin team had long assumed was intended to block hash matches by DCC, Pyzor and Razor, was in fact intended to block AOL's implementation of that system. The open source ones weren't even mentioned.
Update: found it -- from their FAQ:
Mime Encoded content
If you want to get into AOL... use it.
MIME encoders allow you to send documents written within a specific application through email without causing readability or formatting problems. For example, you can send a letter created in MSWord with and be certain that it arrives at its destination in the same format by encoding it with MIME first. The recipient then decodes it back into the original MSWord format.
That isn't why we use it though.
We use it to cause 'uniqueness'.
When you put a rotate tag at the beginning of a MIME encoded email, it causes everything after that point (including checksums) to be 'different' in every message.
Why is that that important?
Because it throws off filters that look for many copies of the same message to nuke.
TV: A choice quote from NBC's Olympics coverage: 'This girl (one of the US beach volleyball team) reads a book a week!' (delivered in shocked tones.)
Web:
My Dad runs a couple of
websites -- his architectural
photography business, and Andalucia Photo Gallery, a
side project selling
some lovely photos from the Andalusia region of Spain.
Needless to say, as the family geek, guess who coded all that up? Using WebMake, naturally ;) This was the main reason I wrote the 'thumbnail_tag' plugin.
You'll note, however, that the image to right is watermarked, quite small, and encoded with a low quality setting. It turned out after a couple of years of operation, that the images were being downloaded and used in print all over the place -- from both sites!
It seems photo piracy is rampant. Even with terms of use clearly linked on the sites, it's still commonplace for print publications to swipe the images -- and not just the little guys, either -- some big commercial names have apparently used the images without asking (or paying licensing fees).
The Andalucia gallery site was a favourite; being a good hit for 'travel photos spain' meant lots of images being used for holiday pages in magazines, newspapers, and so on.
Needless to say, digital watermarking software doesn't work -- it's trivial to load an image into Photoshop, resize or crop, and resave, apparently. Even if PS did respect the watermarks, netpbm doesn't, and a watermarked image isn't identifiable as such once it appears in print anyway! So we went for the blunt-tool approach, adding visible watermarks to the images.
It's pretty easy -- pamcomp allows you to overlay one image on top of another, using a third as an 'alpha mask' to control transparency. The results are pretty nice and not too intrusive.
It's a shame it has to be done, though... :(
Security: It looks like the security people are starting to take a look at RFID, and it's not pretty.
I link-blogged this the other day -- RFDump is a tool to display and modify data in RFID tags -- including deployed ones, at least in some cases. (Think rewriting the price tags in a shop, scrambling the tracking numbers on a warehouse full of goods, or corrupting frequent-shopper data on a card.)
It looks like this was also discussed at USENIX Security '04 in an RSA presentation (those notes are swarming with typos, but the content's there ;)
That talk has some interesting stuff -- 'blocker' tags which spoof readers with gibberish data, or crash the collision-detection network protocol; while that's being discussed as a security tool here, if the protocol is that hackable, and the hardware is available, I could see that having additional interesting effects in a supermarket. Of course, range is an issue -- but that hasn't stopped Bluetooth hacking, wardriving, etc.
If you ask me, it looks an awful lot like RFID is chock-full of security holes, and the features that make it so attractive (low power use, low cost, tiny size) will be the very features that militate against adding security. We could be in for interesting times here...
Spam: oh man, Spamusement started off well, and has just been getting better and better; * HEATH WARNING * had me laughing out loud, and the idea of linking the entries since August 8 as a series is genius.
Perl: So, I wrote a new CPAN module recently -- IPC::DirQueue. It implements a nifty design pattern for slightly larger systems, ones where multiple processes, possibly on multiple machines, must collaborate to deal with incoming task submissions. To quote the POD:
This module implements a FIFO queueing infrastructure, using a directory as the communications and storage media. No daemon process is required to manage the queue; all communication takes place via the filesystem.
A common UNIX system design pattern is to use a tool like
lpras a task queueing system; for example, this article describes the use of lpr as an MP3 jukebox.However,
lprisn't as efficient as it could be. When used in this way, you have to restart each task processor for every new task. If you have a lot of startup overhead, this can be very inefficient. WithIPC::DirQueue, a processing server can run persistently and cache data needed across multiple tasks efficiently; it will not be restarted unless you restart it.Multiple enqueueing and dequeueing processes on multiple hosts (NFS-safe locking is used) can run simultaneously, and safely, on the same queue.
Since multiple dequeuers can run simultaneously, this provides a good way to process a variable level of incoming tasks using a pre-defined number of worker processes.
If you need more CPU power working on a queue, you can simply start another dequeuer to help out. If you need less, kill off a few dequeuers.
If you need to take down the server to perform some maintainance or upgrades, just kill the dequeuer processes, perform the work, and start up new ones. Since there's no 'socket' or similar point of failure aside from the directory itself, the queue will just quietly fill with waiting jobs until the new dequeuer is ready.
Arbitrary 'name = value' metadata pairs can be transferred alongside data files. In fact, in some cases, you may find it easier to send unused and empty data files, and just use the 'metadata' fields to transfer the details of what will be worked on.
Sound interesting? Here's the tarball.
Spam: So, CEAS was great fun, and very educational:
My highlight papers:
IBM's Chung-Kwei pattern-discovery system -- the one which Mark dug up. Very interesting stuff; it turns out that bioinformatics is full of large corpora of data (genomes) which you then need to find patterns in. Funnily enough, so is SpamAssassin: s/genomes/spam/, s/patterns/regular expressions/. The more advanced pattern-discovery algorithms even allow complex patterns to contain alternative blocks, 'don't-cares' and similar regular-expression-like features.
The really good bit of Chung-Kwei is the Teiresias algorithm (more pages, online demo). Of course, being IBM research, it's probably patented to the hilt, and may be tricky to license; but it's certainly pointed us in a whole new interesting direction -- anyone know any bioinformaticians?
IBM is really gearing up on anti-spam research. 4 of the 6 papers listed on their website were presented this year, at CEAS.
Another good paper was On Attacking Statistical Spam Filters, by Gregory L. Wittel and S. Felix Wu, which (similarly to Henry Stern's submission, which I helped a little with) dealt with an attack on Bayesian filters.
This is interesting stuff; we're pretty sure it's not as serious as it could possibly be, in SpamAssassin's implementation, but it's still a serious attack.
I took copious notes on the SpamAssassin wiki, if anyone's curious.
E-Voting: Paul Krugman: Fear of Fraud:
It's election night, and early returns suggest trouble for the incumbent. Then, mysteriously, the vote count stops and observers from the challenger's campaign see employees of a voting-machine company, one wearing a badge that identifies him as a county official, typing instructions at computers with access to the vote-tabulating software.
When the count resumes, the incumbent pulls ahead. The challenger demands an investigation. But there are no ballots to recount, and election officials allied with the incumbent refuse to release data that could shed light on whether there was tampering with the electronic records.
This isn't a paranoid fantasy. It's a true account of a recent election in Riverside County, Calif., reported by Andrew Gumbel of the British newspaper The Independent.
Here is Gumbel's account. It's quite simply crazy:
On March 4, Floyd and Cassel saw the second Sequoia employee, Eddie Campbell, return to the registrar's office and watched him pop into his pocket what looked like a PCMCIA card similar to those used to store votes on individual touchscreen machines. The Sequoia AVC Edge machines do not make a paper record of individual votes, and any record of total votes for a potential recount -- vital in a race separated only by 45 votes -- would only be stored on that kind of card.
Floyd shouted out: 'Where are you going with that?' But he received no answer.
Incredible.
Reading: Both jim winstead and Nelson Minar have praised Earth Abides , a 1949 post-apocalyptic novel where 'all but a handful of people die from a mystery disease', and the ensuing narrative 'follows one man's attempt to rebuild something like a society.' It seems a tip from original happy mutant Mark Frauenfelder was the pointer for both of 'em.
I'm a huge fan of the genre; I think it's something about our age group, growing up in the shadow of Reagan's 'Evil Empire' speeches, Threads and (much less terrifying) The Day After.
Given that, it looks like Earth Abides goes straight into the wishlist. However, I should make another couple of reading tips while I'm at it, in the same genre:
First off, Jack London's short story
The Scarlet Plague (1912) is a clear
antecedent to Earth Abides. In this story, too,
a plague hits the planet and wipes out most of civilization;
an old man talks to children who've known nothing but
the post-apocalypse period. It's pretty short and well
worth a read.
But my main recommendation is Kim Stanley Robinson's The Wild Shore (1984), first book of his Three Calfornias trilogy, and his debut novel.
It takes place in 2047, 60 years after a massive nuclear attack on the US, by Russian infiltrators (pretty dated, eh ;). The narrator is a teenager in a primitive agrarian community on the coast of southern Orange County. His group are farmers, living far away from the previously built-up areas; the people who live amongst those ruins are shunned, and the different tribes meet only occasionally to trade. Disposable butane lighters are a treasured commodity.
He gradually discovers that the US was once a superpower, and that they are now being kept in a virtually stone-age state by outside powers. The interesting factor here is that most sci-fi authors, at this point, would embark on a jingoistic, militaristic armed struggle; it initially seems that's what's happening, but Robinson takes a very interesting tack, in his own style, and this really makes the book something special.
(I won't go too far into it, but if you really want to know and don't mind spoilers, this site thoroughly spills the beans.)
Web: in passing -- here's a bookmarklet for the current day's Doonesbury comic strip: Today's Doonesbury.
Apache: Not content with distributing GPL'd software, Microsoft are now taking another step into the open-source world by shipping Apache-licensed code.
Not quite as big a deal as the GPL -- but still, another interesting milestone.
Spam: Some fantastic data in this paper from the Kentucky Long-Term Policy Research Center.
It's a brief 2-pager detailing the effectiveness of the CAN-SPAM Act in reducing the spam load, using a set of test addresses. The methodology is pretty good.
One point in particular is very important: 'opting out' from spam Just Does Not Work. This graph tells the whole story:

After opting out from spams received, the amount of spam received at those 'opted out' test addresses actually rose. (This even after CAN-SPAM made such activity explicitly illegal.)
Some other data:
If anyone needed proof, this shows that spammers are quite happy to break the law; strong enforcement 'teeth' are needed for any anti-spam legislation. (UK, take note: the thoroughly useless system whereby spam complaints must be submitted on paper isn't going to help!)
The Technical Details document also notes something interesting: one test address was set up to test 'opting out' of legitimate mass mail from some (unnamed) big websites, and continued to receive ads 'sometimes months after opting-out'. For shame!
(thx to John Levine for forwarding the links.)
Spam: Michael Radwin on open HTTP redirectors, and in particular noting that Yahoo! have (finally) closed their main one down. One down, several hundred to go ;)
Good history of the exploitation techniques that spammers have been using, too.
Funny: Some of the taint.org readership (that's you, Nishad) may be familiar with BEST SONG EVER.mp3 -- it's an insane, 10-minute workout: one guy ranting at a high pitch in some east-asian language at an incredible speed over some cheesy Casio, hardly taking a breath, punctuated by bizarre 7-Zark-7-style ribbits and squawks. By the end of it, he's nearly hoarse. It is incredibly bizarre. Turkopop has nothing on this.
Well, it's origin has been discovered -- he's called
E Pak Sa, and the style
is called 'Pansori'. His version is a modern take on this ancient
traditional style -- 'While singing, he would imitate the sound of all of
the instruments used in the prelude and interlude, and even the sound of
the whistle used to gather the tourists.' From there, he grew
in popularity, especially in Japan:
'Sell-out concerts, myriad television appearances, riots at in-stores, and Japanese teens speaking Korean are all products of E Pak Sa's impact in Japan. E had infiltrated the popular culture of Japan and paved the way for other Korean artist to do the same.'
And guess what -- his Encyclopedia of Pon-Chak album can be listened to online! The YMCA cover -- track 2 -- is strongly recommended.
Spam: ever wondered what this weblog would look like if it was spam? wonder no more. (via crummy.com)
Security: Ross Anderson, crypto and security guru extraordinaire, moonlights as -- wait for it -- a street bagpipe player:
I play the pipes (the Great Highland Bagpipe and the Scottish smallpipes). I played competitively as a teenager, and thereafter paid my way through university by working as a street musician in Germany, France, the Netherlands and Denmark.
Only joking. But yes, he really does play the bagpipes. And that submission to the EU's consultation on the management of copyright and related rights is worth a read, to get an idea of how the new increased enforcement of music copyright has had chilling effects on the viability of the UK's folk music scene. (found via Karl-Friedrich Lenz.)
Hacks: Nearly-Live Planetary Desktop Backgrounds. 'a selection of desktop-sized high-quality PNG images, using near-real-time cloud data, and some very nicely rendered maps using satellite data, to create a nifty, nearly-live world map desktop background.'
Funny: Kiera Knightley's photoshop boobjob has been all over the place recently -- it's a pretty extensive reworking. But then, that's standard practice nowadays...
However, best comment goes to stephendann:
In photo 2, she has the quad damage. The skin colour darkens, the chest expands, the stomach contracts and the character skin is obviously altered so the rest of the players know she's supercharged. In POTC:King Arthur, it's a more subtle damage modified than (the) UT2K4 glowing purple bow.
LOL!
Web: Worth noting for the various sites in Ireland and the UK that I've heard of recently, who have been looking for ways to do shared, collaborative calendaring of upcoming public events: upcoming.org is your man.
Pros: Clean CSS/XHTML layout; no ads; decent management; already covers European metro areas; event calendars are easily syndicated to other sites using RSS.
Works for me!
Unix: via
Ted Leung,
Adam Rosi-Kessel's Linux Tips page has some very useful tips, and
this one's great -- to avoid
getting SSH connection resets, add the following to your .ssh/config:
serveraliveinterval 300
serveralivecountmax 10
This will insure (jm: sic) that ssh will occasional send an ACK type request every 300 seconds so that the connection doesn't die.
As a similar tip that took a while to track down -- KDE users who've upgraded between KDE releases, will probably by now have seen lots of messages like this:
nameofapp (KIconLoader): WARNING: Icon directory /usr/share/icons/hicolor/ group 48x48/stock/text not valid.
It took a bit of googling about to find the cure:
Funny:
BBC: DVD pirate's pitch ends in arrest:
A man has been arrested after trying to sell counterfeit DVDs - at a Trading Standards Office.
The man had apparently missed the sign on the office in Beehive Lane, Chelmsford, Essex, and asked if anyone would like to buy pirated films. Staff said they were very interested indeed in what he had to sell, but when he realised where he was he ran off, leaving his wares and £210 in cash.
Police later arrested the man in a supermarket in Chelmsford.
Movies: Hacking Netflix, via torrez.
Jason Kottke points out a great quote on a Friendster cross-site scripting attack -- this great quote: 'We have a policy that we are not being hacked.'
He also speculates that Google used the GMail invite-network data for whitelisting -- but whitelisting based on email address alone is trivially exploitable, so I'd doubt it.
I'm just back from a trip over to Cape Cod to meet family (halfway between here and Ireland, y'see ;) -- lots and lots of luvverly lobster and sundry shellfish -- and after a 6 day trip, had 5000 spams and a couple of thousand nonspam mails to deal with. Thankfully SpamAssassin dealt with the spams (only about 5 false negatives, no false positives I could spot) -- but I'm going to have to do something about that volume of mail. drowning in the stuff. argh.
Web: Back in 2002, it occurred to someone to check the Google search results for 'http', to figure out what the most popular sites were.
Looks like it's changed -- here's the top five results from a Google search for 'http' now:
My guess: older links are getting good PageRank, using whatever new tweaked algorithm they're using. But AltaVista beating Google? ;)
Spam: we've decided that SpamAssassin needs a logo update. If you've got the skills (and let's face it, it wouldn't be too hard to top my effort), please feel free to enter a logo in the contest!
TV: RTE's 'Prime Time' secured a fantastic interview with GWB, with Carole Coleman asking a few very pointed questions. Watch it with RealPlayer, or listen to the audio in MP3 (2.7Mb).
There's a pretty accurate transcript here:
Let me finish! How many times do I have to tell you how to do your job? See, I gotta insult France at least once. Then I gotta claim 'merica to be the most generous nation in the whole wide world, even though it's not true. And listen, let me mention that democracy in Pakistan, too. And guess what? I'm the first president to ever call for a Palestinian state and I'm damn proud of it - just look at the size of my smirk now. Listen, as long as I keep repeating myself and mouthing empty platitudes, you won't have a chance to call me on any of the bullshit coming out of my mouth.
OK, the official one is here.
It appears that the White House just dropped the ball on this one; reportedly, they had her list of questions three days in advance, but given that they suggested that she 'ask him a question on the outfit that Taoiseach Bertie Ahern wore to the G8 summit' (!!!), they weren't paying attention, and expected some kind of giggling moronic schoolgirl, or something.
Hilariously, the White House has since complained to RTE, the Irish Embassy, the Irish Government, and the reporter herself. Probably God, too. I doubt Prime Time will ever get a White House interview again, but given what they clearly expect from the poodles in the White House press corps, that's hardly much of a loss.
(I'd love to see what'd happen if he had to deal with Paxman ;)
Also, went to see Fahrenheit 9/11. Fantastic movie, and best of all, incredibly well-attended.
My favourite moment: the reminder of just how easily the US news media sold itself out during the war. Seeing Katie Couric blurting 'Navy Seals rock!!' like some kind of starstruck 5-year-old with an Action Man toy, was a classic. It's good to see that this will be immortalized in celluloid, as it was truly shocking at the time. (Not much has changed; Judith Miller is still writing for the NYT.)
Ireland: Here's a hot UL that's floating around the irish web right now --
In a British program about Samuel L Jackson and Colin Farrell's lastest movie SWAT presented by British presenter, Kate Thornton, the following exchange occured:
... yeah, right. ;)
(Update: Actually, believe it or not, that's more or less how it really went. Here's the transcript.)
Some commentary at
TheReggaeBoyz.com (quote: 'I NEARLY DEAD TO RASS!!!!') and
Kuro5hin.
It looks like the TV programme does exist; no scripts online, unfortunately, so we'll never figure out if this one really happened, I think.
IMO, it's made up for sure. That last line is just a little too harsh for a primetime schmooze-a-gram, at the very least. Plus, it's the kind of thing only an Irishman would give a shit about -- the perpetual adoption of Irish celebs and worthies by the UK media is a continual source of irritation for the Irish -- as Dervala puts it:
'No, Oscar Wilde was ours. You put him in jail, though. And Shaw was ours. And Yeats. And Johnny Rotten.'
Web: Minor software announcement -- after some time using HTMLThumbnail, album, and even WebMake to build photo galleries, I finally got peeved enough, and gave in to the temptation of 'not invented here'. ;)
Presenting Uffizi, a CSS- and template-driven, themable perl script to generate photo galleries. Quoting the POD:
Image::Size and the ImageMagick convert command
I am, of course, using it on my own photo pages, and I'm very happy with it; it's been a while since I had to hack it. (I need to get it to thumbnail MPEGs as well, but apart from that it's teh nifty IMO.)
Spam: SpamAssassin is now officially an Apache top-level project! InternetNews.com coverage:
The Apache Software Foundation is taking the spam fight to a new level -- literally -- with the promotion of its Spam Assassin project to top-level status.
Hooray ;)
Spam: or, 'SlashDot spam drama'. So, a few days ago, I forwarded a link to a paper I'd been sent -- it's a great paper, and I'm not just saying that because SpamAssassin did well -- it really tests some of the popular open-source spam filters comprehensively, and correctly. (The authors have 24 years of information retrieval research between them.)
The results have been pretty incendiary. ;) Here's a timeline with links, in case you were wondering where we are right now:
UNIX: I've just made the first change to my core bash
configuration in years, to add
-b to the set command-line. It triggered some thinking about
when the last one was.
It turns out, that apart from writing scripts and aliases frequently, I haven't changed my commandline UI in any respect, since about 2 years ago. By contrast, I've been hacking about with GUI settings continually, new desktop backgrounds, themes, colours, etc. Odd!
Anyway, here's the tip -- it's very handy, I find.
I changed to using a 2-line prompt, with the first line containing the time and the full working directory, in a 'magic' cut-and-pasteable format:
: exit=0 Thu Jun 24 17:55:29 PDT 2004; cd /home/jm/DL
: jm 1203...;
Note that the prompt starts with ":", which means that bash/sh will ignore the line until it hits ";". The end result is that the entire line evaluates to "cd /home/jm/DL" when pasted. Hey presto, cd'ing several terminals to the same dir just involves triple-clicking in one, and middle-button-pasting into the others. nifty! Similarly, the second line has a little bit of prompt, but that snippet will be ignored when cut and pasted.
Having the exit status of the last command (bash var: $?) is useful
too. The code:
do_prompt () {
echo ": exit=$? `date`; cd $PWD"
}
PROMPT_COMMAND='do_prompt $?' # executed before every prompt
do_prompt 0 # set up first prompt
PS1=": `whoami` \!"
PS2="... >>; " # continuation prompt
PS1="$PS1...; "
Software: Mark Twomey, in response to all the Win32 API stuff recently:
We now have a generation of computer users ... who have never received or sent email from a so called 'rich client', never had to send a postal order off to order something from some distant vendor, and are not amazed by something like a search engine. ....
Those ('rich client') people remind me of minicomputer users who crapped on the 'crummy little operating systems' used on 'crummy little desktop computers.'
He's right, you know -- for de yoot, Windows is generally just a way to access Hotmail.
Security: some crypto drama.
Ahmad Chalabi apparently told the Iranian government that the NSA had broken their secret code, according to 'US intelligence officials': NYTimes: Chalabi Reportedly Told Iran That U.S. Had Code. This story is still running -- Bruce Schneier has just posted his expert opinion, as has Ross Anderson. As I noted on Eric Rescorla's weblog, here's my (non-expert) theory ;)
It's known that the Iranians used Crypto AG equipment up until about 1992, and it's been widely reported that Crypto AG's systems were backdoored by the NSA and traffic routinely decrypted. (also, Baltimore Sun story, 1995)
Reportedly, the Anglo-Irish discussions of the 1985 were a rather one-sided affair, because the Irish government used Crypto AG machines to communicate between their Embassy in London and Dublin, and intercepts of their reports were fed back to the UK government.
In addition, according to this article (backup), the NSA also provided Iraq with intercepts of Iranian secret traffic, while Iraq was a US ally -- which could explain why Chalabi would have known about it.
It also speculates as to how it was done:
'Knowledgeable sources indicate that the Crypto AG enciphering process, developed in cooperation with the NSA and the German company Siemans, involved secretly embedding the decryption key in the cipher text. Those who knew where to look could monitor the encrypted communication, then extract the decryption key that was also part of the transmission, and recover the plain text message. Decryption of a message by a knowledgeable third party was not any more difficult than it was for the intended receiver. (More than one method was used. Sometimes the algorithm was simply deficient, with built-in exploitable weaknesses.)'
So my opinion is that Chalabi's claim was very old news from the 80's and early 90's -- which pretty much fits in with the rest of his tip-offs to everyone else ;)
Politics: Kerry in Colorado:
"Just to put your minds all at ease, I have four words for you that I know will relieve you greatly," Kerry told the fund-raiser. "How does this sound? Vice President Hunter Thompson."
Travel: Great posting on culture shock and 'going native' at Yankee Fog.
Hacks: Dan Kaminsky's LayerOne presentation hits Slashdot. Definitely one of the highlights of that conference.
Spam: confession for two: a spammer spills it all. Interesting -- especially since the spammer winds up earning less than he would have working for Starbucks.
It's also worth noting this posting from Gary Smith on the sa-users list, in which Gary filled out a spam form with some not-entirely-valid info -- with hilarious results!
So I did talk to some of these lenders. Apparently they buy leads from www.lendergateway.com . One guy that I talked to was irritated because it costs him $100 per lead they sell him and it's supposed to only be sold to him. He apologized quite a bit and was nice enough to give me the information on who sold him the names. The number he game me goes to voicemail which I'm going to try later. A couple other people told me what I can do with myself and one lady kept saying that she couldn't give me information on who provided her with my information.
The stupid thing is each time I talk to them I tell them I'm on a cell and that I need their name and number and I'll call them right back. They give it to me... So when they hang up I start calling again and again. I've been irritating the hell out of them...
Anyways, that's the fun storing of what happens when these forms are filled out.
$100 per spurious 'lead' would make a serious dent, if enough spurious leads showed up... ;)
Net: WINW Is Not WASTE: 'WINW is a small worlds networking utility. It was inspired by WASTE ... (WINW) has diverged from its original mission to create a clean-room WASTE clone. Today, the WINW feature set is different from that of WASTE, and its protocol is incompatible with WASTE's protocol. However, WINW and WASTE achieve similar goals: they allow people who trust each other to communicate securely.'
Not quite there yet -- just a Windows version with no sharing -- but actively under development. One to keep an eye on...
Software: Economist: Unix's founding fathers (via sourcefrog.net). A very good article on Thompson, Kernighan and Ritchie's amazing achievement, with some new details I hadn't heard before:
AT&T was required under the terms of a 1958 court order in an antitrust case to license its non-telephone-related technology to anyone who asked. And so Unix and C were distributed, mostly to universities, for only a nominal fee. When one considers the ineptness of AT&T's later attempts to commercialise Unix -- after the court order ceased to be applicable because of another antitrust case which broke up AT&T in 1984 -- this restriction, an accidental boost to what would later become known as the open-source movement, becomes even more crucial.
So that's how that happened. Just think -- if it wasn't for that court case, we'd probably all be hacking on VMS. ;)
Also at sourcefrog, mbp points out that the Sulston reverse-engineering story is 'remarkably similar to that of Richard Stallman several years earlier, when the frustration of closed-source printer software helped motivate him to start the GNU project'.
Patents: yet another sourcefrog link, this time to a CNet story with a hilarious quote regarding software patents and the GIF/PNG debacle:
But Unisys credited its exertion of the LZW patent with the creation of the PNG format, and whatever improvements the newer technology brought to bear.
'We haven't evaluated the new recommendation for PNG, and it remains to be seen whether the new version will have an effect on the use of GIF images,' said Unisys representative Kristine Grow. 'If so, the patent situation will have achieved its purpose, which is to advance technological innovation. So we applaud that.'
Wow. Presumably by the same logic, they applaud al-Qaeda for improving airline security innovation, too...
Copyright: Cory Doctorow's DRM talk presented to MS research yesterday. This is a fantastic introduction to the issues regarding DRM; if you know someone who isn't convinced that DRM is A Bad Thing, this is the argument they need to read.
OSes: /.: France Considers Open Source. The usual arguments are going on in the comments, but some people still insist that they get better support from MS than from Linux vendors.
What planet are they on? Because it would have been handy for me to live there, on the occasions in the past where I've had to develop code on MS platforms, and administer networks of Windows PCs. In my experience, you do not get support from Microsoft. Instead, you do what you do with Linux -- go searching on Google, read MSDN, or post in the MSDN forums.
As far as I can see, there's zero difference between doing that with Windows, and doing exactly the same thing with Red Hat -- except in the latter case, you can turn up debug logging through a documented API or switch, use the source and fix it yourself, find the original developers and post a message to their core -dev list, or even ask them personally.
Where's this amazing support? Maybe the companies I've worked for just weren't paying enough, and therefore weren't significant blue-chip customers. Or maybe it's because we weren't based in the US, and so got support from less-skilled, less high-priority staff in a regional office. But I've certainly never experienced the support these advocates claim MS offers, which makes me think it's FUD as usual.
Literature: Happy Bloomsday Centenary! Google agrees:

You can have a read of Joyce's masterpiece online at online-literature.com, although this is certainly one text that works better on paper, to be pored over and parsed slowly. But regardless of whether it's readable on-screen or not, the legality of that copy is dubious, anyway.
As this Telegraph article notes, the copyright situation on Ulysses is, sadly, a total mess. Even 84 years after it was written, and promptly banned in the US, UK and Ireland for 'obscenity', Ulysses remains a thorny legal subject.
The novel was first published in 1922, and as such, fell into public domain in the UK in 1992, but was apparently 'pulled back' in 1996. According to this mail, due to recent copyright term extensions, the 1922 text will now remain in copyright in the EU until the end of 2011, and may not expire until 2032 in the US. And this Irish Times article notes that in Ireland, 'copyright on Joyce's works ran out on December 31st, 1991, 50 years after his death. However, EU regulations revived copyright from July 1995 when it extended the lifetime of copyright to 70 years.'
Reportedly, the Dail even had to pass emergency legislation last week to prevent an exhibition at Dublin's National Library from being sued by the Joyce Estate:
The threat to the exhibition has been caused by the 2000 Copyright Act which creates a doubt about its ability to display manuscripts bought by the State because the Joyce estate still holds copyright.
Hilarious. Recent overzealous copyright extension legislation snares governments too! But they get to rewrite the laws in emergency session to fix it ;)
All very ironic, considering Ulysses' structure was deliberately derived from The Odyssey in the first place.
Tech:
Troubleshooters: Making a bootable CD from a bootable floppy image.
Making a note of this for future reference -- it should be handy next time
I need to do a BIOS or firmware upgrade on my Thinkpad.
I ran into the need for this recently when trying to upgrade the BIOS on my Thinkpad running Linux, so hibernation would work. IBM don't provide BIOS upgrade tools for Linux, so you have to keep a Windows partition around. (Yes, I pay the Windows Tax -- I've been bitten by proprietary firmware upgrades requiring it in the past, as in this case.)
Amazingly, however, even after paying the Tax, the 'non-diskette' BIOS upgrade (ie. the standalone Windows app) doesn't work from Windows XP! Instead, you get a hard hang when it tries to bring the machine down from XP to a single-app mode to perform the upgrade. Running from DOS similarly fails, because the BIOS upgrade app is a WIN32 application. Clever.
Eventually, I wound up reformatting my Windows partition, installing Windows 98 (!), and running the BIOS upgrade app from that worked fine. But next time around, I should be able to save myself a few hours of MCSE imitation by using this floppy-to-CD trick... here's hoping. ;) PCs Are Hard.
War: A couple of war links, I'll keep it short. ;)
High-profile air strikes 'killed only civilians'. 'The American military launched some 50 air strikes designed to kill specific targets during the Iraq war, it emerged yesterday, but none of them found its mark. Instead the air strikes had a high civilian toll, according to military officials serving at the time.' Still, it sounded good, like as if CSI were doing all the war strategification and stuff ;)
And: the
Pentagon 'Torture Memos' took some tips from the torture techniques
used in Northern Ireland in the 1970s.
Music: Licensing row mars iTunes launch. UK indie labels report that 'where Apple has spoken to labels the terms on offer have been commercial suicide', and as a result, they won't be selling their tunes via iTMS Europe.
I agree with Mark Twomey on this one -- bad move. This (and the prices!) reduce the Euro-iTunes offering to about the usefulness of whatever that one is that Real.com have (you know, the one you can't even remember the name of) -- and nobody in Europe buys major-label music online anyway.
Web: http://ws/ . Nifty!
Software: This mail contains a fantastic anecdote from The Common Thread: Science, Politics, Ethics and the Human Genome, by John Sulston, head of the Sanger Centre, and a joint winner of the Nobel Prize for Medicine. I'll reproduce some bits here:
Once the first fluorescence sequencing machines arrived, it became clear that we had to take control of the software. The machines worked well, but ABI (jm: the vendor) wanted to keep control of the data analysis end by forcing their customers to use their proprietary software. ...
I could not accept that we should be dependent on a commercial company for the handling and assembly of the data we were producing. The company even had ambition to take control of the analysis of the sequence, which was ridiculous. ...
So, one hot summer Sunday afternoon, I sat on the lawn at home with printouts spread all around me and decrypted the ABI file that stored the trace data. ... Within a very few days, Rodger and his group had written display software that showed the traces - and there we were. The St Louis team joined in, and they all went to decrypt more of the ABI files, so that we had complete freedom to design our own display and analysis systems. It transformed our productivity. Previously we'd only been able to get the traces as printouts, which we bound together in fat notebooks ....
I certainly feel that between us we did push ABI back a bit and denied to them complete control of this downstream software. It was the first experience of the kind of battle for control of information that I seem to have been fighting with commercial companies ever since: a foretaste of the much larger battles that would later surround the human genome.
Amazing. Was John Sulston the first Nobel Prize-winner to have to reverse-engineer a proprietary file format in the course of his research?
And would his actions be legal in the UK in a few years, once the IPR Enforcement Directive is transposed into law there?
Conferences: LayerOne was seriously great! Got to meet up with some really interesting people; discuss some nifty stuff; and get some new angles on the whole hacking scene.
Seriously, that was well worthwhile, especially in terms of potential new ways to deal with spam, and issues to watch out for in terms of spammer techniques in future. A great techie conf, and the boozing^Wsocialising was pretty good too ;)
I'm actually giving some thought to going to Defcon after that...
Spam: Reg: German hate mail spam attack stuns experts: 'Mailboxes in Germany and the Netherlands were flooded yesterday with spam containing German right-wing propaganda. Spammers used the Sober.G virus - a mass mailing worm that sends itself to email addresses harvested from infected computers - to spread their messages as widely as possible.'
The one good thing about this is that it might help some people realise that spam isn't all about porn and commercial email; any kind of mail can be spam, including political speech.
However, this may be a bit late for the US, since CAN-SPAM explicitly does not regulate political spam. ah well, you live and learn, I suppose. ;)
Security: A very interesting security paper -- Understanding Data Lifetime via Whole System Simulation. It combines virtual machines with data-flow tracking (a la perl's 'taint' mechanism, after which this site is named ;)
By modifying the Bochs VM to support tracking 'tainted' data, they found several cases in popular apps (Mozilla, emacs, and MSIE) where passwords entered from the keyboard are retained in memory, and thereby wind up on disk due to swapping.
This has been a known issue for a long time -- see the source for passwd.c from the 'shadow' package -- but aside from security-naive developers, several other factors have made it more complex recently:
memset()
In general, they suggest more use of buffer zeroing, even for low-level buffers that might not seem to require it (such as the X server's event queue, and the kernel input buffers).
BTW, a similar system they didn't mention is the Sidewinder firewall appliance, which uses what they call 'Type Enforcement' -- effectively, tainting the data based on which network interface it arrived on.
Overall, a very nifty paper. I wonder if Tal Garfinkel is related to Simson? ;)
Oil: a MeFi gem: expert opinion on depletion of the oil reserves. 'Simmons, Campbell, even the Iranian Bakhtiari agreed that the real situation of Saudi reserves is very bad. ... Not a rosy picture, even for optimists.'
Patents: Transcript of the rms talk from a couple of weeks ago.
Spam: Good Salon article on the new forms of spamming, such as Wiki and referrer-log spamming etc. Here's a good quote:
'The adult industry will likely be married to spam and its attendant distribution methods long past the evolution of man into beings of pure energy,' jokes Domenic Merenda, vice president of business development for Edge Productions, a company that operates adult-media properties.
There's a good deal of crossover -- I've seen both email and referrer-log spam advertising the same porn sites.
Web: the June part of the contest is over, but given that there's a July part still to go -- here's a 'Nigritude Ultramarine' link to Anil Dash.
I wasn't really bothered at all about this, until I came across this guy, whose technique involved spamming third-party Wiki sandboxes with backlinks. His excuse? 'A Sandbox (is) a part of a system in which everybody is urged to play around freely. Usually for testing purposes. You can post headings, paragraphs, lists and links here. The content in return will be indexed by Google.'
As
this forum thread points out -- 'The SandBox page is there for a
purpose: to allow users of the wiki to learn to use the software. It is
not meant to be "a place where anyone can create backlinks."'
Sorry, that's spam in my book.
Mail: GMail users, check your mail; if mine was anything to go by, you should have three new invites to give out.
Web: Bernie Goldbach points to a site that's news to me: AnotherFriend.com. It's an Irish dating site.
I've had the odd discussion comparing dating culture in the US (organised 'dating') and Ireland and the UK (where it's a lot more casual), and I must say, I was really convinced that the Friendster/craigslist-style organised, web-mediated dating just wouldn't fly.
Seems I was wrong! Right now, there's 157 people online on the site, with a good half of those being logged-in, chatting users, and about 75% of those in turn being premium, paying members. Wow, not bad.
Politics: TheyWorkForYou.com is a triumph. The most incredibly detailed, and web-aware, hypertextual database of political activity I've seen yet. The web-awareness -- full of scraping, links, RSS and even community -- is what makes it amazing; the concept of being able to read news of your representative's latest speeches and voting record in your RSS aggregator is incredible. We need to get this out there for every country in the world.
It certainly beats Today in Parliament, that's for sure ;)
Aside: nice choice of username for the 'Site News' weblog:
Some sites linking to this entry
An error occurred: Connection error: Access denied for user: 'fawkesmt'@'localhost' (Using password: YES)
Wierd: Incredible footage (WMV stream) of a guy who went nuts, converted a caterpillar earthmover into what is essentially a tank, and went on a GTA-style rampage through the streets of Granby, 15 miles west of Denver, Colorado. In the process, he destroys the local bank, the newspaper, and several stores, seemingly working on the basis of (several) personal grudges.
Hacking: Amazing -- the Action Replay cartridge is still around!
To be honest, I'm quite surprised that the PS2 hardware platform allows any of this stuff without some mod-chip-style soldering... but then, it's pretty clear Datel have the technology to figure these things out. Impressive.
Aside: in my teens, I wrote demos on the Commodore 64 entirely in the Action
Replay's built-in monitor. I tried using compilers that supported such
luxuries as symbolic labels, variable names, etc., but the ability to halt
the entire machine and debug extensively, with a single button press, was
just too nifty ;)
Patents: lyranthe.org notes that the EU elections are coming up this Thursday, 11th June. Accordingly, here's a single-issue roundup of the candidates, from what I've heard:
(PS: these are my opinions, not those of my employer. ;)
(updated: I'd left out Eoin Dubsky! my bad, now fixed.)
Perl: I've been writing a few convenience web-scrapers recently using WWW::Mechanize, with great success.
So the latest development, HTTP::Recorder, looks very nifty too:
HTTP::Recorder is a browser-independent recorder that records interactions with web sites and produces scripts for automated playback. Recorder produces WWW::Mechanize scripts by default (see WWW::Mechanize by Andy Lester), but provides functionality to use your own custom logger.
... Simply speaking, HTTP::Recorder removes a great deal of the tedium from writing scripts for web automation. If you're like me, you'd rather spend your time writing code that's interesting and challenging, rather than digging through HTML files, looking for the names of forms an fields, so that you can write your automation scripts. HTTP::Recorder records what you do as you do it, so that you can focus on the things you care about.
No SSL support yet, though, as far as I can see, but for simple scraping -- or as a good starting point for a more complex Mechanize script -- it looks like it'll work great.
Spam: Kasia raises a very interesting question. Here it is, in a nutshell:
Should the quality of an ISP's enforcement of its Acceptable Use Policy, be a condition of their contract with their Regional Internet Registry, and therefore affect whether they can be assigned new network address space?
Head on over to her weblog if you have a comment on this.
Health: USDA orders silence on mad cow in Texas: 'The U.S. Department of Agriculture has issued an order instructing its inspectors in Texas, where federal mad cow disease testing policies recently were violated, not to talk about the cattle disorder with outside parties ... The order ... was issued in the wake of the April 27 case at Lone Star Beef in San Angelo, in which a cow displaying signs of a brain disorder was not tested for mad cow disease despite a federal policy to screen all such animals.'
Great idea -- if you want to avoid finding mad cow cases, just don't bother looking for them! The beef rendering plant in question supplies beef to MacDonalds, reportedly.
Press: LWN: A look at SpamAssassin 3.0 (article is subscriber-only until next week).
OSes: Kernelthread.com: Making an Operating System Faster. Great article on some OS-level optimisations Apple used in MacOS X -- including a nifty boot-time read-ahead system which reportedly more than doubles the speed of OS X reboots. nice!
Wildlife: here's another critter we encountered last weekend -- a baby Western Diamondback rattlesnake, hiding in a crevice.
Spam: The Spamometer; a
1997-vintage spamfilter along the lines of filter.plx. Interestingly,
I hadn't seen this before -- who knows, if I had, SpamAssassin could have
used a (0.0, 1.0) scoring system instead of the '5 point threshold'. ;)
(Thanks, Gary!)
Conferences: I'm going to LayerOne; it looks interesting, and I've been hoping to bump into Danny O'Brien (who's there doing his Life Hacks talk) for a couple of drinks and a blather for quite a while. Other speakers look similarly interesting, in an 'offbeat hacker conference' way, so I think it'll be fun.
Conflicts with The Streets playing the Wiltern though, but c'est la vie ;)
Life: I've learned one thing this weekend -- humans are not designed to function in the desert. I went bush-camping in the Anza-Borrego Desert state park with a few mates, and we quite simply baked in the 45C/113F degree heat. Walking 3 miles in that heat was easily equivalent to 15 miles in normal temperatures.
We did manage to catch a good look at one of the endangered bighorn sheep that live there -- the poor sheep was clearly trying to get to some water, but those damn humans kept getting in the way!
On the way back, we passed the aftermath of a forest fire near Temecula. Scorched earth.
Security: via IP -- a very scary article at Bruce Blair's Nuclear Column -- apparently, the secret unlocking codes on the launch control mechanisms of Minuteman nuclear missiles were deliberately set to '00000000' throughout the height of the cold war, because the Strategic Air Command 'remained far less concerned about unauthorized launches than about the potential of these safeguards to interfere with the implementation of wartime launch orders.'
Green: A couple of good /. comments on renewable power sources: one from a wind farm designer, and some anti-FUD figures for solar panels.
Music: The full text of
The Timelords'
The Manual (How To Have a Number One the Easy Way)
is online:
THE JUSTIFIED ANCIENTS OF MU MU
REVEAL THEIR ZENARCHISTIC METHOD USED
IN MAKING THE UNTHINKABLE HAPPEN.
KLF 009B
1988 (YOU KNOW WHAT'S GONE)
Photos: the view out to sea from Seal Beach, just south of LA. (duh. thanks Ben, I'd b0rked the link earlier.)
Patents: via the FFII Kwiki, here's 2087 Microsoft USPTO software patents viewed roughly by subject matter. The 'Web' selection is particularly interesting.
Terror: The Atlantic: All you need is love -- how the terrorists stopped terrorism. Amazing -- marry them off!
Tourism: Pictures from Bangkok's new 'Sky Bar' -- open-air dining, 63 floors up, with no walls apart from 1.5-metre-high glass.
Funny: NYPD alerts cops to 'terror indicators'.
The NYPD has ordered its patrol force to be more vigilant about spotting and reporting possible signs of terrorism, including individuals who "express hatred for America." .... The cards advise them to contact counterterrorism investigators when they have suspicions over anyone who is, among other things, carrying driver's licenses from different states, videotaping utilities and tunnels or wearing fake uniforms.
Sounds like the Village People won't be playing NYC any time soon, then ;)
Weblogs: Greenpeace: Mysteries of the Deep -- 'the SV Rainbow Warrior left Auckland, New Zealand, on a voyage around the surrounding waters. Our mission: To highlight the irreversible damage caused to deep sea life by bottom trawling.' Official weblog maintainer for the voyage: one Daev Walsh. Nice one Daev!
Literature: So, more on this entry -- believe it or not, there's a Japanese Sourceforge project implementing a Wiki called 'mrkrgnao'. Japanese Joyce fans!
Funny:
Who knew there was a Commodore 64 gang sign? PRESS PLAY ON TAPE, that's who!
Spam: Ever wanted to ask a question of one of the biggest 'e-mail deployers' on the planet? Aunty Spam's providing the venue, and accepting questions for Scott Richter, erstwhile star of the Daily Show. There's a few already up there.
Funny: via swhackit! -- Language registration: en-Spam-porn:
'One is very much tempted. It is certainly a unique orthography.'
Indeed. When I was offered "[t]ons of dwolnaoadble mvoies, pohtos and sotires", I quickly read past "mvoies" and "pohtos", but was stumped for a while by "sotires". Perhaps I was blocked by interference from "satires".
But I think that registration will fail, because there are no descriptive works provided for the Language Tag Reviewer to consult.
Marketing: It appears that MATRIX (the Multistate Anti-TeRrorism Information EXchange) at one stage did -- and may still -- include a 'terrorism quotient' field, representing 'a statistical likelihood of (people) being terrorists'.
Seisint, the company providing the system, is a Boca Raton, FL company founded by Hank Asher, previously of Database Technologies, the company that 'stripped thousands of African Americans from the Florida voter rolls before the 2000 election, erroneously contending that they were felons'. Lovely.
Boca Raton, eh? Yep, there's a spam connection -- Hank Asher also, apparently, bought eDirect.com from noted Boca-based spammer Steve Hardigree (ROKSO record).
The email in the linked article goes on to note that Asher and Hardigree had 'disagreements' regarding 'how eDirect should position itself in the Direct Marketing Community', so I doubt Asher might have necessarily approved of spamming -- but it does appear he had interests in Direct Marketing.
Given that, I suggest a new spin-off strategy for Seisint's 'terrorism quotient' field, courtesy of my mate Luke: terrorist-targeted direct marketing!
Those turrists are in the market for lots of high-profit-margin goods:
All Seisint have to do is SELECT Name,EmailAddress FROM MATRIX WHERE TQ
> 120, do a mail run, then watch those non-consecutively-numbered
US dollars roll in. Easy!
Tech: I should note this here just in case anyone finds it useful. A handy tip for anyone visiting Caesar's Palace; their 'Business Center' doesn't have wifi yet, but (cough) one of their neighbours certainly does ;)
Travel: I've just spent a week in the UK; much culture was imbibed, I got to see Michael Landy's Semi-detached at the Tate, met up with some good mates including the pregnant Lean, and was a happy camper overall.
Then I had an 11-hour transatlantic flight, stuck in the middle of a 5-seat row with pointy elbows on both sides; then, best of all, arrived at US Immigration and found myself fingerprinted and had my photo taken, in accordance with their new policies under the US-VISIT program.
Apparently the biometrics equipment providers are a company called Cross Match Technologies. Fingers crossed (arf!) they have better false positive rates than their competitor, Identix.
I'm looking forward to seeing similar false-positive-prone usage of biometric data, for US visitors to other countries in response. (With hilarious results!)
Aside: I wonder how href="http://use.perl.org/%7eMatts/journal/18915">Matt's cooking-related-program-activities injury will affect his biometric profile?
Also of relevance -- apparently Boston are introducing random spot-checks of passenger's papers on their metro transport.
It's interesting that travel by train requires a passport, driver's license, or similar heavyweight documentation -- but one can zip around the country unimpeded by road. Of course, all of this is moot, seeing as the 9/11 hijackers had perfectly-in-order documentation, including driver's licenses, and travelled extensively under their real names and passports. One wonders what exactly all this has to do with the War Against Terror, given that.
Funny: Knight Foundation, featuring a downloadable David Hasselhoff Paper Plane! Don't forget, the song 'Hot Shot City' is particularly good.
Patents: According to Ciaran O'Riordan of IFSO, one key aspect of the EU Council's meeting on the software patent legalisation proposal hinged on the use of the phrase 'as such', to effectively sneak a loophole past the Council members:
I recommend that everyone listen to the recordings of the Council's meeting. Transcripts are also linked from there, but the tone of voice etc. is interesting.
Anyway, basically, the people in the room didn't understand the implications of the text (that's our fault).
Bolkenstein added an amendment: "computer programs will not be patentable as such" - this (rightly) fooled most people into thinking that software would not be patentable. Really, it just means you can't patent software as software, you have to patent "software running on a computer". I think the rejected part of the German amendment would have closed this loophole. .....
Anyway, the point is that the Council members were on our side, we just hadn't told them precisely what we want .... We told them "no to software patents", and they think they've done that. We should have said "no to 'as such"', and similar textual lobbying rather then implication lobbying.
Spam: Yahoo!'s DomainKeys proposal for sender auth.
I'm in the UK this week, so commenting in detail isn't too easy right now. But briefly, the big problem I foresee for DK is dealing with mailing lists and forwarders.
I did spot this oddity in the patent license, though:
Yahoo! will grant a royalty-free, worldwide, non-exclusive license under any Yahoo! patent claims that are essential to implement or use any Implementations so that licensees can make, use, sell, offer for sale, import, or yodel Implementations; provided that the licensee agrees not to assert against Yahoo!, or any other Yahoo! licensees of Implementations, any patent claims of licensee that are essential to implement or use any Implementations.
My emphasis. "Yodel"? ;)
But seriously -- patents will make implementation of this tricky for open-source projects, unless those terms are extended to allow the license to be transferable and usable indefinitely.
Patents: argh. That's all I can say for now. :(
Ireland: Update update! The Stallman talk is now free (-as-in-beer), apparently. No more updates, any further news will just be on their site. ;)
Ireland: So I forgot to mention who's running the Richard Stallman talk in TCD next week.
It's IFSO, the Irish Free Software Organisation, with some help from TCD Netsoc apparently (so there'll be a nominal 3-euron charge for the room from them).
Latest news on their news page...
Compare this recent statement from Minister Mary Hanafin, Minister of State with Responsibility for the Information Society, and this extract from 'Why Microsoft Wins' advertorial written by a Microsoft product manager, Sunday Business Post, 2004-05-02:
ILUG have already written an article in response to this pretty obvious prompting of a government minister by a commercial interest.
(thanks to ompaul at lwn.net for pointing that out.)
Circularity: My long-distance provider, Primus, is using SpamAssassin for spam-filtering at their ISP end!
Does this mean I can get a discount? ;)
GNU: Hey, Dublin-based people! Richard Stallman will be giving a talk titled 'The Dangers of Software Patents' in Dublin on May 24, at 19:30. It'll be in the TCD Hamilton building, right beside Pearse St. DART station. I've never seen him speak, but I hear it's definitely worth attending, and his message needs to get out there, further into the Irish software industry and political circles.
Also on patents: good news via groklaw.net -- Germany has stated they plan to vote against the Irish software patent legalisation plan, and some French ISVs are asking Chirac to do likewise.
Funny: Here's the
Daily Show
segment with Scott Richter (WMV, 9.8MB).
Just ignore the lame subtitles added by whoever encoded the file... the rest of it's seriously funny! 'Clitorious', indeed.
Update! 2004-04-13: thanks to Lisa Rein, there's now a 10MB Quicktime .mov version, sans unfunny subtitles. I'd strongly suggest downloading that instead.
Patents: There's a good discussion over at Joi Ito's weblog on software patents.
Unfortunately, there's a persistent, and popular, fallacy that crops up quite frequently in these discussions, and does so here in the comments:
'much of the processing of patents has been, to use understatement, deficient. An invention that is 'silly or obvious' will likely not pass the approrpiate legal test - if this test is applied by people who understand the inventive technology .... while I agree with most of your observations about deficiencies, I fail to see the logic in your solution (to simply outlaw these kinds of inventions).'
So, what the commenter is saying is that the patenting of software and business methods would be acceptable, if only the 'inventive bar' was raised so that trivial patents were not granted.
The problem with this is that:
that they patent ideas instead of physical inventions.
A parallel would be to allow the patenting of plot-lines in fiction, meter in poetry, or combinations of ingredients and cooking methods in recipes. These are all ideas, transformed into output 'products' by performing them as input on a set of hardware (books, cooking equipment), in the same way as software patents and business method patents are abstract ideas that operate on input, generating output, when implemented on a CPU. So, should they be patentable, too?
Patenting of physical designs is fundamentally different from patenting of abstract ideas in one key way. Physical designs must function correctly under real-world physics, and this requires extensive up-front design and prototyping, before they can be turned into mass-produced products.
Abstract ideas can be developed mentally, and the up-front work required before the idea can be put down on paper is trivial by comparison.
Consider these EPO patents: EP0807891 (Sun's 'shopping cart' patent) or EP0689133 (Adobe's 'tabbed palette window' patent). The up-front work required to devise these applications is trivial to anyone with a rudimentary knowledge of UI design; the hard part appears to be writing the legalese, and I understand the patent lawyers take care of that part. ;)
Compare with US patent D0450164, a design patent for a Dyson washing machine. The level of detail, and extensive specifications, is massive, and it's clear a lot of work had gone into the process before the patent application was filed.
In addition, the commenter assumes that extensive prior art searches really do take place. From what I've heard from patent applicants, and from what I've observed in the range of granted software patents, this is cursory at best, and generally performed by the patent lawyer and the examiner, not the applicant themselves.
I've even observed a few patents where prior art, cited in the patent, implemented exactly what was claimed!
Toys:
Argh, I've been infected by the Breedster STD!
Apparently, though, there's a way around it through reincarnation, or -- rumour has it -- through touching Asriel, the bug with the power to heal.
In the meantime, paranoia reigns, and this time of crisis has brought out the worst in some bugs:
It's an interesting piece of emergent net-art, if you ask me, but the STD is pissing me off. (it's itchy!)
Literature: Ulysses:
The cat walked stiffly round a leg of the table with tail on high.
-- Mkgnao!
-- O, there you are, Mr Bloom said, turning from the fire.
The cat mewed in answer and stalked again stiffly round a leg of the table, mewing. Just how she stalks over my writingtable. Prr. Scratch my head. Prr.
Mr Bloom watched curiously, kindly the lithe black form. Clean to see: the gloss of her sleek hide, the white button under the butt of her tail, the green flashing eyes. He bent down to her, his hands on his knees.
-- Milk for the pussens, he said.
-- Mrkgnao! the cat cried.
They call them stupid. They understand what we say better than we understand them. She understands all she wants to. Vindictive too. Cruel. Her nature. Curious mice never squeal. Seem to like it. Wonder what I look like to her. Height of a tower? No, she can jump me.
-- Afraid of the chickens she is, he said mockingly. Afraid of the chookchooks. I never saw such a stupid pussens as the pussens.
-- Mrkrgnao! the cat said loudly.
mrkrgnao.com is available ;)
Patents in an open source world
Patents: Newsforge: Patents in an open source world, by Lawrence Rosen (founding partner of Rosenlaw and Einschlag).
Interesting article, but I'm not sure summary point number 2 ('continue to document our own "prior art" to prevent others from patenting things they weren't the first to invent') really helps, when the patent examiners clearly haven't performed the simplest Google check. I've found obvious prior art in 30 seconds, by plugging 3 words from patent claims into Google in the past (and yes, I have a reasonable idea how to read patent claims by now).
Point number 3 is interesting, since it contradicts most other advice I've read regarding patent searches: 'Conduct a reasonably diligent search for patents we might infringe. At least search the portfolios of our major competitors. (This, by the way, is also a great way to make sure we're aware of important technology advances by our competitors.) Maintain a commercially reasonable balance between doing nothing about patents and being obsessed with reviewing every one of them.'
However, this comment really is interesting and raises something major that I'd never heard of before -- users of proprietary software can also face a significant risk from the patent threat. In particular, according to the linked comment, Microsoft licensed some patented technology from a company called Timeline Inc., but the license was not sublicenseable -- in other words, it did not grant their customers the rights to fully use the technology! (in fairness to MS, this was established later in court.) Result: href="http://trends.newsforge.com/comments.pl?sid=39443&cid=96153">MS SQL server OEMs and ISVs are now being sued.