-
just firewall out RSTs, and the Great Firewall's keyword blocker is defeated
-
Mark Pilgrim's suggested apps for an Ubuntu desktop -- some quite good suggestions here, with lots of KDE goodness. I just wish amaroK was as user-friendly and usable as the amazing (but not well-maintained) JuK, though
Category: Uncategorized
-
wow, Adam Shostack joins MS!
-
The set designer for '24' helped design the operations center at the National Counterterrorism Center, apparently. I bet that really helps (via substitute)
-
'Observations on British and American English by an American linguist in the UK', via Ben. I fall between these two stools on a regular basis, or three if you count Hiberno-English as well
-
author of "Phishing Exposed", general smart guy where phishing attacks are concerned. (also: Amazon does blogs now?)
-
Bible annotation using Amazon's "Mechanical Turk" HIT service; a success. However they did invite their blog readers to participate, which would have skewed results by providing willing participants
-
a great post from John Gruber, pointing out the key problem with DRM -- it forces vendor lock-in, and precludes interoperability, as a core design goal
-
this year's CEAS, July 27-28 2006. CEAS is reliably the best anti-spam conference; worth attending, although I won't be this year
-
'[the book] is about functional programming techniques in Perl. It's about how to write functions that can modify and manufacture other functions.' wow, missed this -- sounds AWESOME
-
it's pretty hard to find decent maps of Dublin online -- these are very good, although not quite Google-maps-shiny, they surpass GMaps' quality in terms of data (via Sander Temme)
-
Michi Henning (!) slates the history of CORBA extensively, blaming the OMG's process and praising the open source community. wow (via slashdot)
-
for sale: one partially plutonium–contaminated Pacific atoll, 718 miles from its nearest neighbour; unfortunately the golf course is closed. see also Ballard's 'Terminal Beach'
Taken last week in Aigufreda on the Costa Brava, Catalunya, Spain.
-
Mail.app proprietary crapness. 'I’m forced to migrate all my mail yet again from yet another proprietary format, and the best documentation I’ve found so far is on LiveJournal. .. somebody deserves to be fired for that.'
-
Maya meticulously recorded almost every penny/baht/kip/ringgit spent over the course of her 6-month travels through SE Asia. about right, going by my own experience; I wish I'd bought more souvenirs
Adrian Weckler posts details of Vodafone Ireland's new flat price datacard; costing 50 Euros per month, including VAT; fully flat rate (hooray, something useful at last!); and they claim that they'll be rolling out HSDPA, which offers 1.2Mbps to 11Mbps rates, 'starting in Dublin in October'.
Those are great numbers, but further info seems thin on the ground; they haven't bothered updating their own website yet, amazingly.
Anyone got further info? What rates does it offer right now? How would one order such a beast?
-
'focuses on the development of interoperability solutions for digital media, and the reverse engineering of proprietary systems for which licensing options are non-existent or impractical' -- and have hired Jon Lech Johansen
-
C|Net's distributed filesystem, a la GFS, Mogilefs (via acme)
-
good data on large-scale spammer behaviour as of 2006, presented at NANOG37. Relay-IP-based techniques not so good any more, but we knew that. Unfortunately doesn't analyze SURBL/URIBL content-oriented DNSBLs, which have picked up the slack nicely
I took the plunge over the weekend, and live-upgraded the new 'Dapper Drake' Ubuntu release -- ouch. Here's the two key lessons I learned:
Don't run "grub-install" in a misremembered attempt to update the current GRUB boot menu 'menu.lst' file with the new kernel; sadly, this will quietly remove important details from your old menu.lst, such as "initrd" lines, rendering those kernels unbootable. Moral: ensure brain is in gear before meddling with MBRs!
If you're a Kubuntu user, watch out. Ensure you run
apt-get install ubuntu-base ubuntu-desktop
-- bringing the entirety of GNOME up to date -- as well asapt-get install kubuntu-desktop
after the upgrade; it appears that some part of a new hotplugging subsystem is not included as a dependency ofkubuntu-desktop
. Failure to do this results in an inability to use USB/hotpluggable devices, including internal devices like the Synaptics touchpad. No pointer devices (mice or touchpads) means no X server at boot, which is always a little annoying.
Some day I'll just do things the right way, and do a fresh-from-CD install instead. Ah well. The good stuff: the new kernel, or possibly Xorg, is proving to be a lot speedier -- window updates are noticeably smoother; and the new Ubuntu GNOME theme is similarly tasty.
CVE 2006-2447, in which Radoslaw Zielinski spotted a nasty in spamd's 'vpopmail' support in pretty much all recent versions of Apache SpamAssassin.
If you use spamd with vpopmail, go read the advisory and determine if you need to take action. Not many people will need to, I think; it's a very rare setup. Still, it's important to get the warning out there anyway.
The irony is that the bug is triggered partly by the "--paranoid" switch. This was intended to increase security, by increasing paranoia when possibly-unsafe situations arose -- hence providing a great demonstration of how the addition of optional code paths, even in the best intentions, can reduce security by allowing bugs to creep in unnoticed.
-
fake 419s, 'How to Explain Enron to Your Children', and 'we falsify commodity markets so that we can deliver physical commodities to our customers at a ridiculously unsustainable price' -- all scraped from the Enron mail corpus
-
great notes on speeding up javascript; I have a Greasemonkey script this will be useful with, once I get some tuits (via yoz)
-
Microsoft CEO Steve Ballmer attends wedding; a parent asks if he'd have a look at their PC; Ballmer spends _no less than two days_ attempting to rid it of encrusted malware infestations -- before giving up and shipping it back to Redmond. hilarious
Regarding the O'Reilly/CMP "Web 2.0 (SM)" trademark shitstorm, Sean McGrath humourously suggested a workaround -- using a different revision number instead of "2.0", specifically e, 2.71....
However, it's not quite that simple in many jurisdictions, apparently. It seems that trademark law -- in the US, at least -- allows trademarks which include a number to also cover uses within roughly plus or minus 10 of that number. In other words, CMP's application will cover the range from Web -8.0 (SM) (assuming negative numbers are included?) to Web 12.0 (SM).
So much for "Web 3.0", "Web 2.1", "Web 2.71...", and so on. Back to the drawing board, Sean! ;)
(disclaimer: IANAL, of course. Credit to Craig for that tidbit.)
Update: doh, got the value of e wrong...
-
I got slashdotted yesterday! Unfortunately, stock WordPress falls over pretty quickly. Once I managed to get this plugin installed, though, things were a lot better... thumbs up for WP-Cache
-
I need to do this soon; damn copy-on-write disk images are chewing up my disk space
-
one large website's password list analysed; 1.4% of passwords were "123456", and 2.5% overall began with 1234
-
Dapper is now released -- and is live-upgradable via apt-get. am I stupid enough to do this? quite possibly; I've done it for the past 5 upgrades
-
a message router for pings, for web pages containing microformat data. Interesting to see that Upcoming.org is currently the only ping producer -- their pings are then consumed by evdb, the only third-party ping receiver listed
-
graph of request frequency over the past few days at taint.org; that spike was pretty major
-
great article on current grid computing, featuring MPI, MapReduce, Hadoop, and promising a new UNIXy thing from tbray called Sigrid (ha!). Mind-boggling quote from Jim Gray: 'Memory is the new disk. Disk is the new tape.'
-
spontaneously converts the off-patent anhydrous form of the drug into the patented hemihydrate form, which then successively converts more and more of the anhydrous form, Ice-9-style. Never mind "viral" licenses, this takes the biscuit! (via substitute)
An interesting article on blog-spam countermeasures -- Google's embarrassing mistake. Quote:
I think it's time we all agreed that the 'nofollow' tag has been a complete failure.
For those of you new to the concept, nofollow is a tag that blogs can add to hyperlinks in blog comments. The tag tells Google not to use that link in calculating the PageRank for the linked site. [...]
Since its enthusiastic adoption a year and a half ago, by Google, Six Apart, WordPress, and of course the eminent Dave Winer, I think we can all agree that nofollow has done -- nothing. Comment spam? Thicker than ever. It's had absolutely no effect on the volume of spam. That's probably because comment spammers don't give a crap, because the marginal cost of spamming is so low. Also, nofollow-tagged links are still links, which means that humans can still click on them -- and if humans can click, there's a chance somebody might visit the linked sites after all.
I agree. At the time, I pointed at this comment from Mark Pilgrim:
Spammers have it in their heads now that weblog comments are a vector to exploit. They don't look at individual results and tweak their software to stop bothering individuals. They write generic software that works with millions of sites and goes after them en masse. So you would end up with just as much spam, it would just be displayed with unlinked URLs.
Spammers don't read blogs; they just write to them.
I still think he was spot on.
However, one part of the 'Google's embarrassing mistake' article is a red herring -- I think the chilling effect on "nonspam links" is not to be worried about; as Jeremy Zawodny said, life's too short to worry about dropping links purely in the hopes of giving yourself Page Rank. I don't know if I really want links that people are leaving purely for that reason. ;)
In fact, I wouldn't be surprised to hear that Google's crawler starts treating "nofollow" links as mildly non-spammy in a future revision, due to their wide use in wikis, blogs etc.
To be honest, though -- I don't see the problem of blog-spam much anymore. As I said here:
[Weblog] comment spam should be a lot easier to deal with than SMTP spam. ... With weblog comments, you control the protocol entirely, whereas with SMTP you're stuck with an existing protocol and very little "wiggle room".
On my WordPress weblog [ie. here] -- which, admittedly, gets only about 1/4 of the traffic plasticbag.org does -- I've instituted a very simple check stolen from Jeremy Zawodny. I simply include a form field which asks the comment poster for my first name, and if they fail to supply that, the comment is dropped. In addition, I've removed the form fields to post directly, requiring that all comments are previewed; this has the nice bonus of increasing comment quality, too.
Those are the only antispam measures I'm using there, and as a result of those two I get about 1 successful spam posted per week, which is a one-click moderation task in my email. That's it.
The key is to not use the same measures as everyone else -- if every weblog has a different set of protocols, with different form fields asking different simple questions, the only spammers that can beat that are the ones that write custom code for your site -- or use human operators sitting down to an IE window.
Trackbacks, however -- turn that off. The protocol was designed poorly, with insufficient thought given to its abuse potential; there's no point keeping it around, now that it's a spam vector.
Finally, a "perfect" solution to blog spam, while allowing comments, is unachievable. There will always be one guy who's going to sit down at a real web browser to hand-type a comment extolling the virtues of some product or another. The goal is to get it to a level where you get one of those per week, and it's a one-click operation to discard them.
(Update: This story got Slashdotted! The poor server's been up and down repeatedly -- looks like it needs an upgrade. In the meantime, WP-Cache has proven its weight in gold; recommended...)
-
argh. avoid iTunes 6 like the plague; Apple changed the DRM again, it's as yet unbroken, and once you purchase a track, your account is "locked" to the new DRM. This page gives details of the (labourious) process required to escape this nasty trap
-
Polypaudio looks like Linux sound done right (at last). questions 21-24 of this FAQ list hint at awesome possibilities for LAN-networked speaker systems, even better than http://taint.org/wk/RemotePlaybackWithEsd .
-
the Ordnance Survey has set up an online shop to sell access to out-of-copyright, public domain maps of Ireland. thanks lads, but I think there's a word for paying for something that one should be getting for free
A commenter at this post on Colm MacCarthaigh's weblog writes:
I guess I still don't understand how Open Source makes sense for the developers, economically. I understand how it makes sense for adapters like me, who take an app like Xoops or Gecko and customize it gently for a contract. Saves me hundreds of hours of labour. The down side of this is that the whole software industry is seeing a good deal of undercutting aimed at sales to small and medium sized commercial institutions.
Similarly, in the follow-up to the O'Reilly "web 2.0" trademark shitstorm, there's been quite a few comments along the lines of "it's all hype anyway".
I disagree with that assertion -- and Joe Drumgoole has posted a great list of key Web 2.0 vs Web 1.0 differentiators, which nails down some key ideas about the new concepts, in a clear set of one-liners.
Both open source software companies, and "web 2.0" companies, are based on new economic ideas about software and the internet. There's still quite a lot of confusion, fear and doubt about both, I think.
Open Source
As I said in my comment at Colm's weblog -- open source is a network effect. If you think of the software market as a single buyer and seller, with the seller producing software and selling to the buyer, it doesn't make sense.
But that's not the real picture of a software market. If you expand the picture beyond that, to a more realistic picture of a larger community of all sorts of people at all levels, with various levels interacting in a more complex maze of conversation and transactions, open source creates new opportunities.
Here's one example, speaking from experience. As the developer of SpamAssassin, open source made sense for me because I could never compete with the big companies any other way.
If I had been considering it in terms of me (the seller) and a single customer (the buyer), economically I could make a case of 'proprietary SpamAssassin' being a viable situation -- but that's not the real situation; in reality there was me, the buyer, a few 800lb gorillas who could stomp all over any puny little underfunded Irish company I could put together, and quite a few other very smart people, who I could never afford to employ, who were happy to help out on 'open-source SpamAssassin' for free.
Given this picture, I'm quite sure that I made the right choice by open sourcing my code. Since then, I've basically had a career in SpamAssassin. In other words my open source product allowed me to make income that I wouldn't have had, any other way.
It's certainly not simple economics, is a risk, and is complicated, and many people don't believe it works -- but it's viable as an economic strategy for developers, in my experience. (I'm not sure how to make it work for an entire company, mind you, but for single developers it's entirely viable.)
Web 2.0
Similarly -- I feel some of the companies that have been tagged as "web 2.0" are using the core ideas of open source code, and applying them in other ways.
Consider Threadless, which encourages designers to make their designs available, essentially for free -- the designer doesn't get paid when their tee shirt is printed; they get entered into a contest to win prizes.
Or Upcoming.org, where event tracking is entirely user-contributed; there's no professional content writers scribbling reviews and leader text, just random people doing the same. For fun, wtf!
Or Flickr, where users upload their photos for free to create the social experience that is the site's unique selling point.
In other words -- these companies rely heavily on communities (or more correctly certain actors within the community) to produce part of the system -- exactly as open source development relies on bottom-up community contribution to help out a little in places.
The alternative is the traditional, "web 1.0" style; it's where you're Bill Gates in the late 90's, running a commercial software company from the top down.
- You have the "crown jewels" -- your source code -- and the "users" don't get to see it; they just "use".
- Then they get to pay for upgrades to the next version.
- If you deal with users, it's via your sales "channels" and your tech support call centre.
- User forums are certainly not to be encouraged, since it could be a PR nightmare if your users start getting together and talking about how buggy your products are.
- Developers (er, I mean "engineers") similarly can't go talking to customers on those forums, since they'll get distracted and give away competitive advantage by accidentally leaking secrets.
- Anyway, the best PR is the stuff that your PR staff put out -- if customers talk to engineers they'll just get confused by the over-technical messages!
Yeah, so, good luck with that. I remember doing all that back in the '90's and it really wasn't much fun being so bloody paranoid all the time ;)
URLs:
(PS: The web2.0 companies aren't using all of the concepts of open-source, of course -- not all those web apps have their source code available for public reimplementation and cloning. I wish they were, but as I said, I can't see how that's entirely viable for every company. Not that it seems to stop the cloners, anyway. ;)
-
'The surge of Nevaeh can be traced to a single event: the appearance of a Christian rock star, Sonny Sandoval of P.O.D., on MTV in 2000 with his baby daughter, Nevaeh. "Heaven spelled backwards," he said.' you stupid, stupid people
-
oh dear. tip: allowing your "VP of Corporate Communications" to respond is not the way to do it cluetrain-style
-
'Tom from GBH and guests, playing Robot-Rock, Distortion-Disko, Electronic, Rock, New Wave Hip-hop, house, punk, electro, downbeat and classics.' lots of good mashups and remixes, one 2-hour 128kbps MP3 every week
-
That evening, Ms. Li and her brother joined 15 strangers at the store to demand a group discount on a new television, refrigerator, and washing machine.' wow (via EirePreneur)
-
old Llamasoft game images may be distributed and used free of charge to and by anyone. awesome!
-
any mention of "web 2.0" in a conference, and O'Reilly are firing legal letters -- even for events outside the US
-
Criminal Records Bureau's "erring on the side of caution" has resulted in around a 9.7% false positive rate, with 2,700 UK job-seekers falsely listed as being convicted criminals
My mate Pam is cycling in this year's AIDS/LifeCycle -- for a week from June 4 to 10, she'll be cycling from San Francisco to LA, for charity. That's 585 miles. Since she bought her bike to do this ride, she's clocked up a terrifying 2040 miles. Blimey.
It's for a good cause -- go on, make a donation!
-
I was wondering why this was such a shambles; now it makes sense. 'Inefficiency has become a virtue in government' (via waxy)
-
Actual running hardware! Looks a lot more realistic than the last mock-ups. I'm more positive now that I hear they have Chris Blizzard and Jim Gettys involved, too
I added the Fixing Email weblog to Planet Antispam a while back -- however, I'm not entirely sure at this stage that its content (which is seems to be primarily news syndication) fits with the "planet" concept (which is primarily intended for first-person posts).
So -- quick poll. Let me know what you think, pro or con, Planet readers: should I remove the Fixing Email feed from that site?
Update: that was a pretty resounding 'yes'. Done!
-
the guy behind the "more DoTs more DoTs more DoTs! 50 DKP MINUS!!" WoW voice-chat recording. I don't play WoW, but this control freak's incoherent freakout is hilarious even without knowing all the details
-
I've come around to this conclusion too -- attempting to use continuations to implement a web app 'requires you to write your code in such a way that it can tolerate sudden halts, thread switches, rewinding, and forking of execution' (via Miguel de Icaza)
-
'The response to my essay on plagiarism last week (“Where Have I Read That Before?â€) was swift, so here goes: Yes, it is plagiarized. 99% of it. The only original lines, in fact, are the first and the last two'
-
actually quite accurate! Deserved props for eMusic, Stereogum, Fluxblog, KCRW, Lemon-Red, ILM, and Music For Robots; missed the Hype Machine, though. mind you, that may be just as well
-
a new website-ribbon campaign from ISIPP, aimed at educating less-techie users on virus/malware avoidance; if you run a consumer-facing website, it'd be fantastic to get this up there
-
Trial a Niagara, get a free trip to SF! nice one Colm ;)
Dear Recruiters,
If you're going to (a) scrape my CV page from my website, then (b) spam me, unsolicited, offering to represent me for jobs I don't want in places I don't live, in explicit contravention of the terms of use [*] of that document -- here's a tip.
Don't compound the problem by asking me to resend the document in bloody Microsoft Word format. FFS.
([*]: Those terms were, of course, added in an attempt to stem the tide of recruiter spam. Thanks to Colm MacCarthaigh for the idea...)
-
cat-and-mouse fun with the Bank of England; interesting to hear that Google's cache is still trackable via CSS references
-
A python framework based on one-way pipes and generators, from the BBC, used to build their "Macro" super-PVR. May be some ideas for IPC::DirQueue here
Reading this
post at Piaras Kelly's blog, I was struck by something -- I never realised
quite how bizarre the situation with Bebo is.
If you check out the Google Trends 'country'
tab, Ireland is
the only country listed -- meaning that search volume for "bebo" is
infinitesimal, by comparison, elsewhere! (Update: Ireland was the
only country listed, because the URL used limited it to Ireland only. However,
the point is still valid when other countries are
included, too ;)
It is also destroying Myspace as a search term on the Irish internet. (Update: also fixed)
As a US-based company, they must be mystified by all this attention -- the Brazilian invasion of Orkut has nothing on this ;)
I'll recycle a comment I made on Joe Drumgoole's weblog as to why this happened:
My theory is that social networking systems, like Bebo, Myspace, linkedin, Friendster, Tribe.net, Orkut, Facebook etc. have all developed their own emergent specialisations. These are entirely driven by their users -- although the sites can attempt to push or pull in certain directions (such as Friendster banning 'non-person' accounts), fundamentally the users will drive it. All of those sites have massively different user populations; Tribe has the Burning Man crowd, Friendster the daters, Orkut the brazilians etc.
Next, I think kids of school age form a set of small set of cliques. They don't want to appear cool to friends thousands of miles away, on the internet; they want to appear cool to their peer group in their local school. So all it takes is a group of influential 'tastemakers' -- the alpha males and females in a year -- to go onto Bebo, and it becomes the site for a certain school; and given enough of that, it'll spread to other schools, and soon Bebo becomes the SNS for the irish school system. In other words, Irish kids couldn't really care less what US kids think of them; they want to be cool locally.
Also I think MySpace has a similar problem to Orkut -- it's already 'owned' by a population somewhere else, who are talking about stuff that makes little sense to Irish teenagers. As a result, it's not being used as a social system here in Ireland; instead, it's just used by musicians who want a cheap place to host a few tracks without having to set up their own website.
(Aside: part of the latter is driven by clueless local press coverage of the Arctic Monkeys -- they have latched onto their success, put the cart before the horse, and decided that they were somehow 'made' by hosting music on MySpace, rather than by the attention of their fans. duh!)
-
according to social-network graph analysis of the Enron mail corpus, "one of the 'central' players was Ken Lay's secretary". ha! (via robotwisdom)
-
Harri Hursti's report for BlackBoxVoting.org; it appears the boot loader will automatically reflash itself, if presented with a suitably-named file on PCMCIA media, and access to the PCMCIA slot is protected only by a few standard Philips-head screws. wow
-
great thread of comments sparked off by Paul Graham's rather ill-informed presentation at XTech2006. Cory's comment is spot-on, on both sides
-
Google's scrapbook-clone service. first impressions: Firefox extension = good, lots of Flash, URL's hardly catchy, no sign of RSS feeds
-
spam filters beating humans at performing spam classification quite a lot, it turns out. Everyone should give SpamOrHam.org a go!
-
good data; there does seem to be an appreciable effect
-
photo of Coldcut's live setup -- structured cabling system required
-
Downloadable filesystem images for Xen; all Linux so far, modified to run as Xen guests out of the box
-
'what if your Singleton has a handle to some limited resource, like a database or file handle? I guess you get to keep that sucker open until your program ends'. YES (via mjd)
-
excellent software-development interview advice
-
iPod-sized ambient hardware loop player from China; the tee-shirts are fantastic
-
DHS ineptitude strikes again
-
some really authoritative thumbs-down comments from Valdis Krebs and John Robb
-
interesting further notes; apparently the Trintech Smart 5000 PINPad terminals run Linux, and can be managed remotely
-
'Communities are human business debuggers. Why not know the problems, address them and prove that they’re fixed all in public?' excellent article, with the solid testimonial of Threadless backing it up
-
'former Iowa congressman Edward Mezvinsky was caught up in a 419 scam, and stole from his law clients, friends, and even his mother-in-law .. He is serving more than six years in prison after pleading guilty to thirty-one counts of fraud.' bloody hell
-
Suw Charman and a load of others (see comments) lay into the BBC's "citizen journalism" conference: "a complete waste of time". ouch
-
'Please visit and take a minute to post positive comments about BlueSecurity. BlueSecurity is encouraging us to do such things so let's help them spread the good word.' explains a lot; several other astroturf coordination forums at castlecops.com, too
new-referrer-rss.pl - generate RSS feed of new referrer URLs from access_log
SYNOPSIS
new-referrers-rss nameofsite [source ...] > new-referrers.xml
DESCRIPTION
Given the name of a web site, and a selection of Apache combined log format 'access_log' files containing referrer URL data, this will generate an RSS feed containing the latest referrers.
The script should be run periodically with 'fresh' access_log data, from cron.
Renesys Blog: The Bluesecurity Fiasco -- in which Todd Underwood, CSO for Renesys Corporation, applies some real-world knowledge of how the internet works to the "timeline of events" press release, issued by BlueSecurity as part of their ongoing PR about the DDoS.
Judging by the comments at Slashdot, this really needs to be more widely read.
Here's some highlights:
The timeline from BlueSecurity [...] is frustratingly vague. It uses phrases like 'tampering with the Internet backbone using a technique called "Blackhole Filtering".' As Thomas Pogge, a philosophy professor of mine, used to say: that's not even wrong yet. There is no "Internet backbone", there is no technique known as "Blackhole Filtering", and blackhole routing is not normally described as tampering. So the whole explanation is nonsense. [...] Let's clear one thing up for the press and everyone else: this event just wasn't that interesting. The attack against bluesecurity was a run-of-the-mill denial of service attack.
His conclusion:
I believe that the PR engine from BS is in overdrive spinning this event as fast as they can. But the concrete facts being put out by them simply to not add up. In the process they seem to be doing two things: 1) trying to imply or state that someone at UUnet was bribed by a spammer. This is simply ridiculous. I know many of the people who work for UUnet and they are honest, hardworking and extraordinarily clever people. They would not be crooked, or stupid, enough to do such a thing and if they were, they would have been trivially caught by change-management procedures. Moreover, such a change at UUnet (or BTN) wouldn't have caused the event BS claims to have witnessed anyway. Additionally, 2) BS is trying to deflect attention from the damage that they caused at Six Apart. It would be much better if they could just claim ignorance of the DOS, apologize and move on. I recognize that that isn't going to happen, but it sure would make this whole thing easier to handle.
Well said.
Of course, this is pretty much immaterial -- the people who are using Blue Frog, and vocally supporting Blue Security, don't really care what happened. All they care about is that someone is taking some kind of direct action against spammers, in some way or another, and if there's a little "friendly fire" and some bending of the truth, why, this is a war! What, do you support the spammers?
It's disappointing -- the amount of disinformation being successfully pumped out (and accepted!) on this story is massive.
-
they're no longer shipping games, electronics, or home/garden items to Ireland. what with this and the crappy shipping, looks like they've written off the Irish market for some reason
Bubba, now safely back in Dublin after his 8000-mile flight from LAX, is getting back into exploring his old manor.
Here he is, ignoring a very brave magpie. Judging by the way the magpie was brazenly hopping around him, cawing, and the way that Bubba was ignoring him, I suspect there may be a nest nearby....
-
good points from Joe Drumgoole; what works for Irish VCs isn't necessarily aligned with what's good for Ireland's high-tech industry
-
First doodled on a placemat by Ken Thompson and Rob Pike for Plan 9 in 1992 (via era)
-
Blue Security accidentally took down large chunks of the blogosphere in an attempt to evade the DDoS targeting them; impressively inept. also, they really need to tone down their sock-puppet commenter squad (via torrez)
-
open geodata creation from OSM in a 3-day mapping-fest this weekend. great explanation of why open geodata is important in the UK and Ireland, too
-
behavioural analysis on web-search engine bots, with some pretty pics (via waxy)
-
YUM. wonder if I can find condensed milk around here
Apparently, Transport For London are planning 'e-money' trials based on their remotely-readable Oyster RFID cards.
Combine that with Kevin Mahaffey of Flexilis' talk at Black Hat last year, where he demonstrated apparatus to extend RFID read range from 4-6 inches to approximately 50 feet, and things could get messy. ;)
The slides for that talk are available here (PDF); slide 20 specifically mentions the Hong Kong "Octopus" cashless-payment card.
-
Some users of the Blue Frog software are considering this leak to be some kind of Churchillian challenge to their resolve, instead of a failure on Blue Frog's part! amazing
-
'What Wikipedia has taught us .. is that in a vacuum of politics, politics will be created. There is no vacuum of politics.' interesting article
-
'This spammer is using mailing lists he already owns and is now sending millions of such messages' -- hasn't hit any of our thousands of spamtraps, which is quite impressive in that case
Blue Frog is a company who operates a "Do Not Email" list, on the (optimistic) basis that spammers will vet their lists against it.
Reportedly, it's been compromised. If this is true, I'm not surprised -- as Dr. Aviel Rubin's report to the FTC of May 2004 regarding a Do-Not-Email list notes:
The scrubbing approach [to running a D-N-E list] requires that a list of live email addresses exist. While the party owning that list may be well intentioned, it is unlikely that such a valuable list would not leak out. History is replete with insider attacks, as well as external break-ins to highly sensitive sites, such as the Pentagon computers. The Do Not Email Registry represents the kind of prize that attracts hackers. In this case, the prize has monetary value as well. Once the list is exposed, there is no way to undo it.
Also, it's almost inevitable:
If this service were running for some time, it is more likely than not that the plaintext addresses would leak at some point, given the history of computer security incidents.
Update: it appears, according to this white paper, that the Blue Frog "Do Not Intrude" list is hashed, rather than plain-text. Rubin's advice still applies:
Without hashing, a compromise of the registry database results in exposure of all of the registered email addresses. This is a total disaster. However, even exposure of a hashed list is a catastrophe. A spammer with a copy of a hashed list of email addresses is able to find out, for any email address, if the address is in the registry. The attacker simply hashes a candidate email address and sees if the hashed value is in the list. This is very powerful. [....]
Hashing provides absolutely no security against a marketer who obtains a scrubbed list and uses that to sell the addresses that were scrubbed by the registry. Whether or not the list is hashed has no impact on a malicious marketer in the scrubbing approach.
Are you a student, and interested in earning $4,500 for contributing to open source, and fighting spam, over the course of the summer?
If so, get thee hence to the Google Summer of Code 2006 site, and propose a project!
Last year, we in SpamAssassin didn't get it together to mentor SoC projects. This year, however, we have a few prospective mentors (including myself), and a few sample project ideas lined up; we're all ready to go! Here's the Student FAQ. Be quick; applications end in a week and a bit.
Here's hoping we get some interesting submissions ;)
-
YouTube's bandwidth bill 'may be approaching $1 million a month'. holy crap (via waxy)
-
a really nice Flickr-like take on mapping; every street has user-contributed location geodata included; open REST API; social aspects; Google-friendly. Best mapping site I've seen
-
'Everything needed to make this episode is available in the eler-source directory in a bzipped tarball. ... Creative Commons Attribution Share-Alike license'
Here's what happens when you search for single letters on Google:
- A: Apple
- B: BHPhotoVideo.com
- C: C-SPAN
- D: D-Link
- E: E! Online
- F: FuckedCompany.com (probably due to being referred to as 'F***edCompany' or similar)
- G: Gmail (not the main Google site, interestingly)
- H: H-Net: Humanities and Social Sciences Online
- I: iTunes
- J: Jennifer Lopez (really. "J-Lo", I guess)
- K: K Mart
- L: the Council of Europe (?!?)
- M: Texas A&M University
- N: AT&T Knowledge Network Explorer: Blue Web'n Homepage
- O: O'Reilly Media
- P: The Alfred P. Sloan Foundation
- Q: Q4Music.com (the website for the UK music magazine 'Q')
- R: The R Project For Statistical Computing
- S: McDonald's (note the apostrophe-S!)
- T: AT&T
- U: The University of Arizona, Tucson Arizona
- V: V for Vendetta (the official site)
- W: W Hotels (a subsite of StarwoodHotels.com)
- X: X.Org
- Y: Yahoo! Messenger (again, not the main Y! site)
- Z: A To Z Teacher Stuff
Interestingly I got to see the new Google search results page, with the sidebar, once. It must be in the process of rolling out...
-
tomorrow afternoon, Federico Heinz and a talk on GPLv3 from Ciaran O'Riordan
-
could be interesting if true
-
'That's the only way that snails catch you up. If you weren't paying bloody attention.'
-
Los Vegas - home of Windows XP piracy!
-
via O'Reilly Radar. good to hear I'm not the only person hacking awfulness with Data::Dumper and eval()
-
hello new Linux desktop wallpaper!
-
Google Maps mash-up from John Handelaar, mapping the locations of all of Eircom's DSL-enabled POPs in Ireland. excellent! some major holes in the map: not so excellent
-
using an ALSA output plugin called raop-play, which streams to the remote server. If only Apple had just used esd's remote streaming protocol, instead of inventing their own crappy DRM-laden proprietary one, this would be a lot simpler
-
everybody seems to be taking the piss out of this, for some reason
-
Apple's postmaster on the DearAOL.com blocking fiasco, and the economics of Goodmail. 'Goodmail will likely fail on its own merits'
-
'the scammers had used open-source software called Asterisk to convert a computer into a PBX ... running an automated telephone information system. The voice system sound[ed] exactly like the bank's phone tree'
-
turns out a user named "sekrit" has actually been representing the output of the BBC 6Music radio station. awesome!
John-Graham Cumming asks, 'Are Citibank crazy?':
I blogged a while ago about Thunderbird's phishing filter trapping a seemingly innnocent mail. Now, a reader has forwarded to me a genuine email from Citibank that he says was trapped by Thunderbird. I'm not going to reproduce the email here because it contains private details of the user, but it is a valid Citibank message.
Thunderbird thinks it's a scam because Citibank uses one of the oldest phishing tricks in the book. The have a URL displayed in the message then when clicked goes to a totally different URL.
Sadly, this has proven to be really quite common. We've investigated using this rule as a worthwhile phish-detection rule in SpamAssassin, several times, and without much luck. In fact, we've had to create a FAQ entry for it -- since it's such a superficially-attractive but ultimately useless, idea, many people have had long discussions on our lists about it!
The companies that produce these false positives in their mails include American Express, Bed Bath & Beyond, Universal Studios, Microsoft, Hilton Hotels -- and now Citibank.
A couple of other examples from real mails:
<a href="http://www65.americanexpress.com/clicktrk/Tracking? mid=MESSAGEID&msrc=ENG-ALERTS&url= https://www.americanexpress.com/estatement/?12345"> https://www.americanexpress.com/estatement/?12345</a> <A HREF="http://echo.epsilon.com/WebServices/EchoEngine/T.aspx?l=ID"> https://www.hilton.com/en/ww/email/tab_email_subscriptions.jhtml</A>
By the way, it really is quite impressive for a bank as heavily phished as Citibank to still be making this kind of basic mistake in their mail-outs! It reinforces a point I made in a mailing list posting recently:
As far as I can see, the approach taken by pretty much all banks to their online services is simply too bureaucratic, hide-bound, and fundamentally driven by their marketing departments, to ever cope effectively with phishing. :(
(For what it's worth, I know Citi have some smart techies working there; but the rest of the company needs to start paying attention to them.)
If you routinely log into one or more remote systems using SSH, and have a flaky internet connection or an incompetent ISP, you probably already know about screen's ability to detach and reattach sessions.
However, you still have to manually type screen -r
to resume a detached session, each time -- and sometimes you'll forget, start working in an SSH session, get logged out, and lose your state.
Here's the next step -- automatic screen-sessions for any remote logins: RemoteLoginAutoScreen.
Optimo have a new mix up -- the First Hour Mix:
Here's the fourth in a brief series of mixes where we present something a little different. This mix isn't really a mix in the conventional sense but rather 17 tracks blended together. To us, the first hour of Optimo, or to be more accurate, the 'Espacio' part of Optimo (Espacio) is a vital part of the night. It is our chance to play absolutely what we like without thinking about the dancefloor.
It's a great mix -- certainly not dancy, but some really interesting tracks here. The Optimo guys put together some really great music.
In fact, I went to see them play last Saturday -- or, at least, myself and a couple of mates tried to. Supposedly, they were supporting The Juan Maclean at the Bud Rising festival over the weekend, but the show was such a shambles, without anyone having a clue when it started or who was on stage at any time, I'm pretty sure we missed their set entirely.
On top of that, it was EUR20 in, and to add insult to injury, the only lager on sale was Budweiser! I mean, I wouldn't mind that if the "Bud Rising Festival" deal meant free entrance, but charging 20 squids and then cutting off the supply of decent booze as well, is just a crime.
Ah well, the Filthy Dukes were pretty good at least.
The view from Sorrell Hill, Co. Wicklow, facing north.
Looks lovely, doesn't it! Well, here's the view in the opposite direction. ;) We got a great reminder of the mercurial nature of an Irish April. This panorama captures the whole story.
So I've been using this for a few days now -- and I'm loving it. A calendaring system that deals coherently with the web:
- good RSS integration
- publishing calendars, easily, via HTTP
- subscribing to third-party calendars via HTTP
- a URL API, allowing third-party sites and user scripts to add events to your calendar
I keep finding little things that make perfect sense, and just feel more logical than what I've used elsewhere. This rocks!
One thing still needs work, though: the links to Mapping fail spectacularly, for non-US addresses at least. But that's pretty minor.
By the way, I have a feeling that Mac.com had parts of this, but really, you had to drink a lot of Apple kool-aid to use that, and I just didn't go for that. Sorry Jobs fans.
Do you know what would be cool now? If Upcoming.org published venue/location-specific iCal feeds. Oh look, they do! Awesome...
Argh! This is what happens every day to my DSL connection, at half past 12:
13 Mon Apr 10 12:26:53 2006 PP12 -WARN SNMP TRAP 2: link down
14 Mon Apr 10 12:26:53 2006 PP12 INFO ppp_ready: ch:8056167c, iface:80419f14
15 Mon Apr 10 12:26:53 2006 PP12 -WARN SNMP TRAP 3: link up
26 Tue Apr 11 12:26:46 2006 PP12 -WARN SNMP TRAP 2: link down
28 Tue Apr 11 12:26:48 2006 PP12 INFO ppp_ready: ch:8056167c, iface:80419f14
29 Tue Apr 11 12:26:48 2006 PP12 -WARN SNMP TRAP 3: link up
38 Wed Apr 12 12:26:56 2006 PP12 -WARN SNMP TRAP 2: link down
40 Wed Apr 12 12:26:58 2006 PP12 INFO ppp_ready: ch:8056167c, iface:80419f14
41 Wed Apr 12 12:26:58 2006 PP12 -WARN SNMP TRAP 3: link up
50 Thu Apr 13 12:27:00 2006 PP12 -WARN SNMP TRAP 2: link down
52 Thu Apr 13 12:27:03 2006 PP12 INFO ppp_ready: ch:8056167c, iface:80419f14
53 Thu Apr 13 12:27:03 2006 PP12 -WARN SNMP TRAP 3: link up
Worse than that, it will generally assign a different IP address to the connection when it reconnects! This buggers up any applications that rely on long-lived TCP connections, such as SSH shell logins, tunnels, remote-desktop sessions, and instant messaging; all get disconnected and have to be manually re-set up.
Initially, I thought this may have been a flaky connection. However, it appears not -- check out those timestamps; that's a scheduled, daily event. Also, there have been no other disconnections apart from those.
A discussion on the IIU mailing list revealed the reason -- it seems BT Ireland have a policy of resetting their customers' connections daily. That could be OK, if they came right back up with the same IP -- TCP/IP is designed to cope with that, and generally does -- but it does not do that. Instead the IP address is reassigned every single time.
This is turning out to be quite a nuisance. Working over the internet requires quite a few VPN connections, tunnels, and remote logins, and having to re-set those up, daily, is turning out to be a pain in the neck.
I'm casting around for hacks to get around this. Right now, I have an assortment of jiggery-pokery involving ssh, a shell script 'while' loop, and screen(1), but it's messy and not working out too well. Ideally, I'd set up another VPN (via IPSec or CIPE), and set it up to reconnect on link failure, then route all other VPNs and remote logins out via that -- but I don't have spare routable IPs to do this with. Anyone got any good suggestions?
By the way, it's worth noting that their FAQ fails to mention this, instead giving some incorrect information about my IP being 'removed' when my web browsing session ends:
Is it a fixed IP?
No, the product is set up with dynamic IP Addressing. This means that every time you open your browser you will be allocated a different IP address for the duration of that session. When the session ends the IP Address is removed.
That is incorrect -- this has nothing to do with web browsing sessions.
To be honest, I'd prefer not to have to switch ISPs to get away from this brokenness -- the rest of the service is quite nice, good pings, good throughput, no other disconnections or outages -- but this is quite a problem for someone using BT Broadband for telecommuting purposes. :(
I gave up smoking last year on May 26 -- that anniversary isn't too far away. Here's how much money I've saved, courtesy of QuitMeter.com:
QuitMeter Counter courtesy of www.quitmeter.com.
Wow -- I could buy myself another iPod! ;)
Paul Graham's recent essay on his experience with software patenting has been making the rounds recently.
Now Kevin Marks has commented. Worth reading, since he demonstrates nicely the kind of crap you see in a 'hot' field, such as video (which he worked on with Apple's Quicktime):
I broadly agree with Paul Graham's essay on Software Patents, but I do think he underestimates the damage from patent trolls, and from what he calls the mafia-like behaviour of some patent holders. Paul has been lucky in the field he has worked in, but in the Audio and Video area there are many patent thickets. ... While I was at Apple on QuickTime, there was a steady stream of patent trolls claiming that Apple should pay them royalties; enough to keep several lawyers busy, and a lot of engineers spending time working on prior art evidence demonstrations. Several potential features were excluded from QuickTime due to patent thickets. The obvious one was the Unisys LZW patent that encumbered GIF, but there were other more subtle pressures that meant adopting open source codecs was discouraged. Working on the patent license agreements for MPEG meant that technology ready to ship was deferred pending legal agreement on more than one occasion.
In my experience, that's what happens -- once a field becomes "hot", patent trolls and other nuisance "inventors" start appearing en masse, and then you've got to waste a lot of time dealing with that crap.
It's my bi-monthly perl blog entry, to earn my place on planet.perl.org! ;)
Here's an interesting "gotcha". Take this code:
perl -e '%t=map{$_=>1}qw/1 2 3/;
while(($k,$v)=each %t){print "1: $k\n"; last;}
while(($k,$v)=each %t){print "2: $k\n";}'
In other words, iterate through all the key-value pairs in %t once, then do it again -- but exit early in the first loop.
You would expect to get something like this output:
1: 1
2: 1
2: 3
2: 2
instead, you see:
1: 1
2: 3
2: 2
The "1" entry in the second loop is AWOL. Here's why -- as "perldoc -f each" notes:
There is a single iterator for each hash, shared by all "each", "keys", and "values" function calls in the program
That's all "each" calls, throughout the entire codebase, possibly in a different class entirely. Argh.
The workaround: reset the iterator using "keys" between calls to "each":
perl -e '%t=map{$_=>1}qw/1 2 3/;
while(($k,$v)=each %t){print "1: $k\n"; last;}
keys %t;
while(($k,$v)=each %t){print "2: $k\n";}'
This got us in SpamAssassin -- bug 4829.
To be honest, having to call "keys" after the loop is kludgy -- as you can see if you check the patch in bug 4829 there, we had to change from a "return inside loop" pattern to a "set variable and exit loop, reset state, then return" pattern. It'd be nice to have a scoped version of each(), instead of this global scope, so that this would work:
perl -e '%t=map{$_=>1}qw/1 2 3/;
{ while(($k,$v)=scoped_each %t){print "1: $k\n"; last;} }
# that each() iterator is now out of scope, so GC'd;
# the next call uses a new iterator, starting from scratch
{ while(($k,$v)=scoped_each %t){print "2: $k\n";} }'
Scoping, of course, has the benefit of allowing "return early" patterns to work; in my opinion, those are clearer -- at the least because they require less lines of code ;)
I just received a very nice info-pack through my front door regarding the new Dublin Metro line, which is in planning at the moment; it seems they're soliciting feedback from residents near the proposed routes. Nicely done.
Right now, Dublin has an embarrassment of good public transit, at least when compared to my previous home in Orange County. There, public transit is actively campaigned against.
My favourite claim: that it 'increases crime' -- in other words that poor people from Santa Ana would come down to Irvine and steal stuff, which they couldn't do with vehicular transport, for some reason.
The OC Weekly thought it was pretty funny, too -- and an opposing group comprehensively debunked it. Still, it seemed to work; while I was living in Irvine, I got to see the Centerline proposal gradually whittled down until it was finally killed off. During that time, in contrast, Dublin built the Luas.
Unfortunately it doesn't exactly go where I want to go, but you can't always have everything. ;)
Quick update -- I've added Ed Falk's "Spam Diaries" to http://planet.spam.abuse.net/ .
finally!
Just got a new cafetiere, so I can finally switch back from instant coffee to the real deal again for my morning coffee. My productivity has doubled. Still no DSL, though -- early next week is the current estimate, and I can hardly wait.
I went to a pub quiz last night with mates Macker, Tom and Alan -- a benefit for a new Dublin theatre company, I think. The prizes were:
- First prize: several 50 Euron vouchers for various Dublin eateries
- Second prize: two fancy scarves, a Nivea women's cosmetics kit, and a very metrosexual Nivea bath kit for a guy
- Third prize: 4 bottles of nice wine
We did very nicely -- "aglet" was correctly defined for instance -- but not nicely enough. Put it this way: guess who's wearing Nivea deodorant?
Hey lazyweb, hear my plea! What are my options for buying consumer electronics online, now that I'm back in Ireland?
I like online shopping. I dislike Argos, and I really hate Dixons, Currys and all the rest of the consumer-electronics high-street operations. Get me on the net and out of the nasty little shops and I'm happy. ;)
All in all, I'm a bit of an Amazon fan. However, now that I'm back in Ireland, I've been brought back to earth with a bang on that count; the prices are OK for items at both Amazon.com and .co.uk -- but shipping is turning out to be a total disaster.
Basically, I've put in two orders, paid through the nose for basic shipping, and neither has turned up. For example -- I ordered this phone a week and a half ago, on the 9th March, ponying up UKP 27 for the item -- and a painful UKP 7 for shipping by International Mail.
Delivery estimate on ordering was for between 5 and 7 days -- 14th to the 16th March. That was long enough -- but it still hasn't turned up, and Amazon.co.uk is still claiming that that is the current estimate, despite the 16th of March being 4 days ago ;)
On top of that, it appears they don't offer any way to track the packages using that shipping method, so who knows what's happening with the damn thing right now.
If I compare that with an order I made at Amazon.com last November, in which I nabbed a handy FM transmitter for my iPod -- in that case, I got it shipped by plain old US Postal Service for $4.51, which was handily discounted as Super Saver Shipping. That -- as with pretty much all my Amazon.com orders -- arrived in 3-4 days, and for a hell of a lot cheaper too. If I'd had to pay for shipping (which I didn't anyway), $4.51 vs UKP 7 works out as a third of the price, no less.
I'm guessing this is mainly down to Amazon.co.uk being shoddy in terms of how it deals with shipping to Ireland, and there are probably sites that use better-quality shipping partners.
Surely there must be better deals with vendors in Ireland, or even elsewhere in the Eurozone? Anyone know? Please drop us a line in the comments!
Update: the items arrived -- 14 days after ordering. This is a moot point now, though, since Amazon.co.uk are no longer selling 'PC & Video Games, Toys & Games, Gift items, Electronics & Photo and Home & Garden items' to Ireland; I guess it was easier to give up on the Irish market for now. Very disappointing -- but I'm waiting to see what happens next.
So, my new employer just launched today!
It's a new search service, VAST.com. As the blog says, 'we are building a search service that extracts classified ads from across the web, structures them, and then makes them available via an open REST API for commercial and non-commercial uses.'
Now you can see why I'm excited ;)
--> Sending: ATZ
ATZ
OK
--> Sending: ATQ0 V1 E1 S0=0 &C1 &D2
ATQ0 V1 E1 S0=0 &C1 &D2
OK
--> Sending: ATH1
ATH1
OK
--> Modem initialized.
--> Sending: ATDT1892150150
--> Waiting for carrier.
ATDT1892150150
CONNECT 45333
45 measly kilobits per second! This is incredibly painful -- and expensive at 5 cents a minute! I briefly considered getting around it by hiring a 3G data-card for the couple of weeks before my DSL is activated -- but that too is insanely overpriced.
Hurry up, DSL...
As of yesterday, I have a new day-job.
I won't be working on email spam as part of the job, which is an interesting turn of events. However, I'll be sticking with the open-source Apache SpamAssassin project, and keeping up the rate of work on that [*].
I'm not sure how much I can blog about the new place just yet, but I will say it's certainly looking like it'll be very interesting work ;)
[*: modulo the next couple of weeks while I'm waiting for my bloody DSL to be installed. argh!]
Miguel de Icaza quotes Dave Winer, pointing out two patent applications from Apple which seem intended to grab major chunks of the feed syndication space as Apple "IP".
The first application is news feed viewer, 20050289147, filed April 13 2005:
A computer-implemented method for displaying a plurality of articles, the method comprising: storing a first feed bookmark in a folder, the first feed bookmark indicating a first feed, the first feed comprising a first plurality of articles; storing a second feed bookmark in the folder, the second feed bookmark indicating a second feed, the second feed comprising a second plurality of articles; aggregating the first feed and the second feed to form a third feed; and displaying the third feed.
I think there were many RSS readers that implemented this, and others from the patent application, before April 2005. I know Liferea, the one I use, has had UI-level aggregation since September 2004, with its VFolders.
Next, news feed browser, 20050289468, filed April 13 2005. This one contains a wide range of claims, but here's one that stands out as particularly trivial:
A computer-implemented method for discovering a feed, the method comprising: receiving a request to display a file; determining that the file includes relationship XML; determining that a Uniform Resource Locator (URL) within the relationship XML indicates a file that comprises the feed; and displaying one of a group containing the feed and a link to the feed.
That's pretty much RSS autodiscovery, as described in 2002.
The listed inventors in both patents are: Kahn, Jessica; (San Francisco, CA) ; Alfke, Jens; (San Jose, CA) ; Wilkin, Sarah Anne; (Menlo Park, CA) ; Howard, Albert Riley JR.; (Sunnyvale, CA) ; Forstall, Scott James; (Mountain View, CA) ; Lemay, Stephen O.; (San Francisco, CA) ; Melton, Donald Dale; (San Carlos, CA) ; Loofbourrow, Wayne Russell; (San Jose, CA).
Thanks, Apple! and thanks, "inventors"!
It's important to note that this is still in the application stage, and as such can be invalidated, or narrowed down to a saner level, by using the techniques described here. I strongly recommend that people working in the syndication field with sufficient knowledge and expertise who feel strongly enough about this should spend a little time doing so, before the patent is issued and it becomes a multi-million-dollar task to invalidate it. (however, IANApatentL of course ;)
Tim Bray: Which Apache project burns the most resources?
Mads: Spamassassin by a wide margin. [...]
Heh, we win ;)
Helios, the Zones server, has been an incredible resource for us. SpamAssassin isn't a traditional open-source software project in one respect: we use a lot of centralized "phone home" infrastructure to support rule and score generation. Having a virtualized server of this quality and horsepower to use for this has been fantastic.
(thanks to John O'Shea for the pointer!)
Another day, another absurd IBM software patent. Via the IP list, here's United States Patent 7,003,497:
- A method for confirming an electronic transaction, comprising the steps of: performing an electronic transaction between a first party and a second party; providing, by the first party to the second party, contact information of a third party service provider associated with the first party; contacting, by the second party, the third party service provider to obtain a location of a predetermined, private mailbox associated with the first party; sending, by the second party, a request for confirmation of the electronic transaction to the predetermined, private mailbox associated with the first party; accessing the private mailbox by the first party; and sending, by the first party, a reply message to the request for confirmation to thereby confirm authorization of the electronic transaction, wherein information regarding the private mailbox is not communicated to the second party during the electronic transaction.
There's lots of waffle in the background section about this being for electronic e-commerce transactions, but that claim, and claims 2 and 3 at least, are easily sufficiently broad to cover simple "confirmed opt-in" email subscription systems -- in other words, the system whereby a potential newsletter subscriber clicks on a link in order to "confirm" that they want to subscribe to a newsletter. That's the current best practice email subscription method used by pretty much everyone.
Filed December 31, 2001. There was plenty of prior art before this date, but who would want to go up against IBM, no less, to attempt to get this invalidated, especially now that it's been issued?
Thanks USPTO, you're doing a heck of a job!
Things have really been heating up recently around the AOL/Goodmail "pay to send" CertifiedMail scheme -- the EFF and a host of other groups have launched dearaol.com, stating:
This system would create a two-tiered Internet in which affluent mass emailers could pay AOL a fee that amounts to an "email tax" for every email sent, in return for a guarantee that such messages would bypass spam filters and go directly to AOL members' inboxes. Those who did not pay the "email tax" would increasingly be left behind with unreliable service. Your customers expect that your first obligation is to deliver all of their wanted mail, and this plan is a step away from that obligation.
While I dislike this proposal, too, as far as I can tell, AOL actually have pretty reasonable intentions with this program -- nowhere near as bad as the DearAOL.com site makes out.
However, they're doing a really really crappy job of getting this information out there, or committing to reasonable limits on the program, such as announcing that they will use it only for transactional emails, as Yahoo! have done.
I'd strongly recommend reading Carl Hutzler's posting on the subject. Carl was AOL's head of anti-spam operations until last year, so he really knows what he's talking about, and he lays it out clearly -- a lot more clearly than any corporate statements from AOL do. His blog contains a fair bit more on the subject, too.
But seriously -- why isn't there a press release on the AOL site about this scheme? Some front-channel communication about now might be useful, I'd suggest, before things really get hairy -- this crapstorm is coming about partly because AOL's comments are all filtering out in drips and drabs via third parties, and (AOLers say) are being misconstrued and misrepresented in the process. It's a classic case of missing the cluetrain.
I'd also really encourage the EFF people to tone done the rhetoric; statements like "senders will have no guarantee that their emails will be delivered" is scare-mongering, given that SMTP email already provides no such guarantee.
Update: wow, MoveOn went really overboard -- "threatening the Internet as we know it ... The very existence of online civic participation and the free Internet as we know it are under attack." OMG the sky is falling!
Side Issue: The Spam Definition
Also, another note to EFF: defining spam as "whatever you don't want to read" is a terrible mistake to make. That confuses a good, clear, enforceable and automatable definition of spam -- unsolicited bulk email -- and makes it effectively unenforceable by law, unpoliceable by ISPs, impossible to detect automatically, and incompatible with existing, effective EU and Australian legislation.
Listen to your own Chairman of the Board; he's right on this count.
PS: any luck fixing up the non-confirmed signups issue? Last time I checked I could still subscribe any address to the EFF Action Alerts without a cross-check, which is not a good thing.
A quick hack --
goog-love.pl - find out where your site's google juice comes from
This script will grind through your web site's "access.log" file (which must be in the "combined" log format). It'll pick out the top 100 Google searches found in the referer field, re-run those searches, and determine which ones are giving your website all the linky Google love -- in other words, the searches that your site 'wins' on.
The output is in plain text and a chunk of HTML.
usage:
goog-love.pl sitehost google-api-key < access.log > out.html
e.g.
cat /var/www/logs/taint.org.* | goog-love.pl \
taint.org 0xb0bd0bb5yourgoogleapikeyhere0xdeadbeef | tee out.html
NOTE: this script requires the SOAP::Lite
module be installed. Install
it using apt-get install libsoap-lite-perl
or cpan SOAP::Lite
.
It also requires a Google API key.
For example, here are the current results for this site. You can immediately see some interesting stuff that's not immediately obvious otherwise, such as my site being the top hit for [beardy justin] ;)
- #1 for kriskat225: http://taint.org/2006/01/20/220239a.html
- #1 for kriskat224: http://taint.org/
- #1 for mailman rss: http://taint.org/mmrss/index.html
- #1 for ray is naked: http://taint.org/2005/05/27/195421a.html
- #1 for beardy justin: http://taint.org/2005/09/10/002323a.html
- #1 for threadless rss: http://taint.org/2005/05/25/060857a.html
- #1 for louis fitzgerald: http://taint.org/2005/05/12/020118a.html
- #1 for download JusteTune: http://taint.org/index.php?tag=apple
- #1 for mobile repair delhi: http://taint.org/2005/11/11/032651a.html
- #1 for site:taint.org mythtv: http://taint.org/index.php?tag=hdtv
- #1 for "Google Map" IDS rulesets: http://taint.org/2005/09/
- #1 for spam email "prank a friend": http://taint.org/2004/11/
- #1 for site:taint.org mythtv freevo: http://taint.org/index.php?tag=mythtv
- #1 for world map desktop background: http://taint.org/xplanet/
- #1 for kate thornton + Samuel L jackson: http://taint.org/2003/12/10/185721a.html
- #1 for when did chris horn leave iona technologies?: http://taint.org/2003/05/
- #2 for natkat224: http://taint.org/
- #2 for itms linux: http://taint.org/2005/09/20/022107a.html
- #2 for msn IDs hacking software: http://taint.org/index.php?tag=hacking
- #3 for gmail spam filter: http://taint.org/2004/04/15/033025a.html
- #3 for live world map on desktop: http://taint.org/xplanet/
- #4 for moin mozex: http://taint.org/2004/10/08/081409a.html
- #4 for editable p45: http://taint.org/2005/01/27/025238a.html
- #4 for urban dead exploits: http://taint.org/index.php?tag=games
- #4 for gmail spam filtering: http://taint.org/2004/04/15/033025a.html
- #4 for world map desktop wallpaper: http://taint.org/xplanet/
- #5 for cdwow.ie: http://taint.org/2003/12/04/185038a.html
- #5 for life hacking: http://taint.org/2005/10/17/210751a.html
- #5 for Adelphi Charter: http://taint.org/index.php?tag=politics
- #6 for irish SME: http://taint.org/2005/06/23/212513a.html
- #6 for urbandead: http://taint.org/index.php?tag=hacks
- #6 for SKY NEWS IRELAND: http://taint.org/2004/05/12/205717a.html
- #7 for daniel cuthbert: http://taint.org/2005/10/12/205836a.html
- #7 for SAMUEL L. JACKSON QUOTES: http://taint.org/2003/12/10/185721a.html
- #7 for cool background pictures: http://taint.org/xplanet/
- #8 for CDWOW: http://taint.org/2003/12/04/185038a.html
- #8 for urban dead: http://taint.org/2005/10/29/224403a.html
- #8 for korea porn: http://taint.org/2003/07/12/031422a.html
- #8 for BBC port 8998: http://taint.org/2003/08/
- #8 for iftop documentation wrt: http://taint.org/index.php?tag=freevo
- #8 for php mail injection spam: http://taint.org/2005/12/08/202248a.html
- #8 for fake open source software : http://taint.org/index.php?tag=open-source
- #9 for faad symbian: http://taint.org/index.php?tag=apple
- #9 for sky news ireland: http://taint.org/2004/05/12/205717a.html
- #9 for telemarketing counter speech: http://taint.org/2002/11/12/130851a.html
- #10 for "Scratch Heads Over": http://taint.org/2003/07/12/031422a.html
- #10 for web scraper linux console: http://taint.org/2004/06/05/023726a.html
Download here (5 KiB perl script).
Notes:
if you see a lot of "502 Bad Gateway" errors, it's probably over-zealous anti-bot ACLs on Google's side. Try from another host.
Read the comments for notes on a bug in recent releases of SOAP::Lite; please let me know if you hear of them getting fixed ;)
Heads-up, Dublin geeks: Vint Cerf will be speaking at the Dublin Googleplex on Thursday.
Sadly, I won't be able to make it myself -- I had to visit the UK this week. Pity; I would have loved to hear him speak :(
While driving around Ireland on a wedding-location-scouting trip, we started receiving texts talking about riots in Dublin; I texted a friend, and got a reply along these lines: "Celtic-topped scobes run riot through O'Connell St, torching cars in Nassau street, hospitalising cops and Charlie Bird. madness!"
I thought he was joking, but nope. A load of IRA-slogan-shouting scumbags really had been allowed to run riot -- with paving stones of all things left unsecured in their midst! -- and it quickly got way, way out of hand.
The blog coverage is excellent, with lots of photos. I suggest starting with Indymedia Ireland, these Flickr photos and the links on this weblog. It appears the gardai really fell down on this one.
For what it's worth, I was in town a few hours later, and the rest of Dublin was trouble-free -- just the usual Saturday night goings-on. O'Connell St. was still a rubble-strewn mess when I passed through on Sunday, though.
Good news. It appears that SourceForge are now offering full, public use of Subversion for all projects on sf.net!
The SourceForge.net: Subversion (Version Control for Source Code) document contains full details on their setup. Notable key points:
- It's using authenticated HTTPS -- which is great, going by my experiences with the ASF's setup
- Imports are done from either an existing SF.net CVS repository using cvs2svn, from a Subversion 'svnadmin dump' file, or from a CVS repository tarball
- CIAbot support is offered as standard ;)
Awesome. I'll be trying this out with Uffizi, which I registered as a Sourceforge project a few weeks ago just to try this out. ;)
Hooray!
SpamAssassin has been voted DataMation Anti-Spam Product of the Year for 2006, earning three times as many votes as the next contender.
This is the second year in a row, which is fantastic -- and our margin is increasing each year. ;)
Some news from TREC's Gordon Cormack:
The TREC 2005 Corpus (92,000 messages - 42,000 ham; 50,000 spam) is now available for self-serve download.
TREC Spam Evaluation is a NIST program to develop methods to measure spam filter accuracy and performance. More details here.
The corpus can be picked up at Gordon's site. As far as I can tell, this should be a pretty solid corpus for spam researchers and developers.
I don't do silly blog antics much, but I got tagged by Mat for the Four Things meme. Looking around, it is indeed a bit more interesting than things like the usual LJ quiz, so why not!
I wrote this on the plane from LA to Dublin, which may have affected some of the selections in 4 places I would rather be right now at least ;)
4 jobs I've had:
I was Iona Technologies' first employee, and stayed there for no less than 7 years. I got to see the company grow from a handful of people, most of whom weren't getting paid (hence how I wound up as the first employee ;), all the way up to a 300-strong multinational, while the company itself formed a core of Ireland's mini dot-com boom. That was fantastic fun, and educational to boot.
my Dad's gun/fishing/sporting-goods shop. Was it really a good idea to have a teenager working near firearms? At least I wasn't the one who unplugged the fridge where the maggots were kept, so that they all hatched over the course of one weekend...
A horrible teenage job -- picking tomatoes. I can still feel the orange dust under my fingernails every time I smell fresh tomatoes :( I didn't last very long at that at all.
writing an Amiga-based kiosk system for virtually no pay whatsoever, at the age of 18 or 19. Ah, exploitation.
4 movies I can watch over and over:
Koyaanisqatsi -- it's dating a little now, since every ad agency through the 90s ripped it off. But still, the invention of a new format. I remember looking at the 405 freeway in LA, and thinking "looks like something out of Koyaanisqatsi" -- of course, it was.
Princess Mononoke -- either that, or Nausicaa. I just love the way the characters are coloured in shades of grey, rather than black and white.
the Lord of the Rings trilogy -- oh dear I'm a hopeless Tolkien fanboy.
Spinal Tap -- pure genius.
4 places I've lived:
Melbourne, Australia; around the time of the annoying TV drama, The Secret Lives Of Us;
Newport Beach, CA; around the time of the annoying TV drama, The O.C.;
Dublin, Ireland; no annoying TV drama -- so far
University of California Irvine, CA; while Irvine itself is the most soulless suburban hellhole I've ever visited, living on the UCI campus is quite fun by comparison. Take about 1000 grad students, post-docs and lecturers from around the world; put them all in the same square mile or so; remove all fun (and bars!) from the surrounding areas; watch them make their own entertainment, or go mad.
4 tv shows I love:
4 places I've vacationed:
Annapurna Base Camp, Nepal; we trekked our way up to there, then trekked back down again. Unforgettable. I really want to do another Nepal trek as a result
car-camping around the Australian state of Victoria; they have some fantastic national park campsites, which most tourists overlook
learning how to dive in Ko Tao, Thailand; great setting, great dive sites, pretty cheap too!
Yosemite; amazing, world-class natural beauty. Californians don't realise just how lucky they've got it ;)
4 of my favourite dishes:
A good Thai green curry
Laos-style green papaya salad with sticky rice
a good meaty cassoulet, from Fandango in San Luis Obispo. At least, that was the tastiest meal I've had in recent months ;)
Mangosteen -- the queen of fruit, according to the Thais. I could, and probably have, eaten hundreds of these
4 places I would rather be right now:
spending New Year's Day with a bunch of friends in rural West Cork or County Galway; until I moved to the US, this was one of my favourite annual traditions.
the Stag's Head Bar, Dublin, in the snug, again with a bunch of friends
sitting on the grass outside the Pavilion bar in TCD, on a sunny summer's day (hmm, that's a lot of bars!)
Chiang Mai, Thailand
4 sites I visit daily:
4 people I'm tagging:
Keith Dawson sent this on -- an interview with Jim Gray, head of Microsoft's Bay Area Research Center and winner of the ACM Turing Award, talking about new transmission systems for truly massive data collections. Very interesting:
[One] option is to send whole computers. .... We're now into the 2-terabyte realm, so we can't actually send a single disk; we need to send a bunch of disks. It's convenient to send them packaged inside a metal box that just happens to have a processor in it. I know this sounds crazy -- but you get an NFS or CIFS server and most people can just plug the thing into the wall and into the network and then copy the data.
Dave Patterson, interviewer: What's the difference in cost between sending a disk and sending a computer?
JG: If I were to send you only one disk, the cost would be double -- something like $400 to send you a computer versus $200 to send you a disk. But I am sending bricks holding more than a terabyte of data -- and the disks are more than 50 percent of the system cost. Presumably, these bricks circulate and don't get consumed by one use.
DP: Are you sending them a whole PC?
JG: Yes, an Athlon with a Gigabit Ethernet interface, a gigabyte of RAM, and seven 300-GB disks -- all for about $3,000.
DP: It's your capital cost to implement the Jim Gray version of "Netflicks." (jm: sic)
JG: Right. We built more than 20 of these boxes we call TeraScale SneakerNet boxes. Three of them are in circulation. We have a dozen doing TeraServer work; we have about eight in our lab for video archives, backups, and so on. It's real convenient to have 40 TB of storage to work with if you are a database guy. Remember the old days and the original eight-inch floppy disks? These are just much bigger.
DP: "Sneaker net" was when you used your sneakers to transport data?
JG: In the old days, sneaker net was the notion that you would pull out floppy disks, run across the room in your sneakers, and plug the floppy into another machine. This is just TeraScale SneakerNet. You write your terabytes onto this thing and ship it out to your pals. Some of our pals are extremely well connected -- they are part of Internet 2, Virtual Business Networks (VBNs), and the Next Generation Internet (NGI). Even so, it takes them a long time to copy a gigabyte. Copy a terabyte? It takes them a very, very long time across the networks they have.
Boing Boing has an interesting case today:
"I filled out a web form for a contest from Miller using a throwaway junk email address and then, months after I dumped the throwaway account, I got this to my main account! Not sure I like the idea of companies tracking me down like this."
I sent a mail to follow up on this, but it's worth blogging here too.
This is, unfortunately, common practice among the "legitimate" bulk mailer companies; it's called "e-pending" (short for "email address appending"). Basically, the advertiser contacts one of the big data-mining companies, provides them with the data they have about the customer -- name, postal address, etc., and gets them to match that against their database; the data-miner then provides any other email addresses they may have on file for that user, even if those email addrs were provided for bills, promotional use for other companies, etc.
The advertisers contend that permission was given by the person who's being mailed; the recipients contend that permission was given to send to a specific address, not all of that person's addresses in perpetuity.
Here's a few more examples of e-pending gone bad: two Jennifer Millers, Sony scraping ancient Internic contact addresses, Spamvertized.org comment on the practice, Joe St. Sauver comments.
It's exclusively a US phenomenon, as far as I know; I think most cases of e-pending are rendered illegal under EU data protection law. Handy. ;)
Update: Brian at the Spam Kings weblog notes that 'this spooky little spam was the work of Equifax, the big credit reporting agency that shut down its Boca Raton-based spam operation, Naviant, in 2003, due to the impending passage of CAN-SPAM.'
Perfect timing! Just 5 days before I return to Ireland, Damien Mulley posts 'Broadband choices in Ireland', a good overview of the options available for consumer broadband internet connection.
I've been out of the loop for quite a while, and spoilt by the options available in suburban Southern California (which are, of course, pretty good). But this is a lot better than what was on the table when I left, 3 years ago.
What strikes me is that the upload/download speeds are quite reasonable and pretty close to what you'd see in the US. Similarly, the prices are finally near to the going rate in the US, once the various limitations and add-ons (required 'bundles', state taxes etc.) are taken into consideration.
However, virtually all of these deals use the horrendous concept of download
capping! Given that I use this stuff for work, and routinely rsync
around
30GB chunks of email corpora between central offices, colo servers, and my
desktop, this just won't fly. It could be argued that I'm therefore not a
typical broadband consumer, who these deals have been carefully designed to
cater for. But seriously -- if a telecommuting software developer isn't a
typical broadband consumer, who the hell is? Hey telcos: a little flexibility
goes a long way -- don't fence me in. ;)
All in all, it looks like Smart Telecom are the winners; 3Mb/s download, 512Kb upload -- and most importantly, no cap -- for EUR 35 per month. (And check out that XHTML/WAI-compliant website!)
I probably would have gone with Irish Broadband, but for the past 6 months the only thing I've been hearing about them via word-of-mouth has been bad news, detailing customer service meltdown after meltdown. Even the legendarily incompetent 'biddies' of Eircom seem to be getting better reviews nowadays.
Talking of Eircon, our dear old dirty-tricks-wielding celtic-tiger-throttling incumbent telco: the top Sponsored Link on a Google search for irish broadband is:
Irish Broadband
www.eircom.ie -- More speed, prices reduced by 25%, free modem & a free connection!
Scum.
AOL and Yahoo! have been making a lot of headlines with their plans to reduce their whitelist-management workload -- and make a little pay-to-send money on the side -- with a deal with Goodmail.
Now Spamhaus have gone on the record against the plan:
On Monday, Richard Cox, chief information officer at antispam organization Spamhaus, said that "an e-mail charge will destroy the spirit of the Internet."
"The Internet has become what it is because of freedom of communication. Open discussion is what gives it value. There should be no cost for particular services, and e-mail should be free and accessible to all. This will disenfranchise people."