Skip to content

Justin's Linklog Posts

Dublin Riots

While driving around Ireland on a wedding-location-scouting trip, we started receiving texts talking about riots in Dublin; I texted a friend, and got a reply along these lines: "Celtic-topped scobes run riot through O’Connell St, torching cars in Nassau street, hospitalising cops and Charlie Bird. madness!"

I thought he was joking, but nope. A load of IRA-slogan-shouting scumbags really had been allowed to run riot — with paving stones of all things left unsecured in their midst! — and it quickly got way, way out of hand.

The blog coverage is excellent, with lots of photos. I suggest starting with Indymedia Ireland, these Flickr photos and the links on this weblog. It appears the gardai really fell down on this one.

For what it’s worth, I was in town a few hours later, and the rest of Dublin was trouble-free — just the usual Saturday night goings-on. O’Connell St. was still a rubble-strewn mess when I passed through on Sunday, though.

SourceForge.net now offering public Subversion

Good news. It appears that SourceForge are now offering full, public use of Subversion for all projects on sf.net!

The SourceForge.net: Subversion (Version Control for Source Code) document contains full details on their setup. Notable key points:

  1. It’s using authenticated HTTPS — which is great, going by my experiences with the ASF’s setup
  2. Imports are done from either an existing SF.net CVS repository using cvs2svn, from a Subversion ‘svnadmin dump’ file, or from a CVS repository tarball
  3. CIAbot support is offered as standard ;)

Awesome. I’ll be trying this out with Uffizi, which I registered as a Sourceforge project a few weeks ago just to try this out. ;)

TREC Spam Corpus

Some news from TREC’s Gordon Cormack:

The TREC 2005 Corpus (92,000 messages – 42,000 ham; 50,000 spam) is now available for self-serve download.

TREC Spam Evaluation is a NIST program to develop methods to measure spam filter accuracy and performance. More details here.

The corpus can be picked up at Gordon’s site. As far as I can tell, this should be a pretty solid corpus for spam researchers and developers.

Four Things

I don’t do silly blog antics much, but I got tagged by Mat for the Four Things meme. Looking around, it is indeed a bit more interesting than things like the usual LJ quiz, so why not!

I wrote this on the plane from LA to Dublin, which may have affected some of the selections in 4 places I would rather be right now at least ;)

4 jobs I’ve had:

  • I was Iona Technologies’ first employee, and stayed there for no less than 7 years. I got to see the company grow from a handful of people, most of whom weren’t getting paid (hence how I wound up as the first employee ;), all the way up to a 300-strong multinational, while the company itself formed a core of Ireland’s mini dot-com boom. That was fantastic fun, and educational to boot.

  • my Dad’s gun/fishing/sporting-goods shop. Was it really a good idea to have a teenager working near firearms? At least I wasn’t the one who unplugged the fridge where the maggots were kept, so that they all hatched over the course of one weekend…

  • A horrible teenage job — picking tomatoes. I can still feel the orange dust under my fingernails every time I smell fresh tomatoes :( I didn’t last very long at that at all.

  • writing an Amiga-based kiosk system for virtually no pay whatsoever, at the age of 18 or 19. Ah, exploitation.

4 movies I can watch over and over:

  • Koyaanisqatsi — it’s dating a little now, since every ad agency through the 90s ripped it off. But still, the invention of a new format. I remember looking at the 405 freeway in LA, and thinking "looks like something out of Koyaanisqatsi" — of course, it was.

  • Princess Mononoke — either that, or Nausicaa. I just love the way the characters are coloured in shades of grey, rather than black and white.

  • the Lord of the Rings trilogy — oh dear I’m a hopeless Tolkien fanboy.

  • Spinal Tap — pure genius.

4 places I’ve lived:

  • Melbourne, Australia; around the time of the annoying TV drama, The Secret Lives Of Us;

  • Newport Beach, CA; around the time of the annoying TV drama, The O.C.;

  • Dublin, Ireland; no annoying TV drama — so far

  • University of California Irvine, CA; while Irvine itself is the most soulless suburban hellhole I’ve ever visited, living on the UCI campus is quite fun by comparison. Take about 1000 grad students, post-docs and lecturers from around the world; put them all in the same square mile or so; remove all fun (and bars!) from the surrounding areas; watch them make their own entertainment, or go mad.

4 tv shows I love:

4 places I’ve vacationed:

  • Annapurna Base Camp, Nepal; we trekked our way up to there, then trekked back down again. Unforgettable. I really want to do another Nepal trek as a result

  • car-camping around the Australian state of Victoria; they have some fantastic national park campsites, which most tourists overlook

  • learning how to dive in Ko Tao, Thailand; great setting, great dive sites, pretty cheap too!

  • Yosemite; amazing, world-class natural beauty. Californians don’t realise just how lucky they’ve got it ;)

4 of my favourite dishes:

  • A good Thai green curry

  • Laos-style green papaya salad with sticky rice

  • a good meaty cassoulet, from Fandango in San Luis Obispo. At least, that was the tastiest meal I’ve had in recent months ;)

  • Mangosteen — the queen of fruit, according to the Thais. I could, and probably have, eaten hundreds of these

4 places I would rather be right now:

  • spending New Year’s Day with a bunch of friends in rural West Cork or County Galway; until I moved to the US, this was one of my favourite annual traditions.

  • the Stag’s Head Bar, Dublin, in the snug, again with a bunch of friends

  • sitting on the grass outside the Pavilion bar in TCD, on a sunny summer’s day (hmm, that’s a lot of bars!)

  • Chiang Mai, Thailand

4 sites I visit daily:

4 people I’m tagging:

The Return of Sneakernet

Keith Dawson sent this on — an interview with Jim Gray, head of Microsoft’s Bay Area Research Center and winner of the ACM Turing Award, talking about new transmission systems for truly massive data collections. Very interesting:

[One] option is to send whole computers. …. We’re now into the 2-terabyte realm, so we can’t actually send a single disk; we need to send a bunch of disks. It’s convenient to send them packaged inside a metal box that just happens to have a processor in it. I know this sounds crazy — but you get an NFS or CIFS server and most people can just plug the thing into the wall and into the network and then copy the data.

Dave Patterson, interviewer: What’s the difference in cost between sending a disk and sending a computer?

JG: If I were to send you only one disk, the cost would be double — something like $400 to send you a computer versus $200 to send you a disk. But I am sending bricks holding more than a terabyte of data — and the disks are more than 50 percent of the system cost. Presumably, these bricks circulate and don’t get consumed by one use.

DP: Are you sending them a whole PC?

JG: Yes, an Athlon with a Gigabit Ethernet interface, a gigabyte of RAM, and seven 300-GB disks — all for about $3,000.

DP: It’s your capital cost to implement the Jim Gray version of "Netflicks." (jm: sic)

JG: Right. We built more than 20 of these boxes we call TeraScale SneakerNet boxes. Three of them are in circulation. We have a dozen doing TeraServer work; we have about eight in our lab for video archives, backups, and so on. It’s real convenient to have 40 TB of storage to work with if you are a database guy. Remember the old days and the original eight-inch floppy disks? These are just much bigger.

DP: "Sneaker net" was when you used your sneakers to transport data?

JG: In the old days, sneaker net was the notion that you would pull out floppy disks, run across the room in your sneakers, and plug the floppy into another machine. This is just TeraScale SneakerNet. You write your terabytes onto this thing and ship it out to your pals. Some of our pals are extremely well connected — they are part of Internet 2, Virtual Business Networks (VBNs), and the Next Generation Internet (NGI). Even so, it takes them a long time to copy a gigabyte. Copy a terabyte? It takes them a very, very long time across the networks they have.

E-Pending

Boing Boing has an interesting case today:

"I filled out a web form for a contest from Miller using a throwaway junk email address and then, months after I dumped the throwaway account, I got this to my main account! Not sure I like the idea of companies tracking me down like this."

I sent a mail to follow up on this, but it’s worth blogging here too.

This is, unfortunately, common practice among the "legitimate" bulk mailer companies; it’s called "e-pending" (short for "email address appending"). Basically, the advertiser contacts one of the big data-mining companies, provides them with the data they have about the customer — name, postal address, etc., and gets them to match that against their database; the data-miner then provides any other email addresses they may have on file for that user, even if those email addrs were provided for bills, promotional use for other companies, etc.

The advertisers contend that permission was given by the person who’s being mailed; the recipients contend that permission was given to send to a specific address, not all of that person’s addresses in perpetuity.

Here’s a few more examples of e-pending gone bad: two Jennifer Millers, Sony scraping ancient Internic contact addresses, Spamvertized.org comment on the practice, Joe St. Sauver comments.

It’s exclusively a US phenomenon, as far as I know; I think most cases of e-pending are rendered illegal under EU data protection law. Handy. ;)

Update: Brian at the Spam Kings weblog notes that ‘this spooky little spam was the work of Equifax, the big credit reporting agency that shut down its Boca Raton-based spam operation, Naviant, in 2003, due to the impending passage of CAN-SPAM.’

RFID in the Grauniad, and back in Dublin

Greetings from sunny Dublin, Ireland! (really!)

I’m now back in taint.org’s native timezone, although precariously set up and experiencing occasional interruptions. If you’re waiting for a mail from me, it may take a little more time.

I did have time to be interviewed last week by Karlin Lillington for this Guardian story:

To make sure customs agents could read his cat’s chip to match him to his Pet Passport on return to Europe, Mason bought his own scanner at a cost of some £200. "I didn’t want to risk the cat being impounded for six months’ quarantine at Heathrow," he sighs.

It’s true.

Happy to be back — I think. Looking forward to my first pints, in over a year, of creamy Guinness in its native habitat. I also have a couple of half-written weblog entries I wrote on the plane, too…

Yahoo! delete b3ta newsletter mailing list?

Today’s top item on the b3ta front page, under Site News:

Yahoo please talk to us! Help! – our yahoogroups list (with over 100,000 subscribers) has been deleted. We don’t know why. If you work at Yahoo and can help us sort this out please contact me at robmanuel AT gmail dot com.

posted by rob on 10th Feb at 2pm

B3ta is a long-established UK humour site who send out a weekly newsletter, every Friday afternoon, using Yahoo! Groups as their mailing list service. They’ve been doing this for years. Yep, that’s 100,000 subscribers.

Anyway, if anyone from Y!Groups, or anyone who knows someone there, is reading, please do get in touch with the b3ta guys — this is a very serious catastrophe for them. I’d be curious to hear how/why this happened.

To tie this into spam-filtering and email operational topics, it brought this posting from Jeremy Zawodny to mind:

This all makes me wonder if it’s worth it for smaller organizations to bother running their own mail servers anymore. If Google offered small business mail the way Yahoo does, there’d be some serious competition in the market and it’d make a lot of people’s lives much easier.

While Jeremy was talking about a different service from list hosting, I think we’re seeing the other side of the email-outsourcing coin, here.

Update: fwiw, it’s back:

Yahoo update – on Friday Yahoo deleted our list of 100,000 newsletter readers email addresses, hence we didn’t send a newsletter. Today they’ve been in touch and have promised a response by Tuesday. Fingers crossed. UPDATE: It looks like it’s back! Hooray for Yahoo!

Broadband choices in Ireland

Perfect timing! Just 5 days before I return to Ireland, Damien Mulley posts ‘Broadband choices in Ireland’, a good overview of the options available for consumer broadband internet connection.

I’ve been out of the loop for quite a while, and spoilt by the options available in suburban Southern California (which are, of course, pretty good). But this is a lot better than what was on the table when I left, 3 years ago.

What strikes me is that the upload/download speeds are quite reasonable and pretty close to what you’d see in the US. Similarly, the prices are finally near to the going rate in the US, once the various limitations and add-ons (required ‘bundles’, state taxes etc.) are taken into consideration.

However, virtually all of these deals use the horrendous concept of download capping! Given that I use this stuff for work, and routinely rsync around 30GB chunks of email corpora between central offices, colo servers, and my desktop, this just won’t fly. It could be argued that I’m therefore not a typical broadband consumer, who these deals have been carefully designed to cater for. But seriously — if a telecommuting software developer isn’t a typical broadband consumer, who the hell is? Hey telcos: a little flexibility goes a long way — don’t fence me in. ;)

All in all, it looks like Smart Telecom are the winners; 3Mb/s download, 512Kb upload — and most importantly, no cap — for EUR 35 per month. (And check out that XHTML/WAI-compliant website!)

I probably would have gone with Irish Broadband, but for the past 6 months the only thing I’ve been hearing about them via word-of-mouth has been bad news, detailing customer service meltdown after meltdown. Even the legendarily incompetent ‘biddies’ of Eircom seem to be getting better reviews nowadays.

Talking of Eircon, our dear old dirty-tricks-wielding celtic-tiger-throttling incumbent telco: the top Sponsored Link on a Google search for irish broadband is:

Irish Broadband

www.eircom.ie — More speed, prices reduced by 25%, free modem & a free connection!

Scum.

Spamhaus comment on the AOL/Goodmail deal

AOL and Yahoo! have been making a lot of headlines with their plans to reduce their whitelist-management workload — and make a little pay-to-send money on the side — with a deal with Goodmail.

Now Spamhaus have gone on the record against the plan:

On Monday, Richard Cox, chief information officer at antispam organization Spamhaus, said that "an e-mail charge will destroy the spirit of the Internet."

"The Internet has become what it is because of freedom of communication. Open discussion is what gives it value. There should be no cost for particular services, and e-mail should be free and accessible to all. This will disenfranchise people."

RFID “e-Passports”

This is what passports containing RFID chips will look like:

Note the little rectangular logo at the bottom. According to Ed Hasbrouck, that’s the ICAO standard logo indicating that this is an RFID passport, and therefore:

identity thieves, terrorists, direct marketers, data aggregators, malicious governments, or anyone else with a radio receiver within 10 meters (30+ feet) or more whenever your passport is read at a border crossing, airport, etc. can secretly and remotely track you, log your movements through the unique "collision avoidance" ID number sent by the chip, and intercept and decrypt all the data (including your digital photo and, in some countries, your digitized fingerprints) needed to "clone" a perfect copy of your passport, forge other identity credentials, or impersonate you.

Of relevance are the comments over at <a href="http://www.schneier.com/blog/archives/2006/01/dutch_biometric.html”>Bruce Schneier’s weblog entry regarding the <a href="http://www.riscure.com/wth.html”>Riscure research into the Dutch Biometric Passport’s lousy security.

Interestingly, as one commenter there notes, breaking the crypto may be overkill; the knowledge that a person is carrying a passport from a certain country, or set of countries, may be enough for certain attackers.

I asked the Irish Passport Office about their RFID plans last April:

I’m an Irish citizen and passport-holder. I have been following recent discussions in the US regarding the addition of RFID computer chips to US passports, and I note that the US Department of State is now indicating that this measure was made necessary due to recent International Civil Aviation Organization (ICAO) standards — namely ICAO Doc 9303.

As a result, since Ireland is a signatory to ICAO regulations, this raises the question as to whether Irish passports shall shortly include similar RFID or "contactless chip" technology.

Can you tell me:

  • if this is planned?

  • is there a mechanism for public comment on this process?

  • who could I further email to ask about this, if you do not know?

Disappointingly, I never received a reply. :( Someday I should really chase this up.

Update, Oct 17 2006: Well, they never bothered replying. They did, however, introduce RFID chips to Irish passports:

The chip technology allows the information stored in an Electronic Passport to be read by special chip readers at a close distance. The chip incorporates digital signature technology to verify the authenticity of the data stored on the chip.

OpenWRT Wifi Repeater Recipe

Seeing as I’ve moved house, and am staying at a friend’s temporarily until I head back to .ie, internet access has become a bit of a problem. Hence, I’m posting this via some neighbour’s leeched wifi ;)

To do this, I came up with some seriously hacky IP infrastructure, to wit a repeater setup composed of two off-the-shelf router/NAT/AP boxes, since the signal is pretty weak and needed a boost to cover the useful parts of the house. If you’re curious. the details can be read over here.

Weblog Spam and Adversarial Classification

Dr. Dave, author of the Spam Karma WordPress antispam plugin, has posted an interesting article about new weblog-spammer tactics:

These spams do not present most of the idiotic traits of their lower colleagues: they do not try cramming hundreds of URLs or inserting hundreds of easily spotted junk keywords in the comment content. Instead, they use only the dedicated name and homepage fields to sneak in spam URL and keywords. The comment content is often perfectly innocuous, sometimes even topical (by copying parts of another comment or a trackbacking post). All in all, these spams could easily be missed by a human moderator who wouldn’t look carefully at the contact name and URL.

(Thanks to Kelson Vibber for the pointer to this.)

In other words, he is noting what we noticed in email anti-spam; that what works well one year, is likely to degrade over time as the spammers attempt to evade it, and one has to keep working to keep up.

The best term for this appears to be adversarial classification. Anti-spam activities fall into this category, and it often means that classic text classification algorithms aren’t suitable — after all, the Reuters-21578 dataset never tried to evade your classifier ;)

In a similar vein, this MS research paper is interesting:

Previous work on adversarial classification has made the unrealistic assumption that the attacker has perfect knowledge of the classifier. …. We present efficient algorithms for reverse engineering linear classifiers with either continuous or Boolean features and demonstrate their effectiveness using real data from the domain of spam filtering.

It’s akin to John Graham-Cumming’s work looking into how a spammer could get past a bayesian filter "from the outside", but with more techniques, and examining MS’ MaxEnt algorithm, too. PDF here, well worth a read.

(By the way, I’m in the process of moving house, so if you send me an email, it may take a while for me to reply. This situation is likely to prevail for the next few weeks, for what it’s worth — fun.)

Raw Food Crackpottery

Via RobotWisdom, a review of a new Primrose Hill cafe:

No wheat. No gluten. No sugar. No GMO. No dairy. No yeast. No shoes.

Yep, no shoes. If you want to enjoy the detoxifying glories of London’s first raw-food cafe, then please leave your clod-hoppers at the door, along with your high stress levels and your smart-arse scepticism.

I know of another cafe elsewhere which also offered a largely-raw menu. This one, however, shared a back alleyway with a shop where a friend of mine worked.

He noted that on several occasions, he’d seen rats near, or on, the pallets of plastic-wrapped fruit and vegetables. You see, the raw food was delivered to the kitchen door, where it laid outside for a short while — in the rat-infested alleyway. Rats crawling over your food, naturally, is not a good thing.

There’s a very good reason why some smart stone-age ancestor invented cooking our food — because it kills the germs that’ll make us sick!

Devotees claim that because the enzymes are destroyed when food is heated above 48C, our bodies have to utilise our own enzymes to break down the food, which can result in us feeling tired and run-down.

Yeah, devotees are pretty much talking crap there. ;) If anything, cooked food is easier to digest than raw. And good luck with the whole ‘getting by without using enzymes’ thing!

What a load of quackery.

Happy Spam-Solved Day!

Happy BillG-Scheduled Spam Solved Day!

"Two years from now, spam will be solved," Microsoft’s Bill Gates said [at the 2004 World Economic Forum in Switzerland].

So is it? Weeeeell…..

To "solve" the problem for consumers in the short run doesn’t require eliminating spam entirely, said Ryan Hamlin, the general manager who oversees [Microsoft]’s anti-spam programs. Rather, he said, the idea is to contain it to the point that its impact on in-boxes is minor.

In that way, Hamlin said, Gates’ prediction has come true for people using the right tactics and advanced filtering technology.

Ha. I am reminded of ‘weapons of mass destruction-related program activities’.

As one slashdotter says, ‘when you fail, try try again; or conversely, change the requirements and make it look like a success, which is exactly what BG has done.’

It’s not washing, though, unsurprisingly. The poll on the same page, asks ‘do you agree with Microsoft’s contention that the spam problem has been "solved"?’ Right now, with 1169 votes, it has 7.2% (in other words, the MS employees) agreeing, and a whopping 92.8% not going for it.

SweetheartsConnection.com – Interesting Dating Scam

Here’s an interesting online scam. An anonymous friend, working in anti-spam, writes:

‘I’ve been covertly looking into rumours of a myspace scam and thought you might like to blog it – I don’t want to be attached to this in any way otherwise I’d write about it myself (I have a profile on there that I want to keep around in case other scams show up, but I don’t really want to advertise the profile).

It works like this:

You sign up for a myspace account and fill in your profile details. Then in a couple of days someone contacts you pretending they’re using their friend’s account because they haven’t signed up yet. They say something along the lines of "I saw your profile and thought you were cute, if you’re interested email me at (random)@yahoo". If you email them, you get a reply back being all bubbly and cute, and a link to a web page that sort of looks like a "My First Homepage" – it even says "I’m taking a course at the community college in HTML". There are pics on the page of a very cute girl, but at the bottom a teaser saucy picture in lingerie, and an Adult Pass signup to get more pics. Of course the signup is $40.

It’s a subtle scam, but definitely a scam. Here’s an example of the type of site you get sent to:

http://www.honesthost5mb.com/kristenssite/

Note the hosting service. Now delete the /kristenssite/ part and it looks legit, right? Until you click on a few links and realise they have nothing to sell.

Google has no knowledge of honesthost5mb – nobody links to them, so how did Kristen find them?

It’s indeed quite funny that there’s a terribly similar hosting service out there: http://www.jagflyhosting.com/ – yet for some reason all their links seem to work, and they have an accessible phone number. Shock. Horror!

I’m pretty sure the account being (ab)used on myspace is a stolen one – it looks pretty legit, including linked in friends and comments, so I’m suspecting a cracked password.

Anyway, thought you could blog this to warn others about it (feel free to advertise the above link – though I guess that’ll ruin the whole "google doesn’t know" thing ;-) I wish I had the guts to sign up for the extra pics to see what you end up with!’

They also passed on the email content, noting ‘here’s the email sent from yahoo webmail from an AOL account (sadly AOL proxies all web content so I can’t track it any further than New York proxies)’:

Hi [redacted] ! Hey you found me! I was a little worried you wouldn’t be able to :P so, how are you? I’m ok.. I’m sneaking a email in at work before my boss comes back in, so sorry if it’s a little short! I promise to write more later :)

So I promised you some pics:P well I will have to send you some of me when I get home (don’t have the pics here at work). In the meantime you can check out my personal homepage. It’s kind of playground while I’m taking this intro to HTML class, kind of like my blog page. Here is the link: http://www.honesthost5mb.com/kristenssite It’s not much yet but it’s getting there. hehe

So tell me more about yourself, are you a work to live or live to work kinda person? What are you looking for in a girl? Do you like myspace? I think I’ll make a profile soon, it’s free right? and you can add your own HTML? That would be cool.. So how is your 2006 going? Mine is ok, one thing I’m excited about though is that today is exactly 1 week before my birthday. Hey, maybe if we hit it off, we can go on a first date on my birthday, that would be really cool. :)

Anyways, enough with the 20 questions right? oh, I prefer to chat on IM, its more personal you know? Do you have AIM? im kriskat224 on there, msg me sometime ok?

Well I should log off and get some work done.. Write back soon! and take care!

xoxo ~ Kristen

Sure enough, a little further research on Google yields the following examples…

The earliest is this story at Jiveworld.net, of 2004-05-24, noting:

Aaron recently received an e-mail from someone he supposedly chatted with on Match.com:

Aaron: I had actually been chatting with someone I might have met there a LONG time ago. I couldn’t remember, so I gave her the benefit of the doubt. I thought it was SPAM, but hey, even my own e-mails sounds like SPAM sometimes. She sent me a picture in her e-mail, but the mail service she was using didn’t like it. So she sent me the link to her "website." It initially seemed like a real personal web space until the big ADULT BUREAU logo appeared. Oh yes, very legitimate.

This was a unique experience for me since someone actually wrote a tailored response to my e-mail, responding to specific things I had mentioned. Even though the bulk of the e-mail seemed form generated, this had to have been a time intensive process for damn near no return. Well, after the ADULT thing, I thought my response to her e-mail was inventive. Since I haven’t received another response, it’s obvious she (Or he) took the hint.

Another: a thread at FordPower.net, 2004-09-24, with a link to http://www.4mbwickedweb.com/sites/melissa/ (since expired);

Another: a Fark thread posting, 2005-01-28, scroll down to the posting of ‘2005-01-28 10:42:28 AM’ by ‘XavierCrutch’, linking to http://www.stepstonehost.com/jesshomepage/ (since expired);

Another: this weblog post, scroll down to March 13, 2005, ‘Personal ads and the great porn conspiracy’, where the poster is snared, via IM with AIM user natkat224 this time, and is sent another link to a site using http://adultbureau.sweetheartsconnection.com/ to collect the $40 fee;

Another: another weblog post, 2005-10-28.

A google search for the AIM username ‘natkat224’ reveals plenty more hits.

So here’s a list of the sites found from those links, and via google, so far:

The common host, at all stages, is ‘SWEETHEARTSCONNECTION.COM’, registered to

INTERTRANS TRADING OVERSEAS LIMITED
VASILEOS OTHONOS 21, FANEROMENIX COMPLEX, OFFICE 102, 6030 LARNACA
N/A
N/A, CA N/A
CY

lots more detail here. SweetheartsConnection.com has terms and conditions that appear to prohibit spamming — but it turns out that they themselves have a pretty scary entry at RipoffReport.com, anyway, noting:

If you want a free LIFE TIME PASSWORD with Adult Bureau.. you have to apply for a 1 month membership @$39.95 to Sweetheartsconnection.com A DATING SERIVCE ….. charge appears as IT INTERNET SERVICES.

No matter if you request cancellation of service this company will continue to bill you " it gets better " then send you to there home made collection company " Secure debt collections, " two companies in one both fraud

Phony Notices will be sent to the home demanding final payment of a service NEVER USED. They will contact you, try intimidate you into paying a Balance of $200.00 (Sweetheartsconnecton.com automatically rebills your credit card every month @$39.95.

eek.

This weblog post, of 2005-10-28. is shaping up to be the canonical support group for victims of this scam; worth reading the comments there.

Quite a scam, and interesting to note the "personal touch" via email and IM.

The C=64-izer

Ever wondered what today’s internet meme images would look like on mid-’80’s home computing hardware?

Wonder no longer!

What Works in Software Development

I already posted this to the link-blog yesterday, but it’s so good it’s worth promoting more widely. If you write software for a living, you really ought to read the slides for Michael Schwern’s excellent ‘What Works In Software Development’ talk.

It’s a long presentation (108 slides!), but during the course of that, he covers:

  • effective teamwork
  • dealing with bad customers
  • dealing with bad management
  • classic coding mistakes
  • classic project management mistakes
  • classic design mistakes
  • test-driven development
  • refactoring
  • patterns

It’s a really good synthesis of what I think are the best bits of good OO design, XP, CPAN and perl’s design and coding styles, without most of the cruft. I’ll be pointing people at this for years to come, I think…

(Found via yoz.)

Planet Antispam: Beta No More

Planet Antispam has been working pretty nicely for the last couple of weeks — can’t say I’ve noticed any trouble, and its RSS feed is turning out to be a nice aggregation of anti-spam news. On top of that, John Levine was kind enough to set up a CNAME for it at a more appropriate URL — http://planet.spam.abuse.net/.

As a result, it’s now fully-fledged, and fit to lose the ‘beta’ qualifier. Please bookmark, subscribe to the feeds, and pass on the URL to others you think may be interested!

Moving Home — De-Cluttering

I’m moving home.

The flights are booked — Feb 14th, Valentine’s Day, I’ll be leaving Orange County and heading back to Dublin permanently. In the meantime, I’ve been selling stuff, throwing stuff out, decommissioning servers, and making backups.

The server

My erstwhile desktop, later my trusty back-room server, ‘jalapeno’, was sold earlier today. Thankfully, I bought a 250GB hard drive recently, so I actually had the room to back up its 70GB somewhere beforehand.

Being security-conscious, I overwrote its partitions using pseudo-random data before passing it on (‘dd if=/dev/urandom of=/dev/hda9 bs=1024k’). However, being lazy, I did this while the machine was up and running, over an SSH link.

Watching as ‘df’ produced gibberish output, and as later commands started producing nothing but bus errors, was odd — a very strange feeling to be actively destroying the disk’s data like that. Here’s hoping the backups worked

The yard sale

We had one, in the process selling about $1000 worth of IKEA furniture, books, camping equipment, bits of hardware, sports equipment, and a pink xmas tree:

The local bargain hunters starting knocking on the door at 8:15am, despite the sign’s posted start time of 9am. Once we did start bringing items out to the front lawn to sell, there were already about 10 people, which quickly swelled to a mob of 20 by 8:45am. They were keen!

By the end of Saturday, we’ve sold pretty much all the furniture, all of the sports and camping equipment, most of the hardware that isn’t total crap, and only 2 of the books. One shopper’s explanation: ‘she didn’t have the time to read books’.

Still, the yard sale has netted $345. Not bad, and a good feeling to de-clutter so successfully.

Music, and iPod Shuffle

I’ve realised I like the endings of songs; whether I like a song or not, entirely depends on how it ends.

Apple’s iPod shuffle algorithm is incredible. I’ve been spending quite a bit of time listening to it, and I’m sure it’s not random; I think it’s picking next tracks based partly on the similarity of metadata between the current and candidate tracks, which is quite neat as an automated mixing technique.

So is it random? Google says:

  • yes
  • no; a commenter on that article notes the same thing I’m talking about
  • yes
  • no; can’t say I’ve noticed the Beatles getting a push on mine
  • yes
  • and finally, no answer here, but a pretty cool stats experiment

Google DRM and WON Authentication

So, Google have invented their own DRM, apparently. I’m keen to find out more details; Techdirt and Plasticbag.org are so far the only places I can find in the blogosphere to discuss it in any detail.

One tidbit worth noting from the LA Times coverage:

The Google copy-protection software also imposes a big restriction: The CBS shows, NBA games and other material protected by the software can be watched only on a computer that’s connected to the Internet.

"I think it’s going to be a problem," said Li, the Forrester analyst, adding that Google executives told her they were trying to fix it.

That’s interesting. In my opinion, given that quote, I’ll bet Google’s DRM is something similar to the copy-protection systems used for many games since about id’s Quake 3 and Valve’s Half-Life; an online "key server" which validates codes, tracks player IDs, and who’s viewing what, "live", as the video is cued up and played.

Some more info on the Half-Life WON authentication system can be found in this GamaSutra article; subscription required — try viewing this google-cache version with Javascript off if you don’t have a sub. That’s historical now, of course, since that WON system has been replaced by a new auth protocol as part of Valve’s ‘Steam’ system.

The key factor is the network, separating the dangerous, untrustworthy user machine from the trusted key server. Since the online key server can act as a platform for trusted, known-insubvertable code to run, along with the video server, both being under Google’s control, it’s actually possible to build reasonably solid DRM on this model. That’s as opposed to the usual case, where a reasonably determined teenager can break it in a week of school-nights. ;)

Anyway, that’s speculation. It remains to be seen if they’ve come up with something along the lines of WON authentication — and if it’s still easily subvertable or not.

Update: Aristotle Pagaltzis has a pretty good point in the comments:

Watching video, unlike playing a multiplayer game, is not an activity that inherently requires connecting to a server. Playing a multiplayer game, OTOH, inherently is.

So cracking a multiplayer game’s key check is fruitless, because then you can’t play online anymore, which was the whole point of the game in the first place. In contrast, a video player with a cracked key check still fulfills its purpose just fine.

I think he’s right. That’s a key point, demonstrating how WON authentication still can’t help — media playback, as a task, is itself fundamentally crackable.

Wedding Plans

Myself and the lovely C are planning on getting married, hopefully sometime this year. I’ve just come across some details about Japanese weddings, and apparently:

‘If you are attending a Japanese wedding reception, you are expected to bring cash for a gift (called Oshugi). The amount depends on your relationship with the couple and the region, unless the fixed amount is indicated on the invitation card. The average is 30,000yen ($250) for a friend’s wedding. It’s important that the cash is enclosed in a special envelope called Shugi-bukuro and your name is written on the front.’ … ‘It is a grave insult to give less than $200.’

That gives me a great idea… ;)

Planet Antispam

So a few weeks back, I mooted the idea of an anti-spam Planet site, similar to Planet GNOME, Planet Java, Planet Perl et al.

Here’s the results: Planet Antispam.

It’s still got a few rough edges; notably, the URL is not permanent — I’d prefer something at a more spam-themed domain — and the logo is the generic "PlanetPlanet" one. But it’s up and running in a beta-ish fashion.

Feel free to bookmark, subscribe, post the URL on, etc.; and if you’d like to give it a better home with an A record at a spam-themed domain, drop me a line.

Update, Jan 17: Thanks to John Levine, it now has a permanent home at http://planet.spam.abuse.net/ . After several weeks of operation, I think it’s turning out to be pretty solid, too!

By the way, it also needs more source feeds. If you know of people with blogs, working on/writing about anti-spam (of the email variety), with RSS feeds that work, include the post text, and permit further redistribution of that text, drop us a line and I’ll add them.

Finally, here’s a picture of a Starbucks SPAM(r) Sandwich. (shudder)

Allowing users to have steak knives

This post on the Wikipedia/Seigenthaler spat at Corante.com contains this excellent comment from Wikipedia’s Jimmy Wales:

Imagine that we are designing a restaurant. This restuarant will serve steak. Because we are going to be serving steak, we will have steak knives for the customers. Because the customers will have steak knives, they might stab each other. Therefore, we conclude, we need to put each table into separate metal cages, to prevent the possibility of people stabbing each other.

What would such an approach do to our civil society? What does it do to human kindness, benevolence, and a positive sense of community?

When we reject this design for restaurants, and then when, inevitably, someone does get stabbed in a restaurant (it does happen), do we write long editorials to the papers complaining that "The steakhouse is inviting it by not only allowing irresponsible vandals to stab anyone they please, but by also providing the weapons"?

No, instead we acknowledge that the verb "to allow" does not apply in such a situation. A restaurant is not allowing something just because they haven"t taken measures to forcibly prevent it a priori. It is surely against the rules of the restaurant, and of course against the laws of society. Just. Like. Libel. If someone starts doing bad things in a restuarant, they are forcibly kicked out and, if it"s particularly bad, the law can be called. Just. Like. Wikipedia. I do not accept the spin that Wikipedia "allows anyone to write anything" just because we do not metaphysically prevent it by putting authors in cages.

Irish MEPs on Data Retention

So, the bad news — it appears that the European Parliament has passed the ‘Data Retention’ Directive, introducing requiring EU states to introduce mandatory electronic surveillance of all European citizens.

Tuppenceworth.ie has looked up how the Irish MEPs voted on the Directive. I was appalled to discover that Proinsias De Rossa (Labour) was the only Irish MEP to vote for this surveillance.

I generally give a high preference to Labour when voting, and before that, Democratic Left, and I’ve voted for him several times in the past. However, I think this may be the deal-breaker. I’m extremely disappointed.

By the way if party line was the issue — that didn’t stop Gay Mitchell (Fine Gael), who broke party line on this, saying:

I do not know why this proposal was rushed. The extremely accelerated legislation procedure has meant that there was little time for discussion, and translations were sometimes unavailable. There was also no time for a technology assessment or for a study on the impact on the internal market.

Major credit to him.

My ApacheCon Roundup

Back from ApacheCon!

I’ve got to say, I found it really useful this year. Last year, I was pretty new to the ASF, and found that my expectations of ApacheCon didn’t quite match reality; it wasn’t a rip-roaring success exactly, for me, as a result.

However, many details of how the ASF works — and how the conference itself works and is organised — are much clearer after you’ve spent some time lurking and absorbing practices in the meantime. (The visibility one gets into the process as a member of the ASF makes this a lot easier.)

Result: it was much more of a success for me this time around. Plenty of networking, putting faces to the names, hanging out, and discussing many aspects of our work.

The hackathon really worked out, too; while we didn’t produce a hell of a lot of code per se, it made for a good ‘developer summit’ and I think we established solid agreement on SpamAssassin’s short-term directions and goals. (summary: rules, and faster).

On top of that, I got to meet up with Colm MacCarthaigh and Cory Doctorow for discussion of Digital Rights Ireland. Looks like I’ll be spending a bit of time on that next year ;)

Finally: Solaris. On Monday night, I got to sit down with Daniel Price, one of the kernel engineers behind Solaris Zones, work through a quick demo of a bug I was running into with chroot(2) and zones on our rule-QA buildbot server, and watch as he visually traced it through the OpenSolaris kernel source on the web. From this — and from talking to Daniel — it’s pretty clear that things have changed at Sun. Pretty much the entire Solaris operating system is now a full-on open-source project; it’s not just a marketing gimmick. The source is up there on the web, that’s the source for the code they’re running now, and there’s no half-assed ‘freeze it, cut out the good bits, and throw it over the wall’ fake-open-source tricks.

The concept of getting this level of access to Solaris source code and engineers, would have blown my mind when I was Iona’s sysadmin back in the 1990s ;) I’m very impressed.

Windows Live Local and Firefox

Windows Live Local, with its isometric, Sim City, "bird’s eye" view, is quite nice.

However, what gets me is — do MS do this deliberately? I’m referring, of course, to the way it’s broken on Firefox 1.5, requiring you to drag twice to get it scrolling around the viewport, and the jumpy, clunky UI on that browser.

Pretty lame — and lazy, too. By now, it’s essential for a new fancy website to work under Firefox; even if only 20% of your users will be using it, a good proportion of those are the bleeding-edge, ‘taste-maker’ types who’ll be blogging about it, writing reviews for newspapers and news sites, and generally generating buzz for you, and thereby attracting the other 80%.

I’m told it works great in IE, but there’s no way I’m starting Windows and opening up that app. If I want to be infected by 700 different malwares within seconds, I’ll ask. ;)

On top of that, coverage seems spotty — Ireland is AWOL, of course.

As a result, my one line summary would have to be: idea = cool, dataset = probably cool, execution = half-assed and crappy. I’m looking forward to Google doing a much better job with their implementation of the Sim City viewpoint.

Email Injection attacks in PHP via mail()

Apparently, spammers are now exploiting a hole, or holes, in multiple PHP scripts which use the mail() API.

The holes are described at the SecurePHP wiki; basically, the script author inserts CGI fields directly into a message template without stripping newlines, and this allows attackers to create new headers, take over the message body, and generally take over the mail message and destinations entirely.

Funnily enough, these are the same holes Ronald F. Guilmette and I found in FormMail 1.9, and described in our Jan 2002 advisory Anonymous Mail Forwarding Vulnerabilities in FormMail 1.9 (PDF) on page 10, Exploitation of email and realname CGI Parameters. Ah, plus ca change…

Worth noting that perl’s venerable taint checking would have spotted these, if it were used.

ApacheCon US 2005

In a couple of weeks, I’ll be going to San Diego for ApacheCon US 2005 (including the hackathon beforehand). There’ll be quite a few other SpamAssassin committers there, too, so if you’re working with SA, or interested in getting some face time with the developers, there’s no better way of doing so.

Digital Rights Ireland launch, next Tuesday

DRI’s formal launch is next Tuesday:

December 6th sees the formal launch of Digital Rights Ireland, with a press conference in the Conference Room, Pearse St. Library, Dublin 2 at 11.30am. We would like to invite to you to come along – we’d welcome your support, and the chance to chat with you about your concerns after the main conference. Please feel free to invite anyone else who you think would be interested in digital rights. To give us an idea of numbers, we’d appreciate an email to <contact AT digitalrights.ie> to let us know if you’re planning on coming along.

[thx] HAM


flickr_IMG_7139.jpg
Originally uploaded by Andy Cadaver.

I was just emailing with Sarah Carey, and she correctly noted that my weblog has been tending towards the techie-incomprehensible recently. A brief look at the front page confirms this.

So here’s a remedy: a photo of the delicious ham which the lovely C cooked up for Thanksgiving, last Thursday. Just look at that, mmmmm!

When I get back to Ireland, I will be bringing Thanksgiving with me; a holiday based around eating cooked fowl, with no religious baggage whatsoever? I’m so there.


New SpamAssassin Rule Development Tools

Recently, I’ve been working on new systems to develop SpamAssassin rules faster, and with a lower ‘barrier to entry’ to the core ruleset. Some highlights seem bloggable, seeing as it’s all web-based and I can link to it!

The ‘preflight’ BuildBot:

This uses the fantastic BuildBot continuous-integration system to monitor changes to our Subversion repository.

Every time something is checked into SVN, this wakes up and immediately runs mass-checks using that latest code and rules, allowing near-real-time viewing of changes in rule behaviour. (A ‘mass-check’ is a massive run of SpamAssassin across a corpus of hundreds of thousands of emails, en masse, to measure rule hit-rates.)

The corpus it mass-checks is split in a certain way so that results will be available very quickly — typically in under 10 minutes — with increasing quantities of results becoming available as time elapses.

Progress of the mass-checks are visible at the BuildBot here; as they complete, their results become visible on the Rule-QA app (below). (More info, if you’re curious.)

The Rule-QA App:

To date, we’ve used the basic "freqs" table — output from the hit-frequencies command-line script — as the UI for rule QA and evaluation. This is fine for a small number of developers, but it scales badly and (like mass-checks) requires a pretty complex setup on the developer’s machine.

This new component is a web application, which takes the "freqs" table, and "webifies" it — demo.

Some major improvements are also made possible; the most important, that it can now display ‘freqs’ for multiple revisions during the day, and keeps historical data for comparison. It adds several new reports from ‘hit-frequencies’; a score-map, overlaps, a performance measurement, and a boolean ‘promoteability’ measurement.

Finally, a really useful new report is the graph of rule hit-rate, as it changes over time. Here’s a cached demo, or see the same data produced ‘live’. This gives a totally new insight into how the rule hits for various people’s corpora, how that changed over time, and allows a whole new type of rule analysis. (In fact, it also allows pretty good corpus analysis, too; can you tell which submitters bounce high-scoring spam at receipt time?)

(More info on these.)

Product idea: RAID Backup Enclosures

Cory Doctorow at Boing Boing links to an article at TechCrunch that lists Better and Cheaper Online File Storage as a product that needs to be made. However, Ben Laurie does the sums on online storage as a useful backup medium, and found them not exactly compelling (e.g. 100GB of data will take 75 days to upload over an 128Kbps link).

I tend to agree. An online host isn’t great as a backup host, since, in my experience, there are two types of backups required:

  • The important small files (for example: encrypted password lists, my address book, my ~/bin directory)
  • The massive big filesets (for example: MP3s, photos)

The first kind of fileset is amenable to an online backup-storage service, at first glance. However — in my opinion you’re better off going the whole hog for these files, and using the distributed, versioned backup method of putting it in a good networked revision control system, and checking it out everywhere, so you can also make changes and check in from any host; otherwise, you face the perils of syncing up a single backup from multiple "writers", without conflicts. So far, none of the online file storage services offer SVN as an access method, so a shell account at a colo server still seems more useful on that count.

The second kind of fileset, as Ben notes, will take donkey’s years to upload and sync as a backup mechanism; and the economics are hardly compelling for the service provider.

I think I prefer Brad Templeton’s idea to deal with large-data backups —

I propose a software RAID-5, done over a LAN with 3 to 5 drives scattered over several machines on the LAN.

Slow as hell, of course, having to read and write your data out over the LAN even at 100mbits. Gigabit would obviously be better. But what is it we have that’s taking up all this disk space ? it?s video, music and photos. Things which, if just being played back, don?t need to be accessed very fast. If you’re not editing video or music, in particular, you can handle having it on a very slow device. (Photos are a bigger issue, as they do sometimes need fast access when building thumbnails etc.)

This could even be done among neighbours over 802.11g, with suitable encryption. In theory.

As a commenter notes, Linux has support for this already, in the form of software RAID and the network block device.

So: take an external IDE enclosure, add a GumStix board running Linux with software RAID, LVM, and nbd, and add wifi. Then add DAV, SMB and NFS export of the disk, and some decent UI code to organise the volumes into a single exported RAID volume (hopefully automatically!), and it’d be a pretty compelling product, in my opinion!

(hey Craig! I said GumStix! ;)

Wisdom Teeth — Complete!

On Friday, I got my lower-left wisdom tooth extracted. That’s the last one that should cause any trouble; there’s only one remaining, and it’s fully out so shouldn’t act up. After a few years of on-again-off-again twinges, and lots of irresponsible putting-off of surgery, I’ve finally taken care of it.

The downside: I’m totally zonked on painkillers, so I won’t be doing much for the next few days apart from what’s required for day-to-day day-job stuff.

Urban Dead HUD; added Inventory Sorting

I’ve updated the Urban Dead HUD Greasemonkey userscript; it now offers inventory sorting, inspired by Ikko’s userscript (albeit a little different in implementation). Here’s a screenshot:

Right now, UD is reasonably interesting — our team of plucky survivors have been helping out with the defence of Caiger Mall, a major mall towards the north-west of the city. We’ve repulsed the Church of the Resurrection‘s attempts to wipe us out, but that seems to have made us quite a juicy target; there are now no less than three separate Zombie groups ganging up on us. For now, we’re still holding out.