-
great article on current grid computing, featuring MPI, MapReduce, Hadoop, and promising a new UNIXy thing from tbray called Sigrid (ha!). Mind-boggling quote from Jim Gray: ‘Memory is the new disk. Disk is the new tape.’
-
spontaneously converts the off-patent anhydrous form of the drug into the patented hemihydrate form, which then successively converts more and more of the anhydrous form, Ice-9-style. Never mind “viral” licenses, this takes the biscuit! (via substitute)
Justin's Linklog Posts
An interesting article on blog-spam countermeasures — Google’s embarrassing mistake. Quote:
I think it’s time we all agreed that the ‘nofollow’ tag has been a complete failure.
For those of you new to the concept, nofollow is a tag that blogs can add to hyperlinks in blog comments. The tag tells Google not to use that link in calculating the PageRank for the linked site. […]
Since its enthusiastic adoption a year and a half ago, by Google, Six Apart, Wordpress, and of course the eminent Dave Winer, I think we can all agree that nofollow has done — nothing. Comment spam? Thicker than ever. It’s had absolutely no effect on the volume of spam. That’s probably because comment spammers don’t give a crap, because the marginal cost of spamming is so low. Also, nofollow-tagged links are still links, which means that humans can still click on them — and if humans can click, there’s a chance somebody might visit the linked sites after all.
I agree. At the time, I pointed at this comment from Mark Pilgrim:
Spammers have it in their heads now that weblog comments are a vector to exploit. They don’t look at individual results and tweak their software to stop bothering individuals. They write generic software that works with millions of sites and goes after them en masse. So you would end up with just as much spam, it would just be displayed with unlinked URLs.
Spammers don’t read blogs; they just write to them.
I still think he was spot on.
However, one part of the ‘Google’s embarrassing mistake’ article is a red herring — I think the chilling effect on "nonspam links" is not to be worried about; as Jeremy Zawodny said, life’s too short to worry about dropping links purely in the hopes of giving yourself Page Rank. I don’t know if I really want links that people are leaving purely for that reason. ;)
In fact, I wouldn’t be surprised to hear that Google’s crawler starts treating "nofollow" links as mildly non-spammy in a future revision, due to their wide use in wikis, blogs etc.
To be honest, though — I don’t see the problem of blog-spam much anymore. As I said here:
[Weblog] comment spam should be a lot easier to deal with than SMTP spam. … With weblog comments, you control the protocol entirely, whereas with SMTP you’re stuck with an existing protocol and very little "wiggle room".
On my WordPress weblog [ie. here] — which, admittedly, gets only about 1/4 of the traffic plasticbag.org does — I’ve instituted a very simple check stolen from Jeremy Zawodny. I simply include a form field which asks the comment poster for my first name, and if they fail to supply that, the comment is dropped. In addition, I’ve removed the form fields to post directly, requiring that all comments are previewed; this has the nice bonus of increasing comment quality, too.
Those are the only antispam measures I’m using there, and as a result of those two I get about 1 successful spam posted per week, which is a one-click moderation task in my email. That’s it.
The key is to not use the same measures as everyone else — if every weblog has a different set of protocols, with different form fields asking different simple questions, the only spammers that can beat that are the ones that write custom code for your site — or use human operators sitting down to an IE window.
Trackbacks, however — turn that off. The protocol was designed poorly, with insufficient thought given to its abuse potential; there’s no point keeping it around, now that it’s a spam vector.
Finally, a "perfect" solution to blog spam, while allowing comments, is unachievable. There will always be one guy who’s going to sit down at a real web browser to hand-type a comment extolling the virtues of some product or another. The goal is to get it to a level where you get one of those per week, and it’s a one-click operation to discard them.
(Update: This story got Slashdotted! The poor server’s been up and down repeatedly — looks like it needs an upgrade. In the meantime, WP-Cache has proven its weight in gold; recommended…)
-
argh. avoid iTunes 6 like the plague; Apple changed the DRM again, it’s as yet unbroken, and once you purchase a track, your account is “locked” to the new DRM. This page gives details of the (labourious) process required to escape this nasty trap
-
Polypaudio looks like Linux sound done right (at last). questions 21-24 of this FAQ list hint at awesome possibilities for LAN-networked speaker systems, even better than http://taint.org/wk/RemotePlaybackWithEsd .
-
the Ordnance Survey has set up an online shop to sell access to out-of-copyright, public domain maps of Ireland. thanks lads, but I think there’s a word for paying for something that one should be getting for free
A commenter at this post on Colm MacCarthaigh’s weblog writes:
I guess I still don’t understand how Open Source makes sense for the developers, economically. I understand how it makes sense for adapters like me, who take an app like Xoops or Gecko and customize it gently for a contract. Saves me hundreds of hours of labour. The down side of this is that the whole software industry is seeing a good deal of undercutting aimed at sales to small and medium sized commercial institutions.
Similarly, in the follow-up to the O’Reilly "web 2.0" trademark shitstorm, there’s been quite a few comments along the lines of "it’s all hype anyway".
I disagree with that assertion — and Joe Drumgoole has posted a great list of key Web 2.0 vs Web 1.0 differentiators, which nails down some key ideas about the new concepts, in a clear set of one-liners.
Both open source software companies, and "web 2.0" companies, are based on new economic ideas about software and the internet. There’s still quite a lot of confusion, fear and doubt about both, I think.
Open Source
As I said in my comment at Colm’s weblog — open source is a network effect. If you think of the software market as a single buyer and seller, with the seller producing software and selling to the buyer, it doesn’t make sense.
But that’s not the real picture of a software market. If you expand the picture beyond that, to a more realistic picture of a larger community of all sorts of people at all levels, with various levels interacting in a more complex maze of conversation and transactions, open source creates new opportunities.
Here’s one example, speaking from experience. As the developer of SpamAssassin, open source made sense for me because I could never compete with the big companies any other way.
If I had been considering it in terms of me (the seller) and a single customer (the buyer), economically I could make a case of ‘proprietary SpamAssassin’ being a viable situation — but that’s not the real situation; in reality there was me, the buyer, a few 800lb gorillas who could stomp all over any puny little underfunded Irish company I could put together, and quite a few other very smart people, who I could never afford to employ, who were happy to help out on ‘open-source SpamAssassin’ for free.
Given this picture, I’m quite sure that I made the right choice by open sourcing my code. Since then, I’ve basically had a career in SpamAssassin. In other words my open source product allowed me to make income that I wouldn’t have had, any other way.
It’s certainly not simple economics, is a risk, and is complicated, and many people don’t believe it works — but it’s viable as an economic strategy for developers, in my experience. (I’m not sure how to make it work for an entire company, mind you, but for single developers it’s entirely viable.)
Web 2.0
Similarly — I feel some of the companies that have been tagged as "web 2.0" are using the core ideas of open source code, and applying them in other ways.
Consider Threadless, which encourages designers to make their designs available, essentially for free — the designer doesn’t get paid when their tee shirt is printed; they get entered into a contest to win prizes.
Or Upcoming.org, where event tracking is entirely user-contributed; there’s no professional content writers scribbling reviews and leader text, just random people doing the same. For fun, wtf!
Or Flickr, where users upload their photos for free to create the social experience that is the site’s unique selling point.
In other words — these companies rely heavily on communities (or more correctly certain actors within the community) to produce part of the system — exactly as open source development relies on bottom-up community contribution to help out a little in places.
The alternative is the traditional, "web 1.0" style; it’s where you’re Bill Gates in the late 90’s, running a commercial software company from the top down.
- You have the "crown jewels" — your source code — and the "users" don’t get to see it; they just "use".
- Then they get to pay for upgrades to the next version.
- If you deal with users, it’s via your sales "channels" and your tech support call centre.
- User forums are certainly not to be encouraged, since it could be a PR nightmare if your users start getting together and talking about how buggy your products are.
- Developers (er, I mean "engineers") similarly can’t go talking to customers on those forums, since they’ll get distracted and give away competitive advantage by accidentally leaking secrets.
- Anyway, the best PR is the stuff that your PR staff put out — if customers talk to engineers they’ll just get confused by the over-technical messages!
Yeah, so, good luck with that. I remember doing all that back in the ’90’s and it really wasn’t much fun being so bloody paranoid all the time ;)
URLs:
(PS: The web2.0 companies aren’t using all of the concepts of open-source, of course — not all those web apps have their source code available for public reimplementation and cloning. I wish they were, but as I said, I can’t see how that’s entirely viable for every company. Not that it seems to stop the cloners, anyway. ;)
-
‘The surge of Nevaeh can be traced to a single event: the appearance of a Christian rock star, Sonny Sandoval of P.O.D., on MTV in 2000 with his baby daughter, Nevaeh. “Heaven spelled backwards,” he said.’ you stupid, stupid people
-
oh dear. tip: allowing your “VP of Corporate Communications” to respond is not the way to do it cluetrain-style
-
‘Tom from GBH and guests, playing Robot-Rock, Distortion-Disko, Electronic, Rock, New Wave Hip-hop, house, punk, electro, downbeat and classics.’ lots of good mashups and remixes, one 2-hour 128kbps MP3 every week
-
That evening, Ms. Li and her brother joined 15 strangers at the store to demand a group discount on a new television, refrigerator, and washing machine.’ wow (via EirePreneur)
-
old Llamasoft game images may be distributed and used free of charge to and by anyone. awesome!
-
any mention of “web 2.0” in a conference, and O’Reilly are firing legal letters — even for events outside the US
-
Criminal Records Bureau’s “erring on the side of caution” has resulted in around a 9.7% false positive rate, with 2,700 UK job-seekers falsely listed as being convicted criminals
My mate Pam is cycling in this year’s <a href="http://www.aidslifecycle.org/6081″>AIDS/LifeCycle — for a week from June 4 to 10, she’ll be cycling from San Francisco to LA, for charity. That’s 585 miles. Since she bought her bike to do this ride, she’s clocked up a terrifying 2040 miles. Blimey.
It’s for a good cause — go on, <a href=’https://www.aidslifecycle.org/donate/6081′>make a donation!
-
I was wondering why this was such a shambles; now it makes sense. ‘Inefficiency has become a virtue in government’ (via waxy)
-
Actual running hardware! Looks a lot more realistic than the last mock-ups. I’m more positive now that I hear they have Chris Blizzard and Jim Gettys involved, too
I added the Fixing Email weblog to Planet Antispam a while back — however, I’m not entirely sure at this stage that its content (which is seems to be primarily news syndication) fits with the "planet" concept (which is primarily intended for first-person posts).
So — quick poll. Let me know what you think, pro or con, Planet readers: should I remove the Fixing Email feed from that site?
Update: that was a pretty resounding ‘yes’. Done!
-
the guy behind the “more DoTs more DoTs more DoTs! 50 DKP MINUS!!” WoW voice-chat recording. I don’t play WoW, but this control freak’s incoherent freakout is hilarious even without knowing all the details
-
I’ve come around to this conclusion too — attempting to use continuations to implement a web app ‘requires you to write your code in such a way that it can tolerate sudden halts, thread switches, rewinding, and forking of execution’ (via Miguel de Icaza)
-
‘The response to my essay on plagiarism last week (“Where Have I Read That Before?â€) was swift, so here goes: Yes, it is plagiarized. 99% of it. The only original lines, in fact, are the first and the last two’
-
actually quite accurate! Deserved props for eMusic, Stereogum, Fluxblog, KCRW, Lemon-Red, ILM, and Music For Robots; missed the Hype Machine, though. mind you, that may be just as well
-
a new website-ribbon campaign from ISIPP, aimed at educating less-techie users on virus/malware avoidance; if you run a consumer-facing website, it’d be fantastic to get this up there
-
Trial a Niagara, get a free trip to SF! nice one Colm ;)
Dear Recruiters,
If you’re going to (a) scrape my CV page from my website, then (b) spam me, unsolicited, offering to represent me for jobs I don’t want in places I don’t live, in explicit contravention of the terms of use [*] of that document — here’s a tip.
Don’t compound the problem by asking me to resend the document in bloody Microsoft Word format. FFS.
([*]: Those terms were, of course, added in an attempt to stem the tide of recruiter spam. Thanks to Colm MacCarthaigh for the idea…)
-
cat-and-mouse fun with the Bank of England; interesting to hear that Google’s cache is still trackable via CSS references
-
A python framework based on one-way pipes and generators, from the BBC, used to build their “Macro” super-PVR. May be some ideas for IPC::DirQueue here
Reading <a
href="http://www.pkellypr.com/blog/2006/0516/the-twelve-days-of-a-changing-irish-society/”>this
post at Piaras Kelly’s blog, I was struck by something — I never realised
quite how bizarre the situation with Bebo is.
If you check out the Google Trends ‘country’
tab, Ireland is
the only country listed — meaning that search volume for "bebo" is
infinitesimal, by comparison, elsewhere! (Update: Ireland was the
only country listed, because the URL used limited it to Ireland only. However,
the point is still valid when other countries are
included, too ;)
It is also destroying Myspace as a search term on the Irish internet. (Update: also fixed)
As a US-based company, they must be mystified by all this attention — the Brazilian invasion of Orkut has nothing on this ;)
I’ll recycle a comment I made on Joe Drumgoole’s weblog as to why this happened:
My theory is that social networking systems, like Bebo, Myspace, linkedin, Friendster, Tribe.net, Orkut, Facebook etc. have all developed their own emergent specialisations. These are entirely driven by their users — although the sites can attempt to push or pull in certain directions (such as Friendster banning ‘non-person’ accounts), fundamentally the users will drive it. All of those sites have massively different user populations; Tribe has the Burning Man crowd, Friendster the daters, Orkut the brazilians etc.
Next, I think kids of school age form a set of small set of cliques. They don’t want to appear cool to friends thousands of miles away, on the internet; they want to appear cool to their peer group in their local school. So all it takes is a group of influential ‘tastemakers’ — the alpha males and females in a year — to go onto Bebo, and it becomes the site for a certain school; and given enough of that, it’ll spread to other schools, and soon Bebo becomes the SNS for the irish school system. In other words, Irish kids couldn’t really care less what US kids think of them; they want to be cool locally.
Also I think MySpace has a similar problem to Orkut — it’s already ‘owned’ by a population somewhere else, who are talking about stuff that makes little sense to Irish teenagers. As a result, it’s not being used as a social system here in Ireland; instead, it’s just used by musicians who want a cheap place to host a few tracks without having to set up their own website.
(Aside: part of the latter is driven by clueless local press coverage of the Arctic Monkeys — they have latched onto their success, put the cart before the horse, and decided that they were somehow ‘made’ by hosting music on MySpace, rather than by the attention of their fans. duh!)
-
according to social-network graph analysis of the Enron mail corpus, “one of the ‘central’ players was Ken Lay’s secretary”. ha! (via robotwisdom)
-
Harri Hursti’s report for BlackBoxVoting.org; it appears the boot loader will automatically reflash itself, if presented with a suitably-named file on PCMCIA media, and access to the PCMCIA slot is protected only by a few standard Philips-head screws. wow
-
great thread of comments sparked off by Paul Graham’s rather ill-informed presentation at XTech2006. Cory’s comment is spot-on, on both sides
-
Google’s scrapbook-clone service. first impressions: Firefox extension = good, lots of Flash, URL’s hardly catchy, no sign of RSS feeds
-
spam filters beating humans at performing spam classification quite a lot, it turns out. Everyone should give SpamOrHam.org a go!
-
good data; there does seem to be an appreciable effect
-
photo of Coldcut’s live setup — structured cabling system required
-
Downloadable filesystem images for Xen; all Linux so far, modified to run as Xen guests out of the box
-
‘what if your Singleton has a handle to some limited resource, like a database or file handle? I guess you get to keep that sucker open until your program ends’. YES (via mjd)
-
excellent software-development interview advice
-
iPod-sized ambient hardware loop player from China; the tee-shirts are fantastic
-
DHS ineptitude strikes again
-
some really authoritative thumbs-down comments from Valdis Krebs and John Robb
-
interesting further notes; apparently the Trintech Smart 5000 PINPad terminals run Linux, and can be managed remotely
-
‘Communities are human business debuggers. Why not know the problems, address them and prove that they’re fixed all in public?’ excellent article, with the solid testimonial of Threadless backing it up
-
‘former Iowa congressman Edward Mezvinsky was caught up in a 419 scam, and stole from his law clients, friends, and even his mother-in-law .. He is serving more than six years in prison after pleading guilty to thirty-one counts of fraud.’ bloody hell
-
Suw Charman and a load of others (see comments) lay into the BBC’s “citizen journalism” conference: “a complete waste of time”. ouch
-
‘Please visit and take a minute to post positive comments about BlueSecurity. BlueSecurity is encouraging us to do such things so let’s help them spread the good word.’ explains a lot; several other astroturf coordination forums at castlecops.com, too
new-referrer-rss.pl – generate RSS feed of new referrer URLs from access_log
SYNOPSIS
new-referrers-rss nameofsite [source ...] > new-referrers.xml
DESCRIPTION
Given the name of a web site, and a selection of Apache combined log format ‘access_log’ files containing referrer URL data, this will generate an RSS feed containing the latest referrers.
The script should be run periodically with ‘fresh’ access_log data, from cron.
Renesys Blog: The Bluesecurity Fiasco — in which Todd Underwood, CSO for Renesys Corporation, applies some real-world knowledge of how the internet works to the "timeline of events" press release, issued by BlueSecurity as part of their ongoing PR about the DDoS.
Judging by the comments at Slashdot, this really needs to be more widely read.
Here’s some highlights:
The timeline from BlueSecurity […] is frustratingly vague. It uses phrases like ‘tampering with the Internet backbone using a technique called "Blackhole Filtering".’ As Thomas Pogge, a philosophy professor of mine, used to say: that’s not even wrong yet. There is no "Internet backbone", there is no technique known as "Blackhole Filtering", and blackhole routing is not normally described as tampering. So the whole explanation is nonsense. […] Let’s clear one thing up for the press and everyone else: this event just wasn’t that interesting. The attack against bluesecurity was a run-of-the-mill denial of service attack.
His conclusion:
I believe that the PR engine from BS is in overdrive spinning this event as fast as they can. But the concrete facts being put out by them simply to not add up. In the process they seem to be doing two things: 1) trying to imply or state that someone at UUnet was bribed by a spammer. This is simply ridiculous. I know many of the people who work for UUnet and they are honest, hardworking and extraordinarily clever people. They would not be crooked, or stupid, enough to do such a thing and if they were, they would have been trivially caught by change-management procedures. Moreover, such a change at UUnet (or BTN) wouldn’t have caused the event BS claims to have witnessed anyway. Additionally, 2) BS is trying to deflect attention from the damage that they caused at Six Apart. It would be much better if they could just claim ignorance of the DOS, apologize and move on. I recognize that that isn’t going to happen, but it sure would make this whole thing easier to handle.
Well said.
Of course, this is pretty much immaterial — the people who are using Blue Frog, and vocally supporting Blue Security, don’t really care what happened. All they care about is that someone is taking some kind of direct action against spammers, in some way or another, and if there’s a little "friendly fire" and some bending of the truth, why, this is a war! What, do you support the spammers?
It’s disappointing — the amount of disinformation being successfully pumped out (and accepted!) on this story is massive.
-
they’re no longer shipping games, electronics, or home/garden items to Ireland. what with this and the crappy shipping, looks like they’ve written off the Irish market for some reason
Bubba, now safely back in Dublin after his 8000-mile flight from LAX, is getting back into exploring his old manor.
Here he is, ignoring a very brave magpie. Judging by the way the magpie was brazenly hopping around him, cawing, and the way that Bubba was ignoring him, I suspect there may be a nest nearby….
-
good points from Joe Drumgoole; what works for Irish VCs isn’t necessarily aligned with what’s good for Ireland’s high-tech industry
-
First doodled on a placemat by Ken Thompson and Rob Pike for Plan 9 in 1992 (via era)
-
Blue Security accidentally took down large chunks of the blogosphere in an attempt to evade the DDoS targeting them; impressively inept. also, they really need to tone down their sock-puppet commenter squad (via torrez)
-
open geodata creation from OSM in a 3-day mapping-fest this weekend. great explanation of why open geodata is important in the UK and Ireland, too
-
behavioural analysis on web-search engine bots, with some pretty pics (via waxy)
-
YUM. wonder if I can find condensed milk around here
Apparently, Transport For London <a href="http://software.silicon.com/applications/0,39024653,39150647,00.htm”>are planning ‘e-money’ trials based on their remotely-readable <a href=’http://www.rfidbuzz.com/wiki/Standards/MIFARE’>Oyster RFID cards.
Combine that with Kevin Mahaffey of Flexilis’ talk at Black Hat last year, where he demonstrated apparatus to extend RFID read range from 4-6 inches to approximately 50 feet, and things could get messy. ;)
The slides for that talk are available here (PDF); slide 20 specifically mentions the Hong Kong "Octopus" cashless-payment card.
-
Some users of the Blue Frog software are considering this leak to be some kind of Churchillian challenge to their resolve, instead of a failure on Blue Frog’s part! amazing
-
‘What Wikipedia has taught us .. is that in a vacuum of politics, politics will be created. There is no vacuum of politics.’ interesting article
-
‘This spammer is using mailing lists he already owns and is now sending millions of such messages’ — hasn’t hit any of our thousands of spamtraps, which is quite impressive in that case
Blue Frog is a company who operates a "Do Not Email" list, on the (optimistic) basis that spammers will vet their lists against it.
Reportedly, it’s been compromised. If this is true, I’m not surprised — as Dr. Aviel Rubin‘s report to the FTC of May 2004 regarding a Do-Not-Email list notes:
The scrubbing approach [to running a D-N-E list] requires that a list of live email addresses exist. While the party owning that list may be well intentioned, it is unlikely that such a valuable list would not leak out. History is replete with insider attacks, as well as external break-ins to highly sensitive sites, such as the Pentagon computers. The Do Not Email Registry represents the kind of prize that attracts hackers. In this case, the prize has monetary value as well. Once the list is exposed, there is no way to undo it.
Also, it’s almost inevitable:
If this service were running for some time, it is more likely than not that the plaintext addresses would leak at some point, given the history of computer security incidents.
Update: it appears, according to this white paper, that the Blue Frog "Do Not Intrude" list is hashed, rather than plain-text. Rubin’s advice still applies:
Without hashing, a compromise of the registry database results in exposure of all of the registered email addresses. This is a total disaster. However, even exposure of a hashed list is a catastrophe. A spammer with a copy of a hashed list of email addresses is able to find out, for any email address, if the address is in the registry. The attacker simply hashes a candidate email address and sees if the hashed value is in the list. This is very powerful. [….]
Hashing provides absolutely no security against a marketer who obtains a scrubbed list and uses that to sell the addresses that were scrubbed by the registry. Whether or not the list is hashed has no impact on a malicious marketer in the scrubbing approach.
Are you a student, and interested in earning $4,500 for contributing to open source, and fighting spam, over the course of the summer?
If so, get thee hence to the Google Summer of Code 2006 site, and propose a project!
Last year, we in SpamAssassin didn’t get it together to mentor SoC projects. This year, however, we have a few prospective mentors (including myself), and a few sample project ideas lined up; we’re all ready to go! Here’s the Student FAQ. Be quick; applications end in a week and a bit.
Here’s hoping we get some interesting submissions ;)
-
YouTube’s bandwidth bill ‘may be approaching $1 million a month’. holy crap (via waxy)
-
a really nice Flickr-like take on mapping; every street has user-contributed location geodata included; open REST API; social aspects; Google-friendly. Best mapping site I’ve seen
-
‘Everything needed to make this episode is available in the eler-source directory in a bzipped tarball. … Creative Commons Attribution Share-Alike license’