Skip to content

Justin's Linklog Posts

Links for 2009-01-14

Links for 2009-01-13

Hack: reassassinate

A coworker today, returning from a couple of weeks holiday, bemoaned the quantities of spam he had to wade through. I mentioned a hack I often used in this situation, which was to discard the spam and download the 2 weeks of supposed-nonspam as a huge mbox, and rescan it all with spamassassin — since the intervening 2 weeks gave us plenty of time for the URLs to be blacklisted by URIBLs and IPs to be listed by DNSBLs, this generally results in better spamfilter accuracy, at least in terms of reducing false negatives (the "missed spam"). In other words, it gets rid of most of the remaining spam nicely.

Chatting about this, it occurred to us that it’d be easy enough to generalize this hack into something more widely useful by hooking up the Mail::IMAPClient CPAN module with Mail::SpamAssassin, and in fact, it’d be pretty likely that someone else would already have done so.

Sure enough, a search threw up this node on perlmonks.org, containing a script which did pretty much all that. Here’s a minor freshening: download

reassassinate – run SpamAssassin on an IMAP mailbox, then reupload

Usage: ./reassassinate –user jmason –host mail.example.com –inbox INBOX –junkfolder INBOX.crap

Runs SpamAssassin over all mail messages in an IMAP mailbox, skipping ones it’s processed before. It then reuploads the rewritten messages to two locations depending on whether they are spam or not; nonspam messages are simply re-saved to the original mailbox, spam messages are sent to the mailbox specified in "–junkfolder".

This is especially handy if some time passed since the mails were originally delivered, allowing more of the message contents of spam mails to be blacklisted by third-party DNSBLs and URIBLs in the meantime.

Prerequisites:

  • Mail::IMAPClient
  • Mail::SpamAssassin

Links for 2009-01-09

Links for 2009-01-08

  • Map/Reduce and Queues for MySQL using Gearman : A talk by Eric Day and Brian Aker at the upcoming MySQL Conference in April: ‘[Gearman] development is now active again with an optimized rewrite in C, along with features such as persistent message queues, queue replication, improved statistics, and advanced job monitoring. For MySQL, there is also a new user defined function to run Gearman jobs, as well as the possibility to write your own aggregate UDFs using Gearman. This gives you the ability to run functions in separate processes, separate servers, and in other languages. The Gearman framework gives you a robust interface to also run these functions reliably in the “cloud”. This session will introduce these concepts and give examples of sample applications.’ Persistent queues (at last)? Gearman integration directly in the DB? excellent!
    (tags: gearman queueing mysql databases brian-aker mapreduce sql conferences talks papers)

Links for 2009-01-07

Links for 2009-01-06

Links for 2009-01-02

Links for 2009-01-02

Links for 2008-12-28

Links for 2008-12-22

Links for 2008-12-21

Links for 2008-12-19

Links for 2008-12-18

Links for 2008-12-17

If only this were true

Some people, when facing a problem, think “I’ll use regular expressions.” Now they have HORDES OF CUTE PEOPLE WANTING TO SLEEP WITH THEM

Yoz, on twitter

Listening to music over wifi?

Hey lazyweb! Long time, no write.

I’m wondering what setup people use to deal with the following situation. Upstairs, I have an Ubuntu 8.04 server with 71GB of MP3s. Downstairs, I have a stereo system. In between the two is a wireless network. How can I listen to the music downstairs, without simply copying the lot (or subsets thereof) onto a local disk on some appliance down there?

Currently, I’m using a VNC client on a Nokia 770 to control a JuK window on the server. This works great, believe it or not! KDE 3 can be coaxed into providing a fantastic UI for a small touchscreen. This then uses Pulseaudio to transmit the sound output using the ESD protocol over TCP to the ESD server on the N770, and the N770 plays back the sound.

Until a few months ago, this worked great. However, something (either hardware changes, network topology changes, or an upgrade to Ubuntu 8.04 on the server) has resulted in effective bitrates between the server and the N770 dropping frequently — hence the audio drops out or changes pitch, rendering it unlistenable :(

I’ve tried using UPNP servers (specifically mediatomb, ushare, and Twonkymedia), with the built-in Media Streamer app on the N770. All fail. MP3s cut off near the end, M3U playlists aren’t supported, and sometimes Media Streamer just locks up. In addition it’s pretty messy trying to get the UPNP servers to notice changes to the MP3 collection.

I’ve also tried using Squeezecenter (nee Slimserver), but the MP3 stream playback support on the N770 is pretty atrocious; there are audible decoding artifacts.

So — anyone got a suggestion? Even something involving iTunes might be helpful — as long as it can at least preserve the Linux server. I’m unlikely to host the full MP3 collection on anything else…

Links for 2008-12-11

Links for 2008-12-10

Links for 2008-12-09

Links for 2008-12-08

Links for 2008-12-07

Links for 2008-12-03

Links for 2008-11-26

Recession Hits The Digital Depot

The Digital Depot is ‘an innovative, state-of-the-art building specifically designed to meet the needs of fast growing digital media companies […] developed as a joint initiative of Enterprise Ireland, Dublin City Council and The Digital Hub Development Agency.’ Generally, it’s a pretty nice place to work, and a great resource for startups and small tech companies.

However, recently, it looks like they’ve been embarking on some innovative, state-of-the-art cost-cutting exercises.

There’s a little canteen area, for companies to make tea and coffee, wash up their mugs, etc. Check out this snapshot from the canteen this morning, courtesy of JK’s phone cam:

Notice anything odd about that bottle of washing-up liquid?

Yum yum! Nothing nicer than washing your mug with a dash of toilet cleaner.

Links for 2008-11-21