Cory Doctorow at Boing Boing links to an article at TechCrunch that lists Better and Cheaper Online File Storage as a product that needs to be made. However, Ben Laurie does the sums on online storage as a useful backup medium, and found them not exactly compelling (e.g. 100GB of data will take 75 days to upload over an 128Kbps link).
I tend to agree. An online host isn’t great as a backup host, since, in my experience, there are two types of backups required:
- The important small files (for example: encrypted password lists, my address book, my ~/bin directory)
- The massive big filesets (for example: MP3s, photos)
The first kind of fileset is amenable to an online backup-storage service, at first glance. However — in my opinion you’re better off going the whole hog for these files, and using the distributed, versioned backup method of putting it in a good networked revision control system, and checking it out everywhere, so you can also make changes and check in from any host; otherwise, you face the perils of syncing up a single backup from multiple “writers”, without conflicts. So far, none of the online file storage services offer SVN as an access method, so a shell account at a colo server still seems more useful on that count.
The second kind of fileset, as Ben notes, will take donkey’s years to upload and sync as a backup mechanism; and the economics are hardly compelling for the service provider.
I think I prefer Brad Templeton’s idea to deal with large-data backups —
I propose a software RAID-5, done over a LAN with 3 to 5 drives scattered over several machines on the LAN.
Slow as hell, of course, having to read and write your data out over the LAN even at 100mbits. Gigabit would obviously be better. But what is it we have that’s taking up all this disk space ? it?s video, music and photos. Things which, if just being played back, don?t need to be accessed very fast. If you’re not editing video or music, in particular, you can handle having it on a very slow device. (Photos are a bigger issue, as they do sometimes need fast access when building thumbnails etc.)
This could even be done among neighbours over 802.11g, with suitable encryption. In theory.
As a commenter notes, Linux has support for this already, in the form of software RAID and the network block device.
So: take an external IDE enclosure, add a GumStix board running Linux with software RAID, LVM, and nbd, and add wifi. Then add DAV, SMB and NFS export of the disk, and some decent UI code to organise the volumes into a single exported RAID volume (hopefully automatically!), and it’d be a pretty compelling product, in my opinion!
(hey Craig! I said GumStix! ;)