Keith Dawson sent this on — an interview with Jim Gray, head of Microsoft’s Bay Area Research Center and winner of the ACM Turing Award, talking about new transmission systems for truly massive data collections. Very interesting:
[One] option is to send whole computers. …. We’re now into the 2-terabyte realm, so we can’t actually send a single disk; we need to send a bunch of disks. It’s convenient to send them packaged inside a metal box that just happens to have a processor in it. I know this sounds crazy — but you get an NFS or CIFS server and most people can just plug the thing into the wall and into the network and then copy the data.
Dave Patterson, interviewer: What’s the difference in cost between sending a disk and sending a computer?
JG: If I were to send you only one disk, the cost would be double — something like $400 to send you a computer versus $200 to send you a disk. But I am sending bricks holding more than a terabyte of data — and the disks are more than 50 percent of the system cost. Presumably, these bricks circulate and don’t get consumed by one use.
DP: Are you sending them a whole PC?
JG: Yes, an Athlon with a Gigabit Ethernet interface, a gigabyte of RAM, and seven 300-GB disks — all for about $3,000.
DP: It’s your capital cost to implement the Jim Gray version of “Netflicks.” (jm: sic)
JG: Right. We built more than 20 of these boxes we call TeraScale SneakerNet boxes. Three of them are in circulation. We have a dozen doing TeraServer work; we have about eight in our lab for video archives, backups, and so on. It’s real convenient to have 40 TB of storage to work with if you are a database guy. Remember the old days and the original eight-inch floppy disks? These are just much bigger.
DP: “Sneaker net” was when you used your sneakers to transport data?
JG: In the old days, sneaker net was the notion that you would pull out floppy disks, run across the room in your sneakers, and plug the floppy into another machine. This is just TeraScale SneakerNet. You write your terabytes onto this thing and ship it out to your pals. Some of our pals are extremely well connected — they are part of Internet 2, Virtual Business Networks (VBNs), and the Next Generation Internet (NGI). Even so, it takes them a long time to copy a gigabyte. Copy a terabyte? It takes them a very, very long time across the networks they have.