Thanks to the lovely gPodder on my Nokia N900, I've recently discovered the net@night podcast by Leo Laporte. Like many regular podcasts, it's mostly full of random chatting and showmanship. In this respect, "new media" tend to be exactly like "old media": forced by their own schedule to blabber for the sake of it. But I digress.
The best segment of Laporte's show is usually an interview with someone from a startup, which is a good way of finding out about new services. Yesterday, it made me sign up for Backupify, a "social media backup tool" which will scrape your GMail / Delicious / Facebook / Flickr / Twitter / Blogger / Wordpress etc etc and store all the resulting data in a safe place on Amazon's cloud. Not a bad idea: the first 20 years of the Age of the Internet should have taught us, if anything, that data is ephemeral and can disappear at the flick of a switch. What happened to Geocities is proof that today's giants won't necessarily be with us tomorrow. Conscious of this state of things, Backupify gives you the option to drop your data on your own Amazon server, so that it will still be available if they go belly-up; quite a honest approach for a startup. It used to be a pay-only service, then went free to accelerate growth and get some venture capital; they will move to a freemium model after January 31, so you better try it out now if you can.
Good "Web 2.0" services usually expose APIs that make backups relatively easy for a programmer, but who's got time to write dedicated scripts AND the foresight to run them regularly? Myself, I've probably written half a dozen GMail scrapers, but I hardly ever ran them more than once. I've exported this Blogger-powered site once, and it was a nightmare. Backupify makes it very easy to "set up and forget", and that's good. The data will only be as good as what the various sites will allow; for example you will never be able to "restore" a Twitter account, so Backupify will only give you a PDF of your (and your friends') twits, which is the best you can expect. For Google Spreadsheets you get XLS files, for Blogger you get a big XML containing all your posts, etc etc.
The only problem with the site is the password anti-pattern: in order to get at your data, they often have to ask for your login details, and will store them on their servers. They do use OAuth if the service supports it (like Facebook or Google), but otherwise you'll have to trust them with your credentials. This makes them a very good target for black-hat hackers, among other things. I do hope they know what they are doing.
 
