Musing on Failover Backups - tuesday 2007-01-23 0956 last modified 2007-01-24 2014
Categories: Nerdy
TrackBacks Sent: None

As part of administering a community machine, backups are a major concern. While we do do them, we don't like the way it's organized. I guess when you throw a bunch of perfectionist engineers at a problem, a set of hand grown shell and Perl scripts to take care of each distinct type of data starts to look inelegant. There are a number of things that bewilder me about the open source world of server maintenance, and the lack of 'just do it' tools surrounding the deployment of mail, web, monitoring, and - yes, you guessed it - backups is rather disheartening. I mean, maybe there's a cabal behind all the distros to keep full time sysadmins in business; why else could this still be so tedious? An overabundance of choice is not always good.

Anyways, the perfect remote backup system. The perfect system doesn't distinguish between master and failover backup(s); they're always totally in sync so in case the master - only so because it's been technically and socially so-designated - dies, it only takes the flip of a DNS switch to go to the backup, and everything looks the same once the change propagates. It does all of this without me ever looking at it; once the master comes back to life, it takes a bit of time to resynchronize before we flip the DNS switch back.

The PRBS is totally secure despite the fact that it can write changes to any file on either machine. It uses the absolutely smallest amount of bandwidth at the lowest possible transfer rate, and takes very little time or CPU or disk resources to calculate the difference between the oldest and the latest backup.

Obviously nothing comes close on all these fronts. I'm willing to trade the bandwidth total and CPU and a bit of disk to get the rest. Part of what I'd really like to see is a more intense filesystem: one that does version control and triggering together. In version control parlance, any saved change (or copy, or move, or deletion) would propagate a commit; in backup terms, it would also immediately trigger a diff transfer to the backup.

But that doesn't exist. There are a bunch of names in this department, and the one which we've settled on for the moment is unison. But I can't just say unison /. I suppose it's fair that I have to weed out what counts for unison transfer. But then I have to figure out a mostly secure way to do it, because having root do this on both ends is an absolute security no-no if I want to do it automatically on a schedule.

And then there are databases and version control repositories that just aren't fit for unison use (it might synchronize to a totally unusable state since the files associated with these things have access and write protocols independent of the regular operating system read/write). So backup starts to look a bit like Rube Goldberg designed it, and instead of a nice, simple 'backup /', it's five tools and here's this over here that needs to be moved after transfer and this needs a binary diff tool and here's a chicken on a bicycle...

Ugh. No more musing. Headache.

When we have a working solution, I'll write a bit more. Hopefully the failover document doesn't end up more than two pages long.

You must login to leave a comment

TrackBacks

No TrackBacks for this entry.