Archive

Posts Tagged ‘rsync’

Backup Strategies

June 20th, 2009 11 comments

With my primary hard drive (a three-year old WD Raptor WD740) having been on life support, so to speak, for the last 3 months, I’ve been a lot more diligent about keeping backup copies of my data. Every couple of days, I log out entirely and run a simple rsync script to copy my entire /home directory to a specialized partition on my secondary disk, which I keep at /mnt/backup for simplicity sake.

While its parameter handling can be a bit quirky, I find that it is extremely useful for two reasons: The first more or less negates its quirky parameter handling: Clear and thorough documentation, with lots of example program calls  The second is that it saves me a lot of time in copying the files. Similar to the DeltaRPM feature I raved about with Fedora 11, it copies over only the changed content instead of the entire directory tree. With my home directory at nearly 20 GB, incrementally updating my backup like this prevents a good 90+% of the data from needing to be copied again.

In this way, I know that I have at least two copies of my data at any given time. A major plus to copying the directory tree as-is is that, once the drive does die and I replace it, I merely need to copy it over, without changing anything or unpacking huge tarballs and applying diffs, et al.

The disadvantage to this is that I only have one consistent backup copy of my data at a given time, and that backup is on a hard drive in the same computer. So, should there be a massive system failure of some sort (knock on wood!), then I would lose my data for certain. I also intend to purchase CD-RWs for this purpose – that is, as an additional backup medium – in the near future. But for right now, the second on-disk copy suffices. I also want to setup a RAID system in my next computer build…but that’ll have to wait. 🙂

So this simple rsync method, as with any storage decision, has its benefits and downfalls:
Pros:

  • Easy to configure;
  • Can be automatically run (e.g., in a cron job);
  • Updates occur via content deltas, not full copies;
  • Backup data is “as-is”, and can be used immediately after copying.

Cons:

  • Only one backup copy;
  • Physical proximity to original data;
  • Requires space for an entire duplicate of the directory tree.

For me, though, this method works out well. Do others have a similar system? Would you suggest any improvements/simplifications? I’d like to hear your thoughts on the matter! Thanks.