[CLUG-tech] Memory problems with rdiff-backup in my backup system
David
wizzardx at gmail.com
Tue May 6 12:12:55 SAST 2008
Hi list.
Appologies in advance for the long mail.
I've posted this to the rdiff-backup list, but haven't had a response
yet, so I thought I'd try here.
On one of my work's backup servers rdiff-backup is using excessive
amounts of memory (almost 2 GB used, out of an available 1 GB RAM and
1 GB swap).
I've checked the rdiff-backup FAQ - My rdiff-backup version is
1.1.5-4, and librsync1 is 0.9.7-1, so I shouldn't have the librsync
memory leak mentioned there. These are Debian versions, I'm running
rdiff-backup on a Debian Etch system.
More information about my backup system:
I'm using a combination of rsync and rdiff-backup to preserve history
for millions of files (a backup server which backs up another backup
server).
For the sake of explanation:
backup1 - original backup server (pulls data from workstations, other
servers, etc)
backup2 - backup server which pulls data from backup1 and keeps history.
backup1 also uses a combination of rsync and rdiff-backup. So backup2
also backs up rdiff-backup metadata from backup1.
I use rdiff-backup to preserve history, but use rsync to fetch the
files (because it's easy to install DeltaCopy on Windows boxes. Also,
rsync has extra options I like to use).
I have a slightly convoluted backup logic - this is necessary as I
need to preserve disk space on the backup server (can't have a
temporary copy that is the same size as the backup store).
My logic goes something like this (on backup2):
1) I start off with a directory structure like this:
/data/backup/files/ - Backups are stored here
/data/backup/files/rdiff-backup-data - rdiff-backup stores it's
metadata and history here
2) Make a temporary work directory:
/data/backup/new/
I create this directory with rsync - i tell it to create hardlinks to
the files under /data/backup/files/, but excluding the
rdiff-backup-data sub-directory
3) Use rsync to pull files from backup1 and update /data/backup/new/
(also deleting files which are no longer under /data/backup/new/)
Also, I preserve hardlinks (and numeric user IDs) at this stage,
because some of the backups are of servers and I want to be able to
restore them with rsync, preserving hardlinks, users, groups, etc.
4) Use rdiff backup to push any changes from /data/backup/new/ to
/data/backup/files, and create history.
rdiff-backup gets a bit confused here because of occasionally-changing
inodes, so I give it a few options:
--preserve-numerical-ids --no-compare-inode --force
This is also the step where rdiff-backup uses excessive memory.
5) Remove the temporary /data/backup/new/ directory.
Some more information about my memory problem. I ran rdiff-backup with
-v9 (verbose). Here are the last few lines logged:
=====OUTPUT STARTS=====
Mon May 5 06:49:03 2008 Touching
/data/backups/rrbackups/uberbackup1_mirror/files/rdiff-backup-data/increments/data/backups/rrbackups/complete_server_bkps/192.168.0.23/files/rdiff-backup-data/increments/var/www/dwww.2007-10-11T02:01:47+02:00.dir.2007-10-04T15:54:38+02:00.missing
Mon May 5 06:49:03 2008 Renaming
/data/backups/rrbackups/uberbackup1_mirror/files/data/backups/rrbackups/complete_server_bkps/192.168.0.23/files/rdiff-backup-data/increments/var/www/rdiff-backup.tmp.538914
to /data/backups/rrbackups/uberbackup1_mirror/files/data/backups/rrbackups/complete_server_bkps/192.168.0.23/files/rdiff-backup-data/increments/var/www/dwww.2007-10-11T02:01:47+02:00.dir
Mon May 5 06:49:07 2008 Processing changed file
data/backups/rrbackups/complete_server_bkps/192.168.0.23/files/rdiff-backup-data/increments/var/www/mp3join.2007-10-09T01:47:48+02:00.dir
Mon May 5 06:49:07 2008 Regular copying ('data', 'backups',
'rrbackups', 'complete_server_bkps', '192.168.0.23 ', 'files',
'rdiff-backup-data', 'increments', 'var', 'www',
'mp3join.2007-10-09T01:47:48+02:00.dir') to
/data/backups/rrbackups/uberbackup1_mirror/files/data/backups/rrbackups/complete_server_bkps/192.168.0.23/files/rdiff-backup-data/increments/var/www/rdiff-backup.tmp.538915
Mon May 5 06:49:07 2008 Writing file object to
/data/backups/rrbackups/uberbackup1_mirror/files/data/backups/rrbackups/complete_server_bkps/192.168.0.23/files/rdiff-backup-data/increments/var/www/rdiff-backup.tmp.538915
Mon May 5 06:49:07 2008 Copying attributes from ('data', 'backups',
'rrbackups', 'complete_server_bkps', '192. 168.0.23', 'files',
'rdiff-backup-data', 'increments', 'var', 'www',
'mp3join.2007-10-09T01:47:48+02:00.dir') to
/data/backups/rrbackups/uberbackup1_mirror/files/data/backups/rrbackups/complete_server_bkps/192.168.0.23/files/rdiff-backup-data/increments/var/www/rdiff-backup.tmp.538915
Mon May 5 06:49:07 2008 Setting time of
/data/backups/rrbackups/uberbackup1_mirror/files/data/backups/rrbackups/complete_server_bkps/192.168.0.23/files/rdiff-backup-data/increments/var/www/rdiff-backup.tmp.538915
to 118577 9828
Mon May 5 06:49:07 2008 Incrementing mirror file
/data/backups/rrbackups/uberbackup1_mirror/files/data/backups/rrbackups/complete_server_bkps/192.168.0.23/files/rdiff-backup-data/increments/var/www/mp3join.2007-10-09T01:47:48+02:00.dir
Mon May 5 06:49:07 2008 Touching
/data/backups/rrbackups/uberbackup1_mirror/files/rdiff-backup-data/increments
/data/backups/rrbackups/complete_server_bkps/192.168.0.23/files/rdiff-backup-data/increments/var/www/mp3join.2007-10-09T01:47:48+02:00.dir.2007-10-04T15:54:38+02:00.missing
Mon May 5 06:49:07 2008 Renaming
/data/backups/rrbackups/uberbackup1_mirror/files/data/backups/rrbackups/complete_server_bkps/192.168.0.23/files/rdiff-backup-data/increments/var/www/rdiff-backup.tmp.538915
to /data/backups/rrbackups/uberbackup1_mirror/files/data/backups/rrbackups/complete_server_bkps/192.168.0.23/files/rdiff-backup-data/increments/var/www/mp3join.2007-10-09T01:47:48+02:00.dir
=====OUTPUT ENDS=====
(These are the real directory paths, unlike the simplified directory
names in my previous examples).
6 hours later, rdiff-backup was still busy with that last line. I took
a look at the directory it was busy with (rdiff-backup.tmp.538915).
Total size: 83M
Total files: 252220
Total sub-directories: 12744
It looks like most of the files are '.missing' rdiff-backup markers.
Memory usage is as mentioned earlier (almost 2 GB).
Should I expect that kind of memory usage?
So, there you have it. Does anyone have suggestions for:
1) Improving my somewhat complicated backup system
2) Using rdiff-backup better, so I don't have to override it's inode warnings
3) Solving the memory usage problem?
I welcome suggestions for different backup methods, but they need to
meet this criteria:
* Needs to handle large outlook mail folders efficiently.
* Needs to be usable for restoring servers
In other words:
* Need to use an efficient method to transfer deltas over the network
* Needs to be usable on Windows and Linux
* File history needs to be space efficient:
- Reverse deltas are ideal
- Hard links and compressed versions of old files won't work well
- In other words, backup software like BackupPC (which use pools
of old file versions) probably won't work well.
* Permissions, ownerships, hardlinks, etc need to be preserved
Thanks in advance for any suggestions.
David.
More information about the clug-tech
mailing list