rsync-backup.pl
You can download rsync-backup.pl here.
NAME
rsync-backup.pl -- manage backups of remote systems via rsync
SYNOPSIS
rsync-backup.pl [ SWITCHES ] [ -f /path/to/config_file ] [ label label... ]
DESCRIPTION
rsync-backup.pl uses rsync over ssh to perform backups of remote systems. It supports multiple host definitions, allowing you to specify unique remote paths, exclusions, local backup targets and so on. You can even mount a filesystem before starting the backup, and unmount it upon completion (dangerous for multiuser environments, but handy for toasters).
rsync-backup.pl is designed to be run from cron, for multiple daily backups and optional archiving of daily, weekly and monthly snapshots. Snapshots are done via hard links, so disk usage is minimal, and since rsync only transfers changes since the last run, and uses compression to boot, bandwidth requirements are light too.
PREREQUISITES
This script requires the following packages:
- rsync, available from http://rsync.samba.org/
- perl 5.x
- File::Rsync
- Proc::ProcessTable (optional; for BSD systems)
- Sys::Syslog
- gnu cp
- gnu tar
Note: On BSD systems, install the coreutils port to get gnu cp and tar.
COMMAND-LINE SWITCHES
The following switches are supported:
-h Display short help summary -f Path to the configuration file -v Enable verbose logging -D Enable debug logging (implies -v) -t Run configuration tests; no transfers -n The number of backup operations to perform at once labels Execute only the named backup configurations
CONFIGURATION
rsync-backup.pl requires a configuration file containing one or more ``config blocks'', which define a remote host targeted for backup. Here's a sample config block:
example {
hostname eg.mydomain.com
path /
snapshots-hourly 4
snapshots-daily 7
snapshots-weekly 4
snapshots-monthly 1
snapshot-path /mnt/backups
excludes /backups/:/proc/:/dev/:tmp/:/usr/src/:/var/db/mysql/
mount-dev /dev/da0s1a
mount-point /mnt/backups
mount-type ufs
mount-flags -fu
mount-on-startup yes
umount-on-shutdown yes
create-tarballs yes
tarball-size 4000m
}
This block tells rsync-backup.pl to backup the entire contents of host 'eg.mydomain.com' 4 times daily. This configuration would preseve one monthly backup, plus the most recent 4 weeks and the last 7 days. It would be advisable with this setup to archive the monthly backup to permanent media, before it is overwritten the following month. If your snapshots are small (or your disks are large), you might save 12 monthly backups, giving you a year's history at a glance.
You may have as many such config blocks in your config file as you like; rsync-backup.pl will process each one in turn. Note that each block must begin with a label, used to identify this backup configuration.
A description of the configuration options follow:
- hostname
The fully-qualified domain name, or IP address, of the host to backup. The host must have rsync installed, and be configuration to allow the ssh user to run rsync. See SSH Configuration, below.
- path
The path on the remote host you wish to back up.
- snapshots-hourly
Save at most this many snapshots of the remote host in the hourly/ directory. You must run rsync-backup.pl at least this many times per day. If you run it more times than you've specified here, the oldest hourly snapshot will be removed.
- snapshots-daily
The number of daily snapshots to retain. Each time the script runs, it checks to see when it last ran -- if it was any day other than today, the most recent hourly snapshot is moved into the daily/ directory. If more than snapshots-daily snapshots already exist, the oldest is removed.
- snapshots-weekly
The number of weekly snapshots to retain. Weekly snapshots are created by the first run of the script on sundays, by moving the newest existing daily snapshot into the weekly/ directory. If more than snapshots-weekly snapshots already exist, the oldest is removed.
- snapshots-monthly
The number of monthly snapshots to retain. Monthly snapshots are created on the first of the month, by moving the newest weekly snapshot into the monthly/ directory. If more than snapshots-monthly snapshots already exist, the oldest is removed.
- snapshot-path
The local path where snapshots will be stored. The backups for each host will be created as subdirectories inside snapshot-path, using the label from the config block. So in the example above, the backups for eg.mydomain.com would be created in /mnt/backups/example.
- excludes
A colon (:)-separated list of path names rsync should not attempt to backup. Sensible things to include on this list are things like open database files, tmp directories, and so forth.
- mount-dev, mount-point
If values are given for both, rsync-backup.pl will try to verify that the specified device is mounted on the specified mount point before beginning the backup sequence.
- mount-type
Optional; if specified along with mount-dev and mount-point, rsync-backup.pl will only launch the backup sequence if the specified mount's filesystem matches mount-type.
This value is also passed to mount(1) when attempting to mount filesystems (see mount-on-startup).
- mount-flags
Optional additional flags to pass to mount(1) eg.,
-u. - mount-on-startup
Optional; if set to
yes, try to mount mount-dev on mount-point, using the mount-type and mount-flags, if specified, before launching rsync. This attempt is only done if the filesystem in question isn't already mounted.Ignored unless both mount-dev and mount-point are defined.
- umount-on-shutdown
Optional; if set to
yes, rsync-backup.pl will unmount the filesystem specified by mount-point when the backup sequence is complete. - create-tarballs
Optional; if set to
yes, when rsync-backup creates a monthly snapshot, it will also create a gzipped tar file of that snapshot, and place it in the tarballs/ directory. If the resulting archive is greater in size than the value of tarball-size, the archive will be split into chunks of tarball-size size. See split(1). - tarball-size
The size of the files, in bytes, a tarball should be split into. If a 'k' is appended to the value, tarball-size is interpretted as kilobytes. If an 'm' is appended to the value, tarball-size is interpretted as megabytes.
Note: A tarball-size of 4000m would be a good size for writing DVD-Rs.
- use-rsyncd
Optional; if set to "yes", rsync-backup.pl will attempt to contact the hose via rsyncd on the default port (879), rather than using SSH to initiate the connection.
USAGE NOTES
Local Configuration
Before running rsync-backup.pl, edit the script and alter the values of the *_cmd variables to match your specific system layout. The defaults are:
my $cp_cmd = '/usr/local/bin/cp -alf'; my $touch_cmd = '/usr/bin/touch'; my $ssh_cmd = '/usr/bin/ssh'; my $mount_cmd = '/sbin/mount'; my $umount_cmd = '/sbin/umount'; my $tar_cmd = '/usr/local/bin/bin/tar'; my $find_cmd = '/usr/bin/find';
cron
rsync-backup.pl is designed to run from cron. Furthermore, to properly manage weekly and monthly snapshots, the script needs to run at least on sundays, and on the first of every month. Thus it is recommended that you create a cron job to run the script daily, as many times as is needed by the highest value of snapshots-daily in your config file. For example, the config block shown above would suggest the following crontab entry:
0,6,12,18 * * * /path/to/rsync-backup.pl -f conf_file
See crontab(5) for details.
Logging
As of version 2.0, a seperate log file (called backup.log) is created in the snapshot directory for each host. The log files are truncated at each run, so no rotating is necessary.
By default no output is sent to STDOUT. You may override this behaviour by specifying the -v switch; this will cause a copy of entries in each host's log file to be echoed to STDOUT.
Also new with version 2.0 is support for logging to syslog via Sys::Syslog. By default only errors are directed there; this can be overridden by enabling debug output (via the -D switch). Doing so will cause all output to be copied to syslog in addition to STDOUT/STDERR and the host log files, as well as enabling various debug-only messages.
Fatal errors are always sent to syslog and to STDERR.
SSH and rsync
Unless you want to hang around and enter a password every time rsync-backup.pl launches rsync to back up a remote host, you're going to want to use certificate-based authentication for the ssh user.
Additionally, if you want to do full system backups with rsync, you're probably going to need to run rsync-backup.pl as root, and allow root to ssh into the remote host and run rsync. Allowing remote logins by root can be dangerous, however. What follows is an overview of my solution to this problem; I strongly recommend you familiarize yourself with the security implications of this setup before blindly charging forth. The author will accept no responsibility for your being foolish, yadda yadda yadda.
- Allow root SSH for authorized commands only
To do this, simply set
PermitRootLogintoforced-commands-onlyin your remote host's sshd_config. Now the root user will be permitted to login via SSH, but may only execute the command you specify in the authorized_keys file. - Configure root's authorized commands
Edit root's authorized_keys file on the remote host, and modify the line containing your backup host's key thusly:
command=``/root/bin/ssh_allowed.sh'', ssh-dss ... root@backup-host
This will force every root login from backup-host to run the shell script ssh_allowed.sh. By interrogating the
$SSH_ORIGINAL_COMMANDenvironment variable in this script, we can decide whether or not to permit the command to be executed. Here's a simple ssh_allowed.sh:#!/bin/sh # # spawned by ssh to execute valid commands remotely # case "$SSH_ORIGINAL_COMMAND" in *\&*) echo "Rejected" ;; *\;*) echo "Rejected" ;; rsync\ --server\ --sender\ -logDtprRz\ .\ /*) $SSH_ORIGINAL_COMMAND ;; *) echo "$SSH_ORIGINAL_COMMAND" >> /var/log/root_ssh_rejected.log echo "Rejected" ;; esacNote: depending on your calling parameters and rsync version, the exact sequence of arguments on the
rsync --serverline may or may not match this example; if your rsyncs are failing, check the rejected log to see what args are bing passed and modify the script accordingly.And of course, ensure your ssh_allowed.sh's permissions are set to 500.
Restoring a split tarball
If you find yourself in the position of needing to restore a backup from a tarball which has been split into chunks, simply copy all the pieces of the tarball into a directory, and execute:
% cat tarball.tgz_* | gnu-tar --preserve -xzf -
Backing up mysql databases
Trying to rsync mysql databases while mysql is running on the remote host will result in broken tables (and kvetching lusers). It is recommended the remote host run mysqlhotcopy from a cron job some time before the rsync backup is scheduled, such that rsync can backup copies of the databases rather than the databases themselves. Such a crontab entry might look like this:
2 3 * * * mysqlhotcopy --addtodest -u user --password=... dbname /path/to/backups
Consult the mysql documentation for details.
CAVEATS
At this time, rsync-backup.pl has no brains for checking disk space before engaging in possibly-dangerous things like creating multiple gigantic tarballs of whole filesystems. Be thou therefore careful with thine tars.
VERSION
This is version 2.1 of rsync-backup.pl.
CHANGES SINCE 2.0
- fixed a bug where logging was attempted before logfile's path existed.
CHANGES SINCE 1.7
- rsync-backup now forks to execute multiple backups simultaneously.
- only tries to load Proc::ProcessTable on bsd-like systems.
- tarball creation and unmounts now happen per-label, to play nice with threads.
- verbose/debug logging reimplemented, including syslog support
- added -n switch to limit number of child processes
- added -D switch to enable debug output
CHANGES SINCE 1.6
- Added support for multiple paths.
- Added support for rsyncd servers with the use-rsyncd flag.
- Added support for the bandwidth-limit flag.
CHANGES SINCE 1.5
- Fixed a bug where args in the *_cmd variables would be stripped
- Improved rsync error message parsing
CHANGES SINCE 1.4
- The recreate-symlinks business introduced in 1.3 was slow (!) and prone to breakage; we now use rsync to copy symbolic links from the working directory to the snapshot directory.
- Added test mode and the -v switch
CHANGES SINCE 1.3
Modified behaviour of the mount options. We can now:
- mount a filesystem, do the backup, then umount the filesystem;
- verify a filesystem is mounted before the backup;
- mount a filesystem if it isn't already mounted;
- unmount a filesystem when the backup is complete, regardless of whether or not we mounted it.
CHANGES SINCE 1.2
- Since one cannot create a hard link of a symlink, snapshots contained none of the symlinks in the working directory. To resolve this, rsync-backup.pl now does a find of all symlinks in the working directory and recreates them in the newly-created hourly snapshot. The mtime, modes and ownership of the symlinks are all preserved.
CHANGES SINCE 1.1
- Changed the aging scheme to move the newest snapshot from daily to weekly, and weekly to monthly. Prior versions moved the oldest, which seems dumb.
CHANGES SINCE 1.0
- Added support for labels on the command-line
- Minor additions to documentation
- Removed unused '$date' variable
TO DO
- bandwidth-aware threading
- automagick writing of monthly tarballs to cd/dvd would rock
AUTHOR
rsync-backup.pl was written by Greg Boyington < greg [at] automagick.us >.
ACKNOWLEDGEMENTS
The basic structure of the backup scheme isn't mine; it belongs to Stu
Sheldon, < stu [at] actusa.net >, whose mirror script I found linked on
Mike Rubel's excellent article, ``Easy Automated Snapshot-Style Backups
with Linux And Rsync.'' You can read the article here:
http://www.mikerubel.org/computers/rsync_snapshots/
License
All source code, tools and scripts on http://automagick.us is Copyright © 2007 - 2010 Greg Boyington, and licensed under aCreative Commons Attribution-Share Alike 3.0 United States License, except where otherwise noted.
