Choosing between Red Hat Enterprise Linux backup utilities

Determining a backup schedule and which Linux utilities, such as tar or dump/restore, to use in your customer's environment can be crucial decisions in case of a disaster.

Solution provider takeaway: The Linux backup utilities or backup schedule that you choose for your customers can be important for ensuring that their data is safe during a disaster. Backup utilities can provide help for tasks such as bundling files for distribution to other sites.

One of the most neglected tasks of system administration is making backup copies of files on a regular basis. The backup copies are vital in three instances: when the system malfunctions and files are lost, when a catastrophic disaster such as a fire or earthquake occurs and when a user or the system administrator deletes or corrupts a file by accident. Even when you set up RAID, you still need to back up files. Although RAID provides fault tolerance—helpful in the event of disk failure—it does not help when a disaster occurs or when a file is corrupted or accidentally removed. It is a good idea to have a written backup policy and to keep copies of backups in a fireproof vault or safe located in another building, at home or at a completely different facility or campus.

The time to start thinking about backups is when you partition the disk. Make sure the capacity of the backup device and your partition sizes are comparable. Although you can back up a partition onto multiple volumes, it is easier not to do that and much easier to restore data from a single volume.

You must back up filesystems on a regular basis. Backup files are usually kept on magnetic tape or some other removable media. Exactly how often you should back up which files depends on the system and your needs.

Use this criterion when determining a backup schedule: If the system crashes, how much work are you willing to lose? Ideally you would back up all files on the system every few minutes so you would never lose more than a few minutes of work.

But there is a tradeoff: How often are you willing to back up the files? The backup procedure typically slows down the system for other users, it takes a certain amount of your time, and it requires that you have and store the tape or disk media holding the backup.

Avoid backing up an active filesystem. The results may be inconsistent, and restoring from the backup may be impossible. This requirement is a function of the backup program and the filesystem you are backing up.

Another question is when to run the backup. Unless you plan to kick users off and bring the system down to single-user mode, you want to perform this task when the machine is at its quietest. Depending on the use of the system, sometime in the middle of the night can work well. Then the backup is least likely to affect users, and the files are not likely to change as they are being read for backup.

A full backup makes copies of all files, regardless of when they were created or accessed. An incremental backup makes copies of those files that have been created or modified since the last—usually full—backup.

The more people using the system, the more often you should back up the filesystems. One popular schedule is to perform an incremental backup one or two times a day and a full backup one or two times a week.

Choosing a Backup Medium

If the local system is connected to a network, you can write your backups to a tape drive on another system. This technique is often used with networked computers to avoid the cost of having a tape drive on each computer in the network and to simplify management of backing up many computers in a network.

Most likely you want to use a tape system for backups. Because tape drives hold many gigabytes of data, using tape simplifies the task of backing up the system, making it more likely that you will take care of this important task regularly.

Other options for holding backups are writable CDs, DVDs and removable hard disks. These devices, although not as cost-effective or able to store as much information as tape systems, offer convenience and improved performance over using tapes.

Backup Utilities

A number of utilities can help you back up the system, and most work with any media. Most Linux backup utilities are based on one of the archive programs —tar or cpio&mdashand augment these basic programs with bookkeeping support for managing backups conveniently.

You can use any of the tar, cpio, or dump/restore utilities to construct full or partial backups of the system. Each utility constructs a large file that contains, or archives, other files. In addition to file contents, an archive includes header information for each file it holds.

This header information can be used when extracting files from the archive to restore file permissions and modification dates. An archive file can be saved to disk, written to tape or shipped across the network while it is being created.

In addition to helping you back up the system, these programs offer a convenient way to bundle files for distribution to other sites. The tar program is often used for this purpose, and some software packages available on the Internet are bundled as tar archive files.

The Advanced Maryland Automatic Network Disk Archiver (AMANDA) utility is one of the more popular backup systems, uses dump or tar and takes advantage of Samba to back up Windows systems. The amanda utility backs up a LAN of heterogeneous hosts to a single tape drive. You can use yum to install amanda; refer to the amanda man page for details.

tar: Archives Files

The tar—tape archive—utility stores and retrieves files from an archive and can compress the archive to conserve space. If you do not specify an archive device, tar uses standard output and standard input. With the –f option, tar uses the argument to –f as the name of the archive device. You can use this option to refer to a device on another system on the network. Although tar has many options, you need only a few in most situations. The following command displays a complete list of options:

# Tar ––help | less

Most options for tar can be given either in a short form—a single letter—or as a descriptive word. Descriptive-word options are preceded by two hyphens, as in ––help. Single-letter options can be combined into a single command-line argument and do not need to be preceded by a hyphen. For consistency with other utilities, it is good practice to use the hyphen anyway.

Although the following two commands look quite different, they specify the same tar options in the same order. The first version combines single-letter options into a single command-line argument. The second version uses descriptive words for the same options:

# Tar –ztvf /dev/st0
# tar ––gzip ––list ––verbose ––file /dev/st0

Both commands tell tar to generate a (v, verbose) table of contents (t, list) from the tape on /dev/st0 (f, file), using gzip (z, gzip) to decompress the files. Unlike the original UNIX tar utility, the GNU version strips the leading / from absolute pathnames.

The options in Table 16-1 tell the tar program what to do. You must include exactly one of these options in a tar command.

Table 16-1: The tar utility

 

Option Effect
––append (–r) Appends files to an archive
––catenate (–A) Adds one or more archives to the end of an existing archive
––create (–c) Creates a new archive
––delete Deletes files in an archive, not on tapes
––diff (–d) Compares files in an archive with disk files
––extract (–x) Extracts files from an archive
––help Displays a help list of tar options
––list (–t) Lists the files in an archive
––update (–u) Like the –r option, but the file is not appended if a newer version is already in the archive

The –c, –t, and –x options are used most frequently. You can use many other options to change how tar operates. The –j option, for example, compresses or decompresses the file by filtering it through bzip2 (page 162).

Cpio: Archives Files

The cpio (copy in/out) program is similar to tar but can use archive files in a variety of formats, including the one used by tar. Normally cpio reads the names of the files to insert into the archive from standard input and produces the archive file as standard output. When extracting files from an archive, cpio reads the archive as standard input.

As with tar, some options can be given in both a short, single-letter form and a more descriptive word form. However, unlike tar, the syntax of the two forms differs when the option must be followed by additional information.

In the short form, you must include a space between the option and the additional information; with the word form, you must separate the two with an equal sign and no spaces.

Running cpio with ––help displays a full list of options.

Performing a Simple Backup

When you prepare to make a major change to a system, such as replacing a disk drive or updating the Linux kernel, it is a good idea to archive some or all of the files so you can restore any that become damaged if something goes wrong. For this type of backup, tar or cpio works well. For example, if you have a SCSI tape drive as device /dev/st0 that is capable of holding all the files on a single tape, you can use the following commands to construct a backup tape of the entire system:

# cd /
# tar –cf /dev/st0 .

All of the commands in this section start by using cd to change to the root directory so you are sure to back up the entire system. The tar command then creates an archive (c) on the device /dev/st0 (f). If you would like to compress the archive, replace the preceding tar command with the following command, which uses j to call bzip2:

# tar –cjf /dev/st0 .

You can back up the system with a combination of find and cpio. The following commands create an output file and set the I/O block size to 5120 bytes (the default is 512 bytes):

# cd /
# find . –depth | cpio –oB > /dev/st0

The next command restores the files in the /home directory from the preceding backup. The options extract files from an archive (–i) in verbose mode, keeping the modification times and creating directories as needed.

# cd /
# cpio –ivmd /home/* < /dev/st0

Exclude some directories from a backup
 

In practice, you will likely want to exclude some directories from the backup process. For example, not backing up /tmp or /var/tmp (or its link, /usr/tmp) can save room in the archive. Also, do not back up the files in /proc. Because the /proc filesystem is not a disk filesystem but rather a way for the Linux kernel to provide information about the operating system and system memory, you need not back up /proc—you cannot restore it later. You do not need to back up filesystems that are mounted from disks on other systems in the network.

Do not back up FIFOs; the results are unpredictable. If you plan on using a simple method, similar to those just discussed, create a file naming the directories to exclude from the backup and use the appropriate option with the archive program to read the file.

Although all of the archive programs work well for such simple backups, utilities such as Amanda provide more sophisticated backup and restore systems. For example, to determine whether a file is in an archive, you must read the entire archive. If the archive is split across several tapes, this process is particularly tiresome. More sophisticated utilities, including Amanda, assist you in several ways, including keeping a table of contents of the files in a backup.

dump, restore: Back Up and Restore Filesystems

The dump utility, which first appeared in Unix version 6, backs up either an entire filesystem or only those files that have changed since the last dump. The restore utility restores an entire filesystem, an individual file or a directory hierarchy. You will get the best results if you perform a backup on a quiescent system so that the files are not changing as you make the backup.

The next command backs up all files—including directories and special files— on the root (/) partition to SCSI tape 0. Frequently there is a link to the active tape drive, named /dev/tape, which you can use in place of the actual entry in the /dev directory.

# dump -0uf /dev/st0 /

The option specifies that the entire filesystem is to be backed up —a full backup). There are ten dump levels: 0–9. Zero is the highest—most complete—level and always backs up the entire filesystem. Each additional level is incremental with respect to the level above it.

For example, 1 is incremental to 0 and backs up only files that have changed since the last level 0 dump, and 2 is incremental to 1 and backs up only files that have changed since the last level 1 dump and so on. You can construct a very flexible schedule using this scheme. You do not need to use sequential numbers for backup levels. You can perform a level 0 dump, followed by level 2 and 5 dumps.

The u option updates the /etc/dumpdates file with filesystem, dateand dump level information for use by the next incremental dump. The f option and its argument write the backup to the device named /dev/st0.

The following command makes a partial backup containing all files that have changed since the last level 0 dump. The first argument is a 1, specifying a level 1 dump:

# dump -1uf /dev/st0 /

To restore an entire filesystem from a tape, first restore the most recent complete (level 0) backup. Perform this operation carefully because restore can overwrite the existing filesystem. When you are logged in as Superuser, cd to the directory the filesystem is mounted on and give this command:

# restore -if /dev/st0

The i option invokes an interactive mode that allows you to choose which files and directories to restore. As with dump, the f option specifies the name of the device that the backup medium is mounted on. When restore finishes, load the next lower level (higher-number) dump tape and issue the same restore command.

If multiple incremental dumps have been made at a particular level, always restore with the most recent one. You do not need to invoke restore with special arguments to restore an incremental dump. It will restore whatever appears on the tape.

You can also use restore to extract individual files from a tape by using the x option and specifying the filenames on the command line. Whenever you restore a file, the restored file will appear in the working directory. Before restoring files, make sure you are working in the correct directory.

The following commands restore the /etc/nsswitch.conf file from the tape on /dev/st0. The filename of the dumped file does not begin with / because all dumped pathnames are relative to the filesystem that you dumped—in this case /. Because the restore command is given from the / directory, the file will be restored to its original location of /etc/nsswitch.conf:

# cd /
# restore -xf /dev/st0 etc/nsswitch.conf

If you use the x option without specifying a file or directory name to extract, the entire dumped filesystem is extracted. Use the r option to restore an entire filesystem without using the interactive interface. The following command restores the filesystem from the tape on /dev/st0 to the working directory without interaction:

# restore -rf /dev/st0

You can also use dump and restore to access a tape drive on another system. Specify the file/directory as host:file, where host is the hostname of the system the tape drive is on and file is the file/directory you want to dump/restore.

Occasionally, restore may prompt you with the following message:

You have not read any volumes yet.
Unless you know which volume your file(s) are on you should start
with the last volume and work towards the first.
Specify next volume #:

Enter 1 (one) in response to this prompt. If the filesystem spans more than one tape or disk, this prompt allows you to switch tapes.

At the end of the dump, you will receive another prompt:

set owner/mode for '.'? [yn]

Answer y to this prompt when you are restoring entire filesystems or files that have been accidentally removed. Doing so will restore the appropriate permissions to the files and directories being restored. Answer n if you are restoring a dump to a directory other than the one it was dumped from. The working directory permissions and owner will then be set to those of the person doing the restore (typically root).

Various device names can access the /dev/st0 device. Each name accesses a different minor device number that controls some aspect of how the tape drive is used. After you complete a dump when you use /dev/st0, the tape drive automatically rewinds the tape to the beginning. Use the nonrewinding SCSI tape device (/dev/nst0) to keep the tape from rewinding on completion. This feature allows you to back up multiple filesystems to one volume. Following is an example of backing up a system where the /home, /usr, and /var directories reside on different filesystems:

# dump -0uf /dev/nst0 /home
# dump -0uf /dev/nst0 /usr
# dump -0uf /dev/st0 /var

The preceding example uses the nonrewinding device for the first two dumps. If you use the rewinding device, the tape rewinds after each dump, and you are left with only the last dump on the tape.

You can use mt (magnetic tape), which is part of the mt-st package, to manipulate files on a multivolume dump tape. The following mt command positions the tape (fsf 2 instructs mt to skip forward past two files, leaving the tape at the start of the third file). The restore command restores the /var filesystem from the previous example:

# mt -f /dev/st0 fsf 2
# restore rf /dev/st0

This excerpt is from Mark Sobell's A Practical Guide to Fedora and Red Hat Enterprise Linux (5th Edition), published by Prentice Hall Professional. For more information, visit: www.informit.com/title/0137060882.

Dig Deeper on MSP technology services

MicroScope
Security
Storage
Networking
Cloud Computing
Data Management
Business Analytics
Close