Storage network backup: Server components

Storage network backup servers are comprised of a range of component parts. This excerpt from "Storage Networks Explained" discusses the main components. Learn what they do and how they work.

Backup servers consist of a whole range of component parts. In the following we will discuss the main components: job scheduler (Section 7.3.1), error handler (Section 7.3.2) metadata database (Section 7.3.3) and media manager (Section 7.3.4)

7.3.1 Job scheduler

The job scheduler determines what data will be backed up when. It must be carefully configured; the actual backup then takes place automatically.

With the aid of job schedulers and tape libraries many computers can be backed up overnight without the need for a system administrator to change tapes on site. Small tape libraries have a tape drive, a magazine with space for around ten tapes and a media changer that can automatically move the various tapes back and forth between magazine and tape drive. Large tape libraries have several dozen tape drives, space for several thousands of tapes and a media changer or two to insert the tapes in the drives.

7.3.2 Error Handling

If a regular automatic backup of several systems has to be performed, it becomes difficult to monitor whether all automated backups have run without errors. The error handler helps to prioritize and filter error messages and generate reports. This avoids the situation in which problems in the backup are not noticed until a backup needs to be restored.

7.3.3 Metadata database

The metadata database and the media manager represent two components that tend to be hidden. The metadata database is the brain of a network backup system. It contains the following entries for every backup up object: name, computer of origin, date of last change, date of last backup, name of the backup medium, etc. For example, an entry is made in the metadata database for every file to be backed up.

The cost of the metadata database is worthwhile: in contrast to backup tools provided by operating systems, network backup systems permit the implementation of the incremental-forever strategy in which a file system is only fully backed up in the first backup. In subsequent backups, only those files that have changed since the previous backup are backed up. The current state of the file system can then be calculated on the backup server from database operations from the original full backup and from all subsequent incremental backups, so that no further full backups are necessary. The calculations in the metadata database are generally performed faster than a new full backup.

Even more is possible: if several versions of the files are backed up on the backup server, a whole file system or a subdirectory dated three days ago, for example, can be restored (point-in-time restore) – the metadata database makes it possible.

7.3.4 Media manager

Use of the incremental-forever strategy can considerably reduce the time taken by the backup in comparison to the full backup. The disadvantage of this is that over time the backed up files can become distributed over numerous tapes. This is critical for the restoring of large file systems because tape mounts cost time. This is where the media manager comes into play. It can ensure that only files from a single computer are located on one tape. This reduces the number of tape mounts involved in a restore process, which means that the data can be restored more quickly.

A further important function of the media manager is so-called tape reclamation. As a result of the incremental-forever strategy, more and more data that is no longer needed is located on the backup tapes. If, for example, a file is deleted or changed very frequently over time, earlier versions of the file can be deleted from the backup medium. The gaps on the tapes that thus become free cannot be directly overwritten using current techniques. In tape reclamation, the media manager copies the remaining data that is still required from several tapes, of which only a certain percentage is used, onto a common new tape. The tapes that have thus become free are then added to the pool of unused tapes.

There is one further technical limitation in the handling of tapes: current tape drives can only write data to the tapes at a certain speed. If the data is transferred to the tape drive too slowly this interrupts the write process, the tape rewinds a little and restarts the write process. The repeated rewinding of the tapes costs performance and causes unnecessary wear to the tapes so they have to be discarded more quickly. It is therefore better to send the data to the tape drive quickly enough so that it can write the data onto the tape in one go (streaming).

The problem with this is that in network backup the backup clients send the data to be backed up via the LAN to the backup server, which forwards the data to the tape drive. On the way from backup client via the LAN to the backup server there are repeated fluctuations in the transmission rate, which means that the streaming of tape drives is repeatedly interrupted. Although it is possible for individual clients to achieve streaming by additional measures (such as the installation of a separate LAN between backup client and backup server) (Section 7.7), these measures are expensive and technically not scalable at will, so they cannot be realized economically for all clients.

The solution: the media manager manages a storage hierarchy within the backup server. To achieve this, the backup server must be equipped with hard disks and tape libraries. If a client cannot send the data fast enough for streaming, the media manager first of all stores the data to be backed up to hard disk. When writing to a hard disk it makes no difference what speed the data is supplied at. When enough of the data to be backed up has been temporarily saved to the hard disk of the backup server, the media manager automatically moves large quantities of data from the hard disk of the backup server to its tapes. This process only involves recopying the data within the backup server, so that streaming is guaranteed when writing the tapes.

This storage hierarchy is used, for example, for the backup of user PCs (Figure 7.2). Many user PCs are switched off overnight, which means that backup cannot be guaranteed overnight. Therefore, network backup systems often use the midday period to back up user PCs. Use of the incremental-forever strategy means that the amount of data to be backed up every day is so low that such a backup strategy is generally feasible. All user PCs are first of all backed up to the hard disk of the backup server in the time window from 11 : 15 to 13 : 45. The media manager in the backup server then has a good twenty hours to move the data from the hard disks to tapes. Then the hard disks are once again free so that the user PCs can once again be backed up to hard disk in the next midday break.

Figure 7.2 The storage hierarchy in the backup server helps to back user PCs up efficiently. First of all, all PCs are backed up to the hard disks of the backup server (1) during the midday period. Before the next midday break the media manager copies the data from the hard disks to tapes (2)

In all operations described here the media manager checks whether the correct tape has been placed in the drive. To this end, the media manager writes an unambiguous signature to every tape, which it records in the metadata database. Every time a tape is inserted the media manager compares the signature on the tape with the signature in the metadata database. This ensures that no tapes are accidentally overwritten and that the correct data is written back during a restore operation.

In all operations described here the media manager checks whether the correct tape has been placed in the drive. To this end, the media manager writes an unambiguous signature to every tape, which it records in the metadata database. Every time a tape is inserted the media manager compares the signature on the tape with the signature in the metadata database. This ensures that no tapes are accidentally overwritten and that the correct data is written back during a restore operation.

Furthermore, the media manager monitors how often a tape has been used and how old it is, so that old tapes are discarded in good time. If necessary, it first copies data that is still required to a new tape. Older tape media formats also have to be wound back and forwards now and then so that they last longer; the media manager can also automate the winding of tapes that have not been used for a long time.

A further important function of the media manager is the management of data in a so-called off-site store. To this end, the media manager keeps two copies of all data to be backed up. The first copy is always stored on the backup server, so that data can be quickly restored if it is required. However, in the event of a large-scale disaster (fire in the data centre) the copies on the backup server could be destroyed. For such cases the media manager keeps a second copy in an off-site store that can be several kilometres away. The media manager supports the system administrator in moving the correct tapes back and forwards between backup server and off-site store. It even supports tape reclamation for tapes that are currently in the off-site store and it.

Use the following table of contents to navigate to chapter excerpts or click here to view Network backup in its entirety.


Storage Networks Explained
  Home: Introduction
  1: Storage network backup: General conditions for backup
  2: Storage network backup services
  3: Storage network backup: Server components
  4: Storage network back-up clients
  5: Storage network back-up performance gains
  6: Storage network backup performance bottlenecks
  7: Storage network backup: Limited opportunities for increasing performance
  8: Storage network backup: Next generation
  9: Storage network backup of file servers
  10: Storage network backup of databases
  11: Storage network backup: Organizational aspects
ABOUT THE BOOK:   
Storage networks will become a basic technology like databases or local area networks. According to market research, 70% of external storage devices will be connected via storage networks in 2003. The authors have hands-on experience of network storage hardware and software, they teach customers about concrete network storage products, they understand the concepts behind storage networks, and show customers how storage networks address their business needs. This book explains how to use storage networks to fix malfunctioning business processes, covering the technologies as well as applications -- a hot topic that will become increasingly important in the coming years.Purchase the book from Wiley Publishing
ABOUT THE AUTHOR:   
Authors Ulf Troppens and Rainer Erkens are both employed at IBM TotalStorage Interoperability Center in Mainz, Germany a testing, development and demonstration laboratory for storage products and storage networks. Both authors work at the interface between technology and customers. Wolfgang Müller is currently working as a software architect in the Storage Software Development Department at IBM in Mainz, Germany, where the focus is on software development projects supporting open standards such as SMI-S/CIM/WBEM and IEEE 1244.

This was first published in July 2007

Dig deeper on Network-Attached Storage (NAS)

Pro+

Features

Enjoy the benefits of Pro+ membership, learn more and join.

0 comments

Oldest 

Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to:

-ADS BY GOOGLE

MicroscopeUK

SearchCloudProvider

SearchSecurity

SearchStorage

SearchNetworking

SearchCloudComputing

SearchConsumerization

SearchDataManagement

SearchBusinessAnalytics

Close