Storage network backup of databases

Storage network back-up of databases requires a fundamental understanding of database operating methods. This excerpt from Storage Networks Explained clearly outlines operating methods and the options available for conventional and unconventional back-up methods.

Databases are the second most important organizational form of data after the file systems discussed in the previous section. Despite the measures introduced in Section 6.3.5, it is sometimes necessary to restore a database from a backup medium. The same questions are raised regarding the backup of the metadata of a database server as for the backup of file servers (Section 7.9.1). On the other hand, there are clear differences between the backup of file systems and databases. The backup of databases requires a fundamental understanding of the operating method of databases (Section 7.10.1). Knowledge of the operating method of databases helps us to perform both the conventional backup of databases without storage networks (Section 7.10.2) and also the backup of databases with storage networks and intelligent storage subsystems (Section 7.10.3) more efficiently.

7.10.1 Operating method of database systems

One requirement of database systems is the atomicity of transactions, with transactions bringing together several write and read accesses to the database to form logically coherent units. Atomicity of transactions means that a transaction involving write access should be performed fully or not at all.

Transactions can change the content of one or more blocks that can be distributed over several hard disks or several disk subsystems. Transactions that change several blocks are problematic for the atomicity. If the database system has already written a few of the blocks to be changed to hard disk and has not yet written others and then the database server goes down due to a power failure or a hardware fault, the transaction has only partially been performed. Without additional measures the transaction can neither be completed nor undone after a reboot of the database server because the information necessary for this is no longer available. The database would therefore be inconsistent.

The database system must therefore store additional information regarding transactions that have not yet been concluded on the hard disk in addition to the actual database. The database system manages this information in so-called log files. It first of all notes every pending change to the database in a log file before going on to perform the changes to the blocks in the database itself. If the database server fails during a transaction, the database system can either complete or undo incomplete transactions with the aid of the log file after the reboot of the server.

Figure 7.18 shows a greatly simplified version of the architecture of database systems. The database system fulfils the following two main tasks:

  • Database: storing the logical data structure to block-oriented storage First, the database system organizes the data into a structure suitable for the applications and stores this on the block-oriented hard disk storage. In modern database systems the relational data model, which stores information in interlinked tables, is the main model used for this. To be precise, the database system stores the logical data directly onto the hard disk, circumventing a file system, or it stores it to large files. The advantages and disadvantages of these two alternatives have already been discussed in Section 4.1.1.
  • Transaction machine: changing the database Second, the database system realizes methods for changing the stored information. To this end, it provides a database language and a transaction engine. In a relational database the users and applications initiate transactions via the database language SQL and thus call up or change the stored information. Transactions on the logical, application-near data structure thus bring about changes to the physical blocks on the hard disk. The transaction system ensures, amongst other things, that the changes to the data set caused by a transaction are either completed or not performed at all. As described above, this condition can be guaranteed with the aid of log files even in the event of computer or database system crashes.
  • Figure 7.18 Users start transactions via the database language (SQL) in order to read or write data. The database system stores the application data in block-oriented data (database)and it uses log files to guarantee the atomicity of the transactions

The database system changes blocks in the data area, in no specific order, depending on how the transactions occur. The log files, on the other hand, are always written sequentially, with each log file being able to store a certain number of changes. Database systems are generally configured with several log files written one after the other. When all log files have been fully written, the database system first overwrites the log file that was written first, then the next, and so on.

A further important function for the backup of databases is the backup of the log files. To this end, the database system copies full log files into a file system as files and numbers these sequentially: logfile 1, logfile 2, logfile 3, etc. These copies of the log files are also called archive log files. The database system must be configured with enough log files that there is sufficient time to copy the content of a log file that has just been fully written into an archive log file before it is once again overwritten.

7.10.3 Next generation back-up of databases

The methods introduced in the previous section for the backup of databases (cold backup, hot backup and fuzzy backup) are excellently suited for use in combination with storage networks and intelligent storage subsystems. In the following we show how the backup of databases can be performed more efficiently with the aid of storage networks and intelligent storage subsystems.

The linking of hot backup with instant copies is an almost perfect tool for the backup of databases. Individually, the following steps should be performed:

  1. Switch the database over into hot backup mode so that there is a consistent data set in the storage system.
  2. Create the instant copy.
  3. Switch the database back to normal mode.
  4. Back up the database from the instant copy.

This procedure has two advantages: first, access to the database is possible throughout the process. Second, steps 1–3 only take a few seconds, so that the database system only has to catch up comparatively few transactions after switching back to normal mode.

Application server-free backup expands the backup by instant copies in order to additionally free up the database server from the load of the backup (Section 7.8.5). The concept shown in Figure 7.11 is also very suitable for databases. Due to the large quantity of data involved in the backup of databases, LAN-free backup is often used – unlike in the figure – in order to back up the data generated using instant copy.

In the previous section (Section 7.10.2) we explained that the time of the last backup is decisive for the time that will be needed to restore a database to the last data state. If the last backup was a long time ago, a lot of archive log files have to be reapplied. In order to reduce the restore time for a database it is therefore necessary to increase the frequency of database backups.

In the previous section (Section 7.10.2) we explained that the time of the last backup is decisive for the time that will be needed to restore a database to the last data state. If the last backup was a long time ago, a lot of archive log files have to be reapplied. In order to reduce the restore time for a database it is therefore necessary to increase the frequency of database backups.

In order to nevertheless increase the backup frequency of a database, the data volume to be transferred must therefore be reduced. This is possible by means of an incremental backup of the database on block level. The most important database systems offer backup tools for this by means of which such database increments can be generated. Many network backup systems provide special adapters (backup agents) that are tailored to the backup tools of the database system in question. However, the format of the increments is unknown to the backup software, so that the incremental-forever strategy cannot be realized in this manner. This would require manufacturers of database systems to publish the format of the increments.

The backup of databases using the incremental-forever strategy therefore requires that the backup software knows the format of the incremental backups, so that it can calculate the full backups from them. To this end, the storage space of the database must be provided via a file system that can be incrementally backed up on block level using the appropriate backup client. The backup software knows the format of the increments so the incremental-forever strategy can be realized for databases via the circuitous route of file systems.

Use the following table of contents to navigate to chapter excerpts or click here to view Network Backup in its entirety.

Storage Networks Explained
  Home: Introduction
  1: Storage network backup: General conditions for backup
  2: Storage network backup services
  3: Storage network backup: Server components
  4: Storage network back-up clients
  5: Storage network back-up performance gains
  6: Storage network backup performance bottlenecks
  7: Storage network backup: Limited opportunities for increasing performance
  8: Storage network backup: Next generation
  9: Storage network backup of file servers
  10: Storage network backup of databases
  11: Storage network backup: Organizational aspects
Storage networks will become a basic technology like databases or local area networks. According to market research, 70% of external storage devices will be connected via storage networks in 2003. The authors have hands-on experience of network storage hardware and software, they teach customers about concrete network storage products, they understand the concepts behind storage networks, and show customers how storage networks address their business needs. This book explains how to use storage networks to fix malfunctioning business processes, covering the technologies as well as applications -- a hot topic that will become increasingly important in the coming years.Purchase the book from Wiley Publishing
Authors Ulf Troppens and Rainer Erkens are both employed at IBM TotalStorage Interoperability Center in Mainz, Germany a testing, development and demonstration laboratory for storage products and storage networks. Both authors work at the interface between technology and customers. Wolfgang Müller is currently working as a software architect in the Storage Software Development Department at IBM in Mainz, Germany, where the focus is on software development projects supporting open standards such as SMI-S/CIM/WBEM and IEEE 1244.

Dig Deeper on Storage Backup and Disaster Recovery Services

Start the conversation

Send me notifications when other members comment.

Please create a username to comment.