When helping customers with their backup design, making sure they get the maximum sustainable performance out of their backup targets -- whether tape or disk -- initially does not have anything to do with those targets. First you have to know the backup source, the connection, the backup software and the backup server.
The source, also known as the client, is the server to be backed up, and there are several variables to understand about it. First, if you are using a backup application that has the client "push" its data to the backup server as opposed to the backup server "pulling" the data from the source, the overall processing capability of that server is critical. This is because this local source server has to scan its file system for files to that need to be backed up, package those files up into a backup stream and then send those files across the network to the backup server. All of this takes time and CPU resources. The slower the CPU, the longer this will take.
If the backup application uses a "pull" method, then the CPU power of the local client is less important, but of course the processing power of the backup server becomes even more critical.
Second, you need to pay attention to the applications that are running on the source itself -- even if there is no application but the server is instead functioning as a file server. Some applications are notoriously slow to feed data to the backup client software. This is especially so when you consider using backup agents on the source server that try to back up granular components of the application, like messages or records within the database. Capturing this type of detail is time-consuming. When developing a backup design, you need to be aware of the limits of the backup application and how it interacts with the application (whether it's Exchange, a database, etc.).
When the source is acting as a file server, the client may have millions of files to inspect and to package up for transmittal to the backup application. It can take longer to inspect a file system with millions of files than sending those files across the network, so you'll need to account for that when designing a customer's backup system.
The final backup design issue to consider related to the source is what type of network it's connected to. Even if the source has sufficient processing power and does not get bogged down by application-specific backups, it's limited by how robust the network connection is to the backup server. In the case of 10GB Ethernet or a Fibre Channel SAN connection, this is not a major problem, but most sources are on a 1GB network, which significantly cripples overall backup performance.
Here is Curtis Preston's story on tape storage performance:
Tape backup best practices: How to improve tape storage performance
Despite all the hype around the use of disk-based data backup and recovery, tape storage is still the primary target for most backups. Not only is the most data stored on tape, but most data is backed up directly to tape without using disk as a buffer.
With this in mind, what does one do to make the performance of their tape drives better?
Know thy tape drive
The most important aspect of obtaining good tape performance is to understand the tape drive to which you are backing up. Modern tape drives are streaming tape drives, which means that they are designed to transfer data at a certain rate, and you need to know what that rate is (or what those rates are) in order to keep that tape drive happy, and keep it streaming during backups and restores.
So start by learning all you can about the tape drives that you use. The first question you need to learn is the tape drive's maximum native (uncompressed) transfer rate. For example, an LTO-4 tape drive's maximum native (uncompressed) transfer rate is 120 MBps.
Read the rest of Curtis' story about tape backup design and how to improve tape storage performance.
About the author
George Crump is president and founder of Storage Switzerland, an IT analyst firm focused on the storage and virtualization segments. With 25 years of experience designing storage solutions for data centers across the United States, he has seen the birth of such technologies as RAID, NAS and SAN. Prior to founding Storage Switzerland, George was chief technology officer at one of the nation's largest storage integrators, where he was in charge of technology testing, integration and product selection. Find Storage Switzerland's disclosure statement here. This was first published in August 2009
This was first published in August 2009