SQL Server 2005 enhancements
I have discussed several storage internal concepts, including several enhancements for SQL Server 2005 based on customer and PSS experience with data recovery in SQL Server 2000 and previous versions.
A discussion of new enhancements for SQL Server 2005 data recovery would not be complete without talking about BACKUP/RESTORE and DBCC CHECKDB.
The most noteworthy enhancement to BACKUP is verification. If you execute the T-SQL BACKUP command using the WITH CHECKSUM option, two things happen:
- As each database page is read from the database file from disk, the server verifies the checksum value in the page header if it exists. If it fails, the server raises an error and stops the backup.
- The server calculates a checksum value for all the bits from all pages in the backup and writes this in the backup media. This is called the backup checksum.
These both serve important purposes. By verifying the checksum on pages before the server writes them to the backup, you can prevent the server writing bad pages in your backup. Second, by writing a checksum value for the entire backup stream, the server can verify using RESTORE whether any bits in the backup media itself were damaged after backup writes were submitted to the target media. It is important to know that no new checksum value is written during BACKUP. So, if the database page did not contain a checksum value, a new one is not calculated and then written to the page before it is written to the backup media. But this page is still used as part of calculating the backup checksum.
If you do encounter an error when using WITH CHECKSUM because of a damaged page, you can choose to ignore any error using the WITH CONTINUE_AFTER_ERROR option. The best course of action is to resolve the checksum problem before taking the backup; in an emergency situation to ensure the rest of the database is backed up, however, you might consider using this option.
One other nice addition for SQL Server 2005 BACKUP is full-text data. Previously, to back up your full-text data, you had to manually back up the full-text catalog files associated with the database. Now if your database has full-text data, SQL Server automatically includes the full-text catalog information into the backup information associated with the database. You can even just back up the full-text catalog on it own using the T-SQL BACKUP command. (Refer to the Books Online reference for BACKUP for the proper syntax.)
Until SQL Server 2005, the RESTORE VERIFYONLY command was not all that useful. This is because this option did not really verify much of what was contained in the backup set. That has changed for SQL Server 2005 in two ways:
- If the backup media contains a backup checksum, by default the server verifies it. The server also verifies any checksum that exists on a given page in the backup.
- Even if the backup does not contain any checksum, the server verifies that the page ID in the page header is valid.
The same verification occurs by default for the standard RESTORE command, not just with VERIFYONLY. But VERIFYONLY is now a useful command in your arsenal. With checksums, it becomes a quick solution to verifying the integrity of the backup media. (Remember, however, this doesn't mean the backup will actually restore successfully, because recovery could still fail.)
Another option for the RESTORE command that will be useful in emergency situations is CONTINUE_AFTER_ERROR. Before this option was added and a customer encountered a corrupt backup, there was no way to know whether a single page was damaged or whether the entire backup was bad.
If you encounter an error with RESTORE (such as failure verifying checksum) on SQL Server 2005, you can use the CONTINUE_AFTER_ERROR option. The engine just ignores all errors and loads all pages in the database as they exist in the backup. You can then decide how to fix any remaining errors (for example, using DBCC CHECKDB with repair options). This is not an option you want to use on a regular basis. If you do, it means you have problems with your backups. However, it might come in handy on some late night when you do have bad backups but want to extract as much as you can from them.
The last important enhancement for RESTORE is the page-level restore. A PAGE option has been added to the RESTORE command to specify page numbers (maximum of 1,000). The addition of this option is yet another to help increase database availability. Consider a case where you encounter a checksum error on a single page for a 10TB database. If you want to restore just this page, you can from the latest full backup and restore subsequent log backups to make the transactions affecting the page consistent. And with Enterprise Edition, you can do all of this online, allowing other users to access unaffected areas of the database. Of course, the speed of the RESTORE alone is improved, because the server only has to write one or more pages from the backup to the database. (It still takes time to find and read the page from the backup.) I fully expect customers to take advantage of this option, especially those who call PSS and want us to help with some type of advanced recovery because they cannot afford the downtime of a full restore. There is an exercise at the end of chapter you can use to see how to use page-level restore after encountering a damaged page.
DBCC CHECKDB Enhancements
Another area of significant investment for SQL Server 2005 is DBCC CHECKDB. A former PSS colleague of mine, Ryan Stonecipher, now owns this code. He worked with Paul Randal, who used to own it, to incorporate feedback from customers and PSS throughout the beta. The result is some of the following feature enhancements.
DBCC CHECKDB can verify all types of things. But in previous releases of SQL Server, one thing it did not verify was the validity of values within certain data types (such as date/time). The SQL Server development team actually added an undocumented trace flag in SQL Server 2000 to check this because a few customers reported databases with invalid datetime or decimal data within the column values. This led to some strange problems including what appeared to be incorrect results. So, the development team decided in SQL Server 2005 to just add an option to check datetime and decimal columns for valid ranges of values as specified for that type.
For a new SQL Server 2005 database, DATA_PURITY checks are on by default (unless you use the WITH PHYSICAL_ONLY option). For an upgraded database, you must use WITH DATA_PURITY one time, and then it is implied from that point forward.
One common question for users of DBCC CHECKDB is a bit like your kids asking you in the car, "When do we get there?" Customers call PSS after DBCC CHECKDB has been running for an hour and say, "When will it be done?" So, the SQL Server development team added progress reporting capabilities for CHECKDB. A user can query the percent_complete and command columns of the sys.dm_exec_requests dynamic management view to see the current progress of CHECKDB. The command column displays a set of predefined values that describe the phase of execution of CHECKDB. (For example, DBCC SYS CHECK means CHECKDB is checking the consistency of system tables. The percent_complete column marks the progress within this phase.) SQL Server Books Online has a complete description of the phases and whether progress is reported for that phase.
"Last Known Good"
One of the questions I have asked customers when investigating a case involving database corruption is this: "When was the last time DBCC CHECKDB reported no errors for this database?" If the customer did not save all CHECKDB results or if the ERRORLOG files have wrapped, there is no way to answer this question. SQL Server 2005 saves in the database information about the last time a DBCC CHECKDB was run without errors on the database. Anytime the database is started, the information about the "last known good" clean DBCC is reported in the ERRORLOG like the following:
2005-09-22 11:56:48.42 Server Database mirroring has been enabled on this instance of SQL Server. 2005-09-22 11:56:48.42 spid5s Starting up database 'master'. 2005-09-22 11:56:48.73 spid5s Recovery is writing a checkpoint in database 'master' (1). This is an informational message only. No user action is required. 2005-09-22 11:56:48.84 spid5s CHECKDB for database 'master' finished without errors on 2005-09-22 11:31:45.990 (local time). This is an informational message only; no user action is required.
As you can see in this message, a CHECKDB was deemed to be "good." This information is updated every time a CHECKDB completes without errors for a specific database.
In addition to saving and recording the last clean CHECKDB, the SQL Server development team also enhanced the report in the ERRORLOG for each CHECKDB execution. The server now includes a duration value, so you can see how long it typically takes to run CHECKDB for your database. Here is an example of this ERRORLOG output:
2006-03-19 19:35:40.15 spid51 DBCC CHECKDB (troy) executed by NORTHAMERICA bobward found 0 errors and repaired 0 errors. Elapsed time: 0 hours 0 minutes 2 seconds.
Online Uses Database Snapshot
In SQL Server 2000, online DBCC CHECKDB reads the transaction log to check the consistency of the database. By online, I mean that the database is in MULTI_USER mode. Although this technique works, in some cases this might cause false CHECKDB errors. Therefore, in SQL Server 2005, CHECKDB takes advantage of the new snapshot database feature. An online CHECKDB now creates a database snapshot of the current database and uses the snapshot to check database consistency. This now makes online CHECKDB extremely simple. Just run the consistency check on the snapshot and you are guaranteed a consistent set of pages at the point in time CHECKDB was run. If a snapshot cannot be created (for example, because the databases are stored on the FAT file system), the table locks are used to ensure consistency.
Enhanced CHECKCATLOG Integrated
Any veteran SQL Server support engineer knows that DBCC CHECKCATALOG is not really worth running. In SQL Server 2000 and previous versions, only a few system tables were actually included in this check. Furthermore, the number of times an actual system table referential integrity occurs is few. Along with the new system table architecture, the SQL Server development team decided to actually implement a full catalog consistency check and include it by default when DBCC CHECKDB runs. DBCC CHECKCATALOG can still be run independently, but it is a quick operation as part of the overall CHECKDB execution.
Emergency Mode Repair
In SQL Server 2000, if the database is marked SUSPECT and you do not have a backup to restore, your options are pretty limited. One option some customers have chosen is to call Microsoft PSS to see whether they can help repair the database. This procedure involves some advanced undocumented commands that could end up resulting in a rebuild of the transaction log. This procedure can result in a CHECKDB with no errors, but logical consistency is now compromised. This request has come in so often that the SQL Server development team decided to include a recovery feature in the product so that customers could perform this operation themselves.
Use the following table of contents to navigate to chapter excerpts, or click here to view Data corruption and recovery issues in its entirety.
|About the book:|
|In this book, bestselling author and SQL Server guru Ken Henderson has worked with the SQL Server Development team to write "the" troubleshooting SQL Server 2005 book. All of the content comes directly from the creators and developers of SQL Server. Ken has edited each chapter to ensure that the writing is clear and sound. Written by the people who know the product best - the people who built it - SQL Server 2005 Practical Troubleshooting teaches developers and dbas how to troubleshoot and diagnose problems they may encounter with SQL Server. Purchase the book from Addison-Wesley Publishing|
|ABOUT THE AUTHOR:|
|Ken Henderson has been a developer for more than 25 years. He has worked with SQL Server since 1990 and has built software for a number of firms throughout his career, including H&R Block, the Central Intelligence Agency, the U.S. Navy, the U.S. Air Force, Borland International, JP Morgan, and various others. He joined Microsoft in 2001 and is currently a developer in the Manageability Platform group within the SQL Server development team. He is the creator of SQL Server 2005's SQLDiag facility and spends his days working on SQL Server management tools and related technologies. He is the author of eight books on a variety of computing topics, including the popular Guru's Guide series of SQL Server books available from Addison-Wesley. He lives with his family in the Dallas area and may be reached via email at firstname.lastname@example.org.|