Service provider takeaway: Learn how to determine the reliability of your client's system and find out the various methods of evaluating performance. This section of the chapter excerpt titled "System Recovery and Diagnostic Tricks" is taken from the book Tricks of the Microsoft Windows Vista Masters.
Download the .pdf of the "System Recovery and Diagnostic Tricks" chapter here.
"It is an immutable law in business that words are words, explanations are explanations, promises are promises but only performance is reality." What Harold S. Geneen, CEO of ITT from 1959 to 1977, was trying to say is that "talk is cheap," but performance stands on its own. The question for computer users is how do we determine whether our computers are performing up to par.
Each computer has a baseline. A baseline is the optimum, standard way of operating for a computer under its current set of hardware and software. Once you know your system's baseline, you can watch to see whether time or a new application takes a toll on the system's performance and reliability. But literally "seeing" that happen requires a good performance monitoring tool. Vista includes a new version of its Performance Monitor of old.
What is your standard method for determining the reliability of your system? Most of us determine a system's reliability by how long it has been since it has blue-screened on us or forced us to reboot. Not a truly "technical" way to assess reliability, huh?
So, it is with open arms that sys admins welcome the new reliability monitor. To open it, you can type perfmon.msc into any Search field or go to your Administrative Tools and select the Reliability and Performance Monitor.
The main goal of the reliability monitor is to keep track of reliability events that have been defined as changes to your system that could alter the stability or other events that might indicate system instability. Events monitored include:
- Windows updates
- Software installs and uninstalls
- Device driver installs, updates, rollbacks, and uninstalls
- Application hangs and crashes
- Device drivers that fail to load or unload
- Disk and memory failures
- Windows failures, including boot failures, system crashes, and sleep failures
Figure 8.6 shows a system that is becoming more unreliable over time. You can literally watch as your Vista decays. Using the monitor, you can see what is causing the instability. Is it an application or a set of applications? Did it begin with the addition of something new?
The System Stability Chart gives you a visual on how reliable your system looks over time. You are given an overall stability index score: 10 is perfection; 1 is the lowest. The Reliability Monitor retains up to a year's worth of data so you can really see how your system has been performing.
If you see a drop in the stability, you can check the date the drop began and then see if it was one of the following that caused the instability: Software (Un)Installs, Application Failures, Hardware Failures, Windows Failures, or Miscellaneous Failures.
Reliability Monitor lets you see if your system is something you can depend on.
Many have questioned the validity of the Reliability Monitor. Some of the questions asked by prominent writer Ed Bott include:
If my Stability Index slips below 5, is it time to do a complete reinstall? Is it really fair to conclude that my overall system stability dropped from a perfect 10 to 8.17 because Explorer crashed twice on May 25, or that it then slid all the way down to 5.77 the next day because OneNote 2007 Beta 1 stopped working twice (and hasn't failed since)?
The monitor, especially that line chart, needs to be taken with a grain of salt. Its real strength lies in an organized way to see system performance decline based upon specific, recorded situations.
You can stop the Resource Monitor by clicking the Stop button on the toolbar. You can also quickly navigate to any of the more detailed lists by putting your cursor over a chart (it forms a target icon) and clicking in the chart.
The Resource Monitor isn't a different tool, actually. When you first open the Reliability and Performance Monitor, you are presented with real-time views of your CPU, disk, network, and memory in four charts. You can click down arrows next to each category to see a more detailed list of what is being done by any one of those resources at that moment (see Figure 8.7).
Resource Monitors let you see your system activity at that moment for CPU, disk, network, and memory.
This tool shows you a visual representation of your system so you can inspect a variety of components, beyond what the Resource Monitor shows you. Initially, you won't see more than the % Process Time initially displayed. You can add more performance metrics, called counters, by clicking the + sign. When you first see the number of possible counters and instances, you can see that the task of choosing which items to monitor can be overwhelming (see Figure 8.8). Any given system has roughly 85 different performance objects (you can monitor the local system or a remote one). Each of those objects contain counters (there are way too many to know them all). After you have all your counters set up, you can make changes to the way they are displayed. For example, you can change the line colors for each counter to make it easier to determine which line you are watching. You can change the format of the display from a graph to a histogram to a report (numeric display).
Adding counters to your performance monitor can be a daunting task.
Baselines and Bottlenecks
Before you can truly know the performance of your system, you need to be able to compare it to something. A baseline is a collection of performance data for your system over a set period of time that indicates where your system normally performs under normal working conditions. Without a baseline you have nothing with which to compare your system. You need to compare it to itself at a better time in its past, if that makes sense. When you do see items in the future that indicate your system is running more slowly, these are called bottlenecks, because they are tying up your system's traffic in some way. You need to eliminate bottlenecks.
John Kellett (http://www.johnkellett.co.uk) gives the following advice about creating your baseline:
You may want to take a baseline reading with Performance Monitor before installing an application, install the application, and then take the reading again using the same Performance Monitor counters while the application is in use. Taking the baseline reading itself has been known to throw up a few issues. If your system has decent hardware and a user that doesn't really push it to the limit, I would be expecting the system to be pretty much flatlining. If one counter is a lot higher than you are expecting, then you can troubleshoot this before going any further.
Keep in mind that even though you technically can monitor and perform a baseline remotely, because of the congestion on a network for remote monitoring, Microsoft recommends that you perform baselines locally. In addition, you should log across multiple days at multiple time intervals to really see the performance. Finally, you should check the system's baseline at regular intervals (once a month or every other month) to see how it's doing. Keep in mind that, for larger networks, you might want to obtain third-party monitoring solutions.
Important Objects and Counters to Monitor
We already mentioned that you have many options from which to choose when using Performance Monitor. That's why we went to the masters to ask them which ones are the most important ones to know.
Richard Brucrew of Sebring, Florida, remembers some of the important ones that were important according to the Microsoft exams to be an engineer. Here are some of the exam objects and counters:
- Pages/sec -- Over 20 pages/sec could indicate too little RAM. This counter shows the transfer of data from the physical memory in your system to your pagefile. When these counters are too high it indicates a memory shortage. What's the solution? Adding more memory to your system might be your first thought, but before doing that you should consider stopping unnecessary services and background applications that might be using up your memory.
- Available Bytes -- This should be more than 4MB. If it drops under 4MB, this indicates a memory issue. This setting monitors the amount of memory that is available after the working sets of applications and the cache have been served.
Independent technical trainer and consultant
From Windows XP Professional Exam Cram, 1st Edition
Memory is often the first performance bottleneck in the real world. The counters related to processor and hard drive utilization might be well beyond their thresholds simply because inadequate memory is causing paging, which impacts those two components.
- % Total Processor Time -- A continuously high value can indicate a bottleneck. Anything over 80% for extended periods of time should give you cause to worry. Make sure, though, this isn't during the running of some of those elaborate screensavers (they take up a lot of processor time when running) and that your memory isn't the main cause for the problem.
Although a high processor time can indicate the need for a faster processor, you should check the queue length, too. If this is above 2 on average, you might consider adding a second processor or trying to remove pressure from your system by moving certain processes to other systems.
Where do you turn next? To Guy Thomas, Microsoft MVP). He has a lot of information on his site regarding performance counters. He has an ebook titled The Art and Science of Performance Monitoring that is worth downloading from http://www.computerperformance.co.uk/ebooks.htm.
- Processor Queue Length -- This should be less than 2, on average. This measures the number of threads waiting in the queue to be processed.
- Logical Disk: % Free Space -- Lets you see how much space you have left on your disk.
- Physical/Logical Disk: % Disk Time -- This counter shows the amount of time spent reading and writing requests. Anything from 50% to 100% indicates a bottleneck.
- Physical/Logical Disk: Disk Queue Length -- Similar to the processor queue, this should be under 2 for read/write requests that are pending.
- The simple method is to open Control Panel, go through System and Maintenance, and then click Performance Information and Tools (or you can go through the System applet and select the Windows Experience Index). In the Tasks pane you can select Advanced Tools. Many items are available that you might consider for later reference (quick links can be found to several performance tools, and you might also note some performance issue tips at the top to help you fix some of your performance problems). Locate the Generate a System Health Report option and then wait one minute while the test runs. You'll notice that, even though you went through a different path, you are still using the Reliability and Performance Monitor to run the report.
- Another way to run the same report is to do it from the Reliability and Performance Monitor itself. Open the tool; then from the system data collector sets, you'll notice four preconfigured tests. Select System Diagnostics, right-click, and select Start. Then, under Reports, open System Diagnostics and select the report that is running (the latest one). You will see that it's the same as the previously mentioned method.
- The simplest method to run the test is to open a command prompt (elevated ornonelevated -- it asks you for permission to proceed if it is nonelevated) and then type perfmon /report.
Data Collector Sets and Reports
Although the real-time view of your system is fun to watch for about a minute, to really collect data and manage it for future comparison you need to know about data collector sets and reports, which are new to Vista.
Data Collector sets allow you to put together a collection of alerts and thresholds that allow you to monitor your system immediately or over a period of time. Data collector sets can contain the following types of data collectors: performance counters, event trace data, and system configuration information (including Registry key values). You can create a data collector set from a template, from an existing set of data collectors in a Performance Monitor view, or by selecting individual data collectors and setting each individual option in the data collector set properties.
Network admin and Internet blogger
The library of data collector sets you can configure may seem a bit overwhelming. Start off by setting your performance counters and then saving that as a data collector set. Then you know exactly what you are monitoring. You'll graduate over time to the bigger items.
Performance logs are created with a .blg extension and are kept in the Peflogs folder by default. You don't open them from the folder, but from within Performance Monitor itself. If you want to convert these .blg files to other types or you review your logs frequently to see recent data, we recommend that you use limits to automatically segment your logs. You can use the relog command to segment long log files or combine multiple short log files. Type relog /? from a command prompt to learn more about this tool.
System Diagnostics Report
Data collector sets are absolutely incredible. Diagnostic capability in Vista is much better than in previous versions. Previously, we mentioned that you can set counters, save these as a data collector set, and view your log files. This has been available for some time. But moving to the next level, you can configure a more granular view of your system. To prove it, there are system data collector sets that really impress us.
Nick White, Microsoft product manager for Windows Client, has this to say about Vista's diagnostic capabilities:
To see a quick system checkup, you can do one of the following:
One of my favorites, as with many others here internally at Microsoft, is the ability to create a System Health Report. This report will help you diagnose your system's health and provides possible solutions to issues that may be affecting your PC's health.
If you wanted to run a different data collector set, you could type perfmon /report "Name of Data Collector Set" to start it.
You might wonder why you should use this complicated way when you could use the simple Control Panel method. It's a fair point; however, the Control Panel method doesn't give you the other three preconfigured tests -- namely, LAN Diagnostics, System Performance, and Wireless Diagnostics. These are also great to work with from the new diagnostics tools to see how your system is doing.
System Information Tool
In scouring the world for Vista Masters, we gathered together a few additional tools and tricks for you to use. First is System Information (located under Accessories, System Tools). D. David Dugan, the president of DD&C (http://www.dugancom.com), an IT consulting and solution providing organization, has written several posts regarding the importance of the System Information tool (msinfo32.exe). This free tool will really surprise you with the level of detail regarding your system's hardware configuration, the components in your system, and the software installed (including drivers and the services that are running). If you aren't sure whether this tool has value for you, just open it one time. Just once and you will clearly see the level of immediate information that is placed before you…and you'll love it.
You'll notice a high volume of Vista tips and tricks sites on the Web, and most of them will offer the same information. Computer Power User (http://www.computerpoweruser.com) gave us this tip that was worth repeating:
Vista users have a hidden resource, systeminfo, that gives them a quick, comprehensive snapshot of their installed hardware and even minutiae such as the original installation date of the OS, BIOS version, installed and available memory, and much more. To bring up systeminfo, click Start, Run; type cmd in the Open field; and click OK. At the command window prompt, type systeminfo and press Enter.
What's New in Task Manager
Task Manager, for many of us, is our go-to guy for problems. You have a problem; you go to Task Manager -- it's almost ingrained in us. You'll see quick and dirty information about your processes, CPU usage, memory, network, and so forth. So any changes that can benefit us are worth considering.
For one thing, the first time you start it you'll see that you can see just your computer's processes. You can also choose to see processes from all the users of the system. One thing you'll notice right away is the new Description aspect to the Processes tab (see Figure 8.9).
The new Task Manager adds a Description column and a Services tab.
One of the new features of Task Manager is the capability to create a minidump file of an application that is running. You can right-click an application or process that is running and select Create Dump File (refer to Figure 8.9). You will be presented with a dialog box that shows you where that file has been written. You can use this feature to discover why a particular application might be crashing so often; conversely, if a process has already crashed and is no longer responding, you can try to discover the cause.
Mitch Tulloch, a Microsoft MVP and president of MTIT Enterprises (www.mtit.com), gave some great pointers on using the new Task Manager in an article in Windows Networking.com (http://www.windowsnetworking.com/articles_tutorials/Managing-Processes-Tasks-Windows-Vista.html).
After you have the dump file, you need to install the symbols for Vista (which you can get at http://www.microsoft.com/whdc/devtools/debugging/symbolpkg.mspx) and then install the latest debugging tools (which you can find at http://www.microsoft.com/whdc/devtools/debugging/default.mspx). Then, as Mitch says, "Then, I can run the Windows Debugger (WinDbg), load the symbols, open the crashdump file, and try to determine what went wrong."
Obviously, that sounds a lot simpler than it really is. Reading dump files is a specialized talent that requires a bit of study and research on the Web. But there is a starter article for beginners at http://www.microsoft.com/whdc/devtools/debugging/ debugstart.mspx.
What Else Can Task Manager Do?
There's still more that you can do with Task Manager. For one thing, it now has a Services tab. From here, you can see all your services, some descriptive information regarding them (description and group information), and whether they are running. You can stop or start services from here. So, now you don't have to open your Services console to simply stop or start a service. You will still need to use that console if you want to do any permanent service adjustment (disabling a service, for example).
You can also right-click an application and select the Properties option, which is new in Vista. This allows you to go the properties of that particular executable so you change things such as the Compatibility options or other aspects of the program.
Process Monitor v10.21
Some of you might already be using tools created by Mark Russinovich, such as Filemon and Regmon. These are some of the most popular tools Sysinternals has offered the world. Microsoft acknowledged the strength of these tools and has acquired Mark's abilities with his tools. They are still offered freely on the Microsoft site at http://www.microsoft.com/technet/sysinternals/.
Filemon and Regmon had some limitations, such as a lack of detailed event information, limited filtering, poor scalability, and no insight into process events. Process Monitor, on the other hand, offers all those features. It has been likened to putting Windows under an x-ray machine. The tool is free on the Microsoft TechNet site. Learning to use it may take some time, though.
You'll never have the advantage of Mark Russinovich sitting in your living room and explaining to you the inner workings of his tools, but here is the next best thing: a set of videos Mark made with David Solomon that are absolutely incredible. You can get them from the Microsoft site, or the Solomon site at http://www.solsem.com/ videolibrary.html.
If the videos seem a bit pricey for you (although they're worth every penny), you can check out their book, Microsoft Windows Internals, Fourth Edition: Microsoft Windows Server 2003, Windows XP, and Windows 2000 (Pro-Developer) (hardcover; ISBN 0-7356-1917-4).
If you want to see RADAR in action, you can use Performance Monitor. RADAR divides the current level of committed virtual memory by the commit limit, which is the maximum size of the paging file. When the percentage reaches 100%, RADAR warns you. However, you can set the Memory object in Performance Monitor to track the % Committed Bytes in Use counter and the Committed Bytes and Commit Limit counters. This gives you a visual representation of what RADAR works with in the background.
Sometimes you might be surprised if your memory gives you a problem. Kerry Brown, Microsoft MVP, says, "Get a second opinion here http://www.memtest.org/."
Tricks of the Microsoft Windows Vista Masters
System Recovery and Diagnostic Tricks: Backup and Restore Center
System Recovery and Diagnostic Tricks: The System Rating
System Recovery and Diagnostic Tricks: Windows System Assessment Tool
System Recovery and Diagnostic Tricks: Problem Reports and Solutions
System Recovery and Diagnostic Tricks: Reliability and Performance Monitor
System Recovery and Diagnostic Tricks: Memory Diagnostics Tool
System Recovery and Diagnostic Tricks: ReadyBoost and SuperFetch
System Recovery and Diagnostic Tricks: Vista Recovery: Advanced Boot Options, WinRE, and WinPE
About the author
J. Peter Bruzzese is an independent consultant and trainer for a variety of clients, including New Horizons and ONLC.com. Over the past 10 years, Peter has worked for and with Goldman Sachs, CommVault Systems and Microsoft, among other companies. He focuses on corporate training. Peter is the author of Tricks of the Microsoft Windows Vista Masters and writes for Redmond Magazine. He travels frequently to speak at conferences and has been an MCT since 1998.