Solution provider takeaway: In this second installment of our four-part series on server virtualization health check services, solution providers can learn how to approach three critical parts of a virtualization health check: backup, networking and unvirtualizing servers.
As part of your server virtualization health check services, you should examine what areas of your customer's environment have been exposed as a result of server virtualization. Typically, both the new infrastructure and the virtual machines within that infrastructure need to be protected. Oftentimes, customers either don't put the right tools in place before implementation, or they don't put enough thought into the backup process.
Your health check services should also examine your customer's messaging and storage networks. Because server virtualization rollouts happen very quickly, to keep the process moving along, many customers will compromise network security by opening it up. It's also important to make sure your customers have an exit strategy in case they need to unvirtualize a system. Evaluating your customer's status in these areas enables you to restore some of the data center safeguards that were in place before the virtualization project began.
The data protection strategy has to be updated when the infrastructure is virtualized, and your health check service should attempt to verify whether or not that update occurred. Unfortunately, customers often simply re-create the protection strategy they used in the physical infrastructure, usually by installing a backup agent in the virtual machine just as they would on a standalone physical machine.
The problem with this approach is that it becomes tricky to balance the load correctly when the backup process begins. If all the virtual machines start backing up at once, the resources of the physical machine can get consumed very quickly. There are few processes in the data center more resource-intensive than a backup, and running that process multiple times simultaneously on the same physical hardware does not produce reliable results.
VMware and other server virtualization providers have created multiple protection options. For a variety of reasons, these options can all be difficult to implement. As part of your health check make sure you triple-check data protection -- no matter who implemented it. According to a survey by Applied Research - West Inc., 60% of respondents indicated that their backups had failed.
Customers oftent try to get by with a combination of guest OS backups and a few basic VMware utilities to get a complete snapshot of the environment. But there are many options for protecting a virtualized environment: snapshots, an NDMP dump, guest OS backup, or an advanced utility from their server virtualization provider or from a third party. For example, VMware's VMware Consolidated Backup (VCB) requires shared storage and special agents from the backup applications. It can be challenging to implement, but once installed, it provides a single protection process for the environment, integrated into the same process that the rest of the data center is using.
The server virtualization protection approach that you recommend will depend more on the customer than anything else. If your customer has budgeted for a new backup application or is upgrading its current application and has already implemented a SAN, VCB is a viable option for them. Your customer is likely to need guidance in choosing the right product and scripting to connect the solution to the backup devices; both jobs are tailor-made for integrators.
If, on the other hand, your customer has limited funds and/or does not want to change its current backup methodology, guide it toward a targeted third-party utility. The current crop of these third-party tools is relatively straightforward to implement and use. They also tend to do more than just basic data protection, offering capabilities like migration, replication and demigration. The downside is that they require a separate process from the overall data protection process in the data center, and most have no integration to tape for long-term retention.
There's a third area where you can offer protection-related services: guest OS backup. You can redesign a customer's backup schedules so that the backups are timed for minimal impact on the physical machines during the backup. As part of the service you should design in a series of backups to protect the core configuration of the virtual infrastructure itself. And you could plan where future virtual machines should land in the schedule.
Conducting a server virtualization health check on a customer's networking infrastructure is a little more complicated than the data protection check; it involves both the messaging network and the storage network (iSCSI, Fibre Channel or NAS). From a health check service perspective, you should look for networks that have been opened up too wide, where too many machines have access to too many storage pools or where the messaging network has been collapsed to a single network and subnets have been essentially eliminated.
For example, SAN storage LUNs are commonly assigned to multiple, if not every, physical machine in the virtual cluster, because it's nice knowing that any virtual machine can be moved to any other physical host -- and then there's the "wow" factor in such a measure. But in reality, this kind of flexibility isn't needed. In today's infrastructure, enabling migration of VMs to just one or two other hosts in the environment should be sufficient.
Once such network weaknesses are identified, the health check should make recommendations for building walls in the network to improve security and routing. And, the network is an ideal point to recommend an infrastructure virtualization tool that would allow customers to power on cold machines and to make real-time storage and network connections.
The next part of the health check service is to determine whether your customer has a viable exit strategy -- meaning, the ability to move from virtual back to physical, also known as V2P. This may sound like sacrilege, since why would you ever want to leave the safe confines of a virtualized world and go back to the antiquated one-app-per-machine days? There are three common reasons: The first has to do with support issues, the second stems from the need to provide dedicated processing power and the third is as part of a disaster recovery plan.
The key is to help your customer identify where they might need to unvirtualize a server as well as the best tool to get them there. While a standalone utility will do the trick for many customers, they can also leverage an infrastructure virtualization tool.
As part of your health check service, you should identify workloads that might need to be unvirtualized. Being able to unvirtualize all systems simplifies troubleshooting problems, but there are probably specific virtual machines where this would be a high priority -- for example, machines that aren't officially supported by applications in the virtual environment, Oracle databases or proprietary vertical applications.
You should also try to identify applications that at certain times could be greatly enhanced by running on their own physical systems. A good example is a batch processing application for payroll; it sits idle most of the time and then every pay period has a huge spike in power usage. A tool like those from Tek-Tools or Akorri can be very valuable in identifying such applications, or you can count on the instincts of the IT staff to help find them.
Finally, in a disaster recovery situation, many customers will use server virtualization to optimize their DR site and to keep associated costs down. In this case they may replicate standalone physical servers to a virtualized server in the DR site. In the event of a failure they may want the ability to move that virtual version to a standalone machine quickly and easily. A tool that can do V2P is ideal here as well. Your goal should be to identify servers that are being replicated to the virtual environment at the DR site, which should be relatively straightforward, and map out a plan of how they will be used at the DR site. If they will stay virtualized, make sure there is capacity for that; if they need to become physical machines, make sure there is a process and tools in place to enable that.
The next installment of our series on server virtualization health check services will detail ways you can help your customers expand their virtualization. While this expansion may sound counterintuitive given the economic situation, remember that virtualization offers strong ROI, and the more it's implemented, the more cost-effective it becomes.
About the author
George Crump is president and founder of Storage Switzerland, an IT analyst firm focused on the storage and virtualization segments. With 25 years of experience designing storage solutions for data centers across the United States, he has seen the birth of such technologies as RAID, NAS and SAN. Prior to founding Storage Switzerland, George was chief technology officer at one of the nation's largest storage integrators, where he was in charge of technology testing, integration and product selection. Find Storage Switzerland's disclosure statement here.