Solution provider takeaway: In this first installment of our four-part series on virtualization health checks, solution providers will learn about developing their own virtualization health check services and how to execute on the first phase, gap analysis.
Virtualization is being rolled out by your customers faster than any other IT initiative in recent years -- maybe ever. And beyond the obvious opportunity you have to help customers with these rollouts, there's another, potentially larger opportunity in the form of a virtualization health check. Server virtualization implementation is typically a fast-tracked project at most companies, and fast-tracked projects are characterized by guesses, assumptions and shortcuts. This dynamic presents an excellent opportunity for you to fine-tune customers' virtualization projects after -- and even during -- implementation.
To prepare for offering virtualization health check services, you need to do some internal work. First, you should assess where the need is among existing customers. You can do this by meeting with your engineering team to discuss what's lacking in the virtualization projects they're involved in. Then, figure out whether you can address these needs with existing staff. Do you have the knowledge internally to deliver this service? If offering this service requires training or additional hires, you'll have to weigh those costs against the potential of the service.
Assuming that the cost/benefit analysis confirms it makes sense to continue along the path to virtualization health check services, you should then determine what the service entails and formalize the service with a brochure that salespeople and engineers would use when rolling out the service to prospective customers. Virtualization health checks won't sell themselves, especially since they optimize something the customer believes is working.
Virtualization project goals vs. reality
In general, the goals for most customers when beginning a virtualization project are reducing physical server count while increasing resource utilization, improving the disaster recoverability of the organization while reining in those costs, reducing application outages due to server maintenance, reducing the time it takes to create a new server and gaining some level of power/cooling efficiency.
A server virtualization health check should assess how well your customer's goals measure up with reality. Your assessment of the difference between the two is the gap analysis; when you present the gap analysis to your customer, you should accompany that analysis with recommendations for filling the gap. In addition, you could point to other advantages your customers may not be aware of because they are not "thinking virtually" and are limiting themselves to the constraints of a physical world. For example, your customer may be thinking of virtual machine migration only as a solution for availability, whereas they could easily benefit by using their virtual infrastructure to load-balance virtual machine use during peak server loads.
It's important to note that a customer's virtualization project doesn't need to be complete (most virtualization projects are still in progress) before you suggest a virtualization health check, since it's key to measure progress along the path, just as you would check your map occasionally on a long road trip to make sure you're traveling in the right direction and at the right pace. The health check should be presented as an independent validation of a customer's progress along the road to complete virtualization. The health check should also be performed regularly. As the virtualization project progresses, so will technology; what might have been impractical at the beginning of the project may become viable along the way.
After presenting initial recommendations to the customer, you should offer two additional phases of the health check: examining exposures created by the virtualization project and examining how well the customer is leveraging their virtual environment and how that environment can be extended. We'll cover those phases in more detail in later articles.
Drilling down on the gap analysis
So, how to conduct the gap analysis? First, keep in mind that while the gap analysis should compare the before and after of the virtualization project, oftentimes you won't have a before picture to use as a baseline. That's because many virtualization projects start rather randomly, and companies keep little documentation of the before state.
In many cases, the before state is no longer important since it no longer exists. What is important is how much further the customer can go in achieving their goal. For example, if one of the goals was to optimize server utilization, you should measure how much further can they go from where they are now. By virtualizing more standalone servers or by decommissioning parts of the virtual infrastructure that were overallocated, you may be able to optimize the environment by another 50%.
Or, maybe the client can be better-served by a three-node virtual cluster instead of a four-node virtual cluster. Eliminating that fourth node will reduce power and cooling as well as licensing costs. The key goal of resource optimization is to use as much processing power as possible while leaving room for server migration and peak processing loads.
While much of this utilization tracking can be done manually, that approach is time-consuming and you may not capture every aspect of the environment. Tools from companies like Tek-Tools or Akorri ease resource utilization monitoring. Some companies will even offer the software on a rental basis for use in a snapshot type of project such as health checks.
Disaster recovery site optimization is another commonly measured area. It's typical for a lack of disaster recovery optimization to crop up in the gap analysis simply because the customer didn't have the skill set or time (or both) to address it. This section of your report, then, could recommend that your organization carry the ball forward. It should also note what the customer environment needs in order to implement a DR site that leverages virtualization -- such as shared storage, connectivity, automation and, obviously, a remote site. If you're an integrator that offers products, be prepared to offer all of these.
If your customer has implemented a DR strategy that leverages virtualization, be prepared to examine the server load and DR site response capabilities. In most cases, customers tend to "overload" the expectations of the remote virtual host by planning to run too many virtual machines. In the event of a real disaster where they can't return to the primary data center for a few weeks, these overloaded systems are not practical and most of the remote virtual machines can't be started without taking the risk of bringing down the other systems.
This predicament is essentially another capacity management issue -- only it occurs at a seldom used remote site. Your report should point out that overutilization could be a problem if most of the virtual machines are started -- and offer alternatives. Many customers overload these primary systems to save on power and cooling costs in the remote facility. To address DR needs without incurring power and cooling costs, customers could very easily deploy additional remote physical systems with virtualization software preloaded but the servers kept idle. In the event of a disaster, your customer would remotely power on these idle servers and start reallocating virtual machines off of the primary DR host. Infrastructure virtualization products like those from Scalent Systems or Egenera can simplify this operation.
Another goal of virtualization, leveraging server virtualization to reduce application or server outages, essentially involves virtual machine migration. While it's pretty straightforward, it is often left undone on the project's list of tasks or opens up areas of exposure. Most commonly, the customer either doesn't have shared storage -- a key migration requirement -- or does not have the experience to configure the shared storage to properly support virtual machine migration. As with disaster recoverability, this section of the virtualization health check recommendations may be as simple as proposing additional products and services to complete that phase.
Similar to disaster recoverability, if a customer has implemented server migration, make sure they have properly planned for shifts in resource requirements. You should also examine how they have assigned the shared storage resource. For speed of deployment, customers often open up all the storage to the entire cluster of physical hosts. To address these issues, you should recommend a reduction in the number of physical machines that need to see other physical machines. It also makes sense to train the customer on how to set up the connections manually or use a virtual infrastructure tool to programmatically handle these connections in real time as needed.
Of all the goals in a server virtualization project, measuring gaps between power savings goals and reality is the most difficult. Most IT departments have little understanding of how the data center affects a company's overall power consumption and certainly don't have the initial power bill to use as a baseline. As a result, you'll need to use different metrics -- for example, the number of servers planned for purchase but not bought or the number of servers decommissioned.
In addition to the top five goals in a server virtualization project, your customer likely has other goals that will need to be measured in the virtualization health check. Gauge these the same way as the top five goals, looking at the original goal, determining how much of that goal was accomplished, and, based on your experience, asking what can be done to further complete that goal.
In the next article, we will look at challenges presented by the virtualization project, some of which are inherent to virtualization and some of which are caused by the speed at which the project is implemented.
About the author
George Crump is president and founder of Storage Switzerland, an IT analyst firm focused on the storage and virtualization segments. With 25 years of experience designing storage solutions for data centers across the United States, he has seen the birth of such technologies as RAID, NAS and SAN. Prior to founding Storage Switzerland, George was chief technology officer at one of the nation's largest storage integrators, where he was in charge of technology testing, integration and product selection. Find Storage Switzerland's disclosure statement here.
This was first published in October 2008