Solution provider takeaway: Solution providers who provide virtualization services can prevent or correct virtual machine (VM) sprawl using a variety of management tools.
Server virtualization has been very successful for both the customers that have implemented it and the resellers that have become experts at it. But like any technology, it brings its own set of problems. Virtual machine (VM) sprawl is one of those problems.
Prior to server virtualization, the rollout time for new applications and servers was governed by physical factors. In response to a new server/application request, assuming they had budget for a new server, your customer had to size the server requirement, procure the server, wait for it to come in, schedule time with the networking and storage guys to connect everything, load the OS and install the application prior to giving it to the user. At most companies, this process took 30 to 45 days. As a result, they could only deploy so many servers so fast and were forced to make compromises. Most commonly, lesser applications started life on a shared server but never justified the processing requirements to come off of that server.
In a virtualized environment, on the other hand, by using templates, servers can be deployed in hours, and there is little if any interaction needed with the networking or storage teams. And there's no need to wait on the budget approval and procurement process. Many environments with a virtualized server can be deployed in two to three hours. There is little question that this is a very good thing, and integrators have been doing an excellent job of helping this process along.
The result of this easy rollout scenario is a massive growth in server counts (although the growth is in virtual servers), which detracts from the efficiencies your customers initially gained from server virtualization. We know of one customer who less than four years ago had 40 servers (all physical) but now has 1,200 servers (fewer than 40 physical) -- and that ratio is typical of companies that have implemented server virtualization. And, ironically, many of these virtual servers are idle because they were created for a point-in-time need and/or for a test scenario. But, an idle server -- physical or virtual -- consumes resources from the virtualization host, whether it's storage resources from the virtualization storage platform or bandwidth dedicated to data replication, if that's been implemented.
Tools for managing VM sprawl
Customers are trying to manage this virtual environment manually with tools like Excel or Word. While these applications make sense in tracking the slow-rollout physical environment, they are no match for the virtual. The virtual environment by its very nature changes too quickly to be manually tracked and updated. In a virtual environment, a weekly update to an Excel or Word document may have to account for hundreds of changes and be out of date before the update is even completed. It's obvious: Your customers need tools that automatically capture, monitor and trend this information so they and you can make real-time decisions on how the virtual infrastructure should be managed.
Companies like Tek-Tools, Monosphere, Akorri, Platespin and others have tools on the market that can monitor and manage VM sprawl. You can provide these tools, which are similar to those for server management and storage resource management, as part of a quarterly virtualization health check service (more on this in a future article), or the software can be sold directly to customers for their own use.
While most of these products are marketed as VMware tools, they often aren't specifically promoted for managing VM sprawl. They do, however, provide the information needed to help prevent the sprawl. They start by trending current resource utilization from a VM perspective and then providing aggregate data at the virtualization host. The tools can provide perspective on processor utilization over time so you can gauge peak times, idle times and impact on the virtualized host.
These products will also give a similar view on storage resource and network resource use, trending overall bandwidth use as well as peak and idle time identification. This trend information is then mapped from the virtual to physical worlds, allowing for critical path decisions to be made to optimize the physical environment and reduce the sprawl of the virtual environment. Those decisions essentially boil down to VMs being grouped into one of the following three categories.
- VMs that aren't using significant resources of the current virtualization host but are still active. These VMs should be moved to a less powerful, less resource-consuming host. This host could also be "greener," saving your customers power.
- VMs that are being throttled due to constraints in their current host. These VMs should be moved up to a better-performing host. Some of the tools from the companies mentioned above can provide "what if" load simulators so that the impact of these moves can be examined beforehand.
- VMs that aren't being used or are rarely used. These should either be consolidated or archived to solve resource allocation problems. Consolidation can entail either combining the light processes on multiple VMs into a single VM or archiving the process altogether.
Consolidation can be accomplished using tools typically built right into the virtualization product. In VMware, for example, the vcbMounter utility is used to export a backup copy of a virtual machine. The program is invoked at the command line or can be part of a script, with vcbMounter indicating the name of the virtual machine to archive and the destination directory for the target archive. vcbMounter is typically used for backup to disk, but it's also ideal for sending the virtual machine image to a disk-based archive. All the files needed to re-create or restore the virtual machine are exported by vcbMounter to produce a complete file-system-consistent backup copy of the virtual machine. With vcbMounter, if the archived virtual machine is needed again, it can be very quickly recovered from the disk archive by using VMware's vcbRestorer. This will import a copy of a virtual machine created by vcbMounter and can be used to completely recover a virtual machine to its original state on the original or an alternate ESX server host.
VM archiving, on the other hand, involves moving the VM instance to a disk archive. Disk is ideal because it allows for rapid recovery of VMs. Also, if the disk archive has data deduplication capability, it can store many virtual images because there is a high degree of redundant data in a typical virtual environment. Companies like Permabit, Data Domain and Copan make products that are ideal for disk-based archiving. The speed of the disk archive also enables a more aggressive (possibly even seasonal) archive of VMs. For example, many servers are only needed at peak times in a quarter or during the year, but in a nonvirtualized environment, it's too difficult to run through the full cycle -- from deployment to un-deployment to re-deployment -- to make better use of those servers when they're not in use. With virtualization, proper monitoring tools and disk archiving of VMs, the cycle can be efficient and painless. With the archive, the entire state of the machine is saved and it merely needs to be accessible. Most disk archives are accessed by a simple network mount point and most virtualization environments have a means of importing a virtual server image from a disk archive via that mount point.
About the author
George Crump is president and founder of Storage Switzerland, an IT analyst firm focused on the storage and virtualization segments. With 25 years of experience designing storage solutions for data centers across the United States, he has seen the birth of such technologies as RAID, NAS and SAN. Prior to founding Storage Switzerland, George was chief technology officer at one of the nation's largest storage integrators, where he was in charge of technology testing, integration and product selection.