IT Channel.com

Enabling VMware HA, DRS: Advanced vSphere features

By Eric Siebert

Solution provider's takeaway: VSphere features such as VMware HA and DRS may be more complicated than you think. Check out how to enable and configure each feature to optimize the resources in your customer's environment.

Advanced features are what really add value to vSphere and help distinguish it from its competitors. The features covered in this chapter provide protection for virtual machines (VMs) running on ESX and ESXi hosts, as well as optimize resources and performance and simplify VM management. These features typically have many requirements, though, and they can be tricky to set up properly. So, make sure you understand how they work and how to configure them before using them.

High Availability (HA)

HA is one of ESX's best features and is a low-cost alternative to traditional server clustering. HA does not provide 100% availability of VMs, but rather provides higher availability by rapidly recovering VMs on failed hosts. The HA feature continuously monitors all ESX Server hosts in a cluster and detects failures, and will automatically restart VMs on other host servers in an ESX cluster in case of a host failure.

How HA works

HA is based on a modified version of the EMC/Legato Automated Availability Manager (AAM) 5.1.2 product that VMware bought to use with VMware VI3. HA works by taking a cluster of ESX and ESXi hosts and placing an agent on each host to maintain a "heartbeat" with the other hosts in the cluster; loss of a heartbeat initiates a restart of all affected VMs on other hosts. vCenter Server does not provide a single point of failure for this feature, and the feature will continue to work even if the vCenter Server is unavailable. In fact, if the vCenter Server goes down, HA clusters can still restart VMs on other hosts; however, information regarding availability of extra resources will be based on the state of the cluster before the vCenter Server went down.

HA monitors whether sufficient resources are available in the cluster at all times in order to be able to restart VMs on different physical host machines in the event of host failure. Safe restart of VMs is made possible by the locking technology in the ESX Server storage stack, which allows multiple ESX Servers to have access to the same VM's file simultaneously. HA relies on what are called "primary" and "secondary" hosts; the first five hosts powered on in an HA cluster are designated as primary hosts, and the remaining hosts in the cluster are considered secondary hosts. The job of the primary hosts is to replicate and maintain the state of the cluster and to initiate failover actions. If a primary host fails, a new primary is chosen at random from the secondary hosts. Any host that joins the cluster must communicate with an existing primary host to complete its configuration (except when you are adding the first host to the cluster). At least one primary host must be functional for VMware HA to operate correctly. If all primary hosts are unavailable, no hosts can be successfully configured for VMware HA.

HA uses a failure detection interval that is set by default to 15 seconds (15000 milliseconds); you can modify this interval by using an advanced HA setting

das.failuredetectiontime = 15000

A host failure is detected after the HA service on a host has stopped sending heartbeats to the other hosts in the cluster. A host stops sending heartbeats if it is isolated from the network, it crashes, or it is completely down due to a hardware failure. Once a failure is detected, other hosts in the cluster treat the host as failed, while the host declares itself as isolated from the network. By default, the isolated host leaves its VMs powered on, but the isolation response for each VM is configurable on a per-VM basis. These VMs can then successfully fail over to other hosts in the cluster. HA also has a restart priority that can be set for each VM so that certain VMs are started before others. This priority can be set to either low, medium, or high, and also can be disabled so that VMs are not automatically restarted on other hosts. Here's what happens when a host failure occurs.

The HA feature was enhanced starting with ESX 3.5, and now provides VM failure monitoring in case of operating system failures such as the Windows Blue Screen of Death (BSOD). If an OS failure is detected due to loss of a heartbeat from VMware Tools, the VM will automatically be reset on the same host so that its OS is restarted. This new functionality allows HA to also monitor VMs via a heartbeat that is sent every second when using VMware Tools, and further enhances HA's ability to recover from failures in your environment.

When this feature was first introduced, it was found that VMs that were functioning properly occasionally stopped sending heartbeats, which caused unnecessary VM resets. To avoid this scenario, the VM monitoring feature was enhanced to also check for network or disk I/O activity on the VM. Once heartbeats from the VM have stopped, the I/O stats for the VM are checked. If no activity has occurred in the preceding two minutes, the VM is restarted. You can change this interval using the HA advanced setting

das.iostatsInterval

VMware enhanced this feature even further in version 4.1 by adding application monitoring to HA. With application monitoring, an application's heartbeat will also be monitored, and if it stops responding, the VM will be restarted. However, unlike VM monitoring, which relies on a heartbeat generated by VMware Tools, application monitoring requires that an application be specifically written to take advantage of this feature. To do this, VMware has provided an SDK that developers can use to modify their applications to take advantage of this feature.

Configuring HA

HA may seem like a simple feature, but it's actually rather complex, as a lot is going on behind the scenes. You can set up the HA feature either during your initial cluster setup or afterward. To configure it, simply select the cluster on which you want to enable HA, right-click on it, and edit the settings for it. Put a checkmark next to the Turn On VMware HA field on the Cluster Features page, and HA will be enabled for the cluster. You can optionally configure some additional settings to change the way HA functions. To access these settings, click on the VMware HA item in the Cluster Settings window.

The Host Monitoring Status section is new to vSphere and is used to enable the exchange of heartbeats among hosts in the cluster. In VI3, hosts always exchanged heartbeats if HA was enabled, and if any network or host maintenance was being performed, HA could be triggered unnecessarily. The Enable Host Monitoring setting allows you to turn this on or off when needed. For HA to work, Host Monitoring must be enabled. If you are doing maintenance, you can temporarily disable it.

The Admission Control section allows you to enable or disable admission control, which determines whether VMs will be allowed to start if, by doing so, they will violate availability constraints. When Admission Control is enabled, any attempt to power on a VM when there is insufficient failover capacity within the cluster will fail. This is a safety mechanism to ensure that enough capacity is available to handle VMs from failed hosts. When Admission Control is disabled, VMs will be allowed to be powered on regardless of whether they decrease the resources needed to handle VMs from failed hosts. If you do disable Admission Control, HA will still work, but you may experience issues when recovering from a failure event if you do not have enough resources on the remaining hosts to handle the VMs that are being restarted.

The Admission Control Policy section allows you to select a type of policy to use. The three available policies are described in the sections that follow.

Host Failures Cluster Tolerates

This is used to ensure that there is sufficient capacity among the remaining host servers to be able to handle the additional load from the VMs on failed host servers. Setting the number of host failures allowed will cause the cluster to continuously monitor that sufficient resources are available to power on additional VMs on other hosts in case of a failure. Specifically, only CPU and memory resources are factored in when determining resource availability; disk and network resources are not. You should set the number of host failures allowed based on the total number of hosts in your cluster, their size, and how busy they are.

Advanced vSphere Features

Configuring VMotion SVMotion: Requirements for VARs
VSphere Fault Tolerance requirements and FT logging

vCenter Server supports up to four host failures per cluster; if all five primaries were to fail simultaneously, HA would not function properly. For example, if you had four ESX hosts in your cluster, you would probably only want to allow for one host failure; if you had eight ESX hosts in your cluster, you might want to allow for two host failures; and if you had a larger cluster with 20 ESX hosts, you might want to allow for up to four host failures. This policy uses a slot size to determine the necessary spare resources to support the number of host failures that you select. A slot is a logical representation of the memory and CPU resources that satisfy the requirements for any powered-on VM in the cluster. HA automatically calculates slot sizes using CPU and memory reservations, and then the maximum number of slots that each host can support is determined. It does this by dividing the host's CPU resource amount by the CPU component of the slot size, and rounds down the result. The same calculation is made for the host's memory resource amount. The two numbers are then compared, and the lower number is the number of slots that the host can support. The failover capacity is computed by determining how many hosts (starting from the largest) can fail and still leave enough slots to satisfy the requirements of all powered-on VMs. Slot size calculations can be confusing and are affected by different things. For more information on slot sizes, see the vSphere Availability Guide.

Percentage of Cluster Resources Reserved As Failover Capacity

Instead of using slot sizes, HA uses calculations to ensure that a percentage of the cluster's resources are reserved for failover. It does this by calculating the total resource requirements for all powered-on VMs in the cluster. Next, it calculates the total number of host resources available for VMs. Finally, it calculates the current CPU failover capacity and current memory failover capacity for the cluster, and if they are less than the percentage that is specified for the configured failover capacity, admission control will be enforced. The resource requirements for powered-on VMs comprise two components, CPU and memory, and are calculated just like slot sizes are. The total number of host resources available for VMs is calculated by summing the host's CPU and memory resources. The current CPU failover capacity is computed by subtracting the total CPU resource requirements from the total host CPU resources and dividing the result by the total host CPU resources. The current memory failover capacity is calculated similarly. This method is a bit more balanced than specifying host failures, but it is not as automated because you have to manually specify a percentage.

Specify a Failover Host

This method is the simplest, as you are specifying a single host onto which to restart failed VMs. If the specified host has failed, or if it does not have enough resources, HA will restart the VMs on another host in the cluster. You can only specify one failover host, and HA will prevent VMs from being powered on or moved to the failover host during normal operations to ensure that it has sufficient capacity.

If you select a cluster in the vSphere Client and then choose the Summary tab, you can see the cluster's current capacity percentages. It is important to note that when a host fails, all of its VMs will be restarted on the single ESX host that has the lightest workload. This policy can quickly overload the host. When this occurs, the Distributed Resource Scheduler (DRS) kicks in to spread the load across the remaining hosts in the cluster. If you plan to use HA without DRS, you should ensure that you have plenty of extra capacity on your ESX hosts to handle the load from any one failed host. Additionally, you can set restart priorities so that you can specify which VMs are restarted first, and even prevent some VMs from being restarted in case of a failure.

The Virtual Machine Options section is for the cluster default settings that will apply to all VMs in the cluster by default, as well as individual VM settings. The cluster default settings apply to each VM created in or moved to the cluster, unless the VM is individually specified. The first setting is VM Restart Priority, which is the priority given to a VM when it is restarted on another host in case of a host failure. This can be set to High, Medium, Low, or Disabled. Any VM set to Disabled will not be restarted in case of a host failure.

The second setting is Host Isolation Response, which is used to determine what action the failed host that is isolated should take with the VMs that are running on it. This is used if a host is still running but has a failure in a particular subsystem (e.g., a NIC or host bus adapter [HBA] failure) or a connectivity problem (e.g., cable or switch) and is not completely down. When a host declares itself isolated and the VMs are restarted on other host, this setting dictates what happens on the failed host. The options include leaving the VM powered on, powering off the VM (hard shutdown), and shutting down the VM (graceful shutdown). If you choose to shut down the VM, HA will wait five minutes for the VM to shut down gracefully before it forcefully shuts it down. You can modify the time period that it waits for a graceful shutdown in the Advanced Configuration settings. It is usually desirable to have the VM on the failed host powered off or shut down so that it releases its lock on its disk file and also does not cause any conflicts with the new host that powers on the VM.

One reason you may choose to leave the VM powered on is if you do not have network redundancy or do not have a reliable network. In this case, you may experience false triggers to HA where the ESX host is okay but has just lost network connectivity. If you have proper network redundancy on your ESX hosts, HA events should be very rare. This setting will not come into play if the failed host experiences a disastrous event, such as completely losing power, because all the VMs will be immediately powered off anyway. In two-host configurations, you almost always want to set this to leave the VM powered on.

The last section is for Virtual Machine Monitoring (and Application Monitoring in vSphere 4.1), which restarts on the same host VMs that have OS failures. You can enable/disable this feature by checking the Enable VM Monitoring field and then using the slider to select a sensitivity from Low to High; if you want to customize your settings, you can click the Advanced Options button (vSphere 4.1) or check the Custom field (vSphere 4.0) and customize the settings on the screen that appears. In vSphere 4.1, instead of checking to enable VM monitoring, you can choose Disabled, VM Monitoring, or VM & Application Monitoring. Here are the VM monitoring advanced/custom options that you can set.

You can also set individual VM settings that are different from the cluster defaults.

vSphere 4.1 added another new feature to HA that checks the cluster's operational status. Available on the cluster's Summary tab, this detail window, called Cluster Operational Status, displays more information about the current HA operational status, including the specific status and errors for each host in the HA cluster.

Advanced configuration

Many advanced configuration options can be set to tweak how HA functions. You can set these options through the Advanced Options button in the HA settings, but you have to know the setting names and their values to be able to set them. These options are not displayed by default, except for the options that are set if you enable Virtual Machine Monitoring, and you must manually add them if you wish to use them. See a list of the many HA advanced options that you can set.

Distributed Resource Scheduler (DRS)

DRS is a powerful feature that enables your virtual environment to automatically balance itself across your ESX host servers in an effort to eliminate resource contention. It utilizes the VMotion feature to provide automated resource optimization through automatic migration of VMs across hosts in a cluster. DRS also provides automatic initial VM placement on any of the hosts in the cluster, and makes automatic resource relocation and optimization decisions as hosts or VMs are added to or removed from the cluster. You can also configure DRS for manual control so that it only provides recommendations that you can review and carry out.

How DRS works

DRS works by utilizing resource pools and clusters that combine the resources of multiple hosts into a single entity. Multiple resource pools can also be created so that you can divide the resources of a single or multiple hosts into separate entities. Currently, DRS will only migrate VMs based on the availability and utilization of the CPU and memory resources. It does not take into account high disk or network utilization to load-balance VMs across hosts.

When a VM experiences increased load, DRS first evaluates its priority against the established resource allocation rules and then, if justified, redistributes VMs among the physical servers to try to eliminate contention for resources. VMotion will then handle the live migration of the VM to a different ESX host with complete transparency to end users. The dynamic resource allocation ensures that capacity is preferentially dedicated to the highest-priority applications, while at the same time maximizing overall resource utilization. Unlike the HA feature, which will still operate when vCenter Server is unavailable, DRS requires that vCenter Server be running for it to function.

Configuring DRS

Similar to HA, the DRS feature can be set up in a cluster either during its initial setup or afterward. To configure DRS, simply select the cluster on which you want to enable DRS, right-click on it, and edit its settings, or select the cluster and in the Summary pane click the Edit Settings link. Put a checkmark next to the Enable VMware DRS field on the General page, and DRS will be enabled for the cluster. You can optionally configure some additional settings to change the way DRS functions. To access these settings, click on the VMware DRS item in the Cluster Settings window.

Once you enable DRS, you must select an automation level that controls how DRS will function. You can choose from the following three levels.

When considering an automation level, it is usually best to choose Fully Automated and let DRS handle everything. However, when first enabling DRS, you might want to set the automation level to Manual or Partially Automated so that you can observe its recommendations for a while before turning it loose on Fully Automated. Even when selecting Fully Automated, you can still configure individual VM automation levels, so you can specify certain VMs to not be migrated at all (disabled) or to be set to Manual or Partially Automated. To configure individual VM automation levels, click on Virtual Machine Options, located under DRS. Usually, the default three-star level is a good starting point and works well for most environments. You should be careful when choosing more aggressive levels, as you could have VMs moving very frequently between hosts (i.e., VM pong), which can create performance issues because of the constant VMotions which cause an entire LUN to be locked during the operation (i.e., SCSI reservations).

About the book

This chapter excerpt on Advanced vSphere Features is taken from the book Maximum vSphere: Tips, How-Tos, and Best Practices for Working with VMware vSphere 4. Solution providers can use this book to learn about vSphere 4 storage, networking, performance monitoring and advanced features such as High Availability, Distributed Resource, Distributed Power Management and Vmotion.

DRS makes its recommendations by applying stars to indicate how much the recommendation would improve the cluster's performance. One star indicates a slight improvement, and four stars indicates significant improvement. Five stars, the maximum, indicates a mandatory move because of a host entering maintenance mode or affinity rule violations. If DRS is set to work in Fully Automated mode, you have the option to set a migration threshold based on how much it would improve the cluster's performance. The lowest threshold, which is the most conservative, only applies five-star recommendations; the highest threshold, which is very aggressive, applies all recommendations. There are also settings in between to only apply two-, three-, or four-star recommendations.

You can configure affinity rules in DRS to keep certain VMs either on the same host or on separate hosts when DRS migrates VMs from host to host. These affinity rules (not to be confused with CPU Affinity) are useful for ensuring that when DRS moves VMs around, it has some limits on where it can place the VMs. You might want to keep VMs on the same host if they are part of a tiered application that runs on multiple VMs, such as a web, application, or database server. You might want to keep VMs on different hosts for servers that are clustered or redundant, such as Active Directory (AD), DNS, or web servers, so that a single ESX failure does not affect both servers at the same time. Doing this ensures that at least one will stay up and remain available while the other recovers from a host failure. Also, you might want to separate servers that have high I/O workloads so that you do not overburden a specific host with too many high-workload servers.

Because DRS does not take into account network and disk workloads when moving VMs, creating a rule for servers that are known to have high workloads in those areas can help you to avoid disk and network I/O bottlenecks on your hosts. In general, try to limit the number of rules you create, and only create ones that are necessary. Having too many rules makes it more difficult for DRS to try to place VMs on hosts to balance the resource load. Also watch out for conflicts between multiple rules which can cause problems.

Once you have DRS enabled, you can monitor it by selecting the cluster in vCenter Server and choosing the Summary tab. Here you can see load deviations, the number of faults and recommendations, and the automation level. By clicking the resource distribution chart you can also see CPU and memory utilization on a per-VM basis, grouped by host. Additionally, you can select the DRS tab in vCenter Server to display any pending recommendations, faults, and the DRS action history. By default, DRS recommendations are generated every five minutes; you can click the Run DRS link to generate them immediately if you do not want to wait. You can also click the Apply Recommendations button to automatically apply the pending recommendations.

Distributed Power Management

DPM is a subcomponent of DRS and is a green feature that was introduced in VI 3.5 that will power down hosts during periods of inactivity to help save power. All the VMs on a host that will be powered down are relocated to other hosts before the initial host is powered down. When activity increases on the other hosts and DPM deems that additional capacity is needed, it will automatically power the host back on and move VMs back onto it using DRS. DPM requires that the host has a supported power management protocol that automatically powers it on after it has been powered off. You can configure DPM in either manual or automatic mode. In manual mode, it will simply make recommendations similar to DRS, and you will have to manually approve them so that they are applied. You can also configure DPM so that certain host servers are excluded from DPM, as well as specify that certain hosts are always automatic or always manual.

How DPM works

Although DPM existed in VI3, it was not officially supported and was considered experimental. This was because it relied on Wake On LAN (WOL) technology that exists in certain network adapters but was not always a reliable means for powering a server on. Being able to power up servers when needed is critical when workloads increase, so a more reliable technology was needed for DPM to be fully supported. For this, VMware turned to two technologies in vSphere: Intelligent Platform Management Interface (IPMI) and HP's Integrated Lights-Out (iLO).

IPMI is a standard that was started by Intel and is supported by most major computer manufacturers. It defines a common set of interfaces that can be used to manage and monitor server hardware health. Since IPMI works at the hardware layer and does not depend on an operating system, it works with any software or operating system that is designed to access it. IPMI relies on a Baseboard Management Controller (BMC) that is a component in server motherboards and monitors many different sensors on the server, including temperature, drive status, fan speed, power status, and much more. IPMI works even when a server is powered off, as long as it is connected to a power source. The BMC is connected to many other controllers on the server and administration can be done using a variety of methods. For instance, out-of-band management can be done using a LAN connection via the network controller interconnects to the BMC. Other out-of-band management options include remote management boards and serial connections.

For HP servers, DPM uses HP's proprietary iLO remote, out-of-band management controllers that are built into every HP server. HP has used its iLO technology under several different names for years, and has only recently begun to also embrace the IPMI standard. Dell's Remote Access Card (DRAC) remote management controllers provide the same functionality as HP's iLO, but Dell fully supports IPMI. Typically on Dell DRAC boards you need to enable IPMI via the server BIOS to be able to use it.

In addition to the IPMI and iLO power management protocols, vSphere also now fully supports WOL. However, if multiple protocols exist in a server, they are used in the following order: IPMI, iLO, WOL.

Configuring DPM

DPM requires that the host be a member of a DRS-enabled cluster. Before you can configure DPM in vSphere, you typically have to enable the power management protocol on whatever method you are using. If you are using WOL, this is usually enabled in your host server's BIOS. Depending on the server for IPMI, you can usually enable this also in the server BIOS or in the web-based configuration utility for the server out-of-band management board (e.g., Dell DRAC). This setting is usually referred to as IPMI over LAN. For HP's iLOs, make sure the Lights-Out functionality is enabled in the iLO web-based configuration utility; it usually is by default. Both IPMI and iLO require authentication to be able to access any of their functionality; WOL does not.

You can determine whether a NIC supports the WOL feature and that it is enabled in the vSphere client by selecting a host, choosing the Configuration tab, and then, under Hardware, selecting Network Adapters. All of the host's NICs will be displayed and one of the columns will show whether WOL is supported. Once you have configured the protocol that you will use with DPM, you can configure it in vSphere by following these steps.

Once DPM is enabled and configured properly, you will want to test it before you use it. To test it, simply select the host in the vSphere Client, right-click on it, and select Enter Standby Mode, which will power down the host. You will be prompted if you want to move powered-off/suspended VMs to other hosts. Powered-on VMs will automatically be migrated using VMotion; if they are not capable of using VMotion, they will be powered off. Your host will begin to shut down and should power off after a few minutes. Verify that your host has powered off, and then, in the vSphere Client, right-click on the host and select Power On. If the feature is working, the host should power back on automatically.

Once you have verified that DPM works properly, you need to enable DPM. To do this, edit the settings for your cluster; next, under the DRS category, select Power Management. You can then select either the Off, Manual, or Automatic option. The Manual option will only make recommendations for powering off hosts; the Automatic option will enable vCenter Server to automatically execute power management- related recommendations. You can also set a threshold for DPM, as you can with DRS, which will determine how aggressive it gets with powering off hosts. The DRS threshold and the DPM threshold are essentially independent. You can differentiate the aggressiveness of the migration and host power state recommendations they respectively provide. The threshold priority ratings are based on the amount of over- or underutilization found in the DRS cluster and the improvement that is expected from the intended host power state change. Priority-one recommendations are the biggest improvement, and priority-five the least. The threshold ranges from conservative, which only applies priority-one recommendations, to aggressive, which applies priority-five recommendations.

If you select Host Options, you can change the Power Management settings on individual hosts to have them use the cluster default, always use Manual, or always use Automatic; you can also disable the feature for them.

DPM Considerations

Although DPM is a great technology that can save you money and that every large datacenter should take advantage of, you should be aware of the following considerations before you use it.

Eric Siebert is a 25-year IT veteran whose primary focus is VMware virtualization and Windows server administration. He is one of the 300 vExperts named by VMware Inc. for 2009. He is the author of the book VI3 Implementation and Administration and a frequent TechTarget contributor. In addition, he maintains vSphere-land.com, a VMware information site.

Printed with permission from Pearson Publishing. Copyright 2010. Maximum vSphere: Tips, How-Tos, and Best Practices for Working with VMware vSphere 4 by Eric Siebert. For more information about this title and other similar books, please visit http://www.pearsonhighered.com.

02 Feb 2011

All Rights Reserved, Copyright 2006 - 2024, TechTarget | Read our Privacy Statement