By Yuval Shavit, Features Writer
The basic goal of storage virtualization is to make several disks appear as a single drive, but the approaches to enable that result vary significantly. Storage virtualization technologies include software-based, appliance-based and switch-based approaches, each with its advantages and disadvantages. There are also hybrid approaches that combine storage virtualization with other technologies, like standard storage switching.
Storage virtualization crops up in various places. Storage can be virtualized either on a block level -- for use with SANs, for instance -- or on a file level, with NAS devices being virtualized. Some vendors also use virtualization within their arrays, so that what they present as a single device may actually be several drives internally with a virtualization layer. In this article, we will focus specifically on block-level storage virtualization technologies that you can install for customers in a heterogeneous storage environment.
There are three basic approaches to storage virtualization. With host-based virtualization, a server provides the virtualization layer and presents a single drive to its applications. Appliance-based virtualization uses a hardware appliance that sits on the storage network; a network-based approach is similar, but works at the switching level. In addition, you can also recommend what Ray Lucchesi, president of Silverton Consulting in Broomfield, Colo., calls a subsystem approach, which combines a standard storage unit with a storage virtualization appliance.
Not everyone agrees on how storage virtualization technologies should be classified. Jim Damoulakis, chief technology officer at GlassHouse Technologies Inc. in Framingham, Mass., says the distinction between appliances and network-based products isn't as important as the distinction between in-band and out-of-band approaches. Appliances are always in-band, whereas network-based products can work either in-band or out-of-band. Since host-based virtualization doesn't affect the storage infrastructure, it isn't considered either in-band or out-of-band.
Host-based storage virtualization technologies
The simplest form of storage virtualization technologies, host-based storage virtualization relies solely on software at a server, often at the OS level. The tools to enable this functionality are more commonly known as volume managers and have been around for years; the term "host-based storage virtualization" is essentially used to fit volume management within the larger scope of storage virtualization.
Volume managers are fairly easy to set up and don't require any changes to your client's infrastructure. The volume manager on the server is configured to take several drives and present them as a single resource, which can then be divided up as needed. The disadvantage of this approach is that configuration has to be done for each server, which can be cumbersome for large systems. Although volume managers are useful in their own right -- they can do things like dynamically resize partitions -- as full-blown storage products they are best suited for relatively small environments.
Appliances and network-based storage virtualization
Appliances and network-based products both work at the storage infrastructure level and, unlike host-based solutions, work for the entire infrastructure. They are more manageable for large systems, because a single change can propagate transparently across several storage units and hosts. For instance, you can migrate data from one storage unit to another without having to reconfigure any of the servers that rely on the data; they still point to the same virtual disk, and the virtualization layer handles the remapping.
Appliances are always in-band, meaning that they sit on the data path between the storage subsystems and the host. This approach is relatively straightforward in that you don't need to set up a separate metadata path to help hosts figure out where data is; they just query the appliance as if it were a storage unit, and it redirects the request to the appropriate unit.
The disadvantage to this approach is that appliances introduce a potential bottleneck, so they need to have at least as high a level of performance and availability as your client's best storage system. You should also work in redundancy by installing at least two of the appliances in parallel, and possibly more. To reduce the latency involved in relaying each request, appliances often maintain a cache for both reading and writing operations, which increases their price.
Instead of intercepting each request, out-of-band network storage virtualization technologies act as tables of contents. When a host needs to access data, it queries the network device to find out which storage subsystem has the data it needs, and then accesses that system directly. In industry parlance, this approach separates the data path -- the flow of information to and from the disks -- from the control path, which establishes where that data resides.
Because network-based products don't need to intercept, interpret and relay each request, this approach can decrease latency. On the other hand, each host needs to be configured so that it knows to query the virtualization device before accessing the data. This approach requires drivers on the host that make it aware of the separate control path.
The decision to go with an in-band, appliance-based approach or an out-of-band, network-based approach is largely a matter of preference, Damoulakis said. One key consideration is whether your client prefers to manage storage at the appliance or switch level, he said, adding that you should also consider your client's storage vendors. For instance, EMC tends to favor out-of-band storage virtualization technologies with Invista, whereas HDS promotes an in-band approach with its Universal Storage Platform.