The disk array controller is the brain of a storage array. It is essentially a compute engine that receives data...
and then sends it to the integrated hard disks. It also manages and manipulates stored data using features that we have come to count on, like thin provisioning, snapshots, clones and replication.
The problem is that all of these features, plus the increased rate of inbound and outbound I/O (thanks to faster networks) and the increased number of disks that need to be supported on the back end (thanks to increased storage requirements), are working together to turn the storage disk array controller into a processing bottleneck.
Vendors are addressing this performance problem in a variety of ways, and each of these methods impacts the capabilities, performance, scalability and usability of the storage system. As a result, it is important for you to understand which technique makes the most sense for each customer. So let’s examine each of these approaches.
Solution: Increased processing power. The first option is to simply use faster and more capable processors. When Intel launched its Westmere processors, several vendors whose systems were totally dependent on processing power for performance saw a significant increase in capabilities and scalability. They could manage more snapshots and support more disks. The advantage of this approach is that every time Intel or another processor manufacturer brings out a new processor, the capabilities of the current system are likely to improve. We call this riding the Intel wave.
The downside is that you’re totally dependent on the next processor for your performance boost, and the capabilities of even the latest, greatest processor may not be able to meet the demands on the storage system. And then there’s cost: The latest processor is also going to be the most expensive.
Solution: Improved storage software. The second option is to use newer, more efficient storage software. More sophisticated software comes to market on a continual basis, with startups always working to build a better mouse trap. It’s always easier to design higher functionality -- snapshots or automated tiering, for instance – to a storage system’s code from the beginning instead of adding it on later. The code that’s in a product from the beginning will likely be more efficient than code added to an existing product.
The downside to improving disk array controller performance by using more efficient software is that it takes talented people to write that software, and those people are in short supply. This can impact you and your customers since it could lead to increased costs or delayed feature delivery. And it’s not practical to change storage vendors on a recurring basis. Once a customer has “invested” in a particular vendor’s code base, switching to a different vendor isn’t easy. As code ages, it becomes increasingly difficult to add new features and adapt to new technologies.
Solution: Task-specific CPUs. Another way to make the storage processor more efficient is to move portions of code to silicon or a field-programmable gate array (FPGA). This allows those sections of code to execute faster and the system to then deliver those functions without impacting overall performance.
But it costs money and takes time for vendors to create and support task-specific CPUs or FPGAs, which can increase a product’s cost and slow its time to market. If those two factors can be minimized, task-specific CPUs will become increasingly attractive as a way to handle the performance demand of additional services.
Solution: Scale-out storage. The fourth option is to build a cluster, or grid, of disk array controllers that work together to tackle storage performance demands. Known as scale-out storage, these systems scale by adding servers, often called nodes, to the storage system. Each node includes additional capacity, I/O and processing power.
While scale-out, especially in the cluster sense, is all the rage right now in the storage world, it is not without downsides. First, density could be problematic for some customers. Scaling to 100-plus nodes may sound good, but racking, powering and cooling dozens of nodes for a storage system may be a problem for some customers because of the space and cost requirements. Second, performance could suffer. When dozens of nodes have to communicate, the overhead created by that communication may introduce new types of performance issues.
Solution: Shift storage tasks to virtual servers. A final option may be to leverage the hypervisor within server and/or desktop virtualization infrastructures to perform more of the data services tasks. Many hypervisors today can perform thin provisioning, snapshots, cloning and even tiering (by moving virtual machine images between disk platforms or LUNs). The value in using the hypervisor is that it can scale as hosts are added to the storage infrastructure since each host can provide additional compute power to the storage infrastructure. It also means that data services are performed closer to the application, and that should improve performance and response time.
But there are a few downsides to moving data services to the hypervisor. The first is that the storage services within hypervisors aren’t yet fully mature. In some cases, there is a significant performance hit when capabilities such as thin provisioning, snapshots and cloning are used. These shortcomings, though, can be easily overcome by using third-party software solutions that augment the hypervisors storage capabilities, such as storage management applications or enhanced file systems.
Another downside is that many storage systems already come with some or all of the storage functionality that we have been discussing, and as a result, your customer would be buying the functionality twice. You can avoid this problem by guiding them toward hardware without such services. By consolidating storage services at the hypervisor, your customer would gain a single point of data management and be able to more easily mix their storage hardware.
The final downside is that the hypervisor can provide these services only on virtual hosts. Standalone servers and their applications would be out of luck. As more of the data center becomes virtualized, this will be less of an issue, but in the short term, your customer may need to manage storage services from multiple systems.
As is always the case, no storage system is perfect. Each has its pros and cons. For many customers, the basic system will give them all the performance they need, but for an increasing number, performance and the ability to scale that performance over time is becoming a problem. Being able to walk through these issues with the customers and determine their best path to addressing the problems will elevate you to a trusted-advisor status in their eyes.
George Crump is president of Storage Switzerland, an IT analyst firm focused on the storage and virtualization segments.