Channel Impact

The RAID rebuild process and its impact on customers

In a recent tech tip, “RAID technology advances with wide striping and erasure coding,” Stephen Foskett correctly described the modern advances in drive-based data protection, also

    Requires Free Membership to View

known as RAID. But let’s back the conversation up a bit to consider how this topic is relevant to resellers. It’s important for VARs to understand the typical RAID problems, particularly with the RAID rebuild process, and why customers should consider something else. Don’t make the assumption that they know.

RAID has served the data center well. It takes a group of individual hard drives and aggregates them to provide better performance and protection from failure. It is that failure or, more importantly, the recovery from failure that has led many, including myself, to predict that RAID, at least in its current form, is on its last legs.

When a drive fails in a RAID group, the RAID algorithm uses data from the other drives in the group, leveraging parity data, to reassemble the data from the failed drive onto a new drive. In principle this method of protection makes sense, but because modern drives hold so much data, it takes far too long to rebuild those drives. With high-capacity drives, we talk about RAID rebuilds in terms of days instead of hours.

But the problem isn’t just the amount of time that the RAID rebuild takes; it’s also the impact of the rebuild process on applications and users: Storage performance is in most cases significantly impacted during the RAID rebuild. That means that applications can grind to a halt or come close to it. With many arrays, you can choose to throttle down on the amount of resources that are allocated to the rebuild process so that regular storage performance is not hindered. But with that strategy, the rebuild takes longer and you are in the exposed state described previously for a longer period of time.

In addition, for the duration of the RAID rebuild -- which can be shorter or longer depending on whether you’ve throttled the rebuild process to avoid hindering system performance -- it can suffer total system failure if one additional drive (under RAID 5) or two additional drives (under RAID 6) fail. If your customer has a complete RAID failure of this nature, typically, the only option is to begin a full recovery from backup, which of course takes time. While this type of failure might seem to be extremely rare, the impact of the failure is enormous. I’m also not sure just how rare a total RAID failure is anymore. While it’s certainly not common, every week I speak to end users who have experienced it.

As Foskett points out in his tip, there are ways around these typical RAID problems, with wide striping, erasure coding and other techniques. And as we mentioned in our recent article on SSD reliability, solid-state storage systems could be another option because they are smaller in capacity per drive and of course very fast, especially on reads. SSD-only storage systems could return RAID rebuild times to minutes.

Now you know the “why” of all the discussion of RAID failures. Start your conversation with the customer with this foundation. You can add value for the customer by educating them on exactly what the RAID problem is and how they can plan around it or -- as Foskett’s article points out -- leverage technology to overcome today’s RAID challenge.

This was first published in March 2011

There are Comments. Add yours.

TIP: Want to include a code block in your comment? Use <pre> or <code> tags around the desired text. Ex: <code>insert code</code>

REGISTER or login:

Forgot Password?
By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy
Sort by: OldestNewest

Forgot Password?

No problem! Submit your e-mail address below. We'll send you an email containing your password.

Your password has been sent to:

Disclaimer: Our Tips Exchange is a forum for you to share technical advice and expertise with your peers and to learn from other enterprise IT professionals. TechTarget provides the infrastructure to facilitate this sharing of information. However, we cannot guarantee the accuracy or validity of the material submitted. You agree that your use of the Ask The Expert services and your reliance on any questions, answers, information or other materials received through this Web site is at your own risk.