Data storage management: Tuning performance

Enterprise Strategy Group's Tony Asaro discusses data storage management challenges companies face in tuning storage performance in part two of this Gear6 podcast transcript.

The following is part two of a transcribed Gear6 storage performance podcast in which Tony Asaro, senior analyst and consultant with the Enterprise Strategy Group, and Gary Orenstein, vice president of marketing at Gear6, discuss data storage management and the challenges of performance tuning. The following table of contents will help you navigate to each part.

Storage performance podcast transcript

 Part 1: Storage performance trends and challenges
 Part 2: Tuning storage performance
 Part 3: Identifying storage performance problems

Gary Orenstein: What do you think are some of the challenges for companies and IT managers in reaching or achieving appropriate levels of storage performance?

Tony Asaro: First of all I think that there's sort of an issue. When we talk about storage performance, it's incredibly difficult to measure until you've actually implemented. So one challenge is to look at all of these specs, have a requirement for performance, understand what my application might need and where it might grow to or scale. But how do I know that whatever it is I'm going to implement actually is going to work in the way that I want it to work? One of the biggest challenges with storage performance in general is that I just don't know. I can make a guesstimate. I have an educated guess because I've been doing this for a long time. But unfortunately, you're not going to know until you've actually implemented a storage system. So you have to understand how these things work, what are the components, what are the characteristics of a storage system that are going to help you with performance. I think that's number one.

Gary: Let's jump into some of the more tactical means of tuning storage performance. What do you see as some of the common means to increase performance across different areas, such as SANs or NAS, and then maybe discuss some key storage applications, such as replication backup recovery?

Tony: First of all, you do have to understand the architecture of your storage system. You need to understand how many processors it has. Processors are incredibly important to performance. Some storage systems only have two processors in them, others have dozens of processors. So you need to understand that aspect of it.

You also need to understand disk performance; disks are probably the slowest part of the higher performance chain. If your IO is intensive, if it does a lot of rewriting straight from the disk, then you're going to need to understand that, too. It's the speed of the disks, but it's also how many disks you can use in parallel so that you have multiple actuators reading and writing data for you in parallel. If it's a bandwidth-intensive application, how many Fiber Channel ports, or iSCSI ports or Ethernet ports for NFS. . . You need to understand how to architect from a bandwidth perspective and if you can actually aggregate those things.

One thing I happen to lean very strongly toward is caching. This is a very controversial issue because some people think caching is not important. I just think those people are crazy. Caching is extremely important. Being able to read data out of cache is incredibly important, writing data to cache is important. You know, cache is 1,000 times faster than disk and therefore, ultimately, it's going to be faster when you read and write out of cache. Intelligent caching algorithms can predict what data you are going to be looking for, if you keep data frequently, what you frequently access and even random data you keepin cache because you frequently access it. Then you're going to get it instantly. So caching is also an important part of it.

Gary: Across these different areas, do you think the industry is moving forward and are there particular areas ready for some additional improvement?

Tony: There are ways of tuning and improving performance. You can engineer anything to work the way you want and you can go out and get as many drives as you need, you can go out and get as many ports as you need, you can go out and get a multiprocessor system, you can go out and buy tons of memory. The problem is can you afford it? Is it practical for your business? I think we have to start looking toward solutions that allow you to maximize performance without breaking your budget. That's the critical thing. So when we talk about performance, we always have to associate it with price. We need ways to transparently and cost-effectively improve performance. The other thing is it's not a static environment. Our environments will change so we have to make sure that we have solutions that will allow us to adapt to that change. And typically, it changes for the worse. In other words, you have greater demands on you, not less demands on your time.

Gary: I was at a conference recently and heard a gentleman talking about how he finds it very easy to provision, and more importantly deprovision resources at the host layer by adding CPU resources or memory resources at the server and application layer, but he feels like there's a big gap at the storage layer in order to complement that. I was wondering if you had any thoughts on that statement.

Tony: We see innovations on the server and application side. I think that there is flexibility there that we don't have yet on the storage side. Right now we look at storage in a very archaic way. We've been doing it the same way for decades now. And certainly there have been improvements and certainly there continues to be improvements. But the problem is that we see all these systems as discreet systems. We say ok, here's my storage system no. 1, and if I'm going to improve performance on that particular system this is what I need to do. Here is storage system no. 2; I've got to do the same thing here and so on and so forth. To us, we feel that's a waste. You are not greater than the sum of your parts, right? You are at exactly the value as the sum of your parts, or you are less then. We are doing things very inefficiently. There has to be better ways that we can improve performance that allow us to do that within our storage environment and not just our storage systems.

