You must have Adobe Flash Player 7 or above to view this content.See http://www.adobe.com/products/flashplayer to download now.
Download for later:
Listen to the podcast with George Crump
• Internet Explorer: Right Click > Save Target As
• Firefox: Right Click > Save Link As
What exactly is file virtualization?
The easiest way to [explain] file virtualization is [to use the example of a DNS server]. With a DNS server, if I go to SearchStorageChannel.com, I don't need to know the IP address of the servers you guys use to present the Web site. I just type in www.searchstoragechannel.com. The same scenario happens with file virtualization. Instead of knowing that my file is on Server_1/Subdirectory_home/George, I just go to George.doc, and it automatically does the mapping for me.
By submitting your personal information, you agree that TechTarget and its partners may contact you regarding relevant content, products and special offers.
So from an end user perspective, it's invisible?
Totally transparent. All they see is one big home directory that has all their files in it, but they don't realize that half of the files are on File Server A and half of the files are on NAS B.
What about if somebody tends to work on their hard drive rather than out on the network?
This only works if you have your data stored on a network-attached storage (NAS) system.
What are some of the different forms that file virtualization solutions can take?
Typically, there are three types: For a NAS-based solution, some of the NAS vendors, like OnStor and NetApp, have a solution that will essentially allow you to activate a module that will allow them to distribute files among their NAS heads; [you need to use that vendor's brand]. The second type is an appliance-based model that sits inline between the users and the NASes. In that case, [you could use] different manufacturers' NASes. So you could have a NetApp box be your primary NAS and then migrate data as it ages to a less expensive NAS for retention. And the final type is a software application that will plug into or enhance some of the file virtualization technologies that already exist in many operating systems. So a good example of that is Microsoft's DFS, or Distributed File System, and Brocade has a product that will enhance that and allow you to build a fairly robust file virtualization solution as a result.
How do these three types compare in terms of cost?
From a cost perspective, it varies. Clearly, in the NAS case, you have to actually buy the NAS and invest in that platform, so if purchasing a new NAS isn't on your [customer's] radar screen at this time, that could be a fairly expensive proposition. The appliance and/or software approach is merely adding that component to the overall solution and could be substantially less expensive than buying all new storage.
What do VARs and integrators need to know about their customer's environment before they talk to them about file virtualization?
Before I'd bring up file virtualization to a customer, the first thing, obviously, that I'd want to do is make sure that there's a payoff for making the investment. You're looking for a fair amount of unstructured data, so basically data that's file-based, so in other words, not databases. If there's a fair amount of file-based data, and it's all on expensive primary storage like a NAS device, the concept of transparently moving that data off onto a secondary, more cost-effective capacity-based sort of platform might be very attractive to them. So the first thing is to understand the amount of unstructured data that exists in the client site. The second [issue to pay attention to is performance]. Many VARs will think of a file server or a NAS as a place to hold Office or Word documents, but many, many customers use NAS-based systems to serve up video or audio or any kind of [file requiring full-speed access]. If you do that and you put in one of the appliance-based or software-based models, you might impact overall performance. So it's something to be aware of.
You mentioned ROI. What are the key ROI benefits?
The key one is that a file virtualization system allows you to migrate data very aggressively off of primary storage. We've had the concept of data migration around for years, but it's typically been to optical or tape. So the thought of doing that was "Wait until data has aged three or four years, if ever," before you actually moved the data. Now with file virtualization and a disk repository, the concept of moving this data very quickly and much more aggressively becomes a reality. With clients we've worked with, we see migration times of three or four days after the file becomes inactive, they move it to secondary storage. The result is a massive reduction in the amount of primary storage being utilized, and in some cases, we've seen customers that don't need to buy any additional primary storage for the next two or three years. And then secondly and probably equally important is that [file virtualization] has a very positive impact on the backup process and backup window. By removing all this old data out of the primary storage area, you don't have to back it up as often. One of the hardest things for a backup application to do is to back up unstructured data because it has to go in and check every single file to see if it's changed. By moving thousands if not millions of files out of the equation, you transfer less data but, maybe more importantly, have to check less data to see if it's changed.
What about implementation? What are the concerns that VARs and integrators should have about implementation?
The primary concern has to do with when a customer is using NAS in a very performance-centric environment. In other words, not office productivity suite stuff -- possibly streaming video, things of that nature. In the appliance- and software-based models, some of these solutions tend to be inline, that means they could also impact performance. So you want to make sure you understand how the customer is using their NAS or file servers [check sentence] and make sure the solution you propose is either going to scale to be able to match performance that can be done today, or that it's not going to impact that particular area of the data store.