Host-based vs. VTL vs. NAS data deduplication

Leveraging data deduplication can maximize your client's storage capabilities. Learn how to choose the right data deduplication technology to meet your client's storage needs.

Service provider takeaway: Service providers can get to data deduplication sales and services faster by helping customers choose a dedupe technology approach first, product second.

In the storage business, data deduplication is all the rage. Customers are clamoring to cash in on the savings, but most don't yet understand how to properly apply the technology to their environment. Service providers who help customers sort through the three basic approaches and extract real value from data deduplication quickly will earn the trust of clients and gain lasting business.

As you know, deduplication offers a number of improvements over traditional storage for backups. But with those benefits comes a confusing set of questions from customers, the key question being: How do we choose the best dedupe technology? In answering that question, it's important not to jump ahead to focus on specific products -- by first choosing product type, whether it be host-based, VTL-based or NAS-based, you can simplify the decision process for customers. Here's how they break down.

Host-based data deduplication

Host-based deduplication requires the backup client to do a lot of the dedupe work. In many cases, that's not a problem, especially when the client is not CPU-bound. Host-based dedupe really helps when backup bandwidth is constrained by small wide area network (WAN) pipes or consolidated virtual servers.

More on data deduplication
Affordable tiered storage via data deduplication services 

Effective remote backup with data deduplication 

Host-based data deduplication solutions usually require you to replace traditional backup software with the dedupe backup software, so before you recommend such a change, make sure that the benefits are significant enough.

Remote office backups to the corporate site will benefit from host-based deduplication because it eliminates most or all of the backup hardware located at the remote site and optimizes the network bandwidth required to centralize backups to corporate data centers. VMware backups benefit from host-based deduplication by limiting the network bandwidth required to back up multiple guest machines concurrently.

Examples of host-based data deduplication technology include EMC Avamar and Symantec NetBackup PureDisk.

Virtual tape library (VTL) data deduplication

Deduped virtual tape libraries (VTLs) work well when the backups are localized to the data center and/or bandwidth between the client and backup storage is not an issue. Naturally, many customers will want to take advantage of deduplication in their existing or planned virtual tape infrastructure. VTLs are already very common in midsized and large enterprises and consume a significant part of many companies' overall storage budget. Deduping at the VTL should be simple for customers because almost all backup software platforms support VTLs. In addition, deduped VTLs are a good fit for disaster recovery replication and when the customer wants to replace tape for primary backups. Given the increased efficiency and deduped VTL-to-VTL replication, there may finally be an opportunity to show real ROI for backup to disk instead of tape.

Examples of VTL dedupe technology include EMC DL3D, Data Domain's DD Appliance series, Diligent ProtectTIER, Sepaton S2100 and Quantum DXi Series. (All of the big storage players have deduped VTL solutions that are either shipping now or will ship within the next six months.)

Primary network-attached storage (NAS) data deduplication

VTLs introduce a lot of the same challenges that physical tape presents, such as tape contention, poor cartridge utilization and intolerance to high storage area network (SAN) latencies. In some cases, customers want the benefits of target hardware-based deduplication without the complexity and limitations of tape. In these cases, deduped NAS file systems may be the perfect remedy. Deduped NAS storage has some impressive cost advantages because it doesn't require SAN connections or VTL licensing in the backup software. In some cases, the deduped NAS storage can be used for more than just backups, such as highly duplicate archive data where throughput is less important than space savings.

Examples of NAS data deduplication technology include NetApp NearStore with Advanced Single Instance Store (ASIS), Data Domain's DD Series, EMC's DL1500 and Quantum DXi-series appliances.

About the author
Brian Peterson is an independent IT infrastructure analyst, with a background in enterprise storage and open systems computing platforms. A recognized expert in his field, Brian has held positions of responsibility on both the supplier and customer sides of IT.

Dig Deeper on Storage Backup and Disaster Recovery Services