This content is part of the Essential Guide: Building a better archival storage strategy

Active archive software: Not just for the enterprise anymore

Active archives, once thought of as practical just for big companies, are becoming more useful for SMB customers in a variety of vertical industries.

Small and midsize organizations face data storage management challenges just like large enterprises. Even though disks are growing in capacity, the amount of data being created, processed and stored continues to grow at ever-increasing rates. While disk-based storage becomes cheaper, facility costs for power, space and cooling continue to grow.

Even for small and medium-sized businesses, retaining idle data that is too valuable to delete but too expensive to store on spinning disks is a perfect task for robotic tape libraries with the right software management system. This setup is considered an "active archive" -- which refers to archive data that's accessible but not instantly available -- though an active archive can be stored on disk as well as on tape at some point in its lifetime.

A library is not a solution by itself. Active archiving uses disk and software to cache or buffer the data and move it to and from the tape library.

The need for quick access to archived data can be seen in many types of small and midsized organizations. For instance, high schools are creating video archives of sporting events, just like professional sports teams and colleges have been doing for years. In retail stores, surveillance cameras are increasing in number, as is the resolution of the video they produce. Software developers and testers are leveraging the flexibility of virtual machines and in the process creating hundreds or thousands of virtual machine disk files (VMDKs).

These videos and virtual machine files are initially accessed from primary storage, but, over time, the need to maintain these big files on expensive primary storage decreases. They can be archived as long as they can still be searched and the data retrieved. So, in the example above, a high school that wants to help its standout football player get recruited by a top college could provide video clips of four years of great plays. Or, store detectives can review month-old footage in pursuit of a suspected shoplifter. In the software developer example, a tested scenario can be recalled and tweaked following bug reports logged through the company's support help desk. All of these scenarios are enabled by retrieval of large, archived files.

Considerations for active archive providers

Whether or not a small solution provider can offer these services depends then on how easy it is to deploy the solution, how expensive it is to acquire and scale it, and how small and cool the system temperatures are. The customers mentioned above probably don't have dedicated IT facilities, so the system may need to sit next to a desk or in an uncooled closet. Plug-and-play simplicity is also important; the data needs to be as easy to find as files on a desktop computer.

tape is a great solution for these customers. Idle tape uses no power, is very dense and cool. Huge amounts of data can be stored in a robotic tape library, and once the needed tape is recalled, data is streamed off at great rates. But a library is not a solution by itself. Active archiving uses disk and software to cache or buffer the data and move it to and from the tape library. The active archive appears as a network resource, like a network-attached storage device. If executed well, the user shouldn't know or care if the data is coming from the tape or the disk cache.

With software as the enabler, solutions such as Quantum Corp.'s StorNext, Silicon Graphics International Corp. 's Data Migration Facility, FileTek Inc.'s StorHouse and Crossroads Systems Inc.'s StrongBox allow a small library like Spectra Logic Corp.'s T200 to offer an archive of 1.3 PB with LTO-6 drives in just 20U (35 inches tall). Tapes can be ejected and maintained offline to allow for an even less expensive fourth tier of storage.

Active archive appliance with software, server, primary disk

An active archive appliance includes the software, server and primary disk.

The key to an active archive solution then rests in simplicity, flexibility and cost. The software requires a system to operate on, disk to cache or buffer the files, and the connection to the library. An appliance approach is ideal, minimizing the complexity of setup and support.

How the data is stored on tape is also important. Crossroads (alone among the vendors listed above) uses the LTFS technology developed by IBM; it writes a standalone file system on each tape. This allows the system to load an individual tape into a compatible tape drive and read it without having the original software that wrote the tape. Contents of LTFS tapes are easy to transport and use at remote sites. The LTFS format makes it easier to share data with other sites or partners or move tapes out of the library. While the use of LTFS tapes is not necessary for an active archive system, this tape format provides great flexibility for long-term storage of your data.

With an active archive, a school can consolidate and manage a massive video library. A store can crack down on previously successful thieves, and a small software developer can quickly respond to bugs by bringing the testers, developers and support together with a virtual machine simulating the problematic case.

While tape's traditional role for backup is being displaced by cloud and disk, and while snapshots, replication and deduplication are minimizing its relevance, the ever-increasing size and scale of data suggests that even small organizations would benefit from an active archive.

About the author
Toby Zellers is vice president of strategy and solutions for TVAR Solutions, a federally focused VAR in McLean, Va.

Dig Deeper on Data Management Technology Services