Managing data with an object storage system
A comprehensive collection of articles, videos and more, hand-picked by our editors
Object-based storage, or object storage, isn't a product, per se, but a technology that's enhancing storage systems, enabling them to bring more value to existing applications and be used in more applications overall. For example, a global file system based on object storage represents an improvement over traditional file systems based on block storage (such as traditional NAS), because it can support extremely large numbers of files while maintaining file access performance. In this tip, we'll explain what object storage is, how it's being used and why it's a hot topic that VARs and MSPs should be very interested in.
What is object storage?
Object storage can be defined as a data architecture that stores data in discrete containers called "objects," which are individually accessible through a standard HTTP-based protocol using a unique object identifier and a simple index. Like a file system, object storage enables data sharing, but without the overhead and size restrictions of traditional protocols, such as NFS or SMB (CIFS).
In addition to an ID number, each object contains extensive metadata that provides information about the object's content and its context within the collection of objects stored. It can also be used to support policies for storage management, such as tiering and data protection. Object metadata also includes a hash signature that abstracts the object's contents for use in deduplication, compliance and other functions.
Use case No. 1: Cloud
Object storage is becoming a hot topic because its characteristics fit well with several use cases, one of the most visible being cloud storage. Public cloud providers (and private cloud users) need virtually unlimited scalability with consistent access times, a simple access methodology and the option to geographically disperse data for protection. Object storage addresses these needs very well, but most VARs probably don't consider cloud providers their typical customers. However, object storage has another strong use case that many more companies have a need for: storing content.
Use case No. 2: Content storage
Content storage (and, by extension, content management systems) stores and maintains files that are relatively unchanging for future access by people or by applications, such as Web servers. Content must be available quickly, so most content storage systems use a simplified access methodology, not a typical file system with hierarchical structure. This access requirement makes content storage systems an ideal use case for object-based storage architectures.
Content can also be very valuable, very sensitive or both, so it must be protected from system-level failure and bit-level corruption for long periods of time. Some content storage use cases involve data with compliance requirements, such as personal, financial and legal information. Historically, these data were put on optical storage because of its longevity and immutability (it couldn't be changed).
Content-addressable storage, or CAS, was developed for many of these use cases. Since it was disk-based, it provided better access times than tape archives and was much simpler than maintaining jukeboxes full of relatively low-capacity optical disks.
The first CAS system was EMC's Centera product, which was object-based. CAS systems were typically expensive, on a per-gigabyte basis, so their use cases were primarily compliance-driven. But as object storage has expanded, it's providing an economical alternative to CAS systems as well.
Aside from data that's subject to regulations about how it is protected and stored, most companies have a lot of files that need to be kept and made available but don't need to be on tier-one storage. For these use cases, a content storage system can be a good way to free up valuable space, and the object-based architecture can be the ideal data structure to make it work.
Beneficial aspects of an object storage architecture
Storage tiering. The ability of object storage to capture extensive metadata about each object’s context and content provides insight into where those objects should be stored and how they need to be protected. Rather than relying on simple information, such as file types and access histories, object storage can tell you which files are used together. This results in much more effective storage tiering decisions and more efficient data movement to support that process.
The ILM promise. As the lowest tier of online storage, content storage systems need to scale in capacity more than other tiers, since they represent the disk repository of last resort. Object storage has the ability to scale to almost unlimited file counts without suffering performance degradation. Combined with their ability to provide a more effective foundation for storage tiering, object storage systems may finally be able to fulfill the promises made over a decade ago about information (or data) lifecycle management.
Erasure coding. Object storage systems support erasure coding, a "forward error correction" technology originally developed for the communications industry to improve transmission success on low-quality networks. Erasure coding divides a data set into multiple components, adding a number of redundant components in the process. It can then reproduce the original data from a subset of these components and provide data protection without requiring redundant copies.
In storage systems, erasure coding can also ensure data integrity without using RAID. This avoids the capacity overhead of keeping multiple copies and the processing overhead of running RAID calculations on very large data sets. The result is data protection for very large storage systems without the risk of very long RAID rebuild cycles.
To maintain long-term data integrity, object storage systems can also run error correction checks on stored files in the background, eliminating full data backups in many cases. Finally, the rich metadata available to object storage systems enables them to better leverage deduplication and compression processes on these data sets, further reducing costs.
Content storage systems based on an object storage architecture are a simple solution for several storage issues. They can provide a compliant storage platform for long-term archiving of sensitive data or just an economical storage solution that takes the pressure off expensive tier-one disk systems.
Object storage is becoming an attractive option for storing ever-growing unstructured data sets, even for long periods of time. By replacing redundant copies, backups and even RAID protection, these systems can improve on the economics of "capacity" disk storage. Unlike tiered storage of the past, object storage systems can provide the access performance, scalability and compliance features to make the solution viable for more use cases.
For VARs and MSPs, object storage systems can be a very disruptive solution to bring into an account that's only using traditional storage. The technology is gaining a lot of recognition; there was even an Object Storage Summit held recently. But the window of opportunity is closing. While there are a number of independent companies offering object-based storage solutions, all of the major disk vendors either have them available or have them on the roadmap.
Eric Slack is a senior analyst with Storage Switzerland.