Capacity Optimization With Content Platforms
by Hu Yoshida on Sep 29, 2011
Recently a lot of attention has been given to Capacity Optimization, and there are plenty of solutions to choose from. Compression, dedupe, dynamic tiering, and thin provisioning are those most often mentioned, and each is effective in reducing capacity in their own way. Choosing the right solution is important too—for instance, using dedupe to reduce the allocated unused space in a volume is not as efficient as thin provisioning the volume and not provisioning the unused allocation in the first place.
Compression and dedupe require some additional steps to un-compress and re-hydrate before the data can be used. Dynamic tiering is more about moving the data to lower cost tiers of storage rather than reducing the capacity. In fact, if the tiering is done on a volume basis, more capacity could be required since you need capacity for the entire volume on each tier involved. Thin provisioning works best with “thin provision-friendly” file systems that do not write
signatures across the data space and communicate to the storage system when they delete a file (so that the capacity can be reallocated to other users). However, even if the file is not thin-friendly, thin provisioning may be used for other advantages that come from the ability to dynamically provision a volume, or automatically stripe data across volumes, to increase performance. No matter which of these solutions you use, you still have to employ additional capacity to back up and eventually archive this data.
The most effective way to optimize capacity is rarely mentioned—the use of a content platform to store static data. Static data is data that is not going to change and can be active or inactive. While it is obvious that inactive data should be archived, active data that is static should also be considered for removal to a content platform. The point is static data will not change, and thus it does not need to be backed up over and over again. You just need two or three copies of the data—whatever your tolerance for risk is— which helps reduce wasted capacity and management resources for backup.
If one were to look at a data center, the bulk of the data (60% – 80% or more) is static. What we need to do is move that data off of the active systems—which we back up every night—and move it onto a content platform like Hitachi Content Platform (HCP). HCP can replicate the content to another HCP and eliminate the need for backup. By doing this we also reduce the working set of active data and make the primary storage systems more efficient. The ingestion of this data onto an HCP does take software, however. That policy management and data migration software is available in our Hitachi NAS and Hitachi Data Ingestor for files. While middleware is required to do ingestion by other applications, more and more applications like Microsoft SharePoint, SQL Server, and SAP NetWeaver ILM have interfaces that ingest directly into HCP.
So if you are looking for ways to optimize capacity and reduce costs, consider the use of a content platform to store your static data.




