“ Only 12% of my storage contains production data.”
by Hu Yoshida on Jan 23, 2013
I recently visited some large customers in New York with multi-petabytes of data and heard consistently that very little was used for production data. One customer with close to 100 petabytes of storage made the bold statement that only 12% of their storage contained production data. Another customer who employs all the current technologies like multiple tiers of storage and de-duplication, acknowledged that 12% was similar to what they had in their environment.
If you look at that percentage closely, there is a lot of additional storage supporting 12% production data, such as copies and space for expansion, which does not mean that the additional 88% is being wasted. Copies are needed for backup, business continuity, data recovery, data analysis, and development test. While copies are needed, it is recognized that the life cycle of copies is very poorly managed.
Today there are many tools available to reduce the need for physical copies and reduce the capacity required for copies. The first thing that comes to mind is de-duplication, and most backup vendors provide de-duplication. However, if you are backing up the same data and de-duping it over and over again, it may be better to archive the data that does not change and reduce the working set that you copy and backup so frequently.
Thin provisioning is a tool that not only eliminates unused allocated space, but it also eliminates the need to copy and backup the unused space.
Microsoft SharePoint Server has become hugely popular with users and a standard in many data centers because it eliminates duplication of attachments to emails and it hosts essential corporate information including a growing number of applications, workflows, and documents. SharePoint relies on many backend Microsoft SQL databases to store content and as more content is added, the maintenance and service level of those databases degrade until they are resized. Hitachi Data Discovery for Microsoft® SharePoint® (HDD-MS) solves the problem of continually expanding SharePoint databases with integrated SharePoint workflows that archive content to Hitachi Content Platform (HCP).
Hitachi Thin Image snapshot software provides virtual copies that allow you to make your copies and still save space. In other words, you can have your cake and eat it too. You can make up to 1024 virtual, point in time, copies of data as it is being created without having to physically add space. Instead of creating physical clones as we do with Hitachi ShadowImage Replication, we copy incremental changes to the primary volume into a data pool and create virtual volumes that can be split and used for purposes like backup or development test.
The biggest creator of copies is data protection. Sean Moser, our vice president, Software Platforms Product Management, posted a blog where he says:
“Data protection is straining under the weight of big data. Exploding data volumes, stringent restore requirements, and shrinking OPEX and CAPEX budgets all mean that traditional data protection solutions aren’t cutting it any more. IT teams are relying on disparate tools for operational recovery, disaster recovery, long term archive, performance, migration, hardware refresh, etc. Each of these use cases involves capturing a copy of the data, yet there is no integration, sharing or reuse. The result is multiple, redundant, and often poorly tracked copies, leading to increased management overhead and excess use of key resources.”
Read Sean’s blog to see how he plans to address this.
I identify this explosion in data replication in my blog on storage trends for 2013
There are tools to reduce this replication trend. For more information ask your Hitachi Data Systems representative or reseller about:
Hitachi Content Platform (HCP) with Hitachi Data Ingestor (HDI)
Hitachi Dynamic Provisioning and zero page reclaim
Hitachi Storage Optimization for Microsoft® SharePoint®
Hitachi Data Protection Suite, powered by CommVault
Hitachi Thin Image