Capacity Efficiency: Really, what is Storage These Days!!??
by Claus Mikkelsen on Mar 1, 2010
There’s this idea (OK, mild frustration) that’s been floating in my brain for the past couple of years. Lately, it’s becoming more and more part of the conversation with customers, and that’s “Capacity Efficiency”. OK, so I just made up the term a few months ago, but I think it captures the concept.
A couple of days ago I was cited by one of my “buddy-bloggers in crime”, David Merrill (yes, he asked me first) for an outrageous comment I had made a few years ago, and continue to make, and that is as far as storage capacity (disk drives) is concerned, our customers treat capacity as “free” and “infinite”. Why else would someone buy an additional 50TB when they have a 30% utilization rate? Seriously!
But the new question is becoming: what is capacity? We’ve long-known that what an application “thinks” it has in capacity rarely equates to the hardware actually installed these days (see CoW snapshots, thin provisioning, etc.). What’s going on is that we storage vendors really are trying to reduce the amount of hardware that actually needs to be purchased. And that’s a very good thing. But there is no single “solution” in the list of “Capacity Efficiency” technologies and rather than engage in arguments over who has the more effective thin provisioning or better de-dupe technology, it’s time to “raise the discussion” to a level that includes all CE technologies offered by the venders since the effect of the combined technologies is what really matters, not just a single one. And the CE list will definitely vary by vendor. I’m not here to bash the other guys (this time, anyway), but merely to provide a thoughtful list of function that needs to be evaluated as a whole.
So I’ve started compiling a list of all the CE technologies I can think of and have come up with the following (there may be more):
- Thin Provisioning – I think all vendors have some form of TP these days, but some are surprisingly more efficient than others. Did you know that HDS’ Dynamic Provisioning (HDP) has a guarantee of 50% reduction in capacity? Pretty cool. Also, as part of HDP we have:
- o Zero Page Reclaim (ZPR) which is a background task that can return unused pages to the free space pool. This is great for existing data that has been (non-disruptively, by the way) moved into an HDP pool.
- o Write-Same – Using Symantec’s VxFS file system, we can now free space that was occupied by deleted files. The file system notifies us when a file is deleted and bang, we return the range back to the free space pool. This keeps the space consumption tight over time, as otherwise the effective utilization rate would continue to get worse. I think XIV also supports this, but EMC seems to be waiting for a ratified standard.
- o Thin provisioning of externally virtualized storage. That is, we can do TP on storage that doesn’t even support TP!!
- o Another advantage of our Dynamic Provisioning is that it can dramatically improve performance throughput. You might think that a performance boost is not a CE feature but it certainly is, since one might decide not to do TP if it comes with a performance degradation, which some implementations do. Another impact is to hopefully eliminate “short stroking” practices (limiting how much data is placed on a drive). This is common practice when provisioning databases.
- “Spaceless” CoW Snapshot – We all have that. But it makes a difference, so think about the flexibility, numbers of copies, and manageability of the copies.
- “Thick to thin” and “thin to thin” replication – Even though the original copy of data is not TP’d, the replicas can be. This is for synchronous remote replication, asynchronous remote replication, and in-system cloning.
- De-dupe during backup – I think this is becoming standard.
- Passthru 3-Datacenter replication (actually, for some reason we’ve decided to call this 2DC Passthru). Multi-datacenter replication is becoming more common, but requires multiple (greater than 2) copies of the critical data. Passthru gives you “no data loss” DR at long distances, but with only 2 copies. That’s pretty unique, I think.
- Archive stale data – We’ve got an excellent product in our Hitachi Content Platform. Actually, although not directly related to CE, is the great data discovery capability it has. Intelligent archiving is a very powerful tool in CE as the percentage of archived data in relation to the total data stored continues to grow at a healthy rate, which is good. Get the old stuff off of expensive storage, put it where it belongs, and find it again when you need it. Also, with HCP we have Single Instance Store – a capability within HCP to remove duplicate objects. This significantly improves CE.
- Here’s a big one. Because of our distributed parity and outboard processing of RAID function, our RAID 5/6 outperforms other RAID 10 schemes. That 50 TB turns into 100 TB of RAID 10 rather than the 62.5 TB with RAID 5, even less with 7+1.
- Metadata overhead – A lot of cool storage technology comes with metadata overhead. Ask your vendor how much additional metadata is being stored to handle TP, for example.
So my point is that if you’re interested in reducing capacity, and who isn’t, look at the impact of ALL the CE technologies offered by a particular vendor, not just the few that dominate the Powerpoint decks. Ask the right questions and don’t be fooled.
Comments (3 )
Marc Farley here, from 3PAR. Your list starts off pretty well, but it spins out of control after the first four items – mixing technologies, benefits, architectures and marketing together. If you intended to provide a list of Capacity Efficient technologies, here are some suggestions for improving it.
Zero Page Reclaim should probably be replaced with something more generic, such as Zeroed Block Reclamation. The important detail being the definition of a page – which is not necessarily a common concept or term between vendors.
A closely related, but distinct technology is Zero Detection, which identifies an opportunity to reclaim zeroed blocks. This technology applies both to reads from storage (scanning for zeroes) as well as incoming writes. Identifying long writes of zeroes with Zero Detection – and not writing them disk in the first place – is a much more efficient process than writing them, finding them later with a storage scan and then reclaiming them. BTW, 3PAR’s Thin Conversion and Thin Persistence products both identify long incoming writes of zeroes.
Write Same. This is the SCSI command used to execute reclamation, but the two words are not exactly intuitive. As an industry we should try to find something better. Maybe “File Space Reclaim”? Also, were you trying to avoid mentioning 3PAR as a vendor that offers this technology? As you probably know, we did a fair amount of development work with Symantec developing this functionality.
Snapshot replication should be on the list. The ability to return snapshot space to primary storage space.
When is ZPR going to be fully automated in the near future? Is performance boost you are referring to the wide striping of Dynamic Provisioning? Is there integration with other file systems other than VxFS regarding deleted files?
How about some links to more information about your points?
Claus you bring up some interesting points and I especially like the idea of our RAID5/6 approaches which are as reliable and performant as our competitors RAID10. This is a great form of capacity efficiency. Also as to HCP beyond SiS there is a compression policy which like SiS is compliance safe. So we have another capacity efficiency or optimization technology in HCP to talk about. File system over head is another area that we can talk about, and I believe that HNAS has a way more efficient file system than NetApp. In fact I think that one of NetApp’s dirty little secret is tied to the metadata argument you point to above. Specifically for all of things needed to make a NetApp system reliable and efficient there can be as much as 40% overhead. So what was that A-SIS thing for anyway, ah to get them back to zero utilization. Here is a post on the same topic that highlights more of our capacity efficiency and optimization approaches: http://blogs.hds.com/michael/2009/07/capacity-optimization-for-hitachi-file-and-content-services.html.