Capacity Efficiencies: Allocation vs Utilization
by Hu Yoshida on Feb 3, 2012
As noted in previous posts, capacity efficiency has two dimensions: allocation efficiency and utilization efficiency.
Allocation efficiency is what most people think of first: eliminating the waste of over-allocation. In open systems this has been a major problem since we may not know ahead of time how much capacity an application requires, but we don’t want to run out of capacity and we know the operational difficulties of expanding it. So, the usual practice is to over-allocate by a wide margin.
After all, disk is cheap, isn’t it?
The problem with over-allocation is that that we don’t just make one copy. As Claus Mikkelsen noted in his blog, there may be 10 to 15 copies of that allocation for many valid requirements, like data analysis, data sharing, development test, or point in time snap shots. The most efficient way to eliminate this over-allocation is to use thin provisioning, where you provide virtual space for the requested allocation and only provision the capacity that is actually being used. It also helps to support the APIs for file systems, like VMFS and Symantec file systems that can notify the storage system when files are deleted so that the allocation for those files can be reclaimed by the storage system.
The capacity and time to make copies is also reduced by the elimination of allocated unused space. Reduction of copies can also be reduced further with copy on write so that only the new changes are replicated. HDS storage supports these functions, and can map them to legacy storage systems through virtualization.
The thin provisioning software that Hitachi provides in Hitachi Dynamic Provisioning (HDP) also increases allocation efficiency by providing a pool of preformatted pages. This eliminates the need for the storage administrator to format the drives, carve out the LUNs, and concatenate LUNs for striping performance. HDP will automatically strip a LUN across the width of the HDP pool. Allocation software in our Hitachi Command Suite will enable a user to allocate storage with five clicks of a mouse.
Utilization Efficiency is about using the capacity in an efficient manner so as to reduce costs and increase performance/availability. The primary Hitachi tool for doing this is Hitachi Dynamic Tiering, where we can dynamically tier pages within a volume across multiple tiers of cost/performance storage capacity. When most people think of tiering they think of volume level tiering, where a volume is moved between tiers of cost/performance storage. While this can match volumes with the right performance tier of storage, it can actually use more storage since you need to have space for the whole volume in all the tiers that are involved. With page level tiering, only the hot pages need to reside on the higher performance tiers. Since only a small amount of pages are hot at any time, you will only need enough high performance for 5% to 10% of your volume rather than for full 100% of your volume. That is utilization efficiency.
Utilization efficiency also depends on the efficiency of the paging process. Paging is the most efficient method of dynamic tiering since it is calculated on a page basis. Chunk/Chunklet methods for paging require the definition of a chunk and then an index into the chunklet. Dynamic tiering requires the handling of more metadata and more processing power within the storage system. VSP was designed with a separate control store for the metadata and a separate pool of Intel quad core processors to offload this processing from the I/O processors.
This function can also be mapped to external storage through storage virtualization through VSP.
These are the primary tools for storage capacity and utilization efficiencies in Hitachi storage systems. I would be interested in hearing about other functions that can be used to enhance capacity efficiencies.
For other posts on maximizing storage and capacity efficiencies, check these out: http://blogs.hds.com/capacity-efficiency.php
There is a concept of “FLATTENING” in Hitachi Storage Arrays in which after assigning a newly created raid-group to existing pool, the pool pages will be striped across the recently added raid-group disks also which increased the total number of disks that participate in IOPS.
On what category this flattening can be distinguished?
Allocation Efficiency: Since the new raid-group is being added to dynamic provisioned pool, which formats pages out of the pool, can it be labelled as Allocation Efficiency.
Utilization Efficiency: The newly added raid-group is being utilized in a more efficient manner by adding it to existing pool and striping the existing data across all the available disks including new raid-group disks, rather than creating new pool out of the raid-group disks which decreases the number of disks participate in IOPS, thereby affecting the application performance. With this can we distinguish this as Utilization efficiency.
Please let me know your thoughts on this.