Chunk size matters
by Hu Yoshida on Jul 30, 2009
The most touted benefit of thin provisioning is the ability to save storage capacity that is allocated but unused, and there have been wars over which vendor’s approach is more thin friendly. Often these arguments revolve around whose “chunklet” size is smaller, since the conventional thinking is that the smaller the chunklet size, the more efficient the thin provisioning.
However, thin provisioning has many more benefits that have to do with ease of provisioning, space reclamation, and data dispersion (wide striping) performance. While one might think that these benefits are less affected by the “chunklet” size, there could be major differences based on the size and the way these chunklets are mapped. The smallest chunklet size may not be the best for these benefits and they also may not be the best for the most efficient use of the thin provisioning pool capacity.
Each chunklet or unit of thin provisioning has to a have mapping table data associated with its location in the thin provisioning pool and its relation to an address space that the server sees. This mapping table has to be stored somewhere and if there is not enough memory in the storage system cache, it has to be stored on disk within the storage system. Of course, accessing mapping table data each time a chunk is accessed could have a large performance impact. The best way to reduce this performance impact is to minimize the size of the mapping table so that it can be kept in the storage controller memory. The size of the mapping table can be minimized by having a fewer number of large chunks versus a large number of small chunks and/or providing an indexing table to minimize the search.
In the USP V we use a 42MB chunk size which is optimized to our RAID formats. The mapping data for these chunks is entirely contained within our control store memory which is separate from our data cache, and access to the mapping has no impact on performance.
Last month we announced HDP on the AMS 2000 with availability on August 3. In the AMS 2000, we do not have a separate control store memory, so we use a combination of large chunklets and an indexing approach to minimize the size of the table and make it entirely resident in cache. When we assign thin provisioning space in the AMS 2000 we reserve a 1GB Chunk and out of that chunk we provision 32 MB chunklets or pages. Here the 32 MB is chosen since it maps directly into the 1GB chunk (32 x 32 MB) and is efficient in both storage layout and cache memory. The use of a large chunk and smaller sub- allocation chunklets is similar to thin provisioning implementations from other vendors. When they pre-allocate space for a thin provisioned volume they grab it in large allocation chunks and then provision it in smaller “chunklets”. So even if an implementation uses a chunklet unit of 16KB, the much larger chunk which contains that chunklet is not available to other users. The difference on the AMS is that we provision in 32 MB pages which requires less processor overhead and memory than processing 16KB at a time. The amount of capacity or chunk that is actually reserved by most thin provisioning implementations is far larger than the sub-allocation of a single 16KB chunklet.
This approach has an impact on the ability to reclaim space. With the USP V/VM the 42 MB page is the chunk. When we move a normal over allocated (fat) volume into a Dynamic Provisioning pool of storage, we move a 42 MB page at a time. After the move we can scan the pages, and where we see a page of all zero format, we can reclaim that page and return it to the pool for other allocations. This is known as Zero Page Reclaim. With other types of thin provisioning where the page or chunklet is a subset of a larger chunk, the chunk cannot be reclaimed and returned to the pool until all of its chunklets are zero format.
So HDP for the AMS 2000 works differently than on the USP V/VM. Currently we do not support Zero Page Reclaim on the AMS 2000 HDP. While the AMS 2000 HDP does not have all the functionality and performance of the USP V/VM HDP on its own, the AMS 2000 can attach behind the USP V/VM and be part of that HDP pool. On its own, the AMS 2000 HDP is very competitive with other thin provisioning products. While there is additional overhead, system performance is increased due to the wide striping effect of HDP. A volume that is wide striped across multiple RAID groups in an HDP pool will perform better than a volume that is striped across one RAID Group.
So when it comes to thin provisioning, the smaller chunklets mean more overhead. In order to minimize that overhead, you may need to architect a storage controller with a separate control store memory to contain that overhead, or you use indexing techniques which require you to reserve a large chunk of storage to contain the mapping for your chunklets and give up some of the advantages of direct one to one mapping.
Comments (12 )
Hu, the current fascination with comparing chunk size is great to keep the techies distracted – they all know in their hearts that vendors (at least HDS, EMC and IBM) make compromises here based on adapting their underlying “fixed provisioning” legacy architecture…as long as it works, scales, performs and meets business continuance objectives, so what?
The real revolution is the end of the need (and the ability!) to worry where to put the data. True storage consolidation at last!
“Simple Provisioning” takes a huge chunk of the cost, effort and risk out of storage administration while also improving availability, reducing cost and increasing agility.
The benefits to the business from “Simple Provisioning” need the CIO to have a light-bulb moment as they require organisation, process and budget/chargeback changes to fully deliver the huge potential benefits.
“Simple Provisioning” means that, at last, IT can respond properly and with lessened cost AND risk to the NEEDS of the business:
If we look at a typical large IT shop, there are (at least) 3 layers – the business who need apps to deliver, the guys who plan and run the apps and then finally the guys who plan, buy and implement the infrastructure to run them.
Your customers have ingrained the old, labour intensive, SLOW forecasting/procurement/implement/provisioning/manage methods into their organisation, their processes, their budgets/business cases and their chargeback.
The apps guys fight to justify buying far too many “disks” so they can (micro)manage where the data goes to be able to build-in guesstimated growth and (hopefully) ensure performance for their users/web customers/business partners. Just like when they had their “own” arrays.
Even though they would like to and especially for new applications, the business CANNOT forecast capacity and performance needs – now with REAL pooled/on demand storage resources, IT can show that, with changes to budgets and chargeback, they don’t need to.
The business NEEDS their mission critical databases to perform and be able to grow dynamically to meet demand near-instantly while containing cost – now with the ability to integrate (for example) Oracle ASM with “Simple Provisioning” IT can sweep away expensive and scarce “fire fighting” DBA costs and hugely improve cost, risk and agility.
Forget chunk size and, instead, provide application proof points to show that “Simple Provisioning”:
Is reliable, scales, performs, has management tools/reporting and meets all business continuance needs – it can therefore become the standard for all new implementations.
Has migration planning tools proven and robust enough to support the ROI/IRR business case for mass migration (cost/savings/risk/timescales/impact)
Can therefore allow a revolution in IT processes that tear out masses of man-hours of labourious, error-prone, expensive effort (from storage admin, change management, DBA, procurement etc) while finally delivering on the promises of storage consolidation onto large arrays.
SteveC, thank you for your excellent comments and for bringing the focus back on the real benefit which is simple provisioning. That is why Hitachi named our product Dynamic Provisioning. Many of our customers are discovering that it is easier to provision out of few HDP pools than it is to manage hundreds and even thousands of LUNs of different capacity requirements.
You also make a great point about the need for management and reporting tools for this new paradigm in provisioning and migrating storage. We have developed enhancements to our integrated Hitachi Storage Command Suite of software to manage, monitor, meter, and report on this new approach to provisioning.
And to your point on ROI/IRR, David Merril, our Chief Economist, has developed a practice to help organization develop these financial metrics and others. Please see his blog at http:/blogs.hds.com/david
Your comment will help to elevated the discussion on Dynamic Provisioning and has triggered some forth coming blogs about the ideas that you brought up.
Hu, good to hear the tools are there to allow used/free capacity, used/remaining IOPS performance and space usage/chargeback to be managed at the “macro” pool level. These are key to allow organisations to quickly and safely move to the new model.
Also good to see the zero page reclaim functionality for non-disruptive fat->thin conversion online. When combined with the load balancing/re-striping functionality, this is ideal to incrementally move existing apps to simple provisioning without the need to buy 100% of the storage a second time.
Two questions on future plans please? (I see these as icing on the cake)
1. Can/will the pool load balancing be able to be scheduled/triggered to be run on existing pools – i.e. when no new capacity has been added?
This would automatically (by time or management-tool trigger) allow any hot spots (potentially caused by the fixed allocation algorithm) to be dispersed across the pool – allowing a further reduction in both risk and management effort.
2. Will space release standards (for example support for the forthcoming Windows TRIM functionality) be supported to allow space to be released from volumes back into the pool where pages are NOT zero but are no longer needed by the apps/OS?
I think all vendors are guilty of complicating issues to such an extent that end users are left poring through unnecessarily complicated documents.
What I’m proposing is a very simple approach … The XYZ factor.
1. Assume that a 1024 x 146GB 15K RPM USPV achieves 200,000 IOPS with max configured cache and no thin provisioning. So we have 1024 x 150 IOPS = 153,600 IOPS. The X factor for the storage array is 200,000/153,600 = 1.3 . This X factor is in effect the efficiency of the architecture for a given workload
2. Redo the workload in 1. above with thin provisioning. Let’s say the IOPS this time is 300,000. The Y factor is 300,000/153,600 = 1.9. This is the efficiency of the thin provisioning algorithms/architecture.
3. If U is the % utilization then the Z factor is nothing but
dY/dU. This indicates the change in the thin provisioning efficiency with capacity utilization.
So why don’t so called independent analysts publish something simple that end users can digest ?
Hello Steve, I can not answer specific questions about future plans. However, this concept of paging is game changing in that we can now manage storage at the page level rather than at the volume or LUN level. You can expect to see additional enhancements in provisioning and tuning at the page level.
Page release standards will be effective in increasing utilization as information is communicated between the file systems and the storage system. It is safe to assume that storage vendors will take advantage of these standards.
Vinod, thank you for your suggestions on measuring performance and utilization improvements with dynamic provisioning. We have case studies that do this on our systems. However, as a way of comparing vendors, it becomes much harder since it is difficult to standardize the parameters.
As you know the effectiveness of thin provisioning depends on the operating system, file system, and the way that the data is accessed.
Perhaps we can standardize on Jetstress for exchange. I would like your thoughts on this
[...] He proposed a simple XYZ factor approach, where X is the efficiency for a given storage architecture without thin provisioning and Y is the efficiency of the architecture with thin provisioning divided by X. You can read his full comment in my preceding post. [...]
[...] gefordert, hier ein Update zum Thema Thin Provisioning in der AMS2000: Wie Hu Yoshida in seinem Blog ausführt, beherrscht die AMS2000 kein “Zero Space Reclaim”, sie erkennt also nicht, [...]
I’m sure there would be many variables including number of LUNs, size of pool, other load(s) on array, etc, but could you please discuss what effects there would be on performance of both the LUNs being processed as well as other LUNs within the same DP pool when Zero-Page-Reclaim is activated for a set of LUNs?
With the new wide stripe pools, is there any need to stripe the filesystems on the host side? Seems the storage subsystem could do a much better job of it, and the host might just be ‘messing with’ the hardware’s plans. Thanks
Bill, there would be no need for the file systems to do wide striping on the host side. With HDP the striping is done automatically across the width of the pool. Also when more storage is added to the pool, the wide stripe is redistributed to include the additional storage.
That is what I figured, thank you very much Sir, for confirming!