The XYZ Factor for Dynamic Provisioning
by Hu Yoshida on August 4, 2009
On my last post on Chunk Size Matters, Vinod Subramaniam had the following comment:
“I think all vendors are guilty of complicating issues to such an extent that end users are left poring through unnecessarily complicated documents.”
He proposed a simple XYZ factor approach, where X is the efficiency for a given storage architecture without thin provisioning and Y is the efficiency of the architecture with thin provisioning divided by X. You can read his full comment in my preceding post.
The Enterprise Strategy Group had done this test for us in early 2008 so I decided to post their results to show you our XY factor. You can go to this link to read the full PDF report.
Their test case was run on a Windows server connected over a single 4 Gbs FC interface to a USP V. The test workload was 100 percent random, 75 percent reads and an 8 KB block size.
First this was run on a LUN that was carved from 4 x 146 GB 15K RPM FC disk drives that were configured on one RAID 10 array group.
Then with HDP they striped the LUN across thirty two 146 GB 15K RPM FC disk drives configured as 8 RAID 10 array groups and ran the same workload. The results are shown below:

So from this result our Y factor in terms of IOPs is 7.16 or a 716 % improvement at a response time of 15 ms or less.
While this result took 32 spindles versus 4, there was additional capacity to run 8 more workloads. Obviously the additional workloads would cause arm contention and the increase in IOPs would not be linear, but the cumulative IOPs of 8 workloads which are wide striped across 8 array groups would still be better that running the 8 workloads on separate array groups.
Wide striping can be done without Dynamic Provisioning. Many data base administrators do that today and short stroke the drives to increase random IOPs performance. Unfortunately that leaves a lot of capacity that is not used. Hitachi Storage Command Suite software can identify that capacity and use it for lower tier requirements or lower SLO requirements, through partitioning and port priority processing even though that capacity is in the same Dynamic Provisioning Pool.
By the way, Vinod’s Z factor is the change in the efficiency Y divided by the change in capacity utilization. dY/dU where U is the capacity utilization factor. This factor may not be useful since not all file systems are thin provision friendly or like NTFS, they don’t stay thin provisioned very long if there is a lot of updating being done. I would say that nearly all file systems will benefit from the ease of provisioning and the wide striping that comes from dynamic provisioning. Many of our customers tell us that thin provisioning is the least of the benefits that come from Dynamic Provisioning.
Comments (4 )
Charlie Dellacona on 05 Aug 2009 at 5:34 am
Hu,
You say “…but the cumulative IOPs of 8 workloads which are wide striped across 8 array groups would still be better that running the 8 workloads on separate array groups.”
Are you claiming this for all transaction loads or just the test case you cite? A simple counterexample is a set of sequential loads, aggregating these would just cause arm contention and increase average service times.
Also, a nit, if the Y factor (Y/X) is 7.16 then Y/X is a 616% improvement over X, not 716%.
Vinod Subramaniam on 06 Aug 2009 at 8:31 am
Hu
I think need to explain more clearly.
X = ( IOPS without HDP ) / ( 150 * No. of 15k rpm Spindles )
= 800 / ( 150 * 4 ) = 1.3
Y = ( IOPS with HDP ) / ( 150 * No of 15k rpm Spindles )
= ( 5000 ) / ( 150 * 32 ) = 1.04
The reasoning behind this is as below :-
1. If one were to use the 4 * 15k rpm spindles in a JBOD and use a host based Volume Manager one would get in theory 600 IOPS.
2. The point behind adding CHA’s, DKA’s and Cache is to improve the performance ( and availability ) that one would get over using the drives in a JBOD. This improvement in performance is what X indicates.
3. Similarly if one were to use 32 * 15k rpm spindles in a JBOD and use a host based Volume Manager ( assume that the host is not CPU bound or memory bound ) one would get in theory 4800 IOPS. The point behind using HDP is to achieve at least a multiplier of 1.5. This is what Y indicates.
4. In the example above ESG has used 146GB drives. What if we use the current 300GB drives or 450GB drives. I assume that the Y factor would be different at a capacity utilization of 80% as compared to a capacity utilization of 40%. This is what Z indicates.
As far as Jetstress I have no experience with it and so cannot comment.
P.S :- One of the bravest things I have done was stand up before a crowd of 500 Unix geeks at Lockheed Martin and sell (train) Microsoft .NET. You can guess how that one went !!!
Hu Yoshida on 06 Aug 2009 at 8:30 pm
Charlie, thanks for the comment and clarification. The ESG report showed 716%, I just converted it to a decimal factor.
Hu Yoshida on 06 Aug 2009 at 8:33 pm
Vinod, thanks for the clarification. I know how it feels to be in front of technology experts!!!




