When you say Tiering, do you mean Tiering?
by Hu Yoshida on Apr 28, 2011
Randy Kerns of Evaluator Group noted in a recent article on SearchStorage.com that most IT people still view storage tiering as external tiering where an application’s data is stored on a tier of storage depending on its storage requirement. I have also run into that perception, especially with non Hitachi customers who have not seen the benefit of storage virtualization. Here are my definitions of tiering.
Data Centers who have not had storage virtualization would classify their data in some manner and allocate data to a storage system based on the cost and features of that storage system. For instance they may have classified an application as tier 1 if it required replication to a DR site and allocated its data to an enterprise class storage system that could do distance replication. A tier 2 application may not need 24×7 operations and could be allocated to a lower cost, two controller, modular storage system. And tier 3 data may be a disk backup, which could be allocated to a low cost modular storage system with lower cost SATA disks. These tiers were static tiers since it was difficult to move volumes from one storage system to another without disruption to the application. Once an application’s data was assigned to a storage system it stayed there even if the requirements of the application changed. While an application may have needed expensive tier 1 storage for its primary data, it did not need it for all the copies that were used for development test, data distribution, backup, etc., but these copies remained on tier 1 since there was no easy way to snap them off to lower cost tiers. Some IT people set out to classify all their applications and data into 3 to 6 tiers of storage to reduce costs and ended up in frustration after several years of pursuing a moving target.
Dynamic Tiering Across Internal and External Storage Systems
Hitachi Storage Systems, starting with USP in 2004, provided dynamic, non disruptive tiering across multiple storage systems through the introduction of storage virtualization. USP could virtualize different external storage systems into one pool of storage resources. This provided a dynamic way to move and snap copies of an application’s data between tiers of storage without disruption to the application. If you made a wrong decision in classification, or the classification changed, it was easy to correct it and move the data to the appropriate tier of storage. In fact, you could set policies and automate the movement of data across tiers based on time or events that were generated by a tuning manager. Our competitors who did not have storage virtualization could only do tiering in their internal storage system. And because of the static nature of mapping storage configurations to cache with BIN files, it was still disruptive to move application data between tiers of storage without stopping the application. The economic cross over point of doing internal tiering versus external tiering was about 100 TB, but other factors like extending the life of existing assets could make external tiering viable even for smaller configurations. Hitachi also makes dynamic, non disruptive tiering, available on the standalone AMS modular system.
Dynamic Tiering with Dynamic Provisioning
In 2008 Hitachi introduced Dynamic Provisioning (HDP) on USP V and in 2009 on AMS. This provided the ability to dynamically provision new tiers of storage and thin provision these tiers to eliminate allocated but unused storage waste. Dynamic provisioning also provided automatic wide striping of a volume across the width of an HDP pool which increased performance by spreading the I/O across many more disks than a single RAID group. This simplified operations by eliminating the need to manually stripe the volumes and concatenate LUNs for desired LUN sizes. With USP V, HDP pools could also be created on external storage and participate in tiering across internal and external HDP pools.
Page level Dynamic Tiering with HDT
In 2010 Hitachi introduced VSP (Virtual Storage Platform) and took the concept of HDP one step further by providing Dynamic Tiering on a 42 MB page basis. Now instead of moving an entire volume up and down tiers of storage, we could move pages of a volume across tiers of storage based upon the I/O activity against that page. Unlike volume level tiering, where space for the whole volume would have to be available in the tier that we were moving it to as well as in the tier that we were moving from, page level tiering only needs room for the page that is being tiered. With page level tiering we do not have to classify our application data since the pages in the volume will be placed on the tier of storage based upon page activity. Of course this puts a heavy load on the storage controllers. The VSP was architected to address this additional load with a global pool of quad core Intel processors that is tightly coupled across an internal switch matrix to a global cache and front/back end processors. Storage systems that do not have this extra processing power will suffer some performance degradation when they do sub LUN level tiering.
Page Level Tiering with SSD
Page level tiering suddenly makes the use of SSD’s more affordable. Using the 80/20 rule, we only need a small amount of SSD to hold the hot pages while the rest of the pages could be on lower cost disks. This provides us with SSD performance for 80% of the I/O while using SSD for less than 20% of the capacity. In actuality this could be even less if we include high speed disks as part of the 20%. Here is a 36 TB comparison of a single tier of high performance disk versus a three tier solutions, including SSD, high performance disk, and large capacity SATA disks. Even with the additional cost of the page level Dynamic Tiering software, the total costs are lower while the increase in I/O performance is over 4 times.
There can be cost benefits in implementing tiering at any level. However, the most benefits can be derived from page level tiering, with SSD drives, in a storage system that has the processing power to support page level tiering without impact to overall performance.
Comments (6 )
Your post implies that the VSP architecture handles the added overhead of auto-tiering more efficiently than (unnamed) other arrays.
I can demonstrate that your claim is not true – in fact, quite the contrary:
Thanks for the comment on Randy’s blog with Storage Soup.
One correction, Randy Kerns @rgkerns is with Evaluator Group (EGI) not Steve Dupe’s group. I am sure most of the vendors think customers understand all this, but our engagements are clearly saying otherwise. We firmly believe, the confusion even starts at a higher level, and that is around IData Management – and the “Sibling Rivalry” of tiering, data protection and archiving. Then you throw in cloud – Just talked to an IT exec who was looking at a general cloud offering for archiving, even when after some discussion it was clear the data had significant legal issues attached and needed the ability to manage by time, if not more.
For more on the Siblings— http://evaluatorgroup.com/sibling-rivalries-integrating-data-management-technologies
My post explains how Hitachi architected the VSP to handle the additional overhead of page level auto tiering and is positioned to handle additional processing as more and more functions are offloaded or created in intelligent storage subsystems.
I have taken a look at your post which shows a vendor generated chart without any information on configurations or workloads. I believe that EMC has claimed in the past that published benchmarks are meaningless since they are artificial and do not reflect real workloads.
What would be more helpful is if you could explain what architectural changes EMC has developed to address the additional workload that is being required of next generation storage systems.
Thank you for the correction.
[...] a recent post I described the different forms of tiering. Today I’d like to discuss another form of tiering that is used in the mainframe: Hierarchical [...]
I’m a bit confused about your charts. I agree that the 300GB HDD lists at about $1,200 apiece, but I question the others. HDS lists the 200GB SSD drive at about $25,000 per drive, not $10,000. So the price tag is about $100,000 for the SSD, not $40,000. Additionally, VSP does not support the 1TB SATA drives (according to your spec sheet). If you are assuming that the 1TB drives are in an external array, then you also need to add the costs of that array, plus sw, maint, etc. Also, the list price for the 1TB drive is $1,200, so that’s about $36,000, not $28,000. If you simply erred and meant internal 2TB drives, those list at about $2,520 for a total of $80,000, not $28,000. And since the 3.5″ 2TB drives require a different chassis than the 2.5″ SSD and 300GB drives, that cost also needs to be included. I believe that’s another $45,000.
So as you can see, there’s some confusion here as to your comparison of costs vs IOPS.