United States
Site Map Contacts Hitachi Global Community
Hu's Blog - Data Storage and Virtualization Thought Leader Hitachi - Inspire the Next

Hu Yoshida's Blog - Vice President | Chief Technology Officer

Home > Corporate > HDS Blogs > HDS Bloggers > Hu's Blog
Products, Solutions and more

Hu's Blog

Busting 6 Myths About Tiered Storage

by Hu Yoshida on May 16, 2011

The tiering of storage is a valuable tool for reducing costs and improving efficiencies in many ways. There are different levels of tiering, as I mentioned in my previous post, and the value that we derive from tiering can vary depending on the different types of tiering.  There are also several myths about tiering of storage that I would like to address.

seo-myths-mythbusterMyth number 1: Tiered storage is information lifecycle management

Information lifecycle management (ILM) was marketing hype that was popular several years ago. The movement of data across tiers of storage is not ILM. The management of information must be done by an application or a records manager, not by storage infrastructure.  Tiered storage may be a tool for information lifecycle management, but it does not make the decisions about information lifecycle management.

Myth number 2:  Tiering of storage is based on the value of the data

All data that we store is valuable, otherwise the data should be deleted and scrubbed from the storage. The notion that some data has less value and can be stored on cheaper, less reliable, storage is not acceptable. Storing data is useless if you cannot retrieve it. Data will have different response time and availability requirements during its life, and so we might need some data on Flash drives, and some could be stored on lower cost modular storage systems with 99.9% availability versus 99.999% availability, but all data must be stored on reliable storage systems that are protected.

Myth number 3: The most economical tiering is done within a storage frame

This is an argument that is made by vendors who do not have the ability to tier storage across multiple storage frames to optimize cost. This could be true if you do not have a need for tier 1 performance, availability, and scalability, and can fill all your requirements with a modular tier 2 storage frame with a mix of performance disks.

But if you need tier 1 storage, you will need a tier 1 storage frame with multiple processors that can access a global cache.  A tier 1 storage frame could also contain a mix of different disk types. However, the frame costs for storing lower cost disks like SATA disk is higher than if you stored them in a lower cost modular frame.  It is true that the additional cost of another storage frame, even if it is a lower cost storage frame does add cost, but there is a cross over point where the addition of external storage for lower tiers of storage becomes more economical than adding disks to the tier 1 storage frame. Having the option of tiering across internal and external tiers of storage gives you the most flexibility to reduce costs.

Myth number 4: There is such a thing as Tier 1.5 storage

Tier 1.5 is a term that some analysts came up with to describe scale out storage. The reason for the 1.5 rating is to imply that this type of storage has tier 1 scalability, performance, and availability with the cost and ease of use of tier 2 storage.  This type of storage consists of a loose coupling of two controller, tier 2 storage frames across an external switch using some protocol like Ethernet or Rapid I/O.  The basic component here is a tier 2 storage system, and no matter how many other tier 2 storage systems are clustered together, it does not increase the scalability, performance, and availability of an application that is running on a particular storage frame. Applications can only access the storage resources, including tiers of storage, which are tightly coupled behind the cache that it is connected to. The loose coupling of storage frames makes it easier to sell more storage; it doesn’t help an application that needs more storage resources.

Myth number 5: There is such a thing as Tier 0

Since the introduction of Flash disks, many people refer to this type of media as tier 0 since it has higher performance that the highest performance spinning disk.  In my view, tier 1 is the highest performing tier of media, whether it is a flash drive or a hard disk. Tier 0 means having no tier, nada, in the storage and should be reserved for performance tiers that do not reside in storage frames, such as flash cache products like Fusion I/O. While this is my perspective, I doubt if my opinion will change this myth.

Myth number 6: Hierarchical Storage Management is the same as storage tiering

See my post on HSM to see why I think they are different.

Do you agree? Disagree? Are there any other myths you can think of?

Related Posts Plugin for WordPress, Blogger...

Comments (8 )

Steven Ruby on 16 May 2011 at 2:02 pm

Myth number 3: The most economical tiering is done within a storage frame
This is an argument that is made by vendors who do not have the ability to tier storage across multiple storage frames to optimize cost. This could be true if you do not have a need for tier 1 performance, availability, and scalability, and can fill all your requirements with a modular tier 2 storage frame with a mix of performance disks.
But if you need tier 1 storage, you will need a tier 1 storage frame with multiple processors that can access a global cache. A tier 1 storage frame could also contain a mix of different disk types. However, the frame costs for storing lower cost disks like SATA disk is higher than if you stored them in a lower cost modular frame. It is true that the additional cost of another storage frame, even if it is a lower cost storage frame does add cost, but there is a cross over point where the addition of external storage for lower tiers of storage becomes more economical than adding disks to the tier 1 storage frame. Having the option of tiering across internal and external tiers of storage gives you the most flexibility to reduce costs.

####
On the VSP this is true. It costs more to virtualize an AMS with SAS and use HTsM or even HDT than it does just internal disks.

the storage anarchist on 17 May 2011 at 8:28 am

Regarding Myth #3
.
FACT: tiering to external storage ia nothing more than a convoluted HDS excuse to make up for well-known VSP deficiencies.
.
FACT: It should be obvious that adding an AMS with SATA or “fat SAS” behind a VSP is more costly than simply putting those same drives into the VSP. You still need sufficient VSP hardware (CPU, ports, memory) to support the I/O to the drives. To that you amust dd the cost of the AMS controller hardware (CPU, ports, memory), the additional floor space, power, cooling – not to mention the cost of managing now TWO arrays instead of one.
.
FACT: The REAL reason VSP wants SATA/fat SAS outside of the array is because SATA support has so much overhead on the VSP. When used with Hitachi Dynamic Tiering, SATA drive pools must be configured with write-read-verify to ensure that the correct data lands on the SATA drives…this obviously doubles the response time of writes to SATA. Moving these drives out to an AMS allows the VSP to transfer that workload to another array, and to capitalize on the AMS’ write caching to reduce the overhead of SATA I/O.
.
FACT: And by the way, write-read-modify doesn’t ensure the data will be accurate when read later. In fact, the #1 failure mode of a SATA drive is block misread – returning the data for a different block than the one requested. Neither VSP nor AMS have any protection against this error. Some arrays (the one my employer EMC makes included) use advanced data integrity checks to protect against misreads while minimizing the overhead of writes that Hitachi arrays must endure.
.
FACT: Neither VSP or USP/USP-V perform any data integrity checks on externally virtualized storage, trusting instead that the target array provides sufficient protection against data corruption or loss. Unfortunately, not all arrays provide sufficient protection against silent data corruption. Some might consider that this risk window makes external storage virtualization more appropriate for migrations than for tiering.
.
FACT: Since HDT is so inefficient, customers should be very concerned about adding latency to data residing on external tiers. The added 1ms of response time for external I/O effectively negates most of the value of having a Flash tier. Sure, the write response times to external SATA may be better than to SATA internal to the USP, but either way the response times are always longer than competitive offerings.

Hu Yoshida on 18 May 2011 at 11:51 am

Hi Steven,

The costs of VSP have come down relative to USP V, so the cross over point using external AMS with SAS drives is higher. This is a moving target as we get new hardware and software in the VSP and AMS product lines. I believe you will agree that using an existing AMS will cost less than buying additional internal storage for tiering in the VSP. The ability to tier to external storage enables the use of existing assets which means there is no additional capital costs for the external storage other than the cost of the tiering software.

Hu Yoshida on 18 May 2011 at 2:11 pm

Proclaiming opinion as facts is misleading and non productive.

Since the VSP can support external storage as a tier in our HDP and HDT tiering, this enables the use of existing external storage as a lower cost tier of storage. In this case there is no additional costs other that the cost of the software. Using external storage also makes it easier to do technology refreshes. Using internal storage capacity for data that is hardly accessed is a waste of high performance capacity, cache memory, and by doing so customers may artificially inflate the cost of their software and maintenance licenses for little or no benefit to the application. As far as your other claims go, I will do a follow on post on our use of SATA drives. HDT is being used by hundreds of our users with no complaints. The use of internal or external SATA should have no effect on the performance of a flash tier which should be used as tier 1 in internal storage.

the storage anarchist on 18 May 2011 at 8:10 pm

My proclamations are indeed facts…especially the fact about the overhead of write-read-verify, the fact it is required by HDT for internal SATA on the VSP, and the fact that this approach is unable to detect and recover from the very real and frequent occurrence of SATA misreads, putting customer data at risk on bot the VSP and the AMS.
.
I must say that I am looking forward to some of that good old Hitachi Math, where you show how putting X SATA drives internal to the VSP costs the same as putting those same X SATA drives external in a separate AMS. This combo of a new VSP+a new AMS is a configuration that your sales teams around the world will nearly always bid; the smart customer insists you offer the SATA drives in the VSP for the same price as you do in the AMS (obviously unaware of the performance implications).
.
I mean, why else would Hitachi take such a significant margin hit on new SATA capacity?
.
it will be even more interesting to see you demonstrate identical SATA performance of a reasonable workload on internal vs. external, unless, of course, you artificially limit the demonstration to an application working set that unrealistically fits entirely within the Flash tier, and doesn’t change.
.
Given the dynamic nature of real-world applications, the inability of HDT to react quickly to workload changes, the massive amount of overhead and performance impact incurred to relocate data across tiers (not to mention the inefficiency of using 42MB pages), a significant portion of typical production application I/Os are going to ineviably be destined to the SATA tier, where the inefficiencies of write-read-verify are going to be quite obvious. May not change the performance of the Flash tier, but a miss to SATA is definitely going to change the TOTAL performance of the application.
.
Then again – maybe not…since HDT performs so poorly overall, maybe nobody will even notice that your SATA drive performance is so bad.
.
Are you willing to concede that your VSP performance is better to external AMS SATA than it is to internal SATA?
.
As to tech refreshes – word on the street is that HHAM still doesn’t work. So you’ll have to explain to us again exactly how one is supposed to non-disruptively tech refresh from a USP-V to a VSP?
.
Finally, as to your accusation that I am asserting opinions as facts without substantiation, I find it hard to believe that hundreds of users of ANYTHING have no complaints. In fact, I know personally from several of your customers that your assertion is blatantly false.

Steven Ruby on 20 May 2011 at 3:43 pm

Thanks Hu, I am looking forward to testing out the price of AMS as Tier 3 in HDT rather than internal VSP disk. My biggest complaint about UVM has been HDS gave it away when a customer wanted to virtualize another vendors array but charged for it to virtualize an HDS array. We personally use UVM (BOSV) to lower our cost and align the performance needs with the right priced disks, all the while still providing Non-disruptive migration and scalable DR (HUR) and Backup solutions (SI/CoWS).

Thanks for your openess and candidness in this blog.

It is my opinion that VSP is Da’Bomb! (just need to align the pricing model a bit)

[...] seemed to have touched a nerve with Barry Burke in my last post on tiered storage myths, where I talked about the advantages of tiering across internal and external tiers of storage. [...]

Hu Yoshida on 23 May 2011 at 12:48 pm

Barry, although you comments are very antagonistic and presumptive, I will continue to post your comments as it reflects on the credibility and professionalism of you and your company. Please see my latest post on the use of SATA and capacity enhanced SAS.

Hu Yoshida - Storage Virtualization Thought LeaderMust-read IT Blog

Hu Yoshida
Vice President and Chief Technology Officer

Connect with Us

     

Recent Videos

Switch to our mobile site