Misunderstandings About Storage Virtualization
by Hu Yoshida on Apr 7, 2006
An April 6 article on SearchDataCenter.com by Rick Cook introduced several misunderstandings about controller based virtualization which I would like to correct.
In this article Rick compares three types of storage virtualization, server based, SAN based, and controller based. While he does acknowledge that controller based virtualization is "intimately connected to the storage arrays" and does an excellent job of working with storage in the event of errors and write failures, he makes two assumptions that are flat out wrong about HDS controller based virtualization.The two assertions he makes is that Storage controller based virtualization:
1. Is a vendor lock in
2. Has the narrowest view of the SAN
First let’s make a further distinction between storage controller based virtualization and storage array based virtualization. The HDS USP and NSC55 products do controller based virtualization as part of its controller functionality, not as an appliance as Rick may have suggested. This enables them to attach standard FC external storage arrays and represent them in their controller caches in the same manner as their own internal FC disks. The external storage arrays can be any vendor’s disk that has standard FC ports. No vendor lock in here. In addition, if you don’t like the added functionality that the USP or NSC55 brings to the external array you can easily disconnect and go back to native mode. If the LUNs are mapped one to one between the external array and the USP/NSC55 cache, it is simply a matter of disconnecting and reconnecting. If the LUNs had been combined, we would do a volume migration to a LUN Image on the external array before we disconnect.The other methods of virtualization are proprietary in that the mapping is controlled by the host software or the Network virtualization tool. If you lose the mapping tables you are out of luck. It is also not as easy to disconnect and go back to native use as it is with the USP/NSC55 since the virtual LUNS may to mapped across extents on multiple disks.
Storage array based virtualization is different from controller based virtualization. These products create virtual LUNs out of their own proprietary array formats. Two such products are HP EVA and 3ParData.They do not attach external storage arrays and are proprietary, but they do provide flexibility in managing LUNs.
The second misunderstanding is the claim that storage controllers have the narrowest view of the SAN. Actually the storage controller transcends the SAN in that it not only can interface to multiple SANs, but also supports direct connect, mainframe connect, NFS/CIFS, and any other protocol like infiniband, that can be bridged to a standard FC or NAS protocol. The storage controller is the consolidation point for all these connections.
A storage controller does not need to have visibility into the SNS tables of a Switch or director since it is not in the business of managing a SAN. That is the business of a Switch or director, and that business is a very complex one. Cynthia Gumbert of Disaster-Resource.com, published a study that identifies SANs as the third leading cause of email outage with an average outage of 25.5 hours. These outages also contributed to "significant data corruption that was replicated to email backup system". These outages were not due as much to hardware failures as it is to misconfigurations of a very dynamic and constantly changing network environment, out dated drivers, LUN zoning, buffer credit congestion, LUN masking, state change notifications, ISL chatter, and alternate pathing, to cite a few challenges. This is not the type of environment that you want to insert a storage virtualization solution which will require more SAN ports for fan in and fan out and add significantly more complexity in order to do functions which were never intended to be done in the network.
SAN implementations like the IBM SVC,not only add complexity to the management of a SAN(S) they also add a whole new layer of management complexity. The SVC requires a separate SAN to interface to the external storage arrays and adds the configuration and management of MDISKS (managed disks) which map to external LUNs. They have another SAN that interfaces to the host and requires the configuration and management of VDISKS, virtual disks which are carved out of the extents on multiple MDISKS. The USP/NSC55 requires no new management except to identify external LUNs as ELUNs which is automatically done when the LUNS are discovered over the FC link..
I would turn the second point around and say that SAN based virtualization has the narrowest view of the host server and the storage. The host server is the source, and he tells the target,storage, what to do, read, write, etc. The Storage controller also knows what cache slots map to what track table, and as Rick points out is "intimately aware" of the storage arrays. A SAN switch or director does not have this knowledge and was never intended to have that knowledge. The FC switch’s job is to manage the secure and efficient flow of data between source (host server) and target (storage controller). Storage virtualization in the SAN network requires unnatural acts. In order to know what the host wants to do, the network virtualization tool must intercept the FC packet, and either act as a proxy in order to redirect it or crack open the packet to remap the content directory blocks. The FC developers never intended this to happen. Frames were put in front of the data packet for use in steering the data to the designated target. The content of the data packet was not intended to be seen by the network.
The network virtualization approaches also do not know about storage array cache slots and track tables, so in order to do functions like replication, it must do a read from one device and then a write to the other device, check status on the way back, retry any errors, and maintain a log for any updates that occurred in the mean time. This takes a lot of cycles and a lot of cache which is not available in the SAN. All this information is available to the storage controller, and it can transfer the data non disruptively between cache images that reside in the same controller cache. Replication functions in the storage controller have been implemented, proven and enhanced for many years by all the major storage vendors. Network virtualization will require that we start from the beginning and reinvent these functions for implementation on a platform that does not have the availability and cache capacity of an enterprise storage controller.
I have blogged about this before, so you can checkout Where should intelligence Reside?
Comments (5 )
In regards to the vendor lock-in. To reap most of the benefits of virtualization the configuration would never be as clean as you portray it to be. One of the advantages is simpler management, were you assign all external storage to the virtualization layer at one time and manage all storage at the virtualization layer from that point on. This would not be possible in your example.
So to put it clearer, to get all the benefit of controller virtualization, you would be locked into a vendors controller until someone went through the arduous task of migrating away from it. I say arduous because you couldn’t rely on virtualization to get you there.
As I’ve said in the past, the only way to get away from hardware vendor lock-in is through software vendor lock-in. Which, for whatever reason, companies are a lot more tolerant of.
The “V” word is likely to generate a lot of debate because it is poorly defined and clouded in marketecture rather than architecture. The point, as I see it, of virtualization is to reduce complexity, not to add it.
Virtualization also provides an important location for splitting writes to multiple targets without impact on the application doing the writing. This is not part of the functionality EMC’s Invista, by the way, probably (and I’m speculating here) because it would gore their SRDF ox, which replicates data synchronously or asynchronously through the back door between EMC frames.
What this really gets back to is the question of whether the industry would consider too disruptive a technology that would front end storage frames and remove their value-add from the vendor’s perspective. One fellow remarked to me recently, “If that happened, we’d all be selling JBOD for a living.”
For now, the benefit of a Tagma approach is the flexibility it affords in being able to map initiators and targets in one-to-many or many-to-many configurations. This was the initial promise of FC fabrics, but the industry went back to one target-one initiator models because of numerous problems caused in other configurations. “SAN” roll-outs followed a pattern: aggregate, then segregate, then aggravate. Tagma solves this with its virtualized ports.
You got a lot of pushback from EMC, as I recall on the issue of “gutting the value add” provided by arrays from third parties slung off the back of Tagma. The guys who are trying to virtualize at the switch, in an appliance, or even on host based software have always gotten the same treatment.
Jon, thanks for pointing out the need for virtualization to reduce complexity.
Control units already do virtualization. All we did with TagmaStore control units was to allow it to attach external disks so that they could be virtualized just like internal disks. We saw no need to creat a whole new industry for the development of network based storage virtualization appliances, switches, and software.
In a DrunkenData Blog of last year, you refered to a study by Disaster-Resource.com which found that the third leading cause of application failures was due to the complexity of configuring SANs and the average application outage was 25.5 hours! SANs are complex enough without adding the additional complexity of clustered virtialization appliances and switches to remap I/O and re invent replication software so that it can run in the network rather than in an establihed control unit architecture with a lot of cache.
Hu, I have to disagree. From your perspective with HDS, I’m sure things are different, but from where I sit, both vendor lock-in and the narrow view of the SAN are real issues with controller-based storage virtualization.
(This doesn’t mean there aren’t issues with the other forms of SAN virtualization, as I pointed out in my article.)
Part of the difference is that we’re talking somewhat at cross-purposes. For example I did not mean to imply that controller based virtualization relied on a separate appliance to handle virtualization jobs. Not using such an appliance is a strength in some areas, but a weakness when it comes to lock-in.
One problem is that we’re using ‘lock in’ in different senses. To you it apparently means that the virtualization scheme won’t support any other kind of storage — hence the concern about the ability to represent external storage arrays. (Although I have to wonder about the performance impact of virtualizing those external arrays through the controller.)
However I’m much more concerned with what happens when the time comes to replace the storage system — including the controller. If your SAN’s storage virtualization is being run by that controller, I contend that you do have a lock-in problem. If you move to another vendor’s controller at the very least you’ve got some serious work to do. There are ways around this, of course, but storage virtualization based in the controller does mean vendor lock-in, at least with today’s products.
Similarly, you may not see the lack of visibility of the SAN details as constituting a ‘narrow view’ for a virtualization product, but I do — especially when you’re looking from the controller side.
You’re quite right that virtualization products aren’t in the business of managing SANs, but that’s a very, very different thing from having a detailed picture of what’s going on in the SAN. You can argue (as you apparently are) that such a view is not useful for the controller but I’d disagree. And of course that in effect admits that the controller’s view is narrower.
Meanwhile, I want to add that I enjoy your blog and your insights on virtualization. I realize you’re coming from a particular perspective, but you do a good job of repsenting that and I find they very useful. Thanks.
Rick thanks for taking the time to respond. I find your articles very useful.
All approaches to virtualization involve some level of vendor lockin. My view is the same as yours. How do you go back to native mode or to another virtualization solution? Our view is that the controller based approach has the least vendor lockin since, the state of LUN configuration is kept with the external storage. This means that the external storage can be disconnected and attached to another virtualization tool or attached in native mode. While we allow concatination of LUNs and splitting of LUNs we can non disruptively migrate it to a single LUN image on external storage so that it can be detached and reacquired. With other approaches, which take extents and recombine them into LUNs, the state of the LUN configuration is external to the storage array, in some mapping table in the network. That makes it much more difficult to restore the external array and use it on some other system.
While I said that the storage controller is not in the business of managing the SAN, I agree with you that a virtualization solution must have knowledge of the total environment. The storage controller is only part of our virtualization solution. We provide three software products to complete this solution. One is an end to end storage area resource manager which gives us visibility into the physical configuration behind the virtualization. This product is HSSM. The second product is the Tuning manager that gives an end to end view of capacity and performance utilization of the physical configuration. And the third is a Tiered storage manager that can move data between tiers of storage, based on policies that are triggered by time or events, which may be output from the Tuning Manager.
Whether your are sitting inside or outside the SAN you need these types of tools to simplify, optimize and automate the virtualization solution. While a controller based solution can work with a SAN it is not dependent on the SAN. Our solution can support DAS, Mainframe, and NAS with embedded NAS blades. This gives us a much broader virtualization capability than a SAN based virtualization solution.
Thanks again for your comments.