Mark Lewis’ Virtualization Principle
by Hu Yoshida on May 6, 2007
I appreciate the post that Mark Lewis made on The Virtualization Principle since it finally gives me the opportunity to understand his thoughts around storage virtualization.
First I am happy to hear that we agree that” virtualization needs to enable heterogeneity, scale, utilization, simplicity, and dynamic operations and do so in a way that provides transparency to the application and information systems functions” (Mark Lewis)
Mark goes on to say that “the key tenant to virtualization is operational transparency meaning that the virtualization function runs within existing environments without changing the transactional interaction between applications and storage”. He goes on to use the example of VMware where” the base architecture is able to run an x86 application just EXACTLY as it was running directly on the hardware.”
I am not sure that I completely understand this statement. If you want to run everything exactly the same then what do you gain by virtualization? In the case of server virtualization where you are trying to make one physical server look like many, I can sort of understand that statement. However, in storage virtualization we are really trying to make many storage systems work together and act like one. In fact we want to mask the storage from the application and its data.
Preserving the transactional interaction between the application and its data is the key, not to the storage as Mark says. Storage virtualization should be able to change the storage without disruption to the application’s access to data. It should be able to non disruptively, move the data to lower or higher performance storage according to the needs of the application; create clone copies for backup, data mining, development and test, business continuance, data distribution, etc; move the data to lower cost tiers of storage as activity declines, migrate the data to new systems for technology refresh or lease expiration; and add capacity as required.. For Hitachi, the key tenant of storage virtualization is data mobility which is non disruptive to the application and can dynamically optimize the interaction between the application and its data.
It should be noted that there some operations like archive where the data is separated from the application. Long term preservation of data may not be tied to the application or sensor that created it. Here the data is important in and of itself and it may be repurposed for compliance or other business requirements. Meta data and policies have to be associated with the data objects and stored in a way that it can be searched and retrieved with some way to prove immutability. Storage virtualization can support this as well by providing services for ingestion of objects and meta data, indexing, searching, hashing for immutability, de duplication, remote vaulting, technology refresh, etc, across heterogeneous storage systems and heterogeneous records management platforms.
This may sound like a lot for a storage virtualization platform to provide. But it is possible if that virtualization platform has all the processing power and scalability of a multiprocessor control unit with a large dynamic, global cache, a switched back plane, and proven storage and data services that can be extended to externally attached, heterogeneous storage systems.
From Mark’s description of virtualization, it appears that he is trying to solve a routing problem. He is rehashing the old in band versus out of band discussion of 5 years ago when the goal of storage virtualization was to move the volume management function from the servers onto the network, for volume pooling. Storage virtualization has moved way beyond volume pooling to the enablement of common storage and data services for block, file and object storage.
When Mark talks about arrays and their limitation in scalability, he is speaking from his frame of reference, the DMX, which has a static cache architecture that requires BIN file changes whenever the storage configuration is changed. If the DMX could support virtualization, then EMC could leverage their tools like Time Finder and SRDF as virtualization services. Instead, EMC must add another layer of complexity, management, and mapping tables with clustered servers and special switches with port ASICs to accomplish what a volume manager already does. You can’t do anything with an Invista without adding a few storage controllers.
The Storage Controller virtualization approach is simple. It does not require disruption to the SAN, in fact it does not depend on the SAN, and can support, DAS, NAS, iSCSI, ESCON, FICON, or any other standard protocol for storage access. The storage controller is the legitimate target to the host initiator. It doesn’t need special port ASICs to crack data packets and remap content directory blocks. The storage controller can make all its storage and data services like clone copies, and replication available to external storage and to applications on the server side.
Mark ends his post by saying. “You can dress up an array with a network storage virtualization moniker but it is still an array”
I don’t think Mark was talking about the Hitachi Universal Storage platform because we do not claim to be a network storage (SAN) virtualization platform. The USP is controller based virtualization and has no aspirations to be SAN based. We have also separated our storage controller from the disk array so that any vendors FC storage array can be attached to our storage controller and be able to use the enhanced enterprise features of the USP.
Mark may have been confused by my previous posts on the need to network storage to storage. What I mean by networking storage to storage is the ability for storage to work together to support the movement of data across heterogeneous storaga arrays without disruption to the application. In order to work together , storage systems must be able to share a common controller, with a common set of services, and with a common cache. That is not available in a cluster of appliances that sit in the SAN. Although SAN stands for Storage Area Network, it does not network storage to storage. It networks servers to storage. Storage systems can not comunicate and share resources and services with other storage systems with network storage(SAN) virtualization.
Comments (3 )
[...] In his most recent blog, Hitachi CTO Hu Yoshida says that the key function of storage virtualization should be data mobility, meaning it should be non-disruptive to applications and, in fact, enhance the interaction between application and data. You can get there, he says, with a virtualization platform featuring a powerful microprocessor control unit, a dynamic global cache, a switched back plane and storage services that can be extended to external, heterogeneous systems. [...]
[...] 3. Storage virtualization. This exchange between EMC’s Mark Lewis and Hitachi Data Systems’ Hu Yoshida shows that there’s fertile grounds for debate in this arena as well. [...]
[...] There is plenty of technical and operational benefits of storage virtualization, some of these links can take you to these sites. [...]