Virtualization: many to one and one to many
by Hu Yoshida on Sep 24, 2006
Virtualization is a way to make many things look like one or one thing look like many.
Partitioning is a way to make one system look like many virtual systems. Partitioning is usually associated with server virtualization. Think LPARS in mainframes and Virtual Machines in VMWare. Partitioning is a virtualization technique that enables multiple platforms to run on the same physical server to increase utilization of server resources.
Storage virtualization is thought to be the opposite. Storage virtualization makes many storage systems look like one for ease of management and utilization. In the case of the USP we attach many external, even heterogeneous storage systems, behind the USP and make them look like one USP storage system with all the functionality of the USP. However, We also do partitioning as another form of storage virtualization with the objective of making one USP look like many USPs.
Why would you want to do this? All forms of virtualization provide the benefit of consolidation. You simplify management, enable better utilization of capacity, and reduce power, cooling, and software licenses. While this may be great for the IT budget, application users may not be happy with consolidation, particularly if they are consolidated onto the same storage frame with other application users. This means that there will be many users now competing for the same storage resources, which leads to performance, and security conflicts between the application users. This is what killed the Storage Service Provider market in the early 2000s. SSPs could not leverage their capital investment because every user they signed up wanted their own stand alone storage system. While performance conflicts are an obvious concern, security is a greater concern today. In a global economy, you may be serving storage to different business units or companies that are in competition, or that may reside in countries which are politically at odds or even at war with each other. So if you are going to do consolidation on to a storage frame which has a common cache and common set of disk arrays, you need to choose a storage architecture that can provide partitioning to ensure a QoS level of performance and security for the applications that need to be separate.
The TagmaStore USP and NSC provide partitioning in addition to the virtualization of external storage systems. We call it Logical Partitioning where we can carve up a USP into 32 virtual storage partitions, each with its own partition of cache, disk array groups, virtual ports, and administrative controls. Each partition has its own serial number for billing purposes. To the partition administrator and application user, it looks and feels like a dedicated USP storage system.
From a performance perspective, each storage partition is assigned a cache partition which can not be impacted by another partition. A data base reorg that may be rippling through cache can not steal cache slots from a Logical partition’s cache. Partitions are also assigned their own physical disk array groups so that disk performance is not impacted by another partition’s disk activity stealing the access arm or blocking data transfer. Performance of virtual host port connections is set by Priority Port Control assignments. These partitions can be dynamically altered by the TagmaStore Storage Administrator. During peak periods, a partition may be given access to more cache and during off periods cache allocation may be reduced and made available to other partitions without disruption to the application.
Hitachi has put a lot of thought into the security of Logical Partitions, since this is the main concern for sharing storage resources. Besides the obvious separation of Cache and disk array groups which guarantees no data leakage between partitions and no denial of service by activities in another partition, new functions are added to ensure management security. The Tagmastore storage administrator, assigns a storage partition administrator role to each partition. These partition administrators can not escalate their privileges to see or administer other partitions. More over there can be Role Based Access Controls (RBAC) defined to limit the roles of these administrators. These roles include combinations of Account Administrator, Audit Administrator, and Storage Administrator. For instance the Audit Administrator role may not be appropriate for a Storage Partition manager and you may chose to disable it for that partition.
Having an Audit Administrator role implies that we do have an auditing capability in the TagmaStore. I find that this feature is not well known. This feature formats transactions, time stamps them, and can down load them to a SYSLOG server for easy access by authorized auditors. It is also helpful in diagnosing intermittent problems
As far as I know Hitachi is the only vendor that provides storage partitioning. IBM claims to have 2 partitions, but that is because they have a two controller architecture. I am not sure if they can provide QoS and security to the level that storage partitioning demands. I would be interested in hearing comments about storage partitioning as a requirement for storage virtualization.
I believe large scale storage consolidation, requires both types of storage virtualization; many to one and one to many.
Comments (3 )
In our implementation of virtualisation (200TB AMS behind USP) I chose separate cache partitions for each AMS system (there are 3). The logic for this was to ensure that an issue with a single AMS didn’t cause problems which depleted cache across the whole array. If I was virtualising multiple tiers of storage behind the same USP (something I am working on), I would follow the same approach.
Thanks for the insight. I normally think about this from the server side, but partitioning can also keep the backend arrays from impacting resources in the USP as you point out. Let me know how the multiple tiers workout for you.
Creating CLPR to mitigate “an issue with single AMS” doesn’t make sense to me. It is the hosts that is issuing IO’s and any host which issues large number of sequential writes can potentailly deplete the cache. It may work, if a single host is seeing storage only from one AMS and it is OK for all the other hosts which have storage from the same AMS can suffer.