Future of Storage Controllers
by Hu Yoshida on Feb 20, 2006
Brian Garrett of ESG believes that future storage controllers will be composed of cluster based storage controllers instead of the enterprise class storage controllers from EMC, HDS, and IBM. He observes that many emerging storage controllers use commodity Intel servers clustered together , using Linux instead of proprietary operating systems.
I don’t agree. This recalls the conversation I have been having with Mike Linett of ZeroWait around virtualization and commoditization. Clustered controllers with their lower costs are fine for tier 2/3 storage requirements, where maintenance windows are acceptable, they serve a limited number of hosts, or write updates are not a factor as in the case of archive or backup.
The biggest problem with clustered solutions is that each node in a cluster has its own cache, and write data has to be replicated across the cluster of caches in order to maintain write consistency. LUNs are usually assigned to one of the nodes, so that all read/write activity to the LUN go through the same cache. A redundant copy of write data may be sent to another passive node to avoid data loss in the event of a cache failure in the primary node.
Enterprise storage controllers have many processors, up to 128 in our USP, all of which see one cache image of data. A host can access data across almost any combination of the USP’s 192 FC ports, because they all see the same cache image. EMC and IBM provide similar implementations of a global cache in their enterprise arrays. A global cache enables, scalable connectivity, non-stop availability, data mobility functions like distance replication, and load balancing across multiple processors. The way we, EMC, IBM, HDS, connect all these processors to a global cache varies by vendor, direct matrix, proprietary buses, or cross bar switch. While I can debate the advantages of these different approaches, suffice it to say that a global cache is key to a scalable, enterprise storage controller. You will not be able to do this by clustering together a number of commodity Intel platforms. Clustering is fine for tier 2/3 storage requirements, especially if you virtualize it behind a tier 1 controller like the USP.
Doing this with an out of the box operating system like Linux would also be difficult. My friend at HDS, Carlos Soares, used to work at Data General, where they built the NUMA server by connecting low cost server boards from Intel. DG had to develop a very sophisticated operating system, DG/UX, to manage the distributed cache as a single image for servers. Managing distributed cache for storage is even more difficult, due to the mechanical nature of disks. The penalty for missed windows is a disk rotation.
I agree with Brian that users should care whether storage controllers are clustered for price performance or are enterprise class. There is a need for both, but I doubt that clustered storage controllers will be able to meet all the requirements of the enterprise. Customers care about simplifying the provisioning and management of their exploding data requirements, and will require a combination of technologies to achieve that.
Comments (6 )
Discovered your blog today through Jeremiah Owyang. Kudos – it’s terrific. And yes you’ve developed a great blogging “style.” I’ve added your blog to my list of CEO (meaning senior executive) bloggers at http://www.BlogWriteForCEOs.com (click my name).
Can you clarify the clusterd global cache idea for me, perhaps a solid state drive would do the trick? Something like the old Imperial Solid State drives? I could see it working like a network load balancer which would be constantly updated with storage address information. The question then becomes could it be made as reliable as the old NUMA cluster cards? A storage load balancer would seem like a logical solution.
Mike, cluster and global to me are opposites. A cluster consists of two or more separate cache adress spaces that are kept in synch over external links. A global cache is one cache address space that is accessed directly by two or more processors.
A solid state drive could be considered a global cache. However, this leads us into another requirement which is dynamic scalability. I believe a solid state drive has a fixed format, and a fixed capacity.
Our storage controller cache consists of a meta data cache that describes the data configuration in a separate data cache. This enables the data cache to be dynamically configured by changing bits in the meta data cache. Separation of control and data eliminates contention between them when the activity increases, and provides a level of security since service is provided through access to the meta data cache and never exposes the users data.
Hu, I appreciate the thoughtful response to my blog entry on the storage controller of the future, but disagree on a few points which will be addressed in my next blog entry.
What is your definition of a Tier 2 storage solution, and how would you define service levels to support it?
There are many ways to define tiers of storage. In general they all have to do with different levels of cost.
I define tiers of storage based on architecture, cost and availability. Service Levels would be based on their availability and performance characteristics.
Tier 1 storage consists of a large number of processors which access a global cache. All the processors have the capability to access the same cache image of a LUN. This type of storage can support 7×24 availability even if a number of the processors are down for planned or unplanned maintenance. The HDS USP/NSC55, IBM DS8000 and EMC DMX are in this category. They also provide a robust distance replication capability. They are designed to serve many heterogeneous users.
Tier 2 storage is a FC Modular storage array with two controllers with separate caches. Luns are assigned to one controller or the other. Writes to a LUN image in one controller cache is mirrored to the other controller’s cache for data protection. This type of storage is about half the cost of Tier 1 storage due to its simpler architecture, but requires maintenance windows since it only has two controllers. With two controllers it is limited in connectivity and but its lower overhead enables it to serve a few users with very high performance. HDS AMS, IBM DS4000, and EMC CLARiiON fall in this category. Tier 3 would be a dual controller storage system like tier 2, but with lower cost and higher capacity SATA disks in place of FC disks. Since SATA disks are bigger and slower than FC disk, access performance is lower. Their availability is also less than FC since the larger disk capacities take longer to provision and rebuild and their lower MBTF statistically means less time between failure than FC disks.
Some people may define a tier 0 for solid state disk or LUNS that are resident in cache, and a tier 4 for tape or Optical.