A cluster of challenges
by Hu Yoshida on May 14, 2009
How scalable is a cluster of cache controllers when you need performance?
The initial description of V-Max, describes it as a cluster of two controller systems that can in turn cluster to very large configurations. The challenges of a cluster of storage controllers is maintaining cache consistencies.
In modular storage arrays with two controllers, LUNs are assigned to one controller or the other to avoid thrashing between the controller caches. The Hitachi AMS 2000 avoids this by switching the I/O request to the controller cache that contains the data and load balances the activity between the two caches.
So perhaps the V-Max is doing what the AMS 2000 has been doing for some time. That works for one V-Max node of two controllers. What gets harder to do is cluster multiple V-Max nodes and still present a global cache data image. Namely, have an I/O come in from one V-Max node and be routed to another V-Max node where the data resides in cache. There needs to be some sort of locking for cache consistency and some sort of arbitrator to tell which node has the correct status if the two of them become out of sync. That takes overhead if you have to go over an external wire instead of a backplane. Performance will be great if you happen to access the node where you data resides, but there will be a performance impact if it doesn’t.
If you happen to be going to EMC World next week, see if you can find out how this is done.
Comments (7 )
This post is nothing more than fictional Fear Uncertainty and Doubt (FUD).
By the way – nobody asked…they all learned the facts in the EMC World open sessions on the V-Max architecture.
You obviously have not taken the time to understand how Symmetrix global memory works in general, much less in the Virtual Matrix. And you seem to still be locked into an archiac notion of how to efficiently manage memory in a scale-out architecture – a fact that Brian Garrett noted nearly 3 years ago (and I reiterated on V-Max launch day).
In most of the emerging clustered storage architectures that I have been exposed to, the concept of a logically isolated cache per node has been scrapped. Instead a distributed service-oriented approach with a universally agreed upon, but truly distributed, addressing and lock management scheme has been implemented.
If you’d like, I’d be happy to spare you further embarrassment. Just give me a call, and we’ll schedule some time so that I can explain SYmmetrix global memory to you. Or better yet, let’s schedule some time together on your next visit to New England – it might be fun to do this face to face.
Meanwhile, I’d advise against sending your sales teams (and those of your OEM/reseller partners) out to do battle with this misinformation. It is likely to be embarrassing for them as well.
Not that HP is likely to make such a silly mistake…
Wait – On second thought…
Thanks for the clarification. HDS has some very nice product. What they don’t have is a sense of reality. Truth be told, they are a reseller of Hitachi product. People need to understand that, they are not the manufacturer. They buy it, uplift the price, distribute it to Sun etc, who uplifts it tremendously, and maintain a shell of a support organization. They, HDS, have no R and D, they “partner” to integrate new and emerging technologies, Diligent, Data Domain, etc, which has backfired,thanks to IBM and NetApp, oops. Truth is, they do not report to the street like EMC or most of their competition, So Please go easy on these people.
Hitachi has been using a distributed lock mechanism for the global cache in their enterprise storage since the introduction of the 7700 in 1995. This is kept in a pair of, mirrored, control store caches which are accessed separately from the global data cache. This has enabled us to dynamically change configurations without the need for disruptive BIN file changes. It also enables us to do dynamic tiering with internal and external storage. That is how we enable 128 processors to access one cache image of data, in a global cache than spans multiple data cache modules. This also enables the Hitachi storage to efficiently mirror only the write data rather than mirroring the entire cache.
You are correct in that I do not know how you manage your cache consistency. Please explain how you do it.
Mark your statements are totally inaccurate and misinformed. Firstly HDS does have R&D. Specifically there are two product engineering teams: the HCAP team in Waltham and the team behind the Hitachi Storage Command Portal in California. Further at BlueArc we have resident engineers with source code access that are implementing Hitachi specific features for HNAS. (Note, I pointed this out here: http://blogs.hds.com/michael/2008/03/hitachi-file-storage-platforms.html.) That’s just the local engineering talent.
From an R&D perspective we are Hitachi! This means we do things that EMC cannot even dream of coming close to. Nuclear power, trains, batteries, telecommunications, and storage represents a subset of the R&D muscle that Hitachi wields. In fact I believed in this so much that I relocated to Japan as of November last year to be even more coupled to Hitachi.
As to storage specific R&D, Hitachi employs a User Centered Design methodology which takes input from our customers to impact our roadmaps. Hitachi Data Systems (HDS) is responsible for generating many of the user centered product designs collaboratively with engineering and then transitioning said designs to engineering. That is to say we are a single Hitachi family and we are all passionate about meeting our customers’ needs and implementing innovative technologies. Note I’m not just saying this to be coy, but as someone who actually does this on a daily basis in Japan, it is fact.
If you are asking a question on deeper innovation, we have the ability to insert innovative ideas by also mining fundamental R&D coming from Hitachi’s labs around the world (http://www.hitachi.com/rd/). These labs are more closely aligned in structure to IBM than to EMC or even NetApp. I’d personally rather have both the ability to mine the deep innovation from the labs and build user centered requirements to create world class products!
Why is everybody doing so difficult about V-max etc. Looks like only IBM has got the answer to all what has been described. It is called XIV, IBM grid storage, made by Moshe Yanai (ex EMC gotfather of Symmetrix). The only way to scale is Grid, look at other grid architectures with CPU’s, the Cern grid etc….unlimitted power, unlimitted boudaries, this stuff is great….. Just wait for the new storage trend to be deployed, IBM has the power now….. other will follow and try to be a U-too vendor.
The XIV validates the switch architecture that Hitachi introduced in 2000 with the 9900. XIV uses an ethernet switch. If you look at the XIV patents, you will see that some of those patents refer to prior works in patents that were filed by Hitachi Engineers. As far as unlimited scaling goes, there are only two models of the XIV. One with 72 disk and the other with 180 disks. And by the way the usable capacity for each system is 27 TB and 79 TB, less than half the raw capacity available with 1TB disks diskftp://ftp.software.ibm.com/common/ssi/pm/sp/n/tsd03055usen/TSD03055USEN.PDF
Hu, we have deep respect for Hitachi innovation.
And you have asked the question that has every computer scientist scratching their head: why on Earth did EMC choose a cluster architecture over grid in 2009 for their scale-out computing? Why put a RapidIO backplane in between CPUs and most of the system’s cache memory? How will you scale out as more and more of this memory becomes non-local?
Why not eliminate cache coherency and distributed locking altogether?
Instead, consider writing a software layer that lets a large number of small autonomous computers work in parallel to solve complex problems.
Such a model worked very well for Google and so far is working well for XIV.
Of course it is not the only solution. About the same time Google was disrupting the computation space I laid my hands on my first glorious Silicon Graphics Onyx2 ccNUMA shared memory cluster. It was pretty cool. But I hope for competition’s sake that Enginuity fares better than IRIX…