AMS 2000 Dynamic Load Balancing Controller and SAS
by Hu Yoshida on Oct 15, 2008
On Monday, Chuck Hollis of EMC had a little time on his hands so he spent a few hours composing a lengthy blog post describing what he considered to be a “non competitor bashing discussion” of Hitachi’s announcement of the AMS 2000 modular storage system. While he said that he was “underwhelmed” by this announcement he went to a great deal of effort to blog about its capabilities.
Chuck appeared to have miss read the press release or was confused about some performance and architectural statements that were made in the announcement letter. This gives me the opportunity to provide some clarification.
Our claim that the AMS 2500 was up to four times performance of our previous product was our “non competitive bashing” way of saying it was faster than our competitors similar midrange products. The AMS 2500 can achieve 900,000 iops. Chuck can dispute this claim by publishing the performance numbers for the CX4.
Chuck disputes our claim that the AMS 2000 is “the industries first Dynamic Load Balancing Controller”. He claims that the EMC and HDS high end arrays have been dynamically load Balancing for years. Unfortunately he has his architectures mixed up. The EMC and Hitachi high end arrays are not midrange modular arrays. They are multi-pathing arrays but they are not dynamically load balancing arrays. They are high end arrays which have a global cache that can be accessed by any path on the array. This allows software, like MPIO, on the host servers to load balance the I/O across multiple paths to a single cache image in the high end array. In our high end array, the USP V, we can set port priority or change queue depth in a port to manage that port’s performance, but the port processors do not dynamically load balance the I/O between each other.
In midrange, modular, storage arrays, there are two controllers that have separate caches. Up until the AMS 2000 modular storage, LUNs had to be assigned to one controller cache or the other so that there would not be any thrashing between the controller caches trying to get control of a consistent cache image. Dynamic Load Balancing Controller (yes we capitalize it since we will be branding this as DLBC) takes over the complex task of LUN ownership and dynamically load balances the I/O to the appropriate controller cache so that the storage administrator does not have to manage it and optimum performance can be maintained. This is one of the ways customers can improve their operational efficiency.
The AMS2000 doesn’t care which paths are used and so additional path management software is not required. Just use the native path management software in the operating system to ensure that HBA failover is handled and you’re good to go.
The benefit of a Dynamic Load Balancing Controller, is that administrators don’t need to assign LUNs to controllers and, therefore, don’t need to analyze the projected workload on each LUN to ensure a balanced system. Administrators don’t need to monitor utilization rates and rebalance their systems as workloads change over time. Whether these imbalances evolve slowly or spike rapidly doesn’t matter, the AMS2000 will adjust on the fly. No other storage system has this functionality.
We haven’t published the thresholds for automatic load balancing but they are 70/40. In other words, one controller must be at greater than or =70% utilization while the other is less than or = to 40% utilization.
Apparently Chuck does not believe that SAS is a big deal. But, the majority of industry analysts believe it is and most disk vendors are moving to SAS. While EMC has a 3Gb SAS backplane on their low end EMC AX4, they only provide 4 paths at most in their SAS implementation. The AMS 2000 series has a SAS backplane with expanders that can provide 32 concurrent paths to the back end disks. This increase in backend paths provides increased performance and improved fault isolation for disk failures.
Application support is very important and we provide more information about this on our website. For instance, the AMS2000 has the following benefits in VMware environments:
* The MRU policy in ESX creates problems for modular controllers whenever disk ownership changes due to errors in the primary path. There aren’t any primary paths in the AMS2000 so there are no issues with disk ownership changes.
* Use of temporary MRU paths can create a workload imbalance that the AMS2000 will adjust to.
* No need to set path management when installing a server farm
* No need to worry about physical host connections and path settings when using VMotion to transfer virtual machines.
I think I have covered most of the issues that Chuck brought up. If there are any questions that you have please drop me a comment and I will get you an answer.
Comments (8 )
Hello Hu (at least, I think it’s Hu … )
Nice to see you join the conversation.
I can’t respond to everything here (sorry!), but a few areas I’d like to comment on?
First, the 900,000 number has no useful context that I can discern. I remember back a few decades when we interepreted MIPS as Meaningless Indication Of Processor Speed.
More specifically, are you stating that customers can experience 900,000 IOPS in a real-world workload and configuration? Or are we just talking engineering bragging rights here?
Second, I do understand the difference between high-end and mditier architectures. I think most people who follow my blog could discern that. I was just questioning the practical usefulness of such a “load balancing” feature in a world where MPIO is very common.
As you know, we sell a reasonable number of mid-tier arrays (#1 according to IDC), and — frankly — all the “problems” and “challenges” you cite appear to be — well — somewhat synthetic in nature.
Thirdly, no argument that you’ve got more wires on your SAS implementation that an AX4. If you read carefully, I was commenting on your claim of “first”, which I would offer needs a bit of qualification on the part of your PR team.
I also think you’re going to have a tough time trying to convince people that “SAS is universally better” given the FC design of the USP-V.
My sense of this is that HDS is trying to create a “problem” where none really exists. Fine, we all work with the hands we’re dealt with.
Best of luck to you all!
Care to quantify “900,000 iops” – of what – super read cache hits as usual – a claim of this type is unrealistic unless you provide details of what it refers too.
There are many ways to cite performance as you note. The only way I know to do any type of apples to apple compare of different architectures is to cite the maximum read cache hits. It is not realistic from an application view but it does show the engineering capability.
Care to give us yours so that we can compare apples?
Hu, thats an interesting take. Read cache hits only show the “top” end of the box, into memory. Not out to disk, i.e. the capability of the algorithms in cache, destage, prefetch etc etc. I’d argue that a 100% write miss actually shows more about a box (technically) than read cache hits.
Also, is it not the case that (if this is the same as USP) then these are “super read cache hits” – so they don’t even leave the buffers of the fibre ports. i.e. there is no DMA involved, just a FC buffer to FC frame conversion – if that is the case, this is NOT AT ALL representative, even at a technical level.
It is read cache hits. We will soon have SPC benchmarks to show.
It seems the SPC results weren’t so good after all because they’re nowhere to be seen.
Hi I found installation example of AMS2000 series at http://www.fcoe.ru/index.php?option=com_content&task=view&id=213&Itemid=54
Thanks for making the honest strive to explain this. I believe very sturdy about it and would like to be informed more. If its OK, as you attain more extensive knowledge, may you thoughts including more posts similar to this one with additional information? It might be extremely useful and useful for me and my colleagues.