Four years is a long time
by Hu Yoshida on Mar 8, 2006
The 2006 Winter Olympics is over and the next one will be in Vancouver in 2010. Some of the athletes who competed this year will return, but very few will be able to compete in a third Olympic. One of the most famous skiers in the world is Rosi Mittermeier of Germany who competed in 3 Olympics, and won two gold and a silver in the 1976 Winter Olympics in Innsbruck. I met Rosi and her husband Christian Neureuther ( 6 time World Cup champion) at an HDS event for customers and partners from Bavaria. She and Christian were there to coach the customers on how to do the slalom! Now I know why the Germans won so many Gold Medals at these Olympics. Almost everyone of these executives went down the slalom, except for the American (old footbal injury you know). I spoke to Rosi at dinner and she shared with us the challenges and sacrifices of preparing for each Olympic, and the difficulty of peaking for every four years.
Four years is a life time in the storage industry. Four years ago, many customers who had 10TB are now close to 100TB, it is easy to imagine that these same customers will be approaching 1000TB in the next 4 years. Four years ago, customer who debated whether to buy storage arrays with more than 8TB , now routinely buy arrays with 40 to 80TB. In the past 4 years all the major storage vendors introduced new product cycles, and virtualization has finally arrived. It will be exciting to see where we will be as an industry when the 2010 Winter Olympics comes around.
I would like to get your views on what HDS needs to do to meet your requirements over the next four years. Are we on the right track with the disaggregation of intelligent controller from disk arrays? I had some previous comments from SNIG about improvements to our replication solutions, are there other areas that you would like to see us improve?
Comments (9 )
Well we just bought a USP1100, a USP600, and an AMS200 (about 84TB usable) so I’ll have lot’s of feedback for you over the next few months…
LU management is not convenient still. For example here is no space defragmentation in Thunder and AMS models. Of course it’s possible to create two LUs and unify them later but it’s much easier from management point to have smaller number of LUs. AMS is more flexible storage and it unifies free space after LUs deletion but it works if LUs were placed one after one. Full space defragmentation is provided by EMC and IBM midrange storages for at least 2 years.
I like paradigm of HP and 3PAR storages when information is placed on all disks. But I still prefer standard RAID groups and LUs because of risk to impact performance of OLTP system by DSS. It’s possible to create independent disk groups in case of HP but this solution is not flexible also (smallest LU size is 1GB). Thus here is no storage with full utilization of disk performance and good management. In my opinion an ideal storage should have paradigm of HP or 3PAR and policy based management. Policies should contain information of which systems may be placed on the same disks/controllers. For example, it’s OK for me to combine several OLTP systems on the same disks. Or combine e-mail and data warehouse systems because one of them requires resources during day users activity and another one during night data import.
Thunder and AMS storages actually have three tiers: FC disks, SATA disks and cache memory. It’ possible to create LU and cache it completely. Why not to add just dwarf full of memory and backed up by separate disks and batteries?
A drawer full of memory can be pretty expensive. We believe it is better to put it in cache where it can be shared across all drawers. As you noted we can lock a LUN in cache for peak use.
(Sorry to revive half-year old thread, just stumbled upon
somewhat frisky post http://blogs.rupturedmonkey.com/?p=50
linking here, cannot resist the temptation)
We’ve just finished two-week evaluation of AMS200, and I’m
pretty impressed with its performance, managebility and
Ideas of neat features to add are abundant, too
In no particluar order:
– Ability to explicitly specify mirrored pairs location with
raid0+1, or some explicit guarantee of enclosure loss protection.
Now it is unclear if raid group will survive in case of complete
enclosure outage even if only half of member disks are there.
– Ability to coalesce free space in RG, maybe something like
transparent LUSE for unused space
– Some more online storage manipulation may be handy
– Simpler procedure for migrating RG to different set of disks
without interrupting access. This is possible with ShadowImage but
with some hassle and, as far as I know, not without access
– Addition of disks to RG and then rebanance data later.
– More fine-grained LUN parameter setting, e.g. disabling read
– One shell-like commandline front-end to management functions
instead of zillion small commands will be great. Scripting
many actions in one batch (e.g. create RG, create LUNs, format
them, map to hosts) with optional dry-run and syntax checking
is superior to calling ‘auXXX, auYYY, auZZZ’ and entering
password 100 times. Hopely adding this will be not too hard,
as all of the utils are actually calling single monolithic
– Having SNMP support is great, but to make it really userful
it would be nice to export full performance data (‘auperform’)
and full status data (e.g. status of TrueCopy/ShadowImage
pairs, which is now impossible to monitor out-of-band).
It will greatly simplify integration to NMS like OpenView,
and writing custom performance monitoring scripts.
– auperform itself does not support ‘one-shot’ mode well,
requiring either 1 minute inteval between outputs or key-press
to get next dataset. Some kind of CSV or XML output will be
nice too for custom performance monitoring scripts.
– unix syslog logging in controllers is another nice feature to
add. It is fairly simple to implement and will save a great
deal of polling for simple status monitoring and events audit.
– Current implementation of Password Protection is not very
satisfactory security-wise and does not provide any actual
protection from determined attacker.
I will not go into further detail, but it will be nice to
– implement some clearly documented password reset mechanism,
probably requiring some physical actions like pressing
buttons on controller
– log failed login attempts to some separate audit log (not
main controller log to prevent overflowing it with garbage
Syslog will suit this purpose extremely well.
– add some simple IP access list to restrict management access
to controller without external firewall
– implement delay between password retries to complicate bruteforce
attacks on passwords
Security is complex field in itself, and those are only basic
requirements for sane access control system as I see them.
– External authentication via RADIUS will be nice too. Even APC
UPSes can talk RADIUS nowdays, why storage arrays can not?
More elaborate authentication mechanisms like ldap, kerberos or
ntlm are probably not worth it, but RADIUS is de-facto standard
in telecom, allows some niceties like specifying access levels,
provides audit trail in server logs and sufficiently secure.
– First, open access to latest version of HiSource documentation
on http://www.hds.com will be most welcome, as well as customer-accessible
release notes on latest firmware, caveats/open bugs etc (maybe
requiring a support contract or such).
– Manuals, although usually deep and detailed, sometimes lack
proof-reading, unclear and downright funny even for me whose
English is far from perfect. QA is certainly improving here,
e.g. my favorite passage referring to user names as consisting
of “one to eight digits in half size alphanumeric character”
dissapeared from april’06 edition of GUI users guide, but
further work would be appreciated (TrueCopy guide is still
states on page 84: “MC/Serviceguard needs to evaluate with
5-node configuration. We will treat as individual if
required” – go figure).
– Explanations like “option -AAA: turns on AAA” are not very
instructive, especially then AAA mentioned only here.
This translates to either “if you do not know what AAA is, why
are you reading this doc” or “I dont know what AAA is and still
writing this doc”, or maybe “this feature is long time unused but
we’ll cite it here to make your inferiority complex worse”
Whole section of tuning parameters in StorageNav CLI reference
(autuningprefetch, autuningmultistream etc) is incomprehensible
pile of such statements (“-read enable: enables the specification
of read mode” – wtf?).
– There are some options completely missed from manuals: e.g
‘auonlineverify -skipverify’ is undocumented but present in
‘-help’ output of command.
– Some book like ‘AMS Storage Concepts’ accompanied by ‘Glossary’
will be one (actually, two) most welcome addition(s) to HiSource
For example, ‘ausystuning -cachecontrol FIFO|LRU’: why this
option is present? What is better for me and what trade-offs
present in LRU vs FIFO? What performance gain is expected
from tweaking this option? Even maintenance manual is silent
about it, all occurances of FIFO in it are for some scary
If this option makes sence for customer (or support engeneer
for that matter), describe it; if it does not, remove it.
- Hardware design
– More elaborate cache mirroring strategy: even if ctrB does
not write at all, mirrored cache partition always occupies
half of ctrA memory, thus effectively halving read cache
size. I was slightly surprised to find that having total of
4GB cache memory array is only able to cache 750MB of one
– 4GB back-end connectivity (no-brainer)
Some of those ideas (e.g. rants about cache mirroring) may be from
gross misunderstanging of controller software internals, some could
lead to higer complexity and lower reliability of this software
(e.g. online manipulations with RGs).
Drop those and implement at least syslog, full SNMP monitoring
of performance/health and free space coalesce, and i’ll be one
happy AMS user.
Thanks for your excellent products and keep up the good work.
I personally believe that the separation of the controller and the commodity disks is a move in the right direction. As for meeting the other storage demands of the future I would like to see the following –
Improved management and reporting tools
Improved, more open and more accessible documentation (see IBM Redbooks)
Continue growing the onine user community and keep up the good interaction with the community
An ask the experts type blog (see storageadvisors.adaptec.com)
Simplified licensing models (see Pillar)
More flexible array based migration tools (i.e. between LDEVs of different sizes). Currently host based tools are the only option.
More use of SSD (cheaper than cache but faster than disk)
A more aggressive push into the space currently dominated by HP EVA and EMC Clariion (I often see proposals that don’t take the HDS offerings seriously)
Any other ideas will be blogged about at blogs.rupturedmonkey.com
Finally, Im also not sold on the idea of partnering with companies rather than buying them. Im not sure you were asking for input on this but I think its important so here are some of my thoughts on the strategy –
It’s a very scary thought to bet the future roadmap of the business on other companies that you have very little control over. What happens if your partners are bought out by your competitors such as EMC, HP and IBM? After all, most venture capitalists will sell for the right price and many of the potential partners are backed by VCs. I remember chatting with some colleagues a while ago about what SUN would do if Microsoft bought Veritas – after all SUN rely heavily on VxFS, VxVM, VCS….
Also from my view point, as a contractor, I see a lot of different vendor environments. The HP storage environments usually have other HP storage technologies such as Data Protector, IBM shops usually also have Tivoli and no doubt the EMC shops will now start using more and more EMC apps. The HDS storage environments on the other hand are a mish-mash of technologies from whoever the preferred partner at the time is. From my end of the spectrum, la owly admin, this makes it much easier for companies to hire storage people for other vendors kit as they often have a bundle of vendor related skills – a HP storage person often also has Data Protector on their Resume. As for HDS admins, you normally have to hire two people, one for the storage box and another to look after the “partner” technologies. Storage skills are not cheap yet either.
Thanks for hearing me out and keep up the good work!
One more idea regarding tiers… It maybe useful to use FC drives as a cache for SATA disks to translate random operations to sequention ones.
Hello Alexev and Nigel. Thanks for investing the time and effort to give me this feedback. It is very valuable and timely. I will take some time to disgest this input and discuss with me product management teams. I may need to bother you with some additional questions. Please send me any other thoughts you may have not only on products but your also views on our business approach as Nigel has done.
Have a great Thanksgiving!
Thank you for your prompt reply. Do not hesitage to ask me questions if any of my remarks seem to be of interest. Also, sorry for garbled formatting and, eh, lack of sonkeigo in my initial post.
Obviously it is too early for me to draw sensible conclusions on such complex and subtle thing as Business Approach
Noticing particular technical shortcomings is easier, and short exposure to HDS products may turn out to be an opportunity for some open-mindness here.
Having said that, there is one point I cannot stress enough: please, do not keep technical product information secret from customers, go forward and publish manuals, news and release notes on http://www.hds.com.
One recent example: in my study of comparative merits of mid-range arrays I noted to myself that AMS is only product in its class not supporting async replication (IBM DS4xxx, NetApp FC, EMC Clariion, HP EVA etc all have it in some form or shape). All data sheets and whitepapers on http://www.hds.com seem to support this, citing async TrueCopy as future product without clear availability date or completely failing to mention its existance like http://www.hds.com/pdf/ds_ams1000_580.pdf . I mentioned this to some rather knowlegeable guy in support, and to my great surprise it turns out that TrueCopy Extended Distance is already shipped in new AMS firmware 750, along with multiple S-VOLs per P-VOL in ShadowImage and probably other userful features I know nothing about.
Another one: http://www.hds.com/pdf/dmg_report_040306.pdf mentioned ‘LUN Migration’ as ‘available late in 2006′, and this feature seems to be just what I want, as I’ve mentioned in my previous post here. Im glad to see that your R&D thought of it too (no surprise, great minds think alike ) But there is no way for me to find out if it is available yet without polling tech suppport guys on monthly schedule.
Another one: studying connectors on AMS controllers I’ve found small one labelled ‘UPS’ and make mental note of it. Later it took me significant effort to find out what models of UPS are supported, how to configure array to interface with them and so on. As far as I remember, it documented only in Maintenance Manual (hefty 3000page book normally inaccessible to customers) and on APC site (vendor of compartible UPSes, brave enough to publish relevant documentation). Why this unique and convinient feature is so carefully hidden from its potential users is beyond me.
Another one: in http://community.hds.com/ some guy asks where can he find users manual for his 9500, and Hitachi representative (being helpful) sending him user manual by mail. forums.hds.com is good idea, but they must compliment online manuals/best practices/references not replace them.
It seems that noone benefit from restricting customer/end user access to technical information that may be importaint for their business. On HP site I have full access to manuals, release notes, quickspecs etc (access to bugs/patches DB requires login), EMC has powerlink, IBM sports extensive library of RedBooks, SUN placed most of manuals online in HTML form. HDS have some site for partners, but mere customers and potential technical desision makers are presented with dated Thunder manuals and a bunch of busness briefs short of technical details.
It is real shame to have full rights to brag about shiny new features and still not mentioning them at all to anybody except resellers and service personnel. I’m not asking for full-page ads in business newspapers, but some customer-accessible ‘Support’ section on http://www.hds.com with links to latest release notes and current version of product manuals will be just fine.
Thank you for your valuable attention and have a good time.
No probs thats fine