Right Size Storage for Virtual Server Growth
by Hu Yoshida on Sep 6, 2011
A common experience that I run into with virtual server users is the over commitment of modular storage which results in poor performance and outages.
By modular storage, I am referring to two controller storage systems that were originally designed for use with open systems workstations. While these types of systems can scale in capacity to hundreds of disk drives and support scores of servers with Fibre Channel switch connections, they were never designed to scale up like an enterprise storage system with multiple processors that share a global cache. Modular storage is a very cost efficient architecture that was designed to provide great performance for a limited number of servers. With two controllers, they are not designed for high availability since when one controller is down for scheduled or unscheduled maintenance, an outage on the other controller will result in data loss. A two-controller system requires maintenance windows.
When virtual servers are introduced into the data center, they are used to consolidate the low hanging fruit: the servers that are idling most of the time and running applications like test and development, or print servers. These types of servers usually run on modular storage so that is the type of storage most often used when they are consolidated onto a virtual server.
This may work fine when you consolidate ten or even twenty low utilization servers onto two or four virtual servers and everyone is elated over the costs savings. But before you know it, you are consolidating several hundred virtual machines onto 20 or more virtual servers, and performance starts to suffer. Some even migrate their tier 1 application onto this configuration, expecting 7 x 24 up time. You can see where this is leading.
If you are expecting to consolidate a large number of servers or desktops onto a virtual server platform you need to have an enterprise storage system that can scale up to meet the demands of hundreds of servers and thousands of desktops. If you are expecting enterprise class availability and performance there is no substitute for an enterprise storage system like Hitachi Virtual Storage Platform (VSP).
You can still use the cost efficiencies of the modular storage system, as long as you front end it with an enterprise storage system like VSP.
Virtual servers can be like a drug. Be sure you balance it out with the right storage system. The virtual server folks need to be talking with the storage folks to make sure they are in sync.
The acquisition cost of enterprise storage like VSP may be more expensive than modular storage, but don’t let that lead you down the wrong path. See David Merrill’s most recent blog post on TCO, in which he explains that acquisition cost is only a fraction of the total cost.
interesting read. IMHO there’s much truth in your quote “Virtual servers can be like a drug” and I think you are also right with your observation about Tier 1 applications being virtualized. From a support perspective this could lead to bad nightmares. But to be honest, I don’t get why the storage system should be the limiting factor here. The number of servers (in terms of OSes running) doesn’t change in your picture and neither did the total workload towards the storage array. They were physical servers before, now they are virtual servers (VMs) on a few physical ones. In my eyes the requirements regarding the storage environment didn’t change big times but of course you have to check carefully if your physical servers with their SAN connectivity could turn into a bottleneck themselves, as I pointed out in my latest blog post (http://ibm.co/mY5PnH).
Additionally, just a minor thing with the dual-controller arrays: Why should the outage of the remaining controllers lead to data loss? Usually the write cache of such arrays will be disabled if one controller is down, because it can’t be mirrored anymore. On one hand this means decreased performance during such maintenance, but on the other hand this means that the host gets the SCSI good status only if the I/O is really written to disk. So, there should be access loss of course, but no data loss.