It Rolls Downhill
by Hu Yoshida on Oct 12, 2010
The “It” is bottlenecks. As soon as you clear a bottleneck at one level, the bottleneck pops up at the next lower level.
Processors are getting faster with increasing numbers of cores, simultaneous multi threading, and multiple levels of cache. Hypervisors are filling up those fast processors with multiple virtual machines. Storage area networks are doubling their bandwidths to 8 Gbps Fibre Channel to 16 Gbps Fibre Channel with 10/40/100 Gbps Fibre Channel over Ethernet (FCoE) on the horizon. So where is the next bottleneck?
Where are the bottlenecks?
You might think it is storage, but there is another step before you get to storage, and that is the file system. If file systems simply pass the I/O workload to the block storage system, there is very little impact to performance or scalability. However, the more work that the file systems do on the I/O stream, the more processor cycles and processor cache resources it consumes and the more it becomes the next bottleneck.
Disk file systems come in several flavors – the most common are either Conventional or Shared (also known as a clustered file system). A clustered file system provides block access directly from multiple computers and adds a mechanism for concurrency control, which conventional file systems intended for local storage do not have. In a conventional file system, a server node manages disk storage directly and other nodes access data through the dedicated server across a communications network.
The trade-off between conventional file systems versus shared block devices is usually the cost of CPU cycles and memory versus storage cycles and more importantly, its ability to scale. However, if the shared file systems and block storage devices can work together, it is then possible to retain the benefits of a clustered file system while offloading the I/O workload to the storage. A case in point is VMware’s vStorage API’s for Array Integration (also known as VAAI).
VAAIs offload some of the workload that the VMFS has to do for the virtual disks. One of these is a write same command that can be used to format the VMDK for new virtual machines (VMs). Instead of VMware using redundant and repetitive commands at the ESX host to format the disks using CPU cycles and memory, it offloads this task to the storage array, saving about 85% of its I/O load. Not only does this increase performance and speed, this feature now takes advantage of Hitachi Dynamic Provisioning (HDP) by allowing the VMDKs to become thin friendly Another task that VAAIs offload is the task of cloning VMDKs to the storage array instead of doing a software clone. The biggest improvement can come from the use of an Atomic Test and Set command, which changes how the VMFS locks VMs, eliminating the contention that most, if not all, storage admins face when sizing storage for VMFS for multiple ESX hosts. The storage array now uses this new locking sequence (ATS) to only lock the extent of the one VM allowing all the others to continue their updates on the same VMFS volume. Removing this bottleneck can increase the number of virtual machines per cluster by 25% to 35% or more!
It all comes down to storage
Finally we come to the storage. This new type of integration imposes some tremendous demands on the storage systems, as storage is at the bottom of the hill. When we ratchet up the processors, the hypervisors, the network bandwidth, and the file systems, it all comes down to storage. That requires a purpose-built storage system. It is not just about adding capacity any more. It is about adding the architectures and software to enable massive scalability, performance, availability, and integration. The VAAIs are open architectures that can be used by any storage array. However, it doesn’t do any good if the storage array cannot scale to meet the increasing I/O workload. It requires storage architectures like the Hitachi Virtual Storage Platform (VSP) that scales up, out, and deep through storage virtualization and intelligent software like the Hitachi Command Suite to simplify, automate, and integrate.
Storage virtualization is a great enabler for scalability but that is not enough. Storage systems need intelligence to work with applications and file systems to optimize the I/O workload as well as the capacity. While APIs like VAAI can offload workload to make VMware more scalable, other file systems like the Symantec file system can help storage be more efficient by notifying the storage when it deletes a file and identifies the extents that can be recovered as free space. Symantec goes as far as to query the storage that is attached and provides the extents in the format that can be easily recovered by that storage systems’ thin provisioning method. In the case of the Hitachi USP V and VSP, that is in 42 MB pages.
So when “it” begins to roll downhill, storage can no longer be content to be a cheap storage container, it must step up to be a storage computer that can scale in multiple dimensions.
Comments (3 )
[...] you really want to appreciate the value of VAAI, be sure to read Hu’s blog post: “It Rolls Downhill”. It does a great job explaining how Atomic Test and Set command eliminated all the contention that [...]
It looks like a good idea that a lots of works originally on the file system list are past down to the storage through VAAI, which can release the file system to handle more requests.
However, we can’t simply say it will improve the whole performance, because those works such as clone, etc.. are still to be done. If the storage is designed in a stupid architecture and can’t process in a higher performance, the bottleneck is still not be removed.
So, I don’t think VAAI is not a pill to the performance. It’s just a specific way to keep VMWare’s market share, and HDS leverages it as a sell point.
Actually, VMWare is an undoubted leader in server virtualization area. I agree that it’s important for each storage company to follow the VAAI.
Joey this is an excellent point.
If the bottleneck is moved to the storage, the storage architecture must be able to handle the additional workload. We have designed the Virtual Storage Platform with a global pool of processors to handle the extra work that is required to support the VAAI primitives. The VSP’s internal switch architecture is also designed to scale up to handle the increased number of Virtual Machines that can be supported once the SCSI reserve is eliminated. While other storage vendors may support the VAAI primitives, they may not be able to scale to meet the increased demand.