Networking Storage to Storage
by Hu Yoshida on Apr 23, 2007
Fibre Channel, Storage Area Networks were introduced in the late 1990’s to eliminate the islands of direct attached storage. It promised to consolidate storage and centralize management of storage and data resources. The assumption was that this consolidation and centralization would lead to better utilization of storage capacity which was then about 20 to 30%. Since its introduction, SANs have enabled more servers to attach to larger and larger capacity storage arrays, and Storage Area Management tools have been developed to map the resources in a large heterogeneous SAN. However, when we look at the utilization of storage today, more than 10 years after the introduction of SAN, capacity utilization in many cases is still around 20 to 30%. The only difference now is that instead of 100’s of TBs in many large accounts, we see PBs in these accounts. There are customers with multiple petabytes of storage where less than 30% is being used!
Tony Asaro of ESG identifies this as a failure of SANs to network storage in his new white paper, Services Oriented Storage Solutions. He points out that SANs provide a network of host servers to storage but it does not network storage systems to other storage systems…“aside from being on the same network, individual storage systems do not, in any way, interact with each other. They do not work in concert, but rather separately: individually; discretely; and therefore inefficiently.”
Recognizing this problem Hitachi introduced a storage controller with the ability to network storage resources within a storage system. In 2000 Hitachi Announced Application Oriented Storage and delivered the Lightning 9900 storage array which had the ability to dynamically change storage configurations within its global cache through the introduction of a cross bar switch architecture and separation of control data from user data. Logical Disk Volumes within the storage array could work in concert, to deliver optimum performance and availability to host applications. In 2002 the 9900V introduced the addition of virtual storage ports which increased the connectivity of each Fibre Channel storage port to 128 virtual ports, each with its own dedicated address space, so that applications that shared the same storage port could be secure from data leakage or data corruption from each other. Multiple applications, residing on different platforms like mainframes, Unix, Linux, Windows, and Novell, could share the same storage array, and use different RAID protection, different performance disks, and set different priorities on port access. Not only was the Lightning 9900 able to optimize storage for multiple applications simultaneously it could also dynamically change the storage configuration as the requirements of the application changed. This enabled customers to massively consolidate and optimize their infrastructure while greatly reducing cost, complexity, and management. This essentially provided networking of storage resources within one storage frame.
While Application Optimized Storage was able to network storage in Hitachi storage arrays, it was apparent that networking of storage resources had to extend beyond individual storage systems. In 2004, Hitachi introduced the Universal Storage Platform (USP) providing the capability to connect, or network, external, heterogeneous, storage systems including Hitachi and non-Hitachi arrays through standard Fibre Channel port connections each of which were now virtualized to 1024 virtual ports. This not only provided the ability to connect tens of thousands of heterogeneous host applications, it also provided the ability to network thousands of heterogeneous storage systems together into a common platform of logical, virtual volumes.
While much of the early adoption of the Universal Storage Platform focused on virtualization as a tool to migrate volumes from old arrays to new, organizations have come to realize the cost, performance, protection, and management benefits of deploying tiered storage infrastructures leveraging heterogeneous storage assets and common storage services offered through virtual ports, logical partitioning, provisioning, mirroring, replication, and volume migration.
This networking of storage to storage can only be done in a storage control unit with a large global cache, where data can be moved between storage units, outside of the production SAN. It can not be done in clustered appliances with limited or no cache, sitting in the middle of a production SAN. SAN base storage virtualization appliances were designed to provide volume pooling, not data mobility or the ability to network storage to storage. Without this data mobility we will be revisiting SAN based virtualization in another 5 years and wondering why we still see 20 to 30% utilizations.
[...] The NetApp V series is a cluster of two processors with separate caches, with 16 FC ports which are not virtualized. It is designed for volume pooling which does not solve the problem of networking storage to storage as I described in my previous blog post. If you apply the recommendation of “n” times the performance and availability of the aggregate storage arrays behind it, that doesn’t leave much room for virtualization. [...]