What is required to network storage to storage?
by Hu Yoshida on Apr 6, 2007
I spoke at a SUN storage event in Burlington last week and had a chance to compare notes with Randy Chalfant, a CTO at SUN. Randy spends a lot of his time with customers analyzing storage requirements. He confirmed that utilization of storage for open systems today is only about 30% as I posted in my blog last week.
I remember having similar conversations about 10 years ago, when the utilization of capacities were at the same low levels. Back then, most open systems storage were direct attached to servers, creating islands of storage with average capacities in the 1 to 2 TB range. The game changing technology that occurred during that time was the Fibre Channel SAN fabric. In addition to faster FC transfer speeds and extended addressing capability, the SAN enabled multiple host servers to connect to multiple storage systems. IT shops set about acquiring, host bus adapters, switches, FC storage, rewiring their data centers, and installing HBA drivers and SAN management software, to realize the benefits that SAN promised.
SAN’s delivered on the promise to network host servers to common storage systems, and for a while we saw the consolidation of storage frames from six or eight 1 and 2 TB frames to one 18 to 20 TB frame. In 2001 and 2002 we saw the rate of capacity growth slow down dramatically to between 25% to 35 %. Consolidation of storage appeared to be working and some shops claimed that they were at 80% utilization.
Now 10 years after the adoption of SAN we see capacity utilization of about 30% and compounded capacity growth rates of 60 to 100%. What happened?
The analysts at ESG, Tony Asaro and Steve Duplessie point out that while SANs network the servers to storage, the storage is not networked to storage. “Aside from being on the same network, individual storage systems do not, in any way, interact with each other. They do not work in concert, but rather separately: individually; discretely; and therefore inefficiently”
Virtualization of storage is intended to solve this problem, but not in the way that many vendors are approaching it.
Many vendors have taken the volume pooling approach. They have inserted a cluster of servers in the middle of the SAN on which they run volume management software to aggregate extents from external storage systems, and remap them into virtual volumes which they present to the host servers. The problem with this approach is that it inserts a layer of unnecessary complexity between the host and the storage and it lacks the data mobility for the storage to interact with each other. It is a static mapping across multiple storage systems and does not provide an interactive network of storage to storage. This will not address the problem of 20% to 30% utilization.
The reason I call this unnecessary complexity is that volume mapping is already being done in the control unit of the storage system. Compared to a cluster of two servers, each with 4GB of cache, a Hitachi USP control unit has 128 processors that access storage through a 256 GB global cache. With these resources The USP provides a large menu of storage services to support mirroring, replication, migration, copy on write, without disruption to the host applications. The USP provides data mobility, provisioning, and security services that enable the efficient networking of storage to storage. Simply by attaching other storage systems through its fibre channel ports (there are 192 physical ports, each of which can be virtualized to 1024 virtual ports) the storage services of the USP are extended to externally attached storage. This virtualization approach effectively networks the externally attached storage together.
In order to solve the storage utilization problem, the industry must implement storage virtualization. However, the storage virtualization solution must provide the data mobility that is required to network storage to storage.
I would like to suggest another reason for low utilization: To many containers. For example, in UNIX a typical server has many Volume Groups and each of those Volume Groups have many, many file systems.
Typically each File System will have room for at 30-50% Growth. Even though each file system could be very small – they all add up to a big number. I think was we need is some good White Papers to share with our DBA folks on how best to design these types of containers for maximum efficiency.
If anyone can post a link to papers on this subject -or anything on this subject – I would be much obliged!