Who took my Storage?
by Hu Yoshida on Jul 2, 2007
No matter how much storage capacity you buy it never seems to be enough. Jon Toigo did a presentation several years ago in which he described storage as being over subscribed and under utilized. He developed a chart of storage utilization which I have taken and modified slightly, but it is essentially the same as he presented.
We start with a base capacity which holds the actual data. Some people focus on the data portion as the problem in terms of what should be stored for business purposes. I am not going to address that at this time and shall leave that discussion for later. The choice of storage for this data whether it is expensive tier 1 or lower cost tier 2 or 3 will depend on the availability, performance, and replication requirements of the data.
Next is allocated but unused capacity. When application owners or data base administrators ask for an allocation of storage capacity they ask for more than they need to avoid running out of capacity in the middle of a production run. If their expected growth is 25% per year they may ask for 50% more just in case it exceeds expectation. Some may even double their allocation or request enough to carry them for 2 or 3 years since storage is cheap. Unfortunately when they buy all this allocation at today’s cheap price they increase their TCO over time. While storage capacity is cheap today it will be cheaper tomorrow. That is one of the laws of storage that has been true for 50 plus years.
The next layer is the copies of data. Since the introduction of point in time copies and storage replication technology in the storage control unit, storage administrators have been making more and more copies for backup, data mining, data distribution, development test, business continuance and other parallel processing purposes. Jon had originally estimated this at 4 to 5 copies. My experience talking to customers today would suggest that there may be 10 to 20 copies or more if the copies are not diligently recycled. Unless Copy on Write technology is applied, these copies replicate all the allocated but unused capacity as well.
The next layer is stranded storage. This occurs when there is unallocated capacity on a storage system which is available but new application owners prefer to spend their budget on new storage. They may do this to get the latest new control unit feature or because they do not want to compromise their data security or performance by sharing their storage resources with other applications. That capacity on the old system is stranded and no one uses it.
The next layer is what I call lead time buffer. Just as application owners must plan for enough capacity to support their applications, storage administrators must plan for enough storage to support their storage users. Adding storage capacity can be a lengthy process. The procurement process may take weeks or even months especially if an RFQ bid is required. Then ordering, delivery, installation, test, formatting, and provisioning adds to this time. As a result operations people over buy storage as a lead time buffer for the procurement and provisioning process. Some buy once a year. Some buy every quarter. Some lock themselves into a vendor at a set price for 3 to 5 years, so that they can get a fixed amount of storage every quarter. Unfortunately this also locks them into that vendor’s technology for that period and can result in higher costs if their growth exceeds the agreed to capacity..
On top of all this you have to use RAID protection for your storage. If it is RAID 1 you double all your storage capacity. If you use RAID 5, you can use one parity drive over a RAID group of multiple drives and reduce the overhead from 1:1 to 1:3 or1:7 depending on the configuration that meets your needs. While RAID 1 and RAID 5 protect against single drive failures, large capacity drives may need RAID 6 which has two parity drives per RAID group in order to protect against a second drive failure during the rebuild of a single drive failure.
So who is taking your storage? Your application users, data base administrators, and storage administrators are all taking your storage for legitimate business reasons, based on the technology that they are using. However, with the right technology and storage architecture you can take back your storage by employing different cost tiers, reducing the amount of allocated and unused capacity, reducing the number of copies, eliminating stranded storage, reducing the lead time buffer, and reducing the RAID overhead.
In my next post I will address how new storage architectures can address each of these elements of capacity utilization
Comments (2 )
[...] July 9th, 2007 In my last post I talked about where all our storage is going and developed the following chart. [...]
I often steal storage in another area. I will often keep unallocated 2 LDEVs per Array Group in case we need to migrate LDEVs to them using Cruise Control/Volume Migrator in the future.
I’ve been caught out in the past with hot Array Groups and wanting to migrate some LDEVs to cooler AG’s only to find that there are no free LDEVs to migrate onto.
I guess this “could” come under the storage admins lead time buffer that you mention. However, I will always try and keep these 2 or 3 LDEVs free if possible.
Oh and if AG’s are generally hot I may keep more free to keep the IOPs demand on the underlying disks from overloading the disks.
And finally, I recommend and even have customers recommend to me that certain perfromance sensetive applications require their own AG’s. E.g. Ive done some work for a customer who has AG’s dedicated to Exchange Data and other to Exchange logs. They have tons of free space on them that will only be used as the Exchange environment gorws. And even then only up to a certain performance limit within the AG.
It seems there are endless pits or “black holes” into which we throw our storage