United States
Site Map Contacts Hitachi Global Community
David Merrill’s Blog - Data Storage Cost Reduction Hitachi - Inspire the Next

David Merrill's Blog - The Storage Economist

Home > Corporate > HDS Blogs > HDS Bloggers > The Storage Economist
Products, Solutions and more

The Storage Economist

Comparing all the costs in a 1-cent-a-GB cloud offering

by David Merrill on Sep 20, 2012

Last week I wrote about Amazon’s new Glacier offering, with cloud storage set at $0.01/GB/month. Let’s take a closer look behind the headlines to compare and contrast the total cost and not get hung-up on this very attractive price point. Before I could continue, I had some colleagues run a couple of comparative configurations using average street pricing for HDS (HCP) and competitive solutions for a small (50 TB) and large (500 TB) environment. With an ASP and a total cost calculation, we can better see how owning the storage asset yourself might compare with the new price offering from Amazon.

A quick summary of the offers can be found on the Glacier pricing web site or below:

      Amazon Glacier       Owning a Small       Owning a Large
Usable Capacity

Both 50 & 500 TB

50 TB

500 TB

Price* per GB/month

$.011

$.055

$.0255

OPEX** & CAPEX per GB/Mo

$.011

$.070

$.0303

On-boarding Cost

$0.055 /GB

None

None

Retrieval Cost

$0.10 /GB

None

None

To create a more balanced view of all costs, you would need to include in the Amazon total cost the on-boarding cost (up-front, one-time) and the retrieval costs (when requested) to the run rate of $0.01 per GB month. There is also a cost for the business to wait for data retrieval (in these examples I have set a low rate of $200/hour)

In this approach, the on-boarding cost is a tariff that has to be factored in to start. In other-words, if the retrieval rate is zero, meaning you put the data there and never access it, the total costs for 1 year would certainly favor Glacier. Data at rest would never be retrieved and would indeed have a very attractive total cost curve as shown below.

Although predicting a zero access rate and zero business impact with data retrieval is not very likely.

Now if I change the retrieval rate to just 5% of the capacity per month, with a total monthly wait-time of 20 hours, we start to see some total cost convergence or parity in a one-year timeframe. The slope of the curve for Glacier is steeper due to Amazon’s charges to retrieve data, and the business impact waiting for that data to be ready and presented. In the model below, these 2 factors will change the slope of the Glacier total cost curves:

The conclusions are fairly simple with this modeling approach:

  • If data is not to be accessed frequently, and has low RTO (recovery time objective), then cloud offerings such as Glacier will have a good total cost story
  • We have to look beyond a steady-state price to really compare and contrast other offerings
  • Your mileage may vary with regard to power, labor, retrieval rates, network costs and data protection (which I did not include in these cost models), so you have to understand the operational costs and data access requirements before being too impressed with an initial price offering
  • Legal and compliance requirements can turn these kinds of calculations upside-down; make sure your inactive data can be really off-site
  • Read the fine print; make sure you understand all life-cycle costs
  • There can be economic surprises over time, like migration costs or maintenance bubbles. Make sure that your time horizon is appropriate.
  • There will be real performance and availability differences with some of these kinds of services. Make sure the data classification and catalog definitions are commensurate with the service you are purchasing
  • Total cost curves will behave very differently for small, medium and large workloads. These 2 samples of 50 and 500 TB may not present the best overall cost performance in these examples. I did find it interesting that the smaller system had a faster cross-over point, this again would be due to the burden of the retrieval costs being large compared to the monthly run rate

I have been asked some questions about the economics of these cloud offerings (based on my initial blog post):

What do you see this type of offering used for?

  • Low cost, inactive data possibly for long-term storage like compliance

Is this a play for archive data or backup?

  • Archive data usually requires an index or lookup feature. I do not think that 4-5 hours to restore data would meet many current or active archive requirements
  • Backup could also be a target, as well as tape replacement
  • See the Amazon use cases put forward on their web site

I don’t want this price analysis exercise to imply that I am anti-cloud. I am a firm believer in cloud architectures and options. We cannot, though, be seduced by low prices offered by a cloud vendor or storage vendor. Do some basic math and understand all the CAPEX and OPEX costs that have to be factored into a total cost comparison. That way you are better positioned to deliver options that reduce both total cost and CAPEX price.

 

Notes:

* For this OPEX calculation I used $0.16 per kWatt hour for electricity, $65 per month for rack/raised floor, $100K fully burdened labor cost, and 200 labor hours per year for minimal administration labor for large capacity, 100 hours for small capacity. Growth rate, once in the archive, is zero percent for this total cost comparison. The on-boarding cost is not a recurring cost, therefore the slope or vector of the total cost curve would be roughly the same (given the modeling conditions) for each of the next few years.

**For the unit price we took several kits from HDS and competitive modular offerings. This price includes the tax impact of a 4-year depreciation @ 32% marginal tax rate.

Related Posts Plugin for WordPress, Blogger...

Comments (0)

.

David Merrill - The Storage Economist

David Merrill
Chief Economist

Connect with Us

     

Most Popular

Switch to our mobile site