Big Data Storage Economics – Case Study #1
by David Merrill on Mar 2, 2012
Last week I posted an introductory blog about big data and some work I had done a few years back in this space (before it was called big data). I have a couple of these large TCO assessments in my library, but will just share 2 or 3 of these that have the easiest story to tell, and make the point around price and cost of big data storage infrastructures.
This first case study (circa 2009) is a large retailer with most of the revenue coming from web transactions. They used the Azure cloud platform, and had (at the time) 1,500 hosts, about 4PB of JBOD and rack mount storage. They were convinced that the seemingly low priced disk architecture connected to the server/node architecture was meeting their price and cost objectives. The fact that the sprawl of the cloud and big data systems (analytics) was on a rapid pace to overthrow their data center, and they were on track to triple the data center size to meet the growing storage/rack space sprawl.
As you may know, normal capacity efficiencies (written-TB per disk) and overheads (most run bare-metal, no RAID) did not apply, so a new set of metrics had to be developed to not only show the unit cost of a usable TB, but also the unit cost of a written-to TB. Before we could make quantitative recommendations about reducing the cost of this big data cloud, we had to pause and measure unit costs of their environment. A blog post from 2009 outlines these simple concepts.
Our challenge was to show (let alone prove) that enterprise disk on a SAN was more cost effective than the JBOD rack approach. With shared volumes and virtual LUNs, we were able to technically show a solution that would work, but the price was much higher than the current disk solutions. We ended up with 9 cost categories for the TCO, and as you can guess, the TCO per usable TB was certainly in the favor of the JBOD/rack disk (labeled here as direct attached or DAS).
As mentioned, the problem was disk sprawl, and our solutions (FC SAN or iSCSI) had a significant impact on the drive count. Note that the SAN was configured with 400GB drives, and the rackable DAS was 1TB drives.
In developing an economic story, some new metrics had to be applied. Looking at a price per TB usable was incomplete, since the disk sprawl was hurting the environmental and management cost. We adopted a unit cost per written-TB, and the resulting metrics turned upside-down.
When measuring written-TB, we were able to get to some closer metrics around total transaction cost, or the analytics query cost within this IT environment. Backing up, network costs, migrating off and on this big data environment had quantifiable costs to the customer. At the end, we took a non-economic view of the problem, and captured a metric around carbon emissions for these 3 options. You can see these results here too.
Upon further analysis, we found that big data processing required the local CPU to do many mundane storage tasks, and extra processors had to be employed. We forget how much work RAID, controllers and intelligent array-based software do to offload the host. These extra server costs were added to the unit cost model to serve up a TB of capacity.
I cannot go into the final actions and results of this case study, except to say that we changed a lot of minds around measuring and identifying different metrics to use when building big data cloud infrastructures. Don’t confuse price and cost, and look at a longer time horizon when planning and building big data storage infrastructures.
Some related readings on big data and cloud cost/economic concepts:
For other posts on maximizing storage and capacity efficiencies, check these out: http://blogs.hds.com/capacity-efficiency.php
Comments (3 )
[...] this very topic. I have been a fan of his white papers and noted that he had started writing about Big Data Storage Economics on his blog titled, The Storage Economist. For those of you who are unfamiliar with his work, a [...]
[...] a few more (bare metal) big data/cloud cost reduction case studies. If you missed case study #1, you can read it here. [...]
[...] out of the office this week—and I plan on continuing my big data case study series as soon as I return—but quickly wanted to reiterate something Claus posted last [...].