Big Data Is About Turning Content Into Appreciating Assets
by Michael Hay on Jun 7, 2011
I was inspired by Doug Henschen’s article in InformationWeek on Big Data in which he hypothesizes that Big Data is bigger than data warehousing. More specifically, he explores whether the data warehousing concept of ETL is also an important facet of Big Data.
I would agree with his statement and I’d like to further it. In order to do so, we first need to understand why Big Data now. For me the big “ah ha” moment started from the idea that Big Data is really about turning data/content from being a depreciated asset or liability into an appreciating asset. My colleague Bill Burns describes this for healthcare content in the following way.
Today data/content in healthcare is the only form that actually becomes more valuable over time. The fact is the interactions you get today when combined with previous interactions and future interactions compounds to establish both qualitative and quantitative trends. It is this compounding effect of data that makes healthcare data unique and more than the sum of its parts. (Bill Burns, 2011)
So this got me wondering, if this is possible for healthcare data then why not for other classes of data? Perhaps Big Data is really about applying this concept to other disciplines. In hindsight this, like many other things, is obvious, but I would argue it is not “well understood” in the industry. After all, we have been trying to attack unstructured content as a kind of enemy with technologies like data shredding, capacity reduction, cold media, HSM, etc. for a really long time. I’ll be the first to say that we need these tools to make better and more efficient use of resources, but I would argue that the tone of applying these technologies is more about treating unstructured content as a liability and not an asset.
Big Data and Retirement Portfolios
I want to be cognizant of discussions asserting that the Big Data-verse means nothing. In one sense I agree, as projects like SAMBA, Perl, C/C++ and *NIX OSes have long deployed primitive “Big Data” technologies like key value stores. However, barring the striking similarities in the technologies, there is a difference in the market: content savings to the point where there are terabytes in the home and soon exabytes in the enterprise.
In fact, in the past I have blogged about the results of content savings resulting in me lusting for easy to use storage resource management and disaster tools for the home. To answer my question, why Big Data now, I’m going to relate Big Data and content accumulation to a retirement portfolio.
When you start saving for retirement, your portfolio is small and you don’t much think about what will happen to it in the future. You also don’t think about how quickly it may grow, but at some point you acquire enough wealth that you realize you need to seek advice, tools, or a mix of the two to accelerate the appreciation of your retirement portfolio, as well as make critical decisions about how to use your retirement savings.
Now, if we think about this concept of savings for content instead of money, I believe we can approach an understanding why Big Data now. Namely, as an industry we are transitioning from the petabyte to the exabyte and in parallel we’ve accumulated enough time and performed sufficient introspection to desire more from our content. We want tools and expertise that unleash the hidden potential in our content savings to assist us in making critical business decisions, such as the derivation of new lines of business, or in the case of healthcare, improved well being.
While this is extraordinarily obvious — as are patents after you invent them — what I think is not obvious is that getting more out of the content is not a localized phenomena. Instead, there are a sufficient number of companies, groups and individuals now contemplating this issue, so we move from it being merely a whisper to being a real sustainable trend.
Of course, we still have the usual cycle to process through as an industry and this will result in a movement from the vague to the specific. However, at least for me this somewhat simple metaphoric example helps me put things in perspective. Now that I have proposed my definition of Big Data — transforming content into an appreciating asset — in future posts, I’ll begin exposing why I think that Big Data is more than just Data Warehousing.
Comments (2 )
[...] I discussed in my previous post, I think about Big Data as a way to transform existing data/content into appreciating assets. To do that we will need applications that are separate or independent from the content they [...]
[...] I came across this blog when I was researching big data on behalf of a technology client, and found this post, which offers useful and unique opinion on the issue, not just a regurgitation of the company’s [...]