Storage, Data, Information
by Hu Yoshida on Mar 29, 2007
When you try to explain a concept to another person who is not a native speaker of your language, in my case American English, it often helps to clarify your own thinking. Last week I was interviewed by a reporter from Taiwan who wanted to know why HDS has not endorsed ILM, Information Life Cycle Management and why we prefer to use DLM, or Data Life Cycle Management. It appeared to her that ILM is an evolution of DLM and all the major storage vendors have gone to ILM.
First I tried to explain the definitions of storage, data, and information. Storage is the container. Data is the content that is stored and information is what one infers from deciphering the data. Storage and data are very tangible, while information is very much like beauty. It is in the eyes of the beholder.
To illustrate this I used the example of a receipt that I had from the breakfast I enjoyed that morning in the hotel. The receipt was the data and it was delivered to me in a small leather folder which served as the container or storage. The information that I received from this receipt was the amount that I would have to pay. When the server retrieved the receipt, the receipt informed her that I was a guest at the hotel and that I appreciated her service since I left a gratuity. The restaurant owner received information as to which room account to charge the breakfast. Others like the hotel accounting system, my credit card provider, and eventually the HDS expense account system will extract their own information from the same piece of data.
Unlike storage and data, information is temporal and transactional. Storage and data are tangible and have a definite life cycle, which involves birth or creation, different stages of activity, and eventual death or decommissioning, and disposal. At HDS we know what a data life cycle is and we can build storage products that can be optimized for the life cycle of data and the applications that access that data during its life.
I have no clue what information life cycle means. All the definitions I have heard so far sound like data life cycle management. HDS is a storage company and our core competence is storage and data services. Unlike all the other storage vendors, we are still stuck on DLM.
In China, the government assigns an official kanji for storage, data, and information.
The kanji for Storage contains two characters which represent “keep” and “store”
The Kanji characters for data represent “number”
The Kanji characters for information represent “letter or correspondence” and “news”
Their definitions are pretty close to mine.
Comments (3 )
perhaps another analogy — information is like beauty… “in the eye of the beholder”
Nicely put! But the marketing guys don’t care. Executives – those with big check books – think that Information is more valuable than Data (it is), and thus they will pay more for products around information than than data. Unfortunately, the products and solutions being sold as ILM product are really, as you state, Data oriented.
On this note, whatever happened to the “Data Processing Department”? It became the IT, or IS department. Why – because they realized that the perception of value was higher with the new name, even though the reality of the product/service they delivered was no different.
Fixed Content Fixations: http://www.storageswitch.com/blog
Perhaps an information lifecycle can be thought of consisting of the following stages:
1. Creation – When a set of data first becomes useful to atleast one person. The receipt itself is just data but when presented to you becomes information. The fact that you could decipher the amount you owed meant that some information had been created in the context of your having breakfast at the hotel.
2. Assimilation – When more data is added to extend the usefulness of the information created. For instance adding a tip and signing the receipt before handing it back to the server. One may argue that the added signature is assimilation of data and not information, but assuming you signed your name ‘Hu Yoshida’.. If data is being assimilated it could be interpreted as ‘h’ ‘u’ ‘y’ ‘o’… The fact that the server knew you had left her a gratuity and hence inferred that you appreciated her service seems to indicate that information was being assimilated here.
3. Dissemination – To distribute the modified information to those who need access to it. In the case of your example the credit card providers and the HDS expense account personnel.
4. Protection/Encryption – (To ensure that the private data on the receipt(credit card details perhaps) does not become information for unauthorized people to exploit. Nobody cares about protecting data per se, if one can guarantee that the potential information in the data is protected.
5. Archival – Company policy often dictates that expense records be kept for a year. Clearly archival of data is done so that people can retrieve information, not retrieve data. In the case of your example, someone may want to retrieve the date and amount of the transaction.
6. Demise – When all transactions are complete and the information is of no consequence to any entity living or non-living. Note that the hotel ‘receipt’ could still be physically lying around crushed and thrown into a dustbin by expense account personnel, but for all practical purposes the information that had been created at it’s inception is dead.
To put it in a nutshell, IMHO though the terms data and information are often interchangeably used, IF you can give someone information without data, they will take it, but if you give the same person data without information, they will probably refuse. Thus the onus is on information which is probably why most companies use the term ‘Information’ Life Cycle Management versus ‘Data’ Life Cycle Management.