Compression, De-duplication, Encryption, Which comes first?
March 10th, 2008
The increasing explosion of data is driving the need to reduce what we store, not only to reduce the capital cost for storage capacity but also to reduce the operational costs of managing an ever increasing amount of data.
In response the storage industry has developed technologies like compression, single instance store, and de-duplication which are rapidly being deployed this year. You can find solutions that are deployed in the application server, the SAN, the storage array, or backup/archive appliance. Careful consideration must be given to where you deploy these tools, since the where, will impact performance and effectiveness as I pointed out in my previous post.
Another technology that we will see more businesses deploying this year will be encryption. The primary driver for this is the need for protection of personal data, which was highlighted by too many well publicized incidents over the past few years. Here again there will be many choices for the deployment of this technology. Vendors will offer solutions for data in flight and data at rest. Application vendors, HBA vendors, switch vendors, storage vendors, backup vendors, archive vendors, and even disk drive vendors will provide many choices. Here again, where you deploy this technology will make a difference in performance and effectiveness.
You will also need to consider where you do encryption in relation to where you are implementing data reduction tools like compression, single instance store, and de-duplication.
The objective of data reduction tools is to find similar bit strings or instances of data and eliminate them from the data stream while preserving the ability to retrieve the content. On the other hand encryption will reshuffle the bits to make the bit stream look random, creating different cipher text for the same plain text. So if you do encryption before you do data reduction, you will most probably negate the effects of data reduction. Data that is being backed up or archived for compliance will most likely require encryption. This type of data also contains a lot of redundancies which can benefit the most from data reduction.
If you are considering the use of data reduction tools today, and see the need to use encryption in the future, or vice a versa, you should plan to do the data reduction before you do encryption. The Hitachi HCAP platform for active archive will incorporate single instance store with encryption under the covers, so you won’t have to worry about it.


Hu,
What you describe is true for mature storage technologies, and, short term, customers have little choice in the matter.
However, long-term we must rethink storage from the ground up. I have reviewed at least two available technologies that render traditional encryption and backups less valuable (perhaps unnecessary) while providing most of the benefits of de-duplication. Both prove obfuscation can be accomplished without reshuffling bits, and backups can be accomplished without multiple copies.
I’d like to see HDS incorporate these innovations into its product portfolio.