Zen and the Art of Storage Maintenance
by David Merrill on Sep 19, 2011
Most of us have a car, or once had a car, and know that an oft-overlooked function is maintenance every 5,000 miles (8,000 km) as the manufacturer recommends. Due to this omission, an automobile will deteriorate – according to the 2nd law of thermodynamics (or law of entropy) and the deterioration in thermodynamic systems. This deterioration will result in accelerated engine wear, poor performance and worsening gas mileage.
I am working this week in Hong Kong, and met with a banking customer here. Part of our discussion turned to the time and effort they expend to monitor, optimize and sustain the storage system. The CIO quickly stated that they do not do this. The few engineers and architects they have work on larger projects, and have no time on this ‘maintenance’ work.
That phrase caught my attention and the correlation to automobile maintenance (not to be confused with Zen and the Art of Motorcycle Maintenance). The systems seem to work okay, but the company cannot or will not afford the maintenance to the systems, and seeing it as an added cost (unjustifiable most of the time) to the IT operations.
I often find that IT organizations do not have the luxury of regularly scheduled or continuous optimization and maintenance of the storage infrastructure. So what is the impact, i.e. what is the wear-and-tear or lower performance from ignoring this task?
- Higher degrees of waste (storage capacity resources)
- The purchase of more capacity than what is needed
- Imbalance of performance, cost and availability
- Over-engineering to provide a common view of capability, regardless of cost
- Performance hot spots that come and go, and are difficult to trace
- Problematic troubleshooting when problems do arise
- External resources have to be brought in for identification, remediation of problems
And the list could go on….
The Need for Remote Operations Service
A popular option now available from many vendors is a remote operations contract or service. This is not outsourcing or management per se, but rather a service that tracks and monitors operations, OLA, SLA, bottlenecks, SNMP alarms, etc. This type of service can graduate into a MSU offering, but it does not have to.
Since the labor is remote (where costs of labor are much lower than your own system engineers) and the monitoring can be ongoing, the costs can usually outweigh the downside of a poor performing system. You would have to do the calculations (I will help you) to see if the improved performance and utilization improvement justifies this effort, but for most IT operations (somewhere around 200TB or larger) this does make very good business sense. There is also the impact to relieve some of the nagging tasks from your on-site engineers and architects, leaving them for the more pressing work of design and IT operations that do require the local presence and expertise.
HDS has such a remote operations service. Most storage vendors and outsourcers also provide this service. Be sure to differentiate this remote monitoring service from a remote management service, as there is a big gap in price and functionality between the two.
How do you maintain your storage system?