BlueArc: The Jumbo Carrot
by Ken Wood on Sep 7, 2011
“Project Carrot” is officially complete! As Michael Hay recounted, our relationship with BlueArc has been an exciting journey over the past 5 years; a journey that we dubbed internally as “Project Carrot”. As you know by now, Project Carrot set out to evaluate and choose the NAS technology that would become HNAS. A small team of HDSers conducted the technical and performance evaluation for our “carrot patch”. After dozens of paper evaluations and interviews, it eventually came down to 3 finalists – Littlefinger, Resistafly, and Jumbo, aka BlueArc (If you’re not a vegetable aficionado, all of these names are different types of carrots.) Without doubt, with today’s acquisition of BlueArc’s talent and technology, HDS has enhanced our file and content services strategy and vision. It’s been a great journey thus far working with the BlueArc teams and we are all looking forward to a great future together.
Let me back up and add some insight and historical background to how we got here…
As part of the HDS technical evaluation that started several years back, we each had an opportunity to submit the vendor or technology we thought would be a great complement to our existing technology. My carrot in the patch was Resistafly. I brought them into the carrot patch to blow away the other carrots and to show off my “vision” for the company’s future. I wanted us to embrace a software-based solution, go for the scale-out NAS approach, and select a platform that we could “easily” develop with our strategy. From my perspective at that time, Resistafly was a shared-everything, clustered file system that ran on Windows and/or Linux, could scale-out to 64 nodes (for the right use case and workload, which at the time was fairly large), was not as complex as an asymmetrical clustered file system such as Lustre or IBRIX, and was software-based so it required no special hardware. We could integrate our search technology and develop other Hitachi IP into this platform. At that time, I truly thought the possibilities were endless.
However, after the evaluation, I found Jumbo was the only real option – despite my blind allegiance to Resitafly, the technology only out performed Jumbo in a couple of corner cases with 8 nodes versus Jumbo’s 2 node configuration. Just imagine if Jumbo had 8 nodes! Throughout the evaluation process I designed over 150 different tests, (truth be known, my performance tests were designed using a high performance computing cluster and leaned towards giving Resistafly an edge by using scale-out workloads that “ramped-up” over a period of time). In my mind, there was no way the other vendors could keep up with this type of loading for very long and my preference of a clustered file system would leave them all in the dust, or so I thought. However, in most cases Resistafly only had an edge in some of the streaming reads with low overhead and metadata processing requirements. Any multi-read, non-shared solution in this environment could have performed like this, like the technology in Littlefinger (most of my detailed results from that evaluation are still held in confidence). It was the “mainstream” workloads where the results were undeniable. Random access and high metadata manipulation is the sweet spot where Jumbo shined and held its own everywhere else. This is also where the target market and primary workload environment we were focusing on was going to be-in the enterprise. Jumbo delivered on management and performance, which are tasks that are high on a system administrator’s task load, and these would not be a problem for enterprise customers.
As I attempted to polish Resitafly, Jumbo out-shined across all workloads (in some cases blew everyone away) and was easy to manage (which, as we all know is extremely important!). Setup time was 6 minutes or less to get a share or export online and useable, and everything fit and worked together seamlessly from a management point of view. Plus, it was on special hardware (FPGAs) for accelerating specific tasks. This “fit” the then HDS mode of operation and culture of “hardware, hardware, hardware”. I was trying to break that mode by thinking expandability and multipurpose. There’s no way we can expand Jumbo’s capability. “…we have a NAS device and that’s all we have”, I would say. In the end, Resistafly felt like a cluster. I had to manage it on several occasions as individual nodes instead of like a single system. I wanted this technology to break away from the old mindset that a scale-out clustered file system couldn’t be a general purpose NAS in its use. But alas, it fell into the typical role. There was no comparison between the two, Jumbo unequivocally demonstrated its superiority as the king carrot.
After choosing the path of BlueArc, I was assigned to manage a small group of HDS employees who were assigned to collaborate with BlueArc developers in the UK. Their task was to start developing features into HNAS to incorporate Hitachi IP and strategy, and integrate Hitachi products into this “hardware NAS device”. In 3 years, they developed in full collaboration with Jumbo developers:
- Change file notification for accelerated content indexing for Hitachi Data Discover Suite (HDDS) search
- Data Migrator; CVL for internal and XVL for external
- Internal file migration
- External NFSv3 based off platform data migration for any NFS based storage target
- External HTTP/Rest based off-platform data migration for Hitachi Content Platform (HCP) storage targets as objects (data plus metadata) ingestion
- Migration policies for Data Migrator for data migration of files based on simple file primitives like date, name, size, etc.
- HDDS integrated with Hitachi NAS Platform (HNAS) for policies management of more complex migration decisions
- CIFS audit logging
- Real-time file blocking and filtering
- VSS support for Windows systems
- … and more!!
All of these advanced features for content and content management were developed on a hardware platform. I ate it up. I was completely converted. The possibilities were endless and the hardware accelerated, AND it was still easy to manage.
Over the years I continued to monitor Resistafly’s progress in the industry – in 2007 they were acquired, and what I can gather is that this technology has been limited to scale-out NAS solutions and apparently wasn’t enough as their parent company acquired yet another company in 2009, presumably for more scale-out capabilities. The updates included the same software capabilities on new hardware servers or storage systems. There was no significant software features announced in the area of content or content management, search or tight integration with other software solutions. It was just medium scale-out NAS. Siloed. Sad.
About 18 months ago, I finally admitted to Michael that I needed to eat crow. He was right and I was wrong. The “hardware” platform that won the bake-off in our carrot patch 5 years ago was no fluke and that all of the advance software features that I wanted to integrate into a software platform indeed run better on HNAS. I expected these features to be in the other carrots by now, but apparently to others software is treated as a hardware enabler, while we (HDS) treated hardware as a feature enabler. I’d mentioned this story a few times internally during meetings with others and now I’m blogging about it for all of you: barbequed crow, when washed down with jumbo carrot juice, definitely tastes triumphant and successful.
The most exciting part about this journey is that all of this development was done with a great partnership between HDS and BlueArc. Just imagine what we’ll be able to accomplish now that BlueArc’s unyielding talent and technology are part of the Hitachi family! I’d like to take this opportunity to welcome the entire BlueArc team to Hitachi. I’ve thoroughly enjoyed working with you in the past on several projects and I look forward to working and furthering our innovation, strategy and vision together in the future.
Comments (5 )
Congratulations for acquiring BlueArc, a long and winding road has ended with success. But I think there is much space for further improvements, especially the
BlueArc Disk Interface firmware has an “old LSI engenio” orientation with preferred path / alternate path technology. This does not reflect the MPIO capabilities ( active / active controller architecture with inherent workload balancing ) of the modern Hitachi storage systems like the AMS2000 and the USP / USP V / VSP and represents a significant throughput brake. Another problem is the BlueArc “Superflush” option which allows full stripe writes with new parity generation without a parity reading before. The volume buffer is limited to 420 KByte, this means a maximum of 6 data volumes x 64 Kbytes for a RAID group
( RAID 5 –> 6D + 1P or RAID6 –> 6D + 2P ). But you must decrease the AMS2000 stripe unit size from 256 to 64 KByte. For the enterprise systems USP, USP V and VSP we have an unchangeable stripe unit size per volume of 2x 256 KByte which makes Superflush de facto impossible.
So I hope that HDS has now the right access to the BlueArc developers to force an improved I/O tuning at the FC interface of the Mercury and Titan boxes.
Thanks for your comment and question Wolfgang. The quick answer to your question is “yes”. The longer answer is that with all of the other strategic development we were working on together, this one was less of a priority to fix. However, given the new relationship we are committed to properly developing this new relationship which will include correcting lower priority items.
Wolfgang, Let me address your comments:
(Also, note that soon my email address will change to firstname.lastname@example.org – and I am very glad of this change).
BlueArc’s developers are now part of HDS. (and enthused about the possibilities it opens!). Thus there will be no need to “force” improved I/O tuning.
In the very near future you and the industry at-large will be introduced to many more improvements and innovations.
Indeed there is always room for improvements. We strive for excellence and will continue to add new features to our products.
Specifically, there are two points made in your comment.
Firstly, will the NAS system support an Active/Active FC/SCSI interface as opposed to the current Active/Passive I/O interface?
With being part of the HDS team, we plan to enhance the current load balancer architecture. This modification of the load balancing scheme should address your concern.
Please note that while we always understood the requirement of supporting such small LUNs, preferably, we’ll still want at least two SDs per stripeset (and preferably more) for redundancy reasons, and so we can still achieve good load-balancing today, without having to share a single SD’s I/O between ports.
Secondly, the size of the superflush buffer is limited to 448KB and thus can limit throughput with the Enterprise arrays(e.g. USP, USP V) that have a fixed stripe unit e.g. 256KB.
Indeed, allowing just 448KB means that superflush can only be used on LUNs that have a 64KB stripe unit with no more than 6 data drives (plus 1 or 2 parity drives depending upon RAID 5 or RAID 6).
The superflush limit has been raised to 4064KB and will be available soon.
True, the arrays you referred to support only a 64K stripe size, this limits the max RAID group size to 7 data disks… this means that the configurations cannot take advantage of larger RAID 5 sets (without RAID5 read/mod/write performance penalty), (Following Array best practice to optimized RAID5 configurations). (Note that new Arrays are now optimized for a 256k stripesize).
In the following release, Superflush will be increased to 4064KB to take advantage of the HDS optimizations for RAID5 8+1. (Note that 512K would suffice, but we looked ahead…). Thus, superflush should be able to support a 256k stripe size with up to 14 data disk in a RAID 5 or 6 RAID Group (Volume Group) .
Some architecture background, in addition to my previous comment:
About active–active Raid controllers: a BlueArc span normally contains many system drives (SDs), which will be balanced between the two controllers. Our hardware-accelerated cache controller, SI, maintains a roughly equal queue of commands at all these SDs (depending, of course, on client I/O patterns). So, although we use only one controller for each SD, we nevertheless balance I/O load evenly between the two controllers. In fact, we go further by letting you span across multiple storage subsystems, meaning that a single file system can balance load across four, six, eight or more Raid controllers.
Vis-à-vis increasing Superflush size: It’s perhaps less useful than it sounds, though: on a mature (i.e. fragmented) file system, it’s most unlikely that the server will ever find 4MiB of contiguous free space to write to.
Interesting conversation on the bits and byte level which reflects the knowledge of HDS of 5 years ago.
Our TechOps File Services Competency Center released in 4 years over 50 performance whitepapers covering Performance in EVERY aspect you can imagine and those papers are available under NDA to our customers and partners.
Those whitepapers demonstrate our superior overall performance for various use cases as every customer environment is different.
In addition we should highlight that our published SpecSFS numbers of HNAS with HDS storage are pretty much equal to Mercury/Titan with other third party disk which is another proof point justifying the name of “High Performance NAS Platform”.