United States
Site Map Contacts Hitachi Global
Techno Musings Blog - Content and Information Management Hitachi - Inspire the Next

Content and Information | Physical Infrastructure | Enterprise Systems Management

Home > Corporate > HDS Blogs > HDS Blog Roll > Techno Musings
Products, Solutions and more

Techno Musings

Federated Clustering, Yeah We Did That

Recently the storage industry has been buzzing about federation, remote caching and clustering.  I would say that this is largely due to EMC’s webcast regarding virtual storage, whose technology underpinnings appear to come from some YottaYotta source IP.  There are only two things that I want to borrow from EMC’s discussion on this topic: 1.)varying forms of federation including a local form and a remote form, 2.)reference to Atmos(t) global federation.
Well what I want to start with is defining federation by using the Google search string of “define: federation” to see what the web considers federation to be.  The results of this query largely portray federation as a facet of governmental organizational strategies, specifically federalism.  A key tenant of these definitions lies in a series of semi-autonomous organizations acting as a whole.  These organizations continue to operate their own internal affairs, but seek to gain new and novel behaviors by working together under a single virtual unit.  If we look at the natural world you can also find examples of this kind model in the form of a colonial organism.  I’ll pull directly from Wikipedia the important bits below.

The difference between a multicellular organism and a colonial organism is that individual organisms from a colony can, if separated, survive on their own, while cells from a multicellular lifeform (e.g., cells from a brain) cannot.

So by taking these two approaches I’m going to state that federation of IT architectures comes from having several autonomous or semi-autonomous systems interconnected.  If any of these systems are removed from the “colony” then they can continue to operate on their own, but complex capabilities which require multiple systems to work together are no longer possible.  In this way indeed federation exists both with or without geographic boundaries.  So in theory we can create federated IT infrastructures across any distance, and I personally find this congruent with the definition that EMC proposes for federation.  I do think that there is room for one more which is a subset of the “remote federation” model.  This has to do with using the power of metropolitan distances to have richer functionality and capabilities.  That is because today the primary technology that is used for storage is Fibre Channel which has an effective distance limitation of about 66km.  This limit largely defines the length that some kind of a synchronous operation can occur on without the associated latencies causing application performance problems.  With this in mind I assert that Hitachi already has a suite of clustering services that allows for both closely coupled and loosely coupled federation — I’ll get to more of that later.

Now I want to talk about Atmos(t) for a bit.  A little known secret is that their federation of content comes from the usage of synchronous and asynchronous replication of objects.  (Note I’ll be the first to tell you that I don’t know if it uses block level differencing or not, but anyway they are replicating.)  Personally, I think that this is a fine definition of a federated cluster in that for relatively close distances — remember that 66km from above — then you can use synchronous approaches, but when you get to longer distances you must use an asynchronous approach. Now mind you I’m borrowing this for a reason, because what is good for the goose is good for the gander.  By this I mean that Mr. Hollis at EMC is intentionally drawing a parallel of the Atmos(t) solution to their new vStorage initiative.  Great, if that is true then I can also use the same model and conclude that Hitachi with HAM and UVM already has capabilities like vStorage in the market today, not at some point in the future.  UVM allows for our array based software functions (e.g. Volume Migrator and Shadow Image) to act on any kind of storage internal or external and facilitates interactions between both types of storage.  Here are some examples:

  • Make a point in time Shadow Image of an internal device to an external device
  • Make a point in time Shadow Image of an external device to an internal device
  • Migrate from an internal device to an external device, using Volume Migrator
  • Migrate from an external device to an internal device using, Volume Migrator

Well there is more to the story with UVM.  Specifically, the interconnection can exist between multiple systems and be bidirectional between these systems.  So that implies that I can have two, three, etc. USP-V(M)s interconnected via UVM.  Further the UVM connections can occur over the distance limitation of the Fibre Channel, which as I mentioned is 66km.  So it means that as a user I can attach many  USP-V(M)s together and share storage between all of them.  Additionally, it is possible to perform the basic use cases mentioned above and then some.  For instance if there is a need for a new application in Site A but there is not enough capacity, but there happens to be spare capacity in Site B then you can use the power of UVM to interconnect the systems in both locations and address Site B’s spare capacity from Site A.  If at some point in time Site A gets more spare capacity then you can use Volume Migrator with UVM to relocate the LUN from Site B to Site A, and if required tear down the remote UVM connection to use the bandwidth for other applications.  So in this way UVM exists as a dynamic federated remote clustering solution over distance.  Further if you have a requirement for running a data warehouse at Site B when the primary application resides at Site A, again you can connect the sites using UVM and create a Shadow Image point in time copy of the relevant LUNs and voila.  I could go on and on with these examples, but let’s move on to HAM.

horizontal-uvm1

Hitachi’s High Availability Manager (HAM) allows for two USP-V(M)s to be interconnected with a quorum disk in between,and the host is connected to both storage controllers at the same time.  (Note, at the core, the system uses True Copy keep two LUNs in sync between the systems.) In this configuration to the host the two interconnected controllers look like a single active-passive storage system.  In the event of one controller failing then host I/O is redirected to the second controller.  Well because the system uses Fibre Channel again we can create a federated system over the same 66km distance as referenced in the UVM case.  Because the host recognizes the two systems as if they were a single controller and we can span the controller across multiple sites it is possible to fail over an application between sites without interruption — note that this also implies a form of inter-controller migration.  To add to all of this it is possible to create copies on the remote controller as well so that they can say be replicated over extreme distances for backup purposes in the event of a regional disaster of some kind — kind of like Katrina did over much of the Southeastern Gulf of Mexico region in the United States.  And finally just like UVM the connections can be dynamically set up or torn down on demand — albeit there is a little more configuration overhead than with UVM.

With all of these configurations you need to have a reasonable pipe to put the traffic over.  Let’s take the example of that HAM makes use of True Copy technology. Hitachi already has developed expertise and best practices that help us to determine the required bandwidth for a given customer environment, and since HAM is based upon True Copy our expertise is immediately applicable.  So we are doing what Newton said: standing on the shoulders of giants.  This is an important fact because I agree with EMC getting federated storage right with things like caching algorithms and the best practices needed to make deployments work is not something to to take lightly, and by building upon our existing retained knowledge we are able to do more now.  To prove my point about simple to conceive things being difficult to realize I want to draw on an example from the file system space. Specifically, an n-way high performance clustered file system that works well for all I/O profiles has long been the holy grail of file system development.  This geis has long plagued file system developers and the primary reason is the challenge around locking mechanisms.  The reality is that distributed locking mechanisms only allow clustered file systems to scale up to say 16 nodes — note my colleague Ken and I have some testing evidence in the lab to back this claim up.  However when some of the requirements of the clustered file system are constrained, such as a file system containing large video files only, then it is possible to achieve a much higher scale.  (An example of this would be SGI’s CXFS.) Another compromise is to assume that a large file system global namespace is really about increasing accessibility.  To achieve this goal some architectures eventually serialize I/O to a single file system owner constraining lock management to a single file system instance.  In this design center with accessibility being the key facet many file systems are joined together through a network protocol like Microsoft DFS or in a custom way as in the HNAS Clustered Namespace.  The point that I’m trying to make is that while it is simple to imagine that something can be done, when you get down to actual implementation sometimes the idealized need cannot easily be realized.  Instead if you constrain the requirements and focus on the high value use cases you can realize a reasonable approach.

Okay so now that I have diverged for a minute to talk about file systems, I want to come back to the topic at hand: wide area storage clustering.  Our competitor spins a tail that smacks me of saying they have solved world hunger and the ability to autopilot the space shuttle at one time.  This is not unlike claiming they have developed a clustered file system that can work equally well with all I/O workloads and without compromise, or even like turning lead to gold.  Further it seems that the Bards and Alchemists of EMC want you to believe that they have created something original, well they have not.  Federated clustering is something that Hitachi can already do and as a result of applying our existing know how, you as customer are more assured of extracting value from successful implementation.  However, I’ll be the first to say that I don’t know precisely what EMC has done.  We first have to see what tale is spun from EMC World and then later we will need to extract the facts from the legend known as vStorage.

Stay tuned…

Related Posts Plugin for WordPress, Blogger...

Comments (0)

Post Comment

Post a Comment





.

Techno-Musings

Techno Musings

Connect with Us

   

Recent Videos

Switch to our mobile site