[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Compaq launches Open SSI Cluster Projects
The good news is that this code contains very important intellectual
property - and I am amazed that Compaq has released it under GPL2.
(I've been working with Tandem/Compaq for several years on highly
scalable, highly reliable SSI and CI architectures.)
I haven't looked at all the code (some is yet to be released...), but if
it is the code I'm familiar with, it allows for process migration,
process pairs, etc. for "serious" HA for business applications and
databases.
David Brower wrote:
>
> They -have- developed in a linux vacuum; this is the Tandem/NonStop/SCO
> Unix cluster stuff, pretty much as deployed on that platform.
Yup.
>
> Personally, I am more interested in the CI part of the project, which
> along with the IBM DLM would provide a reasonable GFS platform. It seems
> it would get together faster than GFS-on-DLM-on-heartbeat would.
>
> I'm less into in the very-large scope SSI work, maybe because I don't
> understand the implications of the failure domains. It has appeared to me
> before that failure of a node is likely to have deeper ripples in the SSI
> scheme than it does in clusters with less tightly coupled nodes.
Not so. With process pairs and process migration, one can have apps.
that are almost impervious to hardware failure. (Take a look at Jim
Gray's book - Transaction Processing Concepts and Techniques, section
3.7 - Fault Model and Software Fault Masking).
> And it
> is all very intrusive in ways one wonders if Linus would ever accept. Its
> a noble attempt, but it's gonna be a lot harder to accept.
By definition, SSI clusters ARE intrusive - at least from the kernel
standpoint. It's a question of "degree". If "hooks" are used, then it
would seem that those "hooks" may well server beneficial to all CI
work(?).
At the application level, intrusiveness can be somewhat "hidden". For
instance, if a JVM is made to be "fault tolerant", applications running
on that JVM "inherit" SOME of the qualities of that fault tolerance.
For an application (say database) to be fully fault tolerant (hardware
and software), the that app must be architected for fault tolerance
(process pairs, etc.).
> In a perfect world, many of the components would plug together; I don't know
> how the CI stuff maps into the heartbeat model.
Again, I'll have to look at the code - but "typically" all clustering
dealing with HA fault domains must have "heartbeat". As I recollect
from some of my earlier IP work, HP and Tandem hold the original patents
on "heartbeat".
---- snip ---- snip ----
Cheers,
Lyle
> ------------------------------------------------------------------------------
> Linux HA Web Site:
> http://linux-ha.org/
> Linux HA HOWTO:
> http://metalab.unc.edu/pub/Linux/ALPHA/linux-ha/High-Availability-HOWTO.html
> ------------------------------------------------------------------------------
--
Lyle Bickley | Bickley Consulting West Inc.
lbickley@acm.org |
lbickley@bickleywest.com | V 650-428-0621
http://bickleywest.com/ | F 650-428-0599
"Black holes exist where GOD is dividing by zero"
Linux-cluster: generic cluster infrastructure for Linux
Archive: http://mail.nl.linux.org/linux-cluster/