[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: High Availability versus Automatic Process Migration



Greg Lindahl wrote:
> 
> On Wed, Feb 28, 2001 at 09:24:33PM -0700, Alan Robertson wrote:
> 
> > I don't know enough about HPC customers, but I don't associate these
> > characteristics with the HPC arena.
> 
> You'd be surprised. But even the national weather service doesn't mind
> if their forecast takes twice as long to run on (rare) occasion; they
> just want the answer almost all of the time in the allowed window, and
> they throw extra cpus at the problem to get the time down to a small
> fraction of the window. Then a rerun due to failure doesn't violate
> the deadline.
> 
> So, the timescale for getting the right answer is different. For most
> commercial HA clusters, you want to transfer in fractions of seconds
> or seconds. For a HPC system, well, if I have to occasionally restart
> that 100 node job from the beginnig, it's not the end of the
> world... I just don't want the user to get back a failure because a
> node died.

I do understand.  It's just that HA by it's very nature systematically
attracts customers who are unusually paranoid ;-)   (Although several
examples of HA/HPC systems come to mind).

> Not surprisingly, this affects cost. The HPC version of weak HA
> requires little extra equipment. But even on my HPC system, I'd like
> to have my admin node with the queue system and etc be a 2-node HA
> system. Not to mention the controller for my parallel filesystem and
> mass-store... as long as it's cheap enough.

But of course, having a single node as a parallel filesystem controller or
mass store is probably not a very scaleable design (when compared to GFS for
example) ;-)

> > A few examples come readily to mind:
> >       Cluster membership and corresponding event APIs
> >       single-image boot
> >       cluster filesystems
> >       system monitoring
> >       node reset mechanisms (i.e., Stonith)
> 
> These are related, yes. Gee, I never realized Stonith needed a name...
> I use APC masterswitches so I can remotely power cycle nodes, just for
> system admin convenience.

STONITH == Shoot The Other Node In The Head - a memorable acronym.  In the
HA case, you probably want something better than the APC switches, since
they only take one power input, so your power doesn't become an SPOF. 
[i.e., the APC switches aren't sufficiently paranoid ;-)]

Stonith means basically that one node can reset another node under program
control.  This is a little different than the desire to do it manually.  We
actually have a library and an API for doing this for several kinds of
mechanisms.

	-- Alan Robertson
	   alanr@unix.sh

Linux-cluster: generic cluster infrastructure for Linux
Archive:       http://mail.nl.linux.org/linux-cluster/