[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: High Availability versus Automatic Process Migration
Greg Lindahl wrote:
>
> On Wed, Feb 28, 2001 at 09:24:33PM -0700, Alan Robertson wrote:
>
> > I don't know enough about HPC customers, but I don't associate these
> > characteristics with the HPC arena.
>
> You'd be surprised. But even the national weather service doesn't mind
> if their forecast takes twice as long to run on (rare) occasion; they
> just want the answer almost all of the time in the allowed window, and
> they throw extra cpus at the problem to get the time down to a small
> fraction of the window. Then a rerun due to failure doesn't violate
> the deadline.
>
> So, the timescale for getting the right answer is different. For most
> commercial HA clusters, you want to transfer in fractions of seconds
> or seconds. For a HPC system, well, if I have to occasionally restart
> that 100 node job from the beginnig, it's not the end of the
> world... I just don't want the user to get back a failure because a
> node died.
I do understand. It's just that HA by it's very nature systematically
attracts customers who are unusually paranoid ;-) (Although several
examples of HA/HPC systems come to mind).
> Not surprisingly, this affects cost. The HPC version of weak HA
> requires little extra equipment. But even on my HPC system, I'd like
> to have my admin node with the queue system and etc be a 2-node HA
> system. Not to mention the controller for my parallel filesystem and
> mass-store... as long as it's cheap enough.
But of course, having a single node as a parallel filesystem controller or
mass store is probably not a very scaleable design (when compared to GFS for
example) ;-)
> > A few examples come readily to mind:
> > Cluster membership and corresponding event APIs
> > single-image boot
> > cluster filesystems
> > system monitoring
> > node reset mechanisms (i.e., Stonith)
>
> These are related, yes. Gee, I never realized Stonith needed a name...
> I use APC masterswitches so I can remotely power cycle nodes, just for
> system admin convenience.
STONITH == Shoot The Other Node In The Head - a memorable acronym. In the
HA case, you probably want something better than the APC switches, since
they only take one power input, so your power doesn't become an SPOF.
[i.e., the APC switches aren't sufficiently paranoid ;-)]
Stonith means basically that one node can reset another node under program
control. This is a little different than the desire to do it manually. We
actually have a library and an API for doing this for several kinds of
mechanisms.
-- Alan Robertson
alanr@unix.sh
Linux-cluster: generic cluster infrastructure for Linux
Archive: http://mail.nl.linux.org/linux-cluster/