[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: A proposal for a General Clustering Framework




----- Original Message -----
From: "Alan Robertson" <alanr@unix.sh>
To: "Peter Badovinatz" <tabmowzo@yahoo.com>
Cc: "linux-cluster" <linux-cluster@nl.linux.org>
Sent: Wednesday, June 06, 2001 12:14 AM
Subject: Re: A proposal for a General Clustering Framework


> Peter Badovinatz wrote:
> >
>
> [snip]
>
> > Depends...  We could push customers who really cared about strict HA to
avoid
> > version heterogeneity except during actual node-by-node upgrade.  It was
> > customers who were more oriented to cluster file system - where we still
needed
> > (a bit looser form of) HA and failover but they could be up to 500 node
> > clusters - that we had no choice but to be very flexible as to
supporting
> > multiple versions across nodes because upgrading 500 nodes a few at a
time
> > takes a long time, and unlike more controlled HA clusters the workload
was
> > quite varied so many more things to upgrade/test/migrate.
> >
> > The better choice is to be sure the framework lets you handle multiple
active
> > version levels from the beginning, and you work to minimise their use by
decree
> > (as above) or by only supporting some limited number of levels (N and
N+1 or
> > N+2.)

Since Alan hasn't responded to my request for a reassessment (I understand
that he's been busy today), and since the comments above may reflect a
similar lack of understanding, I'll reiterate:

Supporting at least version N+1 of the *software* along with version N is
clearly required to achieve on-line rolling upgrades.  What is not required
is to support (concurrently) multiple versions of the communication
protocols:  you can instead support only the old protocol set until all
nodes have been upgraded to version N+1, and then perform a cluster-wide
synchronous cut-over to the new protocol version.

If the cluster can be assumed to be under the control of a single
organization (in contrast to the nodes on the Internet, for example), then
requiring that all nodes be upgraded before the communication protocols
change is not prima facie unreasonable:  indeed, it is difficult to create
examples where a protocol-upgrade is urgently needed but only by some subset
of the cluster's members, since a cluster is a considerably more
tightly-bound entity than the nodes on the Internet.

- bill

>
> The conclusions I would draw from this are:
>
> The real world is ugly.
>
> Technology had better be equipped to deal with it.
>
> Sensible policies help you keep out of trouble.
>
> Policies are not guarantees.
>
> I view these as all perfectly consistent with what I said before.
>
> -- Alan Robertson
>    alanr@unix.sh
>
> Linux-cluster: generic cluster infrastructure for Linux
> Archive:       http://mail.nl.linux.org/linux-cluster/
>


Linux-cluster: generic cluster infrastructure for Linux
Archive:       http://mail.nl.linux.org/linux-cluster/