[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: A proposal for a General Clustering Framework
Lars Marowsky-Bree wrote:
>
> Hi guys,
>
> Let me offer a different perspective here.
>
> Do the nodes running two different versions of the protocol really have to be
> able to talk to each other? The discussion appears to assume that the answer
> to this question is "Yes".
The question was more of this form:
Should the framework provide the capability which would enable two different
versions of software potentially running different versions of protocols to
communicate sensibly with each other without a lot of work?
It was not a blanket statement regarding whether any two different versions
of any particular protocol be able to talk together at any particular point
in time.
> Now, I am going to explain to you why I think the answer should in fact be
> "No".
>
> If you are doing an upgrade which changes APIs (ie, protocol version change,
> new attributes), you are asking for the software to bridge a potentially huge
> communication gap.
>
> Sure, a new attribute might be filled in automatically by defaulting it to a
> sensible value - this is "easy enough" if it is static and independant, but
> you have to embed complex logic if it is in fact related to other attributes.
> It becomes a nightmare if you aren't adding or deleting an attribute, but
> _changing_ the meaning of an attribute, potentially because the old code had a
> bug which treated it incorrectly. This different understanding of the
> parameters can in fact lead to a quite-non-HA cluster.
If you're designing an HA protocol, and you want for this to work in your
favor it is encumbent upon you to make sensible protocol changes which are
easy to do this with. If you make stupid decisions, you can make every
single version change hard. Sometimes the result is just hard, and that's
how it is, and you have to shut the cluster down.
For example, in heartbeat, I changed message format entirely early on. The
new nodes could never communicate with the old ones. Since that time, it
hasn't ever happened. It will likely (but not certainly) happen again when
we change to the XML-RPC view of the world.
The intent of this proposal is to make the easy things easy... The hard
things are still hard. Fortunately, with some thought, the easy cases
vastly outnumber the hard ones.
> So in fact, it is desireable that the answer is "No" to reduce complexity.
>
> So, _can_ the answer be "No"? Yes.
>
> What upgrading a node to a new software release effectively is manually
> partitioning the cluster - into the nodes which already have the new software
> and those which do not.
Why would an administrator (customer) *want* to do this? That's not what I
want when I install *my* new software ;-). If it's a 2-node system (the
most common kind), you no longer have quorum and the cluster stops. Worse
yet, you could wind up with two clusters, *each* of which has quorum...
For example, Paddy understands FailSafe's lack of ability to do this as
being a problem. IBM's HA-CMP has this feature. Heartbeat has this
feature.
Wombat tells me that marketing comes in from time to time and has some
particular sale they're going to lose if they can't upgrade seamlessly from
version X to version Y. Then someone makes the decision and developers
scramble so that the customer gets what they want. This seems to be
reality. This proposal makes reality easier to stomach.
If you have a 101-node cluster, you have to downgrade it to a cluster of
significantly less capacity (>= 51 nodes). This seems like a lot of loss
of capacity... If the new cluster doesn't work, you won't find it until the
new cluster achieves quorum, and then if you have problems, you have 51
nodes to back out of... This is equivalent to a 50/51-node flash cut at the
moment the "old" cluster loses quorum. You're very vulnerable at that
moment...
If you *want* to take this approach, you can define the cluster to talk over
a different multicast group, or a different port or whatever and leave
resources offline until you're ready to switch over.
My experience in database upgrades indicates that about 9/10 of the upgrades
are easy ones. You can always use this approach for dealing with
incompatible versions.
It is probably the case that group services and/or the RPC system should
deal with this - and allow the application some flexibility in solving it.
-- Alan Robertson
alanr@unix.sh
Linux-cluster: generic cluster infrastructure for Linux
Archive: http://mail.nl.linux.org/linux-cluster/