[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: A proposal for a General Clustering Framework
From: "David Brower" <David.Brower@oracle.com>
> I will confess to not understanding what Mr. Darcy has written
> here. I do not see how a point to point RPC from one node to another
> is any different than a single hop message in a ring.
They're different because RPC, by definition, involves a reply to indicate
completion (and possibly a result). Bear in mind that RPC is a general and
often-abused term; there are many kinds of RPC, and some RPC packages
contain non-RPC functionality. Whatever Sun RPC or ONC RPC or XML-RPC or
any other RPC actually looks like is not the issue; the generic category
"RPC" still represents certain concepts and communications models that many
see as non-complementary to clustering needs.
You could send RPCs around a ring, such that A calls B and B calls C and
when C calls A it completes the cycle so the replies all propagate along the
reverse path. There's no point, though. The hard part is setting up and
maintaining the logical topology anyway; once you've done that, using RPC
instead of simple messaging often doesn't buy anything. It just doubles
your message traffic - only linearly, true - and slows things down.
> I don't see where
> timeouts and retransmissions come into play
They come into play in several ways. Every protocol that one might use -
e.g. for membership or consensus - has a very particular set of requirements
regarding message reliability, ordering, etc. If you fail to meet those
requirements, the protocol might fail. What's less obvious is that
*exceeding* the protocol's requirements can be just as dangerous. For
example, a protocol might be able to tolerate a certain amount of message
loss, but not duplication. Add retransmission and you add a potential for
duplication, and you might well break your consensus protocol. That might
seem far-fetched to you, but I've had to spend days tracking down just such
requirements mismatches in production environments. A closely related
problem is when a too-ambitious low-level protocol constrains message
behavior in such a way as to force a higher-level protocol into its least
efficient paths. BTW, don't try to posit XID caches as the solution; they
introduce their own set of rather nasty problems.
> I don't understand what
> SPOF he thinks is present in an RPC.
The SPOF is not in the RPC itself, but in the topology that RPC -
indirectly - forces upon the system. What I was thinking of, specifically,
is the SPOF represented by the hub in a star topology.
> I think Mr. Darcy is making conclusions based on knowledge of some
> particular RPC systems.
I really wish you wouldn't make such assumptions. You don't know me or my
background at all. You have *no* reason to make assumptions about what I do
or do not know. I'm trying not to make similar assumptions about you,
except where your own words indicate presence or absence of specific
knowledge. You'd do well to at least consider the possibility that those
who disagree with you do so out of hard-won knowledge rather than ignorance.
> I still don't see it. Elections involve messages between
> voters; those messages could be RPCs. Maybe they want to
> be reliable/acknowledged messagees, maybe not, depending on
> the election algorithm.
It's a chicken and egg problem. Before you can elect a leader, you have to
determine membership. If your membership algorithm relies on already having
a leader elected to act as a coordinator or communications hub, you'll never
be able to bootstrap the whole thing into existence. The relationships
between the protocols used in a cluster are numerous and subtle. Messaging
code is affected by membership changes, which should be coordinated with
event handling to allow graceful exit of a node, the event handling depends
on consensus algorithms which in turn depend on membership...and so on. The
essence of the problem is that the best-known and most reliable algorithms
for these tasks are based on pure messaging. Adding RPC to this volatile
mix doesn't really solve anything, and accomodating RPC's different
"expectations" and behaviors just adds difficulty to an already-difficult
problem set.
> It wasn't intended to be
> anything but a prod to do more research before concluding
> RPC systems are inappropriate.
And I'm hereby prodding you to do more research before concluding RPC *is*
appropriate. Theories about how things "should" work don't cut much ice
when several of the people in the group have been involved in making systems
that *actually* work and those people seem pretty uniformly skeptical about
the applicability of RPC.
Linux-cluster: generic cluster infrastructure for Linux
Archive: http://mail.nl.linux.org/linux-cluster/