[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: available resource declaration language(s)



"David L. Nicol" wrote:
> 
> Various resources that can be shared; some are architecture-dependent
> (idle CPU) and some are independent (extra memory for network-speed
> swap device, available hard disk for balanced SAN) and some, such as
> PVM-based distribution, are not commoditizable outside of their own terms.
> (PVM requires the worker nodes have functioning C compilers, for compiling
> the pieces that will run on themselves, once.)
> 
> Could the way the clustered machines find out about each other be
> standardized?

There are several meanings to this statement.  I had one initial
interpretation of it the first time I saw it, and now, after rereading your
article, I have another one.

In high-availability clusters, the emphasis is on longer-term (more static)
knowledge about the systems and their resources.  This kind of knowledge has
a lifetime of days, months or years.  The term "resource" in an HA cluster
typically has this meaning.  A resource might be an IP address, or a web
server, or a NIC.

For load balancing purposes (which appears to be the purpose of this query),
one needs more dynamic information.
 
> Mosix uses a peer-to-peer architecture in which each node periodically
> queries a peer selected at random from its list of peers; what archtectures
> do other projects use?

[I'll offer an answer to this below]

> Has anyone done any serious simulations of the efficiency of various discovery
> methods? For instance, it is easy to imagine a virtual ring architecture in
> which each node shares everything it knows about all other nodes in a larger
> packet which is sent around the ring and a node can only initiate a resource
> request when it has the token, for instance; or broadcast-based architectures
> in which a node advertises its surplus resources with a periodic broadcast packet,
> and nodes wishing to use the resource would begin a negotiation.

I would suggest that one should try to be agnostic as to "how" this
information is collected/propagated in the cluster, and provide programs
that need this kind of information an API which would allow one to obtain
the information from the cluster in a uniform manner regardless of how it is
collected.  This API is much more important than any particular mechanism
which supports it.

Then one would not be wed to a particular architecture, but could in fact
use one of several different methods depending on other factors, the results
of current research, etc without making life hard on customers.  I'll get up
on my soap box at a later date on this score.

Now, having made an argument for having an implementation-neutral API for
accessing the information...  Here's what my code actually does...

Heartbeat (my low-level cluster membership/communication layer) sends
multicast keep-alive (heartbeat) packets every second or so.  These packets
are ASCII name/value pairs.  One of the values sent in every packet is the
content of /proc/loadavg.

The scheme is quite flexible, and one could add other information to each
packet quite easily.  These heartbeat packets are currently a bit larger
than 150 bytes each, including this information and the digital signature. 
Making them 250 bytes each would not be a significant extra burden - even in
a large cluster.

There is a paper on heartbeat design here:
	http://linux-ha.org/comm/HBdesign.pdf

and a talk on it's APIs here:
	http://linux-ha.org/heartbeat/LWCE-NYC-2001/

The heartbeat cluster membership/communications layer is not specific to
high-availability clusters, but can in fact serve for other types of
clusters as well.  This is why I send this information around, even though
it isn't of any particular use to a straight failover cluster.

	-- Alan Robertson
	   alanr@unix.sh

Linux-cluster: generic cluster infrastructure for Linux
Archive:       http://mail.nl.linux.org/linux-cluster/