[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: A proposal for a General Clustering Framework
> Clearly whatever we do will BY DEFINITION be a significant extension over
> the XML-RPC spec.
>
> 1) we won't support HTTP
why not?
> 2) we will allow calls without explicit replies (event notification)
> 3) we need to indicate which domain the message is within. By that I mean
> which application program on the destination end will receive the
> call...
that is just another argument; no extension needed.
> This is kinda cool...
>
> I also thought of a third extension:
>
> 3) Allow calls to multiple nodes in the cluster. There would be another
> parameter for the list of machines in the cluster to be called, and one for
> a timeout value. The caller would be notified once all the replies came in,
> or timeouts had occurred, or (optionally) cluster membership had changed...
this is an api, not an encoding/message issue. the target list for
such
an api is a bunch of transport endpoints (host:port) that you expect to
be able to process the messages; the api would send the same encoded
message to all of them.
Someone else complained about the byte size of the encoding. I think
we need to be over this kind of issue. I don't believe the bandwidth
utilization is particularly relevant anymore. The 3X factor of message
size isn't what will kill in a cluster, it is the algorithms affecting
the aggregate number of messages. And it's often not the heartbeat that
goes kerblooey, it's often the membership determination steps
that go off the deep end. So, I'm saying, don't worry about the XML
encoding size.
Michael Brown wrote:
>XMLRPC IS NOT appropriate for
> 1) low-level heartbeat: SendHeartbeat()
> 2) cluster manager stuff/membership services:
> 3) actually performing stonith: StomithKillNode()
and Wombat exercsede reluctance to use RPC for low levels.
I disagree with this, but maybe because I'm playing loosed with
XML-RPC as a concept than he is. I see no reason not to use the
encoding, and little reason not to allow it's actual use for just
those things, even over http. It may not be efficient, but it does
allow configurations that are difficult otherwise -- such as geoclusters
with heartbeats tunneling through firewalls.
Wombat is right on another point that the ordering needed during
cluster transition (barrier services) aren't well served by an
event system. I think that is an application for RPC.
Now, it is the case that I'm comfortable with RPC systems, having
built a corba ORB and gotten used to thinking that way. My
impression is that many cluster people (including those I
work with) aren't adequately comfortable or familiar with
RPC systems, and tend to think in terms of sending raw structures
as messages over a channel to a homogeneous system running the
same code at the same version on the other end. (And usually
over a private network so security issues can be punted).
BTW, here are some links to IBM references, including events.
Here is a good high-level picture:
http://www.research.ibm.com/dss/html/phoenix_ext.html
IBM High Availability Cluster Multi-Processor (HACMP) Software
http://www.rs6000.ibm.com/software/Apps/hacmp/
http://www.rs6000.ibm.com/resource/technology/ha42ov.html
IBM RS6000 Cluster Technology (RSCT)
IBM RS6000 Parallel System Support Program (PSSP)
Boot Info -- System Data Repository (SDR)
http://www.rs6000.ibm.com/doc_link/en_US/a_doc_lib/sp/ssp/cmdsv2/spt1mst07.html
Topology Services
http://www.rs6000.ibm.com/doc_link/en_US/a_doc_lib/sp/ssp/admin/spa1mst35.html
Group Services
http://www.rs6000.ibm.com/doc_link/en_US/a_doc_lib/sp/ssp/admin/spa1mst36.html
http://www.rs6000.ibm.com/doc_link/en_US/a_doc_lib/sp/ssp/grpsvcs/csg1mst.html
Event Management
http://www.rs6000.ibm.com/doc_link/en_US/a_doc_lib/sp/ssp/admin/spa1mst37.html
http://www.rs6000.ibm.com/doc_link/en_US/a_doc_lib/sp/ssp/evmgt/cse1mst02.html
Managing Shared Disks
http://www.rs6000.ibm.com/doc_link/en_US/a_doc_lib/sp/ssp/mngdisks/msd1mst02.html
General Parallel File System (GPFS)
http://www.rs6000.ibm.com/doc_link/en_US/a_doc_lib/sp/gpfs/install_admin/gpfs1mst02.html
-dB
Linux-cluster: generic cluster infrastructure for Linux
Archive: http://mail.nl.linux.org/linux-cluster/