[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Clusterwide pids





 Hello, all!

On Tue, 10 Jul 2001, Greg Freemyer wrote:

> Definition:
> CPID - Clusterwide Process ID.  Guaranteed to be unique for each process across the entire cluster.
>
> (please advise if there is a pre-existing Acronym for this.)
>
> Assumptions:
> CPIDs are a useful concept and one that is used by several existing cluster solutions.
>
> Sophisticated features, such as transparent clusterwide IPC, seem to require CPIDs
>
> CPIDs may make process migration easier to implement. (I have no knowledge about this.)

 Not necessarly. In fact, the only working SSI fully transparent migration
scheme running today on productions environments -Mosix- has no
a common PID space. And you have a patch for common PID space for Beowulf
clusters by Internet.

 Personally, I find the PVM pid model great, and that would work on Mosix
-althought it would break the backwards compatibility some user space
apps, like top-. It could be solved parching the most common system apps
-as ps or top-, but Mosix people are not too much worried about what
happends at user space.  -or at least this is that it looked to me-.


> For various reasons, implementing them seems difficult in a 16-bit pid world.

 As an example, in Mosix we have a limit of 2**16 nodes, plus 16 bits pid,
it would need 32 bit pids. Possible, but common aplications like ps could
behave strangely.

> Questions:
> Given that CPIDs seem to be difficult to fully implement and involve
> the kernel, but if accomplished can be widely used throughout the
> OpenSource cluster projects, should a common CPID project be initiated?

 I have seen on the past a path running for 2.0 series for a common PID,
but I lost the link. Anyway, some people would ask first about if it is
a good feature -since not all the people think that it is a good idea
breaking backward compatibility-.


> Bruce Walker relates that the SSI for Linux Clusters project has this
> working, and apparently PVM does as well.  Could either of these
> implementations, or some other existing implementation, be isolated and
> made available to the Linux Cluster community as a component?

 PVM uses the CPID as an internal comodity of the virtual machine, and it
does mapping between the internal common PID and the real PID of the
machine. Anyway, it runs enterly at userland. I doubt that we could use
PVM code, but we can get the ideas.

 The main idea would be doing mapping, as PID does. If you ask for a
16-bit PID, you reference the local PID. If you ask for a 32-bit PID, it
would give to you the global PID. This would mean that all the old
application will run, but it also means that we will need more kernel
calls. As an example, we keep:

 pid_t getpid(void);
 pid_t getppid(void);


 but we also provide:

 pid_t_32 getpid32(void);
 pid_t_32 getppid32(void);

 This would mean that the old apps will run, and we can develop new HP
applications that would make use of the new syscalls.


 For a non-HP kernel, the result of the 16-bits and 32-bits would be the
same -2 most significant bytes at CPID at 0-.

 For a HP kernel, the two lest significat bytes of CPID are the local
PID, and the two most significant bytes are the node tag.

> Can the CPID module be written in such a way as to allow multiple
> cluster membership algorithms?

 I strongly doubt, due to the problems on "live" new memberships. Anyway,
on all the cases that I know on HP computing,  there is not a "multiple
cluster membership" concept. We used to use  something like "partitioning
the cluster". I thing that maybe is better not allowing multiple
memberships, and allowing partitions and groups of jobs; in a way that a
node can  be in more than one partition, and a job of a group of jobs can
only migrate inside its partition.



 Yours:

David


---------------------------
     irbis@orcero.org
http://www.orcero.org/irbis
---------------------------


Linux-cluster: generic cluster infrastructure for Linux
Archive:       http://mail.nl.linux.org/linux-cluster/