[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: re[2]: Clusterwide pids



Greg Freemyer writes:

>>>  PVM uses the CPID as an internal commodity of the virtual machine, and it
>>>  does mapping between the internal common PID and the real PID of the
>>>  machine. Anyway, it runs enterly at userland. I doubt that we could use
>>>  PVM code, but we can get the ideas.
>
> This seems pretty restrictive.  I gather PVM has a limit of 64,000
> processes across the entire cluster.  A 100 node cluster could only
> have 640 processes per node.

This works: 4000 compute nodes with 16 processes each.

Where I work, we pack 320 32-bit processors into a 9U space.
These units may be linked up together with fiber. So 4 boxes
gets you to 1280 nodes.

> I think any core/common cluster infrastructure modules should
> support at least 100 nodes, and hopefully a lot more than that.

Make that 1000 at least.

> Given the Alpha and the imminent arrival of the Itanium, I would
> prefer to see a 64 bit CPID with the first 32 bits for the node
> and the last 32 bits for the local pid.

Bad, and it won't happen.

BTW the "ps" code is mine.

How about this: when a node boots, pass it a PID range. If a node
runs out of PIDs, it can ask for more. One need not contact the
global pool on every PID allocation; PIDs can be dished out in
blocks and only returned to the pool when they sit idle. PIDs that
belong to other nodes or the global pool can appear allocated to
most of the local code, perhaps with a new process state.

Let's have an adjustable PID limit too. Right now it is 0xfffe if
I remember right. We could set the default to 9999, and let those
with large clusters change it to 999999 or more. This way you can
have all the PIDs you need without forcing small system users to
be annoyed with huge numbers.


Linux-cluster: generic cluster infrastructure for Linux
Archive:       http://mail.nl.linux.org/linux-cluster/