[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: cluster list




On 02.28 Rob Cermak wrote:
> 
> The question posed by the kernel group:
>   Is there a way to add to the existing monilitic kernel to
>   satisfy the needs of these groups?  Common API to handle
>   process, memory, network sharing in cluster arrangements.
> 
> It would be nice if there was a combination of kernel modules
> and user-space tools not requiring a whole hip replacement. 
> 

First of all, I have to say that I do not know too much about kernel
internals. I work in realistic image synthesys, and I have written threaded
programs in SMP shared mem boxes, worked with message passing packages and 
worked slightly with things like POE in SP2. My main insterest is in
getting a cluster built with low end boxes (low end relative to multiprocessing
boxes, some 2-way pc boards) linked with 100Mb ether and its own switch.
University budgets do not give too much space to dream with 64-way SGI or
Sun nodes.

As everybody says, all that can be done in user space should be done that
way.

But there are many things that all packages do that will be faster if
done in kernel space. And some that have to be done in kernel if you
want certain type of clustering.

For example, PVM or MPI configure clusters at user level, but if you want
to use DSM or NUMA (with one level being other node), the kernel has to move
processes or data, so kernel needs to know about the cluster.

I think the first thing that sould be analyzed (as someone posted previously)
is how each package defines node groups to build a cluster and give a common
interface available for all of them. Each package has its own /etc/nodes.cfg
or similar.

It would be fine to have something like
/cluster/node/0/ip
                mem
                bogomips
/cluster/node/1/ip
..
/cluster/node/self -> 1
..

And think about nodes in cluster being even diskless. My ideal cluster will
be a root NFS server and nodes booting over ethernet, with two internal nets,
one for 'housekeeping' (nfs, etc) and one other for mp, say message passing
or process migration or page requests. Then rw data access and control
can overlap.  

-- 
J.A. Magallon                                                      $> cd pub
mailto:jamagallon@able.es                                          $> more beer

Linux werewolf 2.4.2-ac6 #1 SMP Wed Feb 28 01:53:51 CET 2001 i686


Linux-cluster: generic cluster infrastructure for Linux
Archive:       http://mail.nl.linux.org/linux-cluster/