[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

MOSIX objectives



Let me present MOSIX's view of clustering objectives:

The early versions of MOSIX, many years ago, consisted of a true
Single-System Image: although each process had its initial root,
there was one file-system, connecting the roots via the "/.../" directory,
where each node's file-systems could be accessed via "/.../m{node}/".
The "stat"/"fstat" system-calls were extended to provide the node-number
(as well as the device and inode numbers), there was no home-node, and by
using "chroot" one could completely disassociate from any particular
node.  The process-IDs were 31 bits, to allow for 15-bit node-numbers.
The "sync" system-call presented some problem as well.

The above design required massive changes to the Unix kernel (about 60%
of the kernel code was modified) as well as some changes to user-mode
code, then the whole user-mode source-tree had to be recompiled.

This was still possible in the days of Sys-V.2, where all the utilities
took about 10MB. Today nobody would imagine doing this again for all the
Terabytes of Linux user-mode code.

Yes - it is possible to design a nice Unix-like SSI operating-system,
but it wouldn't be Linux, and someone will have to review and possibly
modify all user-land applications. The effect on the kernel would also
be far more massive, and not being able to rely on main-stream drivers,
someone will have to follow up with the hardware-drivers of about 1000
different devices, different buses, chips-sets, memory, APM, IRQs, etc.
Unless one is happy to run their applications on Sys V.2, only Linux
can carry this weight.

In 1992, we "relaxed" the idea of SSI in favor of remaining 100% compatible
(source and binary) with the underlying operating-system. The "new MOSIX"
is based on the "home-model" in which all the user's processes are connected
to and are seen as if they run on the home node. The result is a new MOSIX
kernel architecture which requires modifications of no more than 5% of the
kernel. It attempts to provide SMP functionalities in a scalable cluster.
In the first stage we developed a set of algorithms for efficient management
of the cluster-wide resources by process migration. Other projects that we
intend to develop include DSM and migratable sockets.

As for High-Availability, we think that it is a good idea - but not the
responsibility of the kernel.  It is best done in user-mode, but if
someone comes up with a good user-mode scheme that only requires a bit
of kernel assistance, we will happily try to help provide that support.

Similarly, we look at PVM, MPI and Beowulf as good tools for those
who care to invest more in programming.  They may provide improved
I/O (although MOSIX is closing the gap with DFSA) and initial-assignment
of very short-lived processes.  This is not in contradiction with
dynamic process-migration taking care of further adjustments.  In
fact, initial-assignment can be made to a set of fully-fledged nodes,
while the load can then be distributed to include a larger number of
diskless nodes.

There are many ways to use MOSIX: the kernel provides many flexible
alternatives providing higher-level schemes with automatic, semi-automatic
or manual process-migrations.

Amnon Shiloh -- the HUJI MOSIX group.


Linux-cluster: generic cluster infrastructure for Linux
Archive:       http://mail.nl.linux.org/linux-cluster/