From owner-linux-cluster@nl.linux.org Tue Feb 27 23:10:37 2001
Received: by humbolt.nl.linux.org id <S92269AbRB0WK0>;
	Tue, 27 Feb 2001 23:10:26 +0100
Received: from dsl081-067-005-sfo1.dsl-isp.net ([64.81.67.5]:47374 "EHLO
        renegade") by humbolt.nl.linux.org with ESMTP id <S92303AbRB0WJv>;
	Tue, 27 Feb 2001 23:09:51 +0100
Received: from zbrown (helo=localhost)
	by renegade with local-smtp (Exim 3.12 #1 (Debian))
	id 14Xs6b-00030x-00; Tue, 27 Feb 2001 13:56:25 -0800
Date:   Tue, 27 Feb 2001 13:56:25 -0800 (PST)
From:   Zack Brown <zbrown@tumblerings.org>
X-Sender: zbrown@renegade
Reply-To: Zack Brown <zbrown@tumblerings.org>
To:     "David L. Nicol" <david@kasey.umkc.edu>
cc:     linux-cluster@nl.linux.org, riel@conectiva.com.br,
        viro@math.psu.edu,
        Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: Will Mosix go into the standard kernel?
In-Reply-To: <3A9C1A3A.8BC1BCF2@kasey.umkc.edu>
Message-ID: <Pine.LNX.3.96.1010227134555.780R-100000@renegade>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: owner-linux-cluster@nl.linux.org
Precedence: bulk
Return-Path: <owner-linux-cluster@nl.linux.org>
X-Orcpt: rfc822;linux-cluster-list

Do the Mosix folks have anything to add about possible integration into the
kernel? (should have cced them earlier, but it slipped my mind)

 On Tue, 27 Feb 2001, David L. Nicol wrote:

> Zack Brown wrote:
> > 
> > Just curious, are there any plans to put Mosix into the standard kernel,
> > maybe in 2.5, so folks could just configure it and go? it seems that the
> > number of people with more than one computer might make this a feature many
> > would at least want to try, especially if it was available as an option by
> > default. Is there anything in the Mosix folks' implementation that would
> > prevent this?
> 
> I'm not a knowledgeable person, but I've been following Mosix/beowulf/? for
> a few years and trying to keep up.
> 
> I've thought that it would be good to break up the different clustering
> frills -- node identification, process migration, process hosting, distributed
> memory, yadda yadda blah, into separate bite-sized portions.  
> 
> Centralization would be good for standardizing on what /proc/?/?/? you read to
> find out what clusters you are in, and whatis your node number there.  There
> is a lot of theorhetical work to be done.
> 
> Until then, I don't expect to see the Complete Mosix Patch Set available
> from ftp.kernel.org in its current form, as a monolithic set that does many things,
> including its Very Own Distributed File System Architecture.
> 
> If any of the work from Mosix will make it Into The Standard Kernel it will be
> by backporting and standardization.
> 
> 
> Is there a good list to discuss this on?  Is this the list?  Which pieces of
> clustering-scheme patches would be good to have? 
> 
> I think a good place to start would be node numbering.
> 
> The standard node numbering would need to be flexible enough to have one machine
> participating in multiple clusters at the same time.
> 
> /proc/cluster/....	this would be standard root point for clustering stuff
> 
> /proc/mosix would go away, become proc/cluster/mosix
> 
> and the same with whatever bproc puts into /proc; that stuff would move to
> /proc/cluster/bproc
> 
> 
> Or, the status quo will endure, with cluster hackers playing catch-up.

On Tue, 27 Feb 2001, Alexander Viro wrote:

|
|#include <std_rants/Thou_Shalt_Not_Shite_Into_Procfs>
|
|Guys, if you want a large subtree in /proc - whack yourself over the head
|until you realize that you want an fs of your own. I'll be more than
|happy to help with both parts.

Rik van Riel said:

> I know each of the cluster projects have mailing lists, but
> I've never heard of a list where the different projects come
> together to eg. find out which parts of the infrastructure
> they could share, or ...
> 
> Since I agree with you that we need such a place, I've just
> created a mailing list:
> 
>         linux-cluster@nl.linux.org
> 
> To subscribe to the list, send an email with the text
> "subscribe linux-cluster" to:
> 
>         majordomo@nl.linux.org
> 
> 
> I hope that we'll be able to split out some infrastructure
> stuff from the different cluster projects and we'll be able
> to put cluster support into the kernel in such a way that
> we won't have to make the choice which of the N+1 cluster
> projects should make it into the kernel...



-- 
Zack Brown








Linux-cluster: generic cluster infrastructure for Linux
Archive:       http://mail.nl.linux.org/linux-cluster/

From owner-linux-cluster@nl.linux.org Tue Feb 27 23:30:41 2001
Received: by humbolt.nl.linux.org id <S92331AbRB0Wae>;
	Tue, 27 Feb 2001 23:30:34 +0100
Received: from dsl081-067-005-sfo1.dsl-isp.net ([64.81.67.5]:20751 "EHLO
        renegade") by humbolt.nl.linux.org with ESMTP id <S92338AbRB0W33>;
	Tue, 27 Feb 2001 23:29:29 +0100
Received: from zbrown by renegade with local (Exim 3.12 #1 (Debian))
	id 14XscW-0004DG-00; Tue, 27 Feb 2001 14:29:24 -0800
Date:   Tue, 27 Feb 2001 14:29:23 -0800
From:   zbrown@tumblerings.org
To:     linux-cluster@nl.linux.org
Cc:     "David L. Nicol" <david@kasey.umkc.edu>, riel@conectiva.com.br,
        viro@math.psu.edu,
        Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
        mosix-list@cs.huji.ac.il
Subject: Re: Will Mosix go into the standard kernel?
Message-ID: <20010227142923.A17115@renegade>
References: <3A9C1A3A.8BC1BCF2@kasey.umkc.edu> <Pine.LNX.3.96.1010227134555.780R-100000@renegade>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.2.5i
In-Reply-To: <Pine.LNX.3.96.1010227134555.780R-100000@renegade>; from zbrown@tumblerings.org on Tue, Feb 27, 2001 at 01:56:25PM -0800
Sender: owner-linux-cluster@nl.linux.org
Precedence: bulk
Return-Path: <owner-linux-cluster@nl.linux.org>
X-Orcpt: rfc822;linux-cluster-list

argh. OK

On Tue, Feb 27, 2001 at 01:56:25PM -0800, Zack Brown wrote:
> Do the Mosix folks have anything to add about possible integration into the
> kernel? (should have cced them earlier, but it slipped my mind)
> 
>  On Tue, 27 Feb 2001, David L. Nicol wrote:
> 
> > Zack Brown wrote:
> > > 
> > > Just curious, are there any plans to put Mosix into the standard kernel,
> > > maybe in 2.5, so folks could just configure it and go? it seems that the
> > > number of people with more than one computer might make this a feature many
> > > would at least want to try, especially if it was available as an option by
> > > default. Is there anything in the Mosix folks' implementation that would
> > > prevent this?
> > 
> > I'm not a knowledgeable person, but I've been following Mosix/beowulf/? for
> > a few years and trying to keep up.
> > 
> > I've thought that it would be good to break up the different clustering
> > frills -- node identification, process migration, process hosting, distributed
> > memory, yadda yadda blah, into separate bite-sized portions.  
> > 
> > Centralization would be good for standardizing on what /proc/?/?/? you read to
> > find out what clusters you are in, and whatis your node number there.  There
> > is a lot of theorhetical work to be done.
> > 
> > Until then, I don't expect to see the Complete Mosix Patch Set available
> > from ftp.kernel.org in its current form, as a monolithic set that does many things,
> > including its Very Own Distributed File System Architecture.
> > 
> > If any of the work from Mosix will make it Into The Standard Kernel it will be
> > by backporting and standardization.
> > 
> > 
> > Is there a good list to discuss this on?  Is this the list?  Which pieces of
> > clustering-scheme patches would be good to have? 
> > 
> > I think a good place to start would be node numbering.
> > 
> > The standard node numbering would need to be flexible enough to have one machine
> > participating in multiple clusters at the same time.
> > 
> > /proc/cluster/....	this would be standard root point for clustering stuff
> > 
> > /proc/mosix would go away, become proc/cluster/mosix
> > 
> > and the same with whatever bproc puts into /proc; that stuff would move to
> > /proc/cluster/bproc
> > 
> > 
> > Or, the status quo will endure, with cluster hackers playing catch-up.
> 
> On Tue, 27 Feb 2001, Alexander Viro wrote:
> 
> |
> |#include <std_rants/Thou_Shalt_Not_Shite_Into_Procfs>
> |
> |Guys, if you want a large subtree in /proc - whack yourself over the head
> |until you realize that you want an fs of your own. I'll be more than
> |happy to help with both parts.
> 
> Rik van Riel said:
> 
> > I know each of the cluster projects have mailing lists, but
> > I've never heard of a list where the different projects come
> > together to eg. find out which parts of the infrastructure
> > they could share, or ...
> > 
> > Since I agree with you that we need such a place, I've just
> > created a mailing list:
> > 
> >         linux-cluster@nl.linux.org
> > 
> > To subscribe to the list, send an email with the text
> > "subscribe linux-cluster" to:
> > 
> >         majordomo@nl.linux.org
> > 
> > 
> > I hope that we'll be able to split out some infrastructure
> > stuff from the different cluster projects and we'll be able
> > to put cluster support into the kernel in such a way that
> > we won't have to make the choice which of the N+1 cluster
> > projects should make it into the kernel...
> 
> 
> 
> -- 
> Zack Brown
> 
> 
> 
> 
> 
> 
> 
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/

-- 
-- 
Zack Brown

Linux-cluster: generic cluster infrastructure for Linux
Archive:       http://mail.nl.linux.org/linux-cluster/

From owner-linux-cluster@nl.linux.org Tue Feb 27 23:35:17 2001
Received: by humbolt.nl.linux.org id <S92269AbRB0WfI>;
	Tue, 27 Feb 2001 23:35:08 +0100
Received: from jalon.able.es ([212.97.163.2]:33984 "EHLO jalon.able.es")
	by humbolt.nl.linux.org with ESMTP id <S92338AbRB0Weo>;
	Tue, 27 Feb 2001 23:34:44 +0100
Received: from correo.able.es ([212.97.169.185]) by
          jalon.able.es (Netscape Messaging Server 4.15) with SMTP id
          G9FTDX00.D2G; Tue, 27 Feb 2001 23:34:45 +0100 
Date:   Tue, 27 Feb 2001 23:33:56 +0100
From:   "J . A . Magallon" <jamagallon@able.es>
To:     Zack Brown <zbrown@tumblerings.org>
Cc:     "David L . Nicol" <david@kasey.umkc.edu>,
        linux-cluster@nl.linux.org, riel@conectiva.com.br,
        viro@math.psu.edu,
        Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: Will Mosix go into the standard kernel?
Message-ID: <20010227233356.A1400@werewolf.able.es>
References: <3A9C1A3A.8BC1BCF2@kasey.umkc.edu> <Pine.LNX.3.96.1010227134555.780R-100000@renegade>
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 8bit
In-Reply-To: <Pine.LNX.3.96.1010227134555.780R-100000@renegade>; from zbrown@tumblerings.org on Tue, Feb 27, 2001 at 22:56:25 +0100
X-Mailer: Balsa 1.1.1
Content-Length: 451
Lines:  13
Sender: owner-linux-cluster@nl.linux.org
Precedence: bulk
Return-Path: <owner-linux-cluster@nl.linux.org>
X-Orcpt: rfc822;linux-cluster-list


On 02.27 Zack Brown wrote:
> Do the Mosix folks have anything to add about possible integration into the
> kernel? (should have cced them earlier, but it slipped my mind)
> 

And also beowulf people, beowulf@beowulf.org.

-- 
J.A. Magallon                                                      $> cd pub
mailto:jamagallon@able.es                                          $> more beer

Linux werewolf 2.4.2-ac5 #1 SMP Tue Feb 27 01:09:47 CET 2001 i686


Linux-cluster: generic cluster infrastructure for Linux
Archive:       http://mail.nl.linux.org/linux-cluster/

From owner-linux-cluster@nl.linux.org Wed Feb 28 00:34:45 2001
Received: by humbolt.nl.linux.org id <S92338AbRB0XeZ>;
	Wed, 28 Feb 2001 00:34:25 +0100
Received: from imladris.infradead.org ([194.205.184.45]:43025 "EHLO
        infradead.org") by humbolt.nl.linux.org with ESMTP
	id <S92313AbRB0XeC>; Wed, 28 Feb 2001 00:34:02 +0100
Received: from jalon.able.es ([212.97.163.2])
	by infradead.org with esmtp (Exim 3.20 #2)
	id 14Xtd0-0005PJ-00
	for linux-cluster@nl.linux.org; Tue, 27 Feb 2001 23:33:59 +0000
Received: from correo.able.es ([212.97.169.185]) by
          jalon.able.es (Netscape Messaging Server 4.15) with SMTP id
          G9FW5200.F0L for <linux-cluster@nl.linux.org>; Wed, 28 Feb 2001
          00:34:14 +0100 
Date:   Wed, 28 Feb 2001 00:33:25 +0100
From:   "J . A . Magallon" <jamagallon@able.es>
To:     Linux Cluster <linux-cluster@nl.linux.org>
Subject: linux cluster (was: Will Mosix go into the standard kernel? )
Message-ID: <20010228003325.E1400@werewolf.able.es>
References: <3A9C1A3A.8BC1BCF2@kasey.umkc.edu> <Pine.GSO.4.21.0102271630300.4105-100000@weyl.math.psu.edu>
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 8bit
In-Reply-To: <Pine.GSO.4.21.0102271630300.4105-100000@weyl.math.psu.edu>; from viro@math.psu.edu on Tue, Feb 27, 2001 at 22:37:06 +0100
X-Mailer: Balsa 1.1.1
Content-Length: 383
Lines:  10
Sender: owner-linux-cluster@nl.linux.org
Precedence: bulk
Return-Path: <owner-linux-cluster@nl.linux.org>
X-Orcpt: rfc822;linux-cluster-list

Perhaps it is time to move the thread to linux-cluster ?

I want to talk about node naming...
I would vote for a new /cluster fs, just cloned from /proc.

-- 
J.A. Magallon                                                      $> cd pub
mailto:jamagallon@able.es                                          $> more beer

Linux werewolf 2.4.2-ac5 #1 SMP Tue Feb 27 01:09:47 CET 2001 i686


Linux-cluster: generic cluster infrastructure for Linux
Archive:       http://mail.nl.linux.org/linux-cluster/

From owner-linux-cluster@nl.linux.org Wed Feb 28 16:40:46 2001
Received: by humbolt.nl.linux.org id <S92364AbRB1PkU>;
	Wed, 28 Feb 2001 16:40:20 +0100
Received: from brutus.conectiva.com.br ([200.250.58.146]:1523 "EHLO
        brutus.conectiva.com.br") by humbolt.nl.linux.org with ESMTP
	id <S92359AbRB1Pjz>; Wed, 28 Feb 2001 16:39:55 +0100
Received: from localhost (riel@localhost)
	by brutus.conectiva.com.br (8.11.2/8.11.2) with ESMTP id f1SFelP22617;
	Wed, 28 Feb 2001 12:40:47 -0300
X-Authentication-Warning: duckman.distro.conectiva: riel owned process doing -bs
Date:   Wed, 28 Feb 2001 12:40:44 -0300 (BRST)
From:   Rik van Riel <riel@conectiva.com.br>
X-X-Sender:  <riel@duckman.distro.conectiva>
To:     <linux-cluster@nl.linux.org>
cc:     <linux-kernel@vger.kernel.org>, <lwn@lwn.net>
Subject: [ANNOUNCE] linux-cluster list
Message-ID: <Pine.LNX.4.33.0102281238300.5502-100000@duckman.distro.conectiva>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: owner-linux-cluster@nl.linux.org
Precedence: bulk
Return-Path: <owner-linux-cluster@nl.linux.org>
X-Orcpt: rfc822;linux-cluster-list

On special request, this message is re-sent with [ANNOUNCE] in
the subject and the non-announce parts removed.  ;)

Feel free to pass this on to whomever you think might be interested.
----
	[on general clustering stuff]
On Tue, 27 Feb 2001, David L. Nicol wrote:
> Is there a good list to discuss this on?  Is this the list?
> Which pieces of clustering-scheme patches would be good to have?

I know each of the cluster projects have mailing lists, but
I've never heard of a list where the different projects come
together to eg. find out which parts of the infrastructure
they could share, or ...

Since I agree with you that we need such a place, I've just
created a mailing list:

	linux-cluster@nl.linux.org

To subscribe to the list, send an email with the text
"subscribe linux-cluster" to:

	majordomo@nl.linux.org


I hope that we'll be able to split out some infrastructure
stuff from the different cluster projects and we'll be able
to put cluster support into the kernel in such a way that
we won't have to make the choice which of the N+1 cluster
projects should make it into the kernel...

regards,

Rik
--
Linux MM bugzilla: http://linux-mm.org/bugzilla.shtml

Virtual memory is like a game you can't win;
However, without VM there's truly nothing to lose...

		http://www.surriel.com/
http://www.conectiva.com/	http://distro.conectiva.com/



Linux-cluster: generic cluster infrastructure for Linux
Archive:       http://mail.nl.linux.org/linux-cluster/

From owner-linux-cluster@nl.linux.org Wed Feb 28 18:18:40 2001
Received: by humbolt.nl.linux.org id <S92359AbRB1RSN>;
	Wed, 28 Feb 2001 18:18:13 +0100
Received: from saturn.cs.uml.edu ([129.63.8.2]:22277 "EHLO saturn.cs.uml.edu")
	by humbolt.nl.linux.org with ESMTP id <S92352AbRB1RRt>;
	Wed, 28 Feb 2001 18:17:49 +0100
Received: (from acahalan@localhost)
	by saturn.cs.uml.edu (8.11.0/8.11.2) id f1SHHYe54669;
	Wed, 28 Feb 2001 12:17:34 -0500 (EST)
Date:   Wed, 28 Feb 2001 12:17:34 -0500 (EST)
Message-Id: <200102281717.f1SHHYe54669@saturn.cs.uml.edu>
From:   "Albert D. Cahalan" <acahalan@cs.uml.edu>
To:     linux-cluster@nl.linux.org
Subject: cluster list
Sender: owner-linux-cluster@nl.linux.org
Precedence: bulk
Return-Path: <owner-linux-cluster@nl.linux.org>
X-Orcpt: rfc822;linux-cluster-list


OK, hello, anybody alive?

What type of clusters do people care to discuss, if any?

Ethernet-based
huge SMP system treated as a cluster
NUMA system, with multiple kernels and single-system image
systems that DMA between nodes (dest. addr. specified by which side?)
IP, raw, or other transport?

Might one abuse the huge-memory and IO MMU support for
a global memory space?
#define MAKE_PAGE_NUM(node,addr) ( ((node)<<20) | ((addr)>>12) )


Linux-cluster: generic cluster infrastructure for Linux
Archive:       http://mail.nl.linux.org/linux-cluster/

From owner-linux-cluster@nl.linux.org Wed Feb 28 19:05:09 2001
Received: by humbolt.nl.linux.org id <S92360AbRB1SE7>;
	Wed, 28 Feb 2001 19:04:59 +0100
Received: from emcmail.lss.emc.com ([168.159.48.78]:38063 "EHLO emc.com")
	by humbolt.nl.linux.org with ESMTP id <S92356AbRB1SEi>;
	Wed, 28 Feb 2001 19:04:38 +0100
Received: from emc.com (lub1012.lss.emc.com [168.159.39.12])
	by emc.com (8.10.1/8.10.1) with ESMTP id f1SHxdF05380;
	Wed, 28 Feb 2001 12:59:39 -0500 (EST)
Message-ID: <3A9D3C9E.6090807@emc.com>
Date:   Wed, 28 Feb 2001 12:59:58 -0500
From:   Ric Wheeler <ric@emc.com>
Reply-To: ric@emc.com
User-Agent: Mozilla/5.0 (X11; U; Linux 2.2.10 i686; en-US; m18) Gecko/20010131 Netscape6/6.01
X-Accept-Language: en
MIME-Version: 1.0
To:     "David L. Nicol" <david@kasey.umkc.edu>
CC:     Zack Brown <zbrown@tumblerings.org>,
        Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
        linux-cluster@nl.linux.org
Subject: Re: Will Mosix go into the standard kernel?
References: <Pine.LNX.3.96.1010227091255.780M-100000@renegade> <3A9C1A3A.8BC1BCF2@kasey.umkc.edu>
Content-Type: text/plain; charset=us-ascii; format=flowed
Content-Transfer-Encoding: 7bit
Sender: owner-linux-cluster@nl.linux.org
Precedence: bulk
Return-Path: <owner-linux-cluster@nl.linux.org>
X-Orcpt: rfc822;linux-cluster-list

There are two parts of MOSIX that deal with file systems.
In MOSIX, every migrated process leaves a proxy at its creation (home)
node that services all system call requests, including IO calls.

What newer versions of MOSIX did is to add the "DFSA" (direct file
system access) layer that allows MOSIX to support executing file
system calls locally for migrated process when they are against a
cache coherent, cluster file system (think GFS).  When this was put
in MOSIX, they also did a write through, non-caching file system to
test their DFSA code called MFS.

Both the MOSIX team and the global file system group have been involved
in getting their stuff to play nicely together.

ric





Linux-cluster: generic cluster infrastructure for Linux
Archive:       http://mail.nl.linux.org/linux-cluster/

From owner-linux-cluster@nl.linux.org Wed Feb 28 20:00:36 2001
Received: by humbolt.nl.linux.org id <S92347AbRB1TAN>;
	Wed, 28 Feb 2001 20:00:13 +0100
Received: from hcs.ufl.edu ([128.227.92.136]:19863 "EHLO eagle.hcs.ufl.edu")
	by humbolt.nl.linux.org with ESMTP id <S92348AbRB1S7i>;
	Wed, 28 Feb 2001 19:59:38 +0100
Received: (from chideste@localhost)
	by eagle.hcs.ufl.edu (8.10.0/8.9.3) id f1SIxYJ25884
	for linux-cluster@nl.linux.org; Wed, 28 Feb 2001 13:59:34 -0500 (EST)
From:   Matthew Chidester <chideste@hcs.ufl.edu>
Message-Id: <200102281859.f1SIxYJ25884@eagle.hcs.ufl.edu>
Subject: Re: cluster list
To:     linux-cluster@nl.linux.org
Date:   Wed, 28 Feb 2001 13:59:34 -0500 (EST)
X-Mailer: ELM [version 2.5 PL0]
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: owner-linux-cluster@nl.linux.org
Precedence: bulk
Return-Path: <owner-linux-cluster@nl.linux.org>
X-Orcpt: rfc822;linux-cluster-list

> What type of clusters do people care to discuss, if any?

I, for one, am interested mainly in NUMA and/or message-passing systems
with full ethernet connectivity for adminstration and a HPN for application
data (i.e. Myrinet, cLAN, SCI, etc).

The cluster I operate is described here:

http://www.hcs.ufl.edu/carrier/index.html

Note some of our "interesting" problems:

1) support for just about every HPN type out there (though not simultaneous)
2) heterogeniety (some SMP nodes, some uniprocessor; different clock speeds)
3) mixture of OS (ugh!)

Problem 1) puts me in a position to be able to test all kinds of fun stuff,
but often means I'm maintaining three or four different kernel versions
across the clusted since SCI only works with such-and-such a kernel, Myrinet
with a different version, etc.  

Problem 2) is probably the most interesting one from a load-balancing
standpoint.  Most clusters (esp. shoestring-budget clusters) probably will
"grow" over time, meaning not all processors will be of the same capability. 

...matt




Linux-cluster: generic cluster infrastructure for Linux
Archive:       http://mail.nl.linux.org/linux-cluster/

From owner-linux-cluster@nl.linux.org Wed Feb 28 21:12:52 2001
Received: by humbolt.nl.linux.org id <S92342AbRB1UMe>;
	Wed, 28 Feb 2001 21:12:34 +0100
Received: from imcs.rutgers.edu ([165.230.57.130]:4317 "EHLO imcs.Rutgers.EDU")
	by humbolt.nl.linux.org with ESMTP id <S92345AbRB1UMG>;
	Wed, 28 Feb 2001 21:12:06 +0100
Received: from localhost (cermak@localhost)
	by imcs.Rutgers.EDU (8.9.3/8.9.3) with ESMTP id PAA09047
	for <linux-cluster@nl.linux.org>; Wed, 28 Feb 2001 15:01:18 -0500 (EST)
Date:   Wed, 28 Feb 2001 15:01:17 -0500 (EST)
From:   Rob Cermak <cermak@IMCS.rutgers.edu>
To:     linux-cluster@nl.linux.org
Subject: Re: cluster list
In-Reply-To: <200102281717.f1SHHYe54669@saturn.cs.uml.edu>
Message-ID: <Pine.SOL.4.21.0102281332180.28777-100000@imcs.rutgers.edu>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: owner-linux-cluster@nl.linux.org
Precedence: bulk
Return-Path: <owner-linux-cluster@nl.linux.org>
X-Orcpt: rfc822;linux-cluster-list

> OK, hello, anybody alive?

> What type of clusters do people care to discuss, if any?
 
> Ethernet-based
> huge SMP system treated as a cluster
> NUMA system, with multiple kernels and single-system image
> systems that DMA between nodes (dest. addr. specified by which side?)
> IP, raw, or other transport?
> 
> Might one abuse the huge-memory and IO MMU support for
> a global memory space?
> #define MAKE_PAGE_NUM(node,addr) ( ((node)<<20) | ((addr)>>12) )

There are a few kernel side: Mosix(GFS), Beowulf(Scyld)
There are a few user   side: Condor, Scali, GNQS

There is the http://www.beowulf.org/ Beowulf Project...

Others?

The question posed by the kernel group:
  Is there a way to add to the existing monilitic kernel to
  satisfy the needs of these groups?  Common API to handle
  process, memory, network sharing in cluster arrangements.

It would be nice if there was a combination of kernel modules
and user-space tools not requiring a whole hip replacement. 

Looks like Mosix can be a kernel patch and recompile with a separate
CONFIG option.

That last poster had a good point, what about clusters that have different
CPU speeds?  Handled by the job submission process, if you need X CPU @
300 Mhz, then queue/wait for them to free up.   The scheduler will have to
be smart enough to keep other processors from using the CPU's to keep them
free ready for this job.

Where to begin?
Rob



Linux-cluster: generic cluster infrastructure for Linux
Archive:       http://mail.nl.linux.org/linux-cluster/

From owner-linux-cluster@nl.linux.org Wed Feb 28 22:40:13 2001
Received: by humbolt.nl.linux.org id <S92356AbRB1Vjg>;
	Wed, 28 Feb 2001 22:39:36 +0100
Received: from gw.xkey.com ([206.86.100.52]:53008 "EHLO happy.xkey.com")
	by humbolt.nl.linux.org with ESMTP id <S92365AbRB1VjQ>;
	Wed, 28 Feb 2001 22:39:16 +0100
Received: (from smtp@localhost) by happy.xkey.com
	id NAA29818 for <linux-cluster@nl.linux.org>; Wed, 28 Feb 2001 13:39:13 -0800
Received: from hpti8.fsl.noaa.gov(137.75.132.228) by happy.xkey.com via smtp (V1.3)
	id sma029811; Wed Feb 28 16:39:04 2001
Received: (from lindahl@localhost)
	by localhost.hpti.com (8.11.0/8.11.0) id f1SLdQj01911
	for linux-cluster@nl.linux.org; Wed, 28 Feb 2001 16:39:26 -0500
X-Authentication-Warning: localhost.hpti.com: lindahl set sender to lindahl@conservativecomputer.com using -f
Date:   Wed, 28 Feb 2001 16:39:25 -0500
From:   Greg Lindahl <lindahl@conservativecomputer.com>
To:     linux-cluster@nl.linux.org
Subject: Re: cluster list
Message-ID: <20010228163925.A1908@wumpus>
Mail-Followup-To: linux-cluster@nl.linux.org
References: <200102281717.f1SHHYe54669@saturn.cs.uml.edu> <Pine.SOL.4.21.0102281332180.28777-100000@imcs.rutgers.edu>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.2.5i
In-Reply-To: <Pine.SOL.4.21.0102281332180.28777-100000@imcs.rutgers.edu>; from cermak@IMCS.rutgers.edu on Wed, Feb 28, 2001 at 03:01:17PM -0500
Sender: owner-linux-cluster@nl.linux.org
Precedence: bulk
Return-Path: <owner-linux-cluster@nl.linux.org>
X-Orcpt: rfc822;linux-cluster-list

> What type of clusters do people care to discuss, if any?

The clusters I build for HPC are totally at user-level, and don't
really require any kernel changes except good fast networking and
maybe page coloring. Clusers are a quite diverse topic.

> That last poster had a good point, what about clusters that have different
> CPU speeds?  Handled by the job submission process, if you need X CPU @
> 300 Mhz, then queue/wait for them to free up.   The scheduler will have to
> be smart enough to keep other processors from using the CPU's to keep them
> free ready for this job.

A scheduler for parallel jobs is usually at user level, and common
queue systems such as PBS have the capability to deal with multiple
speeds. But if you're doing something like MOSIX and job migration I
don't know where the scheduler sits.

Maybe it would be good to start with a list of cluster systems which
either patch the kernel or are highly dependent on it. One example not
mentioned yet is Condor.

-- g

Linux-cluster: generic cluster infrastructure for Linux
Archive:       http://mail.nl.linux.org/linux-cluster/

From owner-linux-cluster@nl.linux.org Wed Feb 28 22:40:54 2001
Received: by humbolt.nl.linux.org id <S92358AbRB1Vkh>;
	Wed, 28 Feb 2001 22:40:37 +0100
Received: from imladris.infradead.org ([194.205.184.45]:12044 "EHLO
        infradead.org") by humbolt.nl.linux.org with ESMTP
	id <S92355AbRB1Vjz>; Wed, 28 Feb 2001 22:39:55 +0100
Received: from brutus.conectiva.com.br ([200.250.58.146])
	by infradead.org with esmtp (Exim 3.20 #2)
	id 14YEK9-0002xm-00
	for linux-cluster@nl.linux.org; Wed, 28 Feb 2001 21:39:54 +0000
Received: from localhost (riel@localhost)
	by brutus.conectiva.com.br (8.11.2/8.11.2) with ESMTP id f1S3bf502037
	for <linux-cluster@nl.linux.org>; Wed, 28 Feb 2001 00:37:41 -0300
X-Authentication-Warning: duckman.distro.conectiva: riel owned process doing -bs
Date:   Wed, 28 Feb 2001 00:37:41 -0300 (BRST)
From:   Rik van Riel <riel@conectiva.com.br>
X-X-Sender:  <riel@duckman.distro.conectiva>
To:     <linux-cluster@nl.linux.org>
Subject: inventory
Message-ID: <Pine.LNX.4.33.0102280036250.1961-100000@duckman.distro.conectiva>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: owner-linux-cluster@nl.linux.org
Precedence: bulk
Return-Path: <owner-linux-cluster@nl.linux.org>
X-Orcpt: rfc822;linux-cluster-list

Hi,

now that a sizable group (around 50?) people has gathered on
the list, maybe we could start making an inventory of which
cluster services the various projects have and which they are
lacking...

I'm pretty sure we can remove some items from the TODO lists
after seeing which of each other's already written components
are usable ;)

regards,

Rik
--
Linux MM bugzilla: http://linux-mm.org/bugzilla.shtml

Virtual memory is like a game you can't win;
However, without VM there's truly nothing to lose...

		http://www.surriel.com/
http://www.conectiva.com/	http://distro.conectiva.com/


Linux-cluster: generic cluster infrastructure for Linux
Archive:       http://mail.nl.linux.org/linux-cluster/

From owner-linux-cluster@nl.linux.org Wed Feb 28 22:55:53 2001
Received: by humbolt.nl.linux.org id <S92345AbRB1Vz0>;
	Wed, 28 Feb 2001 22:55:26 +0100
Received: from brutus.conectiva.com.br ([200.250.58.146]:20473 "EHLO
        brutus.conectiva.com.br") by humbolt.nl.linux.org with ESMTP
	id <S92360AbRB1VzA>; Wed, 28 Feb 2001 22:55:00 +0100
Received: from localhost (riel@localhost)
	by brutus.conectiva.com.br (8.11.2/8.11.2) with ESMTP id f1S3stU03414;
	Wed, 28 Feb 2001 00:54:55 -0300
X-Authentication-Warning: duckman.distro.conectiva: riel owned process doing -bs
Date:   Wed, 28 Feb 2001 00:54:55 -0300 (BRST)
From:   Rik van Riel <riel@conectiva.com.br>
X-X-Sender:  <riel@duckman.distro.conectiva>
To:     Greg Lindahl <lindahl@conservativecomputer.com>
cc:     <linux-cluster@nl.linux.org>
Subject: Re: cluster list
In-Reply-To: <20010228163925.A1908@wumpus>
Message-ID: <Pine.LNX.4.33.0102280044541.1961-100000@duckman.distro.conectiva>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: owner-linux-cluster@nl.linux.org
Precedence: bulk
Return-Path: <owner-linux-cluster@nl.linux.org>
X-Orcpt: rfc822;linux-cluster-list

On Wed, 28 Feb 2001, Greg Lindahl wrote:

> > What type of clusters do people care to discuss, if any?
>
> The clusters I build for HPC are totally at user-level, and don't
> really require any kernel changes except good fast networking and
> maybe page coloring. Clusers are a quite diverse topic.

I don't intend this list to be limited to kernel level
things, on the contrary...

The more things we can do cleanly in userland, the more
the kernel will stay "small" and maintainable ;)

regards,

Rik
--
Linux MM bugzilla: http://linux-mm.org/bugzilla.shtml

Virtual memory is like a game you can't win;
However, without VM there's truly nothing to lose...

		http://www.surriel.com/
http://www.conectiva.com/	http://distro.conectiva.com/


Linux-cluster: generic cluster infrastructure for Linux
Archive:       http://mail.nl.linux.org/linux-cluster/

From owner-linux-cluster@nl.linux.org Wed Feb 28 23:05:40 2001
Received: by humbolt.nl.linux.org id <S92351AbRB1WFP>;
	Wed, 28 Feb 2001 23:05:15 +0100
Received: from hilbert.umkc.edu ([134.193.4.60]:60946 "HELO tesla.umkc.edu")
	by humbolt.nl.linux.org with SMTP id <S92352AbRB1WEs>;
	Wed, 28 Feb 2001 23:04:48 +0100
Received: (qmail 452532 invoked from network); 28 Feb 2001 22:03:55 -0000
Received: from nicol1.umkc.edu (HELO kasey.umkc.edu) (david@134.193.4.62)
  by hilbert.umkc.edu with SMTP; 28 Feb 2001 22:03:55 -0000
Message-ID: <3A9D75CA.275E9F8B@kasey.umkc.edu>
Date:   Wed, 28 Feb 2001 16:03:54 -0600
From:   "David L. Nicol" <david@kasey.umkc.edu>
Organization: University of Missouri - Kansas City   supercomputing infrastructure
X-Mailer: Mozilla 4.76 [en] (X11; U; Linux 2.4.0 i586)
X-Accept-Language: en
MIME-Version: 1.0
To:     Rik van Riel <riel@conectiva.com.br>
CC:     linux-cluster@nl.linux.org
Subject: Re: inventory
References: <Pine.LNX.4.33.0102280036250.1961-100000@duckman.distro.conectiva>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: owner-linux-cluster@nl.linux.org
Precedence: bulk
Return-Path: <owner-linux-cluster@nl.linux.org>
X-Orcpt: rfc822;linux-cluster-list

Rik van Riel wrote:
> 
> Hi,
> 
> now that a sizable group (around 50?) people has gathered on
> the list, maybe we could start making an inventory of which
> cluster services the various projects have and which they are
> lacking...

Or at least defining what is and is not a "cluster service."
For instance, PVM and rsh, as beautiful and useful as they
are, do not require any blurring of the line between what is running
in which box.

So although I'm sure we'd all like to converge on a standard rsh delegation
shell (rshsh) that wisely chooses a peer for running the next instruction
in its command stream, rshsh would be a user mode tool which would only
require a shared file system and configured rshd.  The internals of rshsh
would not be on topic here, although we would appreciate the announcement
and name under which we can find it on freshmeat.

> I'm pretty sure we can remove some items from the TODO lists
> after seeing which of each other's already written components
> are usable ;)

It's stunning the level of not-made-here-itis that these projects
can accumulate.


Lets focus on one thing that we know all clustering schemes have, and
see if we can standardize it, then go from there.  My nominee of a
suitable case for this treatment remains node numbering.

Consensus appears that there is a standard file that can be read from
to determine one's node number, or written to to change it.

Mosix "clutters up" /proc with all of its controls and displays; and
it has been suggested that it is preferable to define a New File System
for a clustering architecture's controls and mount it somewhere rather
than doing this.  I like this approach since not only does it reduce
the amount of patching (new clusterfs instead of altered procfs) it
trivially allows participation in multiple clusters by mounting multiple
clusterfses at multiple places

Are all in agreeement with the above notes and ideas?

-- 
                      David Nicol 816.235.1187 dnicol@cstp.umkc.edu
                           Damn! Someone stole my book on security!


Linux-cluster: generic cluster infrastructure for Linux
Archive:       http://mail.nl.linux.org/linux-cluster/

From owner-linux-cluster@nl.linux.org Wed Feb 28 23:25:32 2001
Received: by humbolt.nl.linux.org id <S92345AbRB1WZR>;
	Wed, 28 Feb 2001 23:25:17 +0100
Received: from web9203.mail.yahoo.com ([216.136.129.26]:61444 "HELO
        web9203.mail.yahoo.com") by humbolt.nl.linux.org with SMTP
	id <S92351AbRB1WYy>; Wed, 28 Feb 2001 23:24:54 +0100
Message-ID: <20010228222441.81799.qmail@web9203.mail.yahoo.com>
Received: from [192.148.11.96] by web9203.mail.yahoo.com; Wed, 28 Feb 2001 14:24:40 PST
Date:   Wed, 28 Feb 2001 14:24:40 -0800 (PST)
From:   Peter Badovinatz <tabmowzo@yahoo.com>
Subject: High Availability/Failover clusters
To:     Linux Cluster <linux-cluster@nl.linux.org>
Cc:     Alan Robertson <alanr@unix.sh>, Ian D Romanick <idr@cs.pdx.edu>,
        Tim Wright <timw@splhi.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Sender: owner-linux-cluster@nl.linux.org
Precedence: bulk
Return-Path: <owner-linux-cluster@nl.linux.org>
X-Orcpt: rfc822;linux-cluster-list

Just found out about this list.  There has been much discussion on the Linux-HA
mailing list (http://linux-ha.org/) about "cluster" issues oriented to support
for "High Availability":
- resource management, e.g., database engines, IP addresses, disks, file
systems
- failover, e.g., of disk control, applications, databases, etc.
- IP takeover
- monitoring, e.g., heartbeating, weak/strong membership services

As a general rule, we've viewed this support as being mostly embodied in
user-space daemons, with limited amounts of in-kernel code, and only rare
actual kernel changes (well, actual kernel support that we desire):
- scsi reserve/release, multi-tailed I/O, etc.
- IP aliasing/MAC address changes
- softdog (timer-based watchdog capability)
- soft-real-time scheduling for HA daemons, since these control everything and
when they need to run, they NEED to run!

I am not saying that various of the HA components can't be in the kernel, they
can be (e.g., our distributed lock manager project
http://oss.software.ibm.com/developerworks/projects/dlm) but if so they are
often loadable modules and don't usually require tight tie-ins to the kernel. 
This is often the direction such work has taken in commercial HA clusters, and
it also keeps it relatively independent of the kernel version.

One effort that has happened piece-meal but is receiving more focus in the
Linux-HA community in the near future is to work on defining 'componentry' to
provide layered and granular services useful to all aspects of what we think of
as an HA cluster to allow you to mix and match different components to exploit
only the level of service you require.  If you look at http://linux-ha.org/
you'll see some of this, thanks greatly to Alan Robertson, but many other
contributors.  We hope to gain much more momentum on this over the next few
months.

We generally view "HA clusters" as relatively tightly integrated, usually - but
not always - with shared disks, and requiring strictly controlled access to the
resources.  For example, a failover database server, where uncontrolled disk
access means data corruption.  Bad.  We also view them as being relatively
small, with numbers of nodes in a cluster being single digits up to 16 or 32,
not 100s of nodes.

Ah, a GFS+Mosix+Database cluster, requiring IP failover, start/stop/monitor of
the database and other applications, coordination of all of the above, would be
a valid, and very interesting, cluster.  There are different control aspects
that what we usually view on HA clusters.


=====
These have been the opinions of:
Peter R. Badovinatz -- (503)578-5530 (TL 775)
wombat@us.ibm.com/tabmowzo@yahoo.com
and in no way should be construed as official opinion of 
IBM, Corp.

__________________________________________________
Do You Yahoo!?
Get email at your own domain with Yahoo! Mail. 
http://personal.mail.yahoo.com/

Linux-cluster: generic cluster infrastructure for Linux
Archive:       http://mail.nl.linux.org/linux-cluster/

From owner-linux-cluster@nl.linux.org Wed Feb 28 23:30:47 2001
Received: by humbolt.nl.linux.org id <S92345AbRB1Wai>;
	Wed, 28 Feb 2001 23:30:38 +0100
Received: from gw.xkey.com ([206.86.100.52]:31507 "EHLO happy.xkey.com")
	by humbolt.nl.linux.org with ESMTP id <S92350AbRB1WaU>;
	Wed, 28 Feb 2001 23:30:20 +0100
Received: (from smtp@localhost) by happy.xkey.com
	id OAA32671 for <linux-cluster@nl.linux.org>; Wed, 28 Feb 2001 14:30:17 -0800
Received: from hpti8.fsl.noaa.gov(137.75.132.228) by happy.xkey.com via smtp (V1.3)
	id sma032666; Wed Feb 28 17:30:12 2001
Received: (from lindahl@localhost)
	by localhost.hpti.com (8.11.0/8.11.0) id f1SMUSM02023
	for linux-cluster@nl.linux.org; Wed, 28 Feb 2001 17:30:28 -0500
X-Authentication-Warning: localhost.hpti.com: lindahl set sender to lindahl@conservativecomputer.com using -f
Date:   Wed, 28 Feb 2001 17:30:28 -0500
From:   Greg Lindahl <lindahl@conservativecomputer.com>
To:     linux-cluster@nl.linux.org
Subject: Re: inventory
Message-ID: <20010228173028.B1908@wumpus>
Mail-Followup-To: linux-cluster@nl.linux.org
References: <Pine.LNX.4.33.0102280036250.1961-100000@duckman.distro.conectiva> <3A9D75CA.275E9F8B@kasey.umkc.edu>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.2.5i
In-Reply-To: <3A9D75CA.275E9F8B@kasey.umkc.edu>; from david@kasey.umkc.edu on Wed, Feb 28, 2001 at 04:03:54PM -0600
Sender: owner-linux-cluster@nl.linux.org
Precedence: bulk
Return-Path: <owner-linux-cluster@nl.linux.org>
X-Orcpt: rfc822;linux-cluster-list

> So although I'm sure we'd all like to converge on a standard rsh delegation
> shell (rshsh) that wisely chooses a peer for running the next instruction
> in its command stream, rshsh would be a user mode tool which would only
> require a shared file system and configured rshd.  The internals of rshsh
> would not be on topic here, although we would appreciate the announcement
> and name under which we can find it on freshmeat.

It's not that simple. rshsh has to interface to your queue system, if
you have one, and you might want to wisely choose a peer without tying
yourself into a particular transport. So we should define a service or
API for finding out the right peer, and then rshsh and other tools can
use that API.

Most clustering issues are similarly muddled.

And no, other people in other OSes haven't solved these resource
issues very nicely, but it would be worth looking at what they've
done.

-- greg

Linux-cluster: generic cluster infrastructure for Linux
Archive:       http://mail.nl.linux.org/linux-cluster/

From owner-linux-cluster@nl.linux.org Wed Feb 28 23:37:08 2001
Received: by humbolt.nl.linux.org id <S92345AbRB1WhA>;
	Wed, 28 Feb 2001 23:37:00 +0100
Received: from web9210.mail.yahoo.com ([216.136.129.43]:56077 "HELO
        web9210.mail.yahoo.com") by humbolt.nl.linux.org with SMTP
	id <S92350AbRB1Wgq>; Wed, 28 Feb 2001 23:36:46 +0100
Message-ID: <20010228223633.22138.qmail@web9210.mail.yahoo.com>
Received: from [192.148.11.96] by web9210.mail.yahoo.com; Wed, 28 Feb 2001 14:36:33 PST
Date:   Wed, 28 Feb 2001 14:36:33 -0800 (PST)
From:   Peter Badovinatz <tabmowzo@yahoo.com>
Subject: Re: inventory
To:     Linux Cluster <linux-cluster@nl.linux.org>
In-Reply-To: <3A9D75CA.275E9F8B@kasey.umkc.edu>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Sender: owner-linux-cluster@nl.linux.org
Precedence: bulk
Return-Path: <owner-linux-cluster@nl.linux.org>
X-Orcpt: rfc822;linux-cluster-list


--- "David L. Nicol" <david@kasey.umkc.edu> wrote:
<snip>
> It's stunning the level of not-made-here-itis that these projects
> can accumulate.
> 
> 
> Lets focus on one thing that we know all clustering schemes have, and
> see if we can standardize it, then go from there.  My nominee of a
> suitable case for this treatment remains node numbering.
> 
> Consensus appears that there is a standard file that can be read from
> to determine one's node number, or written to to change it.

I would love to see 'one way' to set/read the node number.  This would greatly
simplify my life.  In the small, I am working on a distributed lock manager
project (http://oss.software.ibm.com/developerworks/projects/dlm) that needs to
know each node's number, but, other cluster pieces, such as heartbeating,
likewise need such info.  

Right now, they only "sort of" coordinate this setting, and since we anticipate
the DLM working in many different cluster environments, I am worried about real
ugliness trying to keep this straight in all environments.

> 
> Mosix "clutters up" /proc with all of its controls and displays; and
> it has been suggested that it is preferable to define a New File System
> for a clustering architecture's controls and mount it somewhere rather
> than doing this.  I like this approach since not only does it reduce
> the amount of patching (new clusterfs instead of altered procfs) it
> trivially allows participation in multiple clusters by mounting multiple
> clusterfses at multiple places
> 
> Are all in agreeement with the above notes and ideas?

Right now we're looking to put various DLM information into /proc, but if we
have a "common" area for cluster components, we'll use it.  We're much smaller
than Mosix, but since we're not all that exciting on a single node, we only get
interesting on a cluster, and in that case if everyone gets used to looking
there for information/controls, fine here!
> 
> -- 
>                       David Nicol 816.235.1187 dnicol@cstp.umkc.edu
>                            Damn! Someone stole my book on security!
> 


=====
These have been the opinions of:
Peter R. Badovinatz -- (503)578-5530 (TL 775)
wombat@us.ibm.com/tabmowzo@yahoo.com
and in no way should be construed as official opinion of 
IBM, Corp.

__________________________________________________
Do You Yahoo!?
Get email at your own domain with Yahoo! Mail. 
http://personal.mail.yahoo.com/

Linux-cluster: generic cluster infrastructure for Linux
Archive:       http://mail.nl.linux.org/linux-cluster/

From owner-linux-cluster@nl.linux.org Wed Feb 28 23:47:47 2001
Received: by humbolt.nl.linux.org id <S92343AbRB1WrU>;
	Wed, 28 Feb 2001 23:47:20 +0100
Received: from jalon.able.es ([212.97.163.2]:19700 "EHLO jalon.able.es")
	by humbolt.nl.linux.org with ESMTP id <S92345AbRB1Wqt>;
	Wed, 28 Feb 2001 23:46:49 +0100
Received: from correo.able.es ([212.97.169.185]) by
          jalon.able.es (Netscape Messaging Server 4.15) with SMTP id
          G9HOMW00.I7V; Wed, 28 Feb 2001 23:47:20 +0100 
Date:   Wed, 28 Feb 2001 23:46:30 +0100
From:   "J . A . Magallon" <jamagallon@able.es>
To:     Rob Cermak <cermak@IMCS.rutgers.edu>
Cc:     linux-cluster@nl.linux.org
Subject: Re: cluster list
Message-ID: <20010228234630.A1256@werewolf.able.es>
References: <200102281717.f1SHHYe54669@saturn.cs.uml.edu> <Pine.SOL.4.21.0102281332180.28777-100000@imcs.rutgers.edu>
Mime-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 8bit
In-Reply-To: <Pine.SOL.4.21.0102281332180.28777-100000@imcs.rutgers.edu>; from cermak@IMCS.rutgers.edu on Wed, Feb 28, 2001 at 21:01:17 +0100
X-Mailer: Balsa 1.1.1
Content-Length: 2334
Lines:  57
Sender: owner-linux-cluster@nl.linux.org
Precedence: bulk
Return-Path: <owner-linux-cluster@nl.linux.org>
X-Orcpt: rfc822;linux-cluster-list


On 02.28 Rob Cermak wrote:
> 
> The question posed by the kernel group:
>   Is there a way to add to the existing monilitic kernel to
>   satisfy the needs of these groups?  Common API to handle
>   process, memory, network sharing in cluster arrangements.
> 
> It would be nice if there was a combination of kernel modules
> and user-space tools not requiring a whole hip replacement. 
> 

First of all, I have to say that I do not know too much about kernel
internals. I work in realistic image synthesys, and I have written threaded
programs in SMP shared mem boxes, worked with message passing packages and 
worked slightly with things like POE in SP2. My main insterest is in
getting a cluster built with low end boxes (low end relative to multiprocessing
boxes, some 2-way pc boards) linked with 100Mb ether and its own switch.
University budgets do not give too much space to dream with 64-way SGI or
Sun nodes.

As everybody says, all that can be done in user space should be done that
way.

But there are many things that all packages do that will be faster if
done in kernel space. And some that have to be done in kernel if you
want certain type of clustering.

For example, PVM or MPI configure clusters at user level, but if you want
to use DSM or NUMA (with one level being other node), the kernel has to move
processes or data, so kernel needs to know about the cluster.

I think the first thing that sould be analyzed (as someone posted previously)
is how each package defines node groups to build a cluster and give a common
interface available for all of them. Each package has its own /etc/nodes.cfg
or similar.

It would be fine to have something like
/cluster/node/0/ip
                mem
                bogomips
/cluster/node/1/ip
..
/cluster/node/self -> 1
..

And think about nodes in cluster being even diskless. My ideal cluster will
be a root NFS server and nodes booting over ethernet, with two internal nets,
one for 'housekeeping' (nfs, etc) and one other for mp, say message passing
or process migration or page requests. Then rw data access and control
can overlap.  

-- 
J.A. Magallon                                                      $> cd pub
mailto:jamagallon@able.es                                          $> more beer

Linux werewolf 2.4.2-ac6 #1 SMP Wed Feb 28 01:53:51 CET 2001 i686


Linux-cluster: generic cluster infrastructure for Linux
Archive:       http://mail.nl.linux.org/linux-cluster/

From owner-linux-cluster@nl.linux.org Wed Feb 28 23:56:00 2001
Received: by humbolt.nl.linux.org id <S92355AbRB1Wzm>;
	Wed, 28 Feb 2001 23:55:42 +0100
Received: from c007-h011.c007.snv.cp.net ([209.228.33.217]:6880 "HELO
        c007.snv.cp.net") by humbolt.nl.linux.org with SMTP
	id <S92352AbRB1WzV>; Wed, 28 Feb 2001 23:55:21 +0100
Received: (cpmta 13761 invoked from network); 28 Feb 2001 14:53:49 -0800
Received: from unknown (HELO testbed) (63.89.70.164)
  by smtp.billnorthrup.com (209.228.33.217) with SMTP; 28 Feb 2001 14:53:49 -0800
X-Sent: 28 Feb 2001 22:53:49 GMT
Message-ID: <00ef01c0a1da$2667c050$1f48000a@enshq>
From:   "Bill Northrup" <nodezero@pacbell.net>
To:     <linux-cluster@nl.linux.org>
Subject: auth 7fd8b6b4 subscribe linux-cluster nodezero@pacbell.net
Date:   Wed, 28 Feb 2001 14:59:48 -0800
MIME-Version: 1.0
Content-Type: multipart/alternative;
	boundary="----=_NextPart_000_00EC_01C0A197.17FCC8F0"
X-Priority: 3
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook Express 5.50.4133.2400
X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4133.2400
Sender: owner-linux-cluster@nl.linux.org
Precedence: bulk
Return-Path: <owner-linux-cluster@nl.linux.org>
X-Orcpt: rfc822;linux-cluster-list

This is a multi-part message in MIME format.

------=_NextPart_000_00EC_01C0A197.17FCC8F0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

auth 7fd8b6b4 subscribe linux-cluster nodezero@pacbell.net



------=_NextPart_000_00EC_01C0A197.17FCC8F0
Content-Type: text/html;
	charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD>
<META http-equiv=3DContent-Type content=3D"text/html; =
charset=3Diso-8859-1">
<META content=3D"MSHTML 5.50.4611.1300" name=3DGENERATOR>
<STYLE></STYLE>
</HEAD>
<BODY bgColor=3D#ffffff>
<DIV><FONT face=3DArial size=3D2><FONT face=3D"Times New Roman" =
size=3D3>auth 7fd8b6b4=20
subscribe linux-cluster </FONT><A =
href=3D"mailto:nodezero@pacbell.net"><FONT=20
face=3D"Times New Roman" =
size=3D3>nodezero@pacbell.net</FONT></A><BR></FONT></DIV>
<DIV><FONT face=3DArial size=3D2><A=20
href=3D"http://www.billnorthrup.com"></A></FONT>&nbsp;</DIV></BODY></HTML=
>

------=_NextPart_000_00EC_01C0A197.17FCC8F0--


Linux-cluster: generic cluster infrastructure for Linux
Archive:       http://mail.nl.linux.org/linux-cluster/

From owner-linux-cluster@nl.linux.org Wed Feb 28 23:58:00 2001
Received: by humbolt.nl.linux.org id <S92343AbRB1W5t>;
	Wed, 28 Feb 2001 23:57:49 +0100
Received: from gw.xkey.com ([206.86.100.52]:30213 "EHLO happy.xkey.com")
	by humbolt.nl.linux.org with ESMTP id <S92345AbRB1W51>;
	Wed, 28 Feb 2001 23:57:27 +0100
Received: (from smtp@localhost) by happy.xkey.com
	id OAA01662 for <linux-cluster@nl.linux.org>; Wed, 28 Feb 2001 14:57:25 -0800
Received: from hpti8.fsl.noaa.gov(137.75.132.228) by happy.xkey.com via smtp (V1.3)
	id sma001657; Wed Feb 28 17:57:20 2001
Received: (from lindahl@localhost)
	by localhost.hpti.com (8.11.0/8.11.0) id f1SMvgC02091
	for linux-cluster@nl.linux.org; Wed, 28 Feb 2001 17:57:42 -0500
X-Authentication-Warning: localhost.hpti.com: lindahl set sender to lindahl@conservativecomputer.com using -f
Date:   Wed, 28 Feb 2001 17:57:42 -0500
From:   Greg Lindahl <lindahl@conservativecomputer.com>
To:     Linux Cluster <linux-cluster@nl.linux.org>
Subject: Re: inventory
Message-ID: <20010228175742.A2077@wumpus>
Mail-Followup-To: Linux Cluster <linux-cluster@nl.linux.org>
References: <3A9D75CA.275E9F8B@kasey.umkc.edu> <20010228223633.22138.qmail@web9210.mail.yahoo.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.2.5i
In-Reply-To: <20010228223633.22138.qmail@web9210.mail.yahoo.com>; from tabmowzo@yahoo.com on Wed, Feb 28, 2001 at 02:36:33PM -0800
Sender: owner-linux-cluster@nl.linux.org
Precedence: bulk
Return-Path: <owner-linux-cluster@nl.linux.org>
X-Orcpt: rfc822;linux-cluster-list

On Wed, Feb 28, 2001 at 02:36:33PM -0800, Peter Badovinatz wrote:

> I would love to see 'one way' to set/read the node number.

What's a "node number" for? In the clusters I've built, nodes have
unique names, which happen to be the Unix hostname. Is that not
appropriate for your use?

-- g

Linux-cluster: generic cluster infrastructure for Linux
Archive:       http://mail.nl.linux.org/linux-cluster/

