[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Clustering for Linux 2.3.x?!



In article <7bmn82$aqrfl@fido.engr.sgi.com>, you write:
|> Quoting Albert D. Cahalan (acahalan@cs.uml.edu):
|> > Erik writes:
|> > 
|> > > I consider this to be exactly the /wrong/ way to go about it! Why? It'll
|> > > make application lazy. And why is that bad? Because redundancy is needed
|> > > in apps too, and we'll end up implementing clustering both in kernel and
|> > > in the apps.
|> > 
|> > Some people need redundancy in apps. Some people don't.
|> 
|> Could you give an example of a large clustered app where redundancy wouldn't
|> be any benefit at all? Failing that (the question is slightly unfair) please
|> describe the least redundancy requireing app. Note that more nodes increases
|> the rate-of-faliure...

Any application that runs on a system which provides application
transparent fault tolerance, of course :-)

That was a goal of the Unisys OPUS system (R.I.P.);  using a little
bit of dedicated memory, and a feature Unisys asked Intel for on 
the P6, one could efficiently audit interactions between a process
and the operating system, checkpoint the process periodically, and
replay it to the point of failure after restarting the checkpoint.
(There are a lot of complicated details I'm glossing over here :-)
(We could even deliver signals at the same space/time in the restarted
copy of the checkpointed app).

Unisys patented the technique in the early 90's.

|> 
|> > > The one place where kernel support is needed is distributed
|> > > redundant filesystems, CODA and AFS pops up in my mind -- fast enough?
|> > 
|> > No, not appropriate at all. CODA and AFS are designed for large
|> > networks. CODA at least does not support locking. Neither supports
|> > simple sharing of disk space accross the machines. You would want
|> > data to migrate to where it is needed.
|> 
|> Yes that was my intended functionality, I had the delusion that both of them
|> could do this. A new fs "clusterfs" might be needed, mostly for the shm apps
|> though.

GFS.

|>  
|> > Kernel support is also needed for process ID allocation and scheduling.
|> 
|> For distributed-ipc: yes. For message-passing: no. What will the cost be for
|> the common case if such modifications are made??

Depends on whether it is an integral part of the system design, 
or an addon patch.

scott lurndal
silicon graphics, inc.			(I speak for myself)
-
Linux-future: thinking about the future of the Linux kernel
Archive:      http://humbolt.nl.linux.org/lists/
Wish list:    http://users.ox.ac.uk/~mert0236/linux-future.html