[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Clustering for Linux 2.3.x?!
Quoting Scott Lurndal (slurn@griffin.engr.sgi.com):
> In article <7bmn82$aqrfl@fido.engr.sgi.com>, you write:
> |> Quoting Albert D. Cahalan (acahalan@cs.uml.edu):
> |> > Erik writes:
> |> >
> |> > > I consider this to be exactly the /wrong/ way to go about it! Why? It'll
> |> > > make application lazy. And why is that bad? Because redundancy is needed
> |> > > in apps too, and we'll end up implementing clustering both in kernel and
> |> > > in the apps.
> |> >
> |> > Some people need redundancy in apps. Some people don't.
> |>
> |> Could you give an example of a large clustered app where redundancy wouldn't
> |> be any benefit at all? Failing that (the question is slightly unfair) please
> |> describe the least redundancy requireing app. Note that more nodes increases
> |> the rate-of-faliure...
>
> Any application that runs on a system which provides application
> transparent fault tolerance, of course :-)
:-)
> That was a goal of the Unisys OPUS system (R.I.P.); using a little
> bit of dedicated memory, and a feature Unisys asked Intel for on
> the P6, one could efficiently audit interactions between a process
> and the operating system, checkpoint the process periodically, and
> replay it to the point of failure after restarting the checkpoint.
> (There are a lot of complicated details I'm glossing over here :-)
> (We could even deliver signals at the same space/time in the restarted
> copy of the checkpointed app).
>
> Unisys patented the technique in the early 90's.
Now you have the system provide redundancy, I asked for any app that didn't want
it _at_all_ :-)
> |> > > The one place where kernel support is needed is distributed
> |> > > redundant filesystems, CODA and AFS pops up in my mind -- fast enough?
> |> >
> |> > No, not appropriate at all. CODA and AFS are designed for large
> |> > networks. CODA at least does not support locking. Neither supports
> |> > simple sharing of disk space accross the machines. You would want
> |> > data to migrate to where it is needed.
> |>
> |> Yes that was my intended functionality, I had the delusion that both of them
> |> could do this. A new fs "clusterfs" might be needed, mostly for the shm apps
> |> though.
>
> GFS.
Pointer please?
> |>
> |> > Kernel support is also needed for process ID allocation and scheduling.
> |>
> |> For distributed-ipc: yes. For message-passing: no. What will the cost be for
> |> the common case if such modifications are made??
>
> Depends on whether it is an integral part of the system design,
> or an addon patch.
I don't like add-on patches, think "workstation when human is present -> cluster
when human is not present"
Erik
--
Please tell me the errors of my ways so I might correct them.
-
Linux-future: thinking about the future of the Linux kernel
Archive: http://humbolt.nl.linux.org/lists/
Wish list: http://users.ox.ac.uk/~mert0236/linux-future.html