[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: available resource declaration language(s)
David Santo Orcero wrote:
>
> Hello, all!
[snip]
> I am begining to thing that we will need two different APIs, that will be
> different kernel options. One is HA ckusters and other HP clusters. They
> are SO different -you are showing me this- that I find quite dificult to
> do a HP+HA API.
I would suggest that we not give up quickly on the idea of having common
components and APIs. I believe that some implementations of components will
be better for some applications and configurations than others, but in many
cases they serve the same purpose - but in different ways.
And, just as you pointed out, my view isn't sufficient for everyone, and I'm
sure you would agree that your view isn't sufficient for everyone either.
For example, we've talked about membership. Most (all?) cluster
applications need some form of membership services. But different
implementations of membership services provide different characteristics in
their implementation. By analogy, one might call them quality of service
(QOS). Using QOS as an analogy, most applications need networking, but some
need low latency, and others need predictable packet times, and others need
high bandwidth.
But, most use TCP/IP for the transport, in spite of their different needs.
This analogy is stretched a little thin, but there are similarities.
In our case, your cluster might need a very low bandwidth solution, and mine
might need quick discovery of dead nodes. But - we both need to be able to
tell what machines are in the cluster, and what ones are out of it.
So, if we design an framework which would allow us to plug in different
loadable modules to provide these services, then one could assemble a
cluster out of one's favorite components - and create a solution which
solves one's problem better than any fixed solution can.
Identifying the right components and designing a sufficiently flexible and
lightweight and general framework (APIs, base software, etc.) is not simple,
unfortunately.
> > The most important conclusion I draw from this interchange is that we MUST
> > create a framework into which we can plug various methods, and have the
> > client applications not care at all. If we create such a framework, then
> > the technologies can fight it out, and the winner will always be the user.
>
> Perfect. I am not going to talk about HA, but HP; and I thing that in
> that case the framework would have the following guidelines -it is my
> proponsal-. I will use as reference the four things that I have used more
> -MPI, PVM, Mosix and Beowulf-. The four are completly differeng things,
> but I will not talk about implementation, but features; it is a wish list.
>
> 1) The cluster have to be a semantics. PVM have a semantics, MPI have a
> semantics, Mosix have a semantics. Maybe Mosix one is better -the whole
> cluster is shown to the user as a SMP machine-. Mosix does this, thus it
> is possible. Maybe one of the hot points of the discussion is deciding
> what semantics is better for a HP cluster.
>
> 2) It should be a efficient method to send a task from the beginning to
> the least loaded node of the cluster. PVM have this, Mosix have not -in
> Mosix the task can migrate after being launched, but its kernel part will
> be executed on the launch node-.
A cluster batch scheduler presumably could be a help here...
> 3) It should be portable between different Linux architectures. Mosix are
> not, the others are. (For me, it does not matter; but I know groups that
> will find great this).
>
> 4) The network will be as transparent af we could. Mosix is great for
> this, PVM and MPI does a good work, and Beowulf does nothing.
>
> 5) It must to allow to run cheap hardware efficiently. this take out
> broadcast protocols, sorry. ;-)
I would state this differently. It must be possible to assemble a set of
components that allows it to work efficiently on cheap hardware. I would
also argue that it must be possible to assemble a set of components that
allow it to take advantage of clusters with more bandwidth.
There is also a class of applications (like weather prediction) where the
system needs to be HA/HPC. The US weather bureau wants to perform a set of
calculations and always have it finish on time, including automatically
completing successfully when nodes fail in the middle. If they need to buy
more hardware for redundancy, they will. There are other examples as well.
>
> 6) Migrating running task is great. Mosix does a good work here, but not
> perfect -sockets and shared memory code can not migrate-.
For HA, automatic process migration in the kernel is a hinderance - not a
help. It makes it difficult to figure out what has failed, and to restart
it on still-working nodes. MOSIX's current implementation is what I'd call
low-availability: For a 2-node cluster, having one node fail can cause all
processes on both nodes to die. This is not good for HA.
It also makes performance unpredictable. If you're short on cycles (like
you describe yourself), then this can be a big problem.
Application-directed restarts are much harder, but often better performing
as well. If you have more human than technological resources, this might be
a better choice.
Nevertheless, there is a class of services which all clusters have in
common.
The following examples come to mind:
You need control communication, you need membership, you need high-bandwidth
communication, reset services, etc.
-- Alan Robertson
alanr@unix.sh
Linux-cluster: generic cluster infrastructure for Linux
Archive: http://mail.nl.linux.org/linux-cluster/