[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Clusterwide pids



* Albert D. Cahalan (acahalan@cs.uml.edu) wrote:
> Lars Marowsky-Br\351e writes:
> >    "Albert D. Cahalan" <acahalan@cs.uml.edu> said:
> 
> >>> This leaves us with the chicken and egg problem - how do you
> >>> boot a node which is - at the time of boot - unable to contact
> >>> the cluster?
> >>
> >> You don't. It is a mistake to design for this perversion.
> >
> > Uh. A node not being able to join the cluster is a perfectly
> > reasonable exception, and you want it to boot so that you can
> > fix it over the network. It makes sense not to start any cluster
> > services, that is true.
> 
> 1. boot the single node without joining the cluster
> 2. fix the node
> 3. reboot to join the cluster

sounds like a windows soultion ;-)  seriously though, if you can boot
and run processes that are local only (no cluster yet, as you haven't
been able to join for whatever reason), how does this cause problems when
you join the cluster?  sure you join the cluster with some machine state
other than a fresh reset, but the state is relative to the resources you
can share with the cluster.  and the act of joining the cluster should
initialize any cluster specific state on the node, no?

<snip>
> >> Now what am I supposed to do with the "ps" program I wrote?
> >> Invisible processes? Nope, not at all OK. This is crap too:
> >> 
> >>   PID TTY          TIME CMD
> >>     1 ?        00:00:03 init
> >>     1 ?        00:00:03 init
> >>     1 ?        00:00:03 init
> >>     1 ?        00:00:03 init
> >
> > You won't see this.
> >
> > init is almost guaranteed to be a "local" process, and thus
> > not visible to other nodes.
> 
> Eeew.
> 
> Share, or do not share.

are you suggesting that it is not legitimate to have local-only processes?
i would expect some processes to be necessarily tied to a node.  like those
repsonsible for reporting the state of the node...

it seems useful to me to be able to distinguish local from cluster
processing.  for ps, perhaps you could use different flags as mentioned
earlier, and scan /proc/cluster/ for cluster processes.

perhaps i'm missing something...
-chris

Linux-cluster: generic cluster infrastructure for Linux
Archive:       http://mail.nl.linux.org/linux-cluster/