[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: available resource declaration language(s)





 Hello, All:

> >  The three problems can be solved in a broadcast net, as proposed? No! If
> > you have enougth nodes, you will flood the network; that is why I am
> > really sure that using broadcast features it will not work.
>
> This is a common misconception.  I've done the calculations on this, and

 No, This is not a misconception. It is my own experience developing
parallel software -ok, all is at user level, but the network cable can't
see the difference between a packet generated by a kernel hacker and a
packet generated by a developer of scientific software-.

 You are trasmiting a short package and wait if a NACK arrives.
Maybe you never ever need establish a TCP connection, neither a simple UCP
transmition; you have enouth with ICMP packets. Between this and full
interchange of information about the sistem -uptime, system
performance, number of process, number of empty slots on system tables,
memory free and so on- there is a HUGE difference. Test it. No matter how
small is your package, you will have to use UDP or TCP. TCP is impossible
due to kernel tables limitations, and UDP is lots bigger that ICMP.



> For a 1000 node system and a 150 byte heartbeat packet, and a 1 second
> heartbeat interval, the bandwidth is approximately 1.2% of the bandwidth
> available on an unswitched 100 Mbit network.

 Did you calculed this operation mathematically, or did you do the test on
a cluster? The results may be completly different, due to the colisions.
There is not a thing like "the amounth of information that a chanel can
transmit" but "the package on the network uses physical space, and when
one package is traveling, the rest must stop. And on the spreadest of the
networks they don't, and they do colitions. Well, we can ask to the user:
"you can't use ethernet to do clustering", but this will leave the most of
the people out of the game.

> If you double the packet size, it would rise to 2.5%.  This is pretty small

 You are calculating mathematically, dividing the peak bit rate that
can be obtained using the network by number of the bits that
you transmit! If you double a packet size, the network usage on the cheap
networks _never_ mutiplies by 2, due the colitions! As an clear example
that anybody can test, if you send 3Mb on a second from one node and other
3Mb on a second on other node, it is not true that in a third node the
information will arrive at 6Mb/seg! In fact, you will have lots of
colitions on the channel.


 Let's assume a network that allows broadcast of node information
and a full exchange of information between your 1000 nodes.
Renember that the MAINTANCE of the  data collected on a non-P2P solution
will be also a problem: a O(n^2). Let's assume that you have a more
efficient algorithm, O(n). You will have the 1000 nodes, sending
constantly information, you sending information constantly, and doing
constant modification of the table. Maybe it will be a good solution to do
Linux as efficient as Amoeba.

  If you do not broadcast, and you do P2P with random poll we will send
few packets por second on your 1000 nodes network, and we will overload
the kernel with a O(k) algorithm.


 Yours:

David


---------------------
http://www.orcero.org
  irbis@orcero.org
---------------------


Linux-cluster: generic cluster infrastructure for Linux
Archive:       http://mail.nl.linux.org/linux-cluster/