Yes, MPI_Barrier runs perfectly.<br><br>There is another question: Is quite normal getting 100 seconds on parallel proceesing using 10/100 network and 20-30 seconds running over multiprocessor machine over sock channel and 2-3 seconds over shared memory, at the same task?
<br><br>The difference is so big? Thats normal?<br><br>Thanks.<br><br><div><span class="gmail_quote">On 3/14/07, <b class="gmail_sendername">Rajeev Thakur</b> <<a href="mailto:thakur@mcs.anl.gov">thakur@mcs.anl.gov</a>
> wrote:</span><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
<div>
<div dir="ltr" align="left"><span><font color="#0000ff" face="Arial" size="2">With 200,000 iterations, the sleep(1) probably causes some
skew that cause the bcast and gather to go out of sync. Another experiment to
try is to add an MPI_Barrier either at the beginning or end of the loop (for
each iteration), with the sleep(1) still there.</font></span></div>
<div dir="ltr" align="left"><span><font color="#0000ff" face="Arial" size="2"></font></span> </div>
<div dir="ltr" align="left"><span><font color="#0000ff" face="Arial" size="2">Rajeev</font></span></div>
<div dir="ltr" align="left"><span> </span></div><br>
<blockquote style="border-left: 2px solid rgb(0, 0, 255); padding-left: 5px; margin-left: 5px; margin-right: 0px;">
<div dir="ltr" align="left" lang="en-us">
<hr>
<font face="Tahoma" size="2"><span class="q"><b>From:</b> Bruno Simioni
[mailto:<a href="mailto:brunosimioni@gmail.com" target="_blank" onclick="return top.js.OpenExtLink(window,event,this)">brunosimioni@gmail.com</a>] <br></span><b>Sent:</b> Wednesday, March 14, 2007
12:54 PM<br><b>To:</b> Rajeev Thakur<br><b>Subject:</b> Re: [MPICH2 Req #3260]
Re: [MPICH] About ch3:nemesis.<br></font><br></div><div><span class="e" id="q_1115198ee37d987b_3">
<div></div>Hey Rajeev,<br><br>About questions:<br><br>Yes, about 200000
iterations.<br><br>What happens if there is small numbers of iterations, I'll
unable to realize the problem. The large number of iteration acumulates the
problem. <br><br>Dummy computation: I'll put a for() loop later. now the lab
is busy. Do you believe that the fact of the thread sleep causes that
late? 'cause the same program running at one only machine is so fast
that using the network. <br><br>Bruno.<br><br>
<div><span class="gmail_quote">On 3/14/07, <b class="gmail_sendername">Rajeev
Thakur</b> <<a href="mailto:thakur@mcs.anl.gov" target="_blank" onclick="return top.js.OpenExtLink(window,event,this)"> thakur@mcs.anl.gov</a>>
wrote:</span>
<blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
<div>
<div dir="ltr" align="left"><span><font color="#0000ff" face="Arial" size="2">Are you
running for a large number of iterations of the for() loop? What happens if
you run just 1 iteration or a small number of iterations (say 5)? Also, what
happens if you replace the sleep(1) with some dummy computation that takes 1
sec?</font></span></div>
<div dir="ltr" align="left"><span><font color="#0000ff" face="Arial" size="2"></font></span> </div>
<div dir="ltr" align="left"><span><font color="#0000ff" face="Arial" size="2">Rajeev</font></span></div>
<div dir="ltr" align="left"><span></span> </div><br>
<blockquote style="border-left: 2px solid rgb(0, 0, 255); padding-left: 5px; margin-left: 5px; margin-right: 0px;">
<div dir="ltr" align="left" lang="en-us">
<hr>
<font face="Tahoma" size="2"><b>From:</b> Bruno Simioni [mailto:<a href="mailto:brunosimioni@gmail.com" target="_blank" onclick="return top.js.OpenExtLink(window,event,this)">brunosimioni@gmail.com</a>] <br><b>Sent:
</b> Tuesday, March
13, 2007 9:37 PM<br><b>To:</b> Darius Buntinas<br><b>Cc:</b> <a href="mailto:mpich2-maint@mcs.anl.gov" target="_blank" onclick="return top.js.OpenExtLink(window,event,this)">mpich2-maint@mcs.anl.gov</a><br><b>Subject:
</b> [MPICH2 Req
#3260] Re: [MPICH] About ch3:nemesis.<br><b>Importance:</b>
High<br></font><br></div>
<div><span>
<div></div>Hi!<br><br>Yeah, you're correct. My problem is described by
second situation.<br><br>3 nodes, with one processor per node, and one
process per processor.<br><br>The program use not MPI_Recv or MPI_Send,
but MPI_Gather and MPI_Bcast. <br><br>if (myid == 0)<br>
{<br>
stuff...<br>
for (...)<br>
{<br>
/* Receive
information from all nodes of communicator. */<br>
MPI_Gather(&rx,1,MPI_DOUBLE,&r,1,MPI_DOUBLE,0,MPI_COMM_WORLD,status);\
<br>
Calculate FR, using Rx of MPI_Gather.<br>
/* Send Fr to everybody,. */<br>
MPI_Bcast (&fr, nn+1, MPI_DOUBLE, 0,
MPI_COMM_WORLD);<br>
Calculate
something and write file. <br>
}<br>
MPI_Finalize();<br>
}<br> else<br>
{<br>
stuff...<br> for
(...)<br>
{<br>
/* Send rx to root */<br>
MPI_Gather(&rx,1,MPI_DOUBLE,&r,1,MPI_DOUBLE,0,MPI_COMM_WORLD,status);
<br>
Calculate
something<br>
Sleep(1); /* For expand the program. In future, i'll change that,
replacing that with a for(). See results. */<br>
/* Receive FR from root. */ <br>
MPI_Bcast (&fr, nn+1, MPI_DOUBLE, 0,
MPI_COMM_WORLD);<br>
Calculate something and write file.<br>
}<br>
MPI_Finalize();<br>
}<br>}<br><br><br>Basically, that is the code.<br><br>And thats the
results:<br><br>I calculate the time of the main for() to estimate and
compare the time of parallel programming.<br><br>the program running on 3
machines - 77 seconds<br>the program running on 3 process under the same
machine using sock channel - 109s <br><br>the program running on 3
machines AND the Sleep(1) line - 1565s<br>the program running on 3 process
under the same machine using sock channel AND the Sleep(1) line -
295s<br><br>How to explain that results?<br><br>If you do not understand
the line Sleep(1), I'll explain. For now, the algoritm is not done yet. A
lot of operations if missing. So, to replace that, i put the Sleep(1)
time, and test.<br><br>It appears that, if the node expends a lot of time
without of communicating, to turn it on again, takes a lot of time, right?
<br><br>Bruno.<br><br><br>
<div><span class="gmail_quote">On 3/13/07, <b class="gmail_sendername">Darius
Buntinas</b> <<a href="mailto:buntinas@mcs.anl.gov" target="_blank" onclick="return top.js.OpenExtLink(window,event,this)">buntinas@mcs.anl.gov</a>> wrote:</span>
<blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"><br>Just
so I understand, the master is doing something like
this:<br><br>MPI_Send(small msg to slave)<br>MPI_Recv(small answer from
slave)<br><br>If the slave does something like
this;<br><br>MPI_Recv(small msg from master) <br>/* no processing
*/<br>MPI_Send(small answer to master)<br><br>The time for the master to
complete the send and receive is relatively<br>short. But if
the slave does something like this:<br><br>MPI_Recv(small msg from
master) <br>sleep(120)<br>MPI_Send(small answer to master)<br><br>Then
the time for the master to complet the send and receive is
much<br>longer than 120 seconds more than the first case.<br><br>Is this
right?<br><br>How many nodes are you using? <br>How many processors does
each node have?<br>How many processes are running on each
node?<br><br>If you can send us the simplest program that demonstrates
this behavior we<br>can take a look at it.<br><br>Darius<br><br>On Tue,
13 Mar 2007, Bruno Simioni wrote:<br><br>> Hey
Darius,<br>><br>> Thank you for your help. That really cleared the
concept for me.<br>><br>> Threre is another thing.<br>><br>>
That ir related to speed and performance problem. <br>><br>> In my
programs, I realize that if I send an TCP packet across network<br>>
several time, one after one, without any late, the communication
runs<br>> perfect, but, if some node make some complex computing that
take a piece of <br>> time, the communication has a great
late.<br>><br>> For example:<br>><br>> The master send
several times packets to node. The node process some little<br>>
thing and aswer to master, sending a packet. (ok, the communication is
<br>> perfect)<br>><br>> The trouble situation:<br>><br>>
The master send a packet to node. The node process a long time, and
answer.<br>> The answer takes the time of processing and another
time. A kind of <br>> overhead. It's sounds like something "halted"
the network and when requestet<br>> "turn it up"
again.<br>><br>> I'm I correct?<br>><br>> I'm using windows
XP, and the latest version of MPICH2. <br>><br>>
Thanks.<br>><br>> Bruno, from Brazil.<br>><br>> On 3/13/07,
Darius Buntinas <<a href="mailto:buntinas@mcs.anl.gov" target="_blank" onclick="return top.js.OpenExtLink(window,event,this)">buntinas@mcs.anl.gov</a>>
wrote:<br>>><br>>><br>>> A channel is a
communication method. For example, the default channel,
<br>>> sock, uses tcp sockets for communication, while
the shm channel<br>>> communicates using
shared-memory.<br>>><br>>> Nemesis is a channel
that uses shared-memory to communicate within a node,
<br>>> and a network to communicate between
nodes. Currently Nemesis supports<br>>> tcp,
gm, mx, and elan networks. (Eventually these will be
selectable at<br>>> runtime, but for now the network
has to be selected when MPICH2 is
<br>>> compiled.)<br>>><br>>> Does
that
help?<br>>><br>>> -d<br>>><br>>> On
Tue, 13 Mar 2007, Bruno Simioni wrote:<br>>><br>>>
> Hi!<br>>> ><br>>> > Can
anyone explain to me what channel is, in mpich2? and what for is
<br>>> that<br>>>
> used?<br>>> ><br>>> > The
next question: What channel nemesis is?<br>>> ><br>>>
> thanks.<br>>> ><br>>>
><br>>><br>><br>><br>><br>><br></blockquote></div><br><br clear="all"><br>-- <br>Bruno.
</span></div></blockquote></div></blockquote></div><br><br clear="all"><br>--
<br>Bruno. </span></div></blockquote></div>
</blockquote></div><br><br clear="all"><br>-- <br>Bruno.