<html><head><style type="text/css"><!-- DIV {margin:0px;} --></style></head><body><div style="font-family:times new roman,new york,times,serif;font-size:12pt"><div>over subscribe is an issue, but a rather tiny one in my application. For -n 4,<br>there will be 20 processes running at startup time, where 16 of those take<br>less than 0.5 sec total to get into idle mode. that has not been a problem until<br>I ran into the combination of thread multiple, machine and the particular test case. <br>BTW, I have not been using that machine for a long time, it is on the<br>lowest end of the HW spectrum for the problem my application is trying to solve.<br>Finding this combo was just a pure luck.<br><br>I hope these 2 issues provide good research opportunity for the MPICH2 <br>team.<br><br>thanks<br>tan<br><br></div><div style="font-family: times new roman,new york,times,serif; font-size: 12pt;"><br><div style="font-family: arial,helvetica,sans-serif;
font-size: 13px;"><font face="Tahoma" size="2"><hr size="1"><b><span style="font-weight: bold;">From:</span></b> Darius Buntinas <buntinas@mcs.anl.gov><br><b><span style="font-weight: bold;">To:</span></b> mpich-discuss@mcs.anl.gov<br><b><span style="font-weight: bold;">Sent:</span></b> Thursday, July 16, 2009 7:52:21 AM<br><b><span style="font-weight: bold;">Subject:</span></b> Re: [mpich-discuss] version 1.1 strange behavior : all processes become idle for extensive period<br></font><br>
Yeah, oversubscribing the processors will have this effect (because of a<br>broken implementation of sched_yield() in the linux kernel since<br>2.6.23). I'm not exactly sure why thread multiple would make it worse<br>though. This is something to look into.<br><br>Thanks for letting us know about this.<br><br>-d<br><br>On 07/15/2009 05:42 PM, chong tan wrote:<br>> I just completed building without -enable_thread=multiple, the slow<br>> startup and<br>> nap problem went away.<br>> <br>> Regarding the slow start up, it may have something to do with my<br>> application. When<br>> my application is run, it actually starts 4 other processes, licensing,<br>> recording, etc. I can<br>> see each of these processes being run 1 after another. BTW, I am using<br>> processor<br>> affinity, and that may help getting the situation worst.<br>> <br>> tan<br>> <br>> <br>>
------------------------------------------------------------------------<br>> *From:* Darius Buntinas <<a ymailto="mailto:buntinas@mcs.anl.gov" href="mailto:buntinas@mcs.anl.gov">buntinas@mcs.anl.gov</a>><br>> *To:* <a ymailto="mailto:mpich-discuss@mcs.anl.gov" href="mailto:mpich-discuss@mcs.anl.gov">mpich-discuss@mcs.anl.gov</a><br>> *Sent:* Tuesday, July 14, 2009 8:05:00 AM<br>> *Subject:* Re: [mpich-discuss] version 1.1 strange behavior : all<br>> processes become idle for extensive period<br>> <br>> <br>> Can you attach a debugger to process 0 and see what it's doing during<br>> this nap? Once the process 0 finishes the nap, does it send/receive the<br>> messages to/from the other processes and things continue normally?<br>> <br>> Thanks,<br>> -d<br>> <br>> On 07/13/2009 08:57 PM, chong tan wrote:<br>>> this is the sequence of MPI calls that lead to the 'nap' (all numbers<br>>>
represent proc id per MPICH2) :<br>>> <br>>> 0 send to 1, recieved by 1<br>>> 0 send to 2, recieved by 2<br>>> 0 sent to 3, recv by 3<br>>> <application activities, shm called ><br>>> 1 blocking send to 0, send buffered<br>>> 1 calls blocking recieve from 0<br>>> 3 blocking send to 0, send buffered,<br>>> 3 calls blocking recieve from 0<br>>> 2 blocking send to 0, send buffered<br>>> 2 calls blocking recieve from 0<br>>> <proc 0 execute some application activities><br>>> proc 0 become idle<br>>> <nap time><br>>> <br>>> <br>>> <br>>> This is rather strange, it only happens on this particular test. hope<br>>> this info help<br>>> <br>>> tan<br>>> <br>>> <br>>> <br>>> <br>>> <br>>><br>>> <br>>><br>>>
------------------------------------------------------------------------<br>>> *From:* Darius Buntinas <<a ymailto="mailto:buntinas@mcs.anl.gov" href="mailto:buntinas@mcs.anl.gov">buntinas@mcs.anl.gov</a><br>> <mailto:<a ymailto="mailto:buntinas@mcs.anl.gov" href="mailto:buntinas@mcs.anl.gov">buntinas@mcs.anl.gov</a>>><br>>> *To:* <a ymailto="mailto:mpich-discuss@mcs.anl.gov" href="mailto:mpich-discuss@mcs.anl.gov">mpich-discuss@mcs.anl.gov</a> <mailto:<a ymailto="mailto:mpich-discuss@mcs.anl.gov" href="mailto:mpich-discuss@mcs.anl.gov">mpich-discuss@mcs.anl.gov</a>><br>>> *Sent:* Monday, July 13, 2009 11:47:38 AM<br>>> *Subject:* Re: [mpich-discuss] version 1.1 strange behavior : all<br>>> processes become idle for extensive period<br>>><br>>><br>>> Is there a simpler example of this that you can send us? If nothing<br>>> else, a binary would be ok.<br>>><br>>>
Does the program that takes the 1 minute "nap" use threads? If so, how<br>>> many threads does each process create?<br>>><br>>> Can you find out what the processes (or threads if it's multithreaded)<br>>> are doing during this time? E.g., are they in an mpi call? Are they<br>>> blocking on a mutex? If so, can you tell us what line number it's<br>>> blocked on?<br>>><br>>> Can you try this without shared memory by setting the environment<br>>> variable MPICH_NO_LOCAL to 1 and see if you get the same problem?<br>>> MPICH_NO_LOCAL=1 mpiexec -n 4 ...<br>>><br>>> Thanks,<br>>> -d<br>>><br>>><br>>><br>>> On 07/13/2009 01:35 PM, chong tan wrote:<br>>>> Sorry can't do that. The benchmark involves 2 things. One from my<br>>>> customer which<br>>>> I am not allowed to distribute. I
may be able to get a limited<br>>>> license of my product<br>>>> for you to try, but I definately can not send source code.<br>>>><br>>>> tan<br>>>><br>>>><br>>>> ------------------------------------------------------------------------<br>>>> *From:* Darius Buntinas <<a ymailto="mailto:buntinas@mcs.anl.gov" href="mailto:buntinas@mcs.anl.gov">buntinas@mcs.anl.gov</a><br>> <mailto:<a ymailto="mailto:buntinas@mcs.anl.gov" href="mailto:buntinas@mcs.anl.gov">buntinas@mcs.anl.gov</a>><br>>> <mailto:<a ymailto="mailto:buntinas@mcs.anl.gov" href="mailto:buntinas@mcs.anl.gov">buntinas@mcs.anl.gov</a> <mailto:<a ymailto="mailto:buntinas@mcs.anl.gov" href="mailto:buntinas@mcs.anl.gov">buntinas@mcs.anl.gov</a>>>><br>>>> *To:* <a ymailto="mailto:mpich-discuss@mcs.anl.gov" href="mailto:mpich-discuss@mcs.anl.gov">mpich-discuss@mcs.anl.gov</a> <mailto:<a
ymailto="mailto:mpich-discuss@mcs.anl.gov" href="mailto:mpich-discuss@mcs.anl.gov">mpich-discuss@mcs.anl.gov</a>><br>> <mailto:<a ymailto="mailto:mpich-discuss@mcs.anl.gov" href="mailto:mpich-discuss@mcs.anl.gov">mpich-discuss@mcs.anl.gov</a> <mailto:<a ymailto="mailto:mpich-discuss@mcs.anl.gov" href="mailto:mpich-discuss@mcs.anl.gov">mpich-discuss@mcs.anl.gov</a>>><br>>>> *Sent:* Monday, July 13, 2009 10:54:50 AM<br>>>> *Subject:* Re: [mpich-discuss] version 1.1 strange behavior : all<br>>>> processes become idle for extensive period<br>>>><br>>>><br>>>> Can you send us the benchmark you're using? This will help us figure<br>>>> out what's going on.<br>>>><br>>>> Thanks,<br>>>> -d<br>>>><br>>>> On 07/13/2009 12:36 PM, chong tan wrote:<br>>>>><br>>>>> thanks
darius,<br>>>>><br>>>>> When I did the comparison (or benchmarking), I have 2 identical source<br>>>>> trees. Everything<br>>>>> were recompiled group up and compiled/linked accordinglyto the version<br>>>>> of MPICH2<br>>>>> to be used.<br>>>>><br>>>>> I have many tests, this is the only one showing this behavior, and is<br>>>>> predictably repeatable.<br>>>>> most of my tests are showing comaptible performance and many do better<br>>>>> with 1.1.<br>>>>><br>>>>> The 'weirdest' thing is the ~1 minute span where there is no activity on<br>>>>> the box at all, zipo<br>>>>> activity except 'top', with machine load at around 0.12. I don't know<br>>>>> how to explain this<br>>>>> 'behavior', and I am extremely curious if anyone can explain
this.<br>>>>><br>>>>> I can't repeat this on AMD boxes as I don't have one that has only 32G<br>>>>> of memory. I can't<br>>>>> repeat this on Niagara box as thread multiple won't build.<br>>>>><br>>>>> I will try to rebuild 1.1 without thread-multiple. Will keep you<br>> posted.<br>>>>><br>>>>> Meanwhile, if anyone has any speculations on this, please bring them up.<br>>>>><br>>>>> thanks<br>>>>> tan<br>>>>><br>>>>> ------------------------------------------------------------------------<br>>>>> *From:* Darius Buntinas <<a ymailto="mailto:buntinas@mcs.anl.gov" href="mailto:buntinas@mcs.anl.gov">buntinas@mcs.anl.gov</a><br>> <mailto:<a ymailto="mailto:buntinas@mcs.anl.gov" href="mailto:buntinas@mcs.anl.gov">buntinas@mcs.anl.gov</a>><br>>> <mailto:<a
ymailto="mailto:buntinas@mcs.anl.gov" href="mailto:buntinas@mcs.anl.gov">buntinas@mcs.anl.gov</a> <mailto:<a ymailto="mailto:buntinas@mcs.anl.gov" href="mailto:buntinas@mcs.anl.gov">buntinas@mcs.anl.gov</a>>><br>>>> <mailto:<a ymailto="mailto:buntinas@mcs.anl.gov" href="mailto:buntinas@mcs.anl.gov">buntinas@mcs.anl.gov</a> <mailto:<a ymailto="mailto:buntinas@mcs.anl.gov" href="mailto:buntinas@mcs.anl.gov">buntinas@mcs.anl.gov</a>><br>> <mailto:<a ymailto="mailto:buntinas@mcs.anl.gov" href="mailto:buntinas@mcs.anl.gov">buntinas@mcs.anl.gov</a> <mailto:<a ymailto="mailto:buntinas@mcs.anl.gov" href="mailto:buntinas@mcs.anl.gov">buntinas@mcs.anl.gov</a>>>>><br>>>>> *To:* <a ymailto="mailto:mpich-discuss@mcs.anl.gov" href="mailto:mpich-discuss@mcs.anl.gov">mpich-discuss@mcs.anl.gov</a> <mailto:<a ymailto="mailto:mpich-discuss@mcs.anl.gov"
href="mailto:mpich-discuss@mcs.anl.gov">mpich-discuss@mcs.anl.gov</a>><br>> <mailto:<a ymailto="mailto:mpich-discuss@mcs.anl.gov" href="mailto:mpich-discuss@mcs.anl.gov">mpich-discuss@mcs.anl.gov</a> <mailto:<a ymailto="mailto:mpich-discuss@mcs.anl.gov" href="mailto:mpich-discuss@mcs.anl.gov">mpich-discuss@mcs.anl.gov</a>>><br>>> <mailto:<a ymailto="mailto:mpich-discuss@mcs.anl.gov" href="mailto:mpich-discuss@mcs.anl.gov">mpich-discuss@mcs.anl.gov</a> <mailto:<a ymailto="mailto:mpich-discuss@mcs.anl.gov" href="mailto:mpich-discuss@mcs.anl.gov">mpich-discuss@mcs.anl.gov</a>><br>> <mailto:<a ymailto="mailto:mpich-discuss@mcs.anl.gov" href="mailto:mpich-discuss@mcs.anl.gov">mpich-discuss@mcs.anl.gov</a> <mailto:<a ymailto="mailto:mpich-discuss@mcs.anl.gov" href="mailto:mpich-discuss@mcs.anl.gov">mpich-discuss@mcs.anl.gov</a>>>><br>>>>> *Sent:* Monday, July 13, 2009 8:30:19
AM<br>>>>> *Subject:* Re: [mpich-discuss] version 1.1 strange behavior : all<br>>>>> processes become idle for extensive period<br>>>>><br>>>>> Tan,<br>>>>><br>>>>> Did you just re-link the applications, or did you recompile them?<br>>>>> Version 1.1 is most likely not binary compatible with 1.0.6, so you<br>>>>> really need to recompile the application.<br>>>>><br>>>>> Next, don't use the --enable-threads=multiple flag when configuring<br>>>>> mpich2. By default, mpich2 supports all thread levels and will select<br>>>>> the thread level at run time (depending on the parameters passed to<br>>>>> MPI_Init_thread). By allowing the thread level to be selected<br>>>>> automatically at run time, you'll avoid the overhead of thread safety<br>>>>> when it's not needed,
allowing your non-threaded applications to run<br>>>> faster.<br>>>>><br>>>>> Let us know if either of these fixes the problem, especially if just<br>>>>> removing the --enable-threads option fixes this.<br>>>>><br>>>>> Thanks,<br>>>>> -d<br>>>>><br>>>>> On 07/10/2009 06:19 PM, chong tan wrote:<br>>>>>> I am seeing this funny situation which I did not see on 1.0.6 and<br>>>>>> 1.0.8. Some background:<br>>>>>><br>>>>>> machine : INTEL 4Xcore 2<br>>>>>><br>>>>>> running mpiexec -n 4<br>>>>>><br>>>>>> machine has 32G of mem.<br>>>>>><br>>>>>> when my application runs, almost all memory are used. However, there<br>>>>>> is no swapping.<br>>>>>> I have
exclusive use of the machine, so contention is not an issue.<br>>>>>><br>>>>>> issue #1 : processes take extra long to be initialized, compared to<br>>>> 1.0.6<br>>>>>> issue #2 : during the run, at time all of them will become idle at the<br>>>>>> same time, for almost a<br>>>>>> minute. We never observed this with 1.0.6<br>>>>>><br>>>>>><br>>>>>> The codes are the same, only linked with different versions of MPICH2.<br>>>>>><br>>>>>> MPICH2 was built with --enable-threads=multiple for 1.1. without for<br>>>>>> 1.0.6 or 1.0.8<br>>>>>><br>>>>>> MPI calls are all in the main application thread. I used only 4 MPI<br>>>>>> functions :<br>>>>>>
init(), Send(), Recv() and Barrier().<br>>>>>><br>>>>>><br>>>>>><br>>>>>> any suggestion ?<br>>>>>><br>>>>>> thanks<br>>>>>> tan<br>>>>>><br>>>>>><br>>>>>><br>>>>>><br>>>>>><br>>>>>><br>>>>><br>>>><br>>><br>> <br></div></div></div><br>
</body></html>