[mpich-discuss] mpich2 MPI_TEST errors

Pavan Balaji balaji at mcs.anl.gov
Sun Mar 15 21:31:46 CDT 2009


> One thing i know that this application is highly threaded application.
> Does that ring a bell?

Aha. How many cores do you have and how many processes/threads in all 
are you running on each node? If you are overprovisioning the node (that 
is, using more processes/threads than the number of cores available), 
then you might be thrashing processes/threads on the available cores. 
That will kill your performance, which will appear as if your 
application is just hanging (whereas it might be making very slow 
progress). You can try two things:

1. Can you control the number of threads that are spawned? If yes, can 
you try reducing the number of threads to see if the application finishes?

2. Configure with --with-device=ch3:sock. This has better support for 
overprovisioned cases such as this (assuming that you are actually 
overprovisioning).

  -- Pavan

-- 
Pavan Balaji
http://www.mcs.anl.gov/~balaji


More information about the mpich-discuss mailing list