[MPICH] mpirun vs. mpiexec

Steve Young chemadm at hamilton.edu
Tue Jun 5 18:46:45 CDT 2007


Hey all,
	Well I seem to still have an issue running a portion of Amber (remd).
This one puzzles me and it still goes back to the original problem. 

Ok so I got the OSC version of mpiexec. This appears to work very well
running normal sander.MPI. Requesting 16 cpu's we verify good output and
near 100% utilization of all 16 processes. Now the next thing we want to
use is another part of Amber called Replica Exchange. It basically is
different arguments to the sander.MPI program. When I run this part of
the program I end up with the following results:

 Error: specified more groups (           8 ) than the number of
processors (
           1 ) !
[unset]: aborting job:
application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0
 Error: specified more groups (           8 ) than the number of
processors (
           1 ) !
[unset]: aborting job:
application called MPI_Abort(MPI_COMM_WORLD, 1) - process 0
 Error: specified more groups (           8 ) than the number of
processors (
           1 ) !
<.... snip....>

Now I realize I should be posting to the Amber list as this appears to
be an Amber related problem. I myself would tend to believe that. But
what I can't explain is why when I change to the original version of
mpirun that the program runs fine using the exact same files.

The problem with this option is our Torque node allocations aren't in
sync with the nodes mpich runs on. Perhaps, I need to look at talking to
the OSC guys now about mpiexec? 

I guess what still puzzles me is what is the difference between them
(mpiexec and mpirun)? Why couldn't mpirun be written so it could be
given an argument of -machinefile? Or rather why are there two programs
anyhow? 

Thanks in advance for all your advice, it has helped immensely already. 

-Steve





More information about the mpich-discuss mailing list