[mpich-discuss] Problem running mpi program

Reuti reuti at staff.uni-marburg.de
Wed Oct 24 17:32:47 CDT 2012


Am 24.10.2012 um 20:27 schrieb gilbert:

> But then the mpi code should run the same whichever mpiexec is used (not likely to happen) or mpiexec should detect which version of mpi was used and inform the user and quit. The only way I knew there was a problem is when one job took three times as long as another when running on nearly identical computers. This is too easy a mistake to make with all the different codes available and most users would just conclude that mpi does not work because there is no speedup and no error message.

This issue also exists between different versions of the same MPI library. Using the correct MPI library's `mpiexec` but from a wrong (maybe older) version can also have this effect or even crash the application. Not to mention to use the correct shared libraries and set LD_LIBRARY_PATH accordingly (but this also applies to the correct libraries from the used compiler version).

`mpiexec` and the compiled MPI application might be static and not using any libraries which could be checked at all. Option could be some kind of "magic header" in the compiled binary, but it takes time to search for it (and it's not in the MPI standard, in fact: you can start any application by `mpiexec` like `hostname`).

Depending on the platform you could either name the compiled application foobar.mpich to reflect the used version or attach file attributes which you check in a wrapper to `mpiexec`:

$ setfattr -n user.mpi -v mpich2 baz
$ setfattr -n user.mpi-version -v 1.5 baz
$ getfattr -d baz
# file: baz
user.mpi="mpich2"
user.mpi-version="1.5"

$ cp --preserve=xattr baz biz
$ getfattr -d biz
# file: biz
user.mpi="mpich2"
user.mpi-version="1.5"

-- Reuti



> Kevin
> 
> 
> 
> On Wed, 24 Oct 2012 11:34:21 -0500, Rajeev Thakur wrote:
>> mpiexec is the name suggested by the MPI standard.
>> 
>> Rajeev
>> 
>> On Oct 24, 2012, at 11:31 AM, gilbert wrote:
>> 
>>> Thanks, that solved the problem. There was an implementation of openmpi installed and their version of mpiexec was being used. It is unfortunate the both implementations use the same driver name. The last time I ran into this problem the lammps code would not run at all with the openmpi driver.
>>> 
>>> Kevin
>>> 
>>> On Wed, 24 Oct 2012 11:02:37 -0500, Darius Buntinas wrote:
>>>> You might have another MPI implementation installed that's confusing
>>>> things.  Check to make sure that you're using the mpich2 mpiexec
>>>> (specify the full path).  Also make sure that the executables were
>>>> compiled with the mpich2 compiler wrappers, e.g., mpicc (again specify
>>>> the full path).
>>>> 
>>>> -d
>>>> 
>>>> On Oct 24, 2012, at 10:59 AM, gilbert wrote:
>>>> 
>>>>> I am running Lammps, compiled with mpich2-1.4.1p1 on two intel based i7 computers, both running OpenSuse 12.2, both using gcc 4.7. I have compiled mpich2-1.4.1p1 on both machines. When I run the lammps benchmark test on the i7-3760 computer with the following command: mpiexec -np 3 ./lmp_mpi < in.lj the output tells me I am running on three processors. If I run the same command on an i7-2600 Lammps says it is only running on one processor. I can copy the lammps executable from one machine to the other and the i7-3760 always runs on three (or how ever many processors I specify) and the i7-2600 always runs on one.
>>>>> 
>>>>> I have tried uninstalling mpich2 and redone the complete installation.
>>>>> The firewall is turned off on both machines.
>>>>> Both machines have the same installation of opensuse.
>>>>> There are no error messages from compilation or running.
>>>>> 
>>>>> So what am I missing? Is there a system service that needs to be running? I have another i7-2600 computer running Windows 7 and it runs the mpi code with no problems.
>>>>> 
>>>>> Kevin
>>>>> _______________________________________________
>>>>> mpich-discuss mailing list     mpich-discuss at mcs.anl.gov
>>>>> To manage subscription options or unsubscribe:
>>>>> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
>>>> 
>>>> _______________________________________________
>>>> mpich-discuss mailing list     mpich-discuss at mcs.anl.gov
>>>> To manage subscription options or unsubscribe:
>>>> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
>>> 
>>> _______________________________________________
>>> mpich-discuss mailing list     mpich-discuss at mcs.anl.gov
>>> To manage subscription options or unsubscribe:
>>> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
>> 
>> _______________________________________________
>> mpich-discuss mailing list     mpich-discuss at mcs.anl.gov
>> To manage subscription options or unsubscribe:
>> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss
> 
> _______________________________________________
> mpich-discuss mailing list     mpich-discuss at mcs.anl.gov
> To manage subscription options or unsubscribe:
> https://lists.mcs.anl.gov/mailman/listinfo/mpich-discuss



More information about the mpich-discuss mailing list