[mpich-discuss] mpich2 MPI_TEST errors
Samir Khanal
skhanal at bgsu.edu
Sun Mar 15 15:40:05 CDT 2009
Hi
I found the Culprit function
it was indeed a problem with mpi_test call , i tracked it down, the programs works now.
But now i am having a hard time using the same program to run on mpich2 1.0.8/PBS on a x86_64 system.
it compiles and runs perfectly as a single process,
ie, mpiexec -n 1 ./Ring
executes and generates outputs.
but as soon as i do mpiexec -n 2 or more , it just waits and eventually the job is thrown out of the queue.
i am using the mpiexec that came with mpich2
The previous system was a single core system.
Does mpich2 has any special configurations with multiple core machines?
Any tips on job submission or compiling,
if just used
./configure --with-device=ch3:nemesis
Right now i do job submission this way,
#PBS -l walltime=3:00:00
#PBS -N my_job
#PBS -j oe
#PBS -l nodes=6
echo `hostname`
echo Directory is `pwd`
echo This job is running on following Processors
export LD_LIBRARY_PATH=/home/skhanal/bgtw/lib:$LD_LIBRARY_PATH
time /home/skhanal/mpich2/bin/mpiexec -n 4 ./Ring
Your help is really appreciated.
Thanks
Samir
________________________________________
From: mpich-discuss-bounces at mcs.anl.gov [mpich-discuss-bounces at mcs.anl.gov] On Behalf Of Pavan Balaji [balaji at mcs.anl.gov]
Sent: Sunday, March 15, 2009 1:58 PM
To: mpich-discuss at mcs.anl.gov
Subject: Re: [mpich-discuss] mpich2 MPI_TEST errors
Hi Samir,
Can you reproduce the problem with a smaller standalone program? It'll
be difficult to setup your library here and try it out.
-- Pavan
Samir Khanal wrote:
> Hi Pavan
>
> Thanks for the quick response and am glad that it reached the right person for a response.
>
> i downloaded mpich2 1.0.8 version from the official website.
>
> I am compiling a time-warp library with mpich2 ch3:nemesis channel.
>
> I can run simple programs (like cpi etc), but the problem comes when i try to use the program for this library, (which works perfectly with mpich 1.2.7) it gives a mpi_test failure info.
>
> If you would like to compile the library and test, i can send you the whole package about 1MB in your mcs email.
>
> I am really stuck with this.
>
> Samir
>
>
> ________________________________________
> From: mpich-discuss-bounces at mcs.anl.gov [mpich-discuss-bounces at mcs.anl.gov] On Behalf Of Pavan Balaji [balaji at mcs.anl.gov]
> Sent: Sunday, March 15, 2009 1:47 PM
> To: mpich-discuss at mcs.anl.gov
> Subject: Re: [mpich-discuss] mpich2 MPI_TEST errors
>
> Samir,
>
> Can you send the test program? If all channels are failing, it is very
> likely either a problem with the test program or your configuration.
> Btw, which version of MPICH2 are you using?
>
> Regarding the new cluster, all you need to use it ch3:nemesis. In a lot
> of ways, nemesis is a superset of sock, ssm and shm. The few features
> that it is missing (which you'll only notice in special circumstances)
> are being added in the 1.1 release of MPICH2.
>
> -- Pavan
>
> Samir Khanal wrote:
>> Hi all
>>
>> I was trying to get a library compiled using mpich2,
>> it works ok - compiles and executes with mpich 1.2.7 ch3_p4 channel and 1.2.5 but i have problem running it with mpich2.
>> Isn't mpich2 backward compatible with mpich 1.2.7 ? I assume that there should not be specific rewrites to work with mpich2...
>> I compiled mpich2 with
>> --with-device=ch3:nemesis
>>
>> and all other 3 options
>> sock
>> ssm
>> shm
>>
>> what should be the channel to be compatible with ch3_p4 as in mpich-1.2.7
>>
>> The error i receive is
>>
>> mpiexec -n 1 ./test
>> Fatal error in MPI_Test: Invalid MPI_Request, error stack:
>> MPI_Test(152): MPI_Test(request=0x869e1dc, flag=0xbfa134f8, status=0xbfa134d0) failed
>> MPI_Test(75).: Invalid MPI_Requestrank 0 in job 19 protos.cs.bgsu.edu_46623 caused collective abort of all ranks
>> exit status of rank 0: killed by signal 9
>>
>> mpiexec -n 2 ./test
>> Fatal error in MPI_Test: Invalid MPI_Request, error stack:
>> MPI_Test(152): MPI_Test(request=0x92451dc, flag=0xbf8c4938, status=0xbf8c4910) failed
>> MPI_Test(75).: Invalid MPI_Request
>>
>> Fatal error in MPI_Test: Invalid MPI_Request, error stack:
>> MPI_Test(152): MPI_Test(request=0x8bdc1dc, flag=0xbfb77be8, status=0xbfb77bc0) failed
>> MPI_Test(75).: Invalid MPI_Requestrank 0 in job 20 protos.cs.bgsu.edu_46623 caused collective abort of all ranks
>> exit status of rank 0: killed by signal 9
>>
>>
>>
>> i am using correct mpicxx and mpicc and mpi.h versions in includes
>> infact i use
>> include "mpi.h" so that appropriate version header gets pulled in automatically
>>
>> I also have another question
>>
>> I want to compile mpich2 for another cluster with 6 pcs with intel core2quad processors, what appropriate channels should be used?
>>
>> Thanks
>> Samir
>>
>>
>>
>>
>>
>
> --
> Pavan Balaji
> http://www.mcs.anl.gov/~balaji
--
Pavan Balaji
http://www.mcs.anl.gov/~balaji
More information about the mpich-discuss
mailing list