[mpich-discuss] Fatal error in MPI_Test: Invalid MPI_Request

Samir Khanal skhanal at bgsu.edu
Thu Feb 26 16:15:10 CST 2009


Hi Gus
I am using absolute paths for compilations,
as per suggestion of Rajeev, the compiler did pick up proper includes, but the problem still persists
when n=1 there is no problem
but when n>1 i get the error as below

[skhanal at comet ~]$ ~/mpich2/bin/mpiexec -n 1 ./Ring
[skhanal at comet ~]$ ~/mpich2/bin/mpiexec -n 2 ./Ring
Fatal error in MPI_Test: Invalid MPI_Request, error stack:
MPI_Test(152): MPI_Test(request=0xa964388, flag=0x7fffb06b94d4, status=0x7fffb06b9440) failed
MPI_Test(75).: Invalid MPI_Requestrank 0 in job 10  comet.cs.bgsu.edu_60252   caused collective abort of all ranks
  exit status of rank 0: killed by signal 9

Samir

________________________________________
From: mpich-discuss-bounces at mcs.anl.gov [mpich-discuss-bounces at mcs.anl.gov] On Behalf Of Gus Correa [gus at ldeo.columbia.edu]
Sent: Thursday, February 26, 2009 5:12 PM
To: mpich-discuss at mcs.anl.gov
Subject: Re: [mpich-discuss] Fatal error in MPI_Test: Invalid MPI_Request

Hi Samir

As per your previous message, you were launching your
Ring test program directly from the Linux prompt,
without any mpiexec.
MPI programs require mpiexec to run properly.
(See my Feb/23/09 reply to you below.)

Did you change this?
I.e. which command do you use to launch your program?

It may help also if you send the commands that you use to
compile your library, and to compile the Ring test program.

Among other things, you need to make sure that your library,
if it calls MPI functions,
is compiled with the same MPICH2 as the test program (Ring).

Also, using full path names may help you.

Gus Correa
---------------------------------------------------------------------
Gustavo Correa
Lamont-Doherty Earth Observatory - Columbia University
Palisades, NY, 10964-8000 - USA
---------------------------------------------------------------------


On Feb/23/2009 Gus Correa wrote:
 > Hi Samir, list
 >
 > You must launch your Ring program with mpiexec,
 > not just "[skhanal at comet ~]$ ./Ring" as you did.
 >
 > Please, use full path for mpiexec also, to make sure the same
 > build of MPICH2 is being used all the way through.
 > This may not solve the problem, but will avoid a lot of confusion
 > to diagnose the real source of error.
 >
 > Gus Correa


Samir Khanal wrote:
> Hi Rajeev, list
>
> I am using the mpi.h from mpich2,
> the program compiled very well (i am building a library) in a X86 box and ran without error with mpiexec 0.82 and mpich 1.2.7
> I am trying to replicate the same at a x86_64 system with mpich 1.2.7 and mpiexec 0.83
> and mpich2 and its mpiexec.
> The library compiles, but i get the same error as my previous email.
>
> with mpich 1.2.7 / mpiexec 0.82 i get P4 error sigx 15 error (which Gus suggested was due to old library and i compiled mpich2 with nemesis channel..)
> with mpich2  / and its mpiexec shows  the following as a job output.
>
> This job is running on following Processors
> compute-0-3 compute-0-3 compute-0-3 compute-0-3 compute-0-2 compute-0-2 compute-0-2 compute-0-2
>
> Fatal error in MPI_Test: Invalid MPI_Request, error stack:
> MPI_Test(152): MPI_Test(request=0x7098388, flag=0x7fffdda3ea34, status=0x7fffdda3e9a0) failed
> MPI_Test(75).: Invalid MPI_RequestFatal error in MPI_Test: Invalid MPI_Request, error stack:
> MPI_Test(152): MPI_Test(request=0x3b95388, flag=0x7fffb21504b4, status=0x7fffb2150420) failed
> MPI_Test(75).: Invalid MPI_Requestrank 3 in job 1  compute-0-3.local_43455   caused collective abort of all ranks
>   exit status of rank 3: killed by signal 9
>
> FYI this application was written to run on a Gentoo Box with mpich 1.2.5/7  and mpiexec (from OSC) v 0.75
> I am trying to port this to a new 64bit cluster, with all sorts of problems.
>
> :-(
> Samir
>
> ________________________________________
> From: mpich-discuss-bounces at mcs.anl.gov [mpich-discuss-bounces at mcs.anl.gov] On Behalf Of Rajeev Thakur [thakur at mcs.anl.gov]
> Sent: Monday, February 23, 2009 2:07 PM
> To: mpich-discuss at mcs.anl.gov
> Subject: Re: [mpich-discuss] Fatal error in MPI_Test: Invalid MPI_Request
>
> This can happen if you use an mpif.h or mpi.h from some other
> implementation. Remove any mpi*.h in the application directory and don't
> provide any paths to mpi*.h. mpic* will pick up the right file.
>
> Rajeev
>
>
>> -----Original Message-----
>> From: mpich-discuss-bounces at mcs.anl.gov
>> [mailto:mpich-discuss-bounces at mcs.anl.gov] On Behalf Of Samir Khanal
>> Sent: Monday, February 23, 2009 11:35 AM
>> To: mpich-discuss at mcs.anl.gov
>> Subject: Re: [mpich-discuss] Fatal error in MPI_Test: Invalid
>> MPI_Request
>>
>> Hi
>>
>> [skhanal at comet ~]$ g++ -v
>> Using built-in specs.
>> Target: x86_64-redhat-linux
>> Configured with: ../configure --prefix=/usr
>> --mandir=/usr/share/man --infodir=/usr/share/info
>> --enable-shared --enable-threads=posix
>> --enable-checking=release --with-system-zlib
>> --enable-__cxa_atexit --disable-libunwind-exceptions
>> --enable-libgcj-multifile
>> --enable-languages=c,c++,objc,obj-c++,java,fortran,ada
>> --enable-java-awt=gtk --disable-dssi --enable-plugin
>> --with-java-home=/usr/lib/jvm/java-1.4.2-gcj-1.4.2.0/jre
>> --with-cpu=generic --host=x86_64-redhat-linux
>> Thread model: posix
>> gcc version 4.1.2 20071124 (Red Hat 4.1.2-42)
>> [skhanal at comet ~]$ which mpicxx
>> ~/mpich2/bin/mpicxx
>> [skhanal at comet ~]$ which mpicc
>> ~/mpich2/bin/mpicc
>> [skhanal at comet ~]$ which mpiexec
>> ~/mpich2/bin/mpiexec
>>
>> i have installed all on my home directory
>>
>> when i compile i do
>> [skhanal at comet ~]$ /home/skhanal/mpich2/bin/mpicxx -L
>> /home/skhanal/bgtw/lib -lbgtw bgtwRingTest.cpp -o Ring
>>
>> [skhanal at comet ~]$ ./Ring
>> Fatal error in MPI_Test: Invalid MPI_Request, error stack:
>> MPI_Test(152): MPI_Test(request=0x16ae9388,
>> flag=0x7fff7a7599c4, status=0x7fff7a759930) failed
>> MPI_Test(75).: Invalid MPI_Request
>>
>> the library needs a mpi.h file to include, i gave
>> /home/skhanal/mpich2/include/mpi.h as an absolute path.
>>
>> any clues?
>>
>> Thanks
>> Samir
>>
>> ________________________________________
>> From: mpich-discuss-bounces at mcs.anl.gov
>> [mpich-discuss-bounces at mcs.anl.gov] On Behalf Of Gus Correa
>> [gus at ldeo.columbia.edu]
>> Sent: Monday, February 23, 2009 12:32 PM
>> To: Mpich Discuss
>> Subject: Re: [mpich-discuss] Fatal error in MPI_Test: Invalid
>> MPI_Request
>>
>> Hi Samir, list
>>
>> I am wondering if the mpicxx and mpiexec you are using
>> belong to the same MPICH2 build (considering the problems you
>> reported before).
>>
>> What is the output of "which mpicxx" and "which mpiexec"?
>>
>> You may want to use full path names to mpicxx and mpiexec,
>> as Anthony Chan recommended in another email.
>> Problems with PATH and multiple versions and builds of MPI
>> that hang around all Linux computers
>> has been an endless source of frustration for many.
>> I myself prefer to use full path names when I am testing
>> MPI programs, to avoid any confusion and distress.
>>
>> I hope this helps,
>> Gus Correa
>> ---------------------------------------------------------------------
>> Gustavo Correa
>> Lamont-Doherty Earth Observatory - Columbia University
>> Palisades, NY, 10964-8000 - USA
>> ---------------------------------------------------------------------
>>
>> Samir Khanal wrote:
>>> Hi All
>>> I tried and did the following.
>>>
>>> [skhanal at comet ~]$ mpicxx -L /home/skhanal/bgtw/lib -lbgtw
>> bgtwRingTest.cpp -o Ring
>>> [skhanal at comet ~]$ mpiexec -n 4 ./Ring
>>> Fatal error in MPI_Test: Invalid MPI_Request, error stack:
>>> MPI_Test(152): MPI_Test(request=0x1f820388,
>> flag=0x7fffb8236134, status=0x7fffb82360a0) failed
>>> MPI_Test(75).: Invalid MPI_Requestrank 0 in job 35
>> comet.cs.bgsu.edu_35155   caused collective abort of all ranks
>>>   exit status of rank 0: killed by signal 9
>>>
>>> What does this mean?
>>> Samir
>>>
>>> Ps: I am using mpich2 1.0.8
>>



More information about the mpich-discuss mailing list