[mpich-discuss] Fatal error in MPI_Init

Jayesh Krishna jayesh at mcs.anl.gov
Wed Aug 5 17:15:41 CDT 2009


Hi,
 A user can choose the MPICH2 channel at runtime by using the "-channel"
option of mpiexec. This requires using mpich2mpi.dll (with mpich.dll you
are only linking with the default channel).
 So in your case you can do the following,
 
# Create your MPI libs from mpich2mpi.dll and fmpich2.dll using wlib.
# Distribute all the MPICH2 dlls (look for mpich2*.dll, mpe*.dll &
fmpich*.dll in c:\windows\system32 directory on the machine where you
installed MPICH2) with your application. You can leave the MPICH2 dlls in
the directory containing your executable or copy them to the user's
"c:\windows\system32" directory (Note that SYSTEM32 directory is actually
%SystemRoot%\system32).
 
 Let us know if you still have questions.
 
(PS: When building your own wrapper libraries from a 3rd party dll its
always a good idea to use "Dependency Walker", depends.exe, which is
available with Visual Studio. The dependency walker should show all the
dependencies for an executable. In your case compile a simple MPI program,
say hellow.exe, with MPICH2 & use dependency walker to check the
dependencies for hellow.exe.)
 
Regards,
Jayesh

  _____  

From: lccostajr at gmail.com [mailto:lccostajr at gmail.com] On Behalf Of Luiz
Carlos da Costa Junior
Sent: Wednesday, August 05, 2009 4:39 PM
To: Jayesh Krishna
Cc: mpich-discuss at mcs.anl.gov
Subject: Re: [mpich-discuss] Fatal error in MPI_Init


Hi,

I could make the program work, but there are some points I still would
like to clarify.

The problem is that my program also links to another library using its dll
file. However, in my development environment, these dll's are not placed
in the same directory where I build my application. Actually, I keep these
dll's in the library's installation directory and my PATH environment
variable points to this folder. 

I could just figure out of the problem because I was trying to reduce my
applications size and, in order to do that, I replaced many of the
subroutines by dummy ones. When I disabled my part of the code related to
the second library, it worked.

So, the solution I found was to copy the needed dll's to my application's
folder. But I'm not so comfortable with this because this scheme had
always worked, even with MPICH1 (i.e., when I link and run my application
with MPIRUN).

Is such behavior expected? Why does mpiexec work differently from mpirun?
Any mistake from my side?

Another point is that I use to include 'mpif.h' in my Fortran routines
that use MPICH2 and, I link my application with mpich2.lib and
fmpich2.lib, which are the import libraries I have created from mpich2.dll
and fmpich2.dll, using OpenWatcom "wlib" utility.

I didn't know that the correct library to use was mpich2mpi.dll. Sorry for
the question, but what are the differences?
As I could make it run with mpich2.dll and fmpich2.dll, what kind of
errors may I expect?

Thanks again.
LC



2009/7/28 Jayesh Krishna <jayesh at mcs.anl.gov>


Hi,
 
>> create library files from mpich2.dll and fmpich2.dll (distributed with
MPICH2 setup),
  How are you creating lib files from the dlls (In any case you should be
using mpich2mpi.dll instead of mpich2.dll)? 
 
>> compile all my code with the MPICH2 libraries
  Are you compiling your code with the libraries distributed with MPICH2 ?
Or are you using the libs created by you ?
 
>> run my code using "mpiexe -host hostname -n N myapplication"
  Can you run your code on the localhost (Does "mpiexec -n N
myapplication" work ? )?
 
Regards,
Jayesh
 

  _____  

From: lccostajr at gmail.com [mailto:lccostajr at gmail.com] On Behalf Of Luiz
Carlos da Costa Junior
Sent: Monday, July 27, 2009 8:06 PM
To: Jayesh Krishna
Cc: mpich-discuss at mcs.anl.gov
Subject: Re: [mpich-discuss] Fatal error in MPI_Init


Hi Jayesh,

I compile my code with OpenWatcom compiler.
The procedure I use is:


1.	create library files from mpich2.dll and fmpich2.dll (distributed
with MPICH2 setup), 

2.	compile all my code with the MPICH2 libraries 

3.	run my code using "mpiexe -host hostname -n N myapplication"

I had already used procedure successfully before, but I don't know what is
happening now.
I have tried with MPICH2 versions 1.0.8 and the newer one 1.1.1, but the
error is the same.

I also tried to repeat this procedure on a computer without MPI's
installation, but I got the same.

As I told before, I'am able to run CPI and also non-MPI programs.
Any clue that can help me?

Thanks


2009/7/27 Jayesh Krishna <jayesh at mcs.anl.gov>


Hi,
 Did you recompile your code with MPICH2 ? Which version of MPICH2 are you
using ? How are you launching your MPI job (the command used to launch
your job) ? Are you able to run non-MPI programs (Does "mpiexec -n 2
hostname" work ?)?
 
(PS: The error message)
Regards,
Jayesh

  _____  

From: mpich-discuss-bounces at mcs.anl.gov
[mailto:mpich-discuss-bounces at mcs.anl.gov] On Behalf Of Luiz Carlos da
Costa Junior
Sent: Friday, July 24, 2009 10:56 PM
To: MPICH Discuss
Subject: [mpich-discuss] Fatal error in MPI_Init


Dears sirs,

I got the following message when I tried to run my code under Windows XP:


Fatal error in MPI_Init: Other MPI error, error stack:
MPIR_Init_thread(394): Initialization failed
MPID_Init(90)........: channel initialization failed
MPID_Init(357).......: PMI_Init returned -1
[0] PMI_Init failed: FAIL - init called when another process has exited
without calling init

job aborted:
rank: node: exit code[: error message]
0: amsterdam: 1: Fatal error in MPI_Init: Other MPI error, error stack:
MPIR_Init_thread(394): Initialization failed
MPID_Init(90)........: channel initialization failed
MPID_Init(357).......: PMI_Init returned -1
1: amsterdam: -1073741515
2: amsterdam: -1073741515
3: amsterdam: -1073741515
4: amsterdam: -1073741515


This software can be compiled against Linux (using Intel) and Windows
(using Open Watcom).
Under Linux, it works normally.
Under Windows, I got the message above.
SMPD seems to be ok once the distributed CPI program works normally.
If I link my code with MPICH1-1.2.6 and run with "mpirun", it also works
fine.

The weird thing is that I could already make it work with
MPICH2/Windows/Open Watcom.

Do you have any clue regarding the reason of this error?
What can I be doing wrong?

Thanks in advance,
LC




-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20090805/d15f8a47/attachment-0001.htm>


More information about the mpich-discuss mailing list