[mpich-discuss] forrtl errors

Christopher Tanner christopher.tanner at gatech.edu
Wed Oct 8 10:38:13 CDT 2008


Thanks for your help guys -- fixing the typo fixed the problem. Sorry  
for such a rookie mistake.

-------------------------------------------
Chris Tanner
Space Systems Design Lab
Georgia Institute of Technology
christopher.tanner at gatech.edu
-------------------------------------------



On Oct 7, 2008, at 7:54 PM, Gus Correa wrote:

> Hi Christopher and list
>
> In a thread parallel to this, answering your question about "mpif77  
> with ifort",
> Rajeev suggested that you fix the MPICH2 configuration error,
> by changing F70=ifort to F77=ifort.
> Besides, your configuration script has yet another typo: --enable- 
> f70 instead of --enable-f77.
> There is a non-negligible chance that this is part of the problem  
> with your I/O too.
>
> The best way to go about it would be to do a "make distclean" in  
> your MPICH2 directory,
> to wipe off any old mess, and rebuild a fresh MPICH2 with the right  
> configuration options.
>
> After you build MPICH2 fresh, it is a good idea to compile and run  
> some of their example programs,
> (in the "examples" directory and subdirectories): cpi.c, f77/fpi.f,   
> f90/pi3f90.f90, and cxx/cxxpi.cxx,
> just to make sure the build was right, and that you can run the  
> programs in your cluster or "NOW".
>
> Also, when you compile and run, better use full path names to the  
> compiler wrappers (mpicc, etc),
> and to mpiexec, to avoid any confusion with other versions of MPI  
> that may be hanging around on
> your computer (or make sure your PATH variable is neatly organized).
> This is a very common problem, as most Linux distributions and  
> commercial
> compilers flood our computers with a variety of MPI stuff.
> Very often  people want to use, say, MPICH2, but inadvertently  
> compile with LAM MPI mpicc,
> then run with mpiexec from MPICH-1, and things the like,
> because their PATH is not what they think it is.
> (Check it with "which mpicc", "which mpiexec", etc.)
>
> Then, you can remove any residual of previous compilations of your  
> program.
> (make cleanall, if you have a proper makefile, or simply remove the  
> executable and any object
> files, pre-processed files, etc, by hand).
>
> The next step is a fresh recompilation of your program, using the  
> newly and correctly built mpif77.
> Finally run the program again and see how it goes.
> Not guaranteed that it will work, but at least you can discard  
> problems with how MPICH2 was built,
> and how you launch MPI programs on your computers.
> This is painful, but likely to prevent a misleading superposition of  
> different errors,
> and may narrow your search.
>
> My two cents,
> Gus Correa
>
> -- 
> ---------------------------------------------------------------------
> Gustavo J. Ponce Correa, PhD - Email: gus at ldeo.columbia.edu
> Lamont-Doherty Earth Observatory - Columbia University
> P.O. Box 1000 [61 Route 9W] - Palisades, NY, 10964-8000 - USA
> ---------------------------------------------------------------------
>
>
> Christopher Tanner wrote:
>
>> Gus / All -
>>
>> I've NFS mounted the /home directory on all nodes. To ensure that  
>> the  permissions are correct and the NFS export mechanism is  
>> correct, I  ssh'ed to each node and made sure I could read and  
>> write files to the / home/<user> directory. Is this sufficient to  
>> ensure that mpiexec can  read and read to the home directory?
>>
>> I'm launching the process from the /home/<user> directory, where  
>> the  data files to be read/wrote are. The executable is in the NFS  
>> exported  directory /usr/local/bin.
>>
>> Regarding the code itself making I/O errors, this is what I  
>> assumed  initially. Since it occurred on two different  
>> applications, I'm  assuming it's the MPI and not the application,  
>> but I could be wrong.
>>
>> Does anything here stand out as bad?
>>
>> I emailed out the output from my configure command -- hopefully  
>> this  may shed some light on the issue.
>>
>> -------------------------------------------
>> Chris Tanner
>> Space Systems Design Lab
>> Georgia Institute of Technology
>> christopher.tanner at gatech.edu
>> -------------------------------------------
>>
>>
>>
>> On Oct 7, 2008, at 3:37 PM, Gus Correa wrote:
>>
>>> Hi Christopher and list
>>>
>>> A number of different problems can generate I/O errors in a  
>>> parallel  environment.
>>> Some that I came across with (there are certainly more):
>>>
>>> 1) Permissions on the target directory. (Can you read and write   
>>> there?)
>>> 2) If you are running on separate hosts (a cluster or a "NOW"),
>>> are you doing I?O to local disks/filesystems, or to a NFS mounted   
>>> directory?
>>> 2.A) If local disks, are the presumed directories already created   
>>> there, and with the right permissions?
>>> 2.B) If NFS, is the export/mount mechanism operating properly?
>>> 3)  On which directory do your processes start in each execution  
>>> host?
>>> The same as in the host where you launch the mpiexec command or on  
>>> a  different directory? (See mpiexec -wdir option, assuming you  
>>> are  using the mpiexec that comes with MPICH2. There are other  
>>> mpiexec  commands, though.)
>>> 4) Code (Fortran code, I presume) that makes wrong assumptions  
>>> about  file status,.
>>> e.g. "open(fin,file='myfile',status=old)" but 'myfile' doesn't  
>>> exist  yet.
>>>
>>> Witting a very simple MPI test program that where each process  
>>> opens/ creates, writes, and closes,
>>> a different file may help you sort this out.
>>>
>>> Also, I wonder if your precompiled commercial applications are  
>>> using  the same MPICH2 that
>>> you configured, or some other MPI version.
>>>
>>> I hope this helps,
>>> Gus Correa
>>>
>>> -- 
>>> ---------------------------------------------------------------------
>>> Gustavo J. Ponce Correa, PhD - Email: gus at ldeo.columbia.edu
>>> Lamont-Doherty Earth Observatory - Columbia University
>>> P.O. Box 1000 [61 Route 9W] - Palisades, NY, 10964-8000 - USA
>>> ---------------------------------------------------------------------
>>>
>>>
>>> Christopher Tanner wrote:
>>>
>>>> Hi All -
>>>>
>>>> I am receiving the same errors in multiple applications when I  
>>>> try  to  run them over MPICH2. They all read:
>>>>
>>>> forrtl: Input/output error
>>>> forrtl: No such file or directory
>>>> forrtl: severe ...
>>>>
>>>> This doesn't happen when I try to run any tests (i.e.  
>>>> mpiexec ...   hostname), only whenever I run the applications.  
>>>> Additionally, it   happens with pre-compiled (i.e. commercial  
>>>> applications)  applications  as well as applications compiled on  
>>>> the machine (i.e.  open-source  applications). At first I thought  
>>>> it was something to  do with the  application, now I'm starting  
>>>> to think I've done  something wrong with  MPICH2. Below is the  
>>>> configure command I used:
>>>>
>>>> ./configure --prefix=/usr/local/mpi/mpich2 --enable-f77 --enable-  
>>>> f90 -- enable-cxx --enable-sharedlibs=gcc --enable-fast=defopt   
>>>> CC=icc CFLAGS=- m64 CXX=icpc CXXFLAGS=-m64 F77=ifort FFLAGS=-m64   
>>>> F90=ifort F90FLAGS=-m64
>>>>
>>>> Anyone have any clues? Thanks!
>>>>
>>>> -------------------------------------------
>>>> Chris Tanner
>>>> Space Systems Design Lab
>>>> Georgia Institute of Technology
>>>> christopher.tanner at gatech.edu
>>>> -------------------------------------------
>>>>
>>>>
>>>
>




More information about the mpich-discuss mailing list