[MPICH] Parallel I/O problems on 64-bit machine ( please help :-( )

Peter Diamessis pjd38 at cornell.edu
Mon May 22 13:33:56 CDT 2006


Hello folks,

I'm writing this note to ask some help with running MPI on
a dual proc. 64-bit Linux box I just acquired. I've written a similar
not to the mpi-bugs address but would appreciate any additional
help from anyone else in the community.

I'm using MPICH v1.2.7p1,
which, when tested,  seems to work wonderfully with everything except for
some specific parallel I/O calls.

Specifically, whenever there is a call to MPI_FILE_WRITE_ALL
or MPI_FILE_READ_ALL an SIGSEGV error pops up. Note that
these I/O dumps are part of a greater CFD code which
has worked fine on either a 32-bit dual proc. Linux workstation
or the USC-HPCC Linux cluster (where I was a postdoc).

In  my message to mpi-bugs, I did attach a variety of files that
could provide additional insight. In this case I'm attaching only
the Fortran source code I can gladly provide more material
anyone who may be interested.The troublesome Fortran call is:

   call MPI_FILE_WRITE_ALL(fh, tempout, local_array_size,
> MPI_REAL,
> MPI_STATUS_IGNORE)

Upon call this, the program crashes with a SIGSEGV 11 error. Evidently,
some memory is accessed out of core ?

Tempout is a single precision (Real with kind=4) 3-D array, which has a 
total local
number of elements on each processor equal to local_array_size.
If I change MPI_STATUS_ARRAY to status_array,ierr (where
status_array si appropriately dimensioned) I find that upon error,
printing out the elements of status_array yields these huge values.
This error always is always localized on processor (N+1)/2 (proc. numbering
goes from 0 to N-1).

I installed MPICH2 only to observe the same results.
Calls to MPI_FILE_READ_ALL will also produce identical effects.
I'll reiterate that we've never had problems with this code on 32-bit 
machines.

Note that uname -a returns:

Linux pacific.cee.cornell.edu 2.6.9-5.ELsmp #1 SMP Wed Jan 5 19:29:47 EST
2005 x86_64 x86_64 x86_64 GNU/Linux

Am I running into problems because I've got a 64-bit configured Linux on a 
64-bit
machine.

Any help would HUGELY appreciated. The ability to use MPI2 parallel I/O on
our workstation would greatly help us crunch through some existing large 
datafiles
generated on 32-bit machines.

Cheers,

Peter

-------------------------------------------------------------
Peter Diamessis
Assistant Professor
Environmental Fluid Mechanics & Hydrology
School of Civil and Environmental Engineering
Cornell University
Ithaca, NY 14853
Phone: (607)-255-1719 --- Fax: (607)-255-9004
pjd38 at cornell.edu
http://www.cee.cornell.edu/fbxk/fcbo.cfm?pid=494

-------------- next part --------------
A non-text attachment was scrubbed...
Name: output_problems.f
Type: application/octet-stream
Size: 9700 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/mpich-discuss/attachments/20060522/64933647/attachment.obj>


More information about the mpich-discuss mailing list