[mpich-discuss] File I/O causing collective abort of all ranks

Brian Harker brian.harker at gmail.com
Tue Sep 23 12:13:51 CDT 2008


Hello list-

I have a problem with process 0 being able to open a file for writing
and subsequently write to it.  The pertinent section of code looks as
follows:

========================================
if ( proc_id == 0 ) then

  open( unit = 1, file = "fubar.dat", status="new" )
  do i = 1, ny
    write(1,*) ( array(i,j), i = 1, nx )
  end do
  close(1)

end if
========================================

When this part of the code is reached, the program seems to hang for a
long time while trying to open the file, then spits out the following
error message:

rank 0 in job 11  $HOSTNAME_#####  caused collective abort of all ranks
   exit status of rank 0: killed by signal 9

I am confused about this error, because it is seemingly isolated to
this particular write-to-file by process 0.  During execution, my
slave processes write out other files using this exact same syntax.
Has anyone run across this?  I can't seem to find any useful
information on the interweb.  I have run into this problem with both
MPICH2-1.0.6p1 and MPICH2-1.0.7.  I am using the Intel fortran
compiler, ifort 10.1.012.

Thanks in advance for any input!



-- 
Cheers,
Brian
brian.harker at gmail.com


"In science, there is only physics; all the rest is stamp-collecting."
 -Ernest Rutherford




More information about the mpich-discuss mailing list