[mpich-discuss] File I/O causing collective abort of all ranks

The Source thesourcehim at gmail.com
Tue Sep 23 13:22:57 CDT 2008


This error pops up when the process exits without calling 
MPI_Finalize(). Check if process crashes for example.

Brian Harker пишет:
> Hello list-
>
> I have a problem with process 0 being able to open a file for writing
> and subsequently write to it.  The pertinent section of code looks as
> follows:
>
> ========================================
> if ( proc_id == 0 ) then
>
>   open( unit = 1, file = "fubar.dat", status="new" )
>   do i = 1, ny
>     write(1,*) ( array(i,j), i = 1, nx )
>   end do
>   close(1)
>
> end if
> ========================================
>
> When this part of the code is reached, the program seems to hang for a
> long time while trying to open the file, then spits out the following
> error message:
>
> rank 0 in job 11  $HOSTNAME_#####  caused collective abort of all ranks
>    exit status of rank 0: killed by signal 9
>
> I am confused about this error, because it is seemingly isolated to
> this particular write-to-file by process 0.  During execution, my
> slave processes write out other files using this exact same syntax.
> Has anyone run across this?  I can't seem to find any useful
> information on the interweb.  I have run into this problem with both
> MPICH2-1.0.6p1 and MPICH2-1.0.7.  I am using the Intel fortran
> compiler, ifort 10.1.012.
>
> Thanks in advance for any input!
>
>
>
>   




More information about the mpich-discuss mailing list