[mpich-discuss] File I/O causing collective abort of all ranks
The Source
thesourcehim at gmail.com
Tue Sep 23 13:22:57 CDT 2008
This error pops up when the process exits without calling
MPI_Finalize(). Check if process crashes for example.
Brian Harker пишет:
> Hello list-
>
> I have a problem with process 0 being able to open a file for writing
> and subsequently write to it. The pertinent section of code looks as
> follows:
>
> ========================================
> if ( proc_id == 0 ) then
>
> open( unit = 1, file = "fubar.dat", status="new" )
> do i = 1, ny
> write(1,*) ( array(i,j), i = 1, nx )
> end do
> close(1)
>
> end if
> ========================================
>
> When this part of the code is reached, the program seems to hang for a
> long time while trying to open the file, then spits out the following
> error message:
>
> rank 0 in job 11 $HOSTNAME_##### caused collective abort of all ranks
> exit status of rank 0: killed by signal 9
>
> I am confused about this error, because it is seemingly isolated to
> this particular write-to-file by process 0. During execution, my
> slave processes write out other files using this exact same syntax.
> Has anyone run across this? I can't seem to find any useful
> information on the interweb. I have run into this problem with both
> MPICH2-1.0.6p1 and MPICH2-1.0.7. I am using the Intel fortran
> compiler, ifort 10.1.012.
>
> Thanks in advance for any input!
>
>
>
>
More information about the mpich-discuss
mailing list