newbie errors

Rob Ross rross at mcs.anl.gov
Wed Apr 7 09:50:35 CDT 2004


Hi,

Is this a program that has been run on other machines or used for a long 
period of time, or is this a new code?  Can you get a traceback of where 
that segfault occurred?

Thanks,

Rob

On Wed, 7 Apr 2004 jabencke at ncsu.edu wrote:

> Rob,
> First thanks for your attention.
> 
> I've done what you asked and I'm still getting errors but they are
> different.  This code I've sent you is not the main program but just a
> subroutine.  Now I'm seeing errors like:
> 
> Killed by signal 2.^M
> Killed by signal 2.^M
> Killed by signal 2.^M
> Killed by signal 2.^M
> Killed by signal 2.^M
> Killed by signal 2.^M
> Killed by signal 2.^M
> /opt/mpich/ethernet/icc/bin/mpirun: line 1:  8391 Broken pipe            
> /hom\e/jabencke/pssdw/vhone -p4pg /home/jabencke/pssdw/PI8171 -p4wd
> /home/jabencke/\pssdw
> 
> AND
> 
> p0_8391:  p4_error: interrupt SIGSEGV: 11
> 
> 
> Joseph Benckert
> 
> > The reason I think that you might need to is that your error
> > "Intercommunicator is not allowed" looks like the result of getting the
> > wrong value for MPI_COMM_WORLD.
> >
> > In general you should include the MPI headers in MPI programs.  Can you
> > try it?
> >
> > Thanks,
> >
> > Rob
> >
> > On Tue, 6 Apr 2004 jabencke at ncsu.edu wrote:
> >
> >> Rob,
> >> Thanks for the quick response.  I don't believe I need to include it.  I
> >> don't think the compiler would recognize the MPI_COMM_WORLD, otherwise.
> >> Anyway, we are using a Linux cluster (Rocks), and mpich.
> >>
> >> Joseph Benckert
> >> Department of Physics
> >> North Carolina State University
> >> jabencke at unity.ncsu.edu
> >>
> >>
> >> > Hi,
> >> >
> >> > What machine and MPI are you using here?  You should be including an
> >> MPI
> >> > header too; that might be the cause (or you might have just neglected
> >> to
> >> > include that).
> >> >
> >> > Thanks,
> >> >
> >> > Rob
> >> >
> >> > On Tue, 6 Apr 2004 jabencke at ncsu.edu wrote:
> >> >
> >> >> I'm not sure what's causing the errors I'm having, listed below.  I'm
> >> >> trying to baby step here and just create the file to start things
> >> off.
> >> >> Below the errors is the code that causes the problem.  It's repeated
> >> 8
> >> >> times because it's an 8 processor test job.  Any help would be
> >> >> fantastic.
> >> >>
> >> >>  Can not open/create file
> >> >>
> >> >>  Can not open/create file
> >> >>
> >> >>  Can not open/create file
> >> >>
> >> >>  Can not open/create file
> >> >>
> >> >>  Can not open/create file
> >> >>
> >> >> -1073749200: MPI_File_open error = Intercommunicator is not allowed
> >> >>  Can not open/create file
> >> >>
> >> >> -1073746896: MPI_File_open error = Intercommunicator is not allowed
> >> >>  Can not open/create file
> >> >>
> >> >> -1073751120: MPI_File_open error = Intercommunicator is not allowed
> >> >>  Can not open/create file
> >> >>
> >> >> -1073747536: MPI_File_open error = Intercommunicator is not allowed
> >> >> -1073749456: MPI_File_open error = Intercommunicator is not allowed
> >> >> -1073746512: MPI_File_open error = Intercommunicator is not allowed
> >> >> -1073750224: MPI_File_open error = Intercommunicator is not allowed
> >> >> -1073747248: MPI_File_open error = Intercommunicator is not allowed
> >> >>
> >> >>
> >> >> Code here:
> >> >>
> >> >>       subroutine prin(prefix)
> >> >>
> >> >> ! Outputs ascii array if ndim = 1, else if ndim > 1 then
> >> >> ! write out hdf5 data file containing all variables (plus time).
> >> >>
> >> >>       include 'pnetcdf.inc'
> >> >>       include 'global.h'
> >> >>       include 'sweep.h'
> >> >>       include 'zone.h'
> >> >>
> >> >>       ! integer(HID_T) :: hdf_file !file id for hdf file
> >> >>       integer :: hdf_error !error var for hdf5 file
> >> >>
> >> >>       character(LEN=1) :: char
> >> >>       character(LEN=1) :: coord
> >> >>       character(LEN=4) :: tmp1, tmp2
> >> >>       character(LEN=5) :: prefix
> >> >>       character(LEN=15) :: filename
> >> >>
> >> >> !     Added (Fortran 90 style) for hdf5 stuff
> >> >>
> >> >>       ! INTEGER(HID_T) :: dsp_id  ! Dataspace ID
> >> >>       ! INTEGER(HID_T) :: dset_id !Dataset ID
> >> >>       INTEGER, DIMENSION(3) :: dims
> >> >>       INTEGER, DIMENSION(3) :: dimids
> >> >>       INTEGER :: status, ncid
> >> >>       INTEGER :: xDimID, yDimID, zDimID
> >> >>       INTEGER :: yMaxDimID, tsDimID
> >> >>       INTEGER :: density_varID, pressure_varID
> >> >>       INTEGER :: XVelocity_varID, YVelocity_varID, ZVelocity_varID
> >> >>       INTEGER :: XScale_varID, YScale_varID, ZScale_varID, time_varID
> >> >>
> >> >> !------------------------------------------------------------------------------
> >> >>
> >> >> ! Create filename from integer nfile (in global.h) and prefix such
> >> that
> >> >> filename
> >> >> ! looks like prefx.1000 where 1000 is the value of nfile
> >> >>
> >> >>       write(tmp1,910) nfile
> >> >>       write(tmp2,910) mype
> >> >>  910  format(i4)
> >> >>       do i = 1, 4
> >> >>          if ((tmp1(i:i)) .eq. ' ') tmp1(i:i) = '0'
> >> >>          if ((tmp2(i:i)) .eq. ' ') tmp2(i:i) = '0'
> >> >>       enddo
> >> >>       filename = prefix(1:5) // '_' // tmp1(1:4) // '.' // tmp2(1:4)
> >> >>       nfile = nfile + 1
> >> >>
> >> >>       if (ndim .eq. 1) then
> >> >>
> >> >> ! Keep 1D output simple, just write out in ascii...
> >> >>         open(unit=3,file=filename,form='formatted')
> >> >>         do i = 1, imax
> >> >>           write(3, 1003) zxa(i), zro(i,1,1),zpr(i,1,1), zux(i,1,1)
> >> >>         enddo
> >> >>         close(3)
> >> >>
> >> >>       else
> >> >>
> >> >>
> >> >> !     Initialize Dimensions
> >> >>       dims(1) = imax
> >> >>       dims(2) = js
> >> >>       if(ndim.eq.3) dims(3) = kmax
> >> >>
> >> >>       status = nfmpi_create(MPI_COMM_WORLD, filename,
> >> >>      &                      MPI_INFO_NULL, nf90_Clobber, ncid)
> >> >>
> >> >>
> >> >>
> >> >>       if (status /= nf90_NoErr ) print *, nfmpi_strerror(status)
> >> >>
> >> >>       ! always a necessary statment to flush output
> >> >>       status = nfmpi_close(ncid)
> >> >>
> >> >>       endif
> >> >>
> >> >>       write(8,6000) filename, time, ncycle
> >> >>
> >> >>  6000 format('Wrote ',a10,' to disk at time =',1pe12.5,' (ncycle =',
> >> >>      &        i6,')')
> >> >>  1003 format(' ',4e13.5)
> >> >>
> >> >>       return
> >> >>       end
> >> >>
> >> >>
> >> >>
> >> >
> >> >
> >>
> >>
> >
> >
> 
> 




More information about the parallel-netcdf mailing list