newbie errors
Rob Ross
rross at mcs.anl.gov
Wed Apr 7 09:50:35 CDT 2004
Hi,
Is this a program that has been run on other machines or used for a long
period of time, or is this a new code? Can you get a traceback of where
that segfault occurred?
Thanks,
Rob
On Wed, 7 Apr 2004 jabencke at ncsu.edu wrote:
> Rob,
> First thanks for your attention.
>
> I've done what you asked and I'm still getting errors but they are
> different. This code I've sent you is not the main program but just a
> subroutine. Now I'm seeing errors like:
>
> Killed by signal 2.^M
> Killed by signal 2.^M
> Killed by signal 2.^M
> Killed by signal 2.^M
> Killed by signal 2.^M
> Killed by signal 2.^M
> Killed by signal 2.^M
> /opt/mpich/ethernet/icc/bin/mpirun: line 1: 8391 Broken pipe
> /hom\e/jabencke/pssdw/vhone -p4pg /home/jabencke/pssdw/PI8171 -p4wd
> /home/jabencke/\pssdw
>
> AND
>
> p0_8391: p4_error: interrupt SIGSEGV: 11
>
>
> Joseph Benckert
>
> > The reason I think that you might need to is that your error
> > "Intercommunicator is not allowed" looks like the result of getting the
> > wrong value for MPI_COMM_WORLD.
> >
> > In general you should include the MPI headers in MPI programs. Can you
> > try it?
> >
> > Thanks,
> >
> > Rob
> >
> > On Tue, 6 Apr 2004 jabencke at ncsu.edu wrote:
> >
> >> Rob,
> >> Thanks for the quick response. I don't believe I need to include it. I
> >> don't think the compiler would recognize the MPI_COMM_WORLD, otherwise.
> >> Anyway, we are using a Linux cluster (Rocks), and mpich.
> >>
> >> Joseph Benckert
> >> Department of Physics
> >> North Carolina State University
> >> jabencke at unity.ncsu.edu
> >>
> >>
> >> > Hi,
> >> >
> >> > What machine and MPI are you using here? You should be including an
> >> MPI
> >> > header too; that might be the cause (or you might have just neglected
> >> to
> >> > include that).
> >> >
> >> > Thanks,
> >> >
> >> > Rob
> >> >
> >> > On Tue, 6 Apr 2004 jabencke at ncsu.edu wrote:
> >> >
> >> >> I'm not sure what's causing the errors I'm having, listed below. I'm
> >> >> trying to baby step here and just create the file to start things
> >> off.
> >> >> Below the errors is the code that causes the problem. It's repeated
> >> 8
> >> >> times because it's an 8 processor test job. Any help would be
> >> >> fantastic.
> >> >>
> >> >> Can not open/create file
> >> >>
> >> >> Can not open/create file
> >> >>
> >> >> Can not open/create file
> >> >>
> >> >> Can not open/create file
> >> >>
> >> >> Can not open/create file
> >> >>
> >> >> -1073749200: MPI_File_open error = Intercommunicator is not allowed
> >> >> Can not open/create file
> >> >>
> >> >> -1073746896: MPI_File_open error = Intercommunicator is not allowed
> >> >> Can not open/create file
> >> >>
> >> >> -1073751120: MPI_File_open error = Intercommunicator is not allowed
> >> >> Can not open/create file
> >> >>
> >> >> -1073747536: MPI_File_open error = Intercommunicator is not allowed
> >> >> -1073749456: MPI_File_open error = Intercommunicator is not allowed
> >> >> -1073746512: MPI_File_open error = Intercommunicator is not allowed
> >> >> -1073750224: MPI_File_open error = Intercommunicator is not allowed
> >> >> -1073747248: MPI_File_open error = Intercommunicator is not allowed
> >> >>
> >> >>
> >> >> Code here:
> >> >>
> >> >> subroutine prin(prefix)
> >> >>
> >> >> ! Outputs ascii array if ndim = 1, else if ndim > 1 then
> >> >> ! write out hdf5 data file containing all variables (plus time).
> >> >>
> >> >> include 'pnetcdf.inc'
> >> >> include 'global.h'
> >> >> include 'sweep.h'
> >> >> include 'zone.h'
> >> >>
> >> >> ! integer(HID_T) :: hdf_file !file id for hdf file
> >> >> integer :: hdf_error !error var for hdf5 file
> >> >>
> >> >> character(LEN=1) :: char
> >> >> character(LEN=1) :: coord
> >> >> character(LEN=4) :: tmp1, tmp2
> >> >> character(LEN=5) :: prefix
> >> >> character(LEN=15) :: filename
> >> >>
> >> >> ! Added (Fortran 90 style) for hdf5 stuff
> >> >>
> >> >> ! INTEGER(HID_T) :: dsp_id ! Dataspace ID
> >> >> ! INTEGER(HID_T) :: dset_id !Dataset ID
> >> >> INTEGER, DIMENSION(3) :: dims
> >> >> INTEGER, DIMENSION(3) :: dimids
> >> >> INTEGER :: status, ncid
> >> >> INTEGER :: xDimID, yDimID, zDimID
> >> >> INTEGER :: yMaxDimID, tsDimID
> >> >> INTEGER :: density_varID, pressure_varID
> >> >> INTEGER :: XVelocity_varID, YVelocity_varID, ZVelocity_varID
> >> >> INTEGER :: XScale_varID, YScale_varID, ZScale_varID, time_varID
> >> >>
> >> >> !------------------------------------------------------------------------------
> >> >>
> >> >> ! Create filename from integer nfile (in global.h) and prefix such
> >> that
> >> >> filename
> >> >> ! looks like prefx.1000 where 1000 is the value of nfile
> >> >>
> >> >> write(tmp1,910) nfile
> >> >> write(tmp2,910) mype
> >> >> 910 format(i4)
> >> >> do i = 1, 4
> >> >> if ((tmp1(i:i)) .eq. ' ') tmp1(i:i) = '0'
> >> >> if ((tmp2(i:i)) .eq. ' ') tmp2(i:i) = '0'
> >> >> enddo
> >> >> filename = prefix(1:5) // '_' // tmp1(1:4) // '.' // tmp2(1:4)
> >> >> nfile = nfile + 1
> >> >>
> >> >> if (ndim .eq. 1) then
> >> >>
> >> >> ! Keep 1D output simple, just write out in ascii...
> >> >> open(unit=3,file=filename,form='formatted')
> >> >> do i = 1, imax
> >> >> write(3, 1003) zxa(i), zro(i,1,1),zpr(i,1,1), zux(i,1,1)
> >> >> enddo
> >> >> close(3)
> >> >>
> >> >> else
> >> >>
> >> >>
> >> >> ! Initialize Dimensions
> >> >> dims(1) = imax
> >> >> dims(2) = js
> >> >> if(ndim.eq.3) dims(3) = kmax
> >> >>
> >> >> status = nfmpi_create(MPI_COMM_WORLD, filename,
> >> >> & MPI_INFO_NULL, nf90_Clobber, ncid)
> >> >>
> >> >>
> >> >>
> >> >> if (status /= nf90_NoErr ) print *, nfmpi_strerror(status)
> >> >>
> >> >> ! always a necessary statment to flush output
> >> >> status = nfmpi_close(ncid)
> >> >>
> >> >> endif
> >> >>
> >> >> write(8,6000) filename, time, ncycle
> >> >>
> >> >> 6000 format('Wrote ',a10,' to disk at time =',1pe12.5,' (ncycle =',
> >> >> & i6,')')
> >> >> 1003 format(' ',4e13.5)
> >> >>
> >> >> return
> >> >> end
> >> >>
> >> >>
> >> >>
> >> >
> >> >
> >>
> >>
> >
> >
>
>
More information about the parallel-netcdf
mailing list