bombing out writing large scratch files
Matthew Knepley
knepley at gmail.com
Sat May 27 19:30:52 CDT 2006
What the error? It always shows the error when it cannot pop up the window.
Sounds like a problem with some batch environment being different from the
interactive node. Computer centers are the worst run thing in the world.
Matt
On 5/27/06, Randall Mackie <randy at geosystem.us> wrote:
>
> I can't seem to get the debugger to pop up on my screen.
>
> When I'm logged into the cluster I'm working on, I can
> type xterm &, and an xterm pops up on my display. So I know
> I can get something from the remote cluster.
>
> Now, when I try this using PETSc, I'm getting the following error
> message, for example:
>
> ------------------------------------------------------------------------
> [17]PETSC ERROR: PETSC: Attaching gdb to
> /home/randy/d3inv/PETSC_V3.3/d3inv_3_3_petsc of pid 3628 on display
> 24.5.142.138:0.0 on
> machine compute-0-23.local
> ------------------------------------------------------------------------
>
> I'm using this in my command file:
>
> source ~/.bashrc
> time /opt/mpich/intel/bin/mpirun -np 20 -nolocal -machinefile machines \
> /home/randy/d3inv/PETSC_V3.3/d3inv_3_3_petsc \
> -start_in_debugger \
> -debugger_node 1 \
> -display 24.5.142.138:0.0 \
> -em_ksp_type bcgs \
> -em_sub_pc_type ilu \
> -em_sub_pc_factor_levels 8 \
> -em_sub_pc_factor_fill 4 \
> -em_sub_pc_factor_reuse_ordering \
> -em_sub_pc_factor_reuse_fill \
> -em_sub_pc_factor_mat_ordering_type rcm \
> -divh_ksp_type cr \
> -divh_sub_pc_type icc \
> -ppc_sub_pc_type ilu \
> << EOF
> ...
>
>
> Randy
>
>
> Matthew Knepley wrote:
> > 1) Make sure ssh is forwarding X (-Y I think)
> >
> > 2) -start_in_debugger
> >
> > 3) -display <your machine>:0.0
> >
> > should do it.
> >
> > Matt
> >
> > On 5/27/06, *Randall Mackie* <randy at geosystem.us
> > <mailto:randy at geosystem.us>> wrote:
> >
> > This is a stupid question, but how do I start in the debugger if I'm
> > running
> > on a cluster half-way around the world and I'm working on that
> cluster
> > via ssh?
> >
> > Randy
> >
> >
> > Matthew Knepley wrote:
> > > The best thing to do here is get a stack trace from the debugger.
> > From the
> > > description, it is hard to tell what statement is trying to
> > access which
> > > illegal
> > > memory.
> > >
> > > Matt
> > >
> > > On 5/27/06, *Randall Mackie* < randy at geosystem.us
> > <mailto:randy at geosystem.us>
> > > <mailto:randy at geosystem.us <mailto:randy at geosystem.us>>> wrote:
> > >
> > > In my PETSc based modeling code, I write out intermediate
> > results to
> > > a scratch
> > > file, and then read them back later. This has worked fine up
> > until
> > > today,
> > > when for a large model, this seems to be causing my program
> > to crash
> > > with
> > > errors like:
> > >
> > >
> >
> ------------------------------------------------------------------------
> > > [9]PETSC ERROR: Caught signal number 11 SEGV: Segmentation
> > > Violation, probably memory access out of range
> > >
> > >
> > > I've tracked down the offending code to:
> > >
> > > IF (rank == 0) THEN
> > > irec=(iper-1)*2+ipol
> > > write(7,rec=irec) (xvec(i),i=1,np)
> > > END IF
> > >
> > > It writes out xvec for the first record, but then on the
> second
> > > record my program is crashing.
> > >
> > > The record length (from an inquire statement) is recl
> > 22626552
> > >
> > > The size of the scratch file when my program crashes is 98M.
> > >
> > > PETSc is compiled using the intel compilers ( v9.0 for
> fortran),
> > > and the users manual says that you can have record lengths of
> > > up to 2 billion bytes.
> > >
> > > I'm kind of stuck as to what might be the cause. Any ideas
> > from anyone
> > > would be greatly appreciated.
> > >
> > > Randy Mackie
> > >
> > > ps. I've tried both the optimized and debugging versions of
> > the PETSc
> > > libraries, with the same result.
> > >
> > >
> > > --
> > > Randall Mackie
> > > GSY-USA, Inc.
> > > PMB# 643
> > > 2261 Market St.,
> > > San Francisco, CA 94114-1600
> > > Tel (415) 469-8649
> > > Fax (415) 469-5044
> > >
> > > California Registered Geophysicist
> > > License No. GP 1034
> > >
> > >
> > >
> > >
> > > --
> > > "Failure has a thousand explanations. Success doesn't need one"
> > -- Sir
> > > Alec Guiness
> >
> > --
> > Randall Mackie
> > GSY-USA, Inc.
> > PMB# 643
> > 2261 Market St.,
> > San Francisco, CA 94114-1600
> > Tel (415) 469-8649
> > Fax (415) 469-5044
> >
> > California Registered Geophysicist
> > License No. GP 1034
> >
> >
> >
> >
> > --
> > "Failure has a thousand explanations. Success doesn't need one" -- Sir
> > Alec Guiness
>
> --
> Randall Mackie
> GSY-USA, Inc.
> PMB# 643
> 2261 Market St.,
> San Francisco, CA 94114-1600
> Tel (415) 469-8649
> Fax (415) 469-5044
>
> California Registered Geophysicist
> License No. GP 1034
>
>
--
"Failure has a thousand explanations. Success doesn't need one" -- Sir Alec
Guiness
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20060527/ecceb919/attachment.htm>
More information about the petsc-users
mailing list