[petsc-users] Code sometimes work, sometimes hang when increase cpu usage

Matthew Knepley knepley at gmail.com
Thu Dec 24 10:33:42 CST 2015


It sounds like you have memory corruption in a different part of the code.
Run in valgrind.

  Matt

On Thu, Dec 24, 2015 at 10:14 AM, TAY wee-beng <zonexo at gmail.com> wrote:

> Hi,
>
> I have this strange error. I converted my CFD code from a z directon only
> partition to the yz direction partition. The code works fine but when I
> increase the cpu no, strange things happen when solving the Poisson eqn.
>
> I increase cpu no from 24 to 40.
>
> Sometimes it works, sometimes it doesn't. When it doesn't, it just hangs
> there with no output, or it gives the error below:
>
> Using MPI_Barrier during debug shows that it hangs at
>
> call KSPSolve(ksp,b_rhs,xx,ierr).
>
> I use hypre BoomerAMG and GAMG (-poisson_pc_gamg_agg_nsmooths 1
> -poisson_pc_type gamg)
>
>
> Why is this so random? Also how do I debug this type of problem.
>
>
> [32]PETSC ERROR:
> ------------------------------------------------------------------------
> [32]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation,
> probably memory access out of range
> [32]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
> [32]PETSC ERROR: or see
> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
> [32]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS
> X to find memory corruption errors
> [32]PETSC ERROR: likely location of problem given in stack below
> [32]PETSC ERROR: ---------------------  Stack Frames
> ------------------------------------
> [32]PETSC ERROR: Note: The EXACT line numbers in the stack are not
> available,
> [32]PETSC ERROR:       INSTEAD the line number of the start of the function
> [32]PETSC ERROR:       is given.
> [32]PETSC ERROR: [32] HYPRE_SetupXXX line 174
> /home/wtay/Codes/petsc-3.6.2/src/ksp/pc/impls/hypre/hypre.c
> [32]PETSC ERROR: [32] PCSetUp_HYPRE line 122
> /home/wtay/Codes/petsc-3.6.2/src/ksp/pc/impls/hypre/hypre.c
> [32]PETSC ERROR: [32] PCSetUp line 945
> /home/wtay/Codes/petsc-3.6.2/src/ksp/pc/interface/precon.c
> [32]PETSC ERROR: [32] KSPSetUp line 247
> /home/wtay/Codes/petsc-3.6.2/src/ksp/ksp/interface/itfunc.c
> [32]PETSC ERROR: [32] KSPSolve line 510
> /home/wtay/Codes/petsc-3.6.2/src/ksp/ksp/interface/itfunc.c
> [32]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------
> [32]PETSC ERROR: Signal received
> [32]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html
> for trouble shooting.
> [32]PETSC ERROR: Petsc Release Version 3.6.2, Oct, 02, 2015
> [32]PETSC ERROR: ./a.out on a petsc-3.6.2_shared_gnu_debug named n12-40 by
> wtay Thu Dec 24 17:01:51 2015
> [32]PETSC ERROR: Configure options --with-mpi-dir=/opt/ud/openmpi-1.8.8/
> --download-fblaslapack=1 --with-debugging=1 --download-hypre=1
> --prefix=/home/wtay/Lib/petsc-3.6.2_shared_gnu_debug --known-mpi-shared=1
> --with-shared-libraries --with-fortran-interfaces=1
> [32]PETSC ERROR: #1 User provided function() line 0 in  unknown file
> --------------------------------------------------------------------------
> MPI_ABORT was invoked on rank 32 in communicator MPI_COMM_WORLD
> with errorcode 59.
>
> --
> Thank you.
>
> Yours sincerely,
>
> TAY wee-beng
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20151224/0b8f7c93/attachment.html>


More information about the petsc-users mailing list