[petsc-users] Code sometimes work, sometimes hang when increase cpu usage

Matthew Knepley knepley at gmail.com
Fri Dec 25 08:29:14 CST 2015


It appears that you have an uninitialized variable (or more than one). When
compiled with debugging, variables
are normally initialized to zero.

  Thanks,

     Matt

On Fri, Dec 25, 2015 at 5:41 AM, TAY wee-beng <zonexo at gmail.com> wrote:

> Hi,
>
> Sorry, there seems to be some problems with my valgrind. I have repeated
> it again, with the optimized and debug version
>
>
>
>
>
> Thank you.
>
> Yours sincerely,
>
> TAY wee-beng
>
> On 25/12/2015 12:42 PM, Barry Smith wrote:
>
>> On Dec 24, 2015, at 10:37 PM, TAY wee-beng <zonexo at gmail.com> wrote:
>>>
>>> Hi,
>>>
>>> I tried valgrind in MPI but it aborts very early, with the error msg
>>> regarding PETSc initialize.
>>>
>>    It shouldn't "abort" it should print some error message and continue.
>> Please send all the output when running with valgrind.
>>
>>     It is possible you are solving large enough problem that require
>> configure --with-64-bit-indices . Does that resolve the problem?
>>
>>    Barry
>>
>> I retry again, using a lower resolution.
>>>
>>> GAMG  works, but BoomerAMG and hypre doesn't. Increasing cpu too high
>>> (80) also cause it to hang. 60 works fine.
>>>
>>> My grid size is 98x169x169
>>>
>>> But when I increase the resolution, GAMG can't work again.
>>>
>>> I tried to increase the cpu no but it still doesn't work.
>>>
>>> Previously, using single z direction partition, it work using GAMG and
>>> hypre. So what could be the problem?
>>> Thank you.
>>>
>>> Yours sincerely,
>>>
>>> TAY wee-beng
>>>
>>> On 25/12/2015 12:33 AM, Matthew Knepley wrote:
>>>
>>>> It sounds like you have memory corruption in a different part of the
>>>> code. Run in valgrind.
>>>>
>>>>    Matt
>>>>
>>>> On Thu, Dec 24, 2015 at 10:14 AM, TAY wee-beng <zonexo at gmail.com>
>>>> wrote:
>>>> Hi,
>>>>
>>>> I have this strange error. I converted my CFD code from a z directon
>>>> only partition to the yz direction partition. The code works fine but when
>>>> I increase the cpu no, strange things happen when solving the Poisson eqn.
>>>>
>>>> I increase cpu no from 24 to 40.
>>>>
>>>> Sometimes it works, sometimes it doesn't. When it doesn't, it just
>>>> hangs there with no output, or it gives the error below:
>>>>
>>>> Using MPI_Barrier during debug shows that it hangs at
>>>>
>>>> call KSPSolve(ksp,b_rhs,xx,ierr).
>>>>
>>>> I use hypre BoomerAMG and GAMG (-poisson_pc_gamg_agg_nsmooths 1
>>>> -poisson_pc_type gamg)
>>>>
>>>>
>>>> Why is this so random? Also how do I debug this type of problem.
>>>>
>>>>
>>>> [32]PETSC ERROR:
>>>> ------------------------------------------------------------------------
>>>> [32]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation,
>>>> probably memory access out of range
>>>> [32]PETSC ERROR: Try option -start_in_debugger or
>>>> -on_error_attach_debugger
>>>> [32]PETSC ERROR: or see
>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
>>>> [32]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac
>>>> OS X to find memory corruption errors
>>>> [32]PETSC ERROR: likely location of problem given in stack below
>>>> [32]PETSC ERROR: ---------------------  Stack Frames
>>>> ------------------------------------
>>>> [32]PETSC ERROR: Note: The EXACT line numbers in the stack are not
>>>> available,
>>>> [32]PETSC ERROR:       INSTEAD the line number of the start of the
>>>> function
>>>> [32]PETSC ERROR:       is given.
>>>> [32]PETSC ERROR: [32] HYPRE_SetupXXX line 174
>>>> /home/wtay/Codes/petsc-3.6.2/src/ksp/pc/impls/hypre/hypre.c
>>>> [32]PETSC ERROR: [32] PCSetUp_HYPRE line 122
>>>> /home/wtay/Codes/petsc-3.6.2/src/ksp/pc/impls/hypre/hypre.c
>>>> [32]PETSC ERROR: [32] PCSetUp line 945
>>>> /home/wtay/Codes/petsc-3.6.2/src/ksp/pc/interface/precon.c
>>>> [32]PETSC ERROR: [32] KSPSetUp line 247
>>>> /home/wtay/Codes/petsc-3.6.2/src/ksp/ksp/interface/itfunc.c
>>>> [32]PETSC ERROR: [32] KSPSolve line 510
>>>> /home/wtay/Codes/petsc-3.6.2/src/ksp/ksp/interface/itfunc.c
>>>> [32]PETSC ERROR: --------------------- Error Message
>>>> --------------------------------------------------------------
>>>> [32]PETSC ERROR: Signal received
>>>> [32]PETSC ERROR: See
>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble
>>>> shooting.
>>>> [32]PETSC ERROR: Petsc Release Version 3.6.2, Oct, 02, 2015
>>>> [32]PETSC ERROR: ./a.out on a petsc-3.6.2_shared_gnu_debug named n12-40
>>>> by wtay Thu Dec 24 17:01:51 2015
>>>> [32]PETSC ERROR: Configure options
>>>> --with-mpi-dir=/opt/ud/openmpi-1.8.8/ --download-fblaslapack=1
>>>> --with-debugging=1 --download-hypre=1
>>>> --prefix=/home/wtay/Lib/petsc-3.6.2_shared_gnu_debug --known-mpi-shared=1
>>>> --with-shared-libraries --with-fortran-interfaces=1
>>>> [32]PETSC ERROR: #1 User provided function() line 0 in  unknown file
>>>>
>>>> --------------------------------------------------------------------------
>>>> MPI_ABORT was invoked on rank 32 in communicator MPI_COMM_WORLD
>>>> with errorcode 59.
>>>>
>>>> --
>>>> Thank you.
>>>>
>>>> Yours sincerely,
>>>>
>>>> TAY wee-beng
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> What most experimenters take for granted before they begin their
>>>> experiments is infinitely more interesting than any results to which their
>>>> experiments lead.
>>>> -- Norbert Wiener
>>>>
>>>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20151225/dbad1912/attachment.html>


More information about the petsc-users mailing list