<html>
<head>
<meta content="text/html; charset=utf-8" http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
Hi,<br>
<br>
I've been debugging and removing some errors. Now the code works on
most cluster nodes but fails on 2 of them. The strange thing is that
I'm using the same gnu compiler but only deploying to the newer
setup nodes.<br>
<br>
The newer nodes work when using my old code, which is similar except
that its domain partition is only in the z direction. The new code
partitions in y and z direction.<br>
<br>
It fails at the Poisson eqn solving part. Is there anyway I can find
out why this is happening?<br>
<pre class="moz-signature" cols="72">Thank you.
Yours sincerely,
TAY wee-beng</pre>
<div class="moz-cite-prefix">On 25/12/2015 10:29 PM, Matthew Knepley
wrote:<br>
</div>
<blockquote
cite="mid:CAMYG4GnkmEi_85A-E+GTvt7M9Ux=Qysfzaw2Xq-4HZh4rT9tXA@mail.gmail.com"
type="cite">
<div dir="ltr">It appears that you have an uninitialized variable
(or more than one). When compiled with debugging, variables
<div>are normally initialized to zero.</div>
<div><br>
</div>
<div> Thanks,</div>
<div><br>
</div>
<div> Matt</div>
</div>
<div class="gmail_extra"><br>
<div class="gmail_quote">On Fri, Dec 25, 2015 at 5:41 AM, TAY
wee-beng <span dir="ltr"><<a moz-do-not-send="true"
href="mailto:zonexo@gmail.com" target="_blank">zonexo@gmail.com</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">Hi,<br>
<br>
Sorry, there seems to be some problems with my valgrind. I
have repeated it again, with the optimized and debug version
<div class="HOEnZb">
<div class="h5"><br>
<br>
<br>
<br>
<br>
Thank you.<br>
<br>
Yours sincerely,<br>
<br>
TAY wee-beng<br>
<br>
On 25/12/2015 12:42 PM, Barry Smith wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">
On Dec 24, 2015, at 10:37 PM, TAY wee-beng <<a
moz-do-not-send="true"
href="mailto:zonexo@gmail.com" target="_blank"><a class="moz-txt-link-abbreviated" href="mailto:zonexo@gmail.com">zonexo@gmail.com</a></a>>
wrote:<br>
<br>
Hi,<br>
<br>
I tried valgrind in MPI but it aborts very early,
with the error msg regarding PETSc initialize.<br>
</blockquote>
It shouldn't "abort" it should print some error
message and continue. Please send all the output when
running with valgrind.<br>
<br>
It is possible you are solving large enough
problem that require configure --with-64-bit-indices .
Does that resolve the problem?<br>
<br>
Barry<br>
<br>
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">
I retry again, using a lower resolution.<br>
<br>
GAMG works, but BoomerAMG and hypre doesn't.
Increasing cpu too high (80) also cause it to hang.
60 works fine.<br>
<br>
My grid size is 98x169x169<br>
<br>
But when I increase the resolution, GAMG can't work
again.<br>
<br>
I tried to increase the cpu no but it still doesn't
work.<br>
<br>
Previously, using single z direction partition, it
work using GAMG and hypre. So what could be the
problem?<br>
Thank you.<br>
<br>
Yours sincerely,<br>
<br>
TAY wee-beng<br>
<br>
On 25/12/2015 12:33 AM, Matthew Knepley wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">
It sounds like you have memory corruption in a
different part of the code. Run in valgrind.<br>
<br>
Matt<br>
<br>
On Thu, Dec 24, 2015 at 10:14 AM, TAY wee-beng
<<a moz-do-not-send="true"
href="mailto:zonexo@gmail.com" target="_blank">zonexo@gmail.com</a>>
wrote:<br>
Hi,<br>
<br>
I have this strange error. I converted my CFD code
from a z directon only partition to the yz
direction partition. The code works fine but when
I increase the cpu no, strange things happen when
solving the Poisson eqn.<br>
<br>
I increase cpu no from 24 to 40.<br>
<br>
Sometimes it works, sometimes it doesn't. When it
doesn't, it just hangs there with no output, or it
gives the error below:<br>
<br>
Using MPI_Barrier during debug shows that it hangs
at<br>
<br>
call KSPSolve(ksp,b_rhs,xx,ierr).<br>
<br>
I use hypre BoomerAMG and GAMG
(-poisson_pc_gamg_agg_nsmooths 1 -poisson_pc_type
gamg)<br>
<br>
<br>
Why is this so random? Also how do I debug this
type of problem.<br>
<br>
<br>
[32]PETSC ERROR:
------------------------------------------------------------------------<br>
[32]PETSC ERROR: Caught signal number 11 SEGV:
Segmentation Violation, probably memory access out
of range<br>
[32]PETSC ERROR: Try option -start_in_debugger or
-on_error_attach_debugger<br>
[32]PETSC ERROR: or see <a moz-do-not-send="true"
href="http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind"
rel="noreferrer" target="_blank">http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind</a><br>
[32]PETSC ERROR: or try <a moz-do-not-send="true"
href="http://valgrind.org" rel="noreferrer"
target="_blank">http://valgrind.org</a> on
GNU/linux and Apple Mac OS X to find memory
corruption errors<br>
[32]PETSC ERROR: likely location of problem given
in stack below<br>
[32]PETSC ERROR: --------------------- Stack
Frames ------------------------------------<br>
[32]PETSC ERROR: Note: The EXACT line numbers in
the stack are not available,<br>
[32]PETSC ERROR: INSTEAD the line number of
the start of the function<br>
[32]PETSC ERROR: is given.<br>
[32]PETSC ERROR: [32] HYPRE_SetupXXX line 174
/home/wtay/Codes/petsc-3.6.2/src/ksp/pc/impls/hypre/hypre.c<br>
[32]PETSC ERROR: [32] PCSetUp_HYPRE line 122
/home/wtay/Codes/petsc-3.6.2/src/ksp/pc/impls/hypre/hypre.c<br>
[32]PETSC ERROR: [32] PCSetUp line 945
/home/wtay/Codes/petsc-3.6.2/src/ksp/pc/interface/precon.c<br>
[32]PETSC ERROR: [32] KSPSetUp line 247
/home/wtay/Codes/petsc-3.6.2/src/ksp/ksp/interface/itfunc.c<br>
[32]PETSC ERROR: [32] KSPSolve line 510
/home/wtay/Codes/petsc-3.6.2/src/ksp/ksp/interface/itfunc.c<br>
[32]PETSC ERROR: --------------------- Error
Message
--------------------------------------------------------------<br>
[32]PETSC ERROR: Signal received<br>
[32]PETSC ERROR: See <a moz-do-not-send="true"
href="http://www.mcs.anl.gov/petsc/documentation/faq.html"
rel="noreferrer" target="_blank">http://www.mcs.anl.gov/petsc/documentation/faq.html</a>
for trouble shooting.<br>
[32]PETSC ERROR: Petsc Release Version 3.6.2, Oct,
02, 2015<br>
[32]PETSC ERROR: ./a.out on a
petsc-3.6.2_shared_gnu_debug named n12-40 by wtay
Thu Dec 24 17:01:51 2015<br>
[32]PETSC ERROR: Configure options
--with-mpi-dir=/opt/ud/openmpi-1.8.8/
--download-fblaslapack=1 --with-debugging=1
--download-hypre=1
--prefix=/home/wtay/Lib/petsc-3.6.2_shared_gnu_debug
--known-mpi-shared=1 --with-shared-libraries
--with-fortran-interfaces=1<br>
[32]PETSC ERROR: #1 User provided function() line
0 in unknown file<br>
--------------------------------------------------------------------------<br>
MPI_ABORT was invoked on rank 32 in communicator
MPI_COMM_WORLD<br>
with errorcode 59.<br>
<br>
-- <br>
Thank you.<br>
<br>
Yours sincerely,<br>
<br>
TAY wee-beng<br>
<br>
<br>
<br>
<br>
-- <br>
What most experimenters take for granted before
they begin their experiments is infinitely more
interesting than any results to which their
experiments lead.<br>
-- Norbert Wiener<br>
</blockquote>
</blockquote>
</blockquote>
<br>
</div>
</div>
</blockquote>
</div>
<br>
<br clear="all">
<div><br>
</div>
-- <br>
<div class="gmail_signature">What most experimenters take for
granted before they begin their experiments is infinitely more
interesting than any results to which their experiments lead.<br>
-- Norbert Wiener</div>
</div>
</blockquote>
<br>
</body>
</html>