<div dir="auto">I am adding Ed D'Azevedo, Steve Abbott and Robert Hager to this email thread.<div dir="auto">I hope this issue gets resolved soon so that we can run XGC on the full scale Cori.<br><div dir="auto">Thanks.</div><div dir="auto">CS</div></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Mar 8, 2017 4:57 PM, "Richard Mills" <<a href="mailto:richardtmills@gmail.com">richardtmills@gmail.com</a>> wrote:<br type="attribution"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div><div>Hi Mark,<br><br></div>Is your application threaded? I seem to recall having seen these "Logging event had unbalanced begin/end pairs" with threaded codes that call PETSc.<br><br></div>--Richard<br></div><div class="gmail_extra"><br><div class="gmail_quote">On Wed, Mar 8, 2017 at 1:33 PM, Mark Adams <span dir="ltr"><<a href="mailto:mfadams@lbl.gov" target="_blank">mfadams@lbl.gov</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Our code is having scaling problems on KNL (Cori), when we get up to<br>
about 1K sockets.<br>
<br>
We have isolated the problem to a certain VecScatter. This code stores<br>
the data redundantly. Scattering into the solver is just a local copy,<br>
but scattering out requires that each process send all of its data to<br>
every other process. It is this second one that is not scaling well.<br>
<br>
I wish I had more data, but this is urgent, jobs are in the queue, but<br>
this is all I have. Any recommendation for parameters that we might<br>
test while we get more data?<br>
<br>
Also, we got this error with -log_view.<br>
<br>
I've updated their PETSc with maint and we are waiting for it to run<br>
again. Apparently this was not on the first time step, so the code<br>
seems to have run for a while with what looks to me like a logic bug.<br>
<br>
Thanks,<br>
Mark<br>
<br>
<br>
[4098]PETSC ERROR: --------------------- Error Message<br>
------------------------------<wbr>------------------------------<wbr>--<br>
[4098]PETSC ERROR: Object is in wrong state<br>
[4098]PETSC ERROR: Logging event had unbalanced begin/end pairs<br>
[4098]PETSC ERROR: See<br>
<a href="http://www.mcs.anl.gov/petsc/documentation/faq.html" rel="noreferrer" target="_blank">http://www.mcs.anl.gov/petsc/d<wbr>ocumentation/faq.html</a> for trouble<br>
shooting.<br>
[4098]PETSC ERROR: Petsc Release Version 3.6.3, unknown<br>
[4098]PETSC ERROR: /global/cscratch1/sd/worleyph/<wbr>XGC1_KNL/xgc2 on a<br>
v3.6.3-arch-knl-opt64-intel named nid05668 by worleyph Mon Mar 6<br>
11:33:19 2017<br>
[4098]PETSC ERROR: Configure options COPTFLAGS="-g -O3 -fp-model fast<br>
-xMIC-AVX512 -DX2_HAVE_INTEL" CXXOPTFLAGS="-g -O3 -fp-model fast<br>
-xMIC-AVX512\<br>
-DX2_HAVE_INTEL" FOPTFLAGS="-g -O3 -fp-model fast -xMIC-AVX512<br>
-DX2_HAVE_INTEL" --download-metis=1 --download-parmetis=1<br>
--with-blas-lapack-dir=/g\<br>
lobal/common/cori/software/int<wbr>el/compilers_and_libraries_201<wbr>7.0.098/linux/mkl<br>
--with-cc=cc --with-cxx=cc --with-debugging=0 --with-fc=ftn --with-mp\<br>
iexec=srun --with-batch=0 --with-memalign=64 --with-64-bit-indices<br>
--known-mpi-shared-libraries=1 PETSC_ARCH=v3.6.3-arch-knl-opt<wbr>64-intel<br>
--with-ope\<br>
nmp=1 PETSC_DIR=/global/homes/t/tkos<wbr>kela/git/petsc<br>
[4098]PETSC ERROR: #1 PetscLogEventEndDefault() line 696 in<br>
/global/u2/t/tkoskela/git/pets<wbr>c/src/sys/logging/utils/eventl<wbr>og.c<br>
[4098]PETSC ERROR: #2 VecSet() line 577 in<br>
/global/u2/t/tkoskela/git/pets<wbr>c/src/vec/vec/interface/rvecto<wbr>r.c<br>
[4098]PETSC ERROR: #3 VecCreate_Seq() line 44 in<br>
/global/u2/t/tkoskela/git/pets<wbr>c/src/vec/vec/impls/seq/bvec3.<wbr>c<br>
[4098]PETSC ERROR: #4 VecSetType() line 53 in<br>
/global/u2/t/tkoskela/git/pets<wbr>c/src/vec/vec/interface/vecreg<wbr>.c<br>
[4098]PETSC ERROR: #5 VecDuplicate_Seq() line 786 in<br>
/global/u2/t/tkoskela/git/pets<wbr>c/src/vec/vec/impls/seq/bvec2.<wbr>c<br>
[4098]PETSC ERROR: #6 VecDuplicate() line 399 in<br>
/global/u2/t/tkoskela/git/pets<wbr>c/src/vec/vec/interface/vector<wbr>.c<br>
[4098]PETSC ERROR: #7 VecDuplicateVecs_Default() line 840 in<br>
/global/u2/t/tkoskela/git/pets<wbr>c/src/vec/vec/interface/vector<wbr>.c<br>
[4098]PETSC ERROR: #8 VecDuplicateVecs() line 473 in<br>
/global/u2/t/tkoskela/git/pets<wbr>c/src/vec/vec/interface/vector<wbr>.c<br>
</blockquote></div><br></div>
</blockquote></div></div>