[petsc-dev] VecScatter scaling problem on KNL

Richard Mills richardtmills at gmail.com
Wed Mar 8 15:57:24 CST 2017


Hi Mark,

Is your application threaded?  I seem to recall having seen these "Logging
event had unbalanced begin/end pairs" with threaded codes that call PETSc.

--Richard

On Wed, Mar 8, 2017 at 1:33 PM, Mark Adams <mfadams at lbl.gov> wrote:

> Our code is having scaling problems on KNL (Cori), when we get up to
> about 1K sockets.
>
> We have isolated the problem to a certain VecScatter. This code stores
> the data redundantly. Scattering into the solver is just a local copy,
> but scattering out requires that each process send all of its data to
> every other process. It is this second one that is not scaling well.
>
> I wish I had more data, but this is urgent, jobs are in the queue, but
> this is all I have. Any recommendation for parameters that we might
> test while we get more data?
>
> Also, we got this error with -log_view.
>
> I've updated their PETSc with maint and we are waiting for it to run
> again. Apparently this was not on the first time step, so the code
> seems to have run for a while with what looks to me like a logic bug.
>
> Thanks,
> Mark
>
>
> [4098]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------
> [4098]PETSC ERROR: Object is in wrong state
> [4098]PETSC ERROR: Logging event had unbalanced begin/end pairs
> [4098]PETSC ERROR: See
> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble
> shooting.
> [4098]PETSC ERROR: Petsc Release Version 3.6.3, unknown
> [4098]PETSC ERROR: /global/cscratch1/sd/worleyph/XGC1_KNL/xgc2 on a
> v3.6.3-arch-knl-opt64-intel named nid05668 by worleyph Mon Mar  6
> 11:33:19 2017
> [4098]PETSC ERROR: Configure options COPTFLAGS="-g -O3 -fp-model fast
> -xMIC-AVX512 -DX2_HAVE_INTEL" CXXOPTFLAGS="-g -O3 -fp-model fast
> -xMIC-AVX512\
>  -DX2_HAVE_INTEL" FOPTFLAGS="-g -O3 -fp-model fast -xMIC-AVX512
> -DX2_HAVE_INTEL" --download-metis=1 --download-parmetis=1
> --with-blas-lapack-dir=/g\
> lobal/common/cori/software/intel/compilers_and_libraries_
> 2017.0.098/linux/mkl
> --with-cc=cc --with-cxx=cc --with-debugging=0 --with-fc=ftn --with-mp\
> iexec=srun --with-batch=0 --with-memalign=64 --with-64-bit-indices
> --known-mpi-shared-libraries=1 PETSC_ARCH=v3.6.3-arch-knl-opt64-intel
> --with-ope\
> nmp=1 PETSC_DIR=/global/homes/t/tkoskela/git/petsc
> [4098]PETSC ERROR: #1 PetscLogEventEndDefault() line 696 in
> /global/u2/t/tkoskela/git/petsc/src/sys/logging/utils/eventlog.c
> [4098]PETSC ERROR: #2 VecSet() line 577 in
> /global/u2/t/tkoskela/git/petsc/src/vec/vec/interface/rvector.c
> [4098]PETSC ERROR: #3 VecCreate_Seq() line 44 in
> /global/u2/t/tkoskela/git/petsc/src/vec/vec/impls/seq/bvec3.c
> [4098]PETSC ERROR: #4 VecSetType() line 53 in
> /global/u2/t/tkoskela/git/petsc/src/vec/vec/interface/vecreg.c
> [4098]PETSC ERROR: #5 VecDuplicate_Seq() line 786 in
> /global/u2/t/tkoskela/git/petsc/src/vec/vec/impls/seq/bvec2.c
> [4098]PETSC ERROR: #6 VecDuplicate() line 399 in
> /global/u2/t/tkoskela/git/petsc/src/vec/vec/interface/vector.c
> [4098]PETSC ERROR: #7 VecDuplicateVecs_Default() line 840 in
> /global/u2/t/tkoskela/git/petsc/src/vec/vec/interface/vector.c
> [4098]PETSC ERROR: #8 VecDuplicateVecs() line 473 in
> /global/u2/t/tkoskela/git/petsc/src/vec/vec/interface/vector.c
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20170308/1f4d764e/attachment.html>


More information about the petsc-dev mailing list