[petsc-dev] VecScatter scaling problem on KNL
Choong-Seock Chang
cschang at pppl.gov
Wed Mar 8 16:08:19 CST 2017
I am adding Ed D'Azevedo, Steve Abbott and Robert Hager to this email
thread.
I hope this issue gets resolved soon so that we can run XGC on the full
scale Cori.
Thanks.
CS
On Mar 8, 2017 4:57 PM, "Richard Mills" <richardtmills at gmail.com> wrote:
> Hi Mark,
>
> Is your application threaded? I seem to recall having seen these "Logging
> event had unbalanced begin/end pairs" with threaded codes that call PETSc.
>
> --Richard
>
> On Wed, Mar 8, 2017 at 1:33 PM, Mark Adams <mfadams at lbl.gov> wrote:
>
>> Our code is having scaling problems on KNL (Cori), when we get up to
>> about 1K sockets.
>>
>> We have isolated the problem to a certain VecScatter. This code stores
>> the data redundantly. Scattering into the solver is just a local copy,
>> but scattering out requires that each process send all of its data to
>> every other process. It is this second one that is not scaling well.
>>
>> I wish I had more data, but this is urgent, jobs are in the queue, but
>> this is all I have. Any recommendation for parameters that we might
>> test while we get more data?
>>
>> Also, we got this error with -log_view.
>>
>> I've updated their PETSc with maint and we are waiting for it to run
>> again. Apparently this was not on the first time step, so the code
>> seems to have run for a while with what looks to me like a logic bug.
>>
>> Thanks,
>> Mark
>>
>>
>> [4098]PETSC ERROR: --------------------- Error Message
>> --------------------------------------------------------------
>> [4098]PETSC ERROR: Object is in wrong state
>> [4098]PETSC ERROR: Logging event had unbalanced begin/end pairs
>> [4098]PETSC ERROR: See
>> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble
>> shooting.
>> [4098]PETSC ERROR: Petsc Release Version 3.6.3, unknown
>> [4098]PETSC ERROR: /global/cscratch1/sd/worleyph/XGC1_KNL/xgc2 on a
>> v3.6.3-arch-knl-opt64-intel named nid05668 by worleyph Mon Mar 6
>> 11:33:19 2017
>> [4098]PETSC ERROR: Configure options COPTFLAGS="-g -O3 -fp-model fast
>> -xMIC-AVX512 -DX2_HAVE_INTEL" CXXOPTFLAGS="-g -O3 -fp-model fast
>> -xMIC-AVX512\
>> -DX2_HAVE_INTEL" FOPTFLAGS="-g -O3 -fp-model fast -xMIC-AVX512
>> -DX2_HAVE_INTEL" --download-metis=1 --download-parmetis=1
>> --with-blas-lapack-dir=/g\
>> lobal/common/cori/software/intel/compilers_and_libraries_201
>> 7.0.098/linux/mkl
>> --with-cc=cc --with-cxx=cc --with-debugging=0 --with-fc=ftn --with-mp\
>> iexec=srun --with-batch=0 --with-memalign=64 --with-64-bit-indices
>> --known-mpi-shared-libraries=1 PETSC_ARCH=v3.6.3-arch-knl-opt64-intel
>> --with-ope\
>> nmp=1 PETSC_DIR=/global/homes/t/tkoskela/git/petsc
>> [4098]PETSC ERROR: #1 PetscLogEventEndDefault() line 696 in
>> /global/u2/t/tkoskela/git/petsc/src/sys/logging/utils/eventlog.c
>> [4098]PETSC ERROR: #2 VecSet() line 577 in
>> /global/u2/t/tkoskela/git/petsc/src/vec/vec/interface/rvector.c
>> [4098]PETSC ERROR: #3 VecCreate_Seq() line 44 in
>> /global/u2/t/tkoskela/git/petsc/src/vec/vec/impls/seq/bvec3.c
>> [4098]PETSC ERROR: #4 VecSetType() line 53 in
>> /global/u2/t/tkoskela/git/petsc/src/vec/vec/interface/vecreg.c
>> [4098]PETSC ERROR: #5 VecDuplicate_Seq() line 786 in
>> /global/u2/t/tkoskela/git/petsc/src/vec/vec/impls/seq/bvec2.c
>> [4098]PETSC ERROR: #6 VecDuplicate() line 399 in
>> /global/u2/t/tkoskela/git/petsc/src/vec/vec/interface/vector.c
>> [4098]PETSC ERROR: #7 VecDuplicateVecs_Default() line 840 in
>> /global/u2/t/tkoskela/git/petsc/src/vec/vec/interface/vector.c
>> [4098]PETSC ERROR: #8 VecDuplicateVecs() line 473 in
>> /global/u2/t/tkoskela/git/petsc/src/vec/vec/interface/vector.c
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20170308/1fdf4ca7/attachment.html>
More information about the petsc-dev
mailing list