[petsc-dev] VecScatter scaling problem on KNL

Mark Adams mfadams at lbl.gov
Wed Mar 8 18:35:15 CST 2017


On Wed, Mar 8, 2017 at 7:29 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>
>   Mark,
>
>    Are you getting this with PETSc 3.7.5 ?

I have built a new version and we are waiting for results. As you
could see from the error message this was v3.6.3.



Is the code valgrinded?
>
>
>> On Mar 8, 2017, at 6:27 PM, Mark Adams <mfadams at lbl.gov> wrote:
>>
>> On Wed, Mar 8, 2017 at 4:57 PM, Richard Mills <richardtmills at gmail.com> wrote:
>>> Hi Mark,
>>>
>>> Is your application threaded?  I seem to recall having seen these "Logging
>>> event had unbalanced begin/end pairs" with threaded codes that call PETSc.
>>
>> It is OMP threaded, but it should certainly not call PETSc inside of a
>> thread loop... but this does look like something that threading could
>> cause.
>>
>>
>>>
>>> --Richard
>>>
>>> On Wed, Mar 8, 2017 at 1:33 PM, Mark Adams <mfadams at lbl.gov> wrote:
>>>>
>>>> Our code is having scaling problems on KNL (Cori), when we get up to
>>>> about 1K sockets.
>>>>
>>>> We have isolated the problem to a certain VecScatter. This code stores
>>>> the data redundantly. Scattering into the solver is just a local copy,
>>>> but scattering out requires that each process send all of its data to
>>>> every other process. It is this second one that is not scaling well.
>>>>
>>>> I wish I had more data, but this is urgent, jobs are in the queue, but
>>>> this is all I have. Any recommendation for parameters that we might
>>>> test while we get more data?
>>>>
>>>> Also, we got this error with -log_view.
>>>>
>>>> I've updated their PETSc with maint and we are waiting for it to run
>>>> again. Apparently this was not on the first time step, so the code
>>>> seems to have run for a while with what looks to me like a logic bug.
>>>>
>>>> Thanks,
>>>> Mark
>>>>
>>>>
>>>> [4098]PETSC ERROR: --------------------- Error Message
>>>> --------------------------------------------------------------
>>>> [4098]PETSC ERROR: Object is in wrong state
>>>> [4098]PETSC ERROR: Logging event had unbalanced begin/end pairs
>>>> [4098]PETSC ERROR: See
>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble
>>>> shooting.
>>>> [4098]PETSC ERROR: Petsc Release Version 3.6.3, unknown
>>>> [4098]PETSC ERROR: /global/cscratch1/sd/worleyph/XGC1_KNL/xgc2 on a
>>>> v3.6.3-arch-knl-opt64-intel named nid05668 by worleyph Mon Mar  6
>>>> 11:33:19 2017
>>>> [4098]PETSC ERROR: Configure options COPTFLAGS="-g -O3 -fp-model fast
>>>> -xMIC-AVX512 -DX2_HAVE_INTEL" CXXOPTFLAGS="-g -O3 -fp-model fast
>>>> -xMIC-AVX512\
>>>> -DX2_HAVE_INTEL" FOPTFLAGS="-g -O3 -fp-model fast -xMIC-AVX512
>>>> -DX2_HAVE_INTEL" --download-metis=1 --download-parmetis=1
>>>> --with-blas-lapack-dir=/g\
>>>>
>>>> lobal/common/cori/software/intel/compilers_and_libraries_2017.0.098/linux/mkl
>>>> --with-cc=cc --with-cxx=cc --with-debugging=0 --with-fc=ftn --with-mp\
>>>> iexec=srun --with-batch=0 --with-memalign=64 --with-64-bit-indices
>>>> --known-mpi-shared-libraries=1 PETSC_ARCH=v3.6.3-arch-knl-opt64-intel
>>>> --with-ope\
>>>> nmp=1 PETSC_DIR=/global/homes/t/tkoskela/git/petsc
>>>> [4098]PETSC ERROR: #1 PetscLogEventEndDefault() line 696 in
>>>> /global/u2/t/tkoskela/git/petsc/src/sys/logging/utils/eventlog.c
>>>> [4098]PETSC ERROR: #2 VecSet() line 577 in
>>>> /global/u2/t/tkoskela/git/petsc/src/vec/vec/interface/rvector.c
>>>> [4098]PETSC ERROR: #3 VecCreate_Seq() line 44 in
>>>> /global/u2/t/tkoskela/git/petsc/src/vec/vec/impls/seq/bvec3.c
>>>> [4098]PETSC ERROR: #4 VecSetType() line 53 in
>>>> /global/u2/t/tkoskela/git/petsc/src/vec/vec/interface/vecreg.c
>>>> [4098]PETSC ERROR: #5 VecDuplicate_Seq() line 786 in
>>>> /global/u2/t/tkoskela/git/petsc/src/vec/vec/impls/seq/bvec2.c
>>>> [4098]PETSC ERROR: #6 VecDuplicate() line 399 in
>>>> /global/u2/t/tkoskela/git/petsc/src/vec/vec/interface/vector.c
>>>> [4098]PETSC ERROR: #7 VecDuplicateVecs_Default() line 840 in
>>>> /global/u2/t/tkoskela/git/petsc/src/vec/vec/interface/vector.c
>>>> [4098]PETSC ERROR: #8 VecDuplicateVecs() line 473 in
>>>> /global/u2/t/tkoskela/git/petsc/src/vec/vec/interface/vector.c
>>>
>>>
>



More information about the petsc-dev mailing list