[petsc-users] Sporadic MPI_Allreduce() called in different locations on larger core counts
Mark Lohry
mlohry at gmail.com
Sat Aug 10 13:56:41 CDT 2019
Thanks Barry, been trying all of the above. I think I've homed in on it to
an out-of-memory and/or integer overflow inside MatColoringApply. Which
makes some sense since I only have a sequential coloring algorithm
working...
Is anyone out there using coloring in parallel? I still have the same
previously mentioned issues with MATCOLORINGJP (on small problems takes
upwards of 30 minutes to run) which as far as I can see is the only
"parallel" implementation. MATCOLORINGSL and MATCOLORINGID both work on
less large problems, MATCOLORINGGREEDY works on less large problems if and
only if I set weight type to MAT_COLORING_WEIGHT_LEXICAL, and all 3 are
failing on larger problems.
On Tue, Aug 6, 2019 at 9:36 AM Smith, Barry F. <bsmith at mcs.anl.gov> wrote:
>
> There is also
>
> $ ./configure --help | grep color
> --with-is-color-value-type=<char,short>
> char, short can store 256, 65536 colors current: short
>
> I can't imagine you have over 65 k colors but something to check
>
>
> > On Aug 6, 2019, at 8:19 AM, Mark Lohry <mlohry at gmail.com> wrote:
> >
> > My first guess is that the code is getting integer overflow somewhere.
> 25 billion is well over the 2 billion that 32 bit integers can hold.
> >
> > Mine as well -- though in later tests I have the same issue when using
> --with-64-bit-indices. Ironically I had removed that flag at some point
> because the coloring / index set was using a serious chunk of total memory
> on medium sized problems.
>
> Understood
>
> >
> > Questions on the petsc internals there though: Are matrices indexed with
> two integers (i,j) so the max matrix dimension is (int limit) x (int limit)
> or a single integer so the max dimension is sqrt(int limit)?
> > Also I was operating under the assumption the 32 bit limit should only
> constrain per-process problem sizes (25B over 400 processes giving 62M
> non-zeros per process), is that not right?
>
> It is mostly right but may not be right for everything in PETSc. For
> example I don't know about the MatFD code
>
> Since using a debugger is not practical for large code counts to find
> the point the two processes diverge you can try
>
> -log_trace
>
> or
>
> -log_trace filename
>
> in the second case it will generate one file per core called filename.%d
> note it will produce a lot of output
>
> Good luck
>
>
>
> >
> > We are adding more tests to nicely handle integer overflow but it is
> not easy since it can occur in so many places
> >
> > Totally understood. I know the pain of only finding an overflow bug
> after days of waiting in a cluster queue for a big job.
> >
> > We urge you to upgrade.
> >
> > I'll do that today and hope for the best. On first tests on 3.11.3, I
> still have a couple issues with the coloring code:
> >
> > * I am still getting the nasty hangs with MATCOLORINGJP mentioned here:
> https://lists.mcs.anl.gov/mailman/htdig/petsc-users/2017-October/033746.html
> > * MatColoringSetType(coloring, MATCOLORINGGREEDY); this produces a
> wrong jacobian unless I also set MatColoringSetWeightType(coloring,
> MAT_COLORING_WEIGHT_LEXICAL);
> > * MATCOLORINGMIS mentioned in the documentation doesn't seem to exist.
> >
> > Thanks,
> > Mark
> >
> > On Tue, Aug 6, 2019 at 8:56 AM Smith, Barry F. <bsmith at mcs.anl.gov>
> wrote:
> >
> > My first guess is that the code is getting integer overflow
> somewhere. 25 billion is well over the 2 billion that 32 bit integers can
> hold.
> >
> > We urge you to upgrade.
> >
> > Regardless for problems this large you likely need the ./configure
> option --with-64-bit-indices
> >
> > We are adding more tests to nicely handle integer overflow but it is
> not easy since it can occur in so many places
> >
> > Hopefully this will resolve your problem with large process counts
> >
> > Barry
> >
> >
> > > On Aug 6, 2019, at 7:43 AM, Mark Lohry via petsc-users <
> petsc-users at mcs.anl.gov> wrote:
> > >
> > > I'm running some larger cases than I have previously with a working
> code, and I'm running into failures I don't see on smaller cases. Failures
> are on 400 cores, ~100M unknowns, 25B non-zero jacobian entries. Runs
> successfully on half size case on 200 cores.
> > >
> > > 1) The first error output from petsc is "MPI_Allreduce() called in
> different locations". Is this a red herring, suggesting some process failed
> prior to this and processes have diverged?
> > >
> > > 2) I don't think I'm running out of memory -- globally at least. Slurm
> output shows e.g.
> > > Memory Utilized: 459.15 GB (estimated maximum)
> > > Memory Efficiency: 26.12% of 1.72 TB (175.78 GB/node)
> > > I did try with and without --64-bit-indices.
> > >
> > > 3) The debug traces seem to vary, see below. I *think* the failure
> might be happening in the vicinity of a Coloring call. I'm using
> MatFDColoring like so:
> > >
> > > ISColoring iscoloring;
> > > MatFDColoring fdcoloring;
> > > MatColoring coloring;
> > >
> > > MatColoringCreate(ctx.JPre, &coloring);
> > > MatColoringSetType(coloring, MATCOLORINGGREEDY);
> > >
> > > // converges stalls badly without this on small cases, don't know
> why
> > > MatColoringSetWeightType(coloring, MAT_COLORING_WEIGHT_LEXICAL);
> > >
> > > // none of these worked.
> > > // MatColoringSetType(coloring, MATCOLORINGJP);
> > > // MatColoringSetType(coloring, MATCOLORINGSL);
> > > // MatColoringSetType(coloring, MATCOLORINGID);
> > > MatColoringSetFromOptions(coloring);
> > >
> > > MatColoringApply(coloring, &iscoloring);
> > > MatColoringDestroy(&coloring);
> > > MatFDColoringCreate(ctx.JPre, iscoloring, &fdcoloring);
> > >
> > > I have had issues in the past with getting a functional coloring setup
> for finite difference jacobians, and the above is the only configuration
> I've managed to get working successfully. Have there been any significant
> development changes to that area of code since v3.8.3? I'll try upgrading
> in the mean time and hope for the best.
> > >
> > >
> > >
> > > Any ideas?
> > >
> > >
> > > Thanks,
> > > Mark
> > >
> > >
> > > *************************************
> > >
> > > mlohry at lancer:/ssd/dev_ssd/cmake-build$ grep "\[0\]" slurm-3429773.out
> > > [0]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------
> > > [0]PETSC ERROR: Petsc has generated inconsistent data
> > > [0]PETSC ERROR: MPI_Allreduce() called in different locations
> (functions) on different processors
> > > [0]PETSC ERROR: See
> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
> > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017
> > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h19c1n19 by
> mlohry Tue Aug 6 06:05:02 2019
> > > [0]PETSC ERROR: Configure options
> PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt
> --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc
> --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx
> --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes
> COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1
> --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS
> --with-mpiexec=/usr/bin/srun --with-64-bit-indices
> > > [0]PETSC ERROR: #1 TSSetMaxSteps() line 2944 in
> /home/mlohry/build/external/petsc/src/ts/interface/ts.c
> > > [0]PETSC ERROR: #2 TSSetMaxSteps() line 2944 in
> /home/mlohry/build/external/petsc/src/ts/interface/ts.c
> > > [0]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------
> > > [0]PETSC ERROR: Invalid argument
> > > [0]PETSC ERROR: Enum value must be same on all processes, argument # 2
> > > [0]PETSC ERROR: See
> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
> > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017
> > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h19c1n19 by
> mlohry Tue Aug 6 06:05:02 2019
> > > [0]PETSC ERROR: Configure options
> PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt
> --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc
> --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx
> --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes
> COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1
> --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS
> --with-mpiexec=/usr/bin/srun --with-64-bit-indices
> > > [0]PETSC ERROR: #3 TSSetExactFinalTime() line 2250 in
> /home/mlohry/build/external/petsc/src/ts/interface/ts.c
> > > [0]PETSC ERROR:
> ------------------------------------------------------------------------
> > > [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or
> the batch system) has told this process to end
> > > [0]PETSC ERROR: Try option -start_in_debugger or
> -on_error_attach_debugger
> > > [0]PETSC ERROR: or see
> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
> > > [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac
> OS X to find memory corruption errors
> > > [0]PETSC ERROR: likely location of problem given in stack below
> > > [0]PETSC ERROR: --------------------- Stack Frames
> ------------------------------------
> > > [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not
> available,
> > > [0]PETSC ERROR: INSTEAD the line number of the start of the
> function
> > > [0]PETSC ERROR: is given.
> > > [0]PETSC ERROR: [0] PetscCommDuplicate line 130
> /home/mlohry/build/external/petsc/src/sys/objects/tagm.c
> > > [0]PETSC ERROR: [0] PetscHeaderCreate_Private line 34
> /home/mlohry/build/external/petsc/src/sys/objects/inherit.c
> > > [0]PETSC ERROR: [0] DMCreate line 36
> /home/mlohry/build/external/petsc/src/dm/interface/dm.c
> > > [0]PETSC ERROR: [0] DMShellCreate line 983
> /home/mlohry/build/external/petsc/src/dm/impls/shell/dmshell.c
> > > [0]PETSC ERROR: [0] TSGetDM line 5287
> /home/mlohry/build/external/petsc/src/ts/interface/ts.c
> > > [0]PETSC ERROR: [0] TSSetIFunction line 1310
> /home/mlohry/build/external/petsc/src/ts/interface/ts.c
> > > [0]PETSC ERROR: [0] TSSetExactFinalTime line 2248
> /home/mlohry/build/external/petsc/src/ts/interface/ts.c
> > > [0]PETSC ERROR: [0] TSSetMaxSteps line 2942
> /home/mlohry/build/external/petsc/src/ts/interface/ts.c
> > > [0]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------
> > > [0]PETSC ERROR: Signal received
> > > [0]PETSC ERROR: See
> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
> > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017
> > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h19c1n19 by
> mlohry Tue Aug 6 06:05:02 2019
> > > [0]PETSC ERROR: Configure options
> PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt
> --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc
> --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx
> --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes
> COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1
> --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS
> --with-mpiexec=/usr/bin/srun --with-64-bit-indices
> > > [0]PETSC ERROR: #4 User provided function() line 0 in unknown file
> > >
> > >
> > > *************************************
> > >
> > >
> > > mlohry at lancer:/ssd/dev_ssd/cmake-build$ grep "\[0\]" slurm-3429158.out
> > > [0]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------
> > > [0]PETSC ERROR: Petsc has generated inconsistent data
> > > [0]PETSC ERROR: MPI_Allreduce() called in different locations (code
> lines) on different processors
> > > [0]PETSC ERROR: See
> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
> > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017
> > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h21c2n1 by
> mlohry Mon Aug 5 23:58:19 2019
> > > [0]PETSC ERROR: Configure options
> PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt
> --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc
> --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx
> --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes
> COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1
> --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS
> --with-mpiexec=/usr/bin/srun
> > > [0]PETSC ERROR: #1 MatSetBlockSizes() line 7206 in
> /home/mlohry/build/external/petsc/src/mat/interface/matrix.c
> > > [0]PETSC ERROR: #2 MatSetBlockSizes() line 7206 in
> /home/mlohry/build/external/petsc/src/mat/interface/matrix.c
> > > [0]PETSC ERROR: #3 MatSetBlockSize() line 7170 in
> /home/mlohry/build/external/petsc/src/mat/interface/matrix.c
> > > [0]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------
> > > [0]PETSC ERROR: Petsc has generated inconsistent data
> > > [0]PETSC ERROR: MPI_Allreduce() called in different locations (code
> lines) on different processors
> > > [0]PETSC ERROR: See
> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
> > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017
> > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h21c2n1 by
> mlohry Mon Aug 5 23:58:19 2019
> > > [0]PETSC ERROR: Configure options
> PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt
> --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc
> --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx
> --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes
> COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1
> --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS
> --with-mpiexec=/usr/bin/srun
> > > [0]PETSC ERROR: #4 VecSetSizes() line 1310 in
> /home/mlohry/build/external/petsc/src/vec/vec/interface/vector.c
> > > [0]PETSC ERROR: #5 VecSetSizes() line 1310 in
> /home/mlohry/build/external/petsc/src/vec/vec/interface/vector.c
> > > [0]PETSC ERROR: #6 VecCreateMPIWithArray() line 609 in
> /home/mlohry/build/external/petsc/src/vec/vec/impls/mpi/pbvec.c
> > > [0]PETSC ERROR: #7 MatSetUpMultiply_MPIAIJ() line 111 in
> /home/mlohry/build/external/petsc/src/mat/impls/aij/mpi/mmaij.c
> > > [0]PETSC ERROR: #8 MatAssemblyEnd_MPIAIJ() line 735 in
> /home/mlohry/build/external/petsc/src/mat/impls/aij/mpi/mpiaij.c
> > > [0]PETSC ERROR: #9 MatAssemblyEnd() line 5243 in
> /home/mlohry/build/external/petsc/src/mat/interface/matrix.c
> > > [0]PETSC ERROR:
> ------------------------------------------------------------------------
> > > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation,
> probably memory access out of range
> > > [0]PETSC ERROR: Try option -start_in_debugger or
> -on_error_attach_debugger
> > > [0]PETSC ERROR: or see
> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
> > > [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac
> OS X to find memory corruption errors
> > > [0]PETSC ERROR: likely location of problem given in stack below
> > > [0]PETSC ERROR: --------------------- Stack Frames
> ------------------------------------
> > > [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not
> available,
> > > [0]PETSC ERROR: INSTEAD the line number of the start of the
> function
> > > [0]PETSC ERROR: is given.
> > > [0]PETSC ERROR: [0] PetscSFSetGraphLayout line 497
> /home/mlohry/build/external/petsc/src/vec/is/utils/pmap.c
> > > [0]PETSC ERROR: [0] GreedyColoringLocalDistanceTwo_Private line 208
> /home/mlohry/build/external/petsc/src/mat/color/impls/greedy/greedy.c
> > > [0]PETSC ERROR: [0] MatColoringApply_Greedy line 559
> /home/mlohry/build/external/petsc/src/mat/color/impls/greedy/greedy.c
> > > [0]PETSC ERROR: [0] MatColoringApply line 357
> /home/mlohry/build/external/petsc/src/mat/color/interface/matcoloring.c
> > > [0]PETSC ERROR: [0] VecSetSizes line 1308
> /home/mlohry/build/external/petsc/src/vec/vec/interface/vector.c
> > > [0]PETSC ERROR: [0] VecCreateMPIWithArray line 605
> /home/mlohry/build/external/petsc/src/vec/vec/impls/mpi/pbvec.c
> > > [0]PETSC ERROR: [0] MatSetUpMultiply_MPIAIJ line 24
> /home/mlohry/build/external/petsc/src/mat/impls/aij/mpi/mmaij.c
> > > [0]PETSC ERROR: [0] MatAssemblyEnd_MPIAIJ line 698
> /home/mlohry/build/external/petsc/src/mat/impls/aij/mpi/mpiaij.c
> > > [0]PETSC ERROR: [0] MatAssemblyEnd line 5234
> /home/mlohry/build/external/petsc/src/mat/interface/matrix.c
> > > [0]PETSC ERROR: [0] MatSetBlockSizes line 7204
> /home/mlohry/build/external/petsc/src/mat/interface/matrix.c
> > > [0]PETSC ERROR: [0] MatSetBlockSize line 7167
> /home/mlohry/build/external/petsc/src/mat/interface/matrix.c
> > > [0]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------
> > > [0]PETSC ERROR: Signal received
> > > [0]PETSC ERROR: See
> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
> > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017
> > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h21c2n1 by
> mlohry Mon Aug 5 23:58:19 2019
> > > [0]PETSC ERROR: Configure options
> PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt
> --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc
> --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx
> --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes
> COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1
> --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS
> --with-mpiexec=/usr/bin/srun
> > > [0]PETSC ERROR: #10 User provided function() line 0 in unknown file
> > >
> > >
> > >
> > > *************************
> > >
> > >
> > > mlohry at lancer:/ssd/dev_ssd/cmake-build$ grep "\[0\]" slurm-3429134.out
> > > [0]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------
> > > [0]PETSC ERROR: Petsc has generated inconsistent data
> > > [0]PETSC ERROR: MPI_Allreduce() called in different locations (code
> lines) on different processors
> > > [0]PETSC ERROR: See
> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
> > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017
> > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h20c2n1 by
> mlohry Mon Aug 5 23:24:23 2019
> > > [0]PETSC ERROR: Configure options
> PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt
> --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc
> --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx
> --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes
> COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1
> --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS
> --with-mpiexec=/usr/bin/srun
> > > [0]PETSC ERROR: #1 PetscSplitOwnership() line 88 in
> /home/mlohry/build/external/petsc/src/sys/utils/psplit.c
> > > [0]PETSC ERROR: #2 PetscSplitOwnership() line 88 in
> /home/mlohry/build/external/petsc/src/sys/utils/psplit.c
> > > [0]PETSC ERROR: #3 PetscLayoutSetUp() line 137 in
> /home/mlohry/build/external/petsc/src/vec/is/utils/pmap.c
> > > [0]PETSC ERROR: #4 VecCreate_MPI_Private() line 489 in
> /home/mlohry/build/external/petsc/src/vec/vec/impls/mpi/pbvec.c
> > > [0]PETSC ERROR: #5 VecCreate_MPI() line 537 in
> /home/mlohry/build/external/petsc/src/vec/vec/impls/mpi/pbvec.c
> > > [0]PETSC ERROR: #6 VecSetType() line 51 in
> /home/mlohry/build/external/petsc/src/vec/vec/interface/vecreg.c
> > > [0]PETSC ERROR: #7 VecCreateMPI() line 40 in
> /home/mlohry/build/external/petsc/src/vec/vec/impls/mpi/vmpicr.c
> > > [0]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------
> > > [0]PETSC ERROR: Object is in wrong state
> > > [0]PETSC ERROR: Vec object's type is not set: Argument # 1
> > > [0]PETSC ERROR: See
> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
> > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017
> > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h20c2n1 by
> mlohry Mon Aug 5 23:24:23 2019
> > > [0]PETSC ERROR: Configure options
> PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt
> --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc
> --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx
> --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes
> COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1
> --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS
> --with-mpiexec=/usr/bin/srun
> > > [0]PETSC ERROR: #8 VecGetLocalSize() line 665 in
> /home/mlohry/build/external/petsc/src/vec/vec/interface/vector.c
> > >
> > >
> > >
> > > **************************************
> > >
> > >
> > >
> > > mlohry at lancer:/ssd/dev_ssd/cmake-build$ grep "\[0\]" slurm-3429102.out
> > > [0]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------
> > > [0]PETSC ERROR: Petsc has generated inconsistent data
> > > [0]PETSC ERROR: MPI_Allreduce() called in different locations (code
> lines) on different processors
> > > [0]PETSC ERROR: See
> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
> > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017
> > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h19c1n16 by
> mlohry Mon Aug 5 22:50:12 2019
> > > [0]PETSC ERROR: Configure options
> PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt
> --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc
> --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx
> --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes
> COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1
> --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS
> --with-mpiexec=/usr/bin/srun
> > > [0]PETSC ERROR: #1 TSSetExactFinalTime() line 2250 in
> /home/mlohry/build/external/petsc/src/ts/interface/ts.c
> > > [0]PETSC ERROR: #2 TSSetExactFinalTime() line 2250 in
> /home/mlohry/build/external/petsc/src/ts/interface/ts.c
> > > [0]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------
> > > [0]PETSC ERROR: Petsc has generated inconsistent data
> > > [0]PETSC ERROR: MPI_Allreduce() called in different locations (code
> lines) on different processors
> > > [0]PETSC ERROR: See
> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
> > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017
> > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h19c1n16 by
> mlohry Mon Aug 5 22:50:12 2019
> > > [0]PETSC ERROR: Configure options
> PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt
> --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc
> --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx
> --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes
> COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1
> --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS
> --with-mpiexec=/usr/bin/srun
> > > [0]PETSC ERROR: #3 MatSetBlockSizes() line 7206 in
> /home/mlohry/build/external/petsc/src/mat/interface/matrix.c
> > > [0]PETSC ERROR: #4 MatSetBlockSizes() line 7206 in
> /home/mlohry/build/external/petsc/src/mat/interface/matrix.c
> > > [0]PETSC ERROR: #5 MatSetBlockSize() line 7170 in
> /home/mlohry/build/external/petsc/src/mat/interface/matrix.c
> > > [0]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------
> > > [0]PETSC ERROR: Petsc has generated inconsistent data
> > > [0]PETSC ERROR: MPI_Allreduce() called in different locations (code
> lines) on different processors
> > > [0]PETSC ERROR: See
> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
> > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017
> > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h19c1n16 by
> mlohry Mon Aug 5 22:50:12 2019
> > > [0]PETSC ERROR: Configure options
> PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt
> --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc
> --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx
> --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes
> COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1
> --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS
> --with-mpiexec=/usr/bin/srun
> > > [0]PETSC ERROR: #6 MatStashScatterBegin_Ref() line 476 in
> /home/mlohry/build/external/petsc/src/mat/utils/matstash.c
> > > [0]PETSC ERROR: #7 MatStashScatterBegin_Ref() line 476 in
> /home/mlohry/build/external/petsc/src/mat/utils/matstash.c
> > > [0]PETSC ERROR: #8 MatStashScatterBegin_Private() line 455 in
> /home/mlohry/build/external/petsc/src/mat/utils/matstash.c
> > > [0]PETSC ERROR: #9 MatAssemblyBegin_MPIAIJ() line 679 in
> /home/mlohry/build/external/petsc/src/mat/impls/aij/mpi/mpiaij.c
> > > [0]PETSC ERROR: #10 MatAssemblyBegin() line 5154 in
> /home/mlohry/build/external/petsc/src/mat/interface/matrix.c
> > > [0]PETSC ERROR:
> ------------------------------------------------------------------------
> > > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation,
> probably memory access out of range
> > > [0]PETSC ERROR: Try option -start_in_debugger or
> -on_error_attach_debugger
> > > [0]PETSC ERROR: or see
> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
> > > [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac
> OS X to find memory corruption errors
> > > [0]PETSC ERROR: likely location of problem given in stack below
> > > [0]PETSC ERROR: --------------------- Stack Frames
> ------------------------------------
> > > [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not
> available,
> > > [0]PETSC ERROR: INSTEAD the line number of the start of the
> function
> > > [0]PETSC ERROR: is given.
> > > [0]PETSC ERROR: [0] MatStashScatterEnd_Ref line 137
> /home/mlohry/build/external/petsc/src/mat/utils/matstash.c
> > > [0]PETSC ERROR: [0] MatStashScatterEnd_Private line 126
> /home/mlohry/build/external/petsc/src/mat/utils/matstash.c
> > > [0]PETSC ERROR: [0] MatAssemblyEnd_MPIAIJ line 698
> /home/mlohry/build/external/petsc/src/mat/impls/aij/mpi/mpiaij.c
> > > [0]PETSC ERROR: [0] MatAssemblyEnd line 5234
> /home/mlohry/build/external/petsc/src/mat/interface/matrix.c
> > > [0]PETSC ERROR: [0] MatStashScatterBegin_Ref line 473
> /home/mlohry/build/external/petsc/src/mat/utils/matstash.c
> > > [0]PETSC ERROR: [0] MatStashScatterBegin_Private line 454
> /home/mlohry/build/external/petsc/src/mat/utils/matstash.c
> > > [0]PETSC ERROR: [0] MatAssemblyBegin_MPIAIJ line 676
> /home/mlohry/build/external/petsc/src/mat/impls/aij/mpi/mpiaij.c
> > > [0]PETSC ERROR: [0] MatAssemblyBegin line 5143
> /home/mlohry/build/external/petsc/src/mat/interface/matrix.c
> > > [0]PETSC ERROR: [0] MatSetBlockSizes line 7204
> /home/mlohry/build/external/petsc/src/mat/interface/matrix.c
> > > [0]PETSC ERROR: [0] MatSetBlockSize line 7167
> /home/mlohry/build/external/petsc/src/mat/interface/matrix.c
> > > [0]PETSC ERROR: [0] TSSetExactFinalTime line 2248
> /home/mlohry/build/external/petsc/src/ts/interface/ts.c
> > > [0]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------
> > > [0]PETSC ERROR: Signal received
> > > [0]PETSC ERROR: See
> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
> > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017
> > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h19c1n16 by
> mlohry Mon Aug 5 22:50:12 2019
> > > [0]PETSC ERROR: Configure options
> PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt
> --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc
> --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx
> --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes
> COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1
> --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS
> --with-mpiexec=/usr/bin/srun
> > > [0]PETSC ERROR: #11 User provided function() line 0 in unknown file
> > >
> > >
> > >
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20190810/18e2bb7f/attachment-0001.html>
More information about the petsc-users
mailing list