[petsc-users] Sporadic MPI_Allreduce() called in different locations on larger core counts

Mark Lohry mlohry at gmail.com
Sun Aug 11 08:49:22 CDT 2019


Hi Barry, I made a minimum example comparing the colorings on a very small
case. You'll need to unzip the jacobian_sparsity.tgz to run it.

https://github.com/mlohry/petsc_miscellany

This is sparse block system with 50x50 block sizes, ~7,680 blocks.
Comparing the coloring types sl, lf, jp, id, greedy, I get these timings
wallclock, running with -np 16:

SL: 1.5s
LF: 1.3s
JP: 29s !
ID: 1.4s
greedy: 2s

As far as I'm aware, JP is the only parallel coloring implemented? It is
looking as though I'm simply running out of memory with the sequential
methods (I should apologize to my cluster admin for chewing up 10TB and
crashing...).

On this small problem JP is taking 30 seconds wallclock, but that time
grows exponentially with larger problems (last I tried it, I killed the job
after 24 hours of spinning.)

Also as I mentioned, the "greedy" method appears to be producing an invalid
coloring for me unless I also specify weights "lexical". But
"-mat_coloring_test" doesn't complain. I'll have to make a different
example to actually show it's an invalid coloring.

Thanks,
Mark



On Sat, Aug 10, 2019 at 4:38 PM Smith, Barry F. <bsmith at mcs.anl.gov> wrote:

>
>   Mark,
>
>    Would you be able to cook up an example (or examples)  that demonstrate
> the problem (or problems)  and how to run it? If you send it to us and we
> can reproduce the problem then we'll fix it. If need be you can send large
> matrices to petsc-maint at mcs.anl.gov don't send them to petsc-users since
> it will reject large files.
>
>    Barry
>
>
> > On Aug 10, 2019, at 1:56 PM, Mark Lohry <mlohry at gmail.com> wrote:
> >
> > Thanks Barry, been trying all of the above. I think I've homed in on it
> to an out-of-memory and/or integer overflow inside MatColoringApply. Which
> makes some sense since I only have a sequential coloring algorithm
> working...
> >
> > Is anyone out there using coloring in parallel? I still have the same
> previously mentioned issues with MATCOLORINGJP (on small problems takes
> upwards of 30 minutes to run) which as far as I can see is the only
> "parallel" implementation. MATCOLORINGSL and MATCOLORINGID both work on
> less large problems, MATCOLORINGGREEDY works on less large problems if and
> only if I set weight type to MAT_COLORING_WEIGHT_LEXICAL, and all 3 are
> failing on larger problems.
> >
> > On Tue, Aug 6, 2019 at 9:36 AM Smith, Barry F. <bsmith at mcs.anl.gov>
> wrote:
> >
> >   There is also
> >
> > $ ./configure --help | grep color
> >   --with-is-color-value-type=<char,short>
> >        char, short can store 256, 65536 colors  current: short
> >
> > I can't imagine you have over 65 k colors but something to check
> >
> >
> > > On Aug 6, 2019, at 8:19 AM, Mark Lohry <mlohry at gmail.com> wrote:
> > >
> > > My first guess is that the code is getting integer overflow somewhere.
> 25 billion is well over the 2 billion that 32 bit integers can hold.
> > >
> > > Mine as well -- though in later tests I have the same issue when using
> --with-64-bit-indices. Ironically I had removed that flag at some point
> because the coloring / index set was using a serious chunk of total memory
> on medium sized problems.
> >
> >   Understood
> >
> > >
> > > Questions on the petsc internals there though: Are matrices indexed
> with two integers (i,j) so the max matrix dimension is (int limit) x (int
> limit) or a single integer so the max dimension is sqrt(int limit)?
> > > Also I was operating under the assumption the 32 bit limit should only
> constrain per-process problem sizes (25B over 400 processes giving 62M
> non-zeros per process), is that not right?
> >
> >    It is mostly right but may not be right for everything in PETSc. For
> example I don't know about the MatFD code
> >
> >    Since using a debugger is not practical for large code counts to find
> the point the two processes diverge you can try
> >
> > -log_trace
> >
> > or
> >
> > -log_trace filename
> >
> > in the second case it will generate one file per core called
> filename.%d  note it will produce a lot of output
> >
> >   Good luck
> >
> >
> >
> > >
> > >    We are adding more tests to nicely handle integer overflow but it
> is not easy since it can occur in so many places
> > >
> > > Totally understood. I know the pain of only finding an overflow bug
> after days of waiting in a cluster queue for a big job.
> > >
> > > We urge you to upgrade.
> > >
> > > I'll do that today and hope for the best. On first tests on 3.11.3, I
> still have a couple issues with the coloring code:
> > >
> > > * I am still getting the nasty hangs with MATCOLORINGJP mentioned
> here:
> https://lists.mcs.anl.gov/mailman/htdig/petsc-users/2017-October/033746.html
> > > * MatColoringSetType(coloring, MATCOLORINGGREEDY);  this produces a
> wrong jacobian unless I also set MatColoringSetWeightType(coloring,
> MAT_COLORING_WEIGHT_LEXICAL);
> > > * MATCOLORINGMIS mentioned in the documentation doesn't seem to exist.
> > >
> > > Thanks,
> > > Mark
> > >
> > > On Tue, Aug 6, 2019 at 8:56 AM Smith, Barry F. <bsmith at mcs.anl.gov>
> wrote:
> > >
> > >    My first guess is that the code is getting integer overflow
> somewhere. 25 billion is well over the 2 billion that 32 bit integers can
> hold.
> > >
> > >    We urge you to upgrade.
> > >
> > >    Regardless for problems this large you likely need  the ./configure
> option --with-64-bit-indices
> > >
> > >    We are adding more tests to nicely handle integer overflow but it
> is not easy since it can occur in so many places
> > >
> > >    Hopefully this will resolve your problem with large process counts
> > >
> > >    Barry
> > >
> > >
> > > > On Aug 6, 2019, at 7:43 AM, Mark Lohry via petsc-users <
> petsc-users at mcs.anl.gov> wrote:
> > > >
> > > > I'm running some larger cases than I have previously with a working
> code, and I'm running into failures I don't see on smaller cases. Failures
> are on 400 cores, ~100M unknowns, 25B non-zero jacobian entries. Runs
> successfully on half size case on 200 cores.
> > > >
> > > > 1) The first error output from petsc is "MPI_Allreduce() called in
> different locations". Is this a red herring, suggesting some process failed
> prior to this and processes have diverged?
> > > >
> > > > 2) I don't think I'm running out of memory -- globally at least.
> Slurm output shows e.g.
> > > > Memory Utilized: 459.15 GB (estimated maximum)
> > > > Memory Efficiency: 26.12% of 1.72 TB (175.78 GB/node)
> > > > I did try with and without --64-bit-indices.
> > > >
> > > > 3) The debug traces seem to vary, see below. I *think* the failure
> might be happening in the vicinity of a Coloring call. I'm using
> MatFDColoring like so:
> > > >
> > > >    ISColoring    iscoloring;
> > > >     MatFDColoring fdcoloring;
> > > >     MatColoring   coloring;
> > > >
> > > >     MatColoringCreate(ctx.JPre, &coloring);
> > > >     MatColoringSetType(coloring, MATCOLORINGGREEDY);
> > > >
> > > >    // converges stalls badly without this on small cases, don't know
> why
> > > >     MatColoringSetWeightType(coloring, MAT_COLORING_WEIGHT_LEXICAL);
> > > >
> > > >    // none of these worked.
> > > >     //    MatColoringSetType(coloring, MATCOLORINGJP);
> > > >     // MatColoringSetType(coloring, MATCOLORINGSL);
> > > >     // MatColoringSetType(coloring, MATCOLORINGID);
> > > >     MatColoringSetFromOptions(coloring);
> > > >
> > > >     MatColoringApply(coloring, &iscoloring);
> > > >     MatColoringDestroy(&coloring);
> > > >     MatFDColoringCreate(ctx.JPre, iscoloring, &fdcoloring);
> > > >
> > > > I have had issues in the past with getting a functional coloring
> setup for finite difference jacobians, and the above is the only
> configuration I've managed to get working successfully. Have there been any
> significant development changes to that area of code since v3.8.3? I'll try
> upgrading in the mean time and hope for the best.
> > > >
> > > >
> > > >
> > > > Any ideas?
> > > >
> > > >
> > > > Thanks,
> > > > Mark
> > > >
> > > >
> > > > *************************************
> > > >
> > > > mlohry at lancer:/ssd/dev_ssd/cmake-build$ grep "\[0\]"
> slurm-3429773.out
> > > > [0]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------
> > > > [0]PETSC ERROR: Petsc has generated inconsistent data
> > > > [0]PETSC ERROR: MPI_Allreduce() called in different locations
> (functions) on different processors
> > > > [0]PETSC ERROR: See
> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
> > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017
> > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h19c1n19 by
> mlohry Tue Aug  6 06:05:02 2019
> > > > [0]PETSC ERROR: Configure options
> PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt
> --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc
> --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx
> --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes
> COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1
> --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS
> --with-mpiexec=/usr/bin/srun --with-64-bit-indices
> > > > [0]PETSC ERROR: #1 TSSetMaxSteps() line 2944 in
> /home/mlohry/build/external/petsc/src/ts/interface/ts.c
> > > > [0]PETSC ERROR: #2 TSSetMaxSteps() line 2944 in
> /home/mlohry/build/external/petsc/src/ts/interface/ts.c
> > > > [0]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------
> > > > [0]PETSC ERROR: Invalid argument
> > > > [0]PETSC ERROR: Enum value must be same on all processes, argument #
> 2
> > > > [0]PETSC ERROR: See
> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
> > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017
> > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h19c1n19 by
> mlohry Tue Aug  6 06:05:02 2019
> > > > [0]PETSC ERROR: Configure options
> PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt
> --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc
> --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx
> --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes
> COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1
> --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS
> --with-mpiexec=/usr/bin/srun --with-64-bit-indices
> > > > [0]PETSC ERROR: #3 TSSetExactFinalTime() line 2250 in
> /home/mlohry/build/external/petsc/src/ts/interface/ts.c
> > > > [0]PETSC ERROR:
> ------------------------------------------------------------------------
> > > > [0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or
> the batch system) has told this process to end
> > > > [0]PETSC ERROR: Try option -start_in_debugger or
> -on_error_attach_debugger
> > > > [0]PETSC ERROR: or see
> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
> > > > [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple
> Mac OS X to find memory corruption errors
> > > > [0]PETSC ERROR: likely location of problem given in stack below
> > > > [0]PETSC ERROR: ---------------------  Stack Frames
> ------------------------------------
> > > > [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not
> available,
> > > > [0]PETSC ERROR:       INSTEAD the line number of the start of the
> function
> > > > [0]PETSC ERROR:       is given.
> > > > [0]PETSC ERROR: [0] PetscCommDuplicate line 130
> /home/mlohry/build/external/petsc/src/sys/objects/tagm.c
> > > > [0]PETSC ERROR: [0] PetscHeaderCreate_Private line 34
> /home/mlohry/build/external/petsc/src/sys/objects/inherit.c
> > > > [0]PETSC ERROR: [0] DMCreate line 36
> /home/mlohry/build/external/petsc/src/dm/interface/dm.c
> > > > [0]PETSC ERROR: [0] DMShellCreate line 983
> /home/mlohry/build/external/petsc/src/dm/impls/shell/dmshell.c
> > > > [0]PETSC ERROR: [0] TSGetDM line 5287
> /home/mlohry/build/external/petsc/src/ts/interface/ts.c
> > > > [0]PETSC ERROR: [0] TSSetIFunction line 1310
> /home/mlohry/build/external/petsc/src/ts/interface/ts.c
> > > > [0]PETSC ERROR: [0] TSSetExactFinalTime line 2248
> /home/mlohry/build/external/petsc/src/ts/interface/ts.c
> > > > [0]PETSC ERROR: [0] TSSetMaxSteps line 2942
> /home/mlohry/build/external/petsc/src/ts/interface/ts.c
> > > > [0]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------
> > > > [0]PETSC ERROR: Signal received
> > > > [0]PETSC ERROR: See
> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
> > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017
> > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h19c1n19 by
> mlohry Tue Aug  6 06:05:02 2019
> > > > [0]PETSC ERROR: Configure options
> PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt
> --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc
> --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx
> --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes
> COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1
> --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS
> --with-mpiexec=/usr/bin/srun --with-64-bit-indices
> > > > [0]PETSC ERROR: #4 User provided function() line 0 in  unknown file
> > > >
> > > >
> > > > *************************************
> > > >
> > > >
> > > > mlohry at lancer:/ssd/dev_ssd/cmake-build$ grep "\[0\]"
> slurm-3429158.out
> > > > [0]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------
> > > > [0]PETSC ERROR: Petsc has generated inconsistent data
> > > > [0]PETSC ERROR: MPI_Allreduce() called in different locations (code
> lines) on different processors
> > > > [0]PETSC ERROR: See
> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
> > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017
> > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h21c2n1 by
> mlohry Mon Aug  5 23:58:19 2019
> > > > [0]PETSC ERROR: Configure options
> PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt
> --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc
> --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx
> --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes
> COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1
> --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS
> --with-mpiexec=/usr/bin/srun
> > > > [0]PETSC ERROR: #1 MatSetBlockSizes() line 7206 in
> /home/mlohry/build/external/petsc/src/mat/interface/matrix.c
> > > > [0]PETSC ERROR: #2 MatSetBlockSizes() line 7206 in
> /home/mlohry/build/external/petsc/src/mat/interface/matrix.c
> > > > [0]PETSC ERROR: #3 MatSetBlockSize() line 7170 in
> /home/mlohry/build/external/petsc/src/mat/interface/matrix.c
> > > > [0]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------
> > > > [0]PETSC ERROR: Petsc has generated inconsistent data
> > > > [0]PETSC ERROR: MPI_Allreduce() called in different locations (code
> lines) on different processors
> > > > [0]PETSC ERROR: See
> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
> > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017
> > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h21c2n1 by
> mlohry Mon Aug  5 23:58:19 2019
> > > > [0]PETSC ERROR: Configure options
> PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt
> --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc
> --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx
> --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes
> COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1
> --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS
> --with-mpiexec=/usr/bin/srun
> > > > [0]PETSC ERROR: #4 VecSetSizes() line 1310 in
> /home/mlohry/build/external/petsc/src/vec/vec/interface/vector.c
> > > > [0]PETSC ERROR: #5 VecSetSizes() line 1310 in
> /home/mlohry/build/external/petsc/src/vec/vec/interface/vector.c
> > > > [0]PETSC ERROR: #6 VecCreateMPIWithArray() line 609 in
> /home/mlohry/build/external/petsc/src/vec/vec/impls/mpi/pbvec.c
> > > > [0]PETSC ERROR: #7 MatSetUpMultiply_MPIAIJ() line 111 in
> /home/mlohry/build/external/petsc/src/mat/impls/aij/mpi/mmaij.c
> > > > [0]PETSC ERROR: #8 MatAssemblyEnd_MPIAIJ() line 735 in
> /home/mlohry/build/external/petsc/src/mat/impls/aij/mpi/mpiaij.c
> > > > [0]PETSC ERROR: #9 MatAssemblyEnd() line 5243 in
> /home/mlohry/build/external/petsc/src/mat/interface/matrix.c
> > > > [0]PETSC ERROR:
> ------------------------------------------------------------------------
> > > > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation
> Violation, probably memory access out of range
> > > > [0]PETSC ERROR: Try option -start_in_debugger or
> -on_error_attach_debugger
> > > > [0]PETSC ERROR: or see
> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
> > > > [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple
> Mac OS X to find memory corruption errors
> > > > [0]PETSC ERROR: likely location of problem given in stack below
> > > > [0]PETSC ERROR: ---------------------  Stack Frames
> ------------------------------------
> > > > [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not
> available,
> > > > [0]PETSC ERROR:       INSTEAD the line number of the start of the
> function
> > > > [0]PETSC ERROR:       is given.
> > > > [0]PETSC ERROR: [0] PetscSFSetGraphLayout line 497
> /home/mlohry/build/external/petsc/src/vec/is/utils/pmap.c
> > > > [0]PETSC ERROR: [0] GreedyColoringLocalDistanceTwo_Private line 208
> /home/mlohry/build/external/petsc/src/mat/color/impls/greedy/greedy.c
> > > > [0]PETSC ERROR: [0] MatColoringApply_Greedy line 559
> /home/mlohry/build/external/petsc/src/mat/color/impls/greedy/greedy.c
> > > > [0]PETSC ERROR: [0] MatColoringApply line 357
> /home/mlohry/build/external/petsc/src/mat/color/interface/matcoloring.c
> > > > [0]PETSC ERROR: [0] VecSetSizes line 1308
> /home/mlohry/build/external/petsc/src/vec/vec/interface/vector.c
> > > > [0]PETSC ERROR: [0] VecCreateMPIWithArray line 605
> /home/mlohry/build/external/petsc/src/vec/vec/impls/mpi/pbvec.c
> > > > [0]PETSC ERROR: [0] MatSetUpMultiply_MPIAIJ line 24
> /home/mlohry/build/external/petsc/src/mat/impls/aij/mpi/mmaij.c
> > > > [0]PETSC ERROR: [0] MatAssemblyEnd_MPIAIJ line 698
> /home/mlohry/build/external/petsc/src/mat/impls/aij/mpi/mpiaij.c
> > > > [0]PETSC ERROR: [0] MatAssemblyEnd line 5234
> /home/mlohry/build/external/petsc/src/mat/interface/matrix.c
> > > > [0]PETSC ERROR: [0] MatSetBlockSizes line 7204
> /home/mlohry/build/external/petsc/src/mat/interface/matrix.c
> > > > [0]PETSC ERROR: [0] MatSetBlockSize line 7167
> /home/mlohry/build/external/petsc/src/mat/interface/matrix.c
> > > > [0]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------
> > > > [0]PETSC ERROR: Signal received
> > > > [0]PETSC ERROR: See
> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
> > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017
> > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h21c2n1 by
> mlohry Mon Aug  5 23:58:19 2019
> > > > [0]PETSC ERROR: Configure options
> PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt
> --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc
> --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx
> --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes
> COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1
> --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS
> --with-mpiexec=/usr/bin/srun
> > > > [0]PETSC ERROR: #10 User provided function() line 0 in  unknown file
> > > >
> > > >
> > > >
> > > > *************************
> > > >
> > > >
> > > > mlohry at lancer:/ssd/dev_ssd/cmake-build$ grep "\[0\]"
> slurm-3429134.out
> > > > [0]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------
> > > > [0]PETSC ERROR: Petsc has generated inconsistent data
> > > > [0]PETSC ERROR: MPI_Allreduce() called in different locations (code
> lines) on different processors
> > > > [0]PETSC ERROR: See
> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
> > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017
> > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h20c2n1 by
> mlohry Mon Aug  5 23:24:23 2019
> > > > [0]PETSC ERROR: Configure options
> PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt
> --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc
> --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx
> --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes
> COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1
> --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS
> --with-mpiexec=/usr/bin/srun
> > > > [0]PETSC ERROR: #1 PetscSplitOwnership() line 88 in
> /home/mlohry/build/external/petsc/src/sys/utils/psplit.c
> > > > [0]PETSC ERROR: #2 PetscSplitOwnership() line 88 in
> /home/mlohry/build/external/petsc/src/sys/utils/psplit.c
> > > > [0]PETSC ERROR: #3 PetscLayoutSetUp() line 137 in
> /home/mlohry/build/external/petsc/src/vec/is/utils/pmap.c
> > > > [0]PETSC ERROR: #4 VecCreate_MPI_Private() line 489 in
> /home/mlohry/build/external/petsc/src/vec/vec/impls/mpi/pbvec.c
> > > > [0]PETSC ERROR: #5 VecCreate_MPI() line 537 in
> /home/mlohry/build/external/petsc/src/vec/vec/impls/mpi/pbvec.c
> > > > [0]PETSC ERROR: #6 VecSetType() line 51 in
> /home/mlohry/build/external/petsc/src/vec/vec/interface/vecreg.c
> > > > [0]PETSC ERROR: #7 VecCreateMPI() line 40 in
> /home/mlohry/build/external/petsc/src/vec/vec/impls/mpi/vmpicr.c
> > > > [0]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------
> > > > [0]PETSC ERROR: Object is in wrong state
> > > > [0]PETSC ERROR: Vec object's type is not set: Argument # 1
> > > > [0]PETSC ERROR: See
> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
> > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017
> > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h20c2n1 by
> mlohry Mon Aug  5 23:24:23 2019
> > > > [0]PETSC ERROR: Configure options
> PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt
> --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc
> --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx
> --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes
> COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1
> --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS
> --with-mpiexec=/usr/bin/srun
> > > > [0]PETSC ERROR: #8 VecGetLocalSize() line 665 in
> /home/mlohry/build/external/petsc/src/vec/vec/interface/vector.c
> > > >
> > > >
> > > >
> > > > **************************************
> > > >
> > > >
> > > >
> > > > mlohry at lancer:/ssd/dev_ssd/cmake-build$ grep "\[0\]"
> slurm-3429102.out
> > > > [0]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------
> > > > [0]PETSC ERROR: Petsc has generated inconsistent data
> > > > [0]PETSC ERROR: MPI_Allreduce() called in different locations (code
> lines) on different processors
> > > > [0]PETSC ERROR: See
> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
> > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017
> > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h19c1n16 by
> mlohry Mon Aug  5 22:50:12 2019
> > > > [0]PETSC ERROR: Configure options
> PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt
> --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc
> --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx
> --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes
> COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1
> --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS
> --with-mpiexec=/usr/bin/srun
> > > > [0]PETSC ERROR: #1 TSSetExactFinalTime() line 2250 in
> /home/mlohry/build/external/petsc/src/ts/interface/ts.c
> > > > [0]PETSC ERROR: #2 TSSetExactFinalTime() line 2250 in
> /home/mlohry/build/external/petsc/src/ts/interface/ts.c
> > > > [0]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------
> > > > [0]PETSC ERROR: Petsc has generated inconsistent data
> > > > [0]PETSC ERROR: MPI_Allreduce() called in different locations (code
> lines) on different processors
> > > > [0]PETSC ERROR: See
> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
> > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017
> > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h19c1n16 by
> mlohry Mon Aug  5 22:50:12 2019
> > > > [0]PETSC ERROR: Configure options
> PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt
> --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc
> --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx
> --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes
> COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1
> --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS
> --with-mpiexec=/usr/bin/srun
> > > > [0]PETSC ERROR: #3 MatSetBlockSizes() line 7206 in
> /home/mlohry/build/external/petsc/src/mat/interface/matrix.c
> > > > [0]PETSC ERROR: #4 MatSetBlockSizes() line 7206 in
> /home/mlohry/build/external/petsc/src/mat/interface/matrix.c
> > > > [0]PETSC ERROR: #5 MatSetBlockSize() line 7170 in
> /home/mlohry/build/external/petsc/src/mat/interface/matrix.c
> > > > [0]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------
> > > > [0]PETSC ERROR: Petsc has generated inconsistent data
> > > > [0]PETSC ERROR: MPI_Allreduce() called in different locations (code
> lines) on different processors
> > > > [0]PETSC ERROR: See
> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
> > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017
> > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h19c1n16 by
> mlohry Mon Aug  5 22:50:12 2019
> > > > [0]PETSC ERROR: Configure options
> PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt
> --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc
> --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx
> --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes
> COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1
> --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS
> --with-mpiexec=/usr/bin/srun
> > > > [0]PETSC ERROR: #6 MatStashScatterBegin_Ref() line 476 in
> /home/mlohry/build/external/petsc/src/mat/utils/matstash.c
> > > > [0]PETSC ERROR: #7 MatStashScatterBegin_Ref() line 476 in
> /home/mlohry/build/external/petsc/src/mat/utils/matstash.c
> > > > [0]PETSC ERROR: #8 MatStashScatterBegin_Private() line 455 in
> /home/mlohry/build/external/petsc/src/mat/utils/matstash.c
> > > > [0]PETSC ERROR: #9 MatAssemblyBegin_MPIAIJ() line 679 in
> /home/mlohry/build/external/petsc/src/mat/impls/aij/mpi/mpiaij.c
> > > > [0]PETSC ERROR: #10 MatAssemblyBegin() line 5154 in
> /home/mlohry/build/external/petsc/src/mat/interface/matrix.c
> > > > [0]PETSC ERROR:
> ------------------------------------------------------------------------
> > > > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation
> Violation, probably memory access out of range
> > > > [0]PETSC ERROR: Try option -start_in_debugger or
> -on_error_attach_debugger
> > > > [0]PETSC ERROR: or see
> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
> > > > [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple
> Mac OS X to find memory corruption errors
> > > > [0]PETSC ERROR: likely location of problem given in stack below
> > > > [0]PETSC ERROR: ---------------------  Stack Frames
> ------------------------------------
> > > > [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not
> available,
> > > > [0]PETSC ERROR:       INSTEAD the line number of the start of the
> function
> > > > [0]PETSC ERROR:       is given.
> > > > [0]PETSC ERROR: [0] MatStashScatterEnd_Ref line 137
> /home/mlohry/build/external/petsc/src/mat/utils/matstash.c
> > > > [0]PETSC ERROR: [0] MatStashScatterEnd_Private line 126
> /home/mlohry/build/external/petsc/src/mat/utils/matstash.c
> > > > [0]PETSC ERROR: [0] MatAssemblyEnd_MPIAIJ line 698
> /home/mlohry/build/external/petsc/src/mat/impls/aij/mpi/mpiaij.c
> > > > [0]PETSC ERROR: [0] MatAssemblyEnd line 5234
> /home/mlohry/build/external/petsc/src/mat/interface/matrix.c
> > > > [0]PETSC ERROR: [0] MatStashScatterBegin_Ref line 473
> /home/mlohry/build/external/petsc/src/mat/utils/matstash.c
> > > > [0]PETSC ERROR: [0] MatStashScatterBegin_Private line 454
> /home/mlohry/build/external/petsc/src/mat/utils/matstash.c
> > > > [0]PETSC ERROR: [0] MatAssemblyBegin_MPIAIJ line 676
> /home/mlohry/build/external/petsc/src/mat/impls/aij/mpi/mpiaij.c
> > > > [0]PETSC ERROR: [0] MatAssemblyBegin line 5143
> /home/mlohry/build/external/petsc/src/mat/interface/matrix.c
> > > > [0]PETSC ERROR: [0] MatSetBlockSizes line 7204
> /home/mlohry/build/external/petsc/src/mat/interface/matrix.c
> > > > [0]PETSC ERROR: [0] MatSetBlockSize line 7167
> /home/mlohry/build/external/petsc/src/mat/interface/matrix.c
> > > > [0]PETSC ERROR: [0] TSSetExactFinalTime line 2248
> /home/mlohry/build/external/petsc/src/ts/interface/ts.c
> > > > [0]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------
> > > > [0]PETSC ERROR: Signal received
> > > > [0]PETSC ERROR: See
> http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
> > > > [0]PETSC ERROR: Petsc Release Version 3.8.3, Dec, 09, 2017
> > > > [0]PETSC ERROR: maDG on a arch-linux2-c-opt named tiger-h19c1n16 by
> mlohry Mon Aug  5 22:50:12 2019
> > > > [0]PETSC ERROR: Configure options
> PETSC_DIR=/home/mlohry/build/external/petsc PETSC_ARCH=arch-linux2-c-opt
> --with-cc=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigcc
> --with-cxx=/opt/intel/compilers_and_libraries_2019.1.144/linux/mpi/intel64/bin/mpigxx
> --with-fc=0 --with-clanguage=C++ --with-pic=1 --with-debugging=yes
> COPTFLAGS='-O3' CXXOPTFLAGS='-O3' --with-shared-libraries=1
> --download-parmetis --download-metis MAKEFLAGS=$MAKEFLAGS
> --with-mpiexec=/usr/bin/srun
> > > > [0]PETSC ERROR: #11 User provided function() line 0 in  unknown file
> > > >
> > > >
> > > >
> > >
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20190811/40a4b41a/attachment-0001.html>


More information about the petsc-users mailing list