[petsc-dev] PCTelescope error

Barry Smith bsmith at mcs.anl.gov
Fri Dec 11 22:42:24 CST 2015


  Dave,

   Sorry for the delay in responding; I let life get in the way of PETSc. Don't worry it won't happen again.

   I ran the example with valgrind and got puzzling messages like

==4111== Invalid write of size 8
==4111==    at 0x1000FF4B9: PetscTrMallocDefault (mtr.c:194)
==4111==    by 0x100F2542B: PCTelescopeSetUp_dmda_repart_coors2d (telescope_dmda.c:148)
==4111==    by 0x100F280BB: PCTelescopeSetUp_dmda_repart_coors (telescope_dmda.c:339)
==4111==    by 0x100F2DBE3: PCTelescopeSetUp_dmda (telescope_dmda.c:691)
==4111==    by 0x100F20AE1: PCSetUp_Telescope (telescope.c:369)
==4111==    by 0x101075B4B: PCSetUp (precon.c:984)
==4111==    by 0x101177047: KSPSetUp (itfunc.c:384)
==4111==    by 0x10000194E: main (in ./ex29)
==4111==  Address 0x104900720 is 416 bytes inside an unallocated block of size 3,209,824 in arena "client"
==4111== 
==4111== Invalid write of size 8
==4111==    at 0x1000FF4CF: PetscTrMallocDefault (mtr.c:196)
==4111==    by 0x100F2542B: PCTelescopeSetUp_dmda_repart_coors2d (telescope_dmda.c:148)
==4111==    by 0x100F280BB: PCTelescopeSetUp_dmda_repart_coors (telescope_dmda.c:339)
==4111==    by 0x100F2DBE3: PCTelescopeSetUp_dmda (telescope_dmda.c:691)
==4111==    by 0x100F20AE1: PCSetUp_Telescope (telescope.c:369)
==4111==    by 0x101075B4B: PCSetUp (precon.c:984)
==4111==    by 0x101177047: KSPSetUp (itfunc.c:384)
==4111==    by 0x10000194E: main (in ./ex29)
==4111==  Address 0x104900728 is 424 bytes inside an unallocated block of size 3,209,824 in arena "client"
==4111== 
==4111== Invalid write of size 4
==4111==    at 0x10021A394: PetscStackCopy (pstack.c:157)
==4111==    by 0x1000FF59E: PetscTrMallocDefault (mtr.c:212)
==4111==    by 0x100F2542B: PCTelescopeSetUp_dmda_repart_coors2d (telescope_dmda.c:148)
==4111==    by 0x100F280BB: PCTelescopeSetUp_dmda_repart_coors (telescope_dmda.c:339)
==4111==    by 0x100F2DBE3: PCTelescopeSetUp_dmda (telescope_dmda.c:691)
==4111==    by 0x100F20AE1: PCSetUp_Telescope (telescope.c:369)
==4111==    by 0x101075B4B: PCSetUp (precon.c:984)
==4111==    by 0x101177047: KSPSetUp (itfunc.c:384)
==4111==    by 0x10000194E: main (in ./ex29)
==4111==  Address 0x104900618 is 152 bytes inside an unallocated block of size 3,209,824 in arena "client"
==4111== 
==4111== Invalid write of size 4
==4111==    at 0x10021A3B9: PetscStackCopy (pstack.c:159)
==4111==    by 0x1000FF59E: PetscTrMallocDefault (mtr.c:212)
==4111==    by 0x100F2542B: PCTelescopeSetUp_dmda_repart_coors2d (telescope_dmda.c:148)
==4111==    by 0x100F280BB: PCTelescopeSetUp_dmda_repart_coors (telescope_dmda.c:339)
==4111==    by 0x100F2DBE3: PCTelescopeSetUp_dmda (telescope_dmda.c:691)
==4111==    by 0x100F20AE1: PCSetUp_Telescope (telescope.c:369)
==4111==    by 0x101075B4B: PCSetUp (precon.c:984)
==4111==    by 0x101177047: KSPSetUp (itfunc.c:384)
==4111==    by 0x10000194E: main (in ./ex29)
==4111==  Address 0x104900718 is 408 bytes inside an unallocated block of size 3,209,824 in arena "client"
==4111== 
==4111== Invalid read of size 4
==4111==    at 0x1000FF60D: PetscTrMallocDefault (mtr.c:214)
==4111==    by 0x100F2542B: PCTelescopeSetUp_dmda_repart_coors2d (telescope_dmda.c:148)
==4111==    by 0x100F280BB: PCTelescopeSetUp_dmda_repart_coors (telescope_dmda.c:339)
==4111==    by 0x100F2DBE3: PCTelescopeSetUp_dmda (telescope_dmda.c:691)
==4111==    by 0x100F20AE1: PCSetUp_Telescope (telescope.c:369)
==4111==    by 0x101075B4B: PCSetUp (precon.c:984)
==4111==    by 0x101177047: KSPSetUp (itfunc.c:384)
==4111==    by 0x10000194E: main (in ./ex29)
==4111==  Address 0x104900718 is 408 bytes inside an unallocated block of size 3,209,824 in arena "client"
==4111== 
==4111== Invalid write of size 4
==4111==    at 0x100F254FC: PCTelescopeSetUp_dmda_repart_coors2d (telescope_dmda.c:154)
==4111==    by 0x100F280BB: PCTelescopeSetUp_dmda_repart_coors (telescope_dmda.c:339)
==4111==    by 0x100F2DBE3: PCTelescopeSetUp_dmda (telescope_dmda.c:691)
==4111==    by 0x100F20AE1: PCSetUp_Telescope (telescope.c:369)
==4111==    by 0x101075B4B: PCSetUp (precon.c:984)
==4111==    by 0x101177047: KSPSetUp (itfunc.c:384)
==4111==    by 0x10000194E: main (in ./ex29)
==4111==  Address 0x104900730 is 432 bytes inside an unallocated block of size 3,209,824 in arena "client"
==4111== 
==4111== Invalid write of size 4
==4111==    at 0x100F2551D: PCTelescopeSetUp_dmda_repart_coors2d (telescope_dmda.c:155)
==4111==    by 0x100F280BB: PCTelescopeSetUp_dmda_repart_coors (telescope_dmda.c:339)
==4111==    by 0x100F2DBE3: PCTelescopeSetUp_dmda (telescope_dmda.c:691)
==4111==    by 0x100F20AE1: PCSetUp_Telescope (telescope.c:369)
==4111==    by 0x101075B4B: PCSetUp (precon.c:984)
==4111==    by 0x101177047: KSPSetUp (itfunc.c:384)
==4111==    by 0x10000194E: main (in ./ex29)
==4111==  Address 0x104900734 is 436 bytes inside an unallocated block of size 3,209,824 in arena "client"
==4111== 
[2]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
[2]PETSC ERROR: Petsc has generated inconsistent data
[2]PETSC ERROR: c 4160 should equal 2 * Ml 65 * Nl -1
[2]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentati


but looking at the code I saw the obvious bug 

  if (isActiveRank(psubcomm)) {
    ierr = DMDAGetCorners(subdm,&si,&sj,NULL,&ni,&nj,NULL);CHKERRQ(ierr);
    Ml = ni - si;
    Nl = nj - sj;

Note that this bug came from a misunderstanding of the meaning of ni and nj which are the widths; not the end points 
when I changed the code to 

    Ml = ni;
    Nl = nj;

it ran in valgrind without error.  I have pushed the fix in master and next.

  Barry


  
> On Dec 8, 2015, at 7:52 AM, Dave May <dave.mayhem23 at gmail.com> wrote:
> 
> Hi Barry,
> 
> I've encountered an issue with PCTelescope which I think is related to your changesets b2566f2, 8f5db7e which
> introduce PetscAllreduceBarrierCheck() and MPIU_Allreduce().
> 
> The error message is reported below.
> This message
> 
> [0]PETSC ERROR: MPI_Allreduce() called in different locations (code lines) on different processors
> 
> appears to be produced by PetscAllreduceBarrierCheck(), but I'm not sure why it would get thrown in my use case.
> 
> The same telescope job run here didn't produce an error with the original pull request.
> It seems the error does not occur when the reduction factor equals the original number of MPI ranks in comm world.
> Stupidly I didn't add a test for that particular case, otherwise this issue would have been caught earlier.
> 
> Do you have ideas what the cause/fix might be?
> 
> 
> Thanks,
>   Dave
> 
> 
> /Users/dmay/software/petsc-developments/petscfork/arch-gpu-debug-single/bin/mpiexec -n 4 ./ex29 -ksp_type fgmres -ksp_monitor -da_grid_x 65 -da_grid_y 65 -pc_type telescope  -ksp_view -pc_telescope_reduction_factor 2 -telescope_pc_type mg -telescope_pc_mg_levels 2 -telescope_mg_levels_pc_type jacobi -telescope_pc_mg_galerkin -telescope_mg_levels_pc_type asm
> [2]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
> [2]PETSC ERROR: Petsc has generated inconsistent data
> [2]PETSC ERROR: c 4160 should equal 2 * Ml 65 * Nl -1
> [2]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
> [2]PETSC ERROR: Petsc Development GIT revision: v3.4-10790-g4c50f03  GIT Date: 2015-12-04 18:13:24 -0600
> [2]PETSC ERROR: ./ex29 on a arch-gpu-debug-single named geop-337.ethz.ch by dmay Tue Dec  8 14:18:58 2015
> [2]PETSC ERROR: Configure options --with-fc=0 --download-mpich=yes --with-opencl --with-viennacl-include=../viennacl-dev --with-viennacl-lib= --with-debugging=yes --with-precision=single
> [2]PETSC ERROR: #1 PCTelescopeSetUp_dmda_repart_coors2d() line 159 in /Users/dmay/software/petsc-developments/petscfork/src/ksp/pc/impls/telescope/telescope_dmda.c
> [0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
> [0]PETSC ERROR: Petsc has generated inconsistent data
> [0]PETSC ERROR: MPI_Allreduce() called in different locations (code lines) on different processors
> [0]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
> [0]PETSC ERROR: Petsc Development GIT revision: v3.4-10790-g4c50f03  GIT Date: 2015-12-04 18:13:24 -0600
> [0]PETSC ERROR: ./ex29 on a arch-gpu-debug-single named geop-337.ethz.ch by dmay Tue Dec  8 14:18:58 2015
> [0]PETSC ERROR: Configure options --with-fc=0 --download-mpich=yes --with-opencl --with-viennacl-include=../viennacl-dev --with-viennacl-lib= --with-debugging=yes --with-precision=single
> [0]PETSC ERROR: #1 PetscSplitOwnership() line 84 in /Users/dmay/software/petsc-developments/petscfork/src/sys/utils/psplit.c
> [0]PETSC ERROR: #2 PetscSplitOwnership() line 84 in /Users/dmay/software/petsc-developments/petscfork/src/sys/utils/psplit.c
> [0]PETSC ERROR: #3 PetscLayoutSetUp() line 143 in /Users/dmay/software/petsc-developments/petscfork/src/vec/is/utils/pmap.c
> [0]PETSC ERROR: [1]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
> [1]PETSC ERROR: Petsc has generated inconsistent data
> [1]PETSC ERROR: MPI_Allreduce() called in different locations (code lines) on different processors
> [1]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
> [1]PETSC ERROR: Petsc Development GIT revision: v3.4-10790-g4c50f03  GIT Date: 2015-12-04 18:13:24 -0600
> [1]PETSC ERROR: [2]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
> [2]PETSC ERROR: Petsc has generated inconsistent data
> [2]PETSC ERROR: MPI_Allreduce() called in different locations (code lines) on different processors
> [2]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
> [2]PETSC ERROR: Petsc Development GIT revision: v3.4-10790-g4c50f03  GIT Date: 2015-12-04 18:13:24 -0600
> [2]PETSC ERROR: ./ex29 on a arch-gpu-debug-single named geop-337.ethz.ch by dmay Tue Dec  8 14:18:58 2015
> [2]PETSC ERROR: Configure options --with-fc=0 --download-mpich=yes --with-opencl --with-viennacl-include=../viennacl-dev --with-viennacl-lib= --with-debugging=yes --with-precision=single
> [2]PETSC ERROR: #2 VecSetBlockSize() line 1421 in /Users/dmay/software/petsc-developments/petscfork/src/vec/vec/interface/vector.c
> [2]PETSC ERROR: #3 VecSetBlockSize() line 1421 in /Users/dmay/software/petsc-developments/petscfork/src/vec/vec/interface/vector.c
> [2]PETSC ERROR: #4 DMCreateGlobalVector_DA() line 42 in /Users/dmay/software/petsc-developments/petscfork/src/dm/impls/da/dadist.c
> [2]PETSC ERROR: #5 DMCreateGlobalVector() line 764 in /Users/dmay/software/petsc-developments/petscfork/src/dm/interface/dm.c
> [2]PETSC ERROR: #6 DMGetGlobalVector() line 163 in /Users/dmay/software/petsc-developments/petscfork/src/dm/interface/dmget.c
> [2]PETSC ERROR: #7 PCTelescopeSetUp_dmda_permutation_2d() line 568 in /Users/dmay/software/petsc-developments/petscfork/src/ksp/pc/impls/telescope/telescope_dmda.c
> #4 ISCreateGeneral_Private() line 568 in /Users/dmay/software/petsc-developments/petscfork/src/vec/is/is/impls/general/general.c
> [0]PETSC ERROR: #5 ISGeneralSetIndices_General() line 683 in /Users/dmay/software/petsc-developments/petscfork/src/vec/is/is/impls/general/general.c
> [0]PETSC ERROR: #6 ISGeneralSetIndices() line 654 in /Users/dmay/software/petsc-developments/petscfork/src/vec/is/is/impls/general/general.c
> [0]PETSC ERROR: #7 ISCreateGeneral() line 625 in /Users/dmay/software/petsc-developments/petscfork/src/vec/is/is/impls/general/general.c
> [0]PETSC ERROR: #8 PCTelescopeSetUp_dmda_repart_coors2d() line 163 in /Users/dmay/software/petsc-developments/petscfork/src/ksp/pc/impls/telescope/telescope_dmda.c
> ./ex29 on a arch-gpu-debug-single named geop-337.ethz.ch by dmay Tue Dec  8 14:18:58 2015
> [1]PETSC ERROR: Configure options --with-fc=0 --download-mpich=yes --with-opencl --with-viennacl-include=../viennacl-dev --with-viennacl-lib= --with-debugging=yes --with-precision=single
> [1]PETSC ERROR: #1 PetscSplitOwnership() line 84 in /Users/dmay/software/petsc-developments/petscfork/src/sys/utils/psplit.c
> [1]PETSC ERROR: #2 PetscSplitOwnership() line 84 in /Users/dmay/software/petsc-developments/petscfork/src/sys/utils/psplit.c
> [2]PETSC ERROR: #8 PCTelescopeSetUp_dmda() line 695 in /Users/dmay/software/petsc-developments/petscfork/src/ksp/pc/impls/telescope/telescope_dmda.c
> [2]PETSC ERROR: #9 PCSetUp_Telescope() line 369 in /Users/dmay/software/petsc-developments/petscfork/src/ksp/pc/impls/telescope/telescope.c
> [2]PETSC ERROR: #10 PCSetUp() line 984 in /Users/dmay/software/petsc-developments/petscfork/src/ksp/pc/interface/precon.c
> [1]PETSC ERROR: #3 PetscLayoutSetUp() line 143 in /Users/dmay/software/petsc-developments/petscfork/src/vec/is/utils/pmap.c
> [1]PETSC ERROR: #4 ISCreateGeneral_Private() line 568 in /Users/dmay/software/petsc-developments/petscfork/src/vec/is/is/impls/general/general.c
> [1]PETSC ERROR: [2]PETSC ERROR: #11 KSPSetUp() line 384 in /Users/dmay/software/petsc-developments/petscfork/src/ksp/ksp/interface/itfunc.c
> [2]PETSC ERROR: #12 main() line 77 in /Users/dmay/software/petsc-developments/petscfork/src/ksp/ksp/examples/tutorials/ex29.c
> [2]PETSC ERROR: PETSc Option Table entries:
> [2]PETSC ERROR: -da_grid_x 65
> [2]PETSC ERROR: #5 ISGeneralSetIndices_General() line 683 in /Users/dmay/software/petsc-developments/petscfork/src/vec/is/is/impls/general/general.c
> [1]PETSC ERROR: #6 ISGeneralSetIndices() line 654 in /Users/dmay/software/petsc-developments/petscfork/src/vec/is/is/impls/general/general.c
> [1]PETSC ERROR: #7 ISCreateGeneral() line 625 in /Users/dmay/software/petsc-developments/petscfork/src/vec/is/is/impls/general/general.c
> [1]PETSC ERROR: #8 PCTelescopeSetUp_dmda_repart_coors2d() line 163 in /Users/dmay/software/petsc-developments/petscfork/src/ksp/pc/impls/telescope/telescope_dmda.c
> -da_grid_y 65
> [2]PETSC ERROR: -ksp_monitor
> [2]PETSC ERROR: -ksp_type fgmres
> [2]PETSC ERROR: -ksp_view
> [2]PETSC ERROR: -pc_telescope_reduction_factor 2
> [2]PETSC ERROR: -pc_type telescope
> [2]PETSC ERROR: -telescope_mg_levels_pc_type asm
> [2]PETSC ERROR: -telescope_pc_mg_galerkin
> [2]PETSC ERROR: -telescope_pc_mg_levels 2
> [2]PETSC ERROR: -telescope_pc_type mg
> [2]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov----------
> application called MPI_Abort(MPI_COMM_WORLD, 1) - process 2
> [cli_2]: aborting job:
> application called MPI_Abort(MPI_COMM_WORLD, 1) - process 2
> [3]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
> [3]PETSC ERROR: Petsc has generated inconsistent data
> [3]PETSC ERROR: MPI_Allreduce() called in different locations (code lines) on different processors
> [3]PETSC ERROR: See http://www.mcs.anl.gov/petsc/documentation/faq.html for trouble shooting.
> [3]PETSC ERROR: Petsc Development GIT revision: v3.4-10790-g4c50f03  GIT Date: 2015-12-04 18:13:24 -0600
> [3]PETSC ERROR: ./ex29 on a arch-gpu-debug-single named geop-337.ethz.ch by dmay Tue Dec  8 14:18:58 2015
> [3]PETSC ERROR: Configure options --with-fc=0 --download-mpich=yes --with-opencl --with-viennacl-include=../viennacl-dev --with-viennacl-lib= --with-debugging=yes --with-precision=single
> [3]PETSC ERROR: #1 PetscSplitOwnership() line 84 in /Users/dmay/software/petsc-developments/petscfork/src/sys/utils/psplit.c
> [3]PETSC ERROR: #2 PetscSplitOwnership() line 84 in /Users/dmay/software/petsc-developments/petscfork/src/sys/utils/psplit.c
> [3]PETSC ERROR: #3 PetscLayoutSetUp() line 143 in /Users/dmay/software/petsc-developments/petscfork/src/vec/is/utils/pmap.c
> [3]PETSC ERROR: #4 ISCreateGeneral_Private() line 568 in /Users/dmay/software/petsc-developments/petscfork/src/vec/is/is/impls/general/general.c
> [3]PETSC ERROR: #5 ISGeneralSetIndices_General() line 683 in /Users/dmay/software/petsc-developments/petscfork/src/vec/is/is/impls/general/general.c
> [3]PETSC ERROR: #6 ISGeneralSetIndices() line 654 in /Users/dmay/software/petsc-developments/petscfork/src/vec/is/is/impls/general/general.c
> [3]PETSC ERROR: #7 ISCreateGeneral() line 625 in /Users/dmay/software/petsc-developments/petscfork/src/vec/is/is/impls/general/general.c
> [3]PETSC ERROR: #8 PCTelescopeSetUp_dmda_repart_coors2d() line 163 in /Users/dmay/software/petsc-developments/petscfork/src/ksp/pc/impls/telescope/telescope_dmda.c
> 
> 




More information about the petsc-dev mailing list