[petsc-dev] [petsc-maint #89809] 3D DMDAs in petsc-3.2

Ethan Coon ecoon at lanl.gov
Tue Oct 11 17:22:11 CDT 2011


Attached is a bug fix (wasn't sure if it should just get pushed back to
the repo or not).

The /* 2d case */ stuff should only take affect when z-periodic, P=1 and
s>1 (but there was stupidity in the if statement that caused it to get
used in ex38, where P != 1).  All it does is flatten all z-communication
to think it is a stencil size of 1, which keeps the "from" index within
the global-numbering range (i.e. >=0, <M*N*P).  Normally this case would
break -- if s > P, then logically the s'th ghost cell in the z-direction
should loop through the mesh a second time.  There was a check to make
sure that s <= {x,y,z} to make sure that all ghost cells sit on the
nearest neighbor-process.  In the case where P=1, the number of procs in
z is obviously 1, and we can simply copy that value to all ghost cells
in the z-direction.

While it should be the case that the only time a "from" index is out of
bounds for the global numbering is during this special case, I added an
extra flag to make sure we were in the special case, to avoid masking
any future bugs which would otherwise show up in the index set creation.

Ethan






On Sun, 2011-10-09 at 22:54 -0500, Barry Smith wrote:
> Jed,
> 
>        It appears your  some of your many 
> 
> if (s_t >= x*y*z) {s_t = bases[n26] + (i-1)*x_t;} /* 2d case */
> 
> things are breaking the communication with a stencil width of 2
> 
> In src/dm/examples/tests/ex38.c  in petsc-3.2
> 
> barry-smiths-macbook-pro:tests barrysmith$ petscmpiexec -n 2  ./ex38  
> Vector Object: 1 MPI processes
>   type: seq
> 0
> 1
> 2
> 3
> 4
> 5
> 6
> 7
> 8
> 9
> 10
> 11
> 112
> 113
> 114
> 115
> 116
> 117
> 112
> 113
> 114
> 115
> 116
> 117
> 
> the second layer of ghost points uses the first layer instead of getting the second layer.
> 
> Since I don't understand your /* 2d case */ business I leave it to you to fix 
> 
> So either I still have a bug in my test code (always possible) or you've introduced your first bug into PETSc (seems unlikely),
> 
> 
>    Barry
> 
> 
> 
> On Oct 9, 2011, at 8:09 PM, Jed Brown wrote:
> 
> > Ex48 uses periodic in y and z and it works.
> > 
> > On Oct 9, 2011 8:08 PM, "Barry Smith" <bsmith at mcs.anl.gov> wrote:
> > 
> >  Ok 3.1-p8 also fails with the same problems. Likely periodic boundary conditions have never worked in 3d? Yikes this looks bad
> > 
> >  Barry
> > 
> > On Oct 9, 2011, at 3:28 PM, Barry Smith wrote:
> > 
> > >
> > >   I have created a new test example in petsc-3.2 for 3D global to local with periodic boundary conditions and it fails rather completely so whomever introduced all the IXs, IXe stuff into DMSetUp_DA_3d() likely introduced some bugs, or the bugs have always been there and no one has tested it properly.
> > >
> > > For example (with periodicity indicated only in the x direction) one gets:
> > >
> > >
> > > barry-smiths-macbook-pro:tests barrysmith$ petscmpiexec -n 2 ./ex38 -da_view -stencil_width 1 | more
> > > Processor [0] M 2 N 3 P 4 m 1 n 1 p 2 w 1 s 1
> > > X range of indices: 0 2, Y range of indices: 0 3, Z range of indices: 0 2
> > > Processor [1] M 2 N 3 P 4 m 1 n 1 p 2 w 1 s 1
> > > X range of indices: 0 2, Y range of indices: 0 3, Z range of indices: 2 4
> > > Vector Object: 1 MPI processes
> > >  type: seq
> > > 1
> > > 0
> > > 1
> > > 0
> > > 3
> > > 2
> > > 3
> > > 2
> > > 5
> > > 4
> > > 5
> > > 4
> > > 7
> > > 6
> > > 7
> > > 6
> > > 9
> > > 8
> > > 9
> > > 8
> > > 11
> > > 10
> > > 11
> > > 10
> > > -1
> > > 112
> > > 113
> > > -1
> > > -1
> > > 114
> > > 115
> > > -1
> > > -1
> > > 116
> > > 117
> > > -1
> > >
> > > so it is not properly handling the periodicity on the "ghost plane" brought over from the rank 1 process to the rank 0.
> > >
> > > Note this happens also with a stencil width of 1, so it is not just a question of not handling the stencil width correctly.
> > >
> > >  Barry
> > >
> > >
> > > On Oct 9, 2011, at 1:32 PM, Richard Katz wrote:
> > >
> > >>
> > >> Here's how I set up the problem:
> > >>
> > >>   ierr = DMDACreate3d(user->comm,DMDA_BOUNDARY_PERIODIC,DMDA_BOUNDARY_PERIODIC,DMDA_BOUNDARY_NONE,
> > >>                      DMDA_STENCIL_BOX,p->Ni,p->Nj,p->Nk,1,1,PETSC_DECIDE,
> > >>                      (PetscInt)(sizeof(SolVel3D)/sizeof(PetscReal)),2,
> > >>                      PETSC_NULL,PETSC_NULL,PETSC_NULL,&user->dav);CHKERRQ(ierr);
> > >> ierr = DMCreateGlobalVector(user->dav,&user->v);CHKERRQ(ierr);
> > >> ierr = DMCreateGlobalVector(user->daa,&user->an);CHKERRQ(ierr);
> > >> ierr = VecDuplicate(user->v,&user->vo);CHKERRQ(ierr);
> > >> ierr = VecDuplicate(user->v,&user->vm);CHKERRQ(ierr);
> > >> ierr = VecDuplicate(user->v,&user->rv);CHKERRQ(ierr);
> > >> ierr = VecDuplicate(user->an,&user->ano);CHKERRQ(ierr);
> > >> ierr = SNESCreate(user->comm,&user->snes);CHKERRQ(ierr);
> > >> if (D2(p)) { ierr = SNESSetFunction(user->snes,user->rv,FormResidual_v2d,user);CHKERRQ(ierr); }
> > >> else       { ierr = SNESSetFunction(user->snes,user->rv,FormResidual_v3d,user);CHKERRQ(ierr); }
> > >> ierr = DMGetColoring(user->dav,IS_COLORING_GLOBAL,MATAIJ,&iscoloring);
> > >> ierr = DMGetMatrix(user->dav,MATAIJ,&user->Jv);CHKERRQ(ierr);
> > >> ierr = MatFDColoringCreate(user->Jv,iscoloring,&user->mfdcv);
> > >> ierr = MatFDColoringSetFromOptions(user->mfdcv);CHKERRQ(ierr);
> > >> if (D2(p)) { ierr = MatFDColoringSetFunction(user->mfdcv,(PetscErrorCode(*)(void))FormResidual_v2d,user); }
> > >> else       { ierr = MatFDColoringSetFunction(user->mfdcv,(PetscErrorCode(*)(void))FormResidual_v3d,user); }
> > >> ierr = SNESSetJacobian(user->snes,user->Jv,user->Jv,SNESDefaultComputeJacobianColor,user->mfdcv);
> > >> ierr = ISColoringDestroy(&iscoloring);
> > >> ierr = SNESSetOptionsPrefix(user->snes,"PV_");CHKERRQ(ierr);
> > >> ierr = SNESSetFromOptions(user->snes);CHKERRQ(ierr);
> > >>
> > >>
> > >> D2(p) is a macro that returns 1 if the problem is 2D, and 0 if 3D.
> > >>
> > >> Rich
> > >>
> > >>
> > >> On 9 Oct 2011, at 19:30, Barry Smith wrote:
> > >>
> > >>>
> > >>> box stencil or star stencil?
> > >>>
> > >>>
> > >>> On Oct 9, 2011, at 1:26 PM, Richard Katz wrote:
> > >>>
> > >>>>
> > >>>> On 9 Oct 2011, at 19:24, Barry Smith wrote:
> > >>>>
> > >>>>>
> > >>>>> To clarify: With this build valgrind generates no messages at all?
> > >>>>
> > >>>> That's correct.
> > >>>>
> > >>>>
> > >>>> [magmox:test]$ $PETSC_DIR/bin/petscmpiexec -valgrind -n 2 ../porbandsim -nd 3 -alpha 2 -theta 45 -output_file test2
> > >>>> --94613-- run: /usr/bin/dsymutil "../porbandsim"
> > >>>> --94612-- run: /usr/bin/dsymutil "../porbandsim"
> > >>>> --------------porbandsim: Sun Oct  9 17:24:24 2011---------------
> > >>>> Processes: 2, DOFs/Process: 2800
> > >>>> #PETSc Option Table entries:
> > >>>> -PV_snes_converged_reason
> > >>>> -PV_snes_monitor
> > >>>> -alpha 2
> > >>>> -ksp_gmres_restart 150
> > >>>> -nd 3
> > >>>> -output_file test2
> > >>>> -snes_converged_reason
> > >>>> -snes_monitor
> > >>>> -theta 45
> > >>>> #End of PETSc Option Table entries
> > >>>> -----------------------------------------------------------------
> > >>>> PetscBag Object:  ParameterBag params for porosity band simulation
> > >>>> takei = TRUE; Use Takei & Holztman (2009) viscosity (T/F)
> > >>>> hperm = FALSE; Use harmonic mean for computing permeability (T/F)
> > >>>> lambda = 27; Porosity-weakening exponent in shear viscosity, dim'less
> > >>>> R = 0.57735; Ratio compaction length to domain height, dim'less
> > >>>> sigma_sat = 1; Saturation stress-difference for anisotropy, dim'less
> > >>>> n = 3; Exponent in porosity--permeability relationship, dim'less
> > >>>> phi0 = 0.1; Initial porosity, dim'less
> > >>>> phi1 = 0.01; Amplitude of porosity perturbation, dim'less
> > >>>> theta = 45; Fixed anisotropy angle, degrees (-1234567 indicates dynamic)
> > >>>> alpha = 2; Fixed anisotropy magnitude, 0<=alpha<=2 (-1234567 indicates dynamic)
> > >>>> viscratio = 1.66667; Ratio of bulk to shear viscosity at phi=phi0, dim'less
> > >>>> nsig = 1; Exponent in stress--strain rate relationship, dim'less
> > >>>> init_harmonic = FALSE; Use harmonic perturbation as IC (T/F)
> > >>>> input_file = null; Name base for input porosity file
> > >>>> test = FALSE; Just load parameters and then exit
> > >>>> output_file = test2; Name base for output files
> > >>>> output_all = FALSE; Make output after each time-step (T/F)
> > >>>> output_err = TRUE; Make output after failed time-step (T/F)
> > >>>> output_aux = TRUE; Output auxilliary fields (T/F)
> > >>>> converged = TRUE; Indicates whether a time-step converged (NO SET)
> > >>>> nb = 2; Number of buffer rows at top and bottom of domain
> > >>>> ns = 1000000; Maximum number of time-steps
> > >>>> nits = 1; Number of Picard iterations
> > >>>> nplot = 0; Output counter
> > >>>> advect = 1; Advection scheme: 0=FV, 1=Fromm, 2=TVD
> > >>>> time = 0; Time at current step, dim'less
> > >>>> cfl = 0.5; Courant-Friedrichs-Lewy number, dim'less
> > >>>> tmax = 20; Maximimum simulated time, dim'less
> > >>>> tout = 0.05; Output interval, dim'less
> > >>>> nd = 3; Number of spatial dimensions, 2 or 3
> > >>>> ni = 20; Grid points in x-direction (shear-direction, periodic)
> > >>>> nj = 5; Grid points in y-direction (cross-direction, periodic)
> > >>>> nk = 14; Grid points in z-direction, incl buffer points
> > >>>> -------------------------END PARAM REPORT------------------------
> > >>>> Continuation in lambda for initial condition
> > >>>> 0 SNES Function norm 2.512586736188e+01
> > >>>> 1 SNES Function norm 2.149396609776e-03
> > >>>> 2 SNES Function norm 1.863405745330e-08
> > >>>> Nonlinear solve converged due to CONVERGED_FNORM_ABS
> > >>>> Solved with lambda = 1.0 in 205.35 seconds
> > >>>> 0 SNES Function norm 1.874271649438e+00
> > >>>> 1 SNES Function norm 8.642606950729e-06
> > >>>> 2 SNES Function norm 7.636114324949e-11
> > >>>> Nonlinear solve converged due to CONVERGED_FNORM_ABS
> > >>>> Solved with lambda = 2.0 in 206.47 seconds
> > >>>> 0 SNES Function norm 3.748572573117e+00
> > >>>> 1 SNES Function norm 1.823985563954e-05
> > >>>> 2 SNES Function norm 1.446658479082e-10
> > >>>> Nonlinear solve converged due to CONVERGED_FNORM_ABS
> > >>>> Solved with lambda = 4.0 in 207.02 seconds
> > >>>> 0 SNES Function norm 7.498313656718e+00
> > >>>> 1 SNES Function norm 3.949293531885e-05
> > >>>> 2 SNES Function norm 3.092503935042e-10
> > >>>> Nonlinear solve converged due to CONVERGED_FNORM_ABS
> > >>>> Solved with lambda = 8.0 in 207.44 seconds
> > >>>> 0 SNES Function norm 1.500973207738e+01
> > >>>> 1 SNES Function norm 8.256012403101e-05
> > >>>> 2 SNES Function norm 6.728525270322e-10
> > >>>> Nonlinear solve converged due to CONVERGED_FNORM_ABS
> > >>>> Solved with lambda = 16.0 in 207.38 seconds
> > >>>> 0 SNES Function norm 2.066862203903e+01
> > >>>> 1 SNES Function norm 6.201694512543e-05
> > >>>> 2 SNES Function norm 5.537777484572e-10
> > >>>> Nonlinear solve converged due to CONVERGED_FNORM_ABS
> > >>>> Solved with lambda = 27.0 in 207.32 seconds
> > >>>> Generating output file: "test2_0000"
> > >>>> -----------------------------------------------------------------
> > >>>> 0 SNES Function norm 1.914413986104e+01
> > >>>> 1 SNES Function norm 7.835942994856e-05
> > >>>> 2 SNES Function norm 2.520454050827e-10
> > >>>> Nonlinear solve converged due to CONVERGED_FNORM_ABS
> > >>>> 0 SNES Function norm 2.038885321297e+01
> > >>>> 1 SNES Function norm 1.727646593582e-04
> > >>>> 2 SNES Function norm 1.325429214752e-09
> > >>>> Nonlinear solve converged due to CONVERGED_FNORM_ABS
> > >>>> Stepped to time t = 0.050 in 213.4 seconds
> > >>>> Generating output file: "test2_0001"
> > >>>> -----------------------------------------------------------------
> > >>>> 0 SNES Function norm 1.592319862889e+01
> > >>>> 1 SNES Function norm 5.602653646480e-05
> > >>>> 2 SNES Function norm 2.453571869974e-10
> > >>>> Nonlinear solve converged due to CONVERGED_FNORM_ABS
> > >>>> 0 SNES Function norm 1.321599310609e+01
> > >>>> 1 SNES Function norm 1.261023473773e-04
> > >>>> 2 SNES Function norm 1.163959209038e-09
> > >>>> Nonlinear solve converged due to CONVERGED_FNORM_ABS
> > >>>> Stepped to time t = 0.100 in 212.6 seconds
> > >>>> Generating output file: "test2_0002"
> > >>>> -----------------------------------------------------------------
> > >>>> 0 SNES Function norm 1.422948512067e+01
> > >>>> 1 SNES Function norm 4.242156538283e-05
> > >>>> 2 SNES Function norm 1.625303993154e-10
> > >>>> Nonlinear solve converged due to CONVERGED_FNORM_ABS
> > >>>> 0 SNES Function norm 9.675556760630e+00
> > >>>> 1 SNES Function norm 1.362209365650e-04
> > >>>> 2 SNES Function norm 1.223382062021e-09
> > >>>> Nonlinear solve converged due to CONVERGED_FNORM_ABS
> > >>>> Stepped to time t = 0.150 in 212.9 seconds
> > >>>> Generating output file: "test2_0003"
> > >>>> -----------------------------------------------------------------
> > >>>> 0 SNES Function norm 1.307890574732e+01
> > >>>> 1 SNES Function norm 3.214708350215e-05
> > >>>> 2 SNES Function norm 1.093103013350e-10
> > >>>> Nonlinear solve converged due to CONVERGED_FNORM_ABS
> > >>>> 0 SNES Function norm 7.636762082099e+00
> > >>>> 1 SNES Function norm 1.308928801753e-04
> > >>>> 2 SNES Function norm 1.143217709967e-09
> > >>>> Nonlinear solve converged due to CONVERGED_FNORM_ABS
> > >>>> Stepped to time t = 0.200 in 212.8 seconds
> > >>>> Generating output file: "test2_0004"
> > >>>> -----------------------------------------------------------------
> > >>>> 0 SNES Function norm 1.216229064046e+01
> > >>>> 1 SNES Function norm 2.441284335468e-05
> > >>>> 2 SNES Function norm 7.517551623210e-11
> > >>>> Nonlinear solve converged due to CONVERGED_FNORM_ABS
> > >>>> 0 SNES Function norm 6.371107649815e+00
> > >>>> 1 SNES Function norm 1.185031269819e-04
> > >>>> 2 SNES Function norm 1.095566218658e-09
> > >>>> Nonlinear solve converged due to CONVERGED_FNORM_ABS
> > >>>> Stepped to time t = 0.250 in 213.0 seconds
> > >>>> Generating output file: "test2_0005"
> > >>>> -----------------------------------------------------------------
> > >>>> 0 SNES Function norm 1.136564194223e+01
> > >>>> 1 SNES Function norm 1.865453423897e-05
> > >>>> 2 SNES Function norm 5.239501925142e-11
> > >>>> Nonlinear solve converged due to CONVERGED_FNORM_ABS
> > >>>> 0 SNES Function norm 5.520895480607e+00
> > >>>> 1 SNES Function norm 1.025932275448e-04
> > >>>> 2 SNES Function norm 1.071648118984e-09
> > >>>> Nonlinear solve converged due to CONVERGED_FNORM_ABS
> > >>>> Stepped to time t = 0.300 in 213.0 seconds
> > >>>> Generating output file: "test2_0006"
> > >>>> -----------------------------------------------------------------
> > >>>> 0 SNES Function norm 1.064481900239e+01
> > >>>> 1 SNES Function norm 1.438574893573e-05
> > >>>> 2 SNES Function norm 3.643963189108e-11
> > >>>> Nonlinear solve converged due to CONVERGED_FNORM_ABS
> > >>>> 0 SNES Function norm 4.905069034291e+00
> > >>>> 1 SNES Function norm 8.961489700096e-05
> > >>>> 2 SNES Function norm 1.008198626576e-09
> > >>>> Nonlinear solve converged due to CONVERGED_FNORM_ABS
> > >>>> Stepped to time t = 0.350 in 212.7 seconds
> > >>>> Generating output file: "test2_0007"
> > >>>> -----------------------------------------------------------------
> > >>>> 0 SNES Function norm 9.981300326125e+00
> > >>>> 1 SNES Function norm 1.120717450009e-05
> > >>>> 2 SNES Function norm 2.503394538611e-11
> > >>>> Nonlinear solve converged due to CONVERGED_FNORM_ABS
> > >>>> 0 SNES Function norm 4.428377952757e+00
> > >>>> ^CCtrl-C caught... cleaning up processes
> > >>>>
> > >>>>
> > >>>>
> > >>>>
> > >>>>>
> > >>>>> Barry
> > >>>>>
> > >>>>> On Oct 9, 2011, at 12:19 PM, Richard Katz wrote:
> > >>>>>
> > >>>>>> Hi Barry,
> > >>>>>>
> > >>>>>> I have now run the code with
> > >>>>>>
> > >>>>>> configure_options = [
> > >>>>>> '--download-mpich=1',
> > >>>>>> '--with-cxx=0',
> > >>>>>> '--with-fortran=0',
> > >>>>>> 'PETSC_ARCH=debug_mpich',
> > >>>>>> ]
> > >>>>>>
> > >>>>>> and
> > >>>>>>
> > >>>>>> $PETSC_DIR/bin/petscmpiexec -valgrind -n 2 ../porbandsim -nd 3 -alpha 2 -theta 45 -output_file test2
> > >>>>>>
> > >>>>>> The memory error goes away.
> > >>>>>>
> > >>>>>> The error in the calculated solution does not go away.
> > >>>>>>
> > >>>>>> Thanks,
> > >>>>>> Rich
> > >>>>>>
> > >>>>>>
> > >>>>>>
> > >>>>>>
> > >>>>>> On 9 Oct 2011, at 17:01, Barry Smith wrote:
> > >>>>>>
> > >>>>>>>
> > >>>>>>> Rich,
> > >>>>>>>
> > >>>>>>> Could you do an identical run (to the valgrind run you made below) but using another PETSC_ARCH that uses --download-mpich
> > >>>>>>>
> > >>>>>>> I am suspecting openmpi because it is a bad read of something INSIDE OpenMPI that was just outside the range of memory allocated BY OpenMPI, it is hard to see how PETSc could induce this particular memory problem.
> > >>>>>>>
> > >>>>>>> Of course, it is also possible there are two bugs, one in OpenMPI and another in PETSc related to the DMDA.
> > >>>>>>>
> > >>>>>>> Barry
> > >>>>>>>
> > >>>>>>> On Oct 9, 2011, at 10:46 AM, Richard Katz wrote:
> > >>>>>>>
> > >>>>>>>> Hi all,
> > >>>>>>>>
> > >>>>>>>> I'm beginning to wonder if there is a bug in the filling of ghost-points for 3-dimensional DMDAs under petsc-3.2. I am working on a finite-volume code that uses a DMDACreate3d() to create several DMDAs, all with a stencil width of two.
> > >>>>>>>>
> > >>>>>>>> * The 2-dimensional mode for the same code runs without problems in parallel.
> > >>>>>>>>
> > >>>>>>>> * The initial SNES solve converges well, with no seg-faults, but when there are processor bounds in the k-direction, the solution is clearly incorrect at/near the boundaries.
> > >>>>>>>>
> > >>>>>>>> * When I run the same code in serial, or in parallel but with no processor boundaries in the k-direction, it works without problems.
> > >>>>>>>>
> > >>>>>>>> * When I run the code with processor boundaries in the k-direction, but zeroing-out the bits of the stencil that are at stencil-radius of 2, the processor-boundary errors don't appear.
> > >>>>>>>>
> > >>>>>>>> * Extending the stencil width to 3 has no effect on the problem.
> > >>>>>>>>
> > >>>>>>>> * There does seem to be a memory-corruption problem, but I'm not sure where it comes from, because it appears very high in the stack.  The following is valgrind output for a run with 2 processors.  The processor boundary is in the k-direction.  This memory error ONLY appears on the first SNES solve, but not on any of the following ones.  Also, the memory error seems to be associated with OpenMPI; it doesn't occur when running with MPICH2
> > >>>>>>>>
> > >>>>>>>> [magmox:test]$ $PETSC_DIR/bin/petscmpiexec -valgrind -n 2 ../porbandsim -malloc off -nd 3 -alpha 2 -theta 45 -output_file test2
> > >>>>>>>>
> > >>>>>>>> [...]
> > >>>>>>>>
> > >>>>>>>> Continuation in lambda for initial condition
> > >>>>>>>> 0 SNES Function norm 2.512586736188e+01
> > >>>>>>>> 1 SNES Function norm 2.149396609776e-03
> > >>>>>>>> ==4576== Invalid read of size 8
> > >>>>>>>> ==4576==    at 0x7FFFFFE00BAC: ???
> > >>>>>>>> ==4576==    by 0x101A34213: ompi_convertor_pack (in /usr/lib/libmpi.0.dylib)
> > >>>>>>>> ==4576==    by 0x10858909E: mca_pml_ob1_send_request_start_copy (in /usr/lib/openmpi/mca_pml_ob1.so)
> > >>>>>>>> ==4576==    by 0x1085840D2: mca_pml_ob1_isend (in /usr/lib/openmpi/mca_pml_ob1.so)
> > >>>>>>>> ==4576==    by 0x1085D11C3: ompi_coll_tuned_allreduce_intra_recursivedoubling (in /usr/lib/openmpi/mca_coll_tuned.so)
> > >>>>>>>> ==4576==    by 0x101A4419B: MPI_Allreduce (in /usr/lib/libmpi.0.dylib)
> > >>>>>>>> ==4576==    by 0x1001D3F3E: VecMDot_MPI (pvec2.c:20)
> > >>>>>>>> ==4576==    by 0x100204669: VecMDot (rvector.c:1144)
> > >>>>>>>> ==4576==    by 0x10097E28B: KSPGMRESClassicalGramSchmidtOrthogonalization (borthog2.c:66)
> > >>>>>>>> ==4576==    by 0x100975B33: GMREScycle (gmres.c:159)
> > >>>>>>>> ==4576==    by 0x10097656E: KSPSolve_GMRES (gmres.c:231)
> > >>>>>>>> ==4576==    by 0x1009A8606: KSPSolve (itfunc.c:423)
> > >>>>>>>> ==4576==    by 0x100A0271A: SNES_KSPSolve (snes.c:3396)
> > >>>>>>>> ==4576==    by 0x100A17F79: SNESSolve_LS (ls.c:190)
> > >>>>>>>> ==4576==    by 0x1009FB635: SNESSolve (snes.c:2676)
> > >>>>>>>> ==4576==    by 0x100008104: InitSolveElliptic (initialize.c:216)
> > >>>>>>>> ==4576==    by 0x10000538A: FormInitialCondition (initialize.c:31)
> > >>>>>>>> ==4576==    by 0x100001410: main (porbandsim.c:28)
> > >>>>>>>> ==4576==  Address 0x10289f778 is 0 bytes after a block of size 136 alloc'd
> > >>>>>>>> ==4576==    at 0x1010B05BF: malloc (vg_replace_malloc.c:266)
> > >>>>>>>> ==4576==    by 0x1085D0F98: ompi_coll_tuned_allreduce_intra_recursivedoubling (in /usr/lib/openmpi/mca_coll_tuned.so)
> > >>>>>>>> ==4576==    by 0x101A4419B: MPI_Allreduce (in /usr/lib/libmpi.0.dylib)
> > >>>>>>>> ==4576==    by 0x1001D3F3E: VecMDot_MPI (pvec2.c:20)
> > >>>>>>>> ==4576==    by 0x100204669: VecMDot (rvector.c:1144)
> > >>>>>>>> ==4576==    by 0x10097E28B: KSPGMRESClassicalGramSchmidtOrthogonalization (borthog2.c:66)
> > >>>>>>>> ==4576==    by 0x100975B33: GMREScycle (gmres.c:159)
> > >>>>>>>> ==4576==    by 0x10097656E: KSPSolve_GMRES (gmres.c:231)
> > >>>>>>>> ==4576==    by 0x1009A8606: KSPSolve (itfunc.c:423)
> > >>>>>>>> ==4576==    by 0x100A0271A: SNES_KSPSolve (snes.c:3396)
> > >>>>>>>> ==4576==    by 0x100A17F79: SNESSolve_LS (ls.c:190)
> > >>>>>>>> ==4576==    by 0x1009FB635: SNESSolve (snes.c:2676)
> > >>>>>>>> ==4576==    by 0x100008104: InitSolveElliptic (initialize.c:216)
> > >>>>>>>> ==4576==    by 0x10000538A: FormInitialCondition (initialize.c:31)
> > >>>>>>>> ==4576==    by 0x100001410: main (porbandsim.c:28)
> > >>>>>>>> ==4576==
> > >>>>>>>> ==4577== Invalid read of size 8
> > >>>>>>>> ==4577==    at 0x7FFFFFE00BAC: ???
> > >>>>>>>> ==4577==    by 0x101A34213: ompi_convertor_pack (in /usr/lib/libmpi.0.dylib)
> > >>>>>>>> ==4577==    by 0x10858909E: mca_pml_ob1_send_request_start_copy (in /usr/lib/openmpi/mca_pml_ob1.so)
> > >>>>>>>> ==4577==    by 0x1085840D2: mca_pml_ob1_isend (in /usr/lib/openmpi/mca_pml_ob1.so)
> > >>>>>>>> ==4577==    by 0x1085D11C3: ompi_coll_tuned_allreduce_intra_recursivedoubling (in /usr/lib/openmpi/mca_coll_tuned.so)
> > >>>>>>>> ==4577==    by 0x101A4419B: MPI_Allreduce (in /usr/lib/libmpi.0.dylib)
> > >>>>>>>> ==4577==    by 0x1001D3F3E: VecMDot_MPI (pvec2.c:20)
> > >>>>>>>> ==4577==    by 0x100204669: VecMDot (rvector.c:1144)
> > >>>>>>>> ==4577==    by 0x10097E28B: KSPGMRESClassicalGramSchmidtOrthogonalization (borthog2.c:66)
> > >>>>>>>> ==4577==    by 0x100975B33: GMREScycle (gmres.c:159)
> > >>>>>>>> ==4577==    by 0x10097656E: KSPSolve_GMRES (gmres.c:231)
> > >>>>>>>> ==4577==    by 0x1009A8606: KSPSolve (itfunc.c:423)
> > >>>>>>>> ==4577==    by 0x100A0271A: SNES_KSPSolve (snes.c:3396)
> > >>>>>>>> ==4577==    by 0x100A17F79: SNESSolve_LS (ls.c:190)
> > >>>>>>>> ==4577==    by 0x1009FB635: SNESSolve (snes.c:2676)
> > >>>>>>>> ==4577==    by 0x100008104: InitSolveElliptic (initialize.c:216)
> > >>>>>>>> ==4577==    by 0x10000538A: FormInitialCondition (initialize.c:31)
> > >>>>>>>> ==4577==    by 0x100001410: main (porbandsim.c:28)
> > >>>>>>>> ==4577==  Address 0x10289f778 is 0 bytes after a block of size 136 alloc'd
> > >>>>>>>> ==4577==    at 0x1010B05BF: malloc (vg_replace_malloc.c:266)
> > >>>>>>>> ==4577==    by 0x1085D0F98: ompi_coll_tuned_allreduce_intra_recursivedoubling (in /usr/lib/openmpi/mca_coll_tuned.so)
> > >>>>>>>> ==4577==    by 0x101A4419B: MPI_Allreduce (in /usr/lib/libmpi.0.dylib)
> > >>>>>>>> ==4577==    by 0x1001D3F3E: VecMDot_MPI (pvec2.c:20)
> > >>>>>>>> ==4577==    by 0x100204669: VecMDot (rvector.c:1144)
> > >>>>>>>> ==4577==    by 0x10097E28B: KSPGMRESClassicalGramSchmidtOrthogonalization (borthog2.c:66)
> > >>>>>>>> ==4577==    by 0x100975B33: GMREScycle (gmres.c:159)
> > >>>>>>>> ==4577==    by 0x10097656E: KSPSolve_GMRES (gmres.c:231)
> > >>>>>>>> ==4577==    by 0x1009A8606: KSPSolve (itfunc.c:423)
> > >>>>>>>> ==4577==    by 0x100A0271A: SNES_KSPSolve (snes.c:3396)
> > >>>>>>>> ==4577==    by 0x100A17F79: SNESSolve_LS (ls.c:190)
> > >>>>>>>> ==4577==    by 0x1009FB635: SNESSolve (snes.c:2676)
> > >>>>>>>> ==4577==    by 0x100008104: InitSolveElliptic (initialize.c:216)
> > >>>>>>>> ==4577==    by 0x10000538A: FormInitialCondition (initialize.c:31)
> > >>>>>>>> ==4577==    by 0x100001410: main (porbandsim.c:28)
> > >>>>>>>> ==4577==
> > >>>>>>>> 2 SNES Function norm 1.863405745330e-08
> > >>>>>>>> Nonlinear solve converged due to CONVERGED_FNORM_ABS
> > >>>>>>>> 0 SNES Function norm 1.874271649438e+00
> > >>>>>>>> 1 SNES Function norm 8.642606950729e-06
> > >>>>>>>> 2 SNES Function norm 7.636114324949e-11
> > >>>>>>>> Nonlinear solve converged due to CONVERGED_FNORM_ABS
> > >>>>>>>>
> > >>>>>>>> [...]
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> Any suggestions about how to debug this?  How well-tested are the parallel, 3D DMDA routines?  Should I do some testing on a PETSc example code?
> > >>>>>>>>
> > >>>>>>>> I am running petsc-3.2 with
> > >>>>>>>>
> > >>>>>>>> changeset:   20338:fbd00cf1070e
> > >>>>>>>> tag:         tip
> > >>>>>>>> parent:      20337:589176156279
> > >>>>>>>> parent:      20336:7abcced43e69
> > >>>>>>>> user:        Barry Smith <bsmith at mcs.anl.gov>
> > >>>>>>>> date:        Tue Oct 04 16:45:00 2011 -0500
> > >>>>>>>> summary:     commit after merge
> > >>>>>>>>
> > >>>>>>>> and
> > >>>>>>>>
> > >>>>>>>> configure_options = [
> > >>>>>>>> '--download-fftw=true',
> > >>>>>>>> '--with-cxx=0',
> > >>>>>>>> '--with-debugging=1',
> > >>>>>>>> '--with-fortran=0',
> > >>>>>>>> '--with-mpi-dir=/usr',
> > >>>>>>>> 'PETSC_ARCH=debug',
> > >>>>>>>> ]
> > >>>>>>>>
> > >>>>>>>> or
> > >>>>>>>>
> > >>>>>>>> configure_options = [
> > >>>>>>>> '--download-blacs',
> > >>>>>>>> '--download-fftw=true',
> > >>>>>>>> '--download-mpich=yes',
> > >>>>>>>> '--download-mumps=yes',
> > >>>>>>>> '--download-parmetis',
> > >>>>>>>> '--download-scalapack',
> > >>>>>>>> '--with-cxx=0',
> > >>>>>>>> '--with-debugging=0',
> > >>>>>>>> 'PETSC_ARCH=optimize_mumps',
> > >>>>>>>> ]
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> Thanks,
> > >>>>>>>> Rich
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>
> > >>>>>>
> > >>>>>> ________________________________
> > >>>>>> Richard Foa Katz
> > >>>>>> Dept Earth Sciences, Univ Oxford
> > >>>>>> http://foalab.earth.ox.ac.uk
> > >>>>>>
> > >>>>>
> > >>>>
> > >>>> ________________________________
> > >>>> Richard Foa Katz
> > >>>> Dept Earth Sciences, Univ Oxford
> > >>>> http://foalab.earth.ox.ac.uk
> > >>>>
> > >>>
> > >>
> > >> ________________________________
> > >> Richard Foa Katz
> > >> Dept Earth Sciences, Univ Oxford
> > >> http://foalab.earth.ox.ac.uk
> > >>
> > >
> > 
> 

-- 
------------------------------------
Ethan Coon
Post-Doctoral Researcher
Applied Mathematics - T-5
Los Alamos National Laboratory
505-665-8289

http://www.ldeo.columbia.edu/~ecoon/
------------------------------------
-------------- next part --------------
# HG changeset patch
# Parent 01d9f4cfccafb34949d04e6f5615c7722b0f0d2d
bug fix in when indices get altered to look like 2D for 3D das with 1 node in the z-direction

diff -r 01d9f4cfccaf src/dm/impls/da/da3.c
--- a/src/dm/impls/da/da3.c	Tue Oct 11 11:35:40 2011 -0500
+++ b/src/dm/impls/da/da3.c	Tue Oct 11 16:13:16 2011 -0600
@@ -197,8 +197,10 @@
   Vec                    local,global;
   VecScatter             ltog,gtol;
   IS                     to,from,ltogis;
+  PetscBool              twod;
   PetscErrorCode         ierr;
 
+
   PetscFunctionBegin;
   if (dof < 1) SETERRQ1(((PetscObject)da)->comm,PETSC_ERR_ARG_OUTOFRANGE,"Must have 1 or more degrees of freedom per node: %D",dof);
   if (s < 0) SETERRQ1(((PetscObject)da)->comm,PETSC_ERR_ARG_OUTOFRANGE,"Stencil width cannot be negative: %D",s);
@@ -335,7 +337,10 @@
   /* note this is different than x- and y-, as we will handle as an important special
    case when p=P=1 and DMDA_BOUNDARY_PERIODIC and s > z.  This is to deal with 2D problems
    in a 3D code.  Additional code for this case is noted with "2d case" comments */
-  if ((z < s) && ((p > 1) || ((P > 1) && bz == DMDA_BOUNDARY_PERIODIC))) {
+  twod = PETSC_FALSE;
+  if (P == 1) {
+    twod = PETSC_TRUE;
+  } else if ((z < s) && ((p > 1) || (bz == DMDA_BOUNDARY_PERIODIC))) {
     SETERRQ2(PETSC_COMM_SELF,PETSC_ERR_ARG_OUTOFRANGE,"Local z-width of domain z %D is smaller than stencil width s %D",z,s);
   }
   zs = 0;
@@ -743,9 +748,7 @@
         y_t = ly[(n0 % (m*n))/m];
         z_t = lz[n0 / (m*n)];
         s_t = bases[n0] + x_t*y_t*z_t - (s_y-i)*x_t - s_x - (s_z-k-1)*x_t*y_t;
-#if defined(foo)
-        if (s_t < 0) {s_t = bases[n0] + x_t*y_t*z_t - (s_y-i)*x_t - s_x;} /* 2D case */
-#endif
+        if (twod && (s_t < 0)) {s_t = bases[n0] + x_t*y_t*z_t - (s_y-i)*x_t - s_x;} /* 2D case */
         for (j=0; j<s_x; j++) { idx[nn++] = s_t++;}
       }
       if (n1 >= 0) { /* directly below */
@@ -753,9 +756,7 @@
         y_t = ly[(n1 % (m*n))/m];
         z_t = lz[n1 / (m*n)];
         s_t = bases[n1] + x_t*y_t*z_t - (s_y+1-i)*x_t - (s_z-k-1)*x_t*y_t;
-#if defined(foo)
-        if (s_t < 0) {s_t = bases[n1] + x_t*y_t*z_t - (s_y+1-i)*x_t;} /* 2D case */
-#endif
+        if (twod && (s_t < 0)) {s_t = bases[n1] + x_t*y_t*z_t - (s_y+1-i)*x_t;} /* 2D case */
         for (j=0; j<x_t; j++) { idx[nn++] = s_t++;}
       }
       if (n2 >= 0) { /* right below */
@@ -763,9 +764,7 @@
         y_t = ly[(n2 % (m*n))/m];
         z_t = lz[n2 / (m*n)];
         s_t = bases[n2] + x_t*y_t*z_t - (s_y+1-i)*x_t - (s_z-k-1)*x_t*y_t;
-#if defined(foo)
-        if (s_t < 0) {s_t = bases[n2] + x_t*y_t*z_t - (s_y+1-i)*x_t;} /* 2D case */
-#endif
+        if (twod && (s_t < 0)) {s_t = bases[n2] + x_t*y_t*z_t - (s_y+1-i)*x_t;} /* 2D case */
         for (j=0; j<s_x; j++) { idx[nn++] = s_t++;}
       }
     }
@@ -776,9 +775,7 @@
         y_t = y;
         z_t = lz[n3 / (m*n)];
         s_t = bases[n3] + (i+1)*x_t - s_x + x_t*y_t*z_t - (s_z-k)*x_t*y_t;
-#if defined(foo)
-        if (s_t < 0) {s_t = bases[n3] + (i+1)*x_t - s_x + x_t*y_t*z_t - x_t*y_t;} /* 2D case */
-#endif
+        if (twod && (s_t < 0)) {s_t = bases[n3] + (i+1)*x_t - s_x + x_t*y_t*z_t - x_t*y_t;} /* 2D case */
         for (j=0; j<s_x; j++) { idx[nn++] = s_t++;}
       }
 
@@ -787,9 +784,7 @@
         y_t = y;
         z_t = lz[n4 / (m*n)];
         s_t = bases[n4] + i*x_t + x_t*y_t*z_t - (s_z-k)*x_t*y_t;
-#if defined(foo)
-        if (s_t < 0) {s_t = bases[n4] + i*x_t + x_t*y_t*z_t - x_t*y_t;} /* 2D case */
-#endif
+        if (twod && (s_t < 0)) {s_t = bases[n4] + i*x_t + x_t*y_t*z_t - x_t*y_t;} /* 2D case */
         for (j=0; j<x_t; j++) { idx[nn++] = s_t++;}
       }
 
@@ -798,9 +793,7 @@
         y_t = y;
         z_t = lz[n5 / (m*n)];
         s_t = bases[n5] + i*x_t + x_t*y_t*z_t - (s_z-k)*x_t*y_t;
-#if defined(foo)
-        if (s_t < 0) {s_t = bases[n5] + i*x_t + x_t*y_t*z_t - x_t*y_t;} /* 2D case */
-#endif
+        if (twod && (s_t < 0)) {s_t = bases[n5] + i*x_t + x_t*y_t*z_t - x_t*y_t;} /* 2D case */
         for (j=0; j<s_x; j++) { idx[nn++] = s_t++;}
       }
     }
@@ -811,9 +804,7 @@
         y_t = ly[(n6 % (m*n))/m];
         z_t = lz[n6 / (m*n)];
         s_t = bases[n6] + i*x_t - s_x + x_t*y_t*z_t - (s_z-k)*x_t*y_t;
-#if defined(foo)
-        if (s_t < 0) {s_t = bases[n6] + i*x_t - s_x + x_t*y_t*z_t - x_t*y_t;} /* 2D case */
-#endif
+        if (twod && (s_t < 0)) {s_t = bases[n6] + i*x_t - s_x + x_t*y_t*z_t - x_t*y_t;} /* 2D case */
         for (j=0; j<s_x; j++) { idx[nn++] = s_t++;}
       }
       if (n7 >= 0) { /* directly above */
@@ -821,9 +812,7 @@
         y_t = ly[(n7 % (m*n))/m];
         z_t = lz[n7 / (m*n)];
         s_t = bases[n7] + (i-1)*x_t + x_t*y_t*z_t - (s_z-k)*x_t*y_t;
-#if defined(foo)
-        if (s_t < 0) {s_t = bases[n7] + (i-1)*x_t + x_t*y_t*z_t - x_t*y_t;} /* 2D case */
-#endif
+        if (twod && (s_t < 0)) {s_t = bases[n7] + (i-1)*x_t + x_t*y_t*z_t - x_t*y_t;} /* 2D case */
         for (j=0; j<x_t; j++) { idx[nn++] = s_t++;}
       }
       if (n8 >= 0) { /* right above */
@@ -831,9 +820,7 @@
         y_t = ly[(n8 % (m*n))/m];
         z_t = lz[n8 / (m*n)];
         s_t = bases[n8] + (i-1)*x_t + x_t*y_t*z_t - (s_z-k)*x_t*y_t;
-#if defined(foo)
-        if (s_t < 0) {s_t = bases[n8] + (i-1)*x_t + x_t*y_t*z_t - x_t*y_t;} /* 2D case */
-#endif
+        if (twod && (s_t < 0)) {s_t = bases[n8] + (i-1)*x_t + x_t*y_t*z_t - x_t*y_t;} /* 2D case */
         for (j=0; j<s_x; j++) { idx[nn++] = s_t++;}
       }
     }
@@ -920,9 +907,7 @@
         y_t = ly[(n18 % (m*n))/m]; 
         /* z_t = lz[n18 / (m*n)]; */
         s_t = bases[n18] - (s_y-i)*x_t -s_x + (k+1)*x_t*y_t;
-#if defined(foo)
-        if (s_t >= x*y*z) {s_t = bases[n18] - (s_y-i)*x_t -s_x + x_t*y_t;} /* 2d case */
-#endif
+        if (twod && (s_t >= M*N*P)) {s_t = bases[n18] - (s_y-i)*x_t -s_x + x_t*y_t;} /* 2d case */
         for (j=0; j<s_x; j++) { idx[nn++] = s_t++;}
       }
       if (n19 >= 0) { /* directly below */
@@ -930,9 +915,7 @@
         y_t = ly[(n19 % (m*n))/m]; 
         /* z_t = lz[n19 / (m*n)]; */
         s_t = bases[n19] - (s_y+1-i)*x_t + (k+1)*x_t*y_t;
-#if defined(foo)
-        if (s_t >= x*y*z) {s_t = bases[n19] - (s_y+1-i)*x_t + x_t*y_t;} /* 2d case */
-#endif
+        if (twod && (s_t >= M*N*P)) {s_t = bases[n19] - (s_y+1-i)*x_t + x_t*y_t;} /* 2d case */
         for (j=0; j<x_t; j++) { idx[nn++] = s_t++;}
       }
       if (n20 >= 0) { /* right below */
@@ -940,9 +923,7 @@
         y_t = ly[(n20 % (m*n))/m];
         /* z_t = lz[n20 / (m*n)]; */
         s_t = bases[n20] - (s_y+1-i)*x_t + (k+1)*x_t*y_t;
-#if defined(foo)
-        if (s_t >= x*y*z) {s_t = bases[n20] - (s_y+1-i)*x_t + x_t*y_t;} /* 2d case */
-#endif
+        if (twod && (s_t >= M*N*P)) {s_t = bases[n20] - (s_y+1-i)*x_t + x_t*y_t;} /* 2d case */
         for (j=0; j<s_x; j++) { idx[nn++] = s_t++;}
       }
     }
@@ -953,9 +934,7 @@
         y_t = y;
         /* z_t = lz[n21 / (m*n)]; */
         s_t = bases[n21] + (i+1)*x_t - s_x + k*x_t*y_t;
-#if defined(foo)
-        if (s_t >= x*y*z) {s_t = bases[n21] + (i+1)*x_t - s_x;}  /* 2d case */
-#endif
+        if (twod && (s_t >= M*N*P)) {s_t = bases[n21] + (i+1)*x_t - s_x;}  /* 2d case */
         for (j=0; j<s_x; j++) { idx[nn++] = s_t++;}
       }
 
@@ -964,9 +943,7 @@
         y_t = y;
         /* z_t = lz[n22 / (m*n)]; */
         s_t = bases[n22] + i*x_t + k*x_t*y_t;
-#if defined(foo)
-        if (s_t >= x*y*z) {s_t = bases[n22] + i*x_t;} /* 2d case */
-#endif
+        if (twod && (s_t >= M*N*P)) {s_t = bases[n22] + i*x_t;} /* 2d case */
         for (j=0; j<x_t; j++) { idx[nn++] = s_t++;}
       }
 
@@ -975,9 +952,7 @@
         y_t = y;
         /* z_t = lz[n23 / (m*n)]; */
         s_t = bases[n23] + i*x_t + k*x_t*y_t;
-#if defined(foo)
-        if (s_t >= x*y*z) {s_t = bases[n23] + i*x_t;} /* 2d case */
-#endif
+        if (twod && (s_t >= M*N*P)) {s_t = bases[n23] + i*x_t;} /* 2d case */
         for (j=0; j<s_x; j++) { idx[nn++] = s_t++;}
       }
     }
@@ -988,9 +963,7 @@
         y_t = ly[(n24 % (m*n))/m]; 
         /* z_t = lz[n24 / (m*n)]; */
         s_t = bases[n24] + i*x_t - s_x + k*x_t*y_t;
-#if defined(foo)
-        if (s_t >= x*y*z) {s_t = bases[n24] + i*x_t - s_x;} /* 2d case */
-#endif
+        if (twod && (s_t >= M*N*P)) {s_t = bases[n24] + i*x_t - s_x;} /* 2d case */
         for (j=0; j<s_x; j++) { idx[nn++] = s_t++;}
       }
       if (n25 >= 0) { /* directly above */
@@ -998,9 +971,7 @@
         y_t = ly[(n25 % (m*n))/m];
         /* z_t = lz[n25 / (m*n)]; */
         s_t = bases[n25] + (i-1)*x_t + k*x_t*y_t;
-#if defined(foo)
-        if (s_t >= x*y*z) {s_t = bases[n25] + (i-1)*x_t;} /* 2d case */
-#endif
+        if (twod && (s_t >= M*N*P)) {s_t = bases[n25] + (i-1)*x_t;} /* 2d case */
         for (j=0; j<x_t; j++) { idx[nn++] = s_t++;}
       }
       if (n26 >= 0) { /* right above */
@@ -1008,9 +979,7 @@
         y_t = ly[(n26 % (m*n))/m]; 
         /* z_t = lz[n26 / (m*n)]; */
         s_t = bases[n26] + (i-1)*x_t + k*x_t*y_t;
-#if defined(foo)
-        if (s_t >= x*y*z) {s_t = bases[n26] + (i-1)*x_t;} /* 2d case */
-#endif
+        if (twod && (s_t >= M*N*P)) {s_t = bases[n26] + (i-1)*x_t;} /* 2d case */
         for (j=0; j<s_x; j++) { idx[nn++] = s_t++;}
       }
     }


More information about the petsc-dev mailing list