<div dir="ltr">The PFLOTRAN wiki seems to want us to install a very old version of PETSc:<div><a href="http://documentation.pflotran.org/user_guide/how_to/installation/linux.html#linux-install">http://documentation.pflotran.org/user_guide/how_to/installation/linux.html#linux-install</a><br></div><div>The relevant part is:</div><div>"</div><div><div>Install PETSc</div><div><br></div><div>3.1. Clone petsc and check out the supported version:</div><div><br></div><div>git clone <a href="https://bitbucket.org/petsc/petsc">https://bitbucket.org/petsc/petsc</a> petsc</div><div>cd petsc</div><div>git checkout xsdk-0.2.0</div><div>NOTE:PFLOTRAN currently uses a snapshot of PETSc ‘maint’ (release) branch. The only supported snapshot/version is specified by the changeset-id above. The supported version will change periodically as we need bug fixes or new features and changes will be announced on the mailing lists. The supported version of petsc is used on the buildbot automated testing system.</div></div><div>"</div><div>Doing git checkout xsdk-0.2.0 causes petsc/src/mat/impls/baij/mpi/baijov.c to be replaced with a version that contains the subroutine. </div><div><br></div><div>I'm fairly certain this version is ancient. I've had trouble before with just installing the most recent version of PETSc and then trying to run PFLOTRAN on top of it. I've asked my employer where we stand with supported PETSc versions.</div><div>Also plan to try compiling PFLOTRAN on top of a more recent PETSc, to see if it works this time.</div></div><div class="gmail_extra"><br><div class="gmail_quote">On Sat, Nov 25, 2017 at 10:03 PM, Smith, Barry F. <span dir="ltr"><<a href="mailto:bsmith@mcs.anl.gov" target="_blank">bsmith@mcs.anl.gov</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><br>
I cannot find that routine<br>
<br>
~/Src/petsc ((v3.6.4)) arch-basic<br>
$ git grep MatDestroy_MPIBAIJ_<wbr>MatGetSubmatrices<br>
~/Src/petsc ((v3.6.4)) arch-basic<br>
$ git checkout v3.7.5<br>
Previous HEAD position was 401b1b531b... Increase patchlevel to 3.6.4<br>
HEAD is now at b827f1350a... Increase patchlevel to 3.7.5<br>
~/Src/petsc ((v3.7.5)) arch-basic<br>
$ git grep MatDestroy_MPIBAIJ_<wbr>MatGetSubmatrices<br>
~/Src/petsc ((v3.7.5)) arch-basic<br>
$ git checkout v3.7.0<br>
error: pathspec 'v3.7.0' did not match any file(s) known to git.<br>
~/Src/petsc ((v3.7.5)) arch-basic<br>
$ git checkout v3.7<br>
Previous HEAD position was b827f1350a... Increase patchlevel to 3.7.5<br>
HEAD is now at ae618e6989... release: set v3.7 strings<br>
~/Src/petsc ((v3.7)) arch-basic<br>
$ git grep MatDestroy_MPIBAIJ_<wbr>MatGetSubmatrices<br>
~/Src/petsc ((v3.7)) arch-basic<br>
$ git checkout v3.8<br>
Previous HEAD position was ae618e6989... release: set v3.7 strings<br>
HEAD is now at 0e50f9e530... release: set v3.8 strings<br>
~/Src/petsc ((v3.8)) arch-basic<br>
$ git grep MatDestroy_MPIBAIJ_<wbr>MatGetSubmatrices<br>
~/Src/petsc ((v3.8)) arch-basic<br>
<br>
<br>
<br>
Are you using a the PETSc git repository and some particular branch or commit in it?<br>
<div class="HOEnZb"><div class="h5"><br>
<br>
> On Nov 25, 2017, at 11:09 AM, Daniel Stone <<a href="mailto:daniel.stone@opengosim.com">daniel.stone@opengosim.com</a>> wrote:<br>
><br>
> Thanks for the quick response.<br>
><br>
> I tried Valgrind. Apart from a couple of other warnings in other parts of my code, now fixed, it shows the same stack I described:<br>
> ==22498== Invalid read of size 4<br>
> ==22498== at 0x55A5BFF: MatDestroy_MPIBAIJ_<wbr>MatGetSubmatrices (baijov.c:609)<br>
> ==22498== by 0x538A206: MatDestroy (matrix.c:1168)<br>
> ==22498== by 0x5F21F2F: PCSetUp_ILU (ilu.c:162)<br>
> ==22498== by 0x604898A: PCSetUp (precon.c:924)<br>
> ==22498== by 0x6189005: KSPSetUp (itfunc.c:379)<br>
> ==22498== by 0x618AB57: KSPSolve (itfunc.c:599)<br>
> ==22498== by 0x5FD4816: PCApply_ASM (asm.c:485)<br>
> ==22498== by 0x604204C: PCApply (precon.c:458)<br>
> ==22498== by 0x6055C76: pcapply_ (preconf.c:223)<br>
> ==22498== by 0x42F500: __cpr_linsolver_MOD_cprapply (cpr_linsolver.F90:419)<br>
> ==22498== by 0x5F42431: ourshellapply (zshellpcf.c:41)<br>
> ==22498== by 0x5F3697A: PCApply_Shell (shellpc.c:115)<br>
> ==22498== by 0x604204C: PCApply (precon.c:458)<br>
> ==22498== by 0x61B74E7: KSP_PCApply (kspimpl.h:251)<br>
> ==22498== by 0x61B83C3: KSPInitialResidual (itres.c:67)<br>
> ==22498== by 0x6104EF9: KSPSolve_BCGS (bcgs.c:44)<br>
> ==22498== by 0x618B77E: KSPSolve (itfunc.c:656)<br>
> ==22498== by 0x62BB02D: SNESSolve_NEWTONLS (ls.c:224)<br>
> ==22498== by 0x6245706: SNESSolve (snes.c:3967)<br>
> ==22498== by 0x6265A58: snessolve_ (zsnesf.c:167)<br>
> ==22498== Address 0x0 is not stack'd, malloc'd or (recently) free'd<br>
> ==22498==<br>
><br>
> PETSc version: this is from include/petscversion.h:<br>
> #define PETSC_VERSION_RELEASE 0<br>
> #define PETSC_VERSION_MAJOR 3<br>
> #define PETSC_VERSION_MINOR 7<br>
> #define PETSC_VERSION_SUBMINOR 5<br>
> #define PETSC_VERSION_PATCH 0<br>
> #define PETSC_RELEASE_DATE "Apr, 25, 2016"<br>
> #define PETSC_VERSION_DATE "unknown"<br>
><br>
> This is the recommended version of PETSc for using with PFLOTRAN:<br>
> <a href="http://documentation.pflotran.org/user_guide/how_to/installation/linux.html#linux-install" rel="noreferrer" target="_blank">http://documentation.pflotran.<wbr>org/user_guide/how_to/<wbr>installation/linux.html#linux-<wbr>install</a><br>
><br>
> Exact debugger output:<br>
> It's a graphical debugger so there isn't much to copy/paste.<br>
> The exact message is:<br>
><br>
> Memory error detected in MatDestroy_MPIBAIJ_<wbr>MatGetSubmatrices (baijov.c:609):<br>
><br>
> null pointer dereference or unaligned memory access.<br>
><br>
> I can provide screenshots if that would help.<br>
><br>
> -ksp_view_pre:<br>
> I tried this, it doesn't seem to give information about the KSPs in question. To be clear, this is<br>
> part of an attempt to implement the two stage CPR-AMG preconditioner in PFLOTRAN, so the<br>
> KSP and PC objects involved are:<br>
><br>
> KSP: linear solver inside a SNES, inside PFLOTRAN (BCGS),<br>
> which has a PC:<br>
> PC: shell, the CPR implementation, which calls two more preconditioners, T1 and T2, in sequence:<br>
> T1: another shell, which calls a KSP (GMRES), which has a PC which is HYPRE BOOMERAMG<br>
> T2: ASM, this is the problematic one.<br>
><br>
> -ksp_view_pre doesn't seem to give us any information about the ASM preconditioner object<br>
> or it's ILU sub-KSPs; presumably it crashes before getting there. We do get a lot of output about<br>
> T1, for example:<br>
><br>
> KSP Object: T1 24 MPI processes<br>
> type: gmres<br>
> GMRES: restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement<br>
> GMRES: happy breakdown tolerance 1e-30<br>
> maximum iterations=10000, initial guess is zero<br>
> tolerances: relative=1e-05, absolute=1e-50, divergence=10000.<br>
> left preconditioning<br>
> using DEFAULT norm type for convergence test<br>
> PC Object: 24 MPI processes<br>
> type: hypre<br>
> PC has not been set up so information may be incomplete<br>
> HYPRE BoomerAMG preconditioning<br>
> HYPRE BoomerAMG: Cycle type V<br>
> HYPRE BoomerAMG: Maximum number of levels 25<br>
> HYPRE BoomerAMG: Maximum number of iterations PER hypre call 1<br>
> HYPRE BoomerAMG: Convergence tolerance PER hypre call 0.<br>
> HYPRE BoomerAMG: Threshold for strong coupling 0.25<br>
> HYPRE BoomerAMG: Interpolation truncation factor 0.<br>
> HYPRE BoomerAMG: Interpolation: max elements per row 0<br>
> HYPRE BoomerAMG: Number of levels of aggressive coarsening 0<br>
> HYPRE BoomerAMG: Number of paths for aggressive coarsening 1<br>
> HYPRE BoomerAMG: Maximum row sums 0.9<br>
> HYPRE BoomerAMG: Sweeps down 1<br>
> HYPRE BoomerAMG: Sweeps up 1<br>
> HYPRE BoomerAMG: Sweeps on coarse 1<br>
> HYPRE BoomerAMG: Relax down symmetric-SOR/Jacobi<br>
> HYPRE BoomerAMG: Relax up symmetric-SOR/Jacobi<br>
> HYPRE BoomerAMG: Relax on coarse Gaussian-elimination<br>
> HYPRE BoomerAMG: Relax weight (all) 1.<br>
> HYPRE BoomerAMG: Outer relax weight (all) 1.<br>
> HYPRE BoomerAMG: Using CF-relaxation<br>
> HYPRE BoomerAMG: Not using more complex smoothers.<br>
> HYPRE BoomerAMG: Measure type local<br>
> HYPRE BoomerAMG: Coarsen type Falgout<br>
> HYPRE BoomerAMG: Interpolation type classical<br>
> linear system matrix = precond matrix:<br>
> Mat Object: 24 MPI processes<br>
> type: mpiaij<br>
> rows=1122000, cols=1122000<br>
> total: nonzeros=7780000, allocated nonzeros=7780000<br>
> total number of mallocs used during MatSetValues calls =0<br>
><br>
> Thanks,<br>
><br>
> Daniel Stone<br>
><br>
><br>
> On Fri, Nov 24, 2017 at 4:08 PM, Smith, Barry F. <<a href="mailto:bsmith@mcs.anl.gov">bsmith@mcs.anl.gov</a>> wrote:<br>
><br>
> First run under valgrind. <a href="https://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind" rel="noreferrer" target="_blank">https://www.mcs.anl.gov/petsc/<wbr>documentation/faq.html#<wbr>valgrind</a><br>
><br>
> If that doesn't help send the exact output from the debugger (cut and paste) and the exact version of PETSc you are using.<br>
> Also out put from -ksp_view_pre<br>
><br>
> Barry<br>
><br>
> > On Nov 24, 2017, at 8:03 AM, Daniel Stone <<a href="mailto:daniel.stone@opengosim.com">daniel.stone@opengosim.com</a>> wrote:<br>
> ><br>
> > Hello,<br>
> ><br>
> > I'm getting a memory exception crash every time I try to run the ASM preconditioner in parallel, can anyone help?<br>
> ><br>
> > I'm using a debugger so I can give most of the stack:<br>
> ><br>
> > PCApply_ASM (asm.c:line 485)<br>
> > KSPSolve (itfunc.c:line 599)<br>
> > KSPSetUp (itfunc.c:line 379)<br>
> > PCSetUp (precon.c: 924)<br>
> > PCSetUp_ILU (ilu.c:line 162)<br>
> > MatDestroy (matrix.c:line 1168)<br>
> > MatDestroy_MPIBAIJ_<wbr>MatGetSubMatrices (baijov.c:line 609)<br>
> ><br>
> ><br>
> > The problem line is then in MatDestroy_MPIBAIJ_<wbr>MatGetSubMatrices,<br>
> > in the file baijov.c, line 609:<br>
> ><br>
> > if (!submatj->id) {<br>
> ><br>
> > At this point submatj has no value, address 0x0, and so the attempt to access submatj->id<br>
> > causes the memory error. We can see in the lines just above 609 where submatj is supposed to<br>
> > come from, it should basically be an attribute of C->data, where C is the input matrix.<br>
> ><br>
> > Does anyone have any ideas where to start with getting this to work? I can provide a lot more information<br>
> > from the debugger if need.<br>
> ><br>
> > Many thanks in advance,<br>
> ><br>
> > Daniel Stone<br>
> ><br>
><br>
><br>
<br>
</div></div></blockquote></div><br></div>