[petsc-users] Question about KSP, and makefile linking MPICH

Smith, Barry F. bsmith at mcs.anl.gov
Thu Apr 11 20:44:54 CDT 2019



> On Apr 11, 2019, at 5:44 PM, Yuyun Yang via petsc-users <petsc-users at mcs.anl.gov> wrote:
> 
> Hello team,
>  
> I’d like to check if it’s ok to use the same ksp object and change its operator (the matrix A) later on in the code to solve a different problem?

   Do you mean call KSPSetOperators() with one matrix and then later call it with a different matrix? This is ok if the two matrices are the same size and have the same parallel layout. But if the matrices are different size, have different parallel layout then you need to destroy the KSP and create a new one or call KSPReset() in between for example

  KSPSetFromOptions(ksp);
  KSPSetOperators(ksp,A,A);
  KSPSolve(ksp,b,x); 
  KSPReset(ksp);
  KSPSetOperators(ksp,B,B);
  KSPSolve(ksp,newb,newx);

>  
> Also, I know I’ve asked this before about linking to MPICH when I call mpirun, instead of using my computer’s default MPI, but I want to check again. The same problem was solved on my cluster by using a different CLINKER (called mpiicc) in the Makefile and a different intel compiler, which will link my compiled code with MPICH. Is there a similar thing I can do on my own computer, instead of having to use a very long path to locate the MPICH I configured with PETSc, and then calling the executable? (I tried making CLINKER = mpiicc on my own computer but that didn’t work.)

    Are you asking how you can avoid something like

      /home/me/petsc/arch-myarch/bin/mpiexec -n 2 ./mycode ?

   You can add /home/me/petsc/arch-myarch/bin to the beginning of your PATH, for example with bash put the following into your ~/.bashrc file

      export PATH=/home/me/petsc/arch-myarch/bin:$PATH
      mpiexec -n 2 ./mycode

>  
> The final question is related to valgrind. I have defined a setupKSP function to do all the solver/pc setup. It seems like there is a problem with memory allocation but I don’t really understand why. This only happens for MUMPSCHOLESKY though (running CG, AMG etc. was fine):
>  
> ==830== Invalid read of size 8
> ==830==    at 0x6977C95: dmumps_ana_o_ (dana_aux.F:2054)
> ==830==    by 0x6913B5A: dmumps_ana_driver_ (dana_driver.F:390)
> ==830==    by 0x68C152C: dmumps_ (dmumps_driver.F:1213)
> ==830==    by 0x68BBE1C: dmumps_f77_ (dmumps_f77.F:267)
> ==830==    by 0x68BA4EB: dmumps_c (mumps_c.c:417)
> ==830==    by 0x5A070D6: MatCholeskyFactorSymbolic_MUMPS (mumps.c:1654)
> ==830==    by 0x54071F2: MatCholeskyFactorSymbolic (matrix.c:3179)
> ==830==    by 0x614AFE9: PCSetUp_Cholesky (cholesky.c:88)
> ==830==    by 0x62BA574: PCSetUp (precon.c:932)
> ==830==    by 0x640BB29: KSPSetUp (itfunc.c:391)
> ==830==    by 0x4A1192: PressureEq::setupKSP(_p_KSP*&, _p_PC*&, _p_Mat*&) (pressureEq.cpp:834)
> ==830==    by 0x4A1258: PressureEq::computeInitialSteadyStatePressure(Domain&) (pressureEq.cpp:862)
>  
> ==830==  Address 0xb8149c0 is 0 bytes after a block of size 7,872 alloc'd
>  
> ==830==    at 0x4C2FFC6: memalign (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
> ==830==    by 0x500E7E0: PetscMallocAlign (mal.c:41)
> ==830==    by 0x59F8A16: MatConvertToTriples_seqaij_seqsbaij (mumps.c:402)
> ==830==    by 0x5A06B53: MatCholeskyFactorSymbolic_MUMPS (mumps.c:1618)
> ==830==    by 0x54071F2: MatCholeskyFactorSymbolic (matrix.c:3179)
> ==830==    by 0x614AFE9: PCSetUp_Cholesky (cholesky.c:88)
> ==830==    by 0x62BA574: PCSetUp (precon.c:932)
> ==830==    by 0x640BB29: KSPSetUp (itfunc.c:391)
> ==830==    by 0x4A1192: PressureEq::setupKSP(_p_KSP*&, _p_PC*&, _p_Mat*&) (pressureEq.cpp:834)
> ==830==    by 0x4A1258: PressureEq::computeInitialSteadyStatePressure(Domain&) (pressureEq.cpp:862)
> ==830==    by 0x49B809: PressureEq::PressureEq(Domain&) (pressureEq.cpp:62)
> ==830==    by 0x4A88E9: StrikeSlip_LinearElastic_qd::StrikeSlip_LinearElastic_qd(Domain&) (strikeSlip_linearElastic_qd.cpp:57)

   This is curious. The line in the MUMPS code where valgrind detects a problem is 

            K = 1_8
            THEMIN = ZERO
            DO
               IF(THEMIN .NE. ZERO) EXIT
               THEMIN = abs(id%A(K))                               <<<<<<< this line
               K = K+1_8

   So it has a problem accessing id%A(1)  the very first entry in numerical values of the sparse matrix. Meanwhile it states 
> 0 bytes after a block of size 7,872 alloc'd MatConvertToTriples_seqaij_seqsbaij (mumps.c:402)  which is where PETSc allocates
the values passed to MUMPS. So it almost as if MUMPS never allocated any space for id%A(), I can't imagine why that would ever happen (the
problem size is super small so its not like it might have run out of memory)

    What happens if you allow the valgrind to continue? Do you get more valgrind errors?

    What happens if run without valgrind? Does it crash at this point in the code? At some later point? Does it run to completion and seem to 
produce the correct answer? If it crashes, you could run it in the debugger and when it crashes print the value of id, id%A etc and see if they look
reasonable. 

   Barry




>  
> Thank you!
> Yuyun



More information about the petsc-users mailing list