[petsc-users] Question about KSP, and makefile linking MPICH

Yuyun Yang yyang85 at stanford.edu
Sat Apr 13 19:24:47 CDT 2019


I tried doing -ksp_view_mat binary, but I don't see the binaryoutput file being produced in my output or source file directories. Is it located somewhere else?

Best regards,
Yuyun

-----Original Message-----
From: Smith, Barry F. <bsmith at mcs.anl.gov> 
Sent: Thursday, April 11, 2019 10:21 PM
To: Yuyun Yang <yyang85 at stanford.edu>
Cc: petsc-users at mcs.anl.gov
Subject: Re: [petsc-users] Question about KSP, and makefile linking MPICH


  Ok, still a little odd. PCSetOperators() which is called internally by KSPSetOperators() checks if the matrix has changed size and generates an error. Similar if you set a different matrix from before it resets the computation of the preconditioner. So, in theory, your situation should never occur.

   Barry


> On Apr 12, 2019, at 12:01 AM, Yuyun Yang <yyang85 at stanford.edu> wrote:
> 
> I think this problem arose because I did not reset the ksp for solving 
> a different problem! It's not giving me an error anymore now that I 
> added the reset, so it's all good :)
> 
> Thanks,
> Yuyun
> 
> Get Outlook for iOS
> From: Smith, Barry F. <bsmith at mcs.anl.gov>
> Sent: Thursday, April 11, 2019 9:21:11 PM
> To: Yuyun Yang
> Cc: petsc-users at mcs.anl.gov
> Subject: Re: [petsc-users] Question about KSP, and makefile linking 
> MPICH
>  
> 
>    Ahh, I just realized one other thing we can try. Run the program that crashes with -ksp_mat_view binary  this will produce a file called binaryoutput, send that file to petsc-maint at mcs.anl.gov and we'll see if we can get MUMPS to mis-behave with it also.
> 
>    Barry
> 
> 
> 
> > On Apr 11, 2019, at 11:17 PM, Yuyun Yang <yyang85 at stanford.edu> wrote:
> > 
> > Thanks Barry for the detailed answers!
> > 
> > Regarding the problem with valgrind, this is the only error produced, and if I allow it to run further, the program would break (at a later function I get NaN for some of the values being calculated, and I've put an assert to prevent NaN results). I will take a look at it in the debugger. This is for testing, but for bigger problems I won't end up using Cholesky, so it's not really a big issue.
> > 
> > Thanks again for the timely help!
> > Yuyun
> > 
> > Get Outlook for iOS
> > From: Smith, Barry F. <bsmith at mcs.anl.gov>
> > Sent: Thursday, April 11, 2019 6:44:54 PM
> > To: Yuyun Yang
> > Cc: petsc-users at mcs.anl.gov
> > Subject: Re: [petsc-users] Question about KSP, and makefile linking 
> > MPICH
> >  
> > 
> > 
> > > On Apr 11, 2019, at 5:44 PM, Yuyun Yang via petsc-users <petsc-users at mcs.anl.gov> wrote:
> > > 
> > > Hello team,
> > >  
> > > I’d like to check if it’s ok to use the same ksp object and change its operator (the matrix A) later on in the code to solve a different problem?
> > 
> >    Do you mean call KSPSetOperators() with one matrix and then later 
> > call it with a different matrix? This is ok if the two matrices are 
> > the same size and have the same parallel layout. But if the matrices 
> > are different size, have different parallel layout then you need to 
> > destroy the KSP and create a new one or call KSPReset() in between 
> > for example
> > 
> >   KSPSetFromOptions(ksp);
> >   KSPSetOperators(ksp,A,A);
> >   KSPSolve(ksp,b,x); 
> >   KSPReset(ksp);
> >   KSPSetOperators(ksp,B,B);
> >   KSPSolve(ksp,newb,newx);
> > 
> > >  
> > > Also, I know I’ve asked this before about linking to MPICH when I 
> > > call mpirun, instead of using my computer’s default MPI, but I 
> > > want to check again. The same problem was solved on my cluster by 
> > > using a different CLINKER (called mpiicc) in the Makefile and a 
> > > different intel compiler, which will link my compiled code with 
> > > MPICH. Is there a similar thing I can do on my own computer, 
> > > instead of having to use a very long path to locate the MPICH I 
> > > configured with PETSc, and then calling the executable? (I tried 
> > > making CLINKER = mpiicc on my own computer but that didn’t work.)
> > 
> >     Are you asking how you can avoid something like
> > 
> >       /home/me/petsc/arch-myarch/bin/mpiexec -n 2 ./mycode ?
> > 
> >    You can add /home/me/petsc/arch-myarch/bin to the beginning of 
> > your PATH, for example with bash put the following into your 
> > ~/.bashrc file
> > 
> >       export PATH=/home/me/petsc/arch-myarch/bin:$PATH
> >       mpiexec -n 2 ./mycode
> > 
> > >  
> > > The final question is related to valgrind. I have defined a setupKSP function to do all the solver/pc setup. It seems like there is a problem with memory allocation but I don’t really understand why. This only happens for MUMPSCHOLESKY though (running CG, AMG etc. was fine):
> > >  
> > > ==830== Invalid read of size 8
> > > ==830==    at 0x6977C95: dmumps_ana_o_ (dana_aux.F:2054)
> > > ==830==    by 0x6913B5A: dmumps_ana_driver_ (dana_driver.F:390)
> > > ==830==    by 0x68C152C: dmumps_ (dmumps_driver.F:1213)
> > > ==830==    by 0x68BBE1C: dmumps_f77_ (dmumps_f77.F:267)
> > > ==830==    by 0x68BA4EB: dmumps_c (mumps_c.c:417)
> > > ==830==    by 0x5A070D6: MatCholeskyFactorSymbolic_MUMPS (mumps.c:1654)
> > > ==830==    by 0x54071F2: MatCholeskyFactorSymbolic (matrix.c:3179)
> > > ==830==    by 0x614AFE9: PCSetUp_Cholesky (cholesky.c:88)
> > > ==830==    by 0x62BA574: PCSetUp (precon.c:932)
> > > ==830==    by 0x640BB29: KSPSetUp (itfunc.c:391)
> > > ==830==    by 0x4A1192: PressureEq::setupKSP(_p_KSP*&, _p_PC*&, _p_Mat*&) (pressureEq.cpp:834)
> > > ==830==    by 0x4A1258: PressureEq::computeInitialSteadyStatePressure(Domain&) (pressureEq.cpp:862)
> > >  
> > > ==830==  Address 0xb8149c0 is 0 bytes after a block of size 7,872 
> > > alloc'd
> > >  
> > > ==830==    at 0x4C2FFC6: memalign (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
> > > ==830==    by 0x500E7E0: PetscMallocAlign (mal.c:41)
> > > ==830==    by 0x59F8A16: MatConvertToTriples_seqaij_seqsbaij (mumps.c:402)
> > > ==830==    by 0x5A06B53: MatCholeskyFactorSymbolic_MUMPS (mumps.c:1618)
> > > ==830==    by 0x54071F2: MatCholeskyFactorSymbolic (matrix.c:3179)
> > > ==830==    by 0x614AFE9: PCSetUp_Cholesky (cholesky.c:88)
> > > ==830==    by 0x62BA574: PCSetUp (precon.c:932)
> > > ==830==    by 0x640BB29: KSPSetUp (itfunc.c:391)
> > > ==830==    by 0x4A1192: PressureEq::setupKSP(_p_KSP*&, _p_PC*&, _p_Mat*&) (pressureEq.cpp:834)
> > > ==830==    by 0x4A1258: PressureEq::computeInitialSteadyStatePressure(Domain&) (pressureEq.cpp:862)
> > > ==830==    by 0x49B809: PressureEq::PressureEq(Domain&) (pressureEq.cpp:62)
> > > ==830==    by 0x4A88E9: StrikeSlip_LinearElastic_qd::StrikeSlip_LinearElastic_qd(Domain&) (strikeSlip_linearElastic_qd.cpp:57)
> > 
> >    This is curious. The line in the MUMPS code where valgrind 
> > detects a problem is
> > 
> >             K = 1_8
> >             THEMIN = ZERO
> >             DO
> >                IF(THEMIN .NE. ZERO) EXIT
> >                THEMIN = abs(id%A(K))                               <<<<<<< this line
> >                K = K+1_8
> > 
> >    So it has a problem accessing id%A(1)  the very first entry in 
> > numerical values of the sparse matrix. Meanwhile it states
> > > 0 bytes after a block of size 7,872 alloc'd 
> > > MatConvertToTriples_seqaij_seqsbaij (mumps.c:402)  which is where 
> > > PETSc allocates
> > the values passed to MUMPS. So it almost as if MUMPS never allocated 
> > any space for id%A(), I can't imagine why that would ever happen 
> > (the problem size is super small so its not like it might have run 
> > out of memory)
> > 
> >     What happens if you allow the valgrind to continue? Do you get more valgrind errors?
> > 
> >     What happens if run without valgrind? Does it crash at this 
> > point in the code? At some later point? Does it run to completion 
> > and seem to produce the correct answer? If it crashes, you could run it in the debugger and when it crashes print the value of id, id%A etc and see if they look reasonable.
> > 
> >    Barry
> > 
> > 
> > 
> > 
> > >  
> > > Thank you!
> > > Yuyun



More information about the petsc-users mailing list