[petsc-users] Question about KSP, and makefile linking MPICH

Sat Apr 13 21:07:45 CDT 2019

  It will be in the directory where the program is run. 

  Perhaps you are not calling KSPSetFromOptions()? This is where it is checked.

   Barry

> On Apr 13, 2019, at 7:24 PM, Yuyun Yang <yyang85 at stanford.edu> wrote:
> 
> I tried doing -ksp_view_mat binary, but I don't see the binaryoutput file being produced in my output or source file directories. Is it located somewhere else?
> 
> Best regards,
> Yuyun
> 
> -----Original Message-----
> From: Smith, Barry F. <bsmith at mcs.anl.gov> 
> Sent: Thursday, April 11, 2019 10:21 PM
> To: Yuyun Yang <yyang85 at stanford.edu>
> Cc: petsc-users at mcs.anl.gov
> Subject: Re: [petsc-users] Question about KSP, and makefile linking MPICH
> 
> 
>  Ok, still a little odd. PCSetOperators() which is called internally by KSPSetOperators() checks if the matrix has changed size and generates an error. Similar if you set a different matrix from before it resets the computation of the preconditioner. So, in theory, your situation should never occur.
> 
>   Barry
> 
> 
>> On Apr 12, 2019, at 12:01 AM, Yuyun Yang <yyang85 at stanford.edu> wrote:
>> 
>> I think this problem arose because I did not reset the ksp for solving 
>> a different problem! It's not giving me an error anymore now that I 
>> added the reset, so it's all good :)
>> 
>> Thanks,
>> Yuyun
>> 
>> Get Outlook for iOS
>> From: Smith, Barry F. <bsmith at mcs.anl.gov>
>> Sent: Thursday, April 11, 2019 9:21:11 PM
>> To: Yuyun Yang
>> Cc: petsc-users at mcs.anl.gov
>> Subject: Re: [petsc-users] Question about KSP, and makefile linking 
>> MPICH
>> 
>> 
>>   Ahh, I just realized one other thing we can try. Run the program that crashes with -ksp_mat_view binary  this will produce a file called binaryoutput, send that file to petsc-maint at mcs.anl.gov and we'll see if we can get MUMPS to mis-behave with it also.
>> 
>>   Barry
>> 
>> 
>> 
>>> On Apr 11, 2019, at 11:17 PM, Yuyun Yang <yyang85 at stanford.edu> wrote:
>>> 
>>> Thanks Barry for the detailed answers!
>>> 
>>> Regarding the problem with valgrind, this is the only error produced, and if I allow it to run further, the program would break (at a later function I get NaN for some of the values being calculated, and I've put an assert to prevent NaN results). I will take a look at it in the debugger. This is for testing, but for bigger problems I won't end up using Cholesky, so it's not really a big issue.
>>> 
>>> Thanks again for the timely help!
>>> Yuyun
>>> 
>>> Get Outlook for iOS
>>> From: Smith, Barry F. <bsmith at mcs.anl.gov>
>>> Sent: Thursday, April 11, 2019 6:44:54 PM
>>> To: Yuyun Yang
>>> Cc: petsc-users at mcs.anl.gov
>>> Subject: Re: [petsc-users] Question about KSP, and makefile linking 
>>> MPICH
>>> 
>>> 
>>> 
>>>> On Apr 11, 2019, at 5:44 PM, Yuyun Yang via petsc-users <petsc-users at mcs.anl.gov> wrote:
>>>> 
>>>> Hello team,
>>>> 
>>>> I’d like to check if it’s ok to use the same ksp object and change its operator (the matrix A) later on in the code to solve a different problem?
>>> 
>>>   Do you mean call KSPSetOperators() with one matrix and then later 
>>> call it with a different matrix? This is ok if the two matrices are 
>>> the same size and have the same parallel layout. But if the matrices 
>>> are different size, have different parallel layout then you need to 
>>> destroy the KSP and create a new one or call KSPReset() in between 
>>> for example
>>> 
>>>  KSPSetFromOptions(ksp);
>>>  KSPSetOperators(ksp,A,A);
>>>  KSPSolve(ksp,b,x); 
>>>  KSPReset(ksp);
>>>  KSPSetOperators(ksp,B,B);
>>>  KSPSolve(ksp,newb,newx);
>>> 
>>>> 
>>>> Also, I know I’ve asked this before about linking to MPICH when I 
>>>> call mpirun, instead of using my computer’s default MPI, but I 
>>>> want to check again. The same problem was solved on my cluster by 
>>>> using a different CLINKER (called mpiicc) in the Makefile and a 
>>>> different intel compiler, which will link my compiled code with 
>>>> MPICH. Is there a similar thing I can do on my own computer, 
>>>> instead of having to use a very long path to locate the MPICH I 
>>>> configured with PETSc, and then calling the executable? (I tried 
>>>> making CLINKER = mpiicc on my own computer but that didn’t work.)
>>> 
>>>    Are you asking how you can avoid something like
>>> 
>>>      /home/me/petsc/arch-myarch/bin/mpiexec -n 2 ./mycode ?
>>> 
>>>   You can add /home/me/petsc/arch-myarch/bin to the beginning of 
>>> your PATH, for example with bash put the following into your 
>>> ~/.bashrc file
>>> 
>>>      export PATH=/home/me/petsc/arch-myarch/bin:$PATH
>>>      mpiexec -n 2 ./mycode
>>> 
>>>> 
>>>> The final question is related to valgrind. I have defined a setupKSP function to do all the solver/pc setup. It seems like there is a problem with memory allocation but I don’t really understand why. This only happens for MUMPSCHOLESKY though (running CG, AMG etc. was fine):
>>>> 
>>>> ==830== Invalid read of size 8
>>>> ==830==    at 0x6977C95: dmumps_ana_o_ (dana_aux.F:2054)
>>>> ==830==    by 0x6913B5A: dmumps_ana_driver_ (dana_driver.F:390)
>>>> ==830==    by 0x68C152C: dmumps_ (dmumps_driver.F:1213)
>>>> ==830==    by 0x68BBE1C: dmumps_f77_ (dmumps_f77.F:267)
>>>> ==830==    by 0x68BA4EB: dmumps_c (mumps_c.c:417)
>>>> ==830==    by 0x5A070D6: MatCholeskyFactorSymbolic_MUMPS (mumps.c:1654)
>>>> ==830==    by 0x54071F2: MatCholeskyFactorSymbolic (matrix.c:3179)
>>>> ==830==    by 0x614AFE9: PCSetUp_Cholesky (cholesky.c:88)
>>>> ==830==    by 0x62BA574: PCSetUp (precon.c:932)
>>>> ==830==    by 0x640BB29: KSPSetUp (itfunc.c:391)
>>>> ==830==    by 0x4A1192: PressureEq::setupKSP(_p_KSP*&, _p_PC*&, _p_Mat*&) (pressureEq.cpp:834)
>>>> ==830==    by 0x4A1258: PressureEq::computeInitialSteadyStatePressure(Domain&) (pressureEq.cpp:862)
>>>> 
>>>> ==830==  Address 0xb8149c0 is 0 bytes after a block of size 7,872 
>>>> alloc'd
>>>> 
>>>> ==830==    at 0x4C2FFC6: memalign (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
>>>> ==830==    by 0x500E7E0: PetscMallocAlign (mal.c:41)
>>>> ==830==    by 0x59F8A16: MatConvertToTriples_seqaij_seqsbaij (mumps.c:402)
>>>> ==830==    by 0x5A06B53: MatCholeskyFactorSymbolic_MUMPS (mumps.c:1618)
>>>> ==830==    by 0x54071F2: MatCholeskyFactorSymbolic (matrix.c:3179)
>>>> ==830==    by 0x614AFE9: PCSetUp_Cholesky (cholesky.c:88)
>>>> ==830==    by 0x62BA574: PCSetUp (precon.c:932)
>>>> ==830==    by 0x640BB29: KSPSetUp (itfunc.c:391)
>>>> ==830==    by 0x4A1192: PressureEq::setupKSP(_p_KSP*&, _p_PC*&, _p_Mat*&) (pressureEq.cpp:834)
>>>> ==830==    by 0x4A1258: PressureEq::computeInitialSteadyStatePressure(Domain&) (pressureEq.cpp:862)
>>>> ==830==    by 0x49B809: PressureEq::PressureEq(Domain&) (pressureEq.cpp:62)
>>>> ==830==    by 0x4A88E9: StrikeSlip_LinearElastic_qd::StrikeSlip_LinearElastic_qd(Domain&) (strikeSlip_linearElastic_qd.cpp:57)
>>> 
>>>   This is curious. The line in the MUMPS code where valgrind 
>>> detects a problem is
>>> 
>>>            K = 1_8
>>>            THEMIN = ZERO
>>>            DO
>>>               IF(THEMIN .NE. ZERO) EXIT
>>>               THEMIN = abs(id%A(K))                               <<<<<<< this line
>>>               K = K+1_8
>>> 
>>>   So it has a problem accessing id%A(1)  the very first entry in 
>>> numerical values of the sparse matrix. Meanwhile it states
>>>> 0 bytes after a block of size 7,872 alloc'd 
>>>> MatConvertToTriples_seqaij_seqsbaij (mumps.c:402)  which is where 
>>>> PETSc allocates
>>> the values passed to MUMPS. So it almost as if MUMPS never allocated 
>>> any space for id%A(), I can't imagine why that would ever happen 
>>> (the problem size is super small so its not like it might have run 
>>> out of memory)
>>> 
>>>    What happens if you allow the valgrind to continue? Do you get more valgrind errors?
>>> 
>>>    What happens if run without valgrind? Does it crash at this 
>>> point in the code? At some later point? Does it run to completion 
>>> and seem to produce the correct answer? If it crashes, you could run it in the debugger and when it crashes print the value of id, id%A etc and see if they look reasonable.
>>> 
>>>   Barry
>>> 
>>> 
>>> 
>>> 
>>>> 
>>>> Thank you!
>>>> Yuyun
>