[petsc-users] Question about KSP, and makefile linking MPICH
Smith, Barry F.
bsmith at mcs.anl.gov
Sat Apr 13 21:07:45 CDT 2019
It will be in the directory where the program is run.
Perhaps you are not calling KSPSetFromOptions()? This is where it is checked.
Barry
> On Apr 13, 2019, at 7:24 PM, Yuyun Yang <yyang85 at stanford.edu> wrote:
>
> I tried doing -ksp_view_mat binary, but I don't see the binaryoutput file being produced in my output or source file directories. Is it located somewhere else?
>
> Best regards,
> Yuyun
>
> -----Original Message-----
> From: Smith, Barry F. <bsmith at mcs.anl.gov>
> Sent: Thursday, April 11, 2019 10:21 PM
> To: Yuyun Yang <yyang85 at stanford.edu>
> Cc: petsc-users at mcs.anl.gov
> Subject: Re: [petsc-users] Question about KSP, and makefile linking MPICH
>
>
> Ok, still a little odd. PCSetOperators() which is called internally by KSPSetOperators() checks if the matrix has changed size and generates an error. Similar if you set a different matrix from before it resets the computation of the preconditioner. So, in theory, your situation should never occur.
>
> Barry
>
>
>> On Apr 12, 2019, at 12:01 AM, Yuyun Yang <yyang85 at stanford.edu> wrote:
>>
>> I think this problem arose because I did not reset the ksp for solving
>> a different problem! It's not giving me an error anymore now that I
>> added the reset, so it's all good :)
>>
>> Thanks,
>> Yuyun
>>
>> Get Outlook for iOS
>> From: Smith, Barry F. <bsmith at mcs.anl.gov>
>> Sent: Thursday, April 11, 2019 9:21:11 PM
>> To: Yuyun Yang
>> Cc: petsc-users at mcs.anl.gov
>> Subject: Re: [petsc-users] Question about KSP, and makefile linking
>> MPICH
>>
>>
>> Ahh, I just realized one other thing we can try. Run the program that crashes with -ksp_mat_view binary this will produce a file called binaryoutput, send that file to petsc-maint at mcs.anl.gov and we'll see if we can get MUMPS to mis-behave with it also.
>>
>> Barry
>>
>>
>>
>>> On Apr 11, 2019, at 11:17 PM, Yuyun Yang <yyang85 at stanford.edu> wrote:
>>>
>>> Thanks Barry for the detailed answers!
>>>
>>> Regarding the problem with valgrind, this is the only error produced, and if I allow it to run further, the program would break (at a later function I get NaN for some of the values being calculated, and I've put an assert to prevent NaN results). I will take a look at it in the debugger. This is for testing, but for bigger problems I won't end up using Cholesky, so it's not really a big issue.
>>>
>>> Thanks again for the timely help!
>>> Yuyun
>>>
>>> Get Outlook for iOS
>>> From: Smith, Barry F. <bsmith at mcs.anl.gov>
>>> Sent: Thursday, April 11, 2019 6:44:54 PM
>>> To: Yuyun Yang
>>> Cc: petsc-users at mcs.anl.gov
>>> Subject: Re: [petsc-users] Question about KSP, and makefile linking
>>> MPICH
>>>
>>>
>>>
>>>> On Apr 11, 2019, at 5:44 PM, Yuyun Yang via petsc-users <petsc-users at mcs.anl.gov> wrote:
>>>>
>>>> Hello team,
>>>>
>>>> I’d like to check if it’s ok to use the same ksp object and change its operator (the matrix A) later on in the code to solve a different problem?
>>>
>>> Do you mean call KSPSetOperators() with one matrix and then later
>>> call it with a different matrix? This is ok if the two matrices are
>>> the same size and have the same parallel layout. But if the matrices
>>> are different size, have different parallel layout then you need to
>>> destroy the KSP and create a new one or call KSPReset() in between
>>> for example
>>>
>>> KSPSetFromOptions(ksp);
>>> KSPSetOperators(ksp,A,A);
>>> KSPSolve(ksp,b,x);
>>> KSPReset(ksp);
>>> KSPSetOperators(ksp,B,B);
>>> KSPSolve(ksp,newb,newx);
>>>
>>>>
>>>> Also, I know I’ve asked this before about linking to MPICH when I
>>>> call mpirun, instead of using my computer’s default MPI, but I
>>>> want to check again. The same problem was solved on my cluster by
>>>> using a different CLINKER (called mpiicc) in the Makefile and a
>>>> different intel compiler, which will link my compiled code with
>>>> MPICH. Is there a similar thing I can do on my own computer,
>>>> instead of having to use a very long path to locate the MPICH I
>>>> configured with PETSc, and then calling the executable? (I tried
>>>> making CLINKER = mpiicc on my own computer but that didn’t work.)
>>>
>>> Are you asking how you can avoid something like
>>>
>>> /home/me/petsc/arch-myarch/bin/mpiexec -n 2 ./mycode ?
>>>
>>> You can add /home/me/petsc/arch-myarch/bin to the beginning of
>>> your PATH, for example with bash put the following into your
>>> ~/.bashrc file
>>>
>>> export PATH=/home/me/petsc/arch-myarch/bin:$PATH
>>> mpiexec -n 2 ./mycode
>>>
>>>>
>>>> The final question is related to valgrind. I have defined a setupKSP function to do all the solver/pc setup. It seems like there is a problem with memory allocation but I don’t really understand why. This only happens for MUMPSCHOLESKY though (running CG, AMG etc. was fine):
>>>>
>>>> ==830== Invalid read of size 8
>>>> ==830== at 0x6977C95: dmumps_ana_o_ (dana_aux.F:2054)
>>>> ==830== by 0x6913B5A: dmumps_ana_driver_ (dana_driver.F:390)
>>>> ==830== by 0x68C152C: dmumps_ (dmumps_driver.F:1213)
>>>> ==830== by 0x68BBE1C: dmumps_f77_ (dmumps_f77.F:267)
>>>> ==830== by 0x68BA4EB: dmumps_c (mumps_c.c:417)
>>>> ==830== by 0x5A070D6: MatCholeskyFactorSymbolic_MUMPS (mumps.c:1654)
>>>> ==830== by 0x54071F2: MatCholeskyFactorSymbolic (matrix.c:3179)
>>>> ==830== by 0x614AFE9: PCSetUp_Cholesky (cholesky.c:88)
>>>> ==830== by 0x62BA574: PCSetUp (precon.c:932)
>>>> ==830== by 0x640BB29: KSPSetUp (itfunc.c:391)
>>>> ==830== by 0x4A1192: PressureEq::setupKSP(_p_KSP*&, _p_PC*&, _p_Mat*&) (pressureEq.cpp:834)
>>>> ==830== by 0x4A1258: PressureEq::computeInitialSteadyStatePressure(Domain&) (pressureEq.cpp:862)
>>>>
>>>> ==830== Address 0xb8149c0 is 0 bytes after a block of size 7,872
>>>> alloc'd
>>>>
>>>> ==830== at 0x4C2FFC6: memalign (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
>>>> ==830== by 0x500E7E0: PetscMallocAlign (mal.c:41)
>>>> ==830== by 0x59F8A16: MatConvertToTriples_seqaij_seqsbaij (mumps.c:402)
>>>> ==830== by 0x5A06B53: MatCholeskyFactorSymbolic_MUMPS (mumps.c:1618)
>>>> ==830== by 0x54071F2: MatCholeskyFactorSymbolic (matrix.c:3179)
>>>> ==830== by 0x614AFE9: PCSetUp_Cholesky (cholesky.c:88)
>>>> ==830== by 0x62BA574: PCSetUp (precon.c:932)
>>>> ==830== by 0x640BB29: KSPSetUp (itfunc.c:391)
>>>> ==830== by 0x4A1192: PressureEq::setupKSP(_p_KSP*&, _p_PC*&, _p_Mat*&) (pressureEq.cpp:834)
>>>> ==830== by 0x4A1258: PressureEq::computeInitialSteadyStatePressure(Domain&) (pressureEq.cpp:862)
>>>> ==830== by 0x49B809: PressureEq::PressureEq(Domain&) (pressureEq.cpp:62)
>>>> ==830== by 0x4A88E9: StrikeSlip_LinearElastic_qd::StrikeSlip_LinearElastic_qd(Domain&) (strikeSlip_linearElastic_qd.cpp:57)
>>>
>>> This is curious. The line in the MUMPS code where valgrind
>>> detects a problem is
>>>
>>> K = 1_8
>>> THEMIN = ZERO
>>> DO
>>> IF(THEMIN .NE. ZERO) EXIT
>>> THEMIN = abs(id%A(K)) <<<<<<< this line
>>> K = K+1_8
>>>
>>> So it has a problem accessing id%A(1) the very first entry in
>>> numerical values of the sparse matrix. Meanwhile it states
>>>> 0 bytes after a block of size 7,872 alloc'd
>>>> MatConvertToTriples_seqaij_seqsbaij (mumps.c:402) which is where
>>>> PETSc allocates
>>> the values passed to MUMPS. So it almost as if MUMPS never allocated
>>> any space for id%A(), I can't imagine why that would ever happen
>>> (the problem size is super small so its not like it might have run
>>> out of memory)
>>>
>>> What happens if you allow the valgrind to continue? Do you get more valgrind errors?
>>>
>>> What happens if run without valgrind? Does it crash at this
>>> point in the code? At some later point? Does it run to completion
>>> and seem to produce the correct answer? If it crashes, you could run it in the debugger and when it crashes print the value of id, id%A etc and see if they look reasonable.
>>>
>>> Barry
>>>
>>>
>>>
>>>
>>>>
>>>> Thank you!
>>>> Yuyun
>
More information about the petsc-users
mailing list