[petsc-users] KSPSetUp with PETSc/MUMPS

Barry Smith bsmith at mcs.anl.gov
Thu May 26 12:04:52 CDT 2016


  Hong needs to run with this matrix and add appropriate error checkers in the matrix routines to detect "incomplete" matrices and likely just error out.

   Barry

> On May 26, 2016, at 11:23 AM, Satish Balay <balay at mcs.anl.gov> wrote:
> 
> Mat Object: 1 MPI processes
>  type: mpiaij
> row 0: (0, 0.)  (1, 0.486111) 
> row 1: (0, 0.486111)  (1, 0.) 
> row 2: (2, 0.)  (3, 0.486111) 
> row 3: (4, 0.486111)  (5, -0.486111) 
> row 4:
> row 5:
> 
> The matrix created is funny (empty rows at the end) - so perhaps its
> exposing bugs in Mat code? [is that a valid matrix for this code?]
> 
> ==21091== Use of uninitialised value of size 8
> ==21091==    at 0x57CA16B: MatGetRowIJ_SeqAIJ_Inode_Symmetric (inode.c:101)
> ==21091==    by 0x57CBA1C: MatGetRowIJ_SeqAIJ_Inode (inode.c:241)
> ==21091==    by 0x537C0B5: MatGetRowIJ (matrix.c:7274)
> ==21091==    by 0x53072FD: MatGetOrdering_ND (spnd.c:18)
> ==21091==    by 0x530BC39: MatGetOrdering (sorder.c:260)
> ==21091==    by 0x530A72D: MatGetOrdering (sorder.c:202)
> ==21091==    by 0x5DDD764: PCSetUp_LU (lu.c:124)
> ==21091==    by 0x5EBFE60: PCSetUp (precon.c:968)
> ==21091==    by 0x5FDA1B3: KSPSetUp (itfunc.c:390)
> ==21091==    by 0x601C17D: kspsetup_ (itfuncf.c:252)
> ==21091==    by 0x4028B9: MAIN__ (ex1f.F90:104)
> ==21091==    by 0x403535: main (ex1f.F90:185)
> 
> 
> This goes away if  I add:
> 
>   call PCFactorSetMatOrderingType(pc,MATORDERINGNATURAL,ierr)
> 
> And then there is also:
> 
> ==21275== Invalid read of size 8
> ==21275==    at 0x584DE93: MatGetBrowsOfAoCols_MPIAIJ (mpiaij.c:4734)
> ==21275==    by 0x58970A8: MatMatMultSymbolic_MPIAIJ_MPIAIJ_nonscalable (mpimatmatmult.c:198)
> ==21275==    by 0x5894A54: MatMatMult_MPIAIJ_MPIAIJ (mpimatmatmult.c:34)
> ==21275==    by 0x539664E: MatMatMult (matrix.c:9510)
> ==21275==    by 0x53B3201: matmatmult_ (matrixf.c:1157)
> ==21275==    by 0x402FC9: MAIN__ (ex1f.F90:149)
> ==21275==    by 0x4035B9: main (ex1f.F90:186)
> ==21275==  Address 0xa3d20f0 is 0 bytes after a block of size 48 alloc'd
> ==21275==    at 0x4C2DF93: memalign (vg_replace_malloc.c:858)
> ==21275==    by 0x4FDE05E: PetscMallocAlign (mal.c:28)
> ==21275==    by 0x5240240: VecScatterCreate (vscat.c:1220)
> ==21275==    by 0x5857708: MatSetUpMultiply_MPIAIJ (mmaij.c:116)
> ==21275==    by 0x581C31E: MatAssemblyEnd_MPIAIJ (mpiaij.c:747)
> ==21275==    by 0x53680F2: MatAssemblyEnd (matrix.c:5187)
> ==21275==    by 0x53B24D2: matassemblyend_ (matrixf.c:926)
> ==21275==    by 0x40262C: MAIN__ (ex1f.F90:60)
> ==21275==    by 0x4035B9: main (ex1f.F90:186)
> 
> 
> Satish
> 
> -----------
> 
> $ diff build_nullbasis_petsc_mumps.F90 ex1f.F90
> 3,7c3
> < #include <petsc/finclude/petscsys.h>
> < #include "petsc/finclude/petscvec.h"
> < #include "petsc/finclude/petscmat.h"
> < #include "petsc/finclude/petscpc.h"
> < #include "petsc/finclude/petscksp.h"
> ---
>> #include "petsc/finclude/petsc.h"
> 40,41c36,37
> <    call PetscViewerBinaryOpen(PETSC_COMM_WORLD, "mat_c_bin.txt", 0, viewer, ierr) 
> <    call MatLoad(mat_c, viewer)
> ---
>>   call PetscViewerBinaryOpen(PETSC_COMM_WORLD, "mat_c_bin.txt", FILE_MODE_READ, viewer, ierr) 
>>   call MatLoad(mat_c, viewer,ierr)
> 75a72
>>   call PCFactorSetMatOrderingType(pc,MATORDERINGNATURAL,ierr)
> 150c147
> <    call MatConvert(x, MATMPIAIJ, MAT_REUSE_MATRIX, x, ierr)
> ---
>>   call MatConvert(x, MATMPIAIJ, MAT_INPLACE_MATRIX, x, ierr)
> 
> 
> On Thu, 26 May 2016, Matthew Knepley wrote:
> 
>> Usually this means you have an uninitialized variable that is causing you
>> to overwrite memory. Fortran
>> is so lax in checking this, its one reason to switch to C.
>> 
>>  Thanks,
>> 
>>    Matt
>> 
>> On Thu, May 26, 2016 at 1:46 AM, Constantin Nguyen Van <
>> constantin.nguyen.van at openmailbox.org> wrote:
>> 
>>> Thanks for all your answers.
>>> I'm sorry for the syntax mistake in MatLoad, it was done afterwards.
>>> 
>>> I recompile PETSC --with-debugging=yes and run my code again.
>>> Now, I also have this strange behaviour. When I run the code without
>>> valgrind and with one proc, I have this error message:
>>> 
>>> BEGIN PROC           0
>>> ITERATION           1
>>> ECHO 1
>>> ECHO 2
>>> INFOG(28):           2
>>> BASIS OK           0
>>> END PROC             0
>>> BEGIN PROC           0
>>> ITERATION           2
>>> ECHO 1
>>> ECHO 2
>>> INFOG(28):           2
>>> BASIS OK           0
>>> END PROC             0
>>> BEGIN PROC           0
>>> ITERATION           3
>>> ECHO 1
>>> [0]PETSC ERROR:
>>> ------------------------------------------------------------------------
>>> [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation,
>>> probably memory access out of range
>>> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
>>> [0]PETSC ERROR: or see
>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
>>> [0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS
>>> X to find memory corruption errors
>>> [0]PETSC ERROR: likely location of problem given in stack below
>>> [0]PETSC ERROR: ---------------------  Stack Frames
>>> ------------------------------------
>>> [0]PETSC ERROR: Note: The EXACT line numbers in the stack are not
>>> available,
>>> [0]PETSC ERROR:       INSTEAD the line number of the start of the function
>>> [0]PETSC ERROR:       is given.
>>> [0]PETSC ERROR: [0] MatGetRowIJ_SeqAIJ_Inode_Symmetric line 69
>>> /home/j10077/librairie/petsc-mumps/petsc-3.6.4/src/mat/impls/aij/seq/inode.c
>>> [0]PETSC ERROR: [0] MatGetRowIJ_SeqAIJ_Inode line 235
>>> /home/j10077/librairie/petsc-mumps/petsc-3.6.4/src/mat/impls/aij/seq/inode.c
>>> [0]PETSC ERROR: [0] MatGetRowIJ line 7099
>>> /home/j10077/librairie/petsc-mumps/petsc-3.6.4/src/mat/interface/matrix.c
>>> [0]PETSC ERROR: [0] MatGetOrdering_ND line 17
>>> /home/j10077/librairie/petsc-mumps/petsc-3.6.4/src/mat/order/spnd.c
>>> [0]PETSC ERROR: [0] MatGetOrdering line 185
>>> /home/j10077/librairie/petsc-mumps/petsc-3.6.4/src/mat/order/sorder.c
>>> [0]PETSC ERROR: [0] MatGetOrdering line 185
>>> /home/j10077/librairie/petsc-mumps/petsc-3.6.4/src/mat/order/sorder.c
>>> [0]PETSC ERROR: [0] PCSetUp_LU line 99
>>> /home/j10077/librairie/petsc-mumps/petsc-3.6.4/src/ksp/pc/impls/factor/lu/lu.c
>>> [0]PETSC ERROR: [0] PCSetUp line 945
>>> /home/j10077/librairie/petsc-mumps/petsc-3.6.4/src/ksp/pc/interface/precon.c
>>> [0]PETSC ERROR: [0] KSPSetUp line 247
>>> /home/j10077/librairie/petsc-mumps/petsc-3.6.4/src/ksp/ksp/interface/itfunc.c
>>> 
>>> But when I run it with valgrind, it does work well.
>>> 
>>> Le 2016-05-25 20:04, Barry Smith a écrit :
>>> 
>>>> First run with valgrind
>>>> http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind
>>>> 
>>>> On May 25, 2016, at 2:35 AM, Constantin Nguyen Van
>>>>> <constantin.nguyen.van at openmailbox.org> wrote:
>>>>> 
>>>>> Hi,
>>>>> 
>>>>> I'm a new user of PETSc and I try to use it with MUMPS
>>>>> functionalities to compute a nullbasis.
>>>>> I wrote a code where I compute 4 times the same nullbasis. It does
>>>>> work well when I run it with several procs but with only one
>>>>> processor I get an error on the 2nd iteration when KSPSetUp is
>>>>> called. Furthermore when it is run with a debugger (
>>>>> --with-debugging=yes), it works fine with one or several processors.
>>>>> Have you got any idea about why it doesn't work with one processor
>>>>> and no debugger?
>>>>> 
>>>>> Thanks.
>>>>> Constantin.
>>>>> 
>>>>> PS: You can find the code and the files required to run it enclosed.
>>>>> 
>>>> 
>> 
>> 



More information about the petsc-users mailing list