[petsc-users] pastix solver break at pastix_checkMatrix

Gong Ding gdiso at ustc.edu
Thu Dec 30 22:39:52 CST 2010


Dear Barry
As I mentioned, only works for the first nonlinear iteration.
Will break for the second!
I guess the MatConvertToCSC function should be considered again, i.e. do optimization for nonlinear solver which will
call pastix many times.

Gong Ding


>  Well that, obviously was erroneously left in. Remove that line, recompile in that directory and it should run ok.

>   Barry


On Dec 30, 2010, at 9:49 PM, Gong Ding wrote:

> Dear  Barry,
> First, the patched file has some evident problem.
> 
>   PetscScalar *tmpvalues;
>   PetscInt    *tmprows,*tmpcolptr;
>    tmpvalues = malloc(nnz*sizeof(PetscScalar));
>    tmprows   = malloc(nnz*sizeof(PetscInt));
>    tmpcolptr = malloc((*n+1)*sizeof(PetscInt));
> 
>    ierr = PetscMalloc3(nnz,PetscScalar,&tmpvalues,nnz,PetscInt,&tmprows,(*n+1),PetscInt,&tmpcolptr);CHKERRQ(ierr);  <-- this line alloc meory again.
> 
> After comment above line, the pastix works for the first nonlinear iteration. However, it breaks at the second  iteration. valgrind reported:
> 
> DDM Solver Level 1 init...
> Using PaStiX linear solver...
> Compute equilibrium
> its    | Eq(V) | | Eq(n) | | Eq(p) | | Eq(T) | |Eq(Tn)|  |Eq(Tp)|  |delta x|
> -----------------------------------------------------------------------------
>  0     2.50e-06  2.34e-03  3.12e-04  0.00e+00* 0.00e+00* 0.00e+00* 0.00e+00*
> Check : ordering                OK
> Check : Graph Symmetry
>         Correction
>        Add 4090 null terms
>                OK
> Check : Sort CSC                OK
>  1     2.06e-05  7.29e-04  1.03e-04  0.00e+00* 0.00e+00* 0.00e+00* 3.85e-01
> Check : ordering                OK
> Check : Graph Symmetry==1416== Thread 1:
> ==1416== Invalid read of size 4
> ==1416==    at 0x1BC3186: csc_checksym (csc_utils.c:321)
> ==1416==    by 0x1B4E7E3: pastix_checkMatrix (pastix.c:3915)
> ==1416==    by 0x1508661: MatConvertToCSC (pastix.c:185)
> ==1416==    by 0x150AA2E: MatFactorNumeric_PaStiX (pastix.c:396)
> ==1416==    by 0x139C90B: MatLUFactorNumeric (matrix.c:2587)
> ==1416==    by 0x16AF49A: PCSetUp_LU (lu.c:158)
> ==1416==    by 0x1AA0136: PCSetUp (precon.c:795)
> ==1416==    by 0x16FECC4: KSPSetUp (itfunc.c:237)
> ==1416==    by 0x16FFF1E: KSPSolve (itfunc.c:353)
> ==1416==    by 0x17C3061: SNES_KSPSolve (snes.c:2944)
> ==1416==    by 0x17D33DE: SNESSolve_LS (ls.c:191)
> ==1416==    by 0x17BCF6C: SNESSolve (snes.c:2255)
> ==1416==  Address 0x8aeb1a8 is not stack'd, malloc'd or (recently) free'd
> ==1416==
> 
>         Correction==1416== Invalid read of size 4
> ==1416==    at 0x1BD1147: correct2 (cscsymcsc.c:77)
> ==1416==    by 0x1B4E8EA: pastix_checkMatrix (pastix.c:3930)
> ==1416==    by 0x1508661: MatConvertToCSC (pastix.c:185)
> ==1416==    by 0x150AA2E: MatFactorNumeric_PaStiX (pastix.c:396)
> ==1416==    by 0x139C90B: MatLUFactorNumeric (matrix.c:2587)
> ==1416==    by 0x16AF49A: PCSetUp_LU (lu.c:158)
> ==1416==    by 0x1AA0136: PCSetUp (precon.c:795)
> ==1416==    by 0x16FECC4: KSPSetUp (itfunc.c:237)
> ==1416==    by 0x16FFF1E: KSPSolve (itfunc.c:353)
> ==1416==    by 0x17C3061: SNES_KSPSolve (snes.c:2944)
> ==1416==    by 0x17D33DE: SNESSolve_LS (ls.c:191)
> ==1416==    by 0x17BCF6C: SNESSolve (snes.c:2255)
> ==1416==  Address 0x8aeb1a8 is not stack'd, malloc'd or (recently) free'd
> ==1416==
> ==1416== Invalid read of size 4
> ==1416==    at 0x1BD10D7: correct2 (cscsymcsc.c:67)
> ==1416==    by 0x1B4E8EA: pastix_checkMatrix (pastix.c:3930)
> ==1416==    by 0x1508661: MatConvertToCSC (pastix.c:185)
> ==1416==    by 0x150AA2E: MatFactorNumeric_PaStiX (pastix.c:396)
> ==1416==    by 0x139C90B: MatLUFactorNumeric (matrix.c:2587)
> ==1416==    by 0x16AF49A: PCSetUp_LU (lu.c:158)
> ==1416==    by 0x1AA0136: PCSetUp (precon.c:795)
> ==1416==    by 0x16FECC4: KSPSetUp (itfunc.c:237)
> ==1416==    by 0x16FFF1E: KSPSolve (itfunc.c:353)
> ==1416==    by 0x17C3061: SNES_KSPSolve (snes.c:2944)
> ==1416==    by 0x17D33DE: SNESSolve_LS (ls.c:191)
> ==1416==    by 0x17BCF6C: SNESSolve (snes.c:2255)
> ==1416==  Address 0x8ae7574 is 0 bytes after a block of size 68,180 alloc'd
> ==1416==    at 0x4A061EF: malloc (vg_replace_malloc.c:236)
> ==1416==    by 0x15080FF: MatConvertToCSC (pastix.c:168)
> ==1416==    by 0x150AA2E: MatFactorNumeric_PaStiX (pastix.c:396)
> ==1416==    by 0x139C90B: MatLUFactorNumeric (matrix.c:2587)
> ==1416==    by 0x16AF49A: PCSetUp_LU (lu.c:158)
> ==1416==    by 0x1AA0136: PCSetUp (precon.c:795)
> ==1416==    by 0x16FECC4: KSPSetUp (itfunc.c:237)
> ==1416==    by 0x16FFF1E: KSPSolve (itfunc.c:353)
> ==1416==    by 0x17C3061: SNES_KSPSolve (snes.c:2944)
> ==1416==    by 0x17D33DE: SNESSolve_LS (ls.c:191)
> ==1416==    by 0x17BCF6C: SNESSolve (snes.c:2255)
> ==1416==    by 0x10BAB53: FVM_NonlinearSolver::sens_solve() (fvm_nonlinear_solver.cc:824)
> ==1416==
> ==1416== Invalid read of size 4
> ==1416==    at 0x1BD10EB: correct2 (cscsymcsc.c:72)
> ==1416==    by 0x1B4E8EA: pastix_checkMatrix (pastix.c:3930)
> ==1416==    by 0x1508661: MatConvertToCSC (pastix.c:185)
> ==1416==    by 0x150AA2E: MatFactorNumeric_PaStiX (pastix.c:396)
> ==1416==    by 0x139C90B: MatLUFactorNumeric (matrix.c:2587)
> ==1416==    by 0x16AF49A: PCSetUp_LU (lu.c:158)
> ==1416==    by 0x1AA0136: PCSetUp (precon.c:795)
> ==1416==    by 0x16FECC4: KSPSetUp (itfunc.c:237)
> ==1416==    by 0x16FFF1E: KSPSolve (itfunc.c:353)
> ==1416==    by 0x17C3061: SNES_KSPSolve (snes.c:2944)
> ==1416==    by 0x17D33DE: SNESSolve_LS (ls.c:191)
> ==1416==    by 0x17BCF6C: SNESSolve (snes.c:2255)
> ==1416==  Address 0x8ae7574 is 0 bytes after a block of size 68,180 alloc'd
> ==1416==    at 0x4A061EF: malloc (vg_replace_malloc.c:236)
> ==1416==    by 0x15080FF: MatConvertToCSC (pastix.c:168)
> ==1416==    by 0x150AA2E: MatFactorNumeric_PaStiX (pastix.c:396)
> ==1416==    by 0x139C90B: MatLUFactorNumeric (matrix.c:2587)
> ==1416==    by 0x16AF49A: PCSetUp_LU (lu.c:158)
> ==1416==    by 0x1AA0136: PCSetUp (precon.c:795)
> ==1416==    by 0x16FECC4: KSPSetUp (itfunc.c:237)
> ==1416==    by 0x16FFF1E: KSPSolve (itfunc.c:353)
> ==1416==    by 0x17C3061: SNES_KSPSolve (snes.c:2944)
> ==1416==    by 0x17D33DE: SNESSolve_LS (ls.c:191)
> ==1416==    by 0x17BCF6C: SNESSolve (snes.c:2255)
> ==1416==    by 0x10BAB53: FVM_NonlinearSolver::sens_solve() (fvm_nonlinear_solver.cc:824)
> ==1416==
> ==1416== Invalid read of size 4
> ==1416==    at 0x1BD1102: correct2 (cscsymcsc.c:75)
> ==1416==    by 0x1B4E8EA: pastix_checkMatrix (pastix.c:3930)
> ==1416==    by 0x1508661: MatConvertToCSC (pastix.c:185)
> ==1416==    by 0x150AA2E: MatFactorNumeric_PaStiX (pastix.c:396)
> ==1416==    by 0x139C90B: MatLUFactorNumeric (matrix.c:2587)
> ==1416==    by 0x16AF49A: PCSetUp_LU (lu.c:158)
> ==1416==    by 0x1AA0136: PCSetUp (precon.c:795)
> ==1416==    by 0x16FECC4: KSPSetUp (itfunc.c:237)
> ==1416==    by 0x16FFF1E: KSPSolve (itfunc.c:353)
> ==1416==    by 0x17C3061: SNES_KSPSolve (snes.c:2944)
> ==1416==    by 0x17D33DE: SNESSolve_LS (ls.c:191)
> ==1416==    by 0x17BCF6C: SNESSolve (snes.c:2255)
> ==1416==  Address 0x8d369dc is 4 bytes before a block of size 4,216 alloc'd
> ==1416==    at 0x4A061EF: malloc (vg_replace_malloc.c:236)
> ==1416==    by 0x150812B: MatConvertToCSC (pastix.c:169)
> ==1416==    by 0x150AA2E: MatFactorNumeric_PaStiX (pastix.c:396)
> ==1416==    by 0x139C90B: MatLUFactorNumeric (matrix.c:2587)
> ==1416==    by 0x16AF49A: PCSetUp_LU (lu.c:158)
> ==1416==    by 0x1AA0136: PCSetUp (precon.c:795)
> ==1416==    by 0x16FECC4: KSPSetUp (itfunc.c:237)
> ==1416==    by 0x16FFF1E: KSPSolve (itfunc.c:353)
> ==1416==    by 0x17C3061: SNES_KSPSolve (snes.c:2944)
> ==1416==    by 0x17D33DE: SNESSolve_LS (ls.c:191)
> ==1416==    by 0x17BCF6C: SNESSolve (snes.c:2255)
> ==1416==    by 0x10BAB53: FVM_NonlinearSolver::sens_solve() (fvm_nonlinear_solver.cc:824)
> ==1416==
> ==1416== Invalid read of size 4
> ==1416==    at 0x1BD116D: correct2 (cscsymcsc.c:88)
> ==1416==    by 0x1B4E8EA: pastix_checkMatrix (pastix.c:3930)
> ==1416==    by 0x1508661: MatConvertToCSC (pastix.c:185)
> ==1416==    by 0x150AA2E: MatFactorNumeric_PaStiX (pastix.c:396)
> ==1416==    by 0x139C90B: MatLUFactorNumeric (matrix.c:2587)
> ==1416==    by 0x16AF49A: PCSetUp_LU (lu.c:158)
> ==1416==    by 0x1AA0136: PCSetUp (precon.c:795)
> ==1416==    by 0x16FECC4: KSPSetUp (itfunc.c:237)
> ==1416==    by 0x16FFF1E: KSPSolve (itfunc.c:353)
> ==1416==    by 0x17C3061: SNES_KSPSolve (snes.c:2944)
> ==1416==    by 0x17D33DE: SNESSolve_LS (ls.c:191)
> ==1416==    by 0x17BCF6C: SNESSolve (snes.c:2255)
> ==1416==  Address 0x8ae7574 is 0 bytes after a block of size 68,180 alloc'd
> ==1416==    at 0x4A061EF: malloc (vg_replace_malloc.c:236)
> ==1416==    by 0x15080FF: MatConvertToCSC (pastix.c:168)
> ==1416==    by 0x150AA2E: MatFactorNumeric_PaStiX (pastix.c:396)
> ==1416==    by 0x139C90B: MatLUFactorNumeric (matrix.c:2587)
> ==1416==    by 0x16AF49A: PCSetUp_LU (lu.c:158)
> ==1416==    by 0x1AA0136: PCSetUp (precon.c:795)
> ==1416==    by 0x16FECC4: KSPSetUp (itfunc.c:237)
> ==1416==    by 0x16FFF1E: KSPSolve (itfunc.c:353)
> ==1416==    by 0x17C3061: SNES_KSPSolve (snes.c:2944)
> ==1416==    by 0x17D33DE: SNESSolve_LS (ls.c:191)
> ==1416==    by 0x17BCF6C: SNESSolve (snes.c:2255)
> ==1416==    by 0x10BAB53: FVM_NonlinearSolver::sens_solve() (fvm_nonlinear_solver.cc:824)
> ==1416==
> ==1416== Invalid read of size 4
> ==1416==    at 0x1BD1179: correct2 (cscsymcsc.c:88)
> ==1416==    by 0x1B4E8EA: pastix_checkMatrix (pastix.c:3930)
> ==1416==    by 0x1508661: MatConvertToCSC (pastix.c:185)
> ==1416==    by 0x150AA2E: MatFactorNumeric_PaStiX (pastix.c:396)
> ==1416==    by 0x139C90B: MatLUFactorNumeric (matrix.c:2587)
> ==1416==    by 0x16AF49A: PCSetUp_LU (lu.c:158)
> ==1416==    by 0x1AA0136: PCSetUp (precon.c:795)
> ==1416==    by 0x16FECC4: KSPSetUp (itfunc.c:237)
> ==1416==    by 0x16FFF1E: KSPSolve (itfunc.c:353)
> ==1416==    by 0x17C3061: SNES_KSPSolve (snes.c:2944)
> ==1416==    by 0x17D33DE: SNESSolve_LS (ls.c:191)
> ==1416==    by 0x17BCF6C: SNESSolve (snes.c:2255)
> ==1416==  Address 0x8c1b55c is 4 bytes before a block of size 4,212 alloc'd
> ==1416==    at 0x4A061EF: malloc (vg_replace_malloc.c:236)
> ==1416==    by 0x1BD1029: correct2 (cscsymcsc.c:53)
> ==1416==    by 0x1B4E8EA: pastix_checkMatrix (pastix.c:3930)
> ==1416==    by 0x1508661: MatConvertToCSC (pastix.c:185)
> ==1416==    by 0x150AA2E: MatFactorNumeric_PaStiX (pastix.c:396)
> ==1416==    by 0x139C90B: MatLUFactorNumeric (matrix.c:2587)
> ==1416==    by 0x16AF49A: PCSetUp_LU (lu.c:158)
> ==1416==    by 0x1AA0136: PCSetUp (precon.c:795)
> ==1416==    by 0x16FECC4: KSPSetUp (itfunc.c:237)
> ==1416==    by 0x16FFF1E: KSPSolve (itfunc.c:353)
> ==1416==    by 0x17C3061: SNES_KSPSolve (snes.c:2944)
> ==1416==    by 0x17D33DE: SNESSolve_LS (ls.c:191)
> ==1416==
> ==1416== Invalid read of size 4
> ==1416==    at 0x1BD1186: correct2 (cscsymcsc.c:88)
> ==1416==    by 0x1B4E8EA: pastix_checkMatrix (pastix.c:3930)
> ==1416==    by 0x1508661: MatConvertToCSC (pastix.c:185)
> ==1416==    by 0x150AA2E: MatFactorNumeric_PaStiX (pastix.c:396)
> ==1416==    by 0x139C90B: MatLUFactorNumeric (matrix.c:2587)
> ==1416==    by 0x16AF49A: PCSetUp_LU (lu.c:158)
> ==1416==    by 0x1AA0136: PCSetUp (precon.c:795)
> ==1416==    by 0x16FECC4: KSPSetUp (itfunc.c:237)
> ==1416==    by 0x16FFF1E: KSPSolve (itfunc.c:353)
> ==1416==    by 0x17C3061: SNES_KSPSolve (snes.c:2944)
> ==1416==    by 0x17D33DE: SNESSolve_LS (ls.c:191)
> ==1416==    by 0x17BCF6C: SNESSolve (snes.c:2255)
> ==1416==  Address 0x8ae7574 is 0 bytes after a block of size 68,180 alloc'd
> ==1416==    at 0x4A061EF: malloc (vg_replace_malloc.c:236)
> ==1416==    by 0x15080FF: MatConvertToCSC (pastix.c:168)
> ==1416==    by 0x150AA2E: MatFactorNumeric_PaStiX (pastix.c:396)
> ==1416==    by 0x139C90B: MatLUFactorNumeric (matrix.c:2587)
> ==1416==    by 0x16AF49A: PCSetUp_LU (lu.c:158)
> ==1416==    by 0x1AA0136: PCSetUp (precon.c:795)
> ==1416==    by 0x16FECC4: KSPSetUp (itfunc.c:237)
> ==1416==    by 0x16FFF1E: KSPSolve (itfunc.c:353)
> ==1416==    by 0x17C3061: SNES_KSPSolve (snes.c:2944)
> ==1416==    by 0x17D33DE: SNESSolve_LS (ls.c:191)
> ==1416==    by 0x17BCF6C: SNESSolve (snes.c:2255)
> ==1416==    by 0x10BAB53: FVM_NonlinearSolver::sens_solve() (fvm_nonlinear_solver.cc:824)
> ==1416==
> ==1416== Invalid write of size 4
> ==1416==    at 0x1BD1192: correct2 (cscsymcsc.c:88)
> ==1416==    by 0x1B4E8EA: pastix_checkMatrix (pastix.c:3930)
> ==1416==    by 0x1508661: MatConvertToCSC (pastix.c:185)
> ==1416==    by 0x150AA2E: MatFactorNumeric_PaStiX (pastix.c:396)
> ==1416==    by 0x139C90B: MatLUFactorNumeric (matrix.c:2587)
> ==1416==    by 0x16AF49A: PCSetUp_LU (lu.c:158)
> ==1416==    by 0x1AA0136: PCSetUp (precon.c:795)
> ==1416==    by 0x16FECC4: KSPSetUp (itfunc.c:237)
> ==1416==    by 0x16FFF1E: KSPSolve (itfunc.c:353)
> ==1416==    by 0x17C3061: SNES_KSPSolve (snes.c:2944)
> ==1416==    by 0x17D33DE: SNESSolve_LS (ls.c:191)
> ==1416==    by 0x17BCF6C: SNESSolve (snes.c:2255)
> ==1416==  Address 0x8c1b55c is 4 bytes before a block of size 4,212 alloc'd
> ==1416==    at 0x4A061EF: malloc (vg_replace_malloc.c:236)
> ==1416==    by 0x1BD1029: correct2 (cscsymcsc.c:53)
> ==1416==    by 0x1B4E8EA: pastix_checkMatrix (pastix.c:3930)
> ==1416==    by 0x1508661: MatConvertToCSC (pastix.c:185)
> ==1416==    by 0x150AA2E: MatFactorNumeric_PaStiX (pastix.c:396)
> ==1416==    by 0x139C90B: MatLUFactorNumeric (matrix.c:2587)
> ==1416==    by 0x16AF49A: PCSetUp_LU (lu.c:158)
> ==1416==    by 0x1AA0136: PCSetUp (precon.c:795)
> ==1416==    by 0x16FECC4: KSPSetUp (itfunc.c:237)
> ==1416==    by 0x16FFF1E: KSPSolve (itfunc.c:353)
> ==1416==    by 0x17C3061: SNES_KSPSolve (snes.c:2944)
> ==1416==    by 0x17D33DE: SNESSolve_LS (ls.c:191)
> ==1416==
> [0]PETSC ERROR: ------------------------------------------------------------------------
> [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range
> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
> [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html#Signal[0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
> [0]PETSC ERROR: likely location of problem given in stack below 



More information about the petsc-users mailing list