[petsc-users] pastix solver break at pastix_checkMatrix
Gong Ding
gdiso at ustc.edu
Thu Dec 30 22:39:52 CST 2010
Dear Barry
As I mentioned, only works for the first nonlinear iteration.
Will break for the second!
I guess the MatConvertToCSC function should be considered again, i.e. do optimization for nonlinear solver which will
call pastix many times.
Gong Ding
> Well that, obviously was erroneously left in. Remove that line, recompile in that directory and it should run ok.
> Barry
On Dec 30, 2010, at 9:49 PM, Gong Ding wrote:
> Dear Barry,
> First, the patched file has some evident problem.
>
> PetscScalar *tmpvalues;
> PetscInt *tmprows,*tmpcolptr;
> tmpvalues = malloc(nnz*sizeof(PetscScalar));
> tmprows = malloc(nnz*sizeof(PetscInt));
> tmpcolptr = malloc((*n+1)*sizeof(PetscInt));
>
> ierr = PetscMalloc3(nnz,PetscScalar,&tmpvalues,nnz,PetscInt,&tmprows,(*n+1),PetscInt,&tmpcolptr);CHKERRQ(ierr); <-- this line alloc meory again.
>
> After comment above line, the pastix works for the first nonlinear iteration. However, it breaks at the second iteration. valgrind reported:
>
> DDM Solver Level 1 init...
> Using PaStiX linear solver...
> Compute equilibrium
> its | Eq(V) | | Eq(n) | | Eq(p) | | Eq(T) | |Eq(Tn)| |Eq(Tp)| |delta x|
> -----------------------------------------------------------------------------
> 0 2.50e-06 2.34e-03 3.12e-04 0.00e+00* 0.00e+00* 0.00e+00* 0.00e+00*
> Check : ordering OK
> Check : Graph Symmetry
> Correction
> Add 4090 null terms
> OK
> Check : Sort CSC OK
> 1 2.06e-05 7.29e-04 1.03e-04 0.00e+00* 0.00e+00* 0.00e+00* 3.85e-01
> Check : ordering OK
> Check : Graph Symmetry==1416== Thread 1:
> ==1416== Invalid read of size 4
> ==1416== at 0x1BC3186: csc_checksym (csc_utils.c:321)
> ==1416== by 0x1B4E7E3: pastix_checkMatrix (pastix.c:3915)
> ==1416== by 0x1508661: MatConvertToCSC (pastix.c:185)
> ==1416== by 0x150AA2E: MatFactorNumeric_PaStiX (pastix.c:396)
> ==1416== by 0x139C90B: MatLUFactorNumeric (matrix.c:2587)
> ==1416== by 0x16AF49A: PCSetUp_LU (lu.c:158)
> ==1416== by 0x1AA0136: PCSetUp (precon.c:795)
> ==1416== by 0x16FECC4: KSPSetUp (itfunc.c:237)
> ==1416== by 0x16FFF1E: KSPSolve (itfunc.c:353)
> ==1416== by 0x17C3061: SNES_KSPSolve (snes.c:2944)
> ==1416== by 0x17D33DE: SNESSolve_LS (ls.c:191)
> ==1416== by 0x17BCF6C: SNESSolve (snes.c:2255)
> ==1416== Address 0x8aeb1a8 is not stack'd, malloc'd or (recently) free'd
> ==1416==
>
> Correction==1416== Invalid read of size 4
> ==1416== at 0x1BD1147: correct2 (cscsymcsc.c:77)
> ==1416== by 0x1B4E8EA: pastix_checkMatrix (pastix.c:3930)
> ==1416== by 0x1508661: MatConvertToCSC (pastix.c:185)
> ==1416== by 0x150AA2E: MatFactorNumeric_PaStiX (pastix.c:396)
> ==1416== by 0x139C90B: MatLUFactorNumeric (matrix.c:2587)
> ==1416== by 0x16AF49A: PCSetUp_LU (lu.c:158)
> ==1416== by 0x1AA0136: PCSetUp (precon.c:795)
> ==1416== by 0x16FECC4: KSPSetUp (itfunc.c:237)
> ==1416== by 0x16FFF1E: KSPSolve (itfunc.c:353)
> ==1416== by 0x17C3061: SNES_KSPSolve (snes.c:2944)
> ==1416== by 0x17D33DE: SNESSolve_LS (ls.c:191)
> ==1416== by 0x17BCF6C: SNESSolve (snes.c:2255)
> ==1416== Address 0x8aeb1a8 is not stack'd, malloc'd or (recently) free'd
> ==1416==
> ==1416== Invalid read of size 4
> ==1416== at 0x1BD10D7: correct2 (cscsymcsc.c:67)
> ==1416== by 0x1B4E8EA: pastix_checkMatrix (pastix.c:3930)
> ==1416== by 0x1508661: MatConvertToCSC (pastix.c:185)
> ==1416== by 0x150AA2E: MatFactorNumeric_PaStiX (pastix.c:396)
> ==1416== by 0x139C90B: MatLUFactorNumeric (matrix.c:2587)
> ==1416== by 0x16AF49A: PCSetUp_LU (lu.c:158)
> ==1416== by 0x1AA0136: PCSetUp (precon.c:795)
> ==1416== by 0x16FECC4: KSPSetUp (itfunc.c:237)
> ==1416== by 0x16FFF1E: KSPSolve (itfunc.c:353)
> ==1416== by 0x17C3061: SNES_KSPSolve (snes.c:2944)
> ==1416== by 0x17D33DE: SNESSolve_LS (ls.c:191)
> ==1416== by 0x17BCF6C: SNESSolve (snes.c:2255)
> ==1416== Address 0x8ae7574 is 0 bytes after a block of size 68,180 alloc'd
> ==1416== at 0x4A061EF: malloc (vg_replace_malloc.c:236)
> ==1416== by 0x15080FF: MatConvertToCSC (pastix.c:168)
> ==1416== by 0x150AA2E: MatFactorNumeric_PaStiX (pastix.c:396)
> ==1416== by 0x139C90B: MatLUFactorNumeric (matrix.c:2587)
> ==1416== by 0x16AF49A: PCSetUp_LU (lu.c:158)
> ==1416== by 0x1AA0136: PCSetUp (precon.c:795)
> ==1416== by 0x16FECC4: KSPSetUp (itfunc.c:237)
> ==1416== by 0x16FFF1E: KSPSolve (itfunc.c:353)
> ==1416== by 0x17C3061: SNES_KSPSolve (snes.c:2944)
> ==1416== by 0x17D33DE: SNESSolve_LS (ls.c:191)
> ==1416== by 0x17BCF6C: SNESSolve (snes.c:2255)
> ==1416== by 0x10BAB53: FVM_NonlinearSolver::sens_solve() (fvm_nonlinear_solver.cc:824)
> ==1416==
> ==1416== Invalid read of size 4
> ==1416== at 0x1BD10EB: correct2 (cscsymcsc.c:72)
> ==1416== by 0x1B4E8EA: pastix_checkMatrix (pastix.c:3930)
> ==1416== by 0x1508661: MatConvertToCSC (pastix.c:185)
> ==1416== by 0x150AA2E: MatFactorNumeric_PaStiX (pastix.c:396)
> ==1416== by 0x139C90B: MatLUFactorNumeric (matrix.c:2587)
> ==1416== by 0x16AF49A: PCSetUp_LU (lu.c:158)
> ==1416== by 0x1AA0136: PCSetUp (precon.c:795)
> ==1416== by 0x16FECC4: KSPSetUp (itfunc.c:237)
> ==1416== by 0x16FFF1E: KSPSolve (itfunc.c:353)
> ==1416== by 0x17C3061: SNES_KSPSolve (snes.c:2944)
> ==1416== by 0x17D33DE: SNESSolve_LS (ls.c:191)
> ==1416== by 0x17BCF6C: SNESSolve (snes.c:2255)
> ==1416== Address 0x8ae7574 is 0 bytes after a block of size 68,180 alloc'd
> ==1416== at 0x4A061EF: malloc (vg_replace_malloc.c:236)
> ==1416== by 0x15080FF: MatConvertToCSC (pastix.c:168)
> ==1416== by 0x150AA2E: MatFactorNumeric_PaStiX (pastix.c:396)
> ==1416== by 0x139C90B: MatLUFactorNumeric (matrix.c:2587)
> ==1416== by 0x16AF49A: PCSetUp_LU (lu.c:158)
> ==1416== by 0x1AA0136: PCSetUp (precon.c:795)
> ==1416== by 0x16FECC4: KSPSetUp (itfunc.c:237)
> ==1416== by 0x16FFF1E: KSPSolve (itfunc.c:353)
> ==1416== by 0x17C3061: SNES_KSPSolve (snes.c:2944)
> ==1416== by 0x17D33DE: SNESSolve_LS (ls.c:191)
> ==1416== by 0x17BCF6C: SNESSolve (snes.c:2255)
> ==1416== by 0x10BAB53: FVM_NonlinearSolver::sens_solve() (fvm_nonlinear_solver.cc:824)
> ==1416==
> ==1416== Invalid read of size 4
> ==1416== at 0x1BD1102: correct2 (cscsymcsc.c:75)
> ==1416== by 0x1B4E8EA: pastix_checkMatrix (pastix.c:3930)
> ==1416== by 0x1508661: MatConvertToCSC (pastix.c:185)
> ==1416== by 0x150AA2E: MatFactorNumeric_PaStiX (pastix.c:396)
> ==1416== by 0x139C90B: MatLUFactorNumeric (matrix.c:2587)
> ==1416== by 0x16AF49A: PCSetUp_LU (lu.c:158)
> ==1416== by 0x1AA0136: PCSetUp (precon.c:795)
> ==1416== by 0x16FECC4: KSPSetUp (itfunc.c:237)
> ==1416== by 0x16FFF1E: KSPSolve (itfunc.c:353)
> ==1416== by 0x17C3061: SNES_KSPSolve (snes.c:2944)
> ==1416== by 0x17D33DE: SNESSolve_LS (ls.c:191)
> ==1416== by 0x17BCF6C: SNESSolve (snes.c:2255)
> ==1416== Address 0x8d369dc is 4 bytes before a block of size 4,216 alloc'd
> ==1416== at 0x4A061EF: malloc (vg_replace_malloc.c:236)
> ==1416== by 0x150812B: MatConvertToCSC (pastix.c:169)
> ==1416== by 0x150AA2E: MatFactorNumeric_PaStiX (pastix.c:396)
> ==1416== by 0x139C90B: MatLUFactorNumeric (matrix.c:2587)
> ==1416== by 0x16AF49A: PCSetUp_LU (lu.c:158)
> ==1416== by 0x1AA0136: PCSetUp (precon.c:795)
> ==1416== by 0x16FECC4: KSPSetUp (itfunc.c:237)
> ==1416== by 0x16FFF1E: KSPSolve (itfunc.c:353)
> ==1416== by 0x17C3061: SNES_KSPSolve (snes.c:2944)
> ==1416== by 0x17D33DE: SNESSolve_LS (ls.c:191)
> ==1416== by 0x17BCF6C: SNESSolve (snes.c:2255)
> ==1416== by 0x10BAB53: FVM_NonlinearSolver::sens_solve() (fvm_nonlinear_solver.cc:824)
> ==1416==
> ==1416== Invalid read of size 4
> ==1416== at 0x1BD116D: correct2 (cscsymcsc.c:88)
> ==1416== by 0x1B4E8EA: pastix_checkMatrix (pastix.c:3930)
> ==1416== by 0x1508661: MatConvertToCSC (pastix.c:185)
> ==1416== by 0x150AA2E: MatFactorNumeric_PaStiX (pastix.c:396)
> ==1416== by 0x139C90B: MatLUFactorNumeric (matrix.c:2587)
> ==1416== by 0x16AF49A: PCSetUp_LU (lu.c:158)
> ==1416== by 0x1AA0136: PCSetUp (precon.c:795)
> ==1416== by 0x16FECC4: KSPSetUp (itfunc.c:237)
> ==1416== by 0x16FFF1E: KSPSolve (itfunc.c:353)
> ==1416== by 0x17C3061: SNES_KSPSolve (snes.c:2944)
> ==1416== by 0x17D33DE: SNESSolve_LS (ls.c:191)
> ==1416== by 0x17BCF6C: SNESSolve (snes.c:2255)
> ==1416== Address 0x8ae7574 is 0 bytes after a block of size 68,180 alloc'd
> ==1416== at 0x4A061EF: malloc (vg_replace_malloc.c:236)
> ==1416== by 0x15080FF: MatConvertToCSC (pastix.c:168)
> ==1416== by 0x150AA2E: MatFactorNumeric_PaStiX (pastix.c:396)
> ==1416== by 0x139C90B: MatLUFactorNumeric (matrix.c:2587)
> ==1416== by 0x16AF49A: PCSetUp_LU (lu.c:158)
> ==1416== by 0x1AA0136: PCSetUp (precon.c:795)
> ==1416== by 0x16FECC4: KSPSetUp (itfunc.c:237)
> ==1416== by 0x16FFF1E: KSPSolve (itfunc.c:353)
> ==1416== by 0x17C3061: SNES_KSPSolve (snes.c:2944)
> ==1416== by 0x17D33DE: SNESSolve_LS (ls.c:191)
> ==1416== by 0x17BCF6C: SNESSolve (snes.c:2255)
> ==1416== by 0x10BAB53: FVM_NonlinearSolver::sens_solve() (fvm_nonlinear_solver.cc:824)
> ==1416==
> ==1416== Invalid read of size 4
> ==1416== at 0x1BD1179: correct2 (cscsymcsc.c:88)
> ==1416== by 0x1B4E8EA: pastix_checkMatrix (pastix.c:3930)
> ==1416== by 0x1508661: MatConvertToCSC (pastix.c:185)
> ==1416== by 0x150AA2E: MatFactorNumeric_PaStiX (pastix.c:396)
> ==1416== by 0x139C90B: MatLUFactorNumeric (matrix.c:2587)
> ==1416== by 0x16AF49A: PCSetUp_LU (lu.c:158)
> ==1416== by 0x1AA0136: PCSetUp (precon.c:795)
> ==1416== by 0x16FECC4: KSPSetUp (itfunc.c:237)
> ==1416== by 0x16FFF1E: KSPSolve (itfunc.c:353)
> ==1416== by 0x17C3061: SNES_KSPSolve (snes.c:2944)
> ==1416== by 0x17D33DE: SNESSolve_LS (ls.c:191)
> ==1416== by 0x17BCF6C: SNESSolve (snes.c:2255)
> ==1416== Address 0x8c1b55c is 4 bytes before a block of size 4,212 alloc'd
> ==1416== at 0x4A061EF: malloc (vg_replace_malloc.c:236)
> ==1416== by 0x1BD1029: correct2 (cscsymcsc.c:53)
> ==1416== by 0x1B4E8EA: pastix_checkMatrix (pastix.c:3930)
> ==1416== by 0x1508661: MatConvertToCSC (pastix.c:185)
> ==1416== by 0x150AA2E: MatFactorNumeric_PaStiX (pastix.c:396)
> ==1416== by 0x139C90B: MatLUFactorNumeric (matrix.c:2587)
> ==1416== by 0x16AF49A: PCSetUp_LU (lu.c:158)
> ==1416== by 0x1AA0136: PCSetUp (precon.c:795)
> ==1416== by 0x16FECC4: KSPSetUp (itfunc.c:237)
> ==1416== by 0x16FFF1E: KSPSolve (itfunc.c:353)
> ==1416== by 0x17C3061: SNES_KSPSolve (snes.c:2944)
> ==1416== by 0x17D33DE: SNESSolve_LS (ls.c:191)
> ==1416==
> ==1416== Invalid read of size 4
> ==1416== at 0x1BD1186: correct2 (cscsymcsc.c:88)
> ==1416== by 0x1B4E8EA: pastix_checkMatrix (pastix.c:3930)
> ==1416== by 0x1508661: MatConvertToCSC (pastix.c:185)
> ==1416== by 0x150AA2E: MatFactorNumeric_PaStiX (pastix.c:396)
> ==1416== by 0x139C90B: MatLUFactorNumeric (matrix.c:2587)
> ==1416== by 0x16AF49A: PCSetUp_LU (lu.c:158)
> ==1416== by 0x1AA0136: PCSetUp (precon.c:795)
> ==1416== by 0x16FECC4: KSPSetUp (itfunc.c:237)
> ==1416== by 0x16FFF1E: KSPSolve (itfunc.c:353)
> ==1416== by 0x17C3061: SNES_KSPSolve (snes.c:2944)
> ==1416== by 0x17D33DE: SNESSolve_LS (ls.c:191)
> ==1416== by 0x17BCF6C: SNESSolve (snes.c:2255)
> ==1416== Address 0x8ae7574 is 0 bytes after a block of size 68,180 alloc'd
> ==1416== at 0x4A061EF: malloc (vg_replace_malloc.c:236)
> ==1416== by 0x15080FF: MatConvertToCSC (pastix.c:168)
> ==1416== by 0x150AA2E: MatFactorNumeric_PaStiX (pastix.c:396)
> ==1416== by 0x139C90B: MatLUFactorNumeric (matrix.c:2587)
> ==1416== by 0x16AF49A: PCSetUp_LU (lu.c:158)
> ==1416== by 0x1AA0136: PCSetUp (precon.c:795)
> ==1416== by 0x16FECC4: KSPSetUp (itfunc.c:237)
> ==1416== by 0x16FFF1E: KSPSolve (itfunc.c:353)
> ==1416== by 0x17C3061: SNES_KSPSolve (snes.c:2944)
> ==1416== by 0x17D33DE: SNESSolve_LS (ls.c:191)
> ==1416== by 0x17BCF6C: SNESSolve (snes.c:2255)
> ==1416== by 0x10BAB53: FVM_NonlinearSolver::sens_solve() (fvm_nonlinear_solver.cc:824)
> ==1416==
> ==1416== Invalid write of size 4
> ==1416== at 0x1BD1192: correct2 (cscsymcsc.c:88)
> ==1416== by 0x1B4E8EA: pastix_checkMatrix (pastix.c:3930)
> ==1416== by 0x1508661: MatConvertToCSC (pastix.c:185)
> ==1416== by 0x150AA2E: MatFactorNumeric_PaStiX (pastix.c:396)
> ==1416== by 0x139C90B: MatLUFactorNumeric (matrix.c:2587)
> ==1416== by 0x16AF49A: PCSetUp_LU (lu.c:158)
> ==1416== by 0x1AA0136: PCSetUp (precon.c:795)
> ==1416== by 0x16FECC4: KSPSetUp (itfunc.c:237)
> ==1416== by 0x16FFF1E: KSPSolve (itfunc.c:353)
> ==1416== by 0x17C3061: SNES_KSPSolve (snes.c:2944)
> ==1416== by 0x17D33DE: SNESSolve_LS (ls.c:191)
> ==1416== by 0x17BCF6C: SNESSolve (snes.c:2255)
> ==1416== Address 0x8c1b55c is 4 bytes before a block of size 4,212 alloc'd
> ==1416== at 0x4A061EF: malloc (vg_replace_malloc.c:236)
> ==1416== by 0x1BD1029: correct2 (cscsymcsc.c:53)
> ==1416== by 0x1B4E8EA: pastix_checkMatrix (pastix.c:3930)
> ==1416== by 0x1508661: MatConvertToCSC (pastix.c:185)
> ==1416== by 0x150AA2E: MatFactorNumeric_PaStiX (pastix.c:396)
> ==1416== by 0x139C90B: MatLUFactorNumeric (matrix.c:2587)
> ==1416== by 0x16AF49A: PCSetUp_LU (lu.c:158)
> ==1416== by 0x1AA0136: PCSetUp (precon.c:795)
> ==1416== by 0x16FECC4: KSPSetUp (itfunc.c:237)
> ==1416== by 0x16FFF1E: KSPSolve (itfunc.c:353)
> ==1416== by 0x17C3061: SNES_KSPSolve (snes.c:2944)
> ==1416== by 0x17D33DE: SNESSolve_LS (ls.c:191)
> ==1416==
> [0]PETSC ERROR: ------------------------------------------------------------------------
> [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range
> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
> [0]PETSC ERROR: or see http://www.mcs.anl.gov/petsc/petsc-as/documentation/troubleshooting.html#Signal[0]PETSC ERROR: or try http://valgrind.org on GNU/linux and Apple Mac OS X to find memory corruption errors
> [0]PETSC ERROR: likely location of problem given in stack below
More information about the petsc-users
mailing list