On Thu, Jan 26, 2012 at 11:13 PM, Xuefei (Rebecca) Yuan <span dir="ltr"><<a href="mailto:xyuan@lbl.gov">xyuan@lbl.gov</a>></span> wrote:<br><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div style="word-wrap:break-word"><div><span style="text-indent:0px;letter-spacing:normal;font-variant:normal;text-align:auto;font-style:normal;font-weight:normal;line-height:normal;border-collapse:separate;text-transform:none;font-size:medium;white-space:normal;font-family:Helvetica;word-spacing:0px"><span style="text-indent:0px;letter-spacing:normal;font-variant:normal;font-style:normal;font-weight:normal;line-height:normal;border-collapse:separate;text-transform:none;font-size:medium;white-space:normal;font-family:Helvetica;word-spacing:0px"><div style="word-wrap:break-word">
Here is another error message if running on local mac:</div></span></span>
</div><div><br></div><div><div>*************petsc-Dev = yes*****************</div><div>*********************************************</div><div>******* start solving for time = 0.10000 at time step = 1******</div><div>******* start solving for time = 0.10000 at time step = 1******</div>
<div> 0 SNES Function norm 2.452320964164e-02 </div><div> 0 SNES Function norm 2.452320964164e-02 </div><div>Matrix Object: 1 MPI processes</div><div> type: seqaij</div><div> rows=16384, cols=16384</div><div> total: nonzeros=831552, allocated nonzeros=1577536</div>
<div> total number of mallocs used during MatSetValues calls =0</div><div> using I-node routines: found 4096 nodes, limit used is 5</div><div>Matrix Object: 1 MPI processes</div><div> type: seqaij</div><div> rows=16384, cols=16384</div>
<div> total: nonzeros=831552, allocated nonzeros=1577536</div><div> total number of mallocs used during MatSetValues calls =0</div><div> using I-node routines: found 4096 nodes, limit used is 5</div></div><div><div> Runtime parameters:</div>
<div> Objective type: Unknown!</div><div> Coarsening type: Unknown!</div><div> Initial partitioning type: Unknown!</div><div> Refinement type: Unknown!</div><div> Number of balancing constraints: 1</div><div> Number of refinement iterations: 1606408608</div>
<div> Random number seed: 1606408644</div><div> Number of separators: 48992256</div><div> Compress graph prior to ordering: Yes</div><div> Detect & order connected components separately: Yes</div><div> Prunning factor for high degree vertices: 0.100000</div>
<div> Allowed maximum load imbalance: 1.001 </div><div><br></div><div>Input Error: Incorrect objective type.</div><div> nbrpool statistics</div><div> nbrpoolsize: 0 nbrpoolcpos: 0</div><div>
nbrpoolreallocs: 0</div></div></div></blockquote><div><br></div><div>Can you run that through valgrind? It looks like it might be prior memory corruption.</div><div><br></div><div> Matt</div><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div style="word-wrap:break-word"><div><div>[0]PETSC ERROR: ------------------------------------------------------------------------</div></div><div><div>[0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range</div>
<div>[0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger</div><div>[0]PETSC ERROR: or see <a href="http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSC" target="_blank">http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSC</a> ERROR: or try <a href="http://valgrind.org" target="_blank">http://valgrind.org</a> on GNU/linux and Apple Mac OS X to find memory corruption errors</div>
<div>[0]PETSC ERROR: likely location of problem given in stack below</div><div>[0]PETSC ERROR: --------------------- Stack Frames ------------------------------------</div><div>[0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,</div>
<div>[0]PETSC ERROR: INSTEAD the line number of the start of the function</div><div>[0]PETSC ERROR: is given.</div><div>[0]PETSC ERROR: [0] MatLUFactorNumeric_SuperLU_DIST line 284 /Users/xyuan/Software_macbook/petsc-dev/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c</div>
<div>[0]PETSC ERROR: [0] MatLUFactorNumeric line 2871 /Users/xyuan/Software_macbook/petsc-dev/src/mat/interface/matrix.c</div></div><div><div>[0]PETSC ERROR: [0] PCSetUp_LU line 108 /Users/xyuan/Software_macbook/petsc-dev/src/ksp/pc/impls/factor/lu/lu.c</div>
<div>[0]PETSC ERROR: [0] PCSetUp line 810 /Users/xyuan/Software_macbook/petsc-dev/src/ksp/pc/interface/precon.c</div><div>[0]PETSC ERROR: [0] KSPSetUp line 184 /Users/xyuan/Software_macbook/petsc-dev/src/ksp/ksp/interface/itfunc.c</div>
<div>[0]PETSC ERROR: [0] KSPSolve line 334 /Users/xyuan/Software_macbook/petsc-dev/src/ksp/ksp/interface/itfunc.c</div><div>[0]PETSC ERROR: [0] SNES_KSPSolve line 3874 /Users/xyuan/Software_macbook/petsc-dev/src/snes/interface/snes.c</div>
<div>[0]PETSC ERROR: [0] SNESSolve_LS line 593 /Users/xyuan/Software_macbook/petsc-dev/src/snes/impls/ls/ls.c</div><div>[0]PETSC ERROR: [0] SNESSolve line 3061 /Users/xyuan/Software_macbook/petsc-dev/src/snes/interface/snes.c</div>
<div>[0]PETSC ERROR: [0] DMMGSolveSNES line 538 /Users/xyuan/Software_macbook/petsc-dev/src/snes/utils/damgsnes.c</div><div>[0]PETSC ERROR: [0] DMMGSolve line 303 /Users/xyuan/Software_macbook/petsc-dev/src/snes/utils/damg.c</div>
<div>[0]PETSC ERROR: [0] Solve line 374 twcartffxmhd.c</div><div>[0]PETSC ERROR: --------------------- Error Message ------------------------------------</div></div><div><div>[0]PETSC ERROR: Signal received!</div><div>[0]PETSC ERROR: ------------------------------------------------------------------------</div>
<div>[0]PETSC ERROR: Petsc Development HG revision: 905af3a7d7cdee7d0b744502bace1d74dc34b204 HG Date: Sun Jan 22 16:10:04 2012 -0700</div><div>[0]PETSC ERROR: See docs/changes/index.html for recent updates.</div><div>[0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.</div>
<div>[0]PETSC ERROR: See docs/index.html for manual pages.</div><div>[0]PETSC ERROR: ------------------------------------------------------------------------</div><div>[0]PETSC ERROR: ./twcartffxmhd.exe on a arch-osx- named DOE6897708.local by xyuan Thu Jan 26 21:09:47 2012</div>
<div>[0]PETSC ERROR: Libraries linked from /Users/xyuan/Software_macbook/petsc-dev/arch-osx-10.6-c-pkgs-opt-debug/lib</div><div>[0]PETSC ERROR: Configure run at Mon Jan 23 10:21:17 2012</div></div><div><div>[0]PETSC ERROR: Configure options --with-cc="gcc -m64" --with-fc="gfortran -m64" --with-cxx=g++ --with-debugging=1 -download-f-blas-lapack=1 --download-mpich=1 --download-plapack=1 --download-parmetis=1 --download-metis=1 --download-triangle=1 --download-spooles=1 --download-superlu=1 --download-superlu_dist=/Users/xyuan/Software_macbook/superlu_dist_3.0.tar.gz --download-blacs=1 --download-scalapack=1 --download-mumps=1 --download-hdf5=1 --download-sundials=1 --download-prometheus=1 --download-umfpack=1 --download-chaco=1 --download-spai=1 --download-ptscotch=1 --download-pastix=1 --download-prometheus=1 --download-cmake=1 PETSC_ARCH=arch-osx-10.6-c-pkgs-opt-debug</div>
<div>[0]PETSC ERROR: ------------------------------------------------------------------------</div><div>[0]PETSC ERROR: User provided function() line 0 in unknown directory unknown file</div><div>application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0</div>
<div>[unset]: aborting job:</div><div>application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0</div><div> Runtime parameters:</div><div> Objective type: Unknown!</div><div> Coarsening type: Unknown!</div><div> Initial partitioning type: Unknown!</div>
<div> Refinement type: Unknown!</div><div> Number of balancing constraints: 1</div><div> Number of refinement iterations: 1606408608</div></div><div><div> Random number seed: 1606408644</div><div> Number of separators: 48992256</div>
<div> Compress graph prior to ordering: Yes</div><div> Detect & order connected components separately: Yes</div><div> Prunning factor for high degree vertices: 0.100000</div><div> Allowed maximum load imbalance: 1.001 </div>
<div><br></div><div>Input Error: Incorrect objective type.</div><div> nbrpool statistics</div><div> nbrpoolsize: 0 nbrpoolcpos: 0</div><div> nbrpoolreallocs: 0</div><div><br></div>
<div>[0]PETSC ERROR: ------------------------------------------------------------------------</div><div>[0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range</div><div>[0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger</div>
<div>[0]PETSC ERROR: or see <a href="http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSC" target="_blank">http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind[0]PETSC</a> ERROR: or try <a href="http://valgrind.org" target="_blank">http://valgrind.org</a> on GNU/linux and Apple Mac OS X to find memory corruption errors</div>
<div>[0]PETSC ERROR: likely location of problem given in stack below</div><div>[0]PETSC ERROR: --------------------- Stack Frames ------------------------------------</div></div><div><div>[0]PETSC ERROR: Note: The EXACT line numbers in the stack are not available,</div>
<div>[0]PETSC ERROR: INSTEAD the line number of the start of the function</div><div>[0]PETSC ERROR: is given.</div><div>[0]PETSC ERROR: [0] MatLUFactorNumeric_SuperLU_DIST line 284 /Users/xyuan/Software_macbook/petsc-dev/src/mat/impls/aij/mpi/superlu_dist/superlu_dist.c</div>
<div>[0]PETSC ERROR: [0] MatLUFactorNumeric line 2871 /Users/xyuan/Software_macbook/petsc-dev/src/mat/interface/matrix.c</div><div>[0]PETSC ERROR: [0] PCSetUp_LU line 108 /Users/xyuan/Software_macbook/petsc-dev/src/ksp/pc/impls/factor/lu/lu.c</div>
<div>[0]PETSC ERROR: [0] PCSetUp line 810 /Users/xyuan/Software_macbook/petsc-dev/src/ksp/pc/interface/precon.c</div><div>[0]PETSC ERROR: [0] KSPSetUp line 184 /Users/xyuan/Software_macbook/petsc-dev/src/ksp/ksp/interface/itfunc.c</div>
<div>[0]PETSC ERROR: [0] KSPSolve line 334 /Users/xyuan/Software_macbook/petsc-dev/src/ksp/ksp/interface/itfunc.c</div><div>[0]PETSC ERROR: [0] SNES_KSPSolve line 3874 /Users/xyuan/Software_macbook/petsc-dev/src/snes/interface/snes.c</div>
<div>[0]PETSC ERROR: [0] SNESSolve_LS line 593 /Users/xyuan/Software_macbook/petsc-dev/src/snes/impls/ls/ls.c</div><div>[0]PETSC ERROR: [0] SNESSolve line 3061 /Users/xyuan/Software_macbook/petsc-dev/src/snes/interface/snes.c</div>
<div>[0]PETSC ERROR: [0] DMMGSolveSNES line 538 /Users/xyuan/Software_macbook/petsc-dev/src/snes/utils/damgsnes.c</div></div><div><div>[0]PETSC ERROR: [0] DMMGSolve line 303 /Users/xyuan/Software_macbook/petsc-dev/src/snes/utils/damg.c</div>
<div>[0]PETSC ERROR: [0] Solve line 374 twcartffxmhd.c</div><div>[0]PETSC ERROR: --------------------- Error Message ------------------------------------</div><div>[0]PETSC ERROR: Signal received!</div><div>[0]PETSC ERROR: ------------------------------------------------------------------------</div>
<div>[0]PETSC ERROR: Petsc Development HG revision: 905af3a7d7cdee7d0b744502bace1d74dc34b204 HG Date: Sun Jan 22 16:10:04 2012 -0700</div><div>[0]PETSC ERROR: See docs/changes/index.html for recent updates.</div><div>[0]PETSC ERROR: See docs/faq.html for hints about trouble shooting.</div>
<div>[0]PETSC ERROR: See docs/index.html for manual pages.</div><div>[0]PETSC ERROR: ------------------------------------------------------------------------</div><div>[0]PETSC ERROR: ./twcartffxmhd.exe on a arch-osx- named DOE6897708.local by xyuan Thu Jan 26 21:09:47 2012</div>
<div>[0]PETSC ERROR: Libraries linked from /Users/xyuan/Software_macbook/petsc-dev/arch-osx-10.6-c-pkgs-opt-debug/lib</div><div>[0]PETSC ERROR: Configure run at Mon Jan 23 10:21:17 2012</div></div><div><div>[0]PETSC ERROR: Configure options --with-cc="gcc -m64" --with-fc="gfortran -m64" --with-cxx=g++ --with-debugging=1 -download-f-blas-lapack=1 --download-mpich=1 --download-plapack=1 --download-parmetis=1 --download-metis=1 --download-triangle=1 --download-spooles=1 --download-superlu=1 --download-superlu_dist=/Users/xyuan/Software_macbook/superlu_dist_3.0.tar.gz --download-blacs=1 --download-scalapack=1 --download-mumps=1 --download-hdf5=1 --download-sundials=1 --download-prometheus=1 --download-umfpack=1 --download-chaco=1 --download-spai=1 --download-ptscotch=1 --download-pastix=1 --download-prometheus=1 --download-cmake=1 PETSC_ARCH=arch-osx-10.6-c-pkgs-opt-debug</div>
<div>[0]PETSC ERROR: ------------------------------------------------------------------------</div><div>[0]PETSC ERROR: User provided function() line 0 in unknown directory unknown file</div><div>application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0</div>
<div>[unset]: aborting job:</div><div>application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0</div></div><div><br></div><div>R</div><div><br></div><div><br></div><div><br></div><div><br></div>
<br><div><div>On Jan 26, 2012, at 9:05 PM, Xuefei (Rebecca) Yuan wrote:</div><br><blockquote type="cite"><div style="word-wrap:break-word"><div><div>Hello Mark,</div><div><br></div><div><div style="word-wrap:break-word">
Actually I have tried those options for a sequential run where I need to use superlu to get the condition number of some matrix.</div>
</div><div><br></div><div>In the dev version,</div><div><br></div><div><div>ierr = DMCreateMatrix(DMMGGetDM(dmmg), MATAIJ, &jacobian);CHKERRQ(ierr);</div></div><div><div>ierr = MatSetOption(jacobian, MAT_IGNORE_ZERO_ENTRIES, PETSC_TRUE);CHKERRQ(ierr);</div>
</div><div><br></div><div>And in the options file, </div><div><br></div><div>-dm_preallocate_only</div><div><br></div><div>is added.</div><div><br></div><div>This is totally fine when np=1, however, when I use multiple processors, there are some memory corruption happened.</div>
<div><br></div><div>For example, the number of true nonzeros for 65536 size matrix is 1,470,802. the output (&) is for np=1 with the following PETSc related options:</div><div><div><br></div><div>-dm_preallocate_only</div>
<div><div>-snes_ksp_ew true</div><div>-snes_monitor</div><div>-snes_max_it 1</div><div>-ksp_view</div><div>-mat_view_info</div></div><div><div>-ksp_type preonly</div><div>-pc_type lu</div></div></div><div><div>-pc_factor_mat_solver_package superlu</div>
<div>-mat_superlu_conditionnumber</div><div>-mat_superlu_printstat</div></div><div><br></div><div><br></div><div>However, when np=2, the number of nonzeros changes to 3,366,976 with the following PETSc related options. (*) is the output file.</div>
<div><br></div><div>-dm_preallocate_only</div><div><div>-snes_ksp_ew true</div><div>-snes_monitor</div><div>-ksp_view</div><div>-mat_view_info</div></div><div><div>-ksp_type preonly</div><div>-pc_type lu</div><div>-pc_factor_mat_solver_package superlu_dist</div>
</div><div><br></div><div>-----------------------------</div><div>(&)</div><div><br></div><div><div>*************petsc-Dev = yes*****************</div><div>*********************************************</div><div>******* start solving for time = 1.00000 at time step = 1******</div>
<div> 0 SNES Function norm 1.242539468950e-02</div><div>Matrix Object: 1 MPI processes</div><div> type: seqaij</div><div> rows=65536, cols=65536</div><div> total: nonzeros=1470802, allocated nonzeros=2334720</div><div>
total number of mallocs used during MatSetValues calls =0</div><div> not using I-node routines</div><div> Recip. condition number = 4.345658e-07</div><div>MatLUFactorNumeric_SuperLU():</div><div>Factor time = 42.45</div>
<div>Factor flops = 7.374620e+10 Mflops = 1737.25</div><div>Solve time = 0.00</div><div>Number of memory expansions: 3</div><div> No of nonzeros in factor L = 32491856</div><div> No of nonzeros in factor U = 39390974</div>
<div> No of nonzeros in L+U = 71817294</div><div> L\U MB 741.397 total MB needed 756.339</div><div>Matrix Object: 1 MPI processes</div><div> type: seqaij</div></div><div><div> rows=65536, cols=65536</div><div> package used to perform factorization: superlu</div>
<div> total: nonzeros=0, allocated nonzeros=0</div><div> total number of mallocs used during MatSetValues calls =0</div><div> SuperLU run parameters:</div><div> Equil: NO</div><div> ColPerm: 3</div><div> IterRefine: 0</div>
<div> SymmetricMode: NO</div><div> DiagPivotThresh: 1</div><div> PivotGrowth: NO</div><div> ConditionNumber: YES</div><div> RowPerm: 0</div><div> ReplaceTinyPivot: NO</div><div> PrintStat: YES</div>
<div> lwork: 0</div><div>MatSolve__SuperLU():</div><div>Factor time = 42.45</div><div>Factor flops = 7.374620e+10 Mflops = 1737.25</div><div>Solve time = 0.59</div><div>Solve flops = 1.436365e+08 Mflops = 243.45</div>
</div><div><div>Number of memory expansions: 3</div></div><div><div> 1 SNES Function norm 2.645145585949e-04</div></div><div><br></div><div><br></div><div>-----------------------------------------------</div><div>(*)</div>
<div><div>*************petsc-Dev = yes*****************</div><div>*********************************************</div><div>******* start solving for time = 1.00000 at time step = 1******</div><div> 0 SNES Function norm 1.242539468950e-02</div>
<div>Matrix Object: 2 MPI processes</div><div> type: mpiaij</div><div> rows=65536, cols=65536</div><div> total: nonzeros=3366976, allocated nonzeros=6431296</div><div> total number of mallocs used during MatSetValues calls =0</div>
<div> Matrix Object: 2 MPI processes</div><div> type: mpiaij</div><div> rows=65536, cols=65536</div><div> total: nonzeros=3366976, allocated nonzeros=3366976</div><div> total number of mallocs used during MatSetValues calls =0</div>
<div> using I-node (on process 0) routines: found 8192 nodes, limit used is 5</div><div>Input Error: Incorrect objective type.</div><div>Input Error: Incorrect objective type.</div></div><div><div>At column 0, pivotL() encounters zero diagonal at line 708 in file symbfact.c</div>
<div>At column 0, pivotL() encounters zero diagonal at line 708 in file symbfact.c</div></div><div><br></div><div>Moreover, When I use valgrind with --leak-check=yes --track-origins=yes, there are 441 errors from 219 contexts in PetscInitialize() before calling SNESSolve(). Is this normal for dev?</div>
<div><br></div><div>Thanks very much!</div><div><br></div><div>Best regards,</div><div><br></div><div>Rebecca</div><div><br></div><div><br></div><div><br></div><div><br></div><div><br></div><div><br></div><div><div>On Jan 26, 2012, at 5:06 PM, Jed Brown wrote:</div>
<br><blockquote type="cite"><div class="gmail_quote">On Thu, Jan 26, 2012 at 19:00, Mark F. Adams <span dir="ltr"><<a href="mailto:mark.adams@columbia.edu" target="_blank">mark.adams@columbia.edu</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
I'm guessing that PETSc recently changed and now filters out 0.0 in MatSetValues ... is this true?<br></blockquote><div><br></div><div>Did the option MAT_IGNORE_ZERO_ENTRIES get set somehow?</div></div>
</blockquote></div><br></div></div></blockquote></div><br></div></blockquote></div><br><br clear="all"><div><br></div>-- <br>What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.<br>
-- Norbert Wiener<br>