<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">Mahir:<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div lang="SV" link="blue" vlink="purple"><div>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d"><u></u> <u></u></span></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">I have been using PETSC_COMM_WORLD.</span></p></div></div></blockquote><div> </div><div>What do you get by running a petsc example, e.g.,</div><div>petsc/src/ksp/ksp/examples/tutorials<br></div><div>mpiexec -n 2 ./ex2 -pc_type lu -pc_factor_mat_solver_package superlu_dist -ksp_view</div><div><br></div><div><div>KSP Object: 2 MPI processes</div><div> type: gmres</div></div><div>...</div><div><br></div><div>Hong</div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div lang="SV" link="blue" vlink="purple"><div>
<p class="MsoNormal"><u></u> <u></u></p>
<p class="MsoNormal"><b><span lang="EN-US" style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">From:</span></b><span lang="EN-US" style="font-size:10.0pt;font-family:"Tahoma","sans-serif""> Hong [mailto:<a href="mailto:hzhang@mcs.anl.gov" target="_blank">hzhang@mcs.anl.gov</a>]
<br>
<b>Sent:</b> den 5 augusti 2015 17:11<br>
<b>To:</b> Ülker-Kaustell, Mahir<br>
<b>Cc:</b> Hong; Xiaoye S. Li; PETSc users list<br>
<b>Subject:</b> Re: [petsc-users] SuperLU MPI-problem<u></u><u></u></span></p>
<p class="MsoNormal"><u></u> <u></u></p>
<div>
<div>
<div>
<p class="MsoNormal">Mahir:<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal">As you noticed, you ran the code in serial mode, not parallel.<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal">Check your code on input communicator, e.g., what input communicator do you use in<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal">KSPCreate(comm,&ksp)?<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<div>
<p class="MsoNormal">I have added error flag to superlu_dist interface (released version). When user uses '<span style="font-size:9.0pt;font-family:"Lucida Console"">-mat_superlu_dist_parsymbfact'</span><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:9.0pt;font-family:"Lucida Console"">in serial mode, this option is ignored with a warning.</span><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
<div>
<p class="MsoNormal">Hong<u></u><u></u></p>
</div>
<div>
<blockquote style="border:none;border-left:solid #cccccc 1.0pt;padding:0cm 0cm 0cm 6.0pt;margin-left:4.8pt;margin-right:0cm">
<p class="MsoNormal"><u></u> <u></u></p>
<div>
<div>
<p class="MsoNormal"><span lang="EN-US" style="color:#1f497d">Hong,</span><u></u><u></u></p>
<p class="MsoNormal"><span lang="EN-US" style="color:#1f497d"> </span><u></u><u></u></p>
<p class="MsoNormal"><span lang="EN-US" style="color:#1f497d">If I set parsymbfact:</span><u></u><u></u></p>
<p class="MsoNormal"><span lang="EN-US" style="color:#1f497d"> </span><u></u><u></u></p>
<p class="MsoNormal" style="text-autospace:none">
<span lang="EN-US" style="font-size:9.0pt;font-family:"Lucida Console"">$ mpiexec -n 2 ./solve -ksp_type preonly -pc_type lu -pc_factor_mat_solver_package superlu_dist -mat_superlu_dist_matinput DISTRIBUTED -mat_superlu_dist_parsymbfact -ksp_view</span><u></u><u></u></p>
<p class="MsoNormal" style="text-autospace:none">
<span lang="EN-US" style="font-size:9.0pt;font-family:"Lucida Console"">Invalid ISPEC at line 484 in file get_perm_c.c</span><u></u><u></u></p>
<p class="MsoNormal" style="text-autospace:none">
<span lang="EN-US" style="font-size:9.0pt;font-family:"Lucida Console"">Invalid ISPEC at line 484 in file get_perm_c.c</span><u></u><u></u></p>
<p class="MsoNormal" style="text-autospace:none">
<span lang="EN-US" style="font-size:9.0pt;font-family:"Lucida Console"">-------------------------------------------------------</span><u></u><u></u></p>
<p class="MsoNormal" style="text-autospace:none">
<span lang="EN-US" style="font-size:9.0pt;font-family:"Lucida Console"">Primary job terminated normally, but 1 process returned</span><u></u><u></u></p>
<p class="MsoNormal" style="text-autospace:none">
<span lang="EN-US" style="font-size:9.0pt;font-family:"Lucida Console"">a non-zero exit code.. Per user-direction, the job has been aborted.</span><u></u><u></u></p>
<p class="MsoNormal" style="text-autospace:none">
<span lang="EN-US" style="font-size:9.0pt;font-family:"Lucida Console"">-------------------------------------------------------</span><u></u><u></u></p>
<p class="MsoNormal" style="text-autospace:none">
<span lang="EN-US" style="font-size:9.0pt;font-family:"Lucida Console"">--------------------------------------------------------------------------</span><u></u><u></u></p>
<p class="MsoNormal" style="text-autospace:none">
<span lang="EN-US" style="font-size:9.0pt;font-family:"Lucida Console"">mpiexec detected that one or more processes exited with non-zero status, thus causing</span><u></u><u></u></p>
<p class="MsoNormal" style="text-autospace:none">
<span lang="EN-US" style="font-size:9.0pt;font-family:"Lucida Console"">the job to be terminated. The first process to do so was:</span><u></u><u></u></p>
<p class="MsoNormal" style="text-autospace:none">
<span lang="EN-US" style="font-size:9.0pt;font-family:"Lucida Console""> </span><u></u><u></u></p>
<p class="MsoNormal" style="text-autospace:none">
<span lang="EN-US" style="font-size:9.0pt;font-family:"Lucida Console""> </span>
<span style="font-size:9.0pt;font-family:"Lucida Console"">Process name: [[63679,1],0]</span><u></u><u></u></p>
<p class="MsoNormal" style="text-autospace:none">
<span style="font-size:9.0pt;font-family:"Lucida Console""> Exit code: 255</span><u></u><u></u></p>
<p class="MsoNormal" style="text-autospace:none">
<span style="font-size:9.0pt;font-family:"Lucida Console"">--------------------------------------------------------------------------</span><u></u><u></u></p>
<p class="MsoNormal"><span lang="EN-US" style="color:#1f497d"> </span><u></u><u></u></p>
<p class="MsoNormal"><span lang="EN-US" style="color:#1f497d">Since the program does not finish the call to KSPSolve(), we do not get any information about the KSP from –ksp_view.</span><u></u><u></u></p>
<p class="MsoNormal"><span lang="EN-US" style="color:#1f497d"> </span><u></u><u></u></p>
<p class="MsoNormal"><span lang="EN-US" style="color:#1f497d">If I do not set it, I get a serial run even if I specify –n 2:</span><u></u><u></u></p>
<p class="MsoNormal"><span lang="EN-US" style="color:#1f497d"> </span><u></u><u></u></p>
<p class="MsoNormal" style="text-autospace:none">
<span lang="EN-US" style="font-size:9.0pt;font-family:"Lucida Console"">mpiexec -n 2 ./solve -ksp_type preonly -pc_type lu -pc_factor_mat_solver_package superlu_dist -ksp_view</span><u></u><u></u></p>
<p class="MsoNormal"><span lang="EN-US" style="color:#1f497d">…</span><u></u><u></u></p>
<p class="MsoNormal" style="text-autospace:none">
<span lang="EN-US" style="font-size:9.0pt;font-family:"Lucida Console"">KSP Object: 1 MPI processes</span><u></u><u></u></p>
<p class="MsoNormal" style="text-autospace:none">
<span lang="EN-US" style="font-size:9.0pt;font-family:"Lucida Console""> type: preonly</span><u></u><u></u></p>
<p class="MsoNormal" style="text-autospace:none">
<span lang="EN-US" style="font-size:9.0pt;font-family:"Lucida Console""> maximum iterations=10000, initial guess is zero</span><u></u><u></u></p>
<p class="MsoNormal" style="text-autospace:none">
<span lang="EN-US" style="font-size:9.0pt;font-family:"Lucida Console""> tolerances: relative=1e-05, absolute=1e-50, divergence=10000</span><u></u><u></u></p>
<p class="MsoNormal" style="text-autospace:none">
<span lang="EN-US" style="font-size:9.0pt;font-family:"Lucida Console""> left preconditioning</span><u></u><u></u></p>
<p class="MsoNormal" style="text-autospace:none">
<span lang="EN-US" style="font-size:9.0pt;font-family:"Lucida Console""> using NONE norm type for convergence test</span><u></u><u></u></p>
<p class="MsoNormal" style="text-autospace:none">
<span lang="EN-US" style="font-size:9.0pt;font-family:"Lucida Console"">PC Object: 1 MPI processes</span><u></u><u></u></p>
<p class="MsoNormal" style="text-autospace:none">
<span lang="EN-US" style="font-size:9.0pt;font-family:"Lucida Console""> type: lu</span><u></u><u></u></p>
<p class="MsoNormal" style="text-autospace:none">
<span lang="EN-US" style="font-size:9.0pt;font-family:"Lucida Console""> LU: out-of-place factorization</span><u></u><u></u></p>
<p class="MsoNormal" style="text-autospace:none">
<span lang="EN-US" style="font-size:9.0pt;font-family:"Lucida Console""> tolerance for zero pivot 2.22045e-14</span><u></u><u></u></p>
<p class="MsoNormal" style="text-autospace:none">
<span lang="EN-US" style="font-size:9.0pt;font-family:"Lucida Console""> matrix ordering: nd</span><u></u><u></u></p>
<p class="MsoNormal" style="text-autospace:none">
<span lang="EN-US" style="font-size:9.0pt;font-family:"Lucida Console""> factor fill ratio given 0, needed 0</span><u></u><u></u></p>
<p class="MsoNormal" style="text-autospace:none">
<span lang="EN-US" style="font-size:9.0pt;font-family:"Lucida Console""> Factored matrix follows:</span><u></u><u></u></p>
<p class="MsoNormal" style="text-autospace:none">
<span lang="EN-US" style="font-size:9.0pt;font-family:"Lucida Console""> Mat Object: 1 MPI processes</span><u></u><u></u></p>
<p class="MsoNormal" style="text-autospace:none">
<span lang="EN-US" style="font-size:9.0pt;font-family:"Lucida Console""> type: seqaij</span><u></u><u></u></p>
<p class="MsoNormal" style="text-autospace:none">
<span lang="EN-US" style="font-size:9.0pt;font-family:"Lucida Console""> rows=954, cols=954</span><u></u><u></u></p>
<p class="MsoNormal" style="text-autospace:none">
<span lang="EN-US" style="font-size:9.0pt;font-family:"Lucida Console""> package used to perform factorization: superlu_dist</span><u></u><u></u></p>
<p class="MsoNormal" style="text-autospace:none">
<span lang="EN-US" style="font-size:9.0pt;font-family:"Lucida Console""> total: nonzeros=0, allocated nonzeros=0</span><u></u><u></u></p>
<p class="MsoNormal" style="text-autospace:none">
<span lang="EN-US" style="font-size:9.0pt;font-family:"Lucida Console""> total number of mallocs used during MatSetValues calls =0</span><u></u><u></u></p>
<p class="MsoNormal" style="text-autospace:none">
<span lang="EN-US" style="font-size:9.0pt;font-family:"Lucida Console""> SuperLU_DIST run parameters:</span><u></u><u></u></p>
<p class="MsoNormal" style="text-autospace:none">
<span lang="EN-US" style="font-size:9.0pt;font-family:"Lucida Console""> Process grid nprow 1 x npcol 1</span><u></u><u></u></p>
<p class="MsoNormal" style="text-autospace:none">
<span lang="EN-US" style="font-size:9.0pt;font-family:"Lucida Console""> Equilibrate matrix TRUE</span><u></u><u></u></p>
<p class="MsoNormal" style="text-autospace:none">
<span lang="EN-US" style="font-size:9.0pt;font-family:"Lucida Console""> Matrix input mode 0</span><u></u><u></u></p>
<p class="MsoNormal" style="text-autospace:none">
<span lang="EN-US" style="font-size:9.0pt;font-family:"Lucida Console""> Replace tiny pivots TRUE</span><u></u><u></u></p>
<p class="MsoNormal" style="text-autospace:none">
<span lang="EN-US" style="font-size:9.0pt;font-family:"Lucida Console""> Use iterative refinement FALSE</span><u></u><u></u></p>
<p class="MsoNormal" style="text-autospace:none">
<span lang="EN-US" style="font-size:9.0pt;font-family:"Lucida Console""> Processors in row 1 col partition 1</span><u></u><u></u></p>
<p class="MsoNormal" style="text-autospace:none">
<span lang="EN-US" style="font-size:9.0pt;font-family:"Lucida Console""> Row permutation LargeDiag</span><u></u><u></u></p>
<p class="MsoNormal" style="text-autospace:none">
<span lang="EN-US" style="font-size:9.0pt;font-family:"Lucida Console""> Column permutation METIS_AT_PLUS_A</span><u></u><u></u></p>
<p class="MsoNormal" style="text-autospace:none">
<span lang="EN-US" style="font-size:9.0pt;font-family:"Lucida Console""> Parallel symbolic factorization FALSE</span><u></u><u></u></p>
<p class="MsoNormal" style="text-autospace:none">
<span lang="EN-US" style="font-size:9.0pt;font-family:"Lucida Console""> Repeated factorization SamePattern_SameRowPerm</span><u></u><u></u></p>
<p class="MsoNormal" style="text-autospace:none">
<span lang="EN-US" style="font-size:9.0pt;font-family:"Lucida Console""> linear system matrix = precond matrix:</span><u></u><u></u></p>
<p class="MsoNormal" style="text-autospace:none">
<span lang="EN-US" style="font-size:9.0pt;font-family:"Lucida Console""> Mat Object: 1 MPI processes</span><u></u><u></u></p>
<p class="MsoNormal" style="text-autospace:none">
<span lang="EN-US" style="font-size:9.0pt;font-family:"Lucida Console""> type: seqaij</span><u></u><u></u></p>
<p class="MsoNormal" style="text-autospace:none">
<span lang="EN-US" style="font-size:9.0pt;font-family:"Lucida Console""> rows=954, cols=954</span><u></u><u></u></p>
<p class="MsoNormal" style="text-autospace:none">
<span lang="EN-US" style="font-size:9.0pt;font-family:"Lucida Console""> total: nonzeros=34223, allocated nonzeros=34223</span><u></u><u></u></p>
<p class="MsoNormal" style="text-autospace:none">
<span lang="EN-US" style="font-size:9.0pt;font-family:"Lucida Console""> total number of mallocs used during MatSetValues calls =0</span><u></u><u></u></p>
<p class="MsoNormal" style="text-autospace:none">
<span lang="EN-US" style="font-size:9.0pt;font-family:"Lucida Console""> using I-node routines: found 668 nodes, limit used is 5</span><u></u><u></u></p>
<p class="MsoNormal"><span lang="EN-US" style="color:#1f497d"> </span><u></u><u></u></p>
<p class="MsoNormal"><span lang="EN-US" style="color:#1f497d">I am running PETSc via Cygwin on a windows machine.
</span><u></u><u></u></p>
<p class="MsoNormal"><span lang="EN-US" style="color:#1f497d">When I installed PETSc the tests with different numbers of processes ran well.</span><u></u><u></u></p>
<p class="MsoNormal"><span lang="EN-US" style="color:#1f497d"> </span><u></u><u></u></p>
<p class="MsoNormal"><span lang="EN-US" style="color:#1f497d">Mahir</span><u></u><u></u></p>
<p class="MsoNormal"><span lang="EN-US" style="color:#1f497d"> </span><u></u><u></u></p>
<p class="MsoNormal"><span lang="EN-US"> </span><u></u><u></u></p>
<p class="MsoNormal"><b><span lang="EN-US" style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">From:</span></b><span lang="EN-US" style="font-size:10.0pt;font-family:"Tahoma","sans-serif""> Hong
[mailto:<a href="mailto:hzhang@mcs.anl.gov" target="_blank">hzhang@mcs.anl.gov</a>]
<br>
<b>Sent:</b> den 3 augusti 2015 19:06<br>
<b>To:</b> Ülker-Kaustell, Mahir<br>
<b>Cc:</b> Hong; Xiaoye S. Li; PETSc users list<br>
<b>Subject:</b> Re: [petsc-users] SuperLU MPI-problem</span><u></u><u></u></p>
<p class="MsoNormal"> <u></u><u></u></p>
<div>
<p class="MsoNormal">Mahir,<u></u><u></u></p>
<div>
<div>
<blockquote style="border:none;border-left:solid #cccccc 1.0pt;padding:0cm 0cm 0cm 6.0pt;margin-left:4.8pt;margin-top:5.0pt;margin-right:0cm;margin-bottom:5.0pt">
<p class="MsoNormal"> <u></u><u></u></p>
<div>
<p class="MsoNormal"> <u></u><u></u></p>
</div>
</blockquote>
<blockquote style="border:none;border-left:solid #cccccc 1.0pt;padding:0cm 0cm 0cm 6.0pt;margin-left:4.8pt;margin-top:5.0pt;margin-right:0cm;margin-bottom:5.0pt">
<div>
<div>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">I have not used …parsymbfact in sequential runs or set matinput=GLOBAL for parallel
runs.</span><u></u><u></u></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d"> </span><u></u><u></u></p>
<p class="MsoNormal" style="text-autospace:none">
<span lang="EN-US" style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">If I use 2 processors, the program runs if I use
<b>–mat_superlu_dist_parsymbfact=1</b>:</span><u></u><u></u></p>
<p class="MsoNormal" style="text-autospace:none">
<span lang="EN-US" style="font-size:9.0pt;font-family:"Lucida Console"">mpiexec -n 2 ./solve -ksp_type preonly -pc_type lu -pc_factor_mat_solver_package superlu_dist -mat_superlu_dist_matinput GLOBAL -mat_superlu_dist_parsymbfact=1</span><u></u><u></u></p>
</div>
</div>
</blockquote>
<div>
<p class="MsoNormal"> <u></u><u></u></p>
</div>
<div>
<p class="MsoNormal">The incorrect option '<span style="font-size:9.0pt;font-family:"Lucida Console"">-mat_superlu_dist_parsymbfact=1' is not taken, so your code runs well without parsymbfact.</span><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"> <u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:9.0pt;font-family:"Lucida Console"">Please run it with '-ksp_view' and see what </span><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:9.0pt;font-family:"Lucida Console"">'SuperLU_DIST run parameters:' are being used, e.g.</span><u></u><u></u></p>
</div>
<div>
<div>
<p class="MsoNormal"><span style="font-size:9.0pt;font-family:"Lucida Console"">petsc/src/ksp/ksp/examples/tutorials (maint)</span><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:9.0pt;font-family:"Lucida Console"">$ mpiexec -n 2 ./ex2 -pc_type lu -pc_factor_mat_solver_package superlu_dist -mat_superlu_dist_parsymbfact=1 -ksp_view</span><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:9.0pt;font-family:"Lucida Console""> </span><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:9.0pt;font-family:"Lucida Console"">...</span><u></u><u></u></p>
</div>
<div>
<div>
<p class="MsoNormal"><span style="font-size:9.0pt;font-family:"Lucida Console""> SuperLU_DIST run parameters:</span><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:9.0pt;font-family:"Lucida Console""> Process grid nprow 2 x npcol 1</span><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:9.0pt;font-family:"Lucida Console""> Equilibrate matrix TRUE</span><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:9.0pt;font-family:"Lucida Console""> Matrix input mode 1</span><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:9.0pt;font-family:"Lucida Console""> Replace tiny pivots TRUE</span><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:9.0pt;font-family:"Lucida Console""> Use iterative refinement FALSE</span><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:9.0pt;font-family:"Lucida Console""> Processors in row 2 col partition 1</span><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:9.0pt;font-family:"Lucida Console""> Row permutation LargeDiag</span><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:9.0pt;font-family:"Lucida Console""> Column permutation METIS_AT_PLUS_A</span><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:9.0pt;font-family:"Lucida Console""> Parallel symbolic factorization FALSE</span><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:9.0pt;font-family:"Lucida Console""> Repeated factorization SamePattern_SameRowPerm</span><u></u><u></u></p>
</div>
</div>
<div>
<p class="MsoNormal"><span style="font-family:"Lucida Console""> </span><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:9.0pt;font-family:"Lucida Console"">I do not understand why your code uses matrix input mode = global.</span><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><span style="font-family:"Lucida Console""> </span><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:9.0pt;font-family:"Lucida Console"">Hong</span><u></u><u></u></p>
</div>
</div>
<blockquote style="border:none;border-left:solid #cccccc 1.0pt;padding:0cm 0cm 0cm 6.0pt;margin-left:4.8pt;margin-top:5.0pt;margin-right:0cm;margin-bottom:5.0pt">
<div>
<div>
<p class="MsoNormal" style="text-autospace:none">
<u></u><u></u></p>
<p class="MsoNormal" style="text-autospace:none">
<span lang="EN-US" style="font-size:9.0pt;font-family:"Lucida Console""> </span><u></u><u></u></p>
<p class="MsoNormal"><span lang="EN-US" style="color:#1f497d"> </span><u></u><u></u></p>
<p class="MsoNormal"><b><span lang="EN-US" style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">From:</span></b><span lang="EN-US" style="font-size:10.0pt;font-family:"Tahoma","sans-serif""> Hong
[mailto:<a href="mailto:hzhang@mcs.anl.gov" target="_blank">hzhang@mcs.anl.gov</a>]
<br>
<b>Sent:</b> den 3 augusti 2015 16:46<br>
<b>To:</b> Xiaoye S. Li<br>
<b>Cc:</b> Ülker-Kaustell, Mahir; Hong; PETSc users list</span><u></u><u></u></p>
<div>
<div>
<p class="MsoNormal"><br>
<b>Subject:</b> Re: [petsc-users] SuperLU MPI-problem<u></u><u></u></p>
</div>
</div>
<div>
<div>
<p class="MsoNormal"><span lang="EN-US"> </span><u></u><u></u></p>
<div>
<div>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">Mahir,</span><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"> <u></u><u></u></p>
</div>
<p class="MsoNormal">Sherry found the culprit. I can reproduce it:<u></u><u></u></p>
<div>
<p class="MsoNormal">petsc/src/ksp/ksp/examples/tutorials<u></u><u></u></p>
</div>
<div>
<div>
<p class="MsoNormal">mpiexec -n 2 ./ex2 -pc_type lu -pc_factor_mat_solver_package superlu_dist -mat_superlu_dist_matinput GLOBAL -mat_superlu_dist_parsymbfact<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"> <u></u><u></u></p>
</div>
<div>
<p class="MsoNormal">Invalid ISPEC at line 484 in file get_perm_c.c<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal">Invalid ISPEC at line 484 in file get_perm_c.c<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal">-------------------------------------------------------<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal">Primary job terminated normally, but 1 process returned<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal">a non-zero exit code.. Per user-direction, the job has been aborted.<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal">-------------------------------------------------------<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal">...<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"> <u></u><u></u></p>
</div>
<div>
<p class="MsoNormal">PETSc-superlu_dist interface sets <span style="font-size:9.5pt">matinput=DISTRIBUTED</span> as default when using more than one processes.<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal">Did you either use '-mat_superlu_dist_parsymbfact' for sequential run or set <span style="font-size:9.5pt">matinput=</span>GLOBAL for parallel run?<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"> <u></u><u></u></p>
</div>
<div>
<p class="MsoNormal">I'll add an error flag for these use cases.<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"> <u></u><u></u></p>
</div>
<div>
<p class="MsoNormal">Hong<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"> <u></u><u></u></p>
<div>
<p class="MsoNormal">On Mon, Aug 3, 2015 at 9:17 AM, Xiaoye S. Li <<a href="mailto:xsli@lbl.gov" target="_blank">xsli@lbl.gov</a>> wrote:<u></u><u></u></p>
<div>
<div>
<p class="MsoNormal"><span style="font-family:"Arial","sans-serif"">I think I know the problem. Since zdistribute.c is called, I guess you are using the global (replicated) matrix input interface,
pzgssvx_ABglobal(). This interface does not allow you to use parallel symbolic factorization (since matrix is centralized).</span><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><span style="font-family:"Arial","sans-serif""> </span><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><span style="font-family:"Arial","sans-serif"">That's why you get the following error:</span><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:13.5pt;font-family:"Courier New"">Invalid ISPEC at line 484 in file get_perm_c.c</span><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><span style="font-family:"Arial","sans-serif""> </span><u></u><u></u></p>
</div>
<p class="MsoNormal">You need to use distributed matrix input interface pzgssvx() (without ABglobal)<br>
<br>
Sherry<u></u><u></u></p>
<div>
<p class="MsoNormal"><span style="font-family:"Arial","sans-serif""> </span><u></u><u></u></p>
</div>
</div>
<div>
<div>
<div>
<p class="MsoNormal"> <u></u><u></u></p>
<div>
<p class="MsoNormal">On Mon, Aug 3, 2015 at 5:02 AM,
<a href="mailto:Mahir.Ulker-Kaustell@tyrens.se" target="_blank">Mahir.Ulker-Kaustell@tyrens.se</a> <<a href="mailto:Mahir.Ulker-Kaustell@tyrens.se" target="_blank">Mahir.Ulker-Kaustell@tyrens.se</a>> wrote:<u></u><u></u></p>
<div>
<div>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">Hong and Sherry,</span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d"> </span><u></u><u></u></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">I have rebuilt PETSc with SuperLU 4.1. Unfortunately, the problem remains:</span><u></u><u></u></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d"> </span><u></u><u></u></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">If I use
</span><span lang="EN-US" style="font-size:9.0pt;font-family:"Courier New"">-mat_superlu_dist_parsymbfact</span><span lang="EN-US" style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">, the program crashes with:
</span><span lang="EN-US" style="font-size:9.0pt;font-family:"Courier New"">Invalid ISPEC at line 484 in file get_perm_c.c</span><u></u><u></u></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">If I use
</span><span lang="EN-US" style="font-size:9.0pt;font-family:"Courier New"">-mat_superlu_dist_parsymbfact=1</span><span lang="EN-US" style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d"> or leave this flag out, the program crashes with:
</span><span lang="EN-US" style="font-size:9.0pt;font-family:"Courier New"">Calloc fails for SPA dense[]. at line 438 in file zdistribute.c</span><u></u><u></u></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:9.0pt;font-family:"Courier New""> </span><u></u><u></u></p>
<p class="MsoNormal"><span style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">Mahir</span><u></u><u></u></p>
<p class="MsoNormal"> <u></u><u></u></p>
<p class="MsoNormal"><b><span lang="EN-US" style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">From:</span></b><span lang="EN-US" style="font-size:10.0pt;font-family:"Tahoma","sans-serif""> Hong
[mailto:<a href="mailto:hzhang@mcs.anl.gov" target="_blank">hzhang@mcs.anl.gov</a>]
<br>
<b>Sent:</b> den 30 juli 2015 02:58<br>
<b>To:</b> Ülker-Kaustell, Mahir<br>
<b>Cc:</b> Xiaoye Li; PETSc users list</span><u></u><u></u></p>
<div>
<div>
<p class="MsoNormal"><br>
<b>Subject:</b> Fwd: [petsc-users] SuperLU MPI-problem<u></u><u></u></p>
</div>
</div>
<div>
<div>
<p class="MsoNormal"><span lang="EN-US"> </span><u></u><u></u></p>
<div>
<p class="MsoNormal"><span style="font-size:9.5pt;color:#500050">Mahir,</span><u></u><u></u></p>
<div>
<p class="MsoNormal"> <u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:9.5pt;color:#500050">Sherry fixed several bugs in superlu_dist-v4.1.</span><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:9.5pt;color:#500050">The current petsc-release interfaces with superlu_dist-v4.0.</span><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:9.5pt;color:#500050">We do not know whether the reported issue (attached below) has been resolved or not. If not, can you test it with the latest superlu_dist-v4.1?</span><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"> <u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:9.5pt;color:#500050">Here is how to do it:</span><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:9.5pt;color:#500050">1. download superlu_dist v4.1</span><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:9.5pt;color:#500050">2. remove existing PETSC_ARCH directory, then configure petsc with</span><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:9.5pt;color:#500050">'--download-superlu_dist=superlu_dist_4.1.tar.gz'</span><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:9.5pt;color:#500050">3. build petsc</span><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"> <u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:9.5pt;color:#500050">Let us know if the issue remains.</span><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"> <u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:9.5pt;color:#500050">Hong</span><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"> <u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"> <u></u><u></u></p>
</div>
<div>
<div>
<p class="MsoNormal" style="margin-bottom:12.0pt">---------- Forwarded message ----------<br>
From: <b>Xiaoye S. Li</b> <<a href="mailto:xsli@lbl.gov" target="_blank">xsli@lbl.gov</a>><br>
Date: Wed, Jul 29, 2015 at 2:24 PM<br>
Subject: Fwd: [petsc-users] SuperLU MPI-problem<br>
To: Hong Zhang <<a href="mailto:hzhang@mcs.anl.gov" target="_blank">hzhang@mcs.anl.gov</a>><u></u><u></u></p>
<div>
<div>
<p class="MsoNormal"><span style="font-family:"Arial","sans-serif"">Hong,</span><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal" style="margin-bottom:12.0pt"><span style="font-family:"Arial","sans-serif"">I am cleaning the mailbox, and saw this unresolved issue. I am not sure whether the new fix to parallel symbolic factorization solves the
problem. What bothers be is that he is getting the following error:<br>
<br>
</span><span lang="EN-US" style="font-size:9.0pt;font-family:"Lucida Console"">Invalid ISPEC at line 484 in file get_perm_c.c</span><u></u><u></u></p>
</div>
<p class="MsoNormal">This has nothing to do with my bug fix.<u></u><u></u></p>
<div>
<p class="MsoNormal" style="margin-bottom:12.0pt"><span style="font-family:"Arial","sans-serif""> Shall we ask him to try the new version, or try to get him matrix?</span><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><span style="font-family:"Arial","sans-serif"">Sherry<br>
</span><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><span style="font-family:"Arial","sans-serif""> </span><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal" style="margin-bottom:12.0pt">---------- Forwarded message ----------<br>
From: <b><a href="mailto:Mahir.Ulker-Kaustell@tyrens.se" target="_blank">Mahir.Ulker-Kaustell@tyrens.se</a></b> <<a href="mailto:Mahir.Ulker-Kaustell@tyrens.se" target="_blank">Mahir.Ulker-Kaustell@tyrens.se</a>><br>
Date: Wed, Jul 22, 2015 at 1:32 PM<br>
Subject: RE: [petsc-users] SuperLU MPI-problem<br>
To: Hong <<a href="mailto:hzhang@mcs.anl.gov" target="_blank">hzhang@mcs.anl.gov</a>>, "Xiaoye S. Li" <<a href="mailto:xsli@lbl.gov" target="_blank">xsli@lbl.gov</a>><br>
Cc: petsc-users <<a href="mailto:petsc-users@mcs.anl.gov" target="_blank">petsc-users@mcs.anl.gov</a>><u></u><u></u></p>
<div>
<div>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">The 1000 was just a conservative guess. The number of non-zeros per row is in the tens
in general but certain constraints lead to non-diagonal streaks in the sparsity-pattern.</span><u></u><u></u></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">Is it the reordering of the matrix that is killing me here? How can I set
</span><span lang="EN-US" style="font-size:9.0pt;font-family:"Lucida Console"">options.ColPerm</span><span lang="EN-US" style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">?</span><u></u><u></u></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d"> </span><u></u><u></u></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">If i use -mat_superlu_dist_parsymbfact the program crashes with</span><u></u><u></u></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d"> </span><u></u><u></u></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:9.0pt;font-family:"Lucida Console"">Invalid ISPEC at line 484 in file get_perm_c.c</span><u></u><u></u></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:9.0pt;font-family:"Lucida Console"">-------------------------------------------------------</span><u></u><u></u></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:9.0pt;font-family:"Lucida Console"">Primary job terminated normally, but 1 process returned</span><u></u><u></u></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:9.0pt;font-family:"Lucida Console"">a non-zero exit code.. Per user-direction, the job has been aborted.</span><u></u><u></u></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:9.0pt;font-family:"Lucida Console"">-------------------------------------------------------</span><u></u><u></u></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:9.0pt;font-family:"Lucida Console"">[0]PETSC ERROR: ------------------------------------------------------------------------</span><u></u><u></u></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:9.0pt;font-family:"Lucida Console"">[0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process
to end</span><u></u><u></u></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:9.0pt;font-family:"Lucida Console"">[0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger</span><u></u><u></u></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:9.0pt;font-family:"Lucida Console"">[0]PETSC ERROR: or see
<a href="http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind" target="_blank">
http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind</a></span><u></u><u></u></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:9.0pt;font-family:"Lucida Console"">[0]PETSC ERROR: or try
<a href="http://valgrind.org" target="_blank">http://valgrind.org</a> on GNU/linux and Apple Mac OS X to find memory corruption errors</span><u></u><u></u></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:9.0pt;font-family:"Lucida Console"">[0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run</span><u></u><u></u></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:9.0pt;font-family:"Lucida Console"">[0]PETSC ERROR: to get more information on the crash.</span><u></u><u></u></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:9.0pt;font-family:"Lucida Console"">[0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------</span><u></u><u></u></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:9.0pt;font-family:"Lucida Console"">[0]PETSC ERROR: Signal received</span><u></u><u></u></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:9.0pt;font-family:"Lucida Console"">[0]PETSC ERROR: See
<a href="http://www.mcs.anl.gov/petsc/documentation/faq.html" target="_blank">http://www.mcs.anl.gov/petsc/documentation/faq.html</a> for trouble shooting.</span><u></u><u></u></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:9.0pt;font-family:"Lucida Console"">[0]PETSC ERROR: Petsc Release Version 3.6.0, Jun, 09, 2015</span><u></u><u></u></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:9.0pt;font-family:"Lucida Console"">[0]PETSC ERROR: ./solve on a cygwin-complex-nodebug named CZC5202SM2 by muk Wed Jul 22 21:59:23 2015</span><u></u><u></u></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:9.0pt;font-family:"Lucida Console"">[0]PETSC ERROR: Configure options PETSC_DIR=/packages/petsc-3.6.0 PETSC_ARCH=cygwin-complex-nodebug --with-cc=gcc
--with-cxx=g++ --with-fc=gfortran --with-debugging=0 --with-fortran-kernels=1 --with-scalar-type=complex --download-fblaspack --download-mpich --download-scalapack --download-mumps --download-metis --download-parmetis --download-superlu --download-superlu_dist
--download-fftw</span><u></u><u></u></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:9.0pt;font-family:"Lucida Console"">[0]PETSC ERROR: #1 User provided function() line 0 in unknown file</span><u></u><u></u></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:9.0pt;font-family:"Lucida Console"">application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0</span><u></u><u></u></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:9.0pt;font-family:"Lucida Console"">[unset]: aborting job:</span><u></u><u></u></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:9.0pt;font-family:"Lucida Console"">application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0</span><u></u><u></u></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:9.0pt;font-family:"Lucida Console"">[0]PETSC ERROR: ------------------------------------------------------------------------</span><u></u><u></u></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d"> </span><u></u><u></u></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d">If i use -mat_superlu_dist_parsymbfact=1 the program crashes (somewhat later) with</span><u></u><u></u></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d"> </span><u></u><u></u></p>
<p class="MsoNormal" style="text-autospace:none">
<span lang="EN-US" style="font-size:9.0pt;font-family:"Lucida Console"">Malloc fails for Lnzval_bc_ptr[*][] at line 626 in file zdistribute.c</span><u></u><u></u></p>
<p class="MsoNormal" style="text-autospace:none">
<span lang="EN-US" style="font-size:9.0pt;font-family:"Lucida Console"">col block 3006 -------------------------------------------------------</span><u></u><u></u></p>
<p class="MsoNormal" style="text-autospace:none">
<span lang="EN-US" style="font-size:9.0pt;font-family:"Lucida Console"">Primary job terminated normally, but 1 process returned</span><u></u><u></u></p>
<p class="MsoNormal" style="text-autospace:none">
<span lang="EN-US" style="font-size:9.0pt;font-family:"Lucida Console"">a non-zero exit code.. Per user-direction, the job has been aborted.</span><u></u><u></u></p>
<p class="MsoNormal" style="text-autospace:none">
<span lang="EN-US" style="font-size:9.0pt;font-family:"Lucida Console"">-------------------------------------------------------</span><u></u><u></u></p>
<p class="MsoNormal" style="text-autospace:none">
<span lang="EN-US" style="font-size:9.0pt;font-family:"Lucida Console"">col block 1924 [0]PETSC ERROR: ------------------------------------------------------------------------</span><u></u><u></u></p>
<p class="MsoNormal" style="text-autospace:none">
<span lang="EN-US" style="font-size:9.0pt;font-family:"Lucida Console"">[0]PETSC ERROR: Caught signal number 15 Terminate: Some process (or the batch system) has told this process to end</span><u></u><u></u></p>
<p class="MsoNormal" style="text-autospace:none">
<span lang="EN-US" style="font-size:9.0pt;font-family:"Lucida Console"">[0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger</span><u></u><u></u></p>
<p class="MsoNormal" style="text-autospace:none">
<span lang="EN-US" style="font-size:9.0pt;font-family:"Lucida Console"">[0]PETSC ERROR: or see
<a href="http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind" target="_blank">
http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind</a></span><u></u><u></u></p>
<p class="MsoNormal" style="text-autospace:none">
<span lang="EN-US" style="font-size:9.0pt;font-family:"Lucida Console"">[0]PETSC ERROR: or try
<a href="http://valgrind.org" target="_blank">http://valgrind.org</a> on GNU/linux and Apple Mac OS X to find memory corruption errors</span><u></u><u></u></p>
<p class="MsoNormal" style="text-autospace:none">
<span lang="EN-US" style="font-size:9.0pt;font-family:"Lucida Console"">[0]PETSC ERROR: configure using --with-debugging=yes, recompile, link, and run</span><u></u><u></u></p>
<p class="MsoNormal" style="text-autospace:none">
<span lang="EN-US" style="font-size:9.0pt;font-family:"Lucida Console"">[0]PETSC ERROR: to get more information on the crash.</span><u></u><u></u></p>
<p class="MsoNormal" style="text-autospace:none">
<span lang="EN-US" style="font-size:9.0pt;font-family:"Lucida Console"">[0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------</span><u></u><u></u></p>
<p class="MsoNormal" style="text-autospace:none">
<span lang="EN-US" style="font-size:9.0pt;font-family:"Lucida Console"">[0]PETSC ERROR: Signal received</span><u></u><u></u></p>
<p class="MsoNormal" style="text-autospace:none">
<span lang="EN-US" style="font-size:9.0pt;font-family:"Lucida Console"">[0]PETSC ERROR: See
<a href="http://www.mcs.anl.gov/petsc/documentation/faq.html" target="_blank">http://www.mcs.anl.gov/petsc/documentation/faq.html</a> for trouble shooting.</span><u></u><u></u></p>
<p class="MsoNormal" style="text-autospace:none">
<span lang="EN-US" style="font-size:9.0pt;font-family:"Lucida Console"">[0]PETSC ERROR: Petsc Release Version 3.6.0, Jun, 09, 2015</span><u></u><u></u></p>
<p class="MsoNormal" style="text-autospace:none">
<span lang="EN-US" style="font-size:9.0pt;font-family:"Lucida Console"">[0]PETSC ERROR: ./solve on a cygwin-complex-nodebug named CZC5202SM2 by muk Wed Jul 22 21:59:58 2015</span><u></u><u></u></p>
<p class="MsoNormal" style="text-autospace:none">
<span lang="EN-US" style="font-size:9.0pt;font-family:"Lucida Console"">[0]PETSC ERROR: Configure options PETSC_DIR=/packages/petsc-3.6.0 PETSC_ARCH=cygwin-complex-nodebug --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --with-debugging=0 --with-fortran-kernels=1
--with-scalar-type=complex --download-fblaspack --download-mpich --download-scalapack --download-mumps --download-metis --download-parmetis --download-superlu --download-superlu_dist --download-fftw</span><u></u><u></u></p>
<p class="MsoNormal" style="text-autospace:none">
<span lang="EN-US" style="font-size:9.0pt;font-family:"Lucida Console"">[0]PETSC ERROR: #1 User provided function() line 0 in unknown file</span><u></u><u></u></p>
<p class="MsoNormal" style="text-autospace:none">
<span lang="EN-US" style="font-size:9.0pt;font-family:"Lucida Console"">application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0</span><u></u><u></u></p>
<p class="MsoNormal" style="text-autospace:none">
<span lang="EN-US" style="font-size:9.0pt;font-family:"Lucida Console"">[unset]: aborting job:</span><u></u><u></u></p>
<p class="MsoNormal" style="text-autospace:none">
<span lang="EN-US" style="font-size:9.0pt;font-family:"Lucida Console"">application called MPI_Abort(MPI_COMM_WORLD, 59) - process 0</span><u></u><u></u></p>
<p class="MsoNormal" style="text-autospace:none">
<span style="font-size:9.0pt;font-family:"Lucida Console"">[0]PETSC ERROR: ------------------------------------------------------------------------</span><u></u><u></u></p>
<p class="MsoNormal"><span lang="EN-US" style="font-size:11.0pt;font-family:"Calibri","sans-serif";color:#1f497d"> </span><u></u><u></u></p>
<p class="MsoNormal"><span style="color:#1f497d"> </span><u></u><u></u></p>
<p class="MsoNormal"><span style="color:#1f497d">/Mahir</span><u></u><u></u></p>
<p class="MsoNormal"><span style="color:#1f497d"> </span><u></u><u></u></p>
<p class="MsoNormal"> <u></u><u></u></p>
<p class="MsoNormal"><b><span lang="EN-US" style="font-size:10.0pt;font-family:"Tahoma","sans-serif"">From:</span></b><span lang="EN-US" style="font-size:10.0pt;font-family:"Tahoma","sans-serif""> Hong
[mailto:<a href="mailto:hzhang@mcs.anl.gov" target="_blank">hzhang@mcs.anl.gov</a>]
</span><u></u><u></u></p>
<div>
<div>
<p class="MsoNormal"><b>Sent:</b> den 22 juli 2015 21:34<br>
<b>To:</b> Xiaoye S. Li<br>
<b>Cc:</b> Ülker-Kaustell, Mahir; petsc-users<u></u><u></u></p>
</div>
</div>
<div>
<div>
<div>
<div>
<p class="MsoNormal"><br>
<b>Subject:</b> Re: [petsc-users] SuperLU MPI-problem<u></u><u></u></p>
</div>
</div>
<div>
<div>
<p class="MsoNormal"><span lang="EN-US"> </span><u></u><u></u></p>
<div>
<p class="MsoNormal">In Petsc/superlu_dist interface, we set default <u></u><u></u></p>
<div>
<p class="MsoNormal"> <u></u><u></u></p>
<div>
<div>
<p class="MsoNormal">options.ParSymbFact = NO;<u></u><u></u></p>
</div>
</div>
<div>
<p class="MsoNormal"> <u></u><u></u></p>
</div>
<div>
<p class="MsoNormal">When user raises the flag "-mat_superlu_dist_parsymbfact",<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal">we set<u></u><u></u></p>
</div>
<div>
<div>
<p class="MsoNormal"> <u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"> options.ParSymbFact = YES;<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"> options.ColPerm = PARMETIS; /* in v2.2, PARMETIS is forced for ParSymbFact regardless of user ordering setting */<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"> <u></u><u></u></p>
</div>
</div>
<div>
<p class="MsoNormal">We do not change anything else.<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"> <u></u><u></u></p>
</div>
<div>
<p class="MsoNormal">Hong<u></u><u></u></p>
</div>
</div>
</div>
<div>
<p class="MsoNormal"> <u></u><u></u></p>
<div>
<p class="MsoNormal">On Wed, Jul 22, 2015 at 2:19 PM, Xiaoye S. Li <<a href="mailto:xsli@lbl.gov" target="_blank">xsli@lbl.gov</a>> wrote:<u></u><u></u></p>
<div>
<div>
<p class="MsoNormal"><span style="font-size:13.5pt;font-family:"Arial","sans-serif"">I am trying to understand your problem. You said you are solving
</span>Naviers equation (elastodynamics) in the frequency domain, using finite element discretization. I wonder why you have about 1000 nonzeros per row. Usually in many PDE discretized matrices, the number of nonzeros per row is in the tens (even for 3D
problems), not in the thousands. So, your matrix is quite a bit denser than many sparse matrices we deal with.<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"> <u></u><u></u></p>
</div>
<div>
<p class="MsoNormal">The number of nonzeros in the L and U factors is much more than that in original matrix A -- typically we see 10-20x fill ratio for 2D, or can be as bad as 50-100x fill ratio for
3D. But since your matrix starts much denser (i.e., the underlying graph has many connections), it may not lend to any good ordering strategy to preserve sparsity of L and U; that is, the L and U fill ratio may be large. <u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"> <u></u><u></u></p>
</div>
<div>
<p class="MsoNormal">I don't understand why you get the following error when you use <u></u><u></u></p>
</div>
<div>
<p class="MsoNormal">‘-mat_superlu_dist_parsymbfact’.<u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"> <u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"><span style="font-size:13.5pt;font-family:"Lucida Console"">Invalid ISPEC at line 484 in file get_perm_c.c</span><u></u><u></u></p>
</div>
<div>
<p class="MsoNormal"> <u></u><u></u></p>
</div>
<div>
<p class="MsoNormal">Perhaps Hong Zhang knows; she built the SuperLU_DIST interface for PETSc. <u></u><u></u></p>
</div>
<div>
<div>
<p class="MsoNormal"><span style="font-family:"Arial","sans-serif""> </span><u></u><u></u></p>
</div>
</div>
<div>
<div>
<p class="MsoNormal"><span style="font-family:"Arial","sans-serif"">Hong -- in order to use parallel symbolic factorization, is it sufficient to specify only </span><u></u><u></u></p>
</div>
<p class="MsoNormal">‘-mat_superlu_dist_parsymbfact’<u></u><u></u></p>
<div>
<p class="MsoNormal"><span style="font-family:"Arial","sans-serif""> ? (the default is to use sequential symbolic factorization.)</span><u></u><u></u></p>
</div>
</div>
<div>
<div>
<p class="MsoNormal"><span style="font-family:"Arial","sans-serif""> </span><u></u><u></u></p>
</div>
</div>
<div>
<div>
<p class="MsoNormal"><span style="font-family:"Arial","sans-serif""> </span><u></u><u></u></p>
</div>
</div>
<div>
<div>
<p class="MsoNormal"><span style="font-family:"Arial","sans-serif"">Sherry</span><u></u><u></u></p>
</div>
</div>
<div>
<div>
<p class="MsoNormal"><span style="font-family:"Arial","sans-serif""> </span><u></u><u></u></p>
</div>
</div>
<div>
<p class="MsoNormal">On Wed, Jul 22, 2015 at 9:11 AM,
<a href="mailto:Mahir.Ulker-Kaustell@tyrens.se" target="_blank">Mahir.Ulker-Kaustell@tyrens.se</a> <<a href="mailto:Mahir.Ulker-Kaustell@tyrens.se" target="_blank">Mahir.Ulker-Kaustell@tyrens.se</a>> wrote:<u></u><u></u></p>
</div>
<div>
<div>
<blockquote style="border:none;border-left:solid #cccccc 1.0pt;padding:0cm 0cm 0cm 6.0pt;margin-left:4.8pt;margin-top:5.0pt;margin-right:0cm;margin-bottom:5.0pt">
<p class="MsoNormal">Thank you for your reply.<br>
<br>
As you have probably figured out already, I am not a computational scientist. I am a researcher in civil engineering (railways for high-speed traffic), trying to produce some, from my perspective, fairly large parametric studies based on finite element discretizations.<br>
<br>
I am working in a Windows-environment and have installed PETSc through Cygwin.<br>
Apparently, there is no support for Valgrind in this OS.<br>
<br>
If I have understood you correct, the memory issues are related to superLU and given my background, there is not much I can do. Is this correct?<br>
<br>
<br>
Best regards,<br>
Mahir<br>
<br>
______________________________________________<br>
Mahir Ülker-Kaustell, Kompetenssamordnare, Brokonstruktör, Tekn. Dr, Tyréns AB<br>
010 452 30 82, <a href="mailto:Mahir.Ulker-Kaustell@tyrens.se" target="_blank">Mahir.Ulker-Kaustell@tyrens.se</a><br>
______________________________________________<u></u><u></u></p>
<div>
<div>
<p class="MsoNormal" style="margin-bottom:12.0pt"><br>
-----Original Message-----<br>
From: Barry Smith [mailto:<a href="mailto:bsmith@mcs.anl.gov" target="_blank">bsmith@mcs.anl.gov</a>]<br>
Sent: den 22 juli 2015 02:57<br>
To: Ülker-Kaustell, Mahir<br>
Cc: Xiaoye S. Li; petsc-users<br>
Subject: Re: [petsc-users] SuperLU MPI-problem<br>
<br>
<br>
Run the program under valgrind <a href="http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind" target="_blank">
http://www.mcs.anl.gov/petsc/documentation/faq.html#valgrind</a> . When I use the option -mat_superlu_dist_parsymbfact I get many scary memory problems some involving for example ddist_psymbtonum (pdsymbfact_distdata.c:1332)<br>
<br>
Note that I consider it unacceptable for running programs to EVER use uninitialized values; until these are all cleaned up I won't trust any runs like this.<br>
<br>
Barry<br>
<br>
<br>
<br>
<br>
==42050== Conditional jump or move depends on uninitialised value(s)<br>
==42050== at 0x10274C436: MPI_Allgatherv (allgatherv.c:1053)<br>
==42050== by 0x101557F60: get_perm_c_parmetis (get_perm_c_parmetis.c:285)<br>
==42050== by 0x101501192: pdgssvx (pdgssvx.c:934)<br>
==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)<br>
==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)<br>
==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152)<br>
==42050== by 0x100FF9036: PCSetUp (precon.c:982)<br>
==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332)<br>
==42050== by 0x1010F7985: KSPSolve (itfunc.c:546)<br>
==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)<br>
==42050== by 0x1011C49B7: SNESSolve (snes.c:3906)<br>
==42050== by 0x100001B3C: main (in ./ex19)<br>
==42050== Uninitialised value was created by a stack allocation<br>
==42050== at 0x10155751B: get_perm_c_parmetis (get_perm_c_parmetis.c:96)<br>
==42050==<br>
==42050== Conditional jump or move depends on uninitialised value(s)<br>
==42050== at 0x102851C61: MPIR_Allgatherv_intra (allgatherv.c:651)<br>
==42050== by 0x102853EC7: MPIR_Allgatherv (allgatherv.c:903)<br>
==42050== by 0x102853F84: MPIR_Allgatherv_impl (allgatherv.c:944)<br>
==42050== by 0x10274CA41: MPI_Allgatherv (allgatherv.c:1107)<br>
==42050== by 0x101557F60: get_perm_c_parmetis (get_perm_c_parmetis.c:285)<br>
==42050== by 0x101501192: pdgssvx (pdgssvx.c:934)<br>
==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)<br>
==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)<br>
==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152)<br>
==42050== by 0x100FF9036: PCSetUp (precon.c:982)<br>
==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332)<br>
==42050== by 0x1010F7985: KSPSolve (itfunc.c:546)<br>
==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)<br>
==42050== by 0x1011C49B7: SNESSolve (snes.c:3906)<br>
==42050== by 0x100001B3C: main (in ./ex19)<br>
==42050== Uninitialised value was created by a stack allocation<br>
==42050== at 0x10155751B: get_perm_c_parmetis (get_perm_c_parmetis.c:96)<br>
==42050==<br>
==42049== Syscall param writev(vector[...]) points to uninitialised byte(s)<br>
==42049== at 0x102DA1C3A: writev (in /usr/lib/system/libsystem_kernel.dylib)<br>
==42049== by 0x10296A0DC: MPL_large_writev (mplsock.c:32)<br>
==42049== by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610)<br>
==42049== by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84)<br>
==42049== by 0x102934361: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:556)<br>
==42049== by 0x102939531: MPID_Isend (mpid_isend.c:138)<br>
==42049== by 0x10277656E: MPI_Isend (isend.c:125)<br>
==42049== by 0x102088B66: libparmetis__gkMPI_Isend (gkmpi.c:63)<br>
==42049== by 0x10208140F: libparmetis__CommInterfaceData (comm.c:298)<br>
==42049== by 0x1020A8758: libparmetis__CompactGraph (ometis.c:553)<br>
==42049== by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225)<br>
==42049== by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151)<br>
==42049== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)<br>
==42049== by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241)<br>
==42049== by 0x101501192: pdgssvx (pdgssvx.c:934)<br>
==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)<br>
==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)<br>
==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152)<br>
==42049== by 0x100FF9036: PCSetUp (precon.c:982)<br>
==42048== Syscall param writev(vector[...]) points to uninitialised byte(s)<br>
==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332)<br>
==42049== Address 0x105edff70 is 1,424 bytes inside a block of size 752,720 alloc'd<br>
==42049== at 0x1000183B1: malloc (vg_replace_malloc.c:303)<br>
==42049== by 0x1020EB90C: gk_malloc (memory.c:147)<br>
==42049== by 0x1020EAA28: gk_mcoreCreate (mcore.c:28)<br>
==42048== at 0x102DA1C3A: writev (in /usr/lib/system/libsystem_kernel.dylib)<br>
==42048== by 0x10296A0DC: MPL_large_writev (mplsock.c:32)<br>
==42049== by 0x1020BA5CF: libparmetis__AllocateWSpace (wspace.c:23)<br>
==42049== by 0x1020A6E84: ParMETIS_V32_NodeND (ometis.c:98)<br>
==42048== by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610)<br>
==42048== by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84)<br>
==42048== by 0x102934361: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:556)<br>
==42049== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)<br>
==42049== by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241)<br>
==42049== by 0x101501192: pdgssvx (pdgssvx.c:934)<br>
==42048== by 0x102939531: MPID_Isend (mpid_isend.c:138)<br>
==42048== by 0x10277656E: MPI_Isend (isend.c:125)<br>
==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)<br>
==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)<br>
==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152)<br>
==42049== by 0x100FF9036: PCSetUp (precon.c:982)<br>
==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332)<br>
==42048== by 0x102088B66: libparmetis__gkMPI_Isend (gkmpi.c:63)<br>
==42048== by 0x10208140F: libparmetis__CommInterfaceData (comm.c:298)<br>
==42049== by 0x1010F7985: KSPSolve (itfunc.c:546)<br>
==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)<br>
==42048== by 0x1020A8758: libparmetis__CompactGraph (ometis.c:553)<br>
==42048== by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225)<br>
==42048== by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151)<br>
==42049== by 0x1011C49B7: SNESSolve (snes.c:3906)<br>
==42049== by 0x100001B3C: main (in ./ex19)<br>
==42049== Uninitialised value was created by a heap allocation<br>
==42049== at 0x1000183B1: malloc (vg_replace_malloc.c:303)<br>
==42049== by 0x1020EB90C: gk_malloc (memory.c:147)<br>
==42048== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)<br>
==42048== by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241)<br>
==42048== by 0x101501192: pdgssvx (pdgssvx.c:934)<br>
==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)<br>
==42049== by 0x10211C50B: libmetis__imalloc (gklib.c:24)<br>
==42049== by 0x1020A8566: libparmetis__CompactGraph (ometis.c:519)<br>
==42049== by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225)<br>
==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)<br>
==42049== by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151)<br>
==42049== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)<br>
==42049== by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241)<br>
==42049== by 0x101501192: pdgssvx (pdgssvx.c:934)<br>
==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)<br>
==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)<br>
==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152)<br>
==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152)<br>
==42049== by 0x100FF9036: PCSetUp (precon.c:982)<br>
==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332)<br>
==42049== by 0x1010F7985: KSPSolve (itfunc.c:546)<br>
==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)<br>
==42048== by 0x100FF9036: PCSetUp (precon.c:982)<br>
==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332)<br>
==42048== Address 0x10597a860 is 1,408 bytes inside a block of size 752,720 alloc'd<br>
==42049== by 0x1011C49B7: SNESSolve (snes.c:3906)<br>
==42049== by 0x100001B3C: main (in ./ex19)<br>
==42049==<br>
==42048== at 0x1000183B1: malloc (vg_replace_malloc.c:303)<br>
==42048== by 0x1020EB90C: gk_malloc (memory.c:147)<br>
==42048== by 0x1020EAA28: gk_mcoreCreate (mcore.c:28)<br>
==42048== by 0x1020BA5CF: libparmetis__AllocateWSpace (wspace.c:23)<br>
==42048== by 0x1020A6E84: ParMETIS_V32_NodeND (ometis.c:98)<br>
==42048== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)<br>
==42048== by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241)<br>
==42048== by 0x101501192: pdgssvx (pdgssvx.c:934)<br>
==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)<br>
==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)<br>
==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152)<br>
==42048== by 0x100FF9036: PCSetUp (precon.c:982)<br>
==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332)<br>
==42048== by 0x1010F7985: KSPSolve (itfunc.c:546)<br>
==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)<br>
==42048== by 0x1011C49B7: SNESSolve (snes.c:3906)<br>
==42048== by 0x100001B3C: main (in ./ex19)<br>
==42048== Uninitialised value was created by a heap allocation<br>
==42048== at 0x1000183B1: malloc (vg_replace_malloc.c:303)<br>
==42048== by 0x1020EB90C: gk_malloc (memory.c:147)<br>
==42048== by 0x10211C50B: libmetis__imalloc (gklib.c:24)<br>
==42048== by 0x1020A8566: libparmetis__CompactGraph (ometis.c:519)<br>
==42048== by 0x1020A77BB: libparmetis__MultilevelOrder (ometis.c:225)<br>
==42048== by 0x1020A7493: ParMETIS_V32_NodeND (ometis.c:151)<br>
==42048== by 0x1020A6AFB: ParMETIS_V3_NodeND (ometis.c:34)<br>
==42048== by 0x101557CFC: get_perm_c_parmetis (get_perm_c_parmetis.c:241)<br>
==42048== by 0x101501192: pdgssvx (pdgssvx.c:934)<br>
==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)<br>
==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)<br>
==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152)<br>
==42048== by 0x100FF9036: PCSetUp (precon.c:982)<br>
==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332)<br>
==42048== by 0x1010F7985: KSPSolve (itfunc.c:546)<br>
==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)<br>
==42048== by 0x1011C49B7: SNESSolve (snes.c:3906)<br>
==42048== by 0x100001B3C: main (in ./ex19)<br>
==42048==<br>
==42048== Syscall param write(buf) points to uninitialised byte(s)<br>
==42048== at 0x102DA1C22: write (in /usr/lib/system/libsystem_kernel.dylib)<br>
==42048== by 0x10295F5BD: MPIDU_Sock_write (sock_immed.i:525)<br>
==42048== by 0x102944839: MPIDI_CH3_iStartMsg (ch3_istartmsg.c:86)<br>
==42048== by 0x102933B80: MPIDI_CH3_EagerContigShortSend (ch3u_eager.c:257)<br>
==42048== by 0x10293ADBA: MPID_Send (mpid_send.c:130)<br>
==42048== by 0x10277A1FA: MPI_Send (send.c:127)<br>
==42048== by 0x10155802F: get_perm_c_parmetis (get_perm_c_parmetis.c:299)<br>
==42048== by 0x101501192: pdgssvx (pdgssvx.c:934)<br>
==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)<br>
==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)<br>
==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152)<br>
==42048== by 0x100FF9036: PCSetUp (precon.c:982)<br>
==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332)<br>
==42048== by 0x1010F7985: KSPSolve (itfunc.c:546)<br>
==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)<br>
==42048== by 0x1011C49B7: SNESSolve (snes.c:3906)<br>
==42048== by 0x100001B3C: main (in ./ex19)<br>
==42048== Address 0x104810704 is on thread 1's stack<br>
==42048== in frame #3, created by MPIDI_CH3_EagerContigShortSend (ch3u_eager.c:218)<br>
==42048== Uninitialised value was created by a heap allocation<br>
==42048== at 0x1000183B1: malloc (vg_replace_malloc.c:303)<br>
==42048== by 0x10153B704: superlu_malloc_dist (memory.c:108)<br>
==42048== by 0x101557AB9: get_perm_c_parmetis (get_perm_c_parmetis.c:185)<br>
==42048== by 0x101501192: pdgssvx (pdgssvx.c:934)<br>
==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)<br>
==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)<br>
==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152)<br>
==42048== by 0x100FF9036: PCSetUp (precon.c:982)<br>
==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332)<br>
==42048== by 0x1010F7985: KSPSolve (itfunc.c:546)<br>
==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)<br>
==42048== by 0x1011C49B7: SNESSolve (snes.c:3906)<br>
==42048== by 0x100001B3C: main (in ./ex19)<br>
==42048==<br>
==42050== Conditional jump or move depends on uninitialised value(s)<br>
==42050== at 0x102744CB8: MPI_Alltoallv (alltoallv.c:480)<br>
==42050== by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539)<br>
==42050== by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275)<br>
==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057)<br>
==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)<br>
==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)<br>
==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152)<br>
==42050== by 0x100FF9036: PCSetUp (precon.c:982)<br>
==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332)<br>
==42050== by 0x1010F7985: KSPSolve (itfunc.c:546)<br>
==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)<br>
==42050== by 0x1011C49B7: SNESSolve (snes.c:3906)<br>
==42050== by 0x100001B3C: main (in ./ex19)<br>
==42050== Uninitialised value was created by a stack allocation<br>
==42050== at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96)<br>
==42050==<br>
==42050== Conditional jump or move depends on uninitialised value(s)<br>
==42050== at 0x102744E43: MPI_Alltoallv (alltoallv.c:490)<br>
==42050== by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539)<br>
==42050== by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275)<br>
==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057)<br>
==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)<br>
==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)<br>
==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152)<br>
==42050== by 0x100FF9036: PCSetUp (precon.c:982)<br>
==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332)<br>
==42050== by 0x1010F7985: KSPSolve (itfunc.c:546)<br>
==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)<br>
==42050== by 0x1011C49B7: SNESSolve (snes.c:3906)<br>
==42050== by 0x100001B3C: main (in ./ex19)<br>
==42050== Uninitialised value was created by a stack allocation<br>
==42050== at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96)<br>
==42050==<br>
==42050== Conditional jump or move depends on uninitialised value(s)<br>
==42050== at 0x102744EBF: MPI_Alltoallv (alltoallv.c:497)<br>
==42050== by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539)<br>
==42050== by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275)<br>
==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057)<br>
==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)<br>
==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)<br>
==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152)<br>
==42050== by 0x100FF9036: PCSetUp (precon.c:982)<br>
==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332)<br>
==42050== by 0x1010F7985: KSPSolve (itfunc.c:546)<br>
==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)<br>
==42050== by 0x1011C49B7: SNESSolve (snes.c:3906)<br>
==42050== by 0x100001B3C: main (in ./ex19)<br>
==42050== Uninitialised value was created by a stack allocation<br>
==42050== at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96)<br>
==42050==<br>
==42050== Conditional jump or move depends on uninitialised value(s)<br>
==42050== at 0x1027450B1: MPI_Alltoallv (alltoallv.c:512)<br>
==42050== by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539)<br>
==42050== by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275)<br>
==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057)<br>
==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)<br>
==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)<br>
==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152)<br>
==42050== by 0x100FF9036: PCSetUp (precon.c:982)<br>
==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332)<br>
==42050== by 0x1010F7985: KSPSolve (itfunc.c:546)<br>
==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)<br>
==42050== by 0x1011C49B7: SNESSolve (snes.c:3906)<br>
==42050== by 0x100001B3C: main (in ./ex19)<br>
==42050== Uninitialised value was created by a stack allocation<br>
==42050== at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96)<br>
==42050==<br>
==42050== Conditional jump or move depends on uninitialised value(s)<br>
==42050== at 0x10283FB06: MPIR_Alltoallv_intra (alltoallv.c:92)<br>
==42050== by 0x1028407B6: MPIR_Alltoallv (alltoallv.c:343)<br>
==42050== by 0x102840884: MPIR_Alltoallv_impl (alltoallv.c:380)<br>
==42050== by 0x10274541B: MPI_Alltoallv (alltoallv.c:531)<br>
==42050== by 0x101510B3E: dist_symbLU (pdsymbfact_distdata.c:539)<br>
==42050== by 0x10150A5C6: ddist_psymbtonum (pdsymbfact_distdata.c:1275)<br>
==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057)<br>
==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)<br>
==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)<br>
==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152)<br>
==42050== by 0x100FF9036: PCSetUp (precon.c:982)<br>
==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332)<br>
==42050== by 0x1010F7985: KSPSolve (itfunc.c:546)<br>
==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)<br>
==42050== by 0x1011C49B7: SNESSolve (snes.c:3906)<br>
==42050== by 0x100001B3C: main (in ./ex19)<br>
==42050== Uninitialised value was created by a stack allocation<br>
==42050== at 0x10150E4C4: dist_symbLU (pdsymbfact_distdata.c:96)<br>
==42050==<br>
==42050== Syscall param writev(vector[...]) points to uninitialised byte(s)<br>
==42050== at 0x102DA1C3A: writev (in /usr/lib/system/libsystem_kernel.dylib)<br>
==42050== by 0x10296A0DC: MPL_large_writev (mplsock.c:32)<br>
==42050== by 0x10295F6AD: MPIDU_Sock_writev (sock_immed.i:610)<br>
==42050== by 0x102943FCA: MPIDI_CH3_iSendv (ch3_isendv.c:84)<br>
==42050== by 0x102934361: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:556)<br>
==42050== by 0x102939531: MPID_Isend (mpid_isend.c:138)<br>
==42050== by 0x10277656E: MPI_Isend (isend.c:125)<br>
==42050== by 0x101524C41: pdgstrf2_trsm (pdgstrf2.c:201)<br>
==42050== by 0x10151ECBF: pdgstrf (pdgstrf.c:1082)<br>
==42050== by 0x1015019A5: pdgssvx (pdgssvx.c:1069)<br>
==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)<br>
==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)<br>
==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152)<br>
==42050== by 0x100FF9036: PCSetUp (precon.c:982)<br>
==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332)<br>
==42050== by 0x1010F7985: KSPSolve (itfunc.c:546)<br>
==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)<br>
==42050== by 0x1011C49B7: SNESSolve (snes.c:3906)<br>
==42050== by 0x100001B3C: main (in ./ex19)<br>
==42050== Address 0x1060144d0 is 1,168 bytes inside a block of size 131,072 alloc'd<br>
==42050== at 0x1000183B1: malloc (vg_replace_malloc.c:303)<br>
==42050== by 0x10153B704: superlu_malloc_dist (memory.c:108)<br>
==42050== by 0x1014FD7AD: doubleMalloc_dist (dmemory.c:145)<br>
==42050== by 0x10151DA7D: pdgstrf (pdgstrf.c:735)<br>
==42050== by 0x1015019A5: pdgssvx (pdgssvx.c:1069)<br>
==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)<br>
==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)<br>
==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152)<br>
==42050== by 0x100FF9036: PCSetUp (precon.c:982)<br>
==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332)<br>
==42050== by 0x1010F7985: KSPSolve (itfunc.c:546)<br>
==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)<br>
==42050== by 0x1011C49B7: SNESSolve (snes.c:3906)<br>
==42050== by 0x100001B3C: main (in ./ex19)<br>
==42050== Uninitialised value was created by a heap allocation<br>
==42050== at 0x1000183B1: malloc (vg_replace_malloc.c:303)<br>
==42050== by 0x10153B704: superlu_malloc_dist (memory.c:108)<br>
==42050== by 0x1014FD7AD: doubleMalloc_dist (dmemory.c:145)<br>
==42050== by 0x10151DA7D: pdgstrf (pdgstrf.c:735)<br>
==42050== by 0x1015019A5: pdgssvx (pdgssvx.c:1069)<br>
==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)<br>
==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)<br>
==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152)<br>
==42050== by 0x100FF9036: PCSetUp (precon.c:982)<br>
==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332)<br>
==42050== by 0x1010F7985: KSPSolve (itfunc.c:546)<br>
==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)<br>
==42050== by 0x1011C49B7: SNESSolve (snes.c:3906)<br>
==42050== by 0x100001B3C: main (in ./ex19)<br>
==42050==<br>
==42048== Conditional jump or move depends on uninitialised value(s)<br>
==42048== at 0x10151F141: pdgstrf (pdgstrf.c:1139)<br>
==42048== by 0x1015019A5: pdgssvx (pdgssvx.c:1069)<br>
==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)<br>
==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)<br>
==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152)<br>
==42048== by 0x100FF9036: PCSetUp (precon.c:982)<br>
==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332)<br>
==42048== by 0x1010F7985: KSPSolve (itfunc.c:546)<br>
==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)<br>
==42048== by 0x1011C49B7: SNESSolve (snes.c:3906)<br>
==42048== by 0x100001B3C: main (in ./ex19)<br>
==42048== Uninitialised value was created by a heap allocation<br>
==42048== at 0x1000183B1: malloc (vg_replace_malloc.c:303)<br>
==42048== by 0x10153B704: superlu_malloc_dist (memory.c:108)<br>
==42048== by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332)<br>
==42048== by 0x1015018C2: pdgssvx (pdgssvx.c:1057)<br>
==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)<br>
==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)<br>
==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152)<br>
==42048== by 0x100FF9036: PCSetUp (precon.c:982)<br>
==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332)<br>
==42048== by 0x1010F7985: KSPSolve (itfunc.c:546)<br>
==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)<br>
==42048== by 0x1011C49B7: SNESSolve (snes.c:3906)<br>
==42048== by 0x100001B3C: main (in ./ex19)<br>
==42048==<br>
==42049== Conditional jump or move depends on uninitialised value(s)<br>
==42049== at 0x10151F141: pdgstrf (pdgstrf.c:1139)<br>
==42049== by 0x1015019A5: pdgssvx (pdgssvx.c:1069)<br>
==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)<br>
==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)<br>
==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152)<br>
==42049== by 0x100FF9036: PCSetUp (precon.c:982)<br>
==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332)<br>
==42049== by 0x1010F7985: KSPSolve (itfunc.c:546)<br>
==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)<br>
==42049== by 0x1011C49B7: SNESSolve (snes.c:3906)<br>
==42049== by 0x100001B3C: main (in ./ex19)<br>
==42049== Uninitialised value was created by a heap allocation<br>
==42049== at 0x1000183B1: malloc (vg_replace_malloc.c:303)<br>
==42049== by 0x10153B704: superlu_malloc_dist (memory.c:108)<br>
==42049== by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332)<br>
==42049== by 0x1015018C2: pdgssvx (pdgssvx.c:1057)<br>
==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)<br>
==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)<br>
==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152)<br>
==42049== by 0x100FF9036: PCSetUp (precon.c:982)<br>
==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332)<br>
==42049== by 0x1010F7985: KSPSolve (itfunc.c:546)<br>
==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)<br>
==42049== by 0x1011C49B7: SNESSolve (snes.c:3906)<br>
==42049== by 0x100001B3C: main (in ./ex19)<br>
==42049==<br>
==42048== Conditional jump or move depends on uninitialised value(s)<br>
==42048== at 0x101520054: pdgstrf (pdgstrf.c:1429)<br>
==42048== by 0x1015019A5: pdgssvx (pdgssvx.c:1069)<br>
==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)<br>
==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)<br>
==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152)<br>
==42048== by 0x100FF9036: PCSetUp (precon.c:982)<br>
==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332)<br>
==42048== by 0x1010F7985: KSPSolve (itfunc.c:546)<br>
==42049== Conditional jump or move depends on uninitialised value(s)<br>
==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)<br>
==42048== by 0x1011C49B7: SNESSolve (snes.c:3906)<br>
==42048== by 0x100001B3C: main (in ./ex19)<br>
==42048== Uninitialised value was created by a heap allocation<br>
==42049== at 0x101520054: pdgstrf (pdgstrf.c:1429)<br>
==42048== at 0x1000183B1: malloc (vg_replace_malloc.c:303)<br>
==42048== by 0x10153B704: superlu_malloc_dist (memory.c:108)<br>
==42049== by 0x1015019A5: pdgssvx (pdgssvx.c:1069)<br>
==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)<br>
==42048== by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332)<br>
==42048== by 0x1015018C2: pdgssvx (pdgssvx.c:1057)<br>
==42048== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)<br>
==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)<br>
==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152)<br>
==42048== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)<br>
==42048== by 0x100F09F2C: PCSetUp_LU (lu.c:152)<br>
==42049== by 0x100FF9036: PCSetUp (precon.c:982)<br>
==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332)<br>
==42049== by 0x1010F7985: KSPSolve (itfunc.c:546)<br>
==42048== by 0x100FF9036: PCSetUp (precon.c:982)<br>
==42048== by 0x1010F54EB: KSPSetUp (itfunc.c:332)<br>
==42048== by 0x1010F7985: KSPSolve (itfunc.c:546)<br>
==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)<br>
==42049== by 0x1011C49B7: SNESSolve (snes.c:3906)<br>
==42048== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)<br>
==42048== by 0x1011C49B7: SNESSolve (snes.c:3906)<br>
==42049== by 0x100001B3C: main (in ./ex19)<br>
==42049== Uninitialised value was created by a heap allocation<br>
==42049== at 0x1000183B1: malloc (vg_replace_malloc.c:303)<br>
==42048== by 0x100001B3C: main (in ./ex19)<br>
==42048==<br>
==42049== by 0x10153B704: superlu_malloc_dist (memory.c:108)<br>
==42049== by 0x10150ABE2: ddist_psymbtonum (pdsymbfact_distdata.c:1332)<br>
==42049== by 0x1015018C2: pdgssvx (pdgssvx.c:1057)<br>
==42049== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)<br>
==42049== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)<br>
==42049== by 0x100F09F2C: PCSetUp_LU (lu.c:152)<br>
==42049== by 0x100FF9036: PCSetUp (precon.c:982)<br>
==42049== by 0x1010F54EB: KSPSetUp (itfunc.c:332)<br>
==42049== by 0x1010F7985: KSPSolve (itfunc.c:546)<br>
==42049== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)<br>
==42049== by 0x1011C49B7: SNESSolve (snes.c:3906)<br>
==42049== by 0x100001B3C: main (in ./ex19)<br>
==42049==<br>
==42050== Conditional jump or move depends on uninitialised value(s)<br>
==42050== at 0x10151FDE6: pdgstrf (pdgstrf.c:1382)<br>
==42050== by 0x1015019A5: pdgssvx (pdgssvx.c:1069)<br>
==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)<br>
==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)<br>
==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152)<br>
==42050== by 0x100FF9036: PCSetUp (precon.c:982)<br>
==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332)<br>
==42050== by 0x1010F7985: KSPSolve (itfunc.c:546)<br>
==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)<br>
==42050== by 0x1011C49B7: SNESSolve (snes.c:3906)<br>
==42050== by 0x100001B3C: main (in ./ex19)<br>
==42050== Uninitialised value was created by a heap allocation<br>
==42050== at 0x1000183B1: malloc (vg_replace_malloc.c:303)<br>
==42050== by 0x10153B704: superlu_malloc_dist (memory.c:108)<br>
==42050== by 0x10150B241: ddist_psymbtonum (pdsymbfact_distdata.c:1389)<br>
==42050== by 0x1015018C2: pdgssvx (pdgssvx.c:1057)<br>
==42050== by 0x1009CFE7A: MatLUFactorNumeric_SuperLU_DIST (superlu_dist.c:414)<br>
==42050== by 0x10046CC5C: MatLUFactorNumeric (matrix.c:2946)<br>
==42050== by 0x100F09F2C: PCSetUp_LU (lu.c:152)<br>
==42050== by 0x100FF9036: PCSetUp (precon.c:982)<br>
==42050== by 0x1010F54EB: KSPSetUp (itfunc.c:332)<br>
==42050== by 0x1010F7985: KSPSolve (itfunc.c:546)<br>
==42050== by 0x10125541E: SNESSolve_NEWTONLS (ls.c:233)<br>
==42050== by 0x1011C49B7: SNESSolve (snes.c:3906)<br>
==42050== by 0x100001B3C: main (in ./ex19)<br>
==42050==<br>
<br>
<br>
> On Jul 20, 2015, at 12:03 PM, <a href="mailto:Mahir.Ulker-Kaustell@tyrens.se" target="_blank">
Mahir.Ulker-Kaustell@tyrens.se</a> wrote:<br>
><br>
> Ok. So I have been creating the full factorization on each process. That gives me some hope!<br>
><br>
> I followed your suggestion and tried to use the runtime option ‘-mat_superlu_dist_parsymbfact’.<br>
> However, now the program crashes with:<br>
><br>
> Invalid ISPEC at line 484 in file get_perm_c.c<br>
><br>
> And so on…<br>
><br>
> From the SuperLU manual; I should give the option either YES or NO, however -mat_superlu_dist_parsymbfact YES makes the program crash in the same way as above.<br>
> Also I can’t find any reference to -mat_superlu_dist_parsymbfact in the PETSc documentation<br>
><br>
> Mahir<br>
><br>
> Mahir Ülker-Kaustell, Kompetenssamordnare, Brokonstruktör, Tekn. Dr, Tyréns AB<br>
> 010 452 30 82, <a href="mailto:Mahir.Ulker-Kaustell@tyrens.se" target="_blank">
Mahir.Ulker-Kaustell@tyrens.se</a><br>
><br>
> From: Xiaoye S. Li [mailto:<a href="mailto:xsli@lbl.gov" target="_blank">xsli@lbl.gov</a>]<br>
> Sent: den 20 juli 2015 18:12<br>
> To: Ülker-Kaustell, Mahir<br>
> Cc: Hong; petsc-users<br>
> Subject: Re: [petsc-users] SuperLU MPI-problem<br>
><br>
> The default SuperLU_DIST setting is to serial symbolic factorization. Therefore, what matters is how much memory do you have per MPI task?<br>
><br>
> The code failed to malloc memory during redistribution of matrix A to {L\U} data struction (using result of serial symbolic factorization.)<br>
><br>
> You can use parallel symbolic factorization, by runtime option: '-mat_superlu_dist_parsymbfact'<br>
><br>
> Sherry Li<br>
><br>
><br>
> On Mon, Jul 20, 2015 at 8:59 AM, <a href="mailto:Mahir.Ulker-Kaustell@tyrens.se" target="_blank">
Mahir.Ulker-Kaustell@tyrens.se</a> <<a href="mailto:Mahir.Ulker-Kaustell@tyrens.se" target="_blank">Mahir.Ulker-Kaustell@tyrens.se</a>> wrote:<br>
> Hong:<br>
><br>
> Previous experiences with this equation have shown that it is very difficult to solve it iteratively. Hence the use of a direct solver.<br>
><br>
> The large test problem I am trying to solve has slightly less than 10^6 degrees of freedom. The matrices are derived from finite elements so they are sparse.<br>
> The machine I am working on has 128GB ram. I have estimated the memory needed to less than 20GB, so if the solver needs twice or even three times as much, it should still work well. Or have I completely misunderstood something here?<br>
><br>
> Mahir<br>
><br>
><br>
><br>
> From: Hong [mailto:<a href="mailto:hzhang@mcs.anl.gov" target="_blank">hzhang@mcs.anl.gov</a>]<br>
> Sent: den 20 juli 2015 17:39<br>
> To: Ülker-Kaustell, Mahir<br>
> Cc: petsc-users<br>
> Subject: Re: [petsc-users] SuperLU MPI-problem<br>
><br>
> Mahir:<br>
> Direct solvers consume large amount of memory. Suggest to try followings:<br>
><br>
> 1. A sparse iterative solver if [-omega^2M + K] is not too ill-conditioned. You may test it using the small matrix.<br>
><br>
> 2. Incrementally increase your matrix sizes. Try different matrix orderings.<br>
> Do you get memory crash in the 1st symbolic factorization?<br>
> In your case, matrix data structure stays same when omega changes, so you only need to do one matrix symbolic factorization and reuse it.<br>
><br>
> 3. Use a machine that gives larger memory.<br>
><br>
> Hong<br>
><br>
> Dear Petsc-Users,<br>
><br>
> I am trying to use PETSc to solve a set of linear equations arising from Naviers equation (elastodynamics) in the frequency domain.<br>
> The frequency dependency of the problem requires that the system<br>
><br>
> [-omega^2M + K]u = F<br>
><br>
> where M and K are constant, square, positive definite matrices (mass and stiffness respectively) is solved for each frequency omega of interest.<br>
> K is a complex matrix, including material damping.<br>
><br>
> I have written a PETSc program which solves this problem for a small (1000 degrees of freedom) test problem on one or several processors, but it keeps crashing when I try it on my full scale (in the order of 10^6 degrees of freedom) problem.<br>
><br>
> The program crashes at KSPSetUp() and from what I can see in the error messages, it appears as if it consumes too much memory.<br>
><br>
> I would guess that similar problems have occurred in this mail-list, so I am hoping that someone can push me in the right direction…<br>
><br>
> Mahir<u></u><u></u></p>
</div>
</div>
</blockquote>
</div>
<p class="MsoNormal"> <u></u><u></u></p>
</div>
</div>
</div>
<p class="MsoNormal"> <u></u><u></u></p>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
<p class="MsoNormal"> <u></u><u></u></p>
</div>
</div>
<p class="MsoNormal"> <u></u><u></u></p>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
<p class="MsoNormal"> <u></u><u></u></p>
</div>
</div>
</div>
</div>
<p class="MsoNormal"> <u></u><u></u></p>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</blockquote>
</div>
<p class="MsoNormal"> <u></u><u></u></p>
</div>
</div>
</div>
</div>
</blockquote>
</div>
<p class="MsoNormal"><u></u> <u></u></p>
</div>
</div>
</div>
</div>
</blockquote></div><br></div></div>