<div class="gmail_quote">On Sun, Aug 21, 2011 at 16:45, Алексей Рязанов <span dir="ltr"><<a href="mailto:ram@ibrae.ac.ru">ram@ibrae.ac.ru</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">
Hello!<div><br></div><div>Could you please help me to solve my performance problem.</div><div>I have two programs. </div><div><br></div><div>In 1st I solve one system with one method and one preconditioner and get some performance numbers. </div>
<div>I run it 9 times with 9 different preconditioners.</div>
<div><br></div><div>In 2nd I solve the same system with the same one method but with 9 different preconditioners consecutively one after another. </div><div>I run it once and also get some performance info.</div><div>In the 2nd case I have 2-5 times worse results, depending on used method.</div>
<div><br></div><div>Each KSPSolve procedure placed in its own stage of course, so I can compare times, flops, messages an so..</div><div>I can see the difference but cant explain and eliminate it.</div><div><br></div><div>
For example for -ksp_type cgs -pc_type asm -sub_pc_type jacobi -sub_ksp_type preonly:</div><div>
<div>Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions --</div><div> Avg %Total Avg %Total counts %Total Avg %Total counts %Total </div>
</div><div><div> one stage frome 2nd: 5.5145e+00 14.9% 1.2336e+09 13.2% 2.494e+03 22.4% 3.230e+03 22.2% 2.250e+03 18.5%</div><div>the once stage from 1st: 2.7541e+00 93.1% 1.2336e+09 99.8% 2.508e+03 99.3% 1.470e+04 97.4% 2.266e+03 88.0%</div>
<div><br></div>
<div><br></div><div>My programs are pretty equivalent except the part with definition of preconditioners and the number of called KSPSolve procedures. </div><div>I mean they use equal matrices, equally assemble them, use equal right hand sides, equal convergence monitors.</div>
<div>Actually the 2nd one was made from the 1st.</div><div><br></div><div>In 1st i use KSPSetFromOptions(KSP); and then just set the -ksp_type -pc_type -sub_pc_type -sub_ksp_type keys from command line</div><div><br></div>
<div>In 2d i use for for nonblock PC: </div></div><blockquote style="margin:0 0 0 40px;border:none;padding:0px"><blockquote style="margin:0 0 0 40px;border:none;padding:0px">
<div><div><font face="'courier new', monospace"> KSPGetPC(dKSP, &dPC);</font></div></div></blockquote><blockquote style="margin:0 0 0 40px;border:none;padding:0px">
<div><div><font face="'courier new', monospace"> PCSetType(dPC, PCJACOBI); </font></div></div></blockquote></blockquote><div><div>and for block PC:</div></div><blockquote style="margin:0 0 0 40px;border:none;padding:0px">
<blockquote style="margin:0 0 0 40px;border:none;padding:0px"><div><div><div><font face="'courier new', monospace"> PCSetType(dPC, PCASM);</font></div>
</div></div></blockquote><blockquote style="margin:0 0 0 40px;border:none;padding:0px"><div><div><div><font face="'courier new', monospace"> KSPSetUp(dKSP);</font></div>
</div></div></blockquote><blockquote style="margin:0 0 0 40px;border:none;padding:0px"><div><div><div><font face="'courier new', monospace"> PCSetUp(dPC);</font></div>
</div></div></blockquote><blockquote style="margin:0 0 0 40px;border:none;padding:0px"><div><div><div><font face="'courier new', monospace"> PCASMGetSubKSP(dPC, &n_local, &first_local, &ASMSubKSP);</font></div>
</div></div></blockquote><blockquote style="margin:0 0 0 40px;border:none;padding:0px"><div><div><div><font face="'courier new', monospace"> for (i=0; i<n_local; i++) </font></div>
</div></div></blockquote><blockquote style="margin:0 0 0 40px;border:none;padding:0px"><div><div><div><font face="'courier new', monospace"> {</font></div>
</div></div></blockquote><blockquote style="margin:0 0 0 40px;border:none;padding:0px"><div><div><div><font face="'courier new', monospace"> KSPGetPC(ASMSubKSP[i], &(SubPC[i]));</font></div>
</div></div></blockquote><blockquote style="margin:0 0 0 40px;border:none;padding:0px"><div><div><div><font face="'courier new', monospace"> PCSetType(SubPC[i], PCJACOBI); </font></div>
</div></div></blockquote><blockquote style="margin:0 0 0 40px;border:none;padding:0px"><div><div><div><font face="'courier new', monospace"> } </font></div>
</div></div></blockquote></blockquote><div>
<div><br></div><div>Im sure there is a mistake somewhere. Because 1st program compares Jacobi and ASM-Jacobi preconditioners on my problem on the same KSP and tells me that ASM-Jacobi is better and the 2nd shows otherwise results.</div>
</div></blockquote><div><br></div><div>This could be a preload issue. You can use the PreLoadBegin()/PreLoadEnd() macros if you like, or otherwise solve a system first to make sure everything has been loaded. If the results are still confusing, run with -ksp_view -log_summary and send the output.</div>
<div><br></div><div>There is no reason for ASM-Jacobi (with -sub_ksp_type preonly, which is default) to be better than Jacobi since it does the same algorithm with more communication.</div></div>