<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<p>CG/GCR: I accidentally kept gcr in the batch file. That's still
from when I was experimenting with the different methods. The
performance is quite similar though.</p>
<p>I use the following setup for the ksp object and the vectors:</p>
<blockquote><font size="-1">ierr=PetscInitialize(&argc,
&argv, (char*)0, (char*)0);CHKERRQ(ierr);<br>
<br>
ierr=KSPCreate(PETSC_COMM_WORLD, &ksp);CHKERRQ(ierr);<br>
<br>
ierr=DMDACreate3d(PETSC_COMM_WORLD,</font><font size="-1"><font
size="-1">DM_BOUNDARY_GHOSTED,DM_BOUNDARY_GHOSTED,DM_BOUNDARY_GHOSTED</font>,<br>
DMDA_STENCIL_STAR, g_Nx, g_Ny, g_Nz, dims[0],
dims[1], dims[2], 1, 1, l_Nx, l_Ny, l_Nz,
&da);CHKERRQ(ierr);<br>
ierr=DMSetFromOptions(da);CHKERRQ(ierr);<br>
ierr=DMSetUp(da);CHKERRQ(ierr);<br>
ierr=KSPSetDM(ksp, da);CHKERRQ(ierr);<br>
<br>
ierr=DMCreateGlobalVector(da, &b);CHKERRQ(ierr);<br>
<br>
ierr=VecDuplicate(b, &x);CHKERRQ(ierr);<br>
<br>
ierr=DMCreateLocalVector(da, &l_x);CHKERRQ(ierr);<br>
ierr=VecSet(x,0);CHKERRQ(ierr);<br>
ierr=VecSet(b,0);CHKERRQ(ierr);</font><br>
</blockquote>
<p>For the 125 case the arrays l_Nx, l_Ny, l_Nz have dimension 5 and
every element has value 30. VecGetLocalSize() returns 27000 for
every rank. Is there something I didn't consider?</p>
<p>Michael<br>
</p>
<p><br>
</p>
<br>
<div class="moz-cite-prefix">Am 24.05.2018 um 09:39 schrieb Lawrence
Mitchell:<br>
</div>
<blockquote type="cite"
cite="mid:AC2E6823-8AB2-46C4-810B-2CC496B11A11@imperial.ac.uk">
<pre wrap="">
</pre>
<blockquote type="cite">
<pre wrap="">On 24 May 2018, at 06:24, Michael Becker <a class="moz-txt-link-rfc2396E" href="mailto:Michael.Becker@physik.uni-giessen.de"><Michael.Becker@physik.uni-giessen.de></a> wrote:
Could you have a look at the attached log_view files and tell me if something is particularly odd? The system size per processor is 30^3 and the simulation ran over 1000 timesteps, which means KSPsolve() was called equally often. I introduced two new logging states - one for the first solve and the final setup and one for the remaining solves.
</pre>
</blockquote>
<pre wrap="">
The two attached logs use CG for the 125 proc run, but gcr for the 1000 proc run. Is this deliberate?
125 proc:
-gamg_est_ksp_type cg
-ksp_norm_type unpreconditioned
-ksp_type cg
-log_view
-mg_levels_esteig_ksp_max_it 10
-mg_levels_esteig_ksp_type cg
-mg_levels_ksp_max_it 1
-mg_levels_ksp_norm_type none
-mg_levels_ksp_type richardson
-mg_levels_pc_sor_its 1
-mg_levels_pc_type sor
-pc_gamg_type classical
-pc_type gamg
1000 proc:
-gamg_est_ksp_type cg
-ksp_norm_type unpreconditioned
-ksp_type gcr
-log_view
-mg_levels_esteig_ksp_max_it 10
-mg_levels_esteig_ksp_type cg
-mg_levels_ksp_max_it 1
-mg_levels_ksp_norm_type none
-mg_levels_ksp_type richardson
-mg_levels_pc_sor_its 1
-mg_levels_pc_type sor
-pc_gamg_type classical
-pc_type gamg
That aside, it looks like you have quite a bit of load imbalance. e.g. in the smoother, where you're doing MatSOR, you have:
125 proc:
Calls Time Max/Min time
MatSOR 47808 1.0 6.8888e+01 1.7
1000 proc:
MatSOR 41400 1.0 6.3412e+01 1.6
VecScatters show similar behaviour.
How is your problem distributed across the processes?
Cheers,
Lawrence
</pre>
</blockquote>
<br>
</body>
</html>