<html>

  <head>

    <meta http-equiv="Content-Type" content="text/html; charset=utf-8">

  </head>

  <body text="#000000" bgcolor="#FFFFFF">

    <p>CG/GCR: I accidentally kept gcr in the batch file. That's still

      from when I was experimenting with the different methods. The

      performance is quite similar though.</p>

    <p>I use the following setup for the ksp object and the vectors:</p>

    <blockquote><font size="-1">ierr=PetscInitialize(&argc,

        &argv, (char*)0, (char*)0);CHKERRQ(ierr);<br>

        <br>

        ierr=KSPCreate(PETSC_COMM_WORLD, &ksp);CHKERRQ(ierr);<br>

        <br>

        ierr=DMDACreate3d(PETSC_COMM_WORLD,</font><font size="-1"><font

          size="-1">DM_BOUNDARY_GHOSTED,DM_BOUNDARY_GHOSTED,DM_BOUNDARY_GHOSTED</font>,<br>

                     DMDA_STENCIL_STAR, g_Nx, g_Ny, g_Nz, dims[0],

        dims[1], dims[2], 1, 1, l_Nx, l_Ny, l_Nz,

        &da);CHKERRQ(ierr);<br>

        ierr=DMSetFromOptions(da);CHKERRQ(ierr);<br>

        ierr=DMSetUp(da);CHKERRQ(ierr);<br>

        ierr=KSPSetDM(ksp, da);CHKERRQ(ierr);<br>

          <br>

        ierr=DMCreateGlobalVector(da, &b);CHKERRQ(ierr);<br>

        <br>

        ierr=VecDuplicate(b, &x);CHKERRQ(ierr);<br>

        <br>

        ierr=DMCreateLocalVector(da, &l_x);CHKERRQ(ierr);<br>

        ierr=VecSet(x,0);CHKERRQ(ierr);<br>

        ierr=VecSet(b,0);CHKERRQ(ierr);</font><br>

    </blockquote>

    <p>For the 125 case the arrays l_Nx, l_Ny, l_Nz have dimension 5 and

      every element has value 30. VecGetLocalSize() returns 27000 for

      every rank. Is there something I didn't consider?</p>

    <p>Michael<br>

    </p>

    <p><br>

    </p>

    <br>

    <div class="moz-cite-prefix">Am 24.05.2018 um 09:39 schrieb Lawrence

      Mitchell:<br>

    </div>

    <blockquote type="cite"

      cite="mid:AC2E6823-8AB2-46C4-810B-2CC496B11A11@imperial.ac.uk">

      <pre wrap="">

</pre>

      <blockquote type="cite">

        <pre wrap="">On 24 May 2018, at 06:24, Michael Becker <a class="moz-txt-link-rfc2396E" href="mailto:Michael.Becker@physik.uni-giessen.de"><Michael.Becker@physik.uni-giessen.de></a> wrote:

Could you have a look at the attached log_view files and tell me if something is particularly odd? The system size per processor is 30^3 and the simulation ran over 1000 timesteps, which means KSPsolve() was called equally often. I introduced two new logging states - one for the first solve and the final setup and one for the remaining solves.

</pre>

      </blockquote>

      <pre wrap="">

The two attached logs use CG for the 125 proc run, but gcr for the 1000 proc run.  Is this deliberate?

125 proc:

-gamg_est_ksp_type cg

-ksp_norm_type unpreconditioned

-ksp_type cg

-log_view

-mg_levels_esteig_ksp_max_it 10

-mg_levels_esteig_ksp_type cg

-mg_levels_ksp_max_it 1

-mg_levels_ksp_norm_type none

-mg_levels_ksp_type richardson

-mg_levels_pc_sor_its 1

-mg_levels_pc_type sor

-pc_gamg_type classical

-pc_type gamg

1000 proc:

-gamg_est_ksp_type cg

-ksp_norm_type unpreconditioned

-ksp_type gcr

-log_view

-mg_levels_esteig_ksp_max_it 10

-mg_levels_esteig_ksp_type cg

-mg_levels_ksp_max_it 1

-mg_levels_ksp_norm_type none

-mg_levels_ksp_type richardson

-mg_levels_pc_sor_its 1

-mg_levels_pc_type sor

-pc_gamg_type classical

-pc_type gamg

That aside, it looks like you have quite a bit of load imbalance.  e.g. in the smoother, where you're doing MatSOR, you have:

125 proc:

                   Calls     Time       Max/Min time

MatSOR             47808 1.0 6.8888e+01 1.7

1000 proc:

MatSOR             41400 1.0 6.3412e+01 1.6

VecScatters show similar behaviour.

How is your problem distributed across the processes?

Cheers,

Lawrence

</pre>

    </blockquote>

    <br>

  </body>

</html>