[petsc-users] Poor weak scaling when solving successivelinearsystems

Michael Becker Michael.Becker at physik.uni-giessen.de
Thu May 24 04:10:28 CDT 2018


CG/GCR: I accidentally kept gcr in the batch file. That's still from 
when I was experimenting with the different methods. The performance is 
quite similar though.

I use the following setup for the ksp object and the vectors:

    ierr=PetscInitialize(&argc, &argv, (char*)0, (char*)0);CHKERRQ(ierr);

    ierr=KSPCreate(PETSC_COMM_WORLD, &ksp);CHKERRQ(ierr);

    ierr=DMDACreate3d(PETSC_COMM_WORLD,DM_BOUNDARY_GHOSTED,DM_BOUNDARY_GHOSTED,DM_BOUNDARY_GHOSTED,
                  DMDA_STENCIL_STAR, g_Nx, g_Ny, g_Nz, dims[0], dims[1],
    dims[2], 1, 1, l_Nx, l_Ny, l_Nz, &da);CHKERRQ(ierr);
    ierr=DMSetFromOptions(da);CHKERRQ(ierr);
    ierr=DMSetUp(da);CHKERRQ(ierr);
    ierr=KSPSetDM(ksp, da);CHKERRQ(ierr);

    ierr=DMCreateGlobalVector(da, &b);CHKERRQ(ierr);

    ierr=VecDuplicate(b, &x);CHKERRQ(ierr);

    ierr=DMCreateLocalVector(da, &l_x);CHKERRQ(ierr);
    ierr=VecSet(x,0);CHKERRQ(ierr);
    ierr=VecSet(b,0);CHKERRQ(ierr);

For the 125 case the arrays l_Nx, l_Ny, l_Nz have dimension 5 and every 
element has value 30. VecGetLocalSize() returns 27000 for every rank. Is 
there something I didn't consider?

Michael



Am 24.05.2018 um 09:39 schrieb Lawrence Mitchell:
>
>> On 24 May 2018, at 06:24, Michael Becker <Michael.Becker at physik.uni-giessen.de> wrote:
>>
>> Could you have a look at the attached log_view files and tell me if something is particularly odd? The system size per processor is 30^3 and the simulation ran over 1000 timesteps, which means KSPsolve() was called equally often. I introduced two new logging states - one for the first solve and the final setup and one for the remaining solves.
> The two attached logs use CG for the 125 proc run, but gcr for the 1000 proc run.  Is this deliberate?
>
> 125 proc:
>
> -gamg_est_ksp_type cg
> -ksp_norm_type unpreconditioned
> -ksp_type cg
> -log_view
> -mg_levels_esteig_ksp_max_it 10
> -mg_levels_esteig_ksp_type cg
> -mg_levels_ksp_max_it 1
> -mg_levels_ksp_norm_type none
> -mg_levels_ksp_type richardson
> -mg_levels_pc_sor_its 1
> -mg_levels_pc_type sor
> -pc_gamg_type classical
> -pc_type gamg
>
> 1000 proc:
>
> -gamg_est_ksp_type cg
> -ksp_norm_type unpreconditioned
> -ksp_type gcr
> -log_view
> -mg_levels_esteig_ksp_max_it 10
> -mg_levels_esteig_ksp_type cg
> -mg_levels_ksp_max_it 1
> -mg_levels_ksp_norm_type none
> -mg_levels_ksp_type richardson
> -mg_levels_pc_sor_its 1
> -mg_levels_pc_type sor
> -pc_gamg_type classical
> -pc_type gamg
>
>
> That aside, it looks like you have quite a bit of load imbalance.  e.g. in the smoother, where you're doing MatSOR, you have:
>
> 125 proc:
>                     Calls     Time       Max/Min time
> MatSOR             47808 1.0 6.8888e+01 1.7
>
> 1000 proc:
>
> MatSOR             41400 1.0 6.3412e+01 1.6
>
> VecScatters show similar behaviour.
>
> How is your problem distributed across the processes?
>
> Cheers,
>
> Lawrence
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20180524/9f8ce681/attachment.html>


More information about the petsc-users mailing list