[petsc-users] Correlation between da_refine and pg_mg_levels

Justin Chang jychang48 at gmail.com
Sun Apr 2 11:54:02 CDT 2017


It was sort of arbitrary. I want to conduct a performance spectrum
(dofs/sec) study where at least 1k processors are used on various HPC
machines (and hopefully one more case with 10k procs). Assuming all
available cores on these compute nodes (which I know is not the greatest
idea here), 1032 Ivybridge (24 cores/node) on Edison best matches Cori's
1024 Haswell (32 core/node).

How do I determine the shape of the DMDA? I am guessing the number of MPI
processes needs to be compatible with this?

Thanks,
Justin

On Sun, Apr 2, 2017 at 11:29 AM, Jed Brown <jed at jedbrown.org> wrote:

> Justin Chang <jychang48 at gmail.com> writes:
>
> > Thanks guys,
> >
> > So I want to run SNES ex48 across 1032 processes on Edison,
>
> How did you decide on 1032 processes?  What shape did the DMDA produce?
> Of course this should work, but we didn't explicitly test that in the
> paper since we were running on BG/P.
>
>   https://github.com/jedbrown/tme-ice/tree/master/shaheen/b
>
> > but I keep getting segmentation violations. These are the parameters I
> > am trying:
> >
> > srun -n 1032 -c 2 ./ex48 -M 80 -N 80 -P 9 -da_refine 1 -pc_type mg
> > -thi_mat_type baij -mg_coarse_pc_type gamg
> >
> > The above works perfectly fine if I used 96 processes. I also tried to
> use
> > a finer coarse mesh on 1032 but the error persists.
> >
> > Any ideas why this is happening? What are the ideal parameters to use if
> I
> > want to use 1k+ cores?
> >
> > Thanks,
> > Justin
> >
> > On Fri, Mar 31, 2017 at 12:47 PM, Barry Smith <bsmith at mcs.anl.gov>
> wrote:
> >
> >>
> >> > On Mar 31, 2017, at 10:00 AM, Jed Brown <jed at jedbrown.org> wrote:
> >> >
> >> > Justin Chang <jychang48 at gmail.com> writes:
> >> >
> >> >> Yeah based on my experiments it seems setting pc_mg_levels to
> $DAREFINE
> >> + 1
> >> >> has decent performance.
> >> >>
> >> >> 1) is there ever a case where you'd want $MGLEVELS <= $DAREFINE? In
> >> some of
> >> >> the PETSc tutorial slides (e.g., http://www.mcs.anl.gov/
> >> >> petsc/documentation/tutorials/TutorialCEMRACS2016.pdf on slide
> 203/227)
> >> >> they say to use $MGLEVELS = 4 and $DAREFINE = 5, but when I ran
> this, it
> >> >> was almost twice as slow as if $MGLEVELS >= $DAREFINE
> >> >
> >> > Smaller coarse grids are generally more scalable -- when the problem
> >> > data is distributed, multigrid is a good solution algorithm.  But if
> >> > multigrid stops being effective because it is not preserving
> sufficient
> >> > coarse grid accuracy (e.g., for transport-dominated problems in
> >> > complicated domains) then you might want to stop early and use a more
> >> > robust method (like direct solves).
> >>
> >> Basically for symmetric positive definite operators you can make the
> >> coarse problem as small as you like (even 1 point) in theory. For
> >> indefinite and non-symmetric problems the theory says the "coarse grid
> must
> >> be sufficiently fine" (loosely speaking the coarse grid has to resolve
> the
> >> eigenmodes for the eigenvalues to the left of the x = 0).
> >>
> >> https://www.jstor.org/stable/2158375?seq=1#page_scan_tab_contents
> >>
> >>
> >>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20170402/efb48a1d/attachment-0001.html>


More information about the petsc-users mailing list