[petsc-dev] no petsc on Edison
Mark Adams
mfadams at lbl.gov
Tue Feb 28 08:04:16 CST 2017
Yes, this looks great. Good to get some data on this. Hard problem.
On Mon, Feb 27, 2017 at 3:31 PM, Barry Smith <bsmith at mcs.anl.gov> wrote:
>
> Hong,
>
> Very nice, thanks!
>
> Barry
>
> > On Feb 27, 2017, at 2:06 PM, Hong <hzhang at mcs.anl.gov> wrote:
> >
> > Mark,
> > I implemented scalable MatPtAP and did comparisons of three
> implementations using ex56.c on alcf cetus machine (this machine has small
> memory, 1GB/core):
> > - nonscalable PtAP: use an array of length PN to do dense axpy
> > - scalable PtAP: do sparse axpy without use of PN array
> > - hypre PtAP.
> >
> > The results are attached. Summary:
> > - nonscalable PtAP is 2x faster than scalable, 8x faster than hypre PtAP
> > - scalable PtAP is 4x faster than hypre PtAP
> > - hypre uses less memory (see job.ne399.n63.np1000.sh)
> >
> > Based on above observation, I set the default PtAP algorithm as
> 'nonscalable'.
> > When PN > local estimated nonzero of C=PtAP, then switch default to
> 'scalable'.
> > User can overwrite default.
> >
> > For the case of np=8000, ne=599 (see job.ne599.n500.np8000.sh), I get
> > MatPtAP 3.6224e+01 (nonscalable for small mats,
> scalable for larger ones)
> > scalable MatPtAP 4.6129e+01
> > hypre 1.9389e+02
> >
> > I'm merging this work to next, then to master soon.
> >
> > Hong
> >
> > On Wed, Feb 8, 2017 at 7:25 PM, Mark Adams <mfadams at lbl.gov> wrote:
> > Thanks Hong, that sounds great.
> >
> > I am weary of silent optimizations like you suggest but 2x is big! and
> failure is very bad. So I would vote for your suggestion.
> >
> > ex56 is an elasticity problem. It would be nice, now that you have this
> experimental setup in place, to compare with hypre on a 3D scalar problem.
> Hypre might not spend much effort optimizing for block matrices. 3x better
> than hypre seems large to me. I have to suspect that hypre does not
> exploit blocked matrices as much as we do.
> >
> > Thanks again,
> >
> >
> >
> > On Wed, Feb 8, 2017 at 12:07 PM, Hong <hzhang at mcs.anl.gov> wrote:
> > I conducted tests on MatMatMult() and MatPtAP() using
> petsc/src/ksp/ksp/examples/tutorials/ex56.c (gamg) on a 8-core machine
> (petsc machine). The output file is attached.
> >
> > Summary:
> > 1) non-scalable MatMatMult() for mpiaij format is 2x faster than
> scalable version. The major difference between the two is dense-axpy vs.
> sparse-axpy.
> >
> > Currently, we set non-scalable as default, which leads to problem when
> running large problems.
> > How about setting default as
> > - non-scalable for small to medium size matrices
> > - scalable for larger ones, e.g.
> >
> > + ierr = PetscOptionsEList("-matmatmult_via","Algorithmic
> approach","MatMatMult",algTypes,nalg,algTypes[1],&alg,&flg);
> >
> > + if (!flg) { /* set default algorithm based on B->cmap->N */
> > + PetscMPIInt size;
> > + ierr = MPI_Comm_size(comm,&size);CHKERRQ(ierr);
> > + if ((PetscReal)(B->cmap->N)/size > 100000.0) alg = 0; /* scalable
> algorithm */
> > + }
> >
> > i.e., if user does NOT pick an algorithm, when ave cols per process >
> 100k, use scalable implementation; otherwise, non-scalable version.
> >
> > 2) We do NOT have scalable implementation for MatPtAP() yet.
> > We have non-scalable PtAP and interface to Hypre's PtAP. Comparing the
> two,
> > Petsc MatPtAP() is approx 3x faster than Hypre's.
> >
> > I'm writing a scalable MatPtAP() now.
> >
> > Hong
> >
> > On Thu, Feb 2, 2017 at 2:54 PM, Stefano Zampini <
> stefano.zampini at gmail.com> wrote:
> >
> >
> > Il 02 Feb 2017 23:43, "Mark Adams" <mfadams at lbl.gov> ha scritto:
> >
> >
> > On Thu, Feb 2, 2017 at 12:02 PM, Stefano Zampini <
> stefano.zampini at gmail.com> wrote:
> > Mark,
> >
> > I saw your configuration has hypre. If you could run with master, you
> may try -matptap_via hypre.
> >
> > This is worth trying. Does this even work with GAMG?
> >
> > Yes, it should work, except that the block sizes, if any, are not
> propagated to the resulting matrix. I can add it if you need it.
> >
> >
> >
> > Treb: try hypre anyway. It has its own RAP code.
> >
> >
> > With that option, you will use hypre's RAP with MATAIJ
> >
> >
> > It uses BoomerAMGBuildCoarseOperator directly with the AIJ matrices.
> >
> > Stefano
> >
> >> On Feb 2, 2017, at 7:28 PM, Mark Adams <mfadams at lbl.gov> wrote:
> >>
> >>
> >>
> >> On Thu, Feb 2, 2017 at 11:13 AM, Hong <hzhang at mcs.anl.gov> wrote:
> >> Mark:
> >> Try '-matmatmult_via scalable' first. If this works, should we set it
> as default?
> >>
> >> If it is robust I would say yes unless it is noticeably slower (say
> >20%) small scale problems.
> >>
> >
> >
> >
> >
> >
> >
> > <out_ex56_cetus_short>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20170228/02948955/attachment.html>
More information about the petsc-dev
mailing list