[petsc-dev] development version for MATAIJMKL type mat
Bakytzhan Kallemov
bkallemov at lbl.gov
Tue Oct 10 15:06:37 CDT 2017
On 10/10/2017 12:47 PM, Mark Adams wrote:
>
>
>
> What are you comparing? Are you using say 32 MPI processes and
> 2 threads or 16 MPI processes and 4 threads? How are you
> controlling the number of OpenMP threads, OpenMP environmental
> variable? What parts of the time in the code are you comparing?
> You should just -log_view and compare the times for PCApply and
> PCSetUp() between say 64 MPI process/1 thread and 32 MPI
> processes/2 threads and send us the output for those two cases.
>
>
> These folks don't use many MPI processes. I'm not sure what the
> optimal configuration is with Chombo-Crunch when using all of Cori.
>
> Baky: how many MPI processes per socket are you aiming for on Cori-KNL?
right now I am testing it on a single KNL node going from flat 64+1 to
2+32 for comparison.
But as you can see from the plot in the previous mail, we have a sweet
spot at 16+4 point, then we scale that accordingly when running
with 8k nodes.
>
> >
> > It seems that it made no difference, so perhaps I am doing
> something wrong or my build is not configured right.
> >
> > Do you have any example that makes use of threads when running
> hybrid and show an advantage?
>
> There is not reason to think that using threads on KNL is
> faster than just using MPI processes. Despite what the NERSc/LBL
> web pages may say, just because a website says something doesn't
> make it true.
>
>
> >
> > I'd like to test it and make sure that my libs are configured
> correctly, before start to investigate it further.
> >
> >
> > Thanks,
> >
> > Baky
> >
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20171010/ac131843/attachment.html>
More information about the petsc-dev
mailing list