[petsc-dev] development version for MATAIJMKL type mat

Tue Oct 10 15:06:37 CDT 2017


On 10/10/2017 12:47 PM, Mark Adams wrote:
>
>
>
>        What are you comparing? Are you using say 32 MPI processes and
>     2 threads or 16 MPI processes and 4 threads? How are you
>     controlling the number of OpenMP threads, OpenMP environmental
>     variable? What parts of the time in the code are you comparing?
>     You should just -log_view and compare the times for PCApply and
>     PCSetUp() between say 64 MPI process/1 thread and 32 MPI
>     processes/2 threads and send us the output for those two cases.
>
>
> These folks don't use many MPI processes. I'm not sure what the 
> optimal configuration is with Chombo-Crunch when using all of Cori.
>
> Baky: how many MPI processes per socket are you aiming for on Cori-KNL?
right now I am testing it on a single KNL node going from flat 64+1 to 
2+32 for comparison.
But as you can see from the plot in the previous mail, we have a sweet 
spot at 16+4 point, then we scale that accordingly when running
with 8k nodes.


>
>     >
>     > It seems that it made no difference, so perhaps I am doing
>     something wrong or my build is not configured right.
>     >
>     > Do you have any example that makes use of threads when running
>     hybrid and show an advantage?
>
>        There is not reason to think that using threads on KNL is
>     faster than just using MPI processes. Despite what the NERSc/LBL
>     web pages may say, just because a website says something doesn't
>     make it true.
>
>
>     >
>     > I'd like to test it and make sure that my libs are configured
>     correctly, before start to investigate it further.
>     >
>     >
>     > Thanks,
>     >
>     > Baky
>     >
>     >
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-dev/attachments/20171010/ac131843/attachment.html>