[petsc-users] Some routines are very expensive, such as DMPlexGetSupport and DMPlexPointLocalRef.
李季
leejearl at 126.com
Mon Dec 5 04:34:29 CST 2016
Hi Matt:
Thank you for your kind reply. I am aware of this problem from my test case. I simulate the lid driven cavity
by the code, and the grid is a 100x100 2D domain. I use the routine DMPlexReconstructGradientsFVM to compute
the gradients and limiters. The limiter which I used in the code is the PETSCLIMITERMINMOD. I have march 1000
steps, and the time costs are more higher than I expected. Then, I have loop the function DMPlexReconstructGradientsFVM
for 1000 times, and it costs nearly 170 seconds. I have browse the code of the routine DMPlexReconstructGradientsFVM.
The arithmetic is very clean, so I think It was because of the lots of function calls, such as
VecGetArray, DMPlexGetSupport and DMPlexPointLocalRef.
I make a further test and recode the DMPlexReconstructGradientsFVM and named it as DMPlexReconstructGradientsFVM_1 by myself.
When I loop the DMPlexReconstructGradientsFVM_1 for 1000 times, the time costs were reduced as 30 seconds. The modification in my
own code is that I calls the function outside the loops, and then pass the data into the function DMPlexReconstructGradientsFVM_1. The
program flow is like as follow
VecGetArray
DMPlexGetSupport
DMPlexPointLocalRef
...
for(i=0; i<1000;++i)
{
DMPlexReconstructGradientsFVM_1(data, ....)
/* Here the data represent the data I extract from the DMPlex using the function VecGetArray and etc. */
}
The code using DMPlexReconstructGradientsFVM look like
for(i=0; i<1000;++i)
{
function DMPlexReconstructGradientsFVM
{
VecGetArray
DMPlexGetSupport
DMPlexPointLocalRef
...
}
}
Compared with DMPlexReconstructGradientsFVM_1, DMPlexReconstructGradientsFVM has too many function calls.
It makes the time costs very expensive. So, I write to you for helps that whether I can use some compiler options to
reduce the time coses.
Thanks.
leejearl
At 2016-12-04 21:34:49, "Matthew Knepley" <knepley at gmail.com> wrote:
On Sun, Dec 4, 2016 at 1:58 AM, leejearl <leejearl at 126.com> wrote:
Hi, all PETSc developer:
Thank you for your great works. I have deploy my fvm code based on the PETSc.
It works well, and the results are beautiful. But I found a problem that some of the
functions, such as DMPlexGetSupport and DMPlexPointLocalRef, are very expensive.
I can believe that some parts are expensive, but I think it is probably something other than
GetSupport() and PointLocalRef(). Lets look at the code. First support is just two pointer lookups
https://bitbucket.org/petsc/petsc/src/8191f1e31285033beeebf70760bc9786361aefca/src/dm/impls/plex/plex.c?at=master&fileviewer=file-view-default#plex.c-1502
and for Point LocalRef() its one lookup and arithmetic
https://bitbucket.org/petsc/petsc/src/8191f1e31285033beeebf70760bc9786361aefca/src/dm/impls/plex/plexpoint.c?at=master&fileviewer=file-view-default#plexpoint.c-105
I have benchmark code that runs these, and they should definitely take < 1e-7s, and maybe
10-100 times less. You can look at Plex test ex9 to see some of it.
What is taking a lot of time?
Thanks,
Matt
It costs a lot of times if such routines are involved. Is there any method one can use to reduce
the time costs and improve the efficiency of the executable applications?
Thanks
leejearl
--
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20161205/f3f86a18/attachment.html>
More information about the petsc-users
mailing list