[petsc-users] Some routines are very expensive, such as DMPlexGetSupport and DMPlexPointLocalRef.

Mon Dec 5 04:34:29 CST 2016

Hi Matt:
    Thank you for your kind reply. I am aware of  this problem from my test case. I simulate the lid driven cavity
by the code, and the grid is a 100x100 2D domain. I use the routine DMPlexReconstructGradientsFVM to compute
the gradients and limiters. The limiter which I used in the code is the PETSCLIMITERMINMOD. I have march 1000
steps, and the time costs are more higher than I expected. Then, I have loop the function DMPlexReconstructGradientsFVM
for 1000 times, and it costs nearly 170 seconds. I have browse the code of the routine DMPlexReconstructGradientsFVM.
The arithmetic is very clean, so I think It was because of the  lots of function calls， such as
VecGetArray, DMPlexGetSupport and DMPlexPointLocalRef.

I make a further test and recode the DMPlexReconstructGradientsFVM and named it as DMPlexReconstructGradientsFVM_1  by myself.
When I loop the DMPlexReconstructGradientsFVM_1 for 1000 times, the time costs were reduced as 30 seconds. The modification in my
own code is that I calls the function outside the loops, and then pass the data into the function DMPlexReconstructGradientsFVM_1. The
program flow is like as follow
VecGetArray
DMPlexGetSupport
DMPlexPointLocalRef
...
for(i=0; i<1000;++i)
{
    DMPlexReconstructGradientsFVM_1(data, ....)
    /* Here the data represent the data I extract from the DMPlex using the function  VecGetArray and etc. */
}
The code using DMPlexReconstructGradientsFVM look like
for(i=0; i<1000;++i)
{
    function DMPlexReconstructGradientsFVM
   {
        VecGetArray
        DMPlexGetSupport
        DMPlexPointLocalRef
        ...
   }
}

Compared with DMPlexReconstructGradientsFVM_1, DMPlexReconstructGradientsFVM has too many function calls.

It makes the time costs very expensive. So, I write to you for helps that whether I can use some compiler options to
reduce the time coses.
    Thanks.

leejearl

At 2016-12-04 21:34:49, "Matthew Knepley" <knepley at gmail.com> wrote:

On Sun, Dec 4, 2016 at 1:58 AM, leejearl <leejearl at 126.com> wrote:

Hi, all PETSc developer:

    Thank you for your great works. I have deploy my fvm code based on the PETSc.
It works well, and the results are beautiful. But I found a problem that some of the
functions, such as DMPlexGetSupport and DMPlexPointLocalRef, are very expensive.

I can believe that some parts are expensive, but I think it is probably something other than
GetSupport() and PointLocalRef(). Lets look at the code. First support is just two pointer lookups

  https://bitbucket.org/petsc/petsc/src/8191f1e31285033beeebf70760bc9786361aefca/src/dm/impls/plex/plex.c?at=master&fileviewer=file-view-default#plex.c-1502

and for Point LocalRef() its one lookup and arithmetic

  https://bitbucket.org/petsc/petsc/src/8191f1e31285033beeebf70760bc9786361aefca/src/dm/impls/plex/plexpoint.c?at=master&fileviewer=file-view-default#plexpoint.c-105

I have benchmark code that runs these, and they should definitely take < 1e-7s, and maybe
10-100 times less. You can look at Plex test ex9 to see some of it.

What is taking a lot of time?

  Thanks,

     Matt

It costs a lot of times if such routines are involved. Is there any method one can use to reduce
the time costs and improve the efficiency of the executable applications?
     Thanks
leejearl
--

What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20161205/f3f86a18/attachment.html>