MatAssemblyBegin/End

Thu Jun 19 12:49:17 CDT 2008

   Overlapping communication and computation is largely a myth. The  
CPU is still involved in packing and unpacking
each message (and thus is unavailable for computation during that  
time). I expect exactly the behavior you have seen.

    Barry

The only way I know to reduce the time in these stages is
1) make sure that the time spent on each process in creating the  
matrix entries and calling
     MatSetValues() is evenly balanced between processes
2) make sure that "most" matrix entries are created on the process  
where they will eventually
     live so less data must be moved in the MatAssemblyBegin/End()  
calls.

   These are both pretty hard to tune perfectly. I would just live  
with the 27 secs out of 10 minutes.
On Jun 19, 2008, at 11:12 AM, Andrew Colombi wrote:

> I'm trying to overlap as much as possible MatAssembly with other
> computations, and I'm finding a confusing result.  If I follow my
> AssemblyBegin with an immediate AssemblyEnd it takes 31 seconds to
> assemble the matrix.  If I interleave a 10 minute computation between
> AssemblyBegin and AssemblyEnd I find that executing only AssemblyEnd
> still takes 27 seconds.  So it takes 31 seconds to complete the entire
> transaction, or after 10 minutes of compute I still find myself stuck
> with 27 seconds of wait time.
>
> Now clearly, from the standpoint of optimization, 27 seconds in the
> presence of 10 minute computations is not something to waste brain
> cycles on. Nevertheless, I'm always curious about discrepancies
> between the world in my head and the actual world ;-) Here are some
> points that may be of interest:
>
> * I'm using a debug compile of PETSc.  I wouldn't guess this makes
> much difference as long as BLAS and LAPACK are optimized.
> * One node does not participate in the computation, instead it acts as
> a work queue; doling out work whenever a "worker" processor becomes
> available.  As such the "server" node makes a lot of calls to
> MPI_Iprobe.  Could this be interfering with PETSc's background use of
> MPI?
>
> Thanks,
> -Andrew
>