Time for MatAssembly

Satish Balay balay at mcs.anl.gov
Tue May 19 11:47:39 CDT 2009

On Tue, 19 May 2009, tribur at vision.ee.ethz.ch wrote:

> Distinguished PETSc experts,
> Assuming processor k has defined N entries of a parallel matrix using
> MatSetValues. The half of the entries are in matrix rows belonging to this
> processor, but the other half are situated in rows of other processors.
> My question:
> When does MatAssemblyBegin+MatAssemblyEnd take longer, if the rows where the
> second half of the entries are situated belong all to one single other
> processor, e.g. processor k+1, or if these rows are distributed across
> several, let's say 4, other processors? Is there a significant difference?

Obviously there will be a difference. But it will depend upon the
network/MPI behavior.

A single large one-to-one message vs multiple small all-to-all messages.

Wrt PETSc part - you might have to make sure enough memory is
allocated for these buffers. If the default is small - then there
could be multiple malloc/copies that could slow things down.

Run with '-info' and look for "stash". The number of mallocs here
should be 0 for efficient matrix assembly [The stash size can be
changed with a command line option -matstash_initial_size]


More information about the petsc-users mailing list