[petsc-dev] MatAssembly, debug, and compile flags

Barry Smith bsmith at mcs.anl.gov
Mon Mar 6 09:34:53 CST 2017


   I don't think the lack of the --with-debugging=no is important here. Though he/she should use --with-debugging=no for production runs.

   I think the reason for the "funny" numbers is that MatAssemblyBegin and End in this case have explicit synchronization points so some processes are waiting for other processes to get to the synchronization point thus it looks like some processes are spending a lot of time in the assembly routines when they are not really, they are just waiting. 

   You can remove the synchronization point by calling 

    MatSetOption(mat, MAT_NO_OFF_PROC_ENTRIES, PETSC_TRUE); before calling MatMPIAIJSetPreallocationCSR()

   Barry

> On Mar 6, 2017, at 8:59 AM, Pierre Jolivet <Pierre.Jolivet at enseeiht.fr> wrote:
> 
> Hello,
> I have an application with a matrix with lots of nonzero entries (that are perfectly load balanced between processes and rows).
> A end user is currently using a PETSc library compiled with the following flags (among others):
> --CFLAGS=-O2 --COPTFLAGS=-O3 --CXXFLAGS="-O2 -std=c++11" --CXXOPTFLAGS=-O3 --FFLAGS=-O2 --FOPTFLAGS=-O3
> Notice the lack of --with-debugging=no
> The matrix is assembled using MatMPIAIJSetPreallocationCSR and we end up with something like that in the -log_view:
> MatAssemblyBegin       2 1.0 1.2520e+002602.1 0.00e+00 0.0 0.0e+00 0.0e+00 8.0e+00  0  0  0  0  2   0  0  0  0  2     0
> MatAssemblyEnd         2 1.0 4.5104e+01 1.0 0.00e+00 0.0 8.2e+05 3.2e+04 4.6e+01 40  0 14  4  9  40  0 14  4  9     0
> 
> For reference, here is what the matrix looks like (keep in mind it is well balanced)
>  Mat Object:   640 MPI processes
>    type: mpiaij
>    rows=10682560, cols=10682560
>    total: nonzeros=51691212800, allocated nonzeros=51691212800
>    total number of mallocs used during MatSetValues calls =0
>      not using I-node (on process 0) routines
> 
> Are MatAssemblyBegin/MatAssemblyEnd highly sensitive to the --with-debugging option on x86 even though the corresponding code is compiled with -O2, i.e., should I tell the user to have its PETSc lib recompiled, or would you recommend me to use another routine for assembling such a matrix?
> 
> Thanks,
> Pierre




More information about the petsc-dev mailing list