[petsc-users] large PetscCommDuplicate overhead
Matthew Overholt
overholt at capesim.com
Wed Oct 5 14:30:25 CDT 2016
Hi Petsc-Users,
I am trying to understand an issue where PetscCommDuplicate() calls are
taking an increasing percentage of time as I run a fixed-sized problem on
more processes.
I am using the FEM to solve the steady-state heat transfer equation (K.x =
q) using a PC direct solver, like MUMPS.
I am running on the NERSC Cray X30, which has two Xeon's per node with 12
cores each, and profiling the code using CrayPat sampling.
On a typical problem (1E+6 finite elements), running on a single node:
-for 2 cores (1 on each Xeon), about 1% of time is PetscCommDuplicate (on
process 1, but on the root it is less), and (for reference) 9% of total time
is for MUMPS.
-for 8 cores (4 on each Xeon), over 6% of time is PetscCommDuplicate (on
every process except the root, where it is <1%), and 9-10% of total time is
for MUMPS.
What is the large PetscCommDuplicate time connected to, an increasing number
of messages (tags)? Would using fewer MatSetValues() and VecSetValues()
calls (with longer message lengths) alleviate this?
For reference, the PETSc calling sequence in the code is as follows.
// Create the solution and RHS vectors
ierr = VecCreate(petscData->mpicomm,&mesh->hpx);
ierr = PetscObjectSetName((PetscObject) mesh->hpx, "Solution");
ierr = VecSetSizes(mesh->hpx,mesh->lxN,mesh->neqns); // size = # of
equations; distribution to match mesh
ierr = VecSetFromOptions(mesh->hpx); // allow run time options
ierr = VecDuplicate(mesh->hpx,&q); // create the RHS vector
// Create the stiffnexx matrix
ierr = MatCreate(petscData->mpicomm,&K);
ierr = MatSetSizes(K,mesh->lxN,mesh->lxN,mesh->neqns,mesh->neqns);
ierr = MatSetType(K,MATAIJ); // default sparse type
// Do preallocation
ierr = MatMPIAIJSetPreallocation(K,d_nz,NULL,o_nz,NULL);
ierr = MatSeqAIJSetPreallocation(K,d_nz,NULL);
ierr = MatSetOption(K,MAT_NEW_NONZERO_ALLOCATION_ERR,PETSC_FALSE);
ierr = MatSetUp(K);
// Create and set up the KSP context as a PreConditioner Only (Direct)
Solution
ierr = KSPCreate(petscData->mpicomm,&ksp);
ierr = KSPSetOperators(ksp,K,K);
ierr = KSPSetType(ksp,KSPPREONLY);
// Set the temperature vector
ierr = VecSet(mesh->hpx,mesh->Tmin);
// Set the default PC method as MUMPS
ierr = KSPGetPC(ksp,&pc); // extract the preconditioner
ierr = PCSetType(pc,PCLU); // set pc options
ierr = PCFactorSetMatSolverPackage(pc,MATSOLVERMUMPS);
ierr = KSPSetFromOptions(ksp);
// Set the values for the K matrix and q vector
// which involves a lot of these calls
ierr = MatSetValues(K,mrows,idxm,ncols,idxn,pKe,ADD_VALUES); // 1
call per matrix row (equation)
ierr = VecSetValues(q,nqe,ixn,pqe,ADD_VALUES); // 1 call per
element
ierr = VecAssemblyBegin(q);
ierr = MatAssemblyBegin(K,MAT_FINAL_ASSEMBLY);
ierr = VecAssemblyEnd(q);
ierr = MatAssemblyEnd(K,MAT_FINAL_ASSEMBLY);
// Solve //////////////////////////////////////
ierr = KSPSolve(ksp,q,mesh->hpx);
...
*Note that the code evenly divides the finite elements over the total number
of processors, and I am using ghosting of the FE vertices vector to handle
the vertices that are needed on more than 1 process.
Thanks in advance for your help,
Matt Overholt
CapeSym, Inc.
---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20161005/798b94e2/attachment-0001.html>
More information about the petsc-users
mailing list