[petsc-users] About parallel performance

Thu May 29 13:23:53 CDT 2014

Hello,

I implemented PETSc parallel linear solver in a program, the implementation is basically the same as /src/ksp/ksp/examples/tutorials/ex2.c, i.e., I preallocated the MatMPIAIJ, and let PETSc partition the matrix through MatGetOwnershipRange. However, a few tests shows the parallel solver is always a little slower the serial solver (I have excluded the matrix generation CPU).

For serial run I used PCILU as preconditioner; for parallel run, I used ASM with ILU(0) at each subblocks (-sub_pc_type ilu -sub_ksp_type preonly -ksp_type bcgs -pc_type asm). The number of unknowns are around 200,000.

I have used -log_summary to print out the performance summary as attached (log_summary_p1 for serial run and log_summary_p2 for the run with 2 processes). It seems the KSPSolve counts only for less than 20% of Global %T. 
My questions are:

1. what is the bottle neck of the parallel run according to the summary?
2. Do you have any suggestions to improve the parallel performance?

Thanks a lot for your suggestions!

Regards,
Qin    
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: log_summary_p1.txt
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140529/51163eed/attachment-0002.txt>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: log_summary_p2.txt
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20140529/51163eed/attachment-0003.txt>