[petsc-users] [petsc-maint] Assistance Needed with PETSc KSPSolve Performance Issue
Matthew Knepley
knepley at gmail.com
Fri Jun 14 09:29:23 CDT 2024
PETSc itself only takes 47% of the runtime. I am not sure what is happening
for the other half. For the PETSc half, it is all in the solve:
KSPSolve 20 1.0 5.3323e+03 1.0 1.01e+14 1.0 0.0e+00 0.0e+00
0.0e+00 47 100 0 0 0 47 100 0 0 0 18943
About 2/3 of that is matrix operations (I don't know where you are using LU)
MatMult 19960 1.0 2.1336e+03 1.0 8.78e+13 1.0 0.0e+00 0.0e+00
0.0e+00 19 87 0 0 0 19 87 0 0 0 41163
MatMultAdd 152320 1.0 8.4854e+02 1.0 3.60e+13 1.0 0.0e+00 0.0e+00
0.0e+00 7 35 0 0 0 7 35 0 0 0 42442
MatSolve 6600 1.0 9.0724e+02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
0.0e+00 8 0 0 0 0 8 0 0 0 0 0
and 1/3 is vector operations for orthogonalization in GMRES:
KSPGMRESOrthog 3290 1.0 1.2390e+03 1.0 8.77e+12 1.0 0.0e+00 0.0e+00
0.0e+00 11 9 0 0 0 11 9 0 0 0 7082
VecMAXPY 13220 1.0 1.7894e+03 1.0 9.02e+12 1.0 0.0e+00 0.0e+00
0.0e+00 16 9 0 0 0 16 9 0 0 0 5040
The flop rates do not look crazy, but I do not know what kind of hardware
you are running on.
Thanks,
Matt
On Fri, Jun 14, 2024 at 1:20 AM Yongzhong Li <yongzhong.li at mail.utoronto.ca>
wrote:
> Thanks, I have attached the results without using any KSPGuess. At low
> frequency, the iteration steps are quite close to the one with KSPGuess,
> specifically KSPGuess Object: 1 MPI process type: fischer Model 1, size 200
> However, I found at
> ZjQcmQRYFpfptBannerStart
> This Message Is From an External Sender
> This message came from outside your organization.
>
> ZjQcmQRYFpfptBannerEnd
>
> Thanks, I have attached the results without using any KSPGuess. At low
> frequency, the iteration steps are quite close to the one with KSPGuess,
> specifically
>
> KSPGuess Object: 1 MPI process
>
> type: fischer
>
> Model 1, size 200
>
> However, I found at higher frequency, the # of iteration steps are
> significant higher than the one with KSPGuess, I have attahced both of the
> results for your reference.
>
> Moreover, could I ask why the one without the KSPGuess options can be used
> for a baseline comparsion? What are we comparing here? How does it relate
> to the performance issue/bottleneck I found? “*I have noticed that the
> time taken by **KSPSolve is **almost two times greater than the CPU time
> for matrix-vector product multiplied by the number of iteration*”
>
> Thank you!
> Yongzhong
>
>
>
> *From: *Barry Smith <bsmith at petsc.dev>
> *Date: *Thursday, June 13, 2024 at 2:14 PM
> *To: *Yongzhong Li <yongzhong.li at mail.utoronto.ca>
> *Cc: *petsc-users at mcs.anl.gov <petsc-users at mcs.anl.gov>,
> petsc-maint at mcs.anl.gov <petsc-maint at mcs.anl.gov>, Piero Triverio <
> piero.triverio at utoronto.ca>
> *Subject: *Re: [petsc-maint] Assistance Needed with PETSc KSPSolve
> Performance Issue
>
>
>
> Can you please run the same thing without the KSPGuess option(s) for a
> baseline comparison?
>
>
>
> Thanks
>
>
>
> Barry
>
>
>
> On Jun 13, 2024, at 1:27 PM, Yongzhong Li <yongzhong.li at mail.utoronto.ca>
> wrote:
>
>
>
> This Message Is From an External Sender
>
> This message came from outside your organization.
>
> Hi Matt,
>
> I have rerun the program with the keys you provided. The system output
> when performing ksp solve and the final petsc log output were stored in a
> .txt file attached for your reference.
>
> Thanks!
> Yongzhong
>
>
>
> *From: *Matthew Knepley <knepley at gmail.com>
> *Date: *Wednesday, June 12, 2024 at 6:46 PM
> *To: *Yongzhong Li <yongzhong.li at mail.utoronto.ca>
> *Cc: *petsc-users at mcs.anl.gov <petsc-users at mcs.anl.gov>,
> petsc-maint at mcs.anl.gov <petsc-maint at mcs.anl.gov>, Piero Triverio <
> piero.triverio at utoronto.ca>
> *Subject: *Re: [petsc-maint] Assistance Needed with PETSc KSPSolve
> Performance Issue
>
> 你通常不会收到来自 knepley at gmail.com 的电子邮件。了解这一点为什么很重要
> <https://urldefense.us/v3/__https://aka.ms/LearnAboutSenderIdentification__;!!G_uCfscf7eWS!djGfJnEhNJROfsMsBJy5u_KoRKbug55xZ64oHKUFnH2cWku_Th1hwt4TDdoMd8pWYVDzJeqJslMNZwpO3y0Et94d31qk-oCEwo4$>
>
> On Wed, Jun 12, 2024 at 6:36 PM Yongzhong Li <
> yongzhong.li at mail.utoronto.ca> wrote:
>
> Dear PETSc’s developers, I hope this email finds you well. I am currently
> working on a project using PETSc and have encountered a performance issue
> with the KSPSolve function. Specifically, I have noticed that the time
> taken by KSPSolve is
>
> ZjQcmQRYFpfptBannerStart
>
> *This Message Is From an External Sender*
>
> This message came from outside your organization.
>
>
>
> ZjQcmQRYFpfptBannerEnd
>
> Dear PETSc’s developers,
>
> I hope this email finds you well.
>
> I am currently working on a project using PETSc and have encountered a
> performance issue with the KSPSolve function. Specifically, *I have
> noticed that the time taken by **KSPSolve is **almost two times greater
> than the CPU time for matrix-vector product multiplied by the number of
> iteration steps*. I use C++ chrono to record CPU time.
>
> For context, I am using a shell system matrix A. Despite my efforts to
> parallelize the matrix-vector product (Ax), the overall solve time
> remains higher than the matrix vector product per iteration indicates
> when multiple threads were used. Here are a few details of my setup:
>
> - *Matrix Type*: Shell system matrix
> - *Preconditioner*: Shell PC
> - *Parallel Environment*: Using Intel MKL as PETSc’s BLAS/LAPACK
> library, multithreading is enabled
>
> I have considered several potential reasons, such as preconditioner setup,
> additional solver operations, and the inherent overhead of using a shell
> system matrix. *However, since KSPSolve is a high-level API, I have been
> unable to pinpoint the exact cause of the increased solve time.*
>
> Have you observed the same issue? Could you please provide some experience
> on how to diagnose and address this performance discrepancy? Any
> insights or recommendations you could offer would be greatly appreciated.
>
>
>
> For any performance question like this, we need to see the output of your
> code run with
>
>
>
> -ksp_view -ksp_monitor_true_residual -ksp_converged_reason -log_view
>
>
>
> Thanks,
>
>
>
> Matt
>
>
>
> Thank you for your time and assistance.
>
> Best regards,
>
> Yongzhong
>
> -----------------------------------------------------------
>
> *Yongzhong Li*
>
> PhD student | Electromagnetics Group
>
> Department of Electrical & Computer Engineering
>
> University of Toronto
>
> https://urldefense.us/v3/__http://www.modelics.org__;!!G_uCfscf7eWS!fTDOqOTfYZs4FVyI7NuFX2IPcFNkDKfw0tBwg7sqK1df_HIGAzkpZHNBcWjz96Mfb2isyStipMBB1awwc73f$
> <https://urldefense.us/v3/__http://www.modelics.org__;!!G_uCfscf7eWS!cuLttMJEcegaqu461Bt4QLsO4fASfLM5vjRbtyNhWJQiInbjgNwkGNdkFE1ebSbFjOUatYB0-jd2yQWMWzqkDFFjwMvNl3ZKAr8$>
>
>
>
>
>
>
> --
>
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
>
>
> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!fTDOqOTfYZs4FVyI7NuFX2IPcFNkDKfw0tBwg7sqK1df_HIGAzkpZHNBcWjz96Mfb2isyStipMBB1W3-CeTd$
> <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!djGfJnEhNJROfsMsBJy5u_KoRKbug55xZ64oHKUFnH2cWku_Th1hwt4TDdoMd8pWYVDzJeqJslMNZwpO3y0Et94d31qkNOuenGA$>
>
> <ksp_petsc_log.txt>
>
>
>
--
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener
https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!fTDOqOTfYZs4FVyI7NuFX2IPcFNkDKfw0tBwg7sqK1df_HIGAzkpZHNBcWjz96Mfb2isyStipMBB1W3-CeTd$ <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!fTDOqOTfYZs4FVyI7NuFX2IPcFNkDKfw0tBwg7sqK1df_HIGAzkpZHNBcWjz96Mfb2isyStipMBB1Z_tX2fz$ >
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20240614/d2af7713/attachment-0001.html>
More information about the petsc-users
mailing list