<!DOCTYPE html>
<!-- BaNnErBlUrFlE-BoDy-start -->
<!-- Preheader Text : BEGIN -->
<div style="display:none !important;display:none;visibility:hidden;mso-hide:all;font-size:1px;color:#ffffff;line-height:1px;height:0px;max-height:0px;opacity:0;overflow:hidden;">
Pierre, I've attached the dumps of the matrix + RHS for something of about 3k x 1k. Regarding the weird divergence behaviour, I tried again at home but I still get the same results. I am running a rolling release distribution on both machines,
</div>
<!-- Preheader Text : END -->
<!-- Email Banner : BEGIN -->
<div style="display:none !important;display:none;visibility:hidden;mso-hide:all;font-size:1px;color:#ffffff;line-height:1px;height:0px;max-height:0px;opacity:0;overflow:hidden;">ZjQcmQRYFpfptBannerStart</div>
<!--[if ((ie)|(mso))]>
<table border="0" cellspacing="0" cellpadding="0" width="100%" style="padding: 16px 0px 16px 0px; direction: ltr" ><tr><td>
<table border="0" cellspacing="0" cellpadding="0" style="padding: 0px 10px 5px 6px; width: 100%; border-radius:4px; border-top:4px solid #90a4ae;background-color:#D0D8DC;"><tr><td valign="top">
<table align="left" border="0" cellspacing="0" cellpadding="0" style="padding: 4px 8px 4px 8px">
<tr><td style="color:#000000; font-family: 'Arial', sans-serif; font-weight:bold; font-size:14px; direction: ltr">
This Message Is From an External Sender
</td></tr>
<tr><td style="color:#000000; font-weight:normal; font-family: 'Arial', sans-serif; font-size:12px; direction: ltr">
This message came from outside your organization.
</td></tr>
</table>
</td></tr></table>
</td></tr></table>
<![endif]-->
<![if !((ie)|(mso))]>
<div dir="ltr" id="pfptBannerfq8cy01" style="all: revert !important; display:block !important; text-align: left !important; margin:16px 0px 16px 0px !important; padding:8px 16px 8px 16px !important; border-radius: 4px !important; min-width: 200px !important; background-color: #D0D8DC !important; background-color: #D0D8DC; border-top: 4px solid #90a4ae !important; border-top: 4px solid #90a4ae;">
<div id="pfptBannerfq8cy01" style="all: unset !important; float:left !important; display:block !important; margin: 0px 0px 1px 0px !important; max-width: 600px !important;">
<div id="pfptBannerfq8cy01" style="all: unset !important; display:block !important; visibility: visible !important; background-color: #D0D8DC !important; color:#000000 !important; color:#000000; font-family: 'Arial', sans-serif !important; font-family: 'Arial', sans-serif; font-weight:bold !important; font-weight:bold; font-size:14px !important; line-height:18px !important; line-height:18px">
This Message Is From an External Sender
</div>
<div id="pfptBannerfq8cy01" style="all: unset !important; display:block !important; visibility: visible !important; background-color: #D0D8DC !important; color:#000000 !important; color:#000000; font-weight:normal; font-family: 'Arial', sans-serif !important; font-family: 'Arial', sans-serif; font-size:12px !important; line-height:18px !important; line-height:18px; margin-top:2px !important;">
This message came from outside your organization.
</div>
</div>
<div style="clear: both !important; display: block !important; visibility: hidden !important; line-height: 0 !important; font-size: 0.01px !important; height: 0px"> </div>
</div>
<![endif]>
<div style="display:none !important;display:none;visibility:hidden;mso-hide:all;font-size:1px;color:#ffffff;line-height:1px;height:0px;max-height:0px;opacity:0;overflow:hidden;">ZjQcmQRYFpfptBannerEnd</div>
<!-- Email Banner : END -->
<!-- BaNnErBlUrFlE-BoDy-end -->
<html>
<head><!-- BaNnErBlUrFlE-HeAdEr-start -->
<style>
#pfptBannerfq8cy01 { all: revert !important; display: block !important;
visibility: visible !important; opacity: 1 !important;
background-color: #D0D8DC !important;
max-width: none !important; max-height: none !important }
.pfptPrimaryButtonfq8cy01:hover, .pfptPrimaryButtonfq8cy01:focus {
background-color: #b4c1c7 !important; }
.pfptPrimaryButtonfq8cy01:active {
background-color: #90a4ae !important; }
</style>
<!-- BaNnErBlUrFlE-HeAdEr-end -->
<meta charset="UTF-8"></head><body><pre style="font-family: sans-serif; font-size: 100%; white-space: pre-wrap; word-wrap: break-word">Pierre,
I've attached the dumps of the matrix + RHS for something of about 3k x 1k.
Regarding the weird divergence behaviour, I tried again at home but I still get the same results.
I am running a rolling release distribution on both machines, but that really shouldn't matter for divergence behavior I would think.
Is there some kind of option in PETSc to get more information about the breakdown from my side?
Best regards,
Marco
----- Original Message -----
>> From: Pierre Jolivet <pierre@joliv.et>
>> To: Marco Seiz <marco@kit.ac.jp>
>> Cc: petsc-users@mcs.anl.gov
>> Date: 2024-05-07 18:12:18
>> Subject: Re: [petsc-users] Reasons for breakdown in preconditioned LSQR
>>
>>
>> > On 7 May 2024, at 9:10 AM, Marco Seiz <marco@kit.ac.jp> wrote:
>> >
>> > Thanks for the quick response!
>> >
>> > On 07.05.24 14:24, Pierre Jolivet wrote:
>> >>
>> >>
>> >>> On 7 May 2024, at 7:04 AM, Marco Seiz <marco@kit.ac.jp> wrote:
>> >>>
>> >>> This Message Is From an External Sender
>> >>> This message came from outside your organization.
>> >>> Hello,
>> >>>
>> >>> something a bit different from my last question, since that didn't
>> >>> progress so well:
>> >>> I have a related model which generally produces a rectangular matrix A,
>> >>> so I am using LSQR to solve the system.
>> >>> The matrix A has two nonzeros (1, -1) per row, with A^T A being similar
>> >>> to a finite difference Poisson matrix if the rows were permuted randomly.
>> >>> The problem is singular in that the solution is only specified up to a
>> >>> constant from the matrix, with my target solution being a weighted zero
>> >>> average one, which I can handle by adding a nullspace to my matrix.
>> >>> However, I'd also like to pin (potentially many) DOFs in the future so I
>> >>> also tried pinning a single value, and afterwards subtracting the
>> >>> average from the KSP solution.
>> >>> This leads to the KSP *sometimes* diverging when I use a preconditioner;
>> >>> the target size of the matrix will be something like ([1,20] N) x N,
>> >>> with N ~ [2, 1e6] so for the higher end I will require a preconditioner
>> >>> for reasonable execution time.
>> >>>
>> >>> For a smaller example system, I set up my application to dump the input
>> >>> to the KSP when it breaks down and I've attached a simple python script
>> >>> + data using petsc4py to demonstrate the divergence for those specific
>> >>> systems.
>> >>> With `python3 lsdiv.py -pc_type lu -ksp_converged_reason` that
>> >>> particular system shows breakdown, but if I remove the pinned DOF and
>> >>> add the nullspace (pass -usens) it converges. I did try different PCs
>> >>> but they tend to break down at different steps, e.g. `python3 lsdiv.py
>> >>> -usenormal -qrdiv -pc_type qr -ksp_converged_reason` shows the breakdown
>> >>> for PCQR when I use MatCreateNormal for creating the PC mat, but
>> >>> interestingly it doesn't break down when I explicitly form A^T A (don't
>> >>> pass -usenormal).
>> >>
>> >> What version are you using? All those commands are returning
>> >> Linear solve converged due to CONVERGED_RTOL_NORMAL iterations 1
>> >> So I cannot reproduce any breakdown, but there have been recent changes to KSPLSQR.
>> > For those tests I've been using PETSc 3.20.5 (last githash was
>> > 4b82c11ab5d ).
>> > I pulled the latest version from gitlab ( 6b3135e3cbe ) and compiled it,
>> > but I had to drop --download-suitesparse=1 from my earlier config due to
>> > errors.
>> > Should I write a separate mail about this?
>> >
>> > The LU example still behaves the same for me (`python3 lsdiv.py -pc_type
>> > lu -ksp_converged_reason` gives DIVERGED_BREAKDOWN, `python3 lsdiv.py
>> > -usens -pc_type lu -ksp_converged_reason` gives CONVERGED_RTOL_NORMAL)
>> > but the QR example fails since I had to remove suitesparse.
>> > petsc4py.__version__ reports 3.21.1 and if I rebuild my application,
>> > then `ldd app` gives me `libpetsc.so <<a href="https://urldefense.us/v3/__http://libpetsc.so/__;!!G_uCfscf7eWS!fW1baXZMAQIKi0VDUIDUUzpMi4xQf7jrWGCXPlpIllqKAXJBzDClVwrLKYuWuT7LYfZoDzK4g9I9g_wFtHOXaQ$">https://urldefense.us/v3/__http://libpetsc.so/__;!!G_uCfscf7eWS!fW1baXZMAQIKi0VDUIDUUzpMi4xQf7jrWGCXPlpIllqKAXJBzDClVwrLKYuWuT7LYfZoDzK4g9I9g_wFtHOXaQ$</a>>.3.21 =>
>> > /opt/petsc/linux-c-opt/lib/libpetsc.so <<a href="https://urldefense.us/v3/__http://libpetsc.so/__;!!G_uCfscf7eWS!fW1baXZMAQIKi0VDUIDUUzpMi4xQf7jrWGCXPlpIllqKAXJBzDClVwrLKYuWuT7LYfZoDzK4g9I9g_wFtHOXaQ$">https://urldefense.us/v3/__http://libpetsc.so/__;!!G_uCfscf7eWS!fW1baXZMAQIKi0VDUIDUUzpMi4xQf7jrWGCXPlpIllqKAXJBzDClVwrLKYuWuT7LYfZoDzK4g9I9g_wFtHOXaQ$</a>>.3.21` so it should be using the
>> > newly built one.
>> > The application then still eventually yields a DIVERGED_BREAKDOWN.
>> > I don't have a ~/.petscrc and PETSC_OPTIONS is unset, so if we are on
>> > the same version and there's still a discrepancy it is quite weird.
>>
>> Quite weird indeed…
>> $ python3 lsdiv.py -pc_type lu -ksp_converged_reason
>> Linear solve converged due to CONVERGED_RTOL_NORMAL iterations 1
>> $ python3 lsdiv.py -usens -pc_type lu -ksp_converged_reason
>> Linear solve converged due to CONVERGED_RTOL_NORMAL iterations 1
>> $ python3 lsdiv.py -pc_type qr -ksp_converged_reason
>> Linear solve converged due to CONVERGED_RTOL_NORMAL iterations 1
>> $ python3 lsdiv.py -usens -pc_type qr -ksp_converged_reason
>> Linear solve converged due to CONVERGED_RTOL_NORMAL iterations 1
>>
>> >>> For the moment I can work by adding the nullspace but eventually the
>> >>> need for pinning DOFs will resurface, so I'd like to ask where the
>> >>> breakdown is coming from. What causes the breakdowns? Is that a generic
>> >>> problem occurring when adding (dof_i = val) rows to least-squares
>> >>> systems which prevents these preconditioners from being robust? If so,
>> >>> what preconditioners could be robust?
>> >>> I did a minimal sweep of the available PCs by going over the possible
>> >>> inputs of -pc_type for my application while pinning one DOF. Excepting
>> >>> unavailable PCs (not compiled for, other setup missing, ...) and those
>> >>> that did break down, I am left with ( hmg jacobi mat none pbjacobi sor
>> >>> svd ).
>> >> It’s unlikely any of these preconditioners will scale (or even converge) for problems with up to 1E6 unknowns.
>> >> I could help you setup <a href="https://urldefense.us/v3/__https://epubs.siam.org/doi/abs/10.1137/21M1434891__;!!G_uCfscf7eWS!fW1baXZMAQIKi0VDUIDUUzpMi4xQf7jrWGCXPlpIllqKAXJBzDClVwrLKYuWuT7LYfZoDzK4g9I9g_z0Iwv7Sg$">https://urldefense.us/v3/__https://epubs.siam.org/doi/abs/10.1137/21M1434891__;!!G_uCfscf7eWS!fW1baXZMAQIKi0VDUIDUUzpMi4xQf7jrWGCXPlpIllqKAXJBzDClVwrLKYuWuT7LYfZoDzK4g9I9g_z0Iwv7Sg$</a> if you are willing to share a larger example (the current Mat are extremely tiny).
>> > Yes, that would be great. About how large of a matrix do you need? I can
>> > probably quickly get something non-artificial up to O(N) ~ 1e3,
>>
>> That’s big enough.
>> If you’re in luck, AMG on the normal equations won’t behave too badly, but I’ll try some more robust (in theory) methods nonetheless.
>>
>> Thanks,
>> Pierre
>>
>> > bigger
>> > matrices will take some time since I purposefully ignored MPI previously.
>> > The matrix basically describes the contacts between particles which are
>> > resolved on a uniform grid, so the main memory hog isn't the matrix but
>> > rather resolving the particles.
>> > I should mention that the matrix changes over the course of the
>> > simulation but stays constant for many solves, i.e. hundreds to
>> > thousands of solves with variable RHS between periods of contact
>> > formation/loss.
>> >
>> >>
>> >> Thanks,
>> >> Pierre
>> >>>
>> >>>
>> >>> Best regards,
>> >>> Marco
>> >>>
>> >>> <lsdiv.zip>
>> >>
>> >>
>> > Best regards,
>> > Marco
>>
>>
>> </pre></body></html>