[petsc-users] Reasons for breakdown in preconditioned LSQR
Mark Adams
mfadams at lbl.gov
Tue May 7 07:27:50 CDT 2024
"A^T A being similar to a finite difference Poisson matrix if the rows were
permuted randomly."
Normal eqs are a good option in general for rectangular systems and we have
Poisson solvers.
I'm not sure what you mean by "permuted randomly." A random permutation of
the matrix can kill performance but not math.
Mark
On Tue, May 7, 2024 at 8:03 AM Matthew Knepley <knepley at gmail.com> wrote:
> On Tue, May 7, 2024 at 5: 12 AM Pierre Jolivet <pierre@ joliv. et> wrote:
> On 7 May 2024, at 9: 10 AM, Marco Seiz <marco@ kit. ac. jp> wrote: Thanks
> for the quick response! On 07. 05. 24 14: 24, Pierre Jolivet wrote: On 7
> May 2024,
> ZjQcmQRYFpfptBannerStart
> This Message Is From an External Sender
> This message came from outside your organization.
>
> ZjQcmQRYFpfptBannerEnd
> On Tue, May 7, 2024 at 5:12 AM Pierre Jolivet <pierre at joliv.et> wrote:
>
>> On 7 May 2024, at 9: 10 AM, Marco Seiz <marco@ kit. ac. jp> wrote:
>> Thanks for the quick response! On 07. 05. 24 14: 24, Pierre Jolivet wrote:
>> On 7 May 2024, at 7: 04 AM, Marco Seiz <marco@ kit. ac. jp> wrote: This
>> Message Is From an External
>> ZjQcmQRYFpfptBannerStart
>> This Message Is From an External Sender
>> This message came from outside your organization.
>>
>> ZjQcmQRYFpfptBannerEnd
>>
>>
>> On 7 May 2024, at 9:10 AM, Marco Seiz <marco at kit.ac.jp> wrote:
>>
>> Thanks for the quick response!
>>
>> On 07.05.24 14:24, Pierre Jolivet wrote:
>>
>>
>>
>> On 7 May 2024, at 7:04 AM, Marco Seiz <marco at kit.ac.jp> wrote:
>>
>> This Message Is From an External Sender
>> This message came from outside your organization.
>> Hello,
>>
>> something a bit different from my last question, since that didn't
>> progress so well:
>> I have a related model which generally produces a rectangular matrix A,
>> so I am using LSQR to solve the system.
>> The matrix A has two nonzeros (1, -1) per row, with A^T A being similar
>> to a finite difference Poisson matrix if the rows were permuted randomly.
>> The problem is singular in that the solution is only specified up to a
>> constant from the matrix, with my target solution being a weighted zero
>> average one, which I can handle by adding a nullspace to my matrix.
>> However, I'd also like to pin (potentially many) DOFs in the future so I
>> also tried pinning a single value, and afterwards subtracting the
>> average from the KSP solution.
>> This leads to the KSP *sometimes* diverging when I use a preconditioner;
>> the target size of the matrix will be something like ([1,20] N) x N,
>> with N ~ [2, 1e6] so for the higher end I will require a preconditioner
>> for reasonable execution time.
>>
>> For a smaller example system, I set up my application to dump the input
>> to the KSP when it breaks down and I've attached a simple python script
>> + data using petsc4py to demonstrate the divergence for those specific
>> systems.
>> With `python3 lsdiv.py -pc_type lu -ksp_converged_reason` that
>> particular system shows breakdown, but if I remove the pinned DOF and
>> add the nullspace (pass -usens) it converges. I did try different PCs
>> but they tend to break down at different steps, e.g. `python3 lsdiv.py
>> -usenormal -qrdiv -pc_type qr -ksp_converged_reason` shows the breakdown
>> for PCQR when I use MatCreateNormal for creating the PC mat, but
>> interestingly it doesn't break down when I explicitly form A^T A (don't
>> pass -usenormal).
>>
>>
>> What version are you using? All those commands are returning
>> Linear solve converged due to CONVERGED_RTOL_NORMAL iterations 1
>> So I cannot reproduce any breakdown, but there have been recent changes
>> to KSPLSQR.
>>
>> For those tests I've been using PETSc 3.20.5 (last githash was
>> 4b82c11ab5d ).
>> I pulled the latest version from gitlab ( 6b3135e3cbe ) and compiled it,
>> but I had to drop --download-suitesparse=1 from my earlier config due to
>> errors.
>> Should I write a separate mail about this?
>>
>> The LU example still behaves the same for me (`python3 lsdiv.py -pc_type
>> lu -ksp_converged_reason` gives DIVERGED_BREAKDOWN, `python3 lsdiv.py
>> -usens -pc_type lu -ksp_converged_reason` gives CONVERGED_RTOL_NORMAL)
>> but the QR example fails since I had to remove suitesparse.
>> petsc4py.__version__ reports 3.21.1 and if I rebuild my application,
>> then `ldd app` gives me `libpetsc.so
>> <https://urldefense.us/v3/__http://libpetsc.so/__;!!G_uCfscf7eWS!auri5B6VaP-JYC4fuoLQd6QGnMRYi45UVg6GvK8V2FIlWo6HdPSPwjqjQnRiV2HkM5lAHgRRgpwXScugHRUKcQ$>.3.21
>> =>
>> /opt/petsc/linux-c-opt/lib/libpetsc.so
>> <https://urldefense.us/v3/__http://libpetsc.so/__;!!G_uCfscf7eWS!auri5B6VaP-JYC4fuoLQd6QGnMRYi45UVg6GvK8V2FIlWo6HdPSPwjqjQnRiV2HkM5lAHgRRgpwXScugHRUKcQ$>.3.21`
>> so it should be using the
>> newly built one.
>> The application then still eventually yields a DIVERGED_BREAKDOWN.
>> I don't have a ~/.petscrc and PETSC_OPTIONS is unset, so if we are on
>> the same version and there's still a discrepancy it is quite weird.
>>
>>
>> Quite weird indeed…
>> $ python3 lsdiv.py -pc_type lu -ksp_converged_reason
>> Linear solve converged due to CONVERGED_RTOL_NORMAL iterations 1
>> $ python3 lsdiv.py -usens -pc_type lu -ksp_converged_reason
>> Linear solve converged due to CONVERGED_RTOL_NORMAL iterations 1
>> $ python3 lsdiv.py -pc_type qr -ksp_converged_reason
>> Linear solve converged due to CONVERGED_RTOL_NORMAL iterations 1
>> $ python3 lsdiv.py -usens -pc_type qr -ksp_converged_reason
>> Linear solve converged due to CONVERGED_RTOL_NORMAL iterations 1
>>
>> For the moment I can work by adding the nullspace but eventually the
>> need for pinning DOFs will resurface, so I'd like to ask where the
>> breakdown is coming from. What causes the breakdowns? Is that a generic
>> problem occurring when adding (dof_i = val) rows to least-squares
>> systems which prevents these preconditioners from being robust? If so,
>> what preconditioners could be robust?
>> I did a minimal sweep of the available PCs by going over the possible
>> inputs of -pc_type for my application while pinning one DOF. Excepting
>> unavailable PCs (not compiled for, other setup missing, ...) and those
>> that did break down, I am left with ( hmg jacobi mat none pbjacobi sor
>> svd ).
>>
>> It’s unlikely any of these preconditioners will scale (or even converge)
>> for problems with up to 1E6 unknowns.
>> I could help you setup https://urldefense.us/v3/__https://epubs.siam.org/doi/abs/10.1137/21M1434891__;!!G_uCfscf7eWS!clIDFE_3D83l3xZqcVa-mF8uwEJjY2YOrDaWN1pqR-3xEwIaMN8TrzIpfjT2HT6x7gRyjPF-mnryOEVYXhmiKj8$
>> <https://urldefense.us/v3/__https://epubs.siam.org/doi/abs/10.1137/21M1434891__;!!G_uCfscf7eWS!auri5B6VaP-JYC4fuoLQd6QGnMRYi45UVg6GvK8V2FIlWo6HdPSPwjqjQnRiV2HkM5lAHgRRgpwXScvk6kPrWA$>
>> if you are willing to share a larger example (the current Mat are extremely
>> tiny).
>>
>> Yes, that would be great. About how large of a matrix do you need? I can
>> probably quickly get something non-artificial up to O(N) ~ 1e3,
>>
>>
>> That’s big enough.
>> If you’re in luck, AMG on the normal equations won’t behave too badly,
>> but I’ll try some more robust (in theory) methods nonetheless.
>>
>
> We have also had good luck forming and factoring a block diagonal
> approximation of the normal equations. It depends on your problem. What is
> this problem?
>
> Thanks,
>
> Matt
>
>
>> Thanks,
>> Pierre
>>
>> bigger
>> matrices will take some time since I purposefully ignored MPI previously.
>> The matrix basically describes the contacts between particles which are
>> resolved on a uniform grid, so the main memory hog isn't the matrix but
>> rather resolving the particles.
>> I should mention that the matrix changes over the course of the
>> simulation but stays constant for many solves, i.e. hundreds to
>> thousands of solves with variable RHS between periods of contact
>> formation/loss.
>>
>>
>> Thanks,
>> Pierre
>>
>>
>>
>> Best regards,
>> Marco
>>
>> <lsdiv.zip>
>>
>>
>>
>> Best regards,
>> Marco
>>
>>
>>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!clIDFE_3D83l3xZqcVa-mF8uwEJjY2YOrDaWN1pqR-3xEwIaMN8TrzIpfjT2HT6x7gRyjPF-mnryOEVY1OVU6_4$
> <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!fB-MI7viuwcYPEUg4w1S4_woxMQH7Kg5wnygmQdQdtqlCY5hQY4bFmFI3dJtuNZX9R-0i_z0Hq4eR-ODWjrB$>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20240507/19b31d1a/attachment-0001.html>
More information about the petsc-users
mailing list