From numbersixvs at gmail.com  Wed Sep  1 03:42:37 2021
From: numbersixvs at gmail.com (=?UTF-8?B?0J3QsNC30LTRgNCw0YfRkdCyINCS0LjQutGC0L7RgA==?=)
Date: Wed, 1 Sep 2021 11:42:37 +0300
Subject: [petsc-users] Slow convergence while parallel computations.
Message-ID: <CAELBu--pQXziZLnKD0rXE8VxAdf1_AY09TCNKhW=9we2hXsMPw@mail.gmail.com>

Dear all,

I have a 3D elasticity problem with heterogeneous properties. There is
unstructured grid with aspect ratio varied from 4 to 25. Zero Dirichlet
BCs  are imposed on bottom face of mesh. Also, Neumann (traction) BCs are
imposed on side faces. Gravity load is also accounted for. The grid I use
consists of 500k cells (which is approximately 1.6M of DOFs).

The best performance and memory usage for single MPI process was obtained
with HPDDM(BFBCG) solver and bjacobian + ICC (1) in subdomains as
preconditioner, it took 1 m 45 s and RAM 5.0 GB. Parallel computation with
4 MPI processes took 2 m 46 s when using 5.6 GB of RAM. This because of
number of iterations required to achieve the same tolerance is
significantly increased.

I`ve also tried PCGAMG (agg) preconditioner with IC? (1) sub-precondtioner.
For single MPI process, the calculation took 10 min and 3.4 GB of RAM. To
improve the convergence rate, the nullspace was attached using
MatNullSpaceCreateRigidBody and MatSetNearNullSpace subroutines.  This has
reduced calculation time to 3 m 58 s when using 4.3 GB of RAM. Also, there
is peak memory usage with 14.1 GB, which appears just before the start of
the iterations. Parallel computation with 4 MPI processes took 2 m 53 s
when using 8.4 GB of RAM. In that case the peak memory usage is about 22 GB.



Are there ways to avoid decreasing of the convergence rate for bjacobi
precondtioner in parallel mode? Does it make sense to use hierarchical or
nested krylov methods with a local gmres solver (sub_pc_type gmres) and
some sub-precondtioner (for example, sub_pc_type bjacobi)?



Is this peak memory usage expected for gamg preconditioner? is there any
way to reduce it?



What advice would you give to improve the convergence rate with multiple
MPI processes, but keep memory consumption reasonable?



Kind regards,

Viktor Nazdrachev

R&D senior researcher

Geosteering Technologies LLC
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210901/055ad5cf/attachment.html>

From pierre at joliv.et  Wed Sep  1 04:01:26 2021
From: pierre at joliv.et (Pierre Jolivet)
Date: Wed, 1 Sep 2021 11:01:26 +0200
Subject: [petsc-users] Slow convergence while parallel computations.
In-Reply-To: <CAELBu--pQXziZLnKD0rXE8VxAdf1_AY09TCNKhW=9we2hXsMPw@mail.gmail.com>
References: <CAELBu--pQXziZLnKD0rXE8VxAdf1_AY09TCNKhW=9we2hXsMPw@mail.gmail.com>
Message-ID: <7EFBB20A-CB8A-47BA-BDD8-4E0BD43BBC31@joliv.et>

Dear Viktor,

> On 1 Sep 2021, at 10:42 AM, ????????? ?????? <numbersixvs at gmail.com> wrote:
> 
> Dear all,
> 
> I have a 3D elasticity problem with heterogeneous properties. There is unstructured grid with aspect ratio varied from 4 to 25. Zero Dirichlet BCs  are imposed on bottom face of mesh. Also, Neumann (traction) BCs are imposed on side faces. Gravity load is also accounted for. The grid I use consists of 500k cells (which is approximately 1.6M of DOFs).
> 
> The best performance and memory usage for single MPI process was obtained with HPDDM(BFBCG) solver
> 
Block Krylov solvers are (most often) only useful if you have multiple right-hand sides, e.g., in the context of elasticity, multiple loadings.
Is that really the case? If not, you may as well stick to ?standard? CG instead of the breakdown-free block (BFB) variant.

> and bjacobian + ICC (1) in subdomains as preconditioner, it took 1 m 45 s and RAM 5.0 GB. Parallel computation with 4 MPI processes took 2 m 46 s when using 5.6 GB of RAM. This because of number of iterations required to achieve the same tolerance is significantly increased.
> 
> I`ve also tried PCGAMG (agg) preconditioner with IC? (1) sub-precondtioner. For single MPI process, the calculation took 10 min and 3.4 GB of RAM. To improve the convergence rate, the nullspace was attached using MatNullSpaceCreateRigidBody and MatSetNearNullSpace subroutines.  This has reduced calculation time to 3 m 58 s when using 4.3 GB of RAM. Also, there is peak memory usage with 14.1 GB, which appears just before the start of the iterations. Parallel computation with 4 MPI processes took 2 m 53 s when using 8.4 GB of RAM. In that case the peak memory usage is about 22 GB.
> 
I?m surprised that GAMG is converging so slowly. What do you mean by "ICC(1) sub-preconditioner"? Do you use that as a smoother or as a coarse level solver?
How many iterations are required to reach convergence?
Could you please maybe run the solver with -ksp_view -log_view and send us the output?
Most of the default parameters of GAMG should be good enough for 3D elasticity, provided that your MatNullSpace is correct.
One parameter that may need some adjustments though is the aggregation threshold -pc_gamg_threshold (you could try values in the [0.01; 0.1] range, that?s what I always use for elasticity problems).

Thanks,
Pierre

> Are there ways to avoid decreasing of the convergence rate for bjacobi precondtioner in parallel mode? Does it make sense to use hierarchical or nested krylov methods with a local gmres solver (sub_pc_type gmres) and some sub-precondtioner (for example, sub_pc_type bjacobi)?
> 
>  
> Is this peak memory usage expected for gamg preconditioner? is there any way to reduce it?
> 
>  
> What advice would you give to improve the convergence rate with multiple MPI processes, but keep memory consumption reasonable?
> 
>  
> Kind regards,
> 
> Viktor Nazdrachev
> 
> R&D senior researcher
> 
> Geosteering Technologies LLC
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210901/d35391f0/attachment.html>

From wence at gmx.li  Wed Sep  1 04:02:42 2021
From: wence at gmx.li (Lawrence Mitchell)
Date: Wed, 1 Sep 2021 10:02:42 +0100
Subject: [petsc-users] Slow convergence while parallel computations.
In-Reply-To: <CAELBu--pQXziZLnKD0rXE8VxAdf1_AY09TCNKhW=9we2hXsMPw@mail.gmail.com>
References: <CAELBu--pQXziZLnKD0rXE8VxAdf1_AY09TCNKhW=9we2hXsMPw@mail.gmail.com>
Message-ID: <EB5F7983-CA26-496E-9BE4-E15767E0003C@gmx.li>



> On 1 Sep 2021, at 09:42, ????????? ?????? <numbersixvs at gmail.com> wrote:
> 
> I have a 3D elasticity problem with heterogeneous properties.

What does your coefficient variation look like? How large is the contrast?

> There is unstructured grid with aspect ratio varied from 4 to 25. Zero Dirichlet BCs  are imposed on bottom face of mesh. Also, Neumann (traction) BCs are imposed on side faces. Gravity load is also accounted for. The grid I use consists of 500k cells (which is approximately 1.6M of DOFs).
> 
> The best performance and memory usage for single MPI process was obtained with HPDDM(BFBCG) solver and bjacobian + ICC (1) in subdomains as preconditioner, it took 1 m 45 s and RAM 5.0 GB. Parallel computation with 4 MPI processes took 2 m 46 s when using 5.6 GB of RAM. This because of number of iterations required to achieve the same tolerance is significantly increased.

How many iterations do you have in serial (and then in parallel)?

> I`ve also tried PCGAMG (agg) preconditioner with IC? (1) sub-precondtioner. For single MPI process, the calculation took 10 min and 3.4 GB of RAM. To improve the convergence rate, the nullspace was attached using MatNullSpaceCreateRigidBody and MatSetNearNullSpace subroutines.  This has reduced calculation time to 3 m 58 s when using 4.3 GB of RAM. Also, there is peak memory usage with 14.1 GB, which appears just before the start of the iterations. Parallel computation with 4 MPI processes took 2 m 53 s when using 8.4 GB of RAM. In that case the peak memory usage is about 22 GB.

Does the number of iterates increase in parallel? Again, how many iterations do you have?

> Are there ways to avoid decreasing of the convergence rate for bjacobi precondtioner in parallel mode? Does it make sense to use hierarchical or nested krylov methods with a local gmres solver (sub_pc_type gmres) and some sub-precondtioner (for example, sub_pc_type bjacobi)?

bjacobi is only a one-level method, so you would not expect process-independent convergence rate for this kind of problem. If the coefficient variation is not too extreme, then I would expect GAMG (or some other smoothed aggregation package, perhaps -pc_type ml (you need --download-ml)) would work well with some tuning.

If you have extremely high contrast coefficients you might need something with stronger coarse grids. If you can assemble so-called Neumann matrices (https://petsc.org/release/docs/manualpages/Mat/MATIS.html#MATIS) then you could try the geneo scheme offered by PCHPDDM.

> Is this peak memory usage expected for gamg preconditioner? is there any way to reduce it?

I think that peak memory usage comes from building the coarse grids. Can you run with `-info` and grep for GAMG, this will provide some output that more expert GAMG users can interpret.

Lawrence


From mfadams at lbl.gov  Wed Sep  1 06:49:40 2021
From: mfadams at lbl.gov (Mark Adams)
Date: Wed, 1 Sep 2021 07:49:40 -0400
Subject: [petsc-users] Slow convergence while parallel computations.
In-Reply-To: <EB5F7983-CA26-496E-9BE4-E15767E0003C@gmx.li>
References: <CAELBu--pQXziZLnKD0rXE8VxAdf1_AY09TCNKhW=9we2hXsMPw@mail.gmail.com>
	<EB5F7983-CA26-496E-9BE4-E15767E0003C@gmx.li>
Message-ID: <CADOhEh78J7CU8+V+8OR91m450MDxtyWGk7dV+zDDNuAsCLEcuA@mail.gmail.com>

as far as GAMG:

* Pierre is right, start with the defaults. AMG does take tuning. 2D and 3D
are very different, among other things. You can run with '-info :pc', which
is very noisy and grep on "GAMG" and send me the result. (Oh Lawrence
recommend this, just send it)
  - ICC is not good because it has to scale the diagonal to avoid negative
pivots (even for SPD matrices that are not M matrices at least). This is
probably a problem.
  - As Lawrence indicates, jumps in coefficients can be hard for generic
AMG.
  - And yes, -pc_gamg_threshold is an important parameter for
homogeneous problems and can be additionally important for inhomogeneous
problems to get the AMG method to "see" your jumps.

* The memory problems are from squaring the graph, among other things,
which you usually need to do for elasticity unless you have high order
elements, maybe.

* You can try  PCBDDC, DD methods are nice for elasticity.

* You can try hypre. Good solver but 3D elasticity is not its strength.

* As far as poor scaling, you have large subdomains, I assume the load
balancing is decent, and the network is not crazy. This might be a lot of
setup cost. Run with -log_view and look at the KSPSolve and MatPtAP...
 - The solver will call the setup (MatPtAP), if it has not been done yet,
so that it gets folded in. You can call KSPSetup() before KSPSolve() to get
the timings separated. I you are using the solver (eg, not full Newton)
then the setup gets amortized.

Mark

On Wed, Sep 1, 2021 at 5:02 AM Lawrence Mitchell <wence at gmx.li> wrote:

>
>
> > On 1 Sep 2021, at 09:42, ????????? ?????? <numbersixvs at gmail.com> wrote:
> >
> > I have a 3D elasticity problem with heterogeneous properties.
>
> What does your coefficient variation look like? How large is the contrast?
>
> > There is unstructured grid with aspect ratio varied from 4 to 25. Zero
> Dirichlet BCs  are imposed on bottom face of mesh. Also, Neumann (traction)
> BCs are imposed on side faces. Gravity load is also accounted for. The grid
> I use consists of 500k cells (which is approximately 1.6M of DOFs).
> >
> > The best performance and memory usage for single MPI process was
> obtained with HPDDM(BFBCG) solver and bjacobian + ICC (1) in subdomains as
> preconditioner, it took 1 m 45 s and RAM 5.0 GB. Parallel computation with
> 4 MPI processes took 2 m 46 s when using 5.6 GB of RAM. This because of
> number of iterations required to achieve the same tolerance is
> significantly increased.
>
> How many iterations do you have in serial (and then in parallel)?
>
> > I`ve also tried PCGAMG (agg) preconditioner with IC? (1)
> sub-precondtioner. For single MPI process, the calculation took 10 min and
> 3.4 GB of RAM. To improve the convergence rate, the nullspace was attached
> using MatNullSpaceCreateRigidBody and MatSetNearNullSpace subroutines.
> This has reduced calculation time to 3 m 58 s when using 4.3 GB of RAM.
> Also, there is peak memory usage with 14.1 GB, which appears just before
> the start of the iterations. Parallel computation with 4 MPI processes took
> 2 m 53 s when using 8.4 GB of RAM. In that case the peak memory usage is
> about 22 GB.
>
> Does the number of iterates increase in parallel? Again, how many
> iterations do you have?
>
> > Are there ways to avoid decreasing of the convergence rate for bjacobi
> precondtioner in parallel mode? Does it make sense to use hierarchical or
> nested krylov methods with a local gmres solver (sub_pc_type gmres) and
> some sub-precondtioner (for example, sub_pc_type bjacobi)?
>
> bjacobi is only a one-level method, so you would not expect
> process-independent convergence rate for this kind of problem. If the
> coefficient variation is not too extreme, then I would expect GAMG (or some
> other smoothed aggregation package, perhaps -pc_type ml (you need
> --download-ml)) would work well with some tuning.
>
> If you have extremely high contrast coefficients you might need something
> with stronger coarse grids. If you can assemble so-called Neumann matrices (
> https://petsc.org/release/docs/manualpages/Mat/MATIS.html#MATIS) then you
> could try the geneo scheme offered by PCHPDDM.
>
> > Is this peak memory usage expected for gamg preconditioner? is there any
> way to reduce it?
>
> I think that peak memory usage comes from building the coarse grids. Can
> you run with `-info` and grep for GAMG, this will provide some output that
> more expert GAMG users can interpret.
>
> Lawrence
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210901/66e53edc/attachment.html>

From sam.guo at cd-adapco.com  Wed Sep  1 13:49:27 2021
From: sam.guo at cd-adapco.com (Sam Guo)
Date: Wed, 1 Sep 2021 11:49:27 -0700
Subject: [petsc-users] PETSc 3.15.3 compiling error
In-Reply-To: <575fd7-61c5-b983-5ad0-4c2748b6b6d2@mcs.anl.gov>
References: <CAAZdwQuOEw2289v1a--OUx=b-gfwiUi7nvps+Bwu4x1ogLdwFA@mail.gmail.com>
	<cdfd44a7-a0bf-c1d8-8d7a-14752c3e379b@mcs.anl.gov>
	<CAAZdwQujE4cX0qtc7AydqOm3bROxyqGLvdbZwLf_rp71BGRv3g@mail.gmail.com>
	<CAAZdwQu9ezAD7qKdutN3iiGetvPXHm-PuQU69Omm3s_zimZVHQ@mail.gmail.com>
	<ea5527d1-2244-7c2-a1f9-f7c6babf8166@mcs.anl.gov>
	<CAAZdwQuefc0E_m46WCd8E-41QJzQ-nS040ThMDvHAC8kX5nfiw@mail.gmail.com>
	<cb5d920-3ab1-5c2d-28da-ef433557f5e9@mcs.anl.gov>
	<65d5cb9a-2dc0-8362-6a7-5acf784e7138@mcs.anl.gov>
	<CAAZdwQs7Z1Vin5fsv6BmVOMEGHmzDesWCTsZk4uuXoE2sVbLHQ@mail.gmail.com>
	<575fd7-61c5-b983-5ad0-4c2748b6b6d2@mcs.anl.gov>
Message-ID: <CAAZdwQuTGwKkEbsZnLwkSE3hbxnCEfJwpjbFYLDpq6EC7Yt4kQ@mail.gmail.com>

fc should not be required since I link PETSc with pre-compiled MUMPS. In
fact, --with-mumps-include --with-mumps-lib --with-mumps-serial should not
be required since my own CMake defines -DPETSC_HAVE_MUMPS and links my
pre-compiled MUMPS.

I am able to make it work using PETSc 3.11.3. Attached please find the
cPETSc 3.11.3 onfigure.log PETSc.

On Tue, Aug 31, 2021 at 4:47 PM Satish Balay <balay at mcs.anl.gov> wrote:

>
> *******************************************************************************
>          UNABLE to CONFIGURE with GIVEN OPTIONS    (see configure.log for
> details):
>
> -------------------------------------------------------------------------------
> Package mumps requested requires Fortran but compiler turned off.
>
> *******************************************************************************
>
> i.e remove '--with-fc=0' and rerun configure.
>
> Satish
>
> On Tue, 31 Aug 2021, Sam Guo wrote:
>
> > Attached please find the latest configure.log.
> >
> > grep MUMPS_VERSION
> >
> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/*.h
> >
> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/cmumps_c.h:#ifndef
> > MUMPS_VERSION
> >
> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/cmumps_c.h:#define
> > MUMPS_VERSION "5.2.1"
> >
> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/cmumps_c.h:#ifndef
> > MUMPS_VERSION_MAX_LEN
> >
> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/cmumps_c.h:#define
> > MUMPS_VERSION_MAX_LEN 30
> >
> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/cmumps_c.h:
> >    char version_number[MUMPS_VERSION_MAX_LEN + 1 + 1];
> >
> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/dmumps_c.h:#ifndef
> > MUMPS_VERSION
> >
> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/dmumps_c.h:#define
> > MUMPS_VERSION "5.2.1"
> >
> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/dmumps_c.h:#ifndef
> > MUMPS_VERSION_MAX_LEN
> >
> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/dmumps_c.h:#define
> > MUMPS_VERSION_MAX_LEN 30
> >
> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/dmumps_c.h:
> >    char version_number[MUMPS_VERSION_MAX_LEN + 1 + 1];
> >
> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/smumps_c.h:#ifndef
> > MUMPS_VERSION
> >
> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/smumps_c.h:#define
> > MUMPS_VERSION "5.2.1"
> >
> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/smumps_c.h:#ifndef
> > MUMPS_VERSION_MAX_LEN
> >
> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/smumps_c.h:#define
> > MUMPS_VERSION_MAX_LEN 30
> >
> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/smumps_c.h:
> >    char version_number[MUMPS_VERSION_MAX_LEN + 1 + 1];
> >
> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/zmumps_c.h:#ifndef
> > MUMPS_VERSION
> >
> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/zmumps_c.h:#define
> > MUMPS_VERSION "5.2.1"
> >
> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/zmumps_c.h:#ifndef
> > MUMPS_VERSION_MAX_LEN
> >
> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/zmumps_c.h:#define
> > MUMPS_VERSION_MAX_LEN 30
> >
> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/zmumps_c.h:
> >    char version_number[MUMPS_VERSION_MAX_LEN + 1 + 1];
> >
> > On Mon, Aug 30, 2021 at 9:47 PM Satish Balay <balay at mcs.anl.gov> wrote:
> >
> > > Also - what do you have for:
> > >
> > > grep MUMPS_VERSION
> > >
> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/*.h
> > >
> > > Satish
> > >
> > > On Mon, 30 Aug 2021, Satish Balay via petsc-users wrote:
> > >
> > > > please resend the logs
> > > >
> > > > Satish
> > > >
> > > > On Mon, 30 Aug 2021, Sam Guo wrote:
> > > >
> > > > > Same compiling error with --with-mumps-serial=1.
> > > > >
> > > > > On Mon, Aug 30, 2021 at 8:22 PM Satish Balay <balay at mcs.anl.gov>
> > > wrote:
> > > > >
> > > > > > Use the additional option: -with-mumps-serial
> > > > > >
> > > > > > Satish
> > > > > >
> > > > > > On Mon, 30 Aug 2021, Sam Guo wrote:
> > > > > >
> > > > > > > Attached please find the configure.log. I use my own CMake. I
> have
> > > > > > > defined -DPETSC_HAVE_MUMPS. Thanks.
> > > > > > >
> > > > > > > On Mon, Aug 30, 2021 at 4:56 PM Sam Guo <sam.guo at cd-adapco.com
> >
> > > wrote:
> > > > > > >
> > > > > > > > I use pre-installed
> > > > > > > >
> > > > > > > > On Mon, Aug 30, 2021 at 4:53 PM Satish Balay <
> balay at mcs.anl.gov>
> > > > > > wrote:
> > > > > > > >
> > > > > > > >>
> > > > > > > >> Are you using --download-mumps or pre-installed mumps? If
> using
> > > > > > > >> pre-installed - try --download-mumps.
> > > > > > > >>
> > > > > > > >> If you still have issues - send us configure.log and
> make.log
> > > from the
> > > > > > > >> failed build.
> > > > > > > >>
> > > > > > > >> Satish
> > > > > > > >>
> > > > > > > >> On Mon, 30 Aug 2021, Sam Guo wrote:
> > > > > > > >>
> > > > > > > >> > Dear PETSc dev team,
> > > > > > > >> >    I am compiling petsc 3.15.3 and got following compiling
> > > error
> > > > > > > >> > petsc/src/mat/impls/aij/mpi/mumps/mumps.c:52:31: error:
> > > missing
> > > > > > binary
> > > > > > > >> > operator before token "("
> > > > > > > >> >    52 | #if PETSC_PKG_MUMPS_VERSION_GE(5,3,0)
> > > > > > > >> >    Any idea what I did wrong?
> > > > > > > >> >
> > > > > > > >> > Thanks,
> > > > > > > >> > Sam
> > > > > > > >> >
> > > > > > > >>
> > > > > > > >>
> > > > > > >
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> > >
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210901/40f973e2/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: configure.log
Type: text/x-log
Size: 1074595 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210901/40f973e2/attachment-0001.bin>

From balay at mcs.anl.gov  Wed Sep  1 14:00:24 2021
From: balay at mcs.anl.gov (Satish Balay)
Date: Wed, 1 Sep 2021 14:00:24 -0500 (CDT)
Subject: [petsc-users] PETSc 3.15.3 compiling error
In-Reply-To: <CAAZdwQuTGwKkEbsZnLwkSE3hbxnCEfJwpjbFYLDpq6EC7Yt4kQ@mail.gmail.com>
References: <CAAZdwQuOEw2289v1a--OUx=b-gfwiUi7nvps+Bwu4x1ogLdwFA@mail.gmail.com>
	<cdfd44a7-a0bf-c1d8-8d7a-14752c3e379b@mcs.anl.gov>
	<CAAZdwQujE4cX0qtc7AydqOm3bROxyqGLvdbZwLf_rp71BGRv3g@mail.gmail.com>
	<CAAZdwQu9ezAD7qKdutN3iiGetvPXHm-PuQU69Omm3s_zimZVHQ@mail.gmail.com>
	<ea5527d1-2244-7c2-a1f9-f7c6babf8166@mcs.anl.gov>
	<CAAZdwQuefc0E_m46WCd8E-41QJzQ-nS040ThMDvHAC8kX5nfiw@mail.gmail.com>
	<cb5d920-3ab1-5c2d-28da-ef433557f5e9@mcs.anl.gov>
	<65d5cb9a-2dc0-8362-6a7-5acf784e7138@mcs.anl.gov>
	<CAAZdwQs7Z1Vin5fsv6BmVOMEGHmzDesWCTsZk4uuXoE2sVbLHQ@mail.gmail.com>
	<575fd7-61c5-b983-5ad0-4c2748b6b6d2@mcs.anl.gov>
	<CAAZdwQuTGwKkEbsZnLwkSE3hbxnCEfJwpjbFYLDpq6EC7Yt4kQ@mail.gmail.com>
Message-ID: <ffccfb6e-309d-d215-9fb9-eef5afe3532@mcs.anl.gov>

mumps is a fortran package - so best to specify fc. Any specific reason for needing to force '--with-fc=0'?

The attached configure.log is not using mumps.

Satish

On Wed, 1 Sep 2021, Sam Guo wrote:

> fc should not be required since I link PETSc with pre-compiled MUMPS. In
> fact, --with-mumps-include --with-mumps-lib --with-mumps-serial should not
> be required since my own CMake defines -DPETSC_HAVE_MUMPS and links my
> pre-compiled MUMPS.
> 
> I am able to make it work using PETSc 3.11.3. Attached please find the
> cPETSc 3.11.3 onfigure.log PETSc.
> 
> On Tue, Aug 31, 2021 at 4:47 PM Satish Balay <balay at mcs.anl.gov> wrote:
> 
> >
> > *******************************************************************************
> >          UNABLE to CONFIGURE with GIVEN OPTIONS    (see configure.log for
> > details):
> >
> > -------------------------------------------------------------------------------
> > Package mumps requested requires Fortran but compiler turned off.
> >
> > *******************************************************************************
> >
> > i.e remove '--with-fc=0' and rerun configure.
> >
> > Satish
> >
> > On Tue, 31 Aug 2021, Sam Guo wrote:
> >
> > > Attached please find the latest configure.log.
> > >
> > > grep MUMPS_VERSION
> > >
> > /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/*.h
> > >
> > /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/cmumps_c.h:#ifndef
> > > MUMPS_VERSION
> > >
> > /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/cmumps_c.h:#define
> > > MUMPS_VERSION "5.2.1"
> > >
> > /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/cmumps_c.h:#ifndef
> > > MUMPS_VERSION_MAX_LEN
> > >
> > /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/cmumps_c.h:#define
> > > MUMPS_VERSION_MAX_LEN 30
> > >
> > /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/cmumps_c.h:
> > >    char version_number[MUMPS_VERSION_MAX_LEN + 1 + 1];
> > >
> > /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/dmumps_c.h:#ifndef
> > > MUMPS_VERSION
> > >
> > /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/dmumps_c.h:#define
> > > MUMPS_VERSION "5.2.1"
> > >
> > /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/dmumps_c.h:#ifndef
> > > MUMPS_VERSION_MAX_LEN
> > >
> > /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/dmumps_c.h:#define
> > > MUMPS_VERSION_MAX_LEN 30
> > >
> > /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/dmumps_c.h:
> > >    char version_number[MUMPS_VERSION_MAX_LEN + 1 + 1];
> > >
> > /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/smumps_c.h:#ifndef
> > > MUMPS_VERSION
> > >
> > /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/smumps_c.h:#define
> > > MUMPS_VERSION "5.2.1"
> > >
> > /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/smumps_c.h:#ifndef
> > > MUMPS_VERSION_MAX_LEN
> > >
> > /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/smumps_c.h:#define
> > > MUMPS_VERSION_MAX_LEN 30
> > >
> > /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/smumps_c.h:
> > >    char version_number[MUMPS_VERSION_MAX_LEN + 1 + 1];
> > >
> > /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/zmumps_c.h:#ifndef
> > > MUMPS_VERSION
> > >
> > /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/zmumps_c.h:#define
> > > MUMPS_VERSION "5.2.1"
> > >
> > /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/zmumps_c.h:#ifndef
> > > MUMPS_VERSION_MAX_LEN
> > >
> > /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/zmumps_c.h:#define
> > > MUMPS_VERSION_MAX_LEN 30
> > >
> > /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/zmumps_c.h:
> > >    char version_number[MUMPS_VERSION_MAX_LEN + 1 + 1];
> > >
> > > On Mon, Aug 30, 2021 at 9:47 PM Satish Balay <balay at mcs.anl.gov> wrote:
> > >
> > > > Also - what do you have for:
> > > >
> > > > grep MUMPS_VERSION
> > > >
> > /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/*.h
> > > >
> > > > Satish
> > > >
> > > > On Mon, 30 Aug 2021, Satish Balay via petsc-users wrote:
> > > >
> > > > > please resend the logs
> > > > >
> > > > > Satish
> > > > >
> > > > > On Mon, 30 Aug 2021, Sam Guo wrote:
> > > > >
> > > > > > Same compiling error with --with-mumps-serial=1.
> > > > > >
> > > > > > On Mon, Aug 30, 2021 at 8:22 PM Satish Balay <balay at mcs.anl.gov>
> > > > wrote:
> > > > > >
> > > > > > > Use the additional option: -with-mumps-serial
> > > > > > >
> > > > > > > Satish
> > > > > > >
> > > > > > > On Mon, 30 Aug 2021, Sam Guo wrote:
> > > > > > >
> > > > > > > > Attached please find the configure.log. I use my own CMake. I
> > have
> > > > > > > > defined -DPETSC_HAVE_MUMPS. Thanks.
> > > > > > > >
> > > > > > > > On Mon, Aug 30, 2021 at 4:56 PM Sam Guo <sam.guo at cd-adapco.com
> > >
> > > > wrote:
> > > > > > > >
> > > > > > > > > I use pre-installed
> > > > > > > > >
> > > > > > > > > On Mon, Aug 30, 2021 at 4:53 PM Satish Balay <
> > balay at mcs.anl.gov>
> > > > > > > wrote:
> > > > > > > > >
> > > > > > > > >>
> > > > > > > > >> Are you using --download-mumps or pre-installed mumps? If
> > using
> > > > > > > > >> pre-installed - try --download-mumps.
> > > > > > > > >>
> > > > > > > > >> If you still have issues - send us configure.log and
> > make.log
> > > > from the
> > > > > > > > >> failed build.
> > > > > > > > >>
> > > > > > > > >> Satish
> > > > > > > > >>
> > > > > > > > >> On Mon, 30 Aug 2021, Sam Guo wrote:
> > > > > > > > >>
> > > > > > > > >> > Dear PETSc dev team,
> > > > > > > > >> >    I am compiling petsc 3.15.3 and got following compiling
> > > > error
> > > > > > > > >> > petsc/src/mat/impls/aij/mpi/mumps/mumps.c:52:31: error:
> > > > missing
> > > > > > > binary
> > > > > > > > >> > operator before token "("
> > > > > > > > >> >    52 | #if PETSC_PKG_MUMPS_VERSION_GE(5,3,0)
> > > > > > > > >> >    Any idea what I did wrong?
> > > > > > > > >> >
> > > > > > > > >> > Thanks,
> > > > > > > > >> > Sam
> > > > > > > > >> >
> > > > > > > > >>
> > > > > > > > >>
> > > > > > > >
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > > >
> > >
> >
> >
> 


From sam.guo at cd-adapco.com  Wed Sep  1 14:12:57 2021
From: sam.guo at cd-adapco.com (Sam Guo)
Date: Wed, 1 Sep 2021 12:12:57 -0700
Subject: [petsc-users] PETSc 3.15.3 compiling error
In-Reply-To: <ffccfb6e-309d-d215-9fb9-eef5afe3532@mcs.anl.gov>
References: <CAAZdwQuOEw2289v1a--OUx=b-gfwiUi7nvps+Bwu4x1ogLdwFA@mail.gmail.com>
	<cdfd44a7-a0bf-c1d8-8d7a-14752c3e379b@mcs.anl.gov>
	<CAAZdwQujE4cX0qtc7AydqOm3bROxyqGLvdbZwLf_rp71BGRv3g@mail.gmail.com>
	<CAAZdwQu9ezAD7qKdutN3iiGetvPXHm-PuQU69Omm3s_zimZVHQ@mail.gmail.com>
	<ea5527d1-2244-7c2-a1f9-f7c6babf8166@mcs.anl.gov>
	<CAAZdwQuefc0E_m46WCd8E-41QJzQ-nS040ThMDvHAC8kX5nfiw@mail.gmail.com>
	<cb5d920-3ab1-5c2d-28da-ef433557f5e9@mcs.anl.gov>
	<65d5cb9a-2dc0-8362-6a7-5acf784e7138@mcs.anl.gov>
	<CAAZdwQs7Z1Vin5fsv6BmVOMEGHmzDesWCTsZk4uuXoE2sVbLHQ@mail.gmail.com>
	<575fd7-61c5-b983-5ad0-4c2748b6b6d2@mcs.anl.gov>
	<CAAZdwQuTGwKkEbsZnLwkSE3hbxnCEfJwpjbFYLDpq6EC7Yt4kQ@mail.gmail.com>
	<ffccfb6e-309d-d215-9fb9-eef5afe3532@mcs.anl.gov>
Message-ID: <CAAZdwQt-W5sfZeuqFEMSOYXbKTEKyVE8xsa_3N0yi2sowpNNew@mail.gmail.com>

I believe I am using MUMPS since I have done following
(1) defined  -DPETSC_HAVE_MUMPS,
(2) compiles and links mat/impls/aij/mpi/mumps/mumps.c
(3) link my pre-compiled MUMPS, and
(4) specifies following PETSc options
       checkError(EPSGetST(eps, &st));
        checkError(STSetType(st, STSINVERT));
        //if(useShellMatrix) checkError(STSetMatMode(st, ST_MATMODE_SHELL));
        checkError(STGetKSP(st, &ksp));
        checkError(KSPSetOperators(ksp, A, A));
        checkError(KSPSetType(ksp, KSPPREONLY));
        checkError(KSPGetPC(ksp, &pc));
        checkError(MatSetOption(A, MAT_SPD, PETSC_TRUE));
        checkError(PCSetType(pc, PCCHOLESKY));
        checkError(PCFactorSetMatSolverType(pc, MATSOLVERMUMPS));
        checkError(PCFactorSetUpMatSolverType(pc));
        checkError(PetscOptionsSetValue(NULL, "-mat_mumps_icntl_13","1"));

Another evidence I am using MUMPS is that If I skip (1)-(3) above, I got
the PETSc error saying that MUMPS is required.

On Wed, Sep 1, 2021 at 12:00 PM Satish Balay <balay at mcs.anl.gov> wrote:

> mumps is a fortran package - so best to specify fc. Any specific reason
> for needing to force '--with-fc=0'?
>
> The attached configure.log is not using mumps.
>
> Satish
>
> On Wed, 1 Sep 2021, Sam Guo wrote:
>
> > fc should not be required since I link PETSc with pre-compiled MUMPS. In
> > fact, --with-mumps-include --with-mumps-lib --with-mumps-serial should
> not
> > be required since my own CMake defines -DPETSC_HAVE_MUMPS and links my
> > pre-compiled MUMPS.
> >
> > I am able to make it work using PETSc 3.11.3. Attached please find the
> > cPETSc 3.11.3 onfigure.log PETSc.
> >
> > On Tue, Aug 31, 2021 at 4:47 PM Satish Balay <balay at mcs.anl.gov> wrote:
> >
> > >
> > >
> *******************************************************************************
> > >          UNABLE to CONFIGURE with GIVEN OPTIONS    (see configure.log
> for
> > > details):
> > >
> > >
> -------------------------------------------------------------------------------
> > > Package mumps requested requires Fortran but compiler turned off.
> > >
> > >
> *******************************************************************************
> > >
> > > i.e remove '--with-fc=0' and rerun configure.
> > >
> > > Satish
> > >
> > > On Tue, 31 Aug 2021, Sam Guo wrote:
> > >
> > > > Attached please find the latest configure.log.
> > > >
> > > > grep MUMPS_VERSION
> > > >
> > >
> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/*.h
> > > >
> > >
> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/cmumps_c.h:#ifndef
> > > > MUMPS_VERSION
> > > >
> > >
> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/cmumps_c.h:#define
> > > > MUMPS_VERSION "5.2.1"
> > > >
> > >
> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/cmumps_c.h:#ifndef
> > > > MUMPS_VERSION_MAX_LEN
> > > >
> > >
> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/cmumps_c.h:#define
> > > > MUMPS_VERSION_MAX_LEN 30
> > > >
> > >
> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/cmumps_c.h:
> > > >    char version_number[MUMPS_VERSION_MAX_LEN + 1 + 1];
> > > >
> > >
> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/dmumps_c.h:#ifndef
> > > > MUMPS_VERSION
> > > >
> > >
> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/dmumps_c.h:#define
> > > > MUMPS_VERSION "5.2.1"
> > > >
> > >
> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/dmumps_c.h:#ifndef
> > > > MUMPS_VERSION_MAX_LEN
> > > >
> > >
> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/dmumps_c.h:#define
> > > > MUMPS_VERSION_MAX_LEN 30
> > > >
> > >
> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/dmumps_c.h:
> > > >    char version_number[MUMPS_VERSION_MAX_LEN + 1 + 1];
> > > >
> > >
> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/smumps_c.h:#ifndef
> > > > MUMPS_VERSION
> > > >
> > >
> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/smumps_c.h:#define
> > > > MUMPS_VERSION "5.2.1"
> > > >
> > >
> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/smumps_c.h:#ifndef
> > > > MUMPS_VERSION_MAX_LEN
> > > >
> > >
> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/smumps_c.h:#define
> > > > MUMPS_VERSION_MAX_LEN 30
> > > >
> > >
> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/smumps_c.h:
> > > >    char version_number[MUMPS_VERSION_MAX_LEN + 1 + 1];
> > > >
> > >
> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/zmumps_c.h:#ifndef
> > > > MUMPS_VERSION
> > > >
> > >
> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/zmumps_c.h:#define
> > > > MUMPS_VERSION "5.2.1"
> > > >
> > >
> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/zmumps_c.h:#ifndef
> > > > MUMPS_VERSION_MAX_LEN
> > > >
> > >
> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/zmumps_c.h:#define
> > > > MUMPS_VERSION_MAX_LEN 30
> > > >
> > >
> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/zmumps_c.h:
> > > >    char version_number[MUMPS_VERSION_MAX_LEN + 1 + 1];
> > > >
> > > > On Mon, Aug 30, 2021 at 9:47 PM Satish Balay <balay at mcs.anl.gov>
> wrote:
> > > >
> > > > > Also - what do you have for:
> > > > >
> > > > > grep MUMPS_VERSION
> > > > >
> > >
> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/*.h
> > > > >
> > > > > Satish
> > > > >
> > > > > On Mon, 30 Aug 2021, Satish Balay via petsc-users wrote:
> > > > >
> > > > > > please resend the logs
> > > > > >
> > > > > > Satish
> > > > > >
> > > > > > On Mon, 30 Aug 2021, Sam Guo wrote:
> > > > > >
> > > > > > > Same compiling error with --with-mumps-serial=1.
> > > > > > >
> > > > > > > On Mon, Aug 30, 2021 at 8:22 PM Satish Balay <
> balay at mcs.anl.gov>
> > > > > wrote:
> > > > > > >
> > > > > > > > Use the additional option: -with-mumps-serial
> > > > > > > >
> > > > > > > > Satish
> > > > > > > >
> > > > > > > > On Mon, 30 Aug 2021, Sam Guo wrote:
> > > > > > > >
> > > > > > > > > Attached please find the configure.log. I use my own
> CMake. I
> > > have
> > > > > > > > > defined -DPETSC_HAVE_MUMPS. Thanks.
> > > > > > > > >
> > > > > > > > > On Mon, Aug 30, 2021 at 4:56 PM Sam Guo <
> sam.guo at cd-adapco.com
> > > >
> > > > > wrote:
> > > > > > > > >
> > > > > > > > > > I use pre-installed
> > > > > > > > > >
> > > > > > > > > > On Mon, Aug 30, 2021 at 4:53 PM Satish Balay <
> > > balay at mcs.anl.gov>
> > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > >>
> > > > > > > > > >> Are you using --download-mumps or pre-installed mumps?
> If
> > > using
> > > > > > > > > >> pre-installed - try --download-mumps.
> > > > > > > > > >>
> > > > > > > > > >> If you still have issues - send us configure.log and
> > > make.log
> > > > > from the
> > > > > > > > > >> failed build.
> > > > > > > > > >>
> > > > > > > > > >> Satish
> > > > > > > > > >>
> > > > > > > > > >> On Mon, 30 Aug 2021, Sam Guo wrote:
> > > > > > > > > >>
> > > > > > > > > >> > Dear PETSc dev team,
> > > > > > > > > >> >    I am compiling petsc 3.15.3 and got following
> compiling
> > > > > error
> > > > > > > > > >> > petsc/src/mat/impls/aij/mpi/mumps/mumps.c:52:31:
> error:
> > > > > missing
> > > > > > > > binary
> > > > > > > > > >> > operator before token "("
> > > > > > > > > >> >    52 | #if PETSC_PKG_MUMPS_VERSION_GE(5,3,0)
> > > > > > > > > >> >    Any idea what I did wrong?
> > > > > > > > > >> >
> > > > > > > > > >> > Thanks,
> > > > > > > > > >> > Sam
> > > > > > > > > >> >
> > > > > > > > > >>
> > > > > > > > > >>
> > > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > > >
> > > >
> > >
> > >
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210901/fb94fb8e/attachment-0001.html>

From balay at mcs.anl.gov  Wed Sep  1 14:19:48 2021
From: balay at mcs.anl.gov (Satish Balay)
Date: Wed, 1 Sep 2021 14:19:48 -0500 (CDT)
Subject: [petsc-users] PETSc 3.15.3 compiling error
In-Reply-To: <CAAZdwQt-W5sfZeuqFEMSOYXbKTEKyVE8xsa_3N0yi2sowpNNew@mail.gmail.com>
References: <CAAZdwQuOEw2289v1a--OUx=b-gfwiUi7nvps+Bwu4x1ogLdwFA@mail.gmail.com>
	<cdfd44a7-a0bf-c1d8-8d7a-14752c3e379b@mcs.anl.gov>
	<CAAZdwQujE4cX0qtc7AydqOm3bROxyqGLvdbZwLf_rp71BGRv3g@mail.gmail.com>
	<CAAZdwQu9ezAD7qKdutN3iiGetvPXHm-PuQU69Omm3s_zimZVHQ@mail.gmail.com>
	<ea5527d1-2244-7c2-a1f9-f7c6babf8166@mcs.anl.gov>
	<CAAZdwQuefc0E_m46WCd8E-41QJzQ-nS040ThMDvHAC8kX5nfiw@mail.gmail.com>
	<cb5d920-3ab1-5c2d-28da-ef433557f5e9@mcs.anl.gov>
	<65d5cb9a-2dc0-8362-6a7-5acf784e7138@mcs.anl.gov>
	<CAAZdwQs7Z1Vin5fsv6BmVOMEGHmzDesWCTsZk4uuXoE2sVbLHQ@mail.gmail.com>
	<575fd7-61c5-b983-5ad0-4c2748b6b6d2@mcs.anl.gov>
	<CAAZdwQuTGwKkEbsZnLwkSE3hbxnCEfJwpjbFYLDpq6EC7Yt4kQ@mail.gmail.com>
	<ffccfb6e-309d-d215-9fb9-eef5afe3532@mcs.anl.gov>
	<CAAZdwQt-W5sfZeuqFEMSOYXbKTEKyVE8xsa_3N0yi2sowpNNew@mail.gmail.com>
Message-ID: <a4fdd0e5-df12-128d-c1d9-ea2f3e5ac8ef@mcs.anl.gov>

I'm not sure why you would want to do this  - instead of following the recommended installation instructions.

If your process works - thats great! you can use it!

But why start this e-mail thread?

Satish

On Wed, 1 Sep 2021, Sam Guo wrote:

> I believe I am using MUMPS since I have done following
> (1) defined  -DPETSC_HAVE_MUMPS,
> (2) compiles and links mat/impls/aij/mpi/mumps/mumps.c
> (3) link my pre-compiled MUMPS, and
> (4) specifies following PETSc options
>        checkError(EPSGetST(eps, &st));
>         checkError(STSetType(st, STSINVERT));
>         //if(useShellMatrix) checkError(STSetMatMode(st, ST_MATMODE_SHELL));
>         checkError(STGetKSP(st, &ksp));
>         checkError(KSPSetOperators(ksp, A, A));
>         checkError(KSPSetType(ksp, KSPPREONLY));
>         checkError(KSPGetPC(ksp, &pc));
>         checkError(MatSetOption(A, MAT_SPD, PETSC_TRUE));
>         checkError(PCSetType(pc, PCCHOLESKY));
>         checkError(PCFactorSetMatSolverType(pc, MATSOLVERMUMPS));
>         checkError(PCFactorSetUpMatSolverType(pc));
>         checkError(PetscOptionsSetValue(NULL, "-mat_mumps_icntl_13","1"));
> 
> Another evidence I am using MUMPS is that If I skip (1)-(3) above, I got
> the PETSc error saying that MUMPS is required.
> 
> On Wed, Sep 1, 2021 at 12:00 PM Satish Balay <balay at mcs.anl.gov> wrote:
> 
> > mumps is a fortran package - so best to specify fc. Any specific reason
> > for needing to force '--with-fc=0'?
> >
> > The attached configure.log is not using mumps.
> >
> > Satish
> >
> > On Wed, 1 Sep 2021, Sam Guo wrote:
> >
> > > fc should not be required since I link PETSc with pre-compiled MUMPS. In
> > > fact, --with-mumps-include --with-mumps-lib --with-mumps-serial should
> > not
> > > be required since my own CMake defines -DPETSC_HAVE_MUMPS and links my
> > > pre-compiled MUMPS.
> > >
> > > I am able to make it work using PETSc 3.11.3. Attached please find the
> > > cPETSc 3.11.3 onfigure.log PETSc.
> > >
> > > On Tue, Aug 31, 2021 at 4:47 PM Satish Balay <balay at mcs.anl.gov> wrote:
> > >
> > > >
> > > >
> > *******************************************************************************
> > > >          UNABLE to CONFIGURE with GIVEN OPTIONS    (see configure.log
> > for
> > > > details):
> > > >
> > > >
> > -------------------------------------------------------------------------------
> > > > Package mumps requested requires Fortran but compiler turned off.
> > > >
> > > >
> > *******************************************************************************
> > > >
> > > > i.e remove '--with-fc=0' and rerun configure.
> > > >
> > > > Satish
> > > >
> > > > On Tue, 31 Aug 2021, Sam Guo wrote:
> > > >
> > > > > Attached please find the latest configure.log.
> > > > >
> > > > > grep MUMPS_VERSION
> > > > >
> > > >
> > /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/*.h
> > > > >
> > > >
> > /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/cmumps_c.h:#ifndef
> > > > > MUMPS_VERSION
> > > > >
> > > >
> > /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/cmumps_c.h:#define
> > > > > MUMPS_VERSION "5.2.1"
> > > > >
> > > >
> > /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/cmumps_c.h:#ifndef
> > > > > MUMPS_VERSION_MAX_LEN
> > > > >
> > > >
> > /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/cmumps_c.h:#define
> > > > > MUMPS_VERSION_MAX_LEN 30
> > > > >
> > > >
> > /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/cmumps_c.h:
> > > > >    char version_number[MUMPS_VERSION_MAX_LEN + 1 + 1];
> > > > >
> > > >
> > /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/dmumps_c.h:#ifndef
> > > > > MUMPS_VERSION
> > > > >
> > > >
> > /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/dmumps_c.h:#define
> > > > > MUMPS_VERSION "5.2.1"
> > > > >
> > > >
> > /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/dmumps_c.h:#ifndef
> > > > > MUMPS_VERSION_MAX_LEN
> > > > >
> > > >
> > /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/dmumps_c.h:#define
> > > > > MUMPS_VERSION_MAX_LEN 30
> > > > >
> > > >
> > /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/dmumps_c.h:
> > > > >    char version_number[MUMPS_VERSION_MAX_LEN + 1 + 1];
> > > > >
> > > >
> > /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/smumps_c.h:#ifndef
> > > > > MUMPS_VERSION
> > > > >
> > > >
> > /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/smumps_c.h:#define
> > > > > MUMPS_VERSION "5.2.1"
> > > > >
> > > >
> > /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/smumps_c.h:#ifndef
> > > > > MUMPS_VERSION_MAX_LEN
> > > > >
> > > >
> > /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/smumps_c.h:#define
> > > > > MUMPS_VERSION_MAX_LEN 30
> > > > >
> > > >
> > /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/smumps_c.h:
> > > > >    char version_number[MUMPS_VERSION_MAX_LEN + 1 + 1];
> > > > >
> > > >
> > /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/zmumps_c.h:#ifndef
> > > > > MUMPS_VERSION
> > > > >
> > > >
> > /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/zmumps_c.h:#define
> > > > > MUMPS_VERSION "5.2.1"
> > > > >
> > > >
> > /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/zmumps_c.h:#ifndef
> > > > > MUMPS_VERSION_MAX_LEN
> > > > >
> > > >
> > /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/zmumps_c.h:#define
> > > > > MUMPS_VERSION_MAX_LEN 30
> > > > >
> > > >
> > /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/zmumps_c.h:
> > > > >    char version_number[MUMPS_VERSION_MAX_LEN + 1 + 1];
> > > > >
> > > > > On Mon, Aug 30, 2021 at 9:47 PM Satish Balay <balay at mcs.anl.gov>
> > wrote:
> > > > >
> > > > > > Also - what do you have for:
> > > > > >
> > > > > > grep MUMPS_VERSION
> > > > > >
> > > >
> > /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/*.h
> > > > > >
> > > > > > Satish
> > > > > >
> > > > > > On Mon, 30 Aug 2021, Satish Balay via petsc-users wrote:
> > > > > >
> > > > > > > please resend the logs
> > > > > > >
> > > > > > > Satish
> > > > > > >
> > > > > > > On Mon, 30 Aug 2021, Sam Guo wrote:
> > > > > > >
> > > > > > > > Same compiling error with --with-mumps-serial=1.
> > > > > > > >
> > > > > > > > On Mon, Aug 30, 2021 at 8:22 PM Satish Balay <
> > balay at mcs.anl.gov>
> > > > > > wrote:
> > > > > > > >
> > > > > > > > > Use the additional option: -with-mumps-serial
> > > > > > > > >
> > > > > > > > > Satish
> > > > > > > > >
> > > > > > > > > On Mon, 30 Aug 2021, Sam Guo wrote:
> > > > > > > > >
> > > > > > > > > > Attached please find the configure.log. I use my own
> > CMake. I
> > > > have
> > > > > > > > > > defined -DPETSC_HAVE_MUMPS. Thanks.
> > > > > > > > > >
> > > > > > > > > > On Mon, Aug 30, 2021 at 4:56 PM Sam Guo <
> > sam.guo at cd-adapco.com
> > > > >
> > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > I use pre-installed
> > > > > > > > > > >
> > > > > > > > > > > On Mon, Aug 30, 2021 at 4:53 PM Satish Balay <
> > > > balay at mcs.anl.gov>
> > > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > >>
> > > > > > > > > > >> Are you using --download-mumps or pre-installed mumps?
> > If
> > > > using
> > > > > > > > > > >> pre-installed - try --download-mumps.
> > > > > > > > > > >>
> > > > > > > > > > >> If you still have issues - send us configure.log and
> > > > make.log
> > > > > > from the
> > > > > > > > > > >> failed build.
> > > > > > > > > > >>
> > > > > > > > > > >> Satish
> > > > > > > > > > >>
> > > > > > > > > > >> On Mon, 30 Aug 2021, Sam Guo wrote:
> > > > > > > > > > >>
> > > > > > > > > > >> > Dear PETSc dev team,
> > > > > > > > > > >> >    I am compiling petsc 3.15.3 and got following
> > compiling
> > > > > > error
> > > > > > > > > > >> > petsc/src/mat/impls/aij/mpi/mumps/mumps.c:52:31:
> > error:
> > > > > > missing
> > > > > > > > > binary
> > > > > > > > > > >> > operator before token "("
> > > > > > > > > > >> >    52 | #if PETSC_PKG_MUMPS_VERSION_GE(5,3,0)
> > > > > > > > > > >> >    Any idea what I did wrong?
> > > > > > > > > > >> >
> > > > > > > > > > >> > Thanks,
> > > > > > > > > > >> > Sam
> > > > > > > > > > >> >
> > > > > > > > > > >>
> > > > > > > > > > >>
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > >
> > > >
> > > >
> > >
> >
> >
> 


From sam.guo at cd-adapco.com  Wed Sep  1 14:19:53 2021
From: sam.guo at cd-adapco.com (Sam Guo)
Date: Wed, 1 Sep 2021 12:19:53 -0700
Subject: [petsc-users] PETSc 3.15.3 compiling error
In-Reply-To: <CAAZdwQt-W5sfZeuqFEMSOYXbKTEKyVE8xsa_3N0yi2sowpNNew@mail.gmail.com>
References: <CAAZdwQuOEw2289v1a--OUx=b-gfwiUi7nvps+Bwu4x1ogLdwFA@mail.gmail.com>
	<cdfd44a7-a0bf-c1d8-8d7a-14752c3e379b@mcs.anl.gov>
	<CAAZdwQujE4cX0qtc7AydqOm3bROxyqGLvdbZwLf_rp71BGRv3g@mail.gmail.com>
	<CAAZdwQu9ezAD7qKdutN3iiGetvPXHm-PuQU69Omm3s_zimZVHQ@mail.gmail.com>
	<ea5527d1-2244-7c2-a1f9-f7c6babf8166@mcs.anl.gov>
	<CAAZdwQuefc0E_m46WCd8E-41QJzQ-nS040ThMDvHAC8kX5nfiw@mail.gmail.com>
	<cb5d920-3ab1-5c2d-28da-ef433557f5e9@mcs.anl.gov>
	<65d5cb9a-2dc0-8362-6a7-5acf784e7138@mcs.anl.gov>
	<CAAZdwQs7Z1Vin5fsv6BmVOMEGHmzDesWCTsZk4uuXoE2sVbLHQ@mail.gmail.com>
	<575fd7-61c5-b983-5ad0-4c2748b6b6d2@mcs.anl.gov>
	<CAAZdwQuTGwKkEbsZnLwkSE3hbxnCEfJwpjbFYLDpq6EC7Yt4kQ@mail.gmail.com>
	<ffccfb6e-309d-d215-9fb9-eef5afe3532@mcs.anl.gov>
	<CAAZdwQt-W5sfZeuqFEMSOYXbKTEKyVE8xsa_3N0yi2sowpNNew@mail.gmail.com>
Message-ID: <CAAZdwQtHpQkzvxQrVz3kt_M23vMcksEnKKcT1656PUfm6hukkQ@mail.gmail.com>

If we go back to the original compiling error,
"petsc/src/mat/impls/aij/mpi/mumps/mumps.c:52:31: error: missing binary
operator before token "("
   52 | #if PETSC_PKG_MUMPS_VERSION_GE(5,3,0)"
I don't understand what PETSC_PKG_MUMPS_VERSION_GE(5,3,0) is doing.

On Wed, Sep 1, 2021 at 12:12 PM Sam Guo <sam.guo at cd-adapco.com> wrote:

> I believe I am using MUMPS since I have done following
> (1) defined  -DPETSC_HAVE_MUMPS,
> (2) compiles and links mat/impls/aij/mpi/mumps/mumps.c
> (3) link my pre-compiled MUMPS, and
> (4) specifies following PETSc options
>        checkError(EPSGetST(eps, &st));
>         checkError(STSetType(st, STSINVERT));
>         //if(useShellMatrix) checkError(STSetMatMode(st,
> ST_MATMODE_SHELL));
>         checkError(STGetKSP(st, &ksp));
>         checkError(KSPSetOperators(ksp, A, A));
>         checkError(KSPSetType(ksp, KSPPREONLY));
>         checkError(KSPGetPC(ksp, &pc));
>         checkError(MatSetOption(A, MAT_SPD, PETSC_TRUE));
>         checkError(PCSetType(pc, PCCHOLESKY));
>         checkError(PCFactorSetMatSolverType(pc, MATSOLVERMUMPS));
>         checkError(PCFactorSetUpMatSolverType(pc));
>         checkError(PetscOptionsSetValue(NULL, "-mat_mumps_icntl_13","1"));
>
> Another evidence I am using MUMPS is that If I skip (1)-(3) above, I got
> the PETSc error saying that MUMPS is required.
>
> On Wed, Sep 1, 2021 at 12:00 PM Satish Balay <balay at mcs.anl.gov> wrote:
>
>> mumps is a fortran package - so best to specify fc. Any specific reason
>> for needing to force '--with-fc=0'?
>>
>> The attached configure.log is not using mumps.
>>
>> Satish
>>
>> On Wed, 1 Sep 2021, Sam Guo wrote:
>>
>> > fc should not be required since I link PETSc with pre-compiled MUMPS. In
>> > fact, --with-mumps-include --with-mumps-lib --with-mumps-serial should
>> not
>> > be required since my own CMake defines -DPETSC_HAVE_MUMPS and links my
>> > pre-compiled MUMPS.
>> >
>> > I am able to make it work using PETSc 3.11.3. Attached please find the
>> > cPETSc 3.11.3 onfigure.log PETSc.
>> >
>> > On Tue, Aug 31, 2021 at 4:47 PM Satish Balay <balay at mcs.anl.gov> wrote:
>> >
>> > >
>> > >
>> *******************************************************************************
>> > >          UNABLE to CONFIGURE with GIVEN OPTIONS    (see configure.log
>> for
>> > > details):
>> > >
>> > >
>> -------------------------------------------------------------------------------
>> > > Package mumps requested requires Fortran but compiler turned off.
>> > >
>> > >
>> *******************************************************************************
>> > >
>> > > i.e remove '--with-fc=0' and rerun configure.
>> > >
>> > > Satish
>> > >
>> > > On Tue, 31 Aug 2021, Sam Guo wrote:
>> > >
>> > > > Attached please find the latest configure.log.
>> > > >
>> > > > grep MUMPS_VERSION
>> > > >
>> > >
>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/*.h
>> > > >
>> > >
>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/cmumps_c.h:#ifndef
>> > > > MUMPS_VERSION
>> > > >
>> > >
>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/cmumps_c.h:#define
>> > > > MUMPS_VERSION "5.2.1"
>> > > >
>> > >
>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/cmumps_c.h:#ifndef
>> > > > MUMPS_VERSION_MAX_LEN
>> > > >
>> > >
>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/cmumps_c.h:#define
>> > > > MUMPS_VERSION_MAX_LEN 30
>> > > >
>> > >
>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/cmumps_c.h:
>> > > >    char version_number[MUMPS_VERSION_MAX_LEN + 1 + 1];
>> > > >
>> > >
>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/dmumps_c.h:#ifndef
>> > > > MUMPS_VERSION
>> > > >
>> > >
>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/dmumps_c.h:#define
>> > > > MUMPS_VERSION "5.2.1"
>> > > >
>> > >
>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/dmumps_c.h:#ifndef
>> > > > MUMPS_VERSION_MAX_LEN
>> > > >
>> > >
>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/dmumps_c.h:#define
>> > > > MUMPS_VERSION_MAX_LEN 30
>> > > >
>> > >
>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/dmumps_c.h:
>> > > >    char version_number[MUMPS_VERSION_MAX_LEN + 1 + 1];
>> > > >
>> > >
>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/smumps_c.h:#ifndef
>> > > > MUMPS_VERSION
>> > > >
>> > >
>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/smumps_c.h:#define
>> > > > MUMPS_VERSION "5.2.1"
>> > > >
>> > >
>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/smumps_c.h:#ifndef
>> > > > MUMPS_VERSION_MAX_LEN
>> > > >
>> > >
>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/smumps_c.h:#define
>> > > > MUMPS_VERSION_MAX_LEN 30
>> > > >
>> > >
>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/smumps_c.h:
>> > > >    char version_number[MUMPS_VERSION_MAX_LEN + 1 + 1];
>> > > >
>> > >
>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/zmumps_c.h:#ifndef
>> > > > MUMPS_VERSION
>> > > >
>> > >
>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/zmumps_c.h:#define
>> > > > MUMPS_VERSION "5.2.1"
>> > > >
>> > >
>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/zmumps_c.h:#ifndef
>> > > > MUMPS_VERSION_MAX_LEN
>> > > >
>> > >
>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/zmumps_c.h:#define
>> > > > MUMPS_VERSION_MAX_LEN 30
>> > > >
>> > >
>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/zmumps_c.h:
>> > > >    char version_number[MUMPS_VERSION_MAX_LEN + 1 + 1];
>> > > >
>> > > > On Mon, Aug 30, 2021 at 9:47 PM Satish Balay <balay at mcs.anl.gov>
>> wrote:
>> > > >
>> > > > > Also - what do you have for:
>> > > > >
>> > > > > grep MUMPS_VERSION
>> > > > >
>> > >
>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/*.h
>> > > > >
>> > > > > Satish
>> > > > >
>> > > > > On Mon, 30 Aug 2021, Satish Balay via petsc-users wrote:
>> > > > >
>> > > > > > please resend the logs
>> > > > > >
>> > > > > > Satish
>> > > > > >
>> > > > > > On Mon, 30 Aug 2021, Sam Guo wrote:
>> > > > > >
>> > > > > > > Same compiling error with --with-mumps-serial=1.
>> > > > > > >
>> > > > > > > On Mon, Aug 30, 2021 at 8:22 PM Satish Balay <
>> balay at mcs.anl.gov>
>> > > > > wrote:
>> > > > > > >
>> > > > > > > > Use the additional option: -with-mumps-serial
>> > > > > > > >
>> > > > > > > > Satish
>> > > > > > > >
>> > > > > > > > On Mon, 30 Aug 2021, Sam Guo wrote:
>> > > > > > > >
>> > > > > > > > > Attached please find the configure.log. I use my own
>> CMake. I
>> > > have
>> > > > > > > > > defined -DPETSC_HAVE_MUMPS. Thanks.
>> > > > > > > > >
>> > > > > > > > > On Mon, Aug 30, 2021 at 4:56 PM Sam Guo <
>> sam.guo at cd-adapco.com
>> > > >
>> > > > > wrote:
>> > > > > > > > >
>> > > > > > > > > > I use pre-installed
>> > > > > > > > > >
>> > > > > > > > > > On Mon, Aug 30, 2021 at 4:53 PM Satish Balay <
>> > > balay at mcs.anl.gov>
>> > > > > > > > wrote:
>> > > > > > > > > >
>> > > > > > > > > >>
>> > > > > > > > > >> Are you using --download-mumps or pre-installed mumps?
>> If
>> > > using
>> > > > > > > > > >> pre-installed - try --download-mumps.
>> > > > > > > > > >>
>> > > > > > > > > >> If you still have issues - send us configure.log and
>> > > make.log
>> > > > > from the
>> > > > > > > > > >> failed build.
>> > > > > > > > > >>
>> > > > > > > > > >> Satish
>> > > > > > > > > >>
>> > > > > > > > > >> On Mon, 30 Aug 2021, Sam Guo wrote:
>> > > > > > > > > >>
>> > > > > > > > > >> > Dear PETSc dev team,
>> > > > > > > > > >> >    I am compiling petsc 3.15.3 and got following
>> compiling
>> > > > > error
>> > > > > > > > > >> > petsc/src/mat/impls/aij/mpi/mumps/mumps.c:52:31:
>> error:
>> > > > > missing
>> > > > > > > > binary
>> > > > > > > > > >> > operator before token "("
>> > > > > > > > > >> >    52 | #if PETSC_PKG_MUMPS_VERSION_GE(5,3,0)
>> > > > > > > > > >> >    Any idea what I did wrong?
>> > > > > > > > > >> >
>> > > > > > > > > >> > Thanks,
>> > > > > > > > > >> > Sam
>> > > > > > > > > >> >
>> > > > > > > > > >>
>> > > > > > > > > >>
>> > > > > > > > >
>> > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > > >
>> > > >
>> > >
>> > >
>> >
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210901/b1a66cdf/attachment-0001.html>

From sam.guo at cd-adapco.com  Wed Sep  1 14:22:29 2021
From: sam.guo at cd-adapco.com (Sam Guo)
Date: Wed, 1 Sep 2021 12:22:29 -0700
Subject: [petsc-users] PETSc 3.15.3 compiling error
In-Reply-To: <CAAZdwQtHpQkzvxQrVz3kt_M23vMcksEnKKcT1656PUfm6hukkQ@mail.gmail.com>
References: <CAAZdwQuOEw2289v1a--OUx=b-gfwiUi7nvps+Bwu4x1ogLdwFA@mail.gmail.com>
	<cdfd44a7-a0bf-c1d8-8d7a-14752c3e379b@mcs.anl.gov>
	<CAAZdwQujE4cX0qtc7AydqOm3bROxyqGLvdbZwLf_rp71BGRv3g@mail.gmail.com>
	<CAAZdwQu9ezAD7qKdutN3iiGetvPXHm-PuQU69Omm3s_zimZVHQ@mail.gmail.com>
	<ea5527d1-2244-7c2-a1f9-f7c6babf8166@mcs.anl.gov>
	<CAAZdwQuefc0E_m46WCd8E-41QJzQ-nS040ThMDvHAC8kX5nfiw@mail.gmail.com>
	<cb5d920-3ab1-5c2d-28da-ef433557f5e9@mcs.anl.gov>
	<65d5cb9a-2dc0-8362-6a7-5acf784e7138@mcs.anl.gov>
	<CAAZdwQs7Z1Vin5fsv6BmVOMEGHmzDesWCTsZk4uuXoE2sVbLHQ@mail.gmail.com>
	<575fd7-61c5-b983-5ad0-4c2748b6b6d2@mcs.anl.gov>
	<CAAZdwQuTGwKkEbsZnLwkSE3hbxnCEfJwpjbFYLDpq6EC7Yt4kQ@mail.gmail.com>
	<ffccfb6e-309d-d215-9fb9-eef5afe3532@mcs.anl.gov>
	<CAAZdwQt-W5sfZeuqFEMSOYXbKTEKyVE8xsa_3N0yi2sowpNNew@mail.gmail.com>
	<CAAZdwQtHpQkzvxQrVz3kt_M23vMcksEnKKcT1656PUfm6hukkQ@mail.gmail.com>
Message-ID: <CAAZdwQu+=05KD6QY6vSSF57qDPM8F12b5-NV0gBOMwrjf2s+2Q@mail.gmail.com>

My process only works for PTESc 3.11.3, not 3.15.3 and that's why I started
this email thread.

On Wed, Sep 1, 2021 at 12:19 PM Sam Guo <sam.guo at cd-adapco.com> wrote:

> If we go back to the original compiling error,
> "petsc/src/mat/impls/aij/mpi/mumps/mumps.c:52:31: error: missing binary
> operator before token "("
>    52 | #if PETSC_PKG_MUMPS_VERSION_GE(5,3,0)"
> I don't understand what PETSC_PKG_MUMPS_VERSION_GE(5,3,0) is doing.
>
> On Wed, Sep 1, 2021 at 12:12 PM Sam Guo <sam.guo at cd-adapco.com> wrote:
>
>> I believe I am using MUMPS since I have done following
>> (1) defined  -DPETSC_HAVE_MUMPS,
>> (2) compiles and links mat/impls/aij/mpi/mumps/mumps.c
>> (3) link my pre-compiled MUMPS, and
>> (4) specifies following PETSc options
>>        checkError(EPSGetST(eps, &st));
>>         checkError(STSetType(st, STSINVERT));
>>         //if(useShellMatrix) checkError(STSetMatMode(st,
>> ST_MATMODE_SHELL));
>>         checkError(STGetKSP(st, &ksp));
>>         checkError(KSPSetOperators(ksp, A, A));
>>         checkError(KSPSetType(ksp, KSPPREONLY));
>>         checkError(KSPGetPC(ksp, &pc));
>>         checkError(MatSetOption(A, MAT_SPD, PETSC_TRUE));
>>         checkError(PCSetType(pc, PCCHOLESKY));
>>         checkError(PCFactorSetMatSolverType(pc, MATSOLVERMUMPS));
>>         checkError(PCFactorSetUpMatSolverType(pc));
>>         checkError(PetscOptionsSetValue(NULL, "-mat_mumps_icntl_13","1"));
>>
>> Another evidence I am using MUMPS is that If I skip (1)-(3) above, I got
>> the PETSc error saying that MUMPS is required.
>>
>> On Wed, Sep 1, 2021 at 12:00 PM Satish Balay <balay at mcs.anl.gov> wrote:
>>
>>> mumps is a fortran package - so best to specify fc. Any specific reason
>>> for needing to force '--with-fc=0'?
>>>
>>> The attached configure.log is not using mumps.
>>>
>>> Satish
>>>
>>> On Wed, 1 Sep 2021, Sam Guo wrote:
>>>
>>> > fc should not be required since I link PETSc with pre-compiled MUMPS.
>>> In
>>> > fact, --with-mumps-include --with-mumps-lib --with-mumps-serial should
>>> not
>>> > be required since my own CMake defines -DPETSC_HAVE_MUMPS and links my
>>> > pre-compiled MUMPS.
>>> >
>>> > I am able to make it work using PETSc 3.11.3. Attached please find the
>>> > cPETSc 3.11.3 onfigure.log PETSc.
>>> >
>>> > On Tue, Aug 31, 2021 at 4:47 PM Satish Balay <balay at mcs.anl.gov>
>>> wrote:
>>> >
>>> > >
>>> > >
>>> *******************************************************************************
>>> > >          UNABLE to CONFIGURE with GIVEN OPTIONS    (see
>>> configure.log for
>>> > > details):
>>> > >
>>> > >
>>> -------------------------------------------------------------------------------
>>> > > Package mumps requested requires Fortran but compiler turned off.
>>> > >
>>> > >
>>> *******************************************************************************
>>> > >
>>> > > i.e remove '--with-fc=0' and rerun configure.
>>> > >
>>> > > Satish
>>> > >
>>> > > On Tue, 31 Aug 2021, Sam Guo wrote:
>>> > >
>>> > > > Attached please find the latest configure.log.
>>> > > >
>>> > > > grep MUMPS_VERSION
>>> > > >
>>> > >
>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/*.h
>>> > > >
>>> > >
>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/cmumps_c.h:#ifndef
>>> > > > MUMPS_VERSION
>>> > > >
>>> > >
>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/cmumps_c.h:#define
>>> > > > MUMPS_VERSION "5.2.1"
>>> > > >
>>> > >
>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/cmumps_c.h:#ifndef
>>> > > > MUMPS_VERSION_MAX_LEN
>>> > > >
>>> > >
>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/cmumps_c.h:#define
>>> > > > MUMPS_VERSION_MAX_LEN 30
>>> > > >
>>> > >
>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/cmumps_c.h:
>>> > > >    char version_number[MUMPS_VERSION_MAX_LEN + 1 + 1];
>>> > > >
>>> > >
>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/dmumps_c.h:#ifndef
>>> > > > MUMPS_VERSION
>>> > > >
>>> > >
>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/dmumps_c.h:#define
>>> > > > MUMPS_VERSION "5.2.1"
>>> > > >
>>> > >
>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/dmumps_c.h:#ifndef
>>> > > > MUMPS_VERSION_MAX_LEN
>>> > > >
>>> > >
>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/dmumps_c.h:#define
>>> > > > MUMPS_VERSION_MAX_LEN 30
>>> > > >
>>> > >
>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/dmumps_c.h:
>>> > > >    char version_number[MUMPS_VERSION_MAX_LEN + 1 + 1];
>>> > > >
>>> > >
>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/smumps_c.h:#ifndef
>>> > > > MUMPS_VERSION
>>> > > >
>>> > >
>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/smumps_c.h:#define
>>> > > > MUMPS_VERSION "5.2.1"
>>> > > >
>>> > >
>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/smumps_c.h:#ifndef
>>> > > > MUMPS_VERSION_MAX_LEN
>>> > > >
>>> > >
>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/smumps_c.h:#define
>>> > > > MUMPS_VERSION_MAX_LEN 30
>>> > > >
>>> > >
>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/smumps_c.h:
>>> > > >    char version_number[MUMPS_VERSION_MAX_LEN + 1 + 1];
>>> > > >
>>> > >
>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/zmumps_c.h:#ifndef
>>> > > > MUMPS_VERSION
>>> > > >
>>> > >
>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/zmumps_c.h:#define
>>> > > > MUMPS_VERSION "5.2.1"
>>> > > >
>>> > >
>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/zmumps_c.h:#ifndef
>>> > > > MUMPS_VERSION_MAX_LEN
>>> > > >
>>> > >
>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/zmumps_c.h:#define
>>> > > > MUMPS_VERSION_MAX_LEN 30
>>> > > >
>>> > >
>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/zmumps_c.h:
>>> > > >    char version_number[MUMPS_VERSION_MAX_LEN + 1 + 1];
>>> > > >
>>> > > > On Mon, Aug 30, 2021 at 9:47 PM Satish Balay <balay at mcs.anl.gov>
>>> wrote:
>>> > > >
>>> > > > > Also - what do you have for:
>>> > > > >
>>> > > > > grep MUMPS_VERSION
>>> > > > >
>>> > >
>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/*.h
>>> > > > >
>>> > > > > Satish
>>> > > > >
>>> > > > > On Mon, 30 Aug 2021, Satish Balay via petsc-users wrote:
>>> > > > >
>>> > > > > > please resend the logs
>>> > > > > >
>>> > > > > > Satish
>>> > > > > >
>>> > > > > > On Mon, 30 Aug 2021, Sam Guo wrote:
>>> > > > > >
>>> > > > > > > Same compiling error with --with-mumps-serial=1.
>>> > > > > > >
>>> > > > > > > On Mon, Aug 30, 2021 at 8:22 PM Satish Balay <
>>> balay at mcs.anl.gov>
>>> > > > > wrote:
>>> > > > > > >
>>> > > > > > > > Use the additional option: -with-mumps-serial
>>> > > > > > > >
>>> > > > > > > > Satish
>>> > > > > > > >
>>> > > > > > > > On Mon, 30 Aug 2021, Sam Guo wrote:
>>> > > > > > > >
>>> > > > > > > > > Attached please find the configure.log. I use my own
>>> CMake. I
>>> > > have
>>> > > > > > > > > defined -DPETSC_HAVE_MUMPS. Thanks.
>>> > > > > > > > >
>>> > > > > > > > > On Mon, Aug 30, 2021 at 4:56 PM Sam Guo <
>>> sam.guo at cd-adapco.com
>>> > > >
>>> > > > > wrote:
>>> > > > > > > > >
>>> > > > > > > > > > I use pre-installed
>>> > > > > > > > > >
>>> > > > > > > > > > On Mon, Aug 30, 2021 at 4:53 PM Satish Balay <
>>> > > balay at mcs.anl.gov>
>>> > > > > > > > wrote:
>>> > > > > > > > > >
>>> > > > > > > > > >>
>>> > > > > > > > > >> Are you using --download-mumps or pre-installed
>>> mumps? If
>>> > > using
>>> > > > > > > > > >> pre-installed - try --download-mumps.
>>> > > > > > > > > >>
>>> > > > > > > > > >> If you still have issues - send us configure.log and
>>> > > make.log
>>> > > > > from the
>>> > > > > > > > > >> failed build.
>>> > > > > > > > > >>
>>> > > > > > > > > >> Satish
>>> > > > > > > > > >>
>>> > > > > > > > > >> On Mon, 30 Aug 2021, Sam Guo wrote:
>>> > > > > > > > > >>
>>> > > > > > > > > >> > Dear PETSc dev team,
>>> > > > > > > > > >> >    I am compiling petsc 3.15.3 and got following
>>> compiling
>>> > > > > error
>>> > > > > > > > > >> > petsc/src/mat/impls/aij/mpi/mumps/mumps.c:52:31:
>>> error:
>>> > > > > missing
>>> > > > > > > > binary
>>> > > > > > > > > >> > operator before token "("
>>> > > > > > > > > >> >    52 | #if PETSC_PKG_MUMPS_VERSION_GE(5,3,0)
>>> > > > > > > > > >> >    Any idea what I did wrong?
>>> > > > > > > > > >> >
>>> > > > > > > > > >> > Thanks,
>>> > > > > > > > > >> > Sam
>>> > > > > > > > > >> >
>>> > > > > > > > > >>
>>> > > > > > > > > >>
>>> > > > > > > > >
>>> > > > > > > >
>>> > > > > > > >
>>> > > > > > >
>>> > > > > >
>>> > > > >
>>> > > > >
>>> > > >
>>> > >
>>> > >
>>> >
>>>
>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210901/9f90b84e/attachment.html>

From balay at mcs.anl.gov  Wed Sep  1 14:26:52 2021
From: balay at mcs.anl.gov (Satish Balay)
Date: Wed, 1 Sep 2021 14:26:52 -0500 (CDT)
Subject: [petsc-users] PETSc 3.15.3 compiling error
In-Reply-To: <CAAZdwQu+=05KD6QY6vSSF57qDPM8F12b5-NV0gBOMwrjf2s+2Q@mail.gmail.com>
References: <CAAZdwQuOEw2289v1a--OUx=b-gfwiUi7nvps+Bwu4x1ogLdwFA@mail.gmail.com>
	<cdfd44a7-a0bf-c1d8-8d7a-14752c3e379b@mcs.anl.gov>
	<CAAZdwQujE4cX0qtc7AydqOm3bROxyqGLvdbZwLf_rp71BGRv3g@mail.gmail.com>
	<CAAZdwQu9ezAD7qKdutN3iiGetvPXHm-PuQU69Omm3s_zimZVHQ@mail.gmail.com>
	<ea5527d1-2244-7c2-a1f9-f7c6babf8166@mcs.anl.gov>
	<CAAZdwQuefc0E_m46WCd8E-41QJzQ-nS040ThMDvHAC8kX5nfiw@mail.gmail.com>
	<cb5d920-3ab1-5c2d-28da-ef433557f5e9@mcs.anl.gov>
	<65d5cb9a-2dc0-8362-6a7-5acf784e7138@mcs.anl.gov>
	<CAAZdwQs7Z1Vin5fsv6BmVOMEGHmzDesWCTsZk4uuXoE2sVbLHQ@mail.gmail.com>
	<575fd7-61c5-b983-5ad0-4c2748b6b6d2@mcs.anl.gov>
	<CAAZdwQuTGwKkEbsZnLwkSE3hbxnCEfJwpjbFYLDpq6EC7Yt4kQ@mail.gmail.com>
	<ffccfb6e-309d-d215-9fb9-eef5afe3532@mcs.anl.gov>
	<CAAZdwQt-W5sfZeuqFEMSOYXbKTEKyVE8xsa_3N0yi2sowpNNew@mail.gmail.com>
	<CAAZdwQtHpQkzvxQrVz3kt_M23vMcksEnKKcT1656PUfm6hukkQ@mail.gmail.com>
	<CAAZdwQu+=05KD6QY6vSSF57qDPM8F12b5-NV0gBOMwrjf2s+2Q@mail.gmail.com>
Message-ID: <6db0cfdc-5250-6cf4-350-84bd08b4ec@mcs.anl.gov>

Well - then you refuse to follow our installation instructions.

If you have your own hakey way of installing things - you can spend
time debugging your process - and fixing things.

[can't expect us to fix problems that your process creates. Just
because it worked before for you doesn't mean its a petsc issue that
we should put effort into debugging and fixing]

Satish

On Wed, 1 Sep 2021, Sam Guo wrote:

> My process only works for PTESc 3.11.3, not 3.15.3 and that's why I started
> this email thread.
> 
> On Wed, Sep 1, 2021 at 12:19 PM Sam Guo <sam.guo at cd-adapco.com> wrote:
> 
> > If we go back to the original compiling error,
> > "petsc/src/mat/impls/aij/mpi/mumps/mumps.c:52:31: error: missing binary
> > operator before token "("
> >    52 | #if PETSC_PKG_MUMPS_VERSION_GE(5,3,0)"
> > I don't understand what PETSC_PKG_MUMPS_VERSION_GE(5,3,0) is doing.
> >
> > On Wed, Sep 1, 2021 at 12:12 PM Sam Guo <sam.guo at cd-adapco.com> wrote:
> >
> >> I believe I am using MUMPS since I have done following
> >> (1) defined  -DPETSC_HAVE_MUMPS,
> >> (2) compiles and links mat/impls/aij/mpi/mumps/mumps.c
> >> (3) link my pre-compiled MUMPS, and
> >> (4) specifies following PETSc options
> >>        checkError(EPSGetST(eps, &st));
> >>         checkError(STSetType(st, STSINVERT));
> >>         //if(useShellMatrix) checkError(STSetMatMode(st,
> >> ST_MATMODE_SHELL));
> >>         checkError(STGetKSP(st, &ksp));
> >>         checkError(KSPSetOperators(ksp, A, A));
> >>         checkError(KSPSetType(ksp, KSPPREONLY));
> >>         checkError(KSPGetPC(ksp, &pc));
> >>         checkError(MatSetOption(A, MAT_SPD, PETSC_TRUE));
> >>         checkError(PCSetType(pc, PCCHOLESKY));
> >>         checkError(PCFactorSetMatSolverType(pc, MATSOLVERMUMPS));
> >>         checkError(PCFactorSetUpMatSolverType(pc));
> >>         checkError(PetscOptionsSetValue(NULL, "-mat_mumps_icntl_13","1"));
> >>
> >> Another evidence I am using MUMPS is that If I skip (1)-(3) above, I got
> >> the PETSc error saying that MUMPS is required.
> >>
> >> On Wed, Sep 1, 2021 at 12:00 PM Satish Balay <balay at mcs.anl.gov> wrote:
> >>
> >>> mumps is a fortran package - so best to specify fc. Any specific reason
> >>> for needing to force '--with-fc=0'?
> >>>
> >>> The attached configure.log is not using mumps.
> >>>
> >>> Satish
> >>>
> >>> On Wed, 1 Sep 2021, Sam Guo wrote:
> >>>
> >>> > fc should not be required since I link PETSc with pre-compiled MUMPS.
> >>> In
> >>> > fact, --with-mumps-include --with-mumps-lib --with-mumps-serial should
> >>> not
> >>> > be required since my own CMake defines -DPETSC_HAVE_MUMPS and links my
> >>> > pre-compiled MUMPS.
> >>> >
> >>> > I am able to make it work using PETSc 3.11.3. Attached please find the
> >>> > cPETSc 3.11.3 onfigure.log PETSc.
> >>> >
> >>> > On Tue, Aug 31, 2021 at 4:47 PM Satish Balay <balay at mcs.anl.gov>
> >>> wrote:
> >>> >
> >>> > >
> >>> > >
> >>> *******************************************************************************
> >>> > >          UNABLE to CONFIGURE with GIVEN OPTIONS    (see
> >>> configure.log for
> >>> > > details):
> >>> > >
> >>> > >
> >>> -------------------------------------------------------------------------------
> >>> > > Package mumps requested requires Fortran but compiler turned off.
> >>> > >
> >>> > >
> >>> *******************************************************************************
> >>> > >
> >>> > > i.e remove '--with-fc=0' and rerun configure.
> >>> > >
> >>> > > Satish
> >>> > >
> >>> > > On Tue, 31 Aug 2021, Sam Guo wrote:
> >>> > >
> >>> > > > Attached please find the latest configure.log.
> >>> > > >
> >>> > > > grep MUMPS_VERSION
> >>> > > >
> >>> > >
> >>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/*.h
> >>> > > >
> >>> > >
> >>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/cmumps_c.h:#ifndef
> >>> > > > MUMPS_VERSION
> >>> > > >
> >>> > >
> >>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/cmumps_c.h:#define
> >>> > > > MUMPS_VERSION "5.2.1"
> >>> > > >
> >>> > >
> >>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/cmumps_c.h:#ifndef
> >>> > > > MUMPS_VERSION_MAX_LEN
> >>> > > >
> >>> > >
> >>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/cmumps_c.h:#define
> >>> > > > MUMPS_VERSION_MAX_LEN 30
> >>> > > >
> >>> > >
> >>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/cmumps_c.h:
> >>> > > >    char version_number[MUMPS_VERSION_MAX_LEN + 1 + 1];
> >>> > > >
> >>> > >
> >>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/dmumps_c.h:#ifndef
> >>> > > > MUMPS_VERSION
> >>> > > >
> >>> > >
> >>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/dmumps_c.h:#define
> >>> > > > MUMPS_VERSION "5.2.1"
> >>> > > >
> >>> > >
> >>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/dmumps_c.h:#ifndef
> >>> > > > MUMPS_VERSION_MAX_LEN
> >>> > > >
> >>> > >
> >>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/dmumps_c.h:#define
> >>> > > > MUMPS_VERSION_MAX_LEN 30
> >>> > > >
> >>> > >
> >>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/dmumps_c.h:
> >>> > > >    char version_number[MUMPS_VERSION_MAX_LEN + 1 + 1];
> >>> > > >
> >>> > >
> >>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/smumps_c.h:#ifndef
> >>> > > > MUMPS_VERSION
> >>> > > >
> >>> > >
> >>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/smumps_c.h:#define
> >>> > > > MUMPS_VERSION "5.2.1"
> >>> > > >
> >>> > >
> >>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/smumps_c.h:#ifndef
> >>> > > > MUMPS_VERSION_MAX_LEN
> >>> > > >
> >>> > >
> >>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/smumps_c.h:#define
> >>> > > > MUMPS_VERSION_MAX_LEN 30
> >>> > > >
> >>> > >
> >>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/smumps_c.h:
> >>> > > >    char version_number[MUMPS_VERSION_MAX_LEN + 1 + 1];
> >>> > > >
> >>> > >
> >>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/zmumps_c.h:#ifndef
> >>> > > > MUMPS_VERSION
> >>> > > >
> >>> > >
> >>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/zmumps_c.h:#define
> >>> > > > MUMPS_VERSION "5.2.1"
> >>> > > >
> >>> > >
> >>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/zmumps_c.h:#ifndef
> >>> > > > MUMPS_VERSION_MAX_LEN
> >>> > > >
> >>> > >
> >>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/zmumps_c.h:#define
> >>> > > > MUMPS_VERSION_MAX_LEN 30
> >>> > > >
> >>> > >
> >>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/zmumps_c.h:
> >>> > > >    char version_number[MUMPS_VERSION_MAX_LEN + 1 + 1];
> >>> > > >
> >>> > > > On Mon, Aug 30, 2021 at 9:47 PM Satish Balay <balay at mcs.anl.gov>
> >>> wrote:
> >>> > > >
> >>> > > > > Also - what do you have for:
> >>> > > > >
> >>> > > > > grep MUMPS_VERSION
> >>> > > > >
> >>> > >
> >>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/*.h
> >>> > > > >
> >>> > > > > Satish
> >>> > > > >
> >>> > > > > On Mon, 30 Aug 2021, Satish Balay via petsc-users wrote:
> >>> > > > >
> >>> > > > > > please resend the logs
> >>> > > > > >
> >>> > > > > > Satish
> >>> > > > > >
> >>> > > > > > On Mon, 30 Aug 2021, Sam Guo wrote:
> >>> > > > > >
> >>> > > > > > > Same compiling error with --with-mumps-serial=1.
> >>> > > > > > >
> >>> > > > > > > On Mon, Aug 30, 2021 at 8:22 PM Satish Balay <
> >>> balay at mcs.anl.gov>
> >>> > > > > wrote:
> >>> > > > > > >
> >>> > > > > > > > Use the additional option: -with-mumps-serial
> >>> > > > > > > >
> >>> > > > > > > > Satish
> >>> > > > > > > >
> >>> > > > > > > > On Mon, 30 Aug 2021, Sam Guo wrote:
> >>> > > > > > > >
> >>> > > > > > > > > Attached please find the configure.log. I use my own
> >>> CMake. I
> >>> > > have
> >>> > > > > > > > > defined -DPETSC_HAVE_MUMPS. Thanks.
> >>> > > > > > > > >
> >>> > > > > > > > > On Mon, Aug 30, 2021 at 4:56 PM Sam Guo <
> >>> sam.guo at cd-adapco.com
> >>> > > >
> >>> > > > > wrote:
> >>> > > > > > > > >
> >>> > > > > > > > > > I use pre-installed
> >>> > > > > > > > > >
> >>> > > > > > > > > > On Mon, Aug 30, 2021 at 4:53 PM Satish Balay <
> >>> > > balay at mcs.anl.gov>
> >>> > > > > > > > wrote:
> >>> > > > > > > > > >
> >>> > > > > > > > > >>
> >>> > > > > > > > > >> Are you using --download-mumps or pre-installed
> >>> mumps? If
> >>> > > using
> >>> > > > > > > > > >> pre-installed - try --download-mumps.
> >>> > > > > > > > > >>
> >>> > > > > > > > > >> If you still have issues - send us configure.log and
> >>> > > make.log
> >>> > > > > from the
> >>> > > > > > > > > >> failed build.
> >>> > > > > > > > > >>
> >>> > > > > > > > > >> Satish
> >>> > > > > > > > > >>
> >>> > > > > > > > > >> On Mon, 30 Aug 2021, Sam Guo wrote:
> >>> > > > > > > > > >>
> >>> > > > > > > > > >> > Dear PETSc dev team,
> >>> > > > > > > > > >> >    I am compiling petsc 3.15.3 and got following
> >>> compiling
> >>> > > > > error
> >>> > > > > > > > > >> > petsc/src/mat/impls/aij/mpi/mumps/mumps.c:52:31:
> >>> error:
> >>> > > > > missing
> >>> > > > > > > > binary
> >>> > > > > > > > > >> > operator before token "("
> >>> > > > > > > > > >> >    52 | #if PETSC_PKG_MUMPS_VERSION_GE(5,3,0)
> >>> > > > > > > > > >> >    Any idea what I did wrong?
> >>> > > > > > > > > >> >
> >>> > > > > > > > > >> > Thanks,
> >>> > > > > > > > > >> > Sam
> >>> > > > > > > > > >> >
> >>> > > > > > > > > >>
> >>> > > > > > > > > >>
> >>> > > > > > > > >
> >>> > > > > > > >
> >>> > > > > > > >
> >>> > > > > > >
> >>> > > > > >
> >>> > > > >
> >>> > > > >
> >>> > > >
> >>> > >
> >>> > >
> >>> >
> >>>
> >>>
> 


From sam.guo at cd-adapco.com  Wed Sep  1 14:26:48 2021
From: sam.guo at cd-adapco.com (Sam Guo)
Date: Wed, 1 Sep 2021 12:26:48 -0700
Subject: [petsc-users] PETSc 3.15.3 compiling error
In-Reply-To: <CAAZdwQu+=05KD6QY6vSSF57qDPM8F12b5-NV0gBOMwrjf2s+2Q@mail.gmail.com>
References: <CAAZdwQuOEw2289v1a--OUx=b-gfwiUi7nvps+Bwu4x1ogLdwFA@mail.gmail.com>
	<cdfd44a7-a0bf-c1d8-8d7a-14752c3e379b@mcs.anl.gov>
	<CAAZdwQujE4cX0qtc7AydqOm3bROxyqGLvdbZwLf_rp71BGRv3g@mail.gmail.com>
	<CAAZdwQu9ezAD7qKdutN3iiGetvPXHm-PuQU69Omm3s_zimZVHQ@mail.gmail.com>
	<ea5527d1-2244-7c2-a1f9-f7c6babf8166@mcs.anl.gov>
	<CAAZdwQuefc0E_m46WCd8E-41QJzQ-nS040ThMDvHAC8kX5nfiw@mail.gmail.com>
	<cb5d920-3ab1-5c2d-28da-ef433557f5e9@mcs.anl.gov>
	<65d5cb9a-2dc0-8362-6a7-5acf784e7138@mcs.anl.gov>
	<CAAZdwQs7Z1Vin5fsv6BmVOMEGHmzDesWCTsZk4uuXoE2sVbLHQ@mail.gmail.com>
	<575fd7-61c5-b983-5ad0-4c2748b6b6d2@mcs.anl.gov>
	<CAAZdwQuTGwKkEbsZnLwkSE3hbxnCEfJwpjbFYLDpq6EC7Yt4kQ@mail.gmail.com>
	<ffccfb6e-309d-d215-9fb9-eef5afe3532@mcs.anl.gov>
	<CAAZdwQt-W5sfZeuqFEMSOYXbKTEKyVE8xsa_3N0yi2sowpNNew@mail.gmail.com>
	<CAAZdwQtHpQkzvxQrVz3kt_M23vMcksEnKKcT1656PUfm6hukkQ@mail.gmail.com>
	<CAAZdwQu+=05KD6QY6vSSF57qDPM8F12b5-NV0gBOMwrjf2s+2Q@mail.gmail.com>
Message-ID: <CAAZdwQux8SVv-oMY8OoqjgaZDviWexkpQs0x9gM7BXkvk3CqPw@mail.gmail.com>

For PETSc 3.15.3, if I don't include mat/impls/aij/mpi/mumps/mumps.c, I
have no compiling error. But I need it for using MUMPS. It is a compiling
error rather than linking error.

On Wed, Sep 1, 2021 at 12:22 PM Sam Guo <sam.guo at cd-adapco.com> wrote:

> My process only works for PTESc 3.11.3, not 3.15.3 and that's why I
> started this email thread.
>
> On Wed, Sep 1, 2021 at 12:19 PM Sam Guo <sam.guo at cd-adapco.com> wrote:
>
>> If we go back to the original compiling error,
>> "petsc/src/mat/impls/aij/mpi/mumps/mumps.c:52:31: error: missing binary
>> operator before token "("
>>    52 | #if PETSC_PKG_MUMPS_VERSION_GE(5,3,0)"
>> I don't understand what PETSC_PKG_MUMPS_VERSION_GE(5,3,0) is doing.
>>
>> On Wed, Sep 1, 2021 at 12:12 PM Sam Guo <sam.guo at cd-adapco.com> wrote:
>>
>>> I believe I am using MUMPS since I have done following
>>> (1) defined  -DPETSC_HAVE_MUMPS,
>>> (2) compiles and links mat/impls/aij/mpi/mumps/mumps.c
>>> (3) link my pre-compiled MUMPS, and
>>> (4) specifies following PETSc options
>>>        checkError(EPSGetST(eps, &st));
>>>         checkError(STSetType(st, STSINVERT));
>>>         //if(useShellMatrix) checkError(STSetMatMode(st,
>>> ST_MATMODE_SHELL));
>>>         checkError(STGetKSP(st, &ksp));
>>>         checkError(KSPSetOperators(ksp, A, A));
>>>         checkError(KSPSetType(ksp, KSPPREONLY));
>>>         checkError(KSPGetPC(ksp, &pc));
>>>         checkError(MatSetOption(A, MAT_SPD, PETSC_TRUE));
>>>         checkError(PCSetType(pc, PCCHOLESKY));
>>>         checkError(PCFactorSetMatSolverType(pc, MATSOLVERMUMPS));
>>>         checkError(PCFactorSetUpMatSolverType(pc));
>>>         checkError(PetscOptionsSetValue(NULL,
>>> "-mat_mumps_icntl_13","1"));
>>>
>>> Another evidence I am using MUMPS is that If I skip (1)-(3) above, I got
>>> the PETSc error saying that MUMPS is required.
>>>
>>> On Wed, Sep 1, 2021 at 12:00 PM Satish Balay <balay at mcs.anl.gov> wrote:
>>>
>>>> mumps is a fortran package - so best to specify fc. Any specific reason
>>>> for needing to force '--with-fc=0'?
>>>>
>>>> The attached configure.log is not using mumps.
>>>>
>>>> Satish
>>>>
>>>> On Wed, 1 Sep 2021, Sam Guo wrote:
>>>>
>>>> > fc should not be required since I link PETSc with pre-compiled MUMPS.
>>>> In
>>>> > fact, --with-mumps-include --with-mumps-lib --with-mumps-serial
>>>> should not
>>>> > be required since my own CMake defines -DPETSC_HAVE_MUMPS and links my
>>>> > pre-compiled MUMPS.
>>>> >
>>>> > I am able to make it work using PETSc 3.11.3. Attached please find the
>>>> > cPETSc 3.11.3 onfigure.log PETSc.
>>>> >
>>>> > On Tue, Aug 31, 2021 at 4:47 PM Satish Balay <balay at mcs.anl.gov>
>>>> wrote:
>>>> >
>>>> > >
>>>> > >
>>>> *******************************************************************************
>>>> > >          UNABLE to CONFIGURE with GIVEN OPTIONS    (see
>>>> configure.log for
>>>> > > details):
>>>> > >
>>>> > >
>>>> -------------------------------------------------------------------------------
>>>> > > Package mumps requested requires Fortran but compiler turned off.
>>>> > >
>>>> > >
>>>> *******************************************************************************
>>>> > >
>>>> > > i.e remove '--with-fc=0' and rerun configure.
>>>> > >
>>>> > > Satish
>>>> > >
>>>> > > On Tue, 31 Aug 2021, Sam Guo wrote:
>>>> > >
>>>> > > > Attached please find the latest configure.log.
>>>> > > >
>>>> > > > grep MUMPS_VERSION
>>>> > > >
>>>> > >
>>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/*.h
>>>> > > >
>>>> > >
>>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/cmumps_c.h:#ifndef
>>>> > > > MUMPS_VERSION
>>>> > > >
>>>> > >
>>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/cmumps_c.h:#define
>>>> > > > MUMPS_VERSION "5.2.1"
>>>> > > >
>>>> > >
>>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/cmumps_c.h:#ifndef
>>>> > > > MUMPS_VERSION_MAX_LEN
>>>> > > >
>>>> > >
>>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/cmumps_c.h:#define
>>>> > > > MUMPS_VERSION_MAX_LEN 30
>>>> > > >
>>>> > >
>>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/cmumps_c.h:
>>>> > > >    char version_number[MUMPS_VERSION_MAX_LEN + 1 + 1];
>>>> > > >
>>>> > >
>>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/dmumps_c.h:#ifndef
>>>> > > > MUMPS_VERSION
>>>> > > >
>>>> > >
>>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/dmumps_c.h:#define
>>>> > > > MUMPS_VERSION "5.2.1"
>>>> > > >
>>>> > >
>>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/dmumps_c.h:#ifndef
>>>> > > > MUMPS_VERSION_MAX_LEN
>>>> > > >
>>>> > >
>>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/dmumps_c.h:#define
>>>> > > > MUMPS_VERSION_MAX_LEN 30
>>>> > > >
>>>> > >
>>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/dmumps_c.h:
>>>> > > >    char version_number[MUMPS_VERSION_MAX_LEN + 1 + 1];
>>>> > > >
>>>> > >
>>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/smumps_c.h:#ifndef
>>>> > > > MUMPS_VERSION
>>>> > > >
>>>> > >
>>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/smumps_c.h:#define
>>>> > > > MUMPS_VERSION "5.2.1"
>>>> > > >
>>>> > >
>>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/smumps_c.h:#ifndef
>>>> > > > MUMPS_VERSION_MAX_LEN
>>>> > > >
>>>> > >
>>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/smumps_c.h:#define
>>>> > > > MUMPS_VERSION_MAX_LEN 30
>>>> > > >
>>>> > >
>>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/smumps_c.h:
>>>> > > >    char version_number[MUMPS_VERSION_MAX_LEN + 1 + 1];
>>>> > > >
>>>> > >
>>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/zmumps_c.h:#ifndef
>>>> > > > MUMPS_VERSION
>>>> > > >
>>>> > >
>>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/zmumps_c.h:#define
>>>> > > > MUMPS_VERSION "5.2.1"
>>>> > > >
>>>> > >
>>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/zmumps_c.h:#ifndef
>>>> > > > MUMPS_VERSION_MAX_LEN
>>>> > > >
>>>> > >
>>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/zmumps_c.h:#define
>>>> > > > MUMPS_VERSION_MAX_LEN 30
>>>> > > >
>>>> > >
>>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/zmumps_c.h:
>>>> > > >    char version_number[MUMPS_VERSION_MAX_LEN + 1 + 1];
>>>> > > >
>>>> > > > On Mon, Aug 30, 2021 at 9:47 PM Satish Balay <balay at mcs.anl.gov>
>>>> wrote:
>>>> > > >
>>>> > > > > Also - what do you have for:
>>>> > > > >
>>>> > > > > grep MUMPS_VERSION
>>>> > > > >
>>>> > >
>>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/*.h
>>>> > > > >
>>>> > > > > Satish
>>>> > > > >
>>>> > > > > On Mon, 30 Aug 2021, Satish Balay via petsc-users wrote:
>>>> > > > >
>>>> > > > > > please resend the logs
>>>> > > > > >
>>>> > > > > > Satish
>>>> > > > > >
>>>> > > > > > On Mon, 30 Aug 2021, Sam Guo wrote:
>>>> > > > > >
>>>> > > > > > > Same compiling error with --with-mumps-serial=1.
>>>> > > > > > >
>>>> > > > > > > On Mon, Aug 30, 2021 at 8:22 PM Satish Balay <
>>>> balay at mcs.anl.gov>
>>>> > > > > wrote:
>>>> > > > > > >
>>>> > > > > > > > Use the additional option: -with-mumps-serial
>>>> > > > > > > >
>>>> > > > > > > > Satish
>>>> > > > > > > >
>>>> > > > > > > > On Mon, 30 Aug 2021, Sam Guo wrote:
>>>> > > > > > > >
>>>> > > > > > > > > Attached please find the configure.log. I use my own
>>>> CMake. I
>>>> > > have
>>>> > > > > > > > > defined -DPETSC_HAVE_MUMPS. Thanks.
>>>> > > > > > > > >
>>>> > > > > > > > > On Mon, Aug 30, 2021 at 4:56 PM Sam Guo <
>>>> sam.guo at cd-adapco.com
>>>> > > >
>>>> > > > > wrote:
>>>> > > > > > > > >
>>>> > > > > > > > > > I use pre-installed
>>>> > > > > > > > > >
>>>> > > > > > > > > > On Mon, Aug 30, 2021 at 4:53 PM Satish Balay <
>>>> > > balay at mcs.anl.gov>
>>>> > > > > > > > wrote:
>>>> > > > > > > > > >
>>>> > > > > > > > > >>
>>>> > > > > > > > > >> Are you using --download-mumps or pre-installed
>>>> mumps? If
>>>> > > using
>>>> > > > > > > > > >> pre-installed - try --download-mumps.
>>>> > > > > > > > > >>
>>>> > > > > > > > > >> If you still have issues - send us configure.log and
>>>> > > make.log
>>>> > > > > from the
>>>> > > > > > > > > >> failed build.
>>>> > > > > > > > > >>
>>>> > > > > > > > > >> Satish
>>>> > > > > > > > > >>
>>>> > > > > > > > > >> On Mon, 30 Aug 2021, Sam Guo wrote:
>>>> > > > > > > > > >>
>>>> > > > > > > > > >> > Dear PETSc dev team,
>>>> > > > > > > > > >> >    I am compiling petsc 3.15.3 and got following
>>>> compiling
>>>> > > > > error
>>>> > > > > > > > > >> > petsc/src/mat/impls/aij/mpi/mumps/mumps.c:52:31:
>>>> error:
>>>> > > > > missing
>>>> > > > > > > > binary
>>>> > > > > > > > > >> > operator before token "("
>>>> > > > > > > > > >> >    52 | #if PETSC_PKG_MUMPS_VERSION_GE(5,3,0)
>>>> > > > > > > > > >> >    Any idea what I did wrong?
>>>> > > > > > > > > >> >
>>>> > > > > > > > > >> > Thanks,
>>>> > > > > > > > > >> > Sam
>>>> > > > > > > > > >> >
>>>> > > > > > > > > >>
>>>> > > > > > > > > >>
>>>> > > > > > > > >
>>>> > > > > > > >
>>>> > > > > > > >
>>>> > > > > > >
>>>> > > > > >
>>>> > > > >
>>>> > > > >
>>>> > > >
>>>> > >
>>>> > >
>>>> >
>>>>
>>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210901/86610397/attachment.html>

From knepley at gmail.com  Wed Sep  1 14:30:46 2021
From: knepley at gmail.com (Matthew Knepley)
Date: Wed, 1 Sep 2021 15:30:46 -0400
Subject: [petsc-users] PETSc 3.15.3 compiling error
In-Reply-To: <CAAZdwQux8SVv-oMY8OoqjgaZDviWexkpQs0x9gM7BXkvk3CqPw@mail.gmail.com>
References: <CAAZdwQuOEw2289v1a--OUx=b-gfwiUi7nvps+Bwu4x1ogLdwFA@mail.gmail.com>
	<cdfd44a7-a0bf-c1d8-8d7a-14752c3e379b@mcs.anl.gov>
	<CAAZdwQujE4cX0qtc7AydqOm3bROxyqGLvdbZwLf_rp71BGRv3g@mail.gmail.com>
	<CAAZdwQu9ezAD7qKdutN3iiGetvPXHm-PuQU69Omm3s_zimZVHQ@mail.gmail.com>
	<ea5527d1-2244-7c2-a1f9-f7c6babf8166@mcs.anl.gov>
	<CAAZdwQuefc0E_m46WCd8E-41QJzQ-nS040ThMDvHAC8kX5nfiw@mail.gmail.com>
	<cb5d920-3ab1-5c2d-28da-ef433557f5e9@mcs.anl.gov>
	<65d5cb9a-2dc0-8362-6a7-5acf784e7138@mcs.anl.gov>
	<CAAZdwQs7Z1Vin5fsv6BmVOMEGHmzDesWCTsZk4uuXoE2sVbLHQ@mail.gmail.com>
	<575fd7-61c5-b983-5ad0-4c2748b6b6d2@mcs.anl.gov>
	<CAAZdwQuTGwKkEbsZnLwkSE3hbxnCEfJwpjbFYLDpq6EC7Yt4kQ@mail.gmail.com>
	<ffccfb6e-309d-d215-9fb9-eef5afe3532@mcs.anl.gov>
	<CAAZdwQt-W5sfZeuqFEMSOYXbKTEKyVE8xsa_3N0yi2sowpNNew@mail.gmail.com>
	<CAAZdwQtHpQkzvxQrVz3kt_M23vMcksEnKKcT1656PUfm6hukkQ@mail.gmail.com>
	<CAAZdwQu+=05KD6QY6vSSF57qDPM8F12b5-NV0gBOMwrjf2s+2Q@mail.gmail.com>
	<CAAZdwQux8SVv-oMY8OoqjgaZDviWexkpQs0x9gM7BXkvk3CqPw@mail.gmail.com>
Message-ID: <CAMYG4G=oZ6Wz=o6RnorjFAjPQ6-swJkEk84Zr+drEMf4PC6ATw@mail.gmail.com>

On Wed, Sep 1, 2021 at 3:27 PM Sam Guo <sam.guo at cd-adapco.com> wrote:

> For PETSc 3.15.3, if I don't include mat/impls/aij/mpi/mumps/mumps.c, I
> have no compiling error. But I need it for using MUMPS. It is a compiling
> error rather than linking error.
>

I will ask a different way:

Can you run configure, but point it at your MUMPS installation?

  --with-mumps-dir=</path/to/mumps>

  Thanks,

      Matt


> On Wed, Sep 1, 2021 at 12:22 PM Sam Guo <sam.guo at cd-adapco.com> wrote:
>
>> My process only works for PTESc 3.11.3, not 3.15.3 and that's why I
>> started this email thread.
>>
>> On Wed, Sep 1, 2021 at 12:19 PM Sam Guo <sam.guo at cd-adapco.com> wrote:
>>
>>> If we go back to the original compiling error,
>>> "petsc/src/mat/impls/aij/mpi/mumps/mumps.c:52:31: error: missing binary
>>> operator before token "("
>>>    52 | #if PETSC_PKG_MUMPS_VERSION_GE(5,3,0)"
>>> I don't understand what PETSC_PKG_MUMPS_VERSION_GE(5,3,0) is doing.
>>>
>>> On Wed, Sep 1, 2021 at 12:12 PM Sam Guo <sam.guo at cd-adapco.com> wrote:
>>>
>>>> I believe I am using MUMPS since I have done following
>>>> (1) defined  -DPETSC_HAVE_MUMPS,
>>>> (2) compiles and links mat/impls/aij/mpi/mumps/mumps.c
>>>> (3) link my pre-compiled MUMPS, and
>>>> (4) specifies following PETSc options
>>>>        checkError(EPSGetST(eps, &st));
>>>>         checkError(STSetType(st, STSINVERT));
>>>>         //if(useShellMatrix) checkError(STSetMatMode(st,
>>>> ST_MATMODE_SHELL));
>>>>         checkError(STGetKSP(st, &ksp));
>>>>         checkError(KSPSetOperators(ksp, A, A));
>>>>         checkError(KSPSetType(ksp, KSPPREONLY));
>>>>         checkError(KSPGetPC(ksp, &pc));
>>>>         checkError(MatSetOption(A, MAT_SPD, PETSC_TRUE));
>>>>         checkError(PCSetType(pc, PCCHOLESKY));
>>>>         checkError(PCFactorSetMatSolverType(pc, MATSOLVERMUMPS));
>>>>         checkError(PCFactorSetUpMatSolverType(pc));
>>>>         checkError(PetscOptionsSetValue(NULL,
>>>> "-mat_mumps_icntl_13","1"));
>>>>
>>>> Another evidence I am using MUMPS is that If I skip (1)-(3) above, I
>>>> got the PETSc error saying that MUMPS is required.
>>>>
>>>> On Wed, Sep 1, 2021 at 12:00 PM Satish Balay <balay at mcs.anl.gov> wrote:
>>>>
>>>>> mumps is a fortran package - so best to specify fc. Any specific
>>>>> reason for needing to force '--with-fc=0'?
>>>>>
>>>>> The attached configure.log is not using mumps.
>>>>>
>>>>> Satish
>>>>>
>>>>> On Wed, 1 Sep 2021, Sam Guo wrote:
>>>>>
>>>>> > fc should not be required since I link PETSc with pre-compiled
>>>>> MUMPS. In
>>>>> > fact, --with-mumps-include --with-mumps-lib --with-mumps-serial
>>>>> should not
>>>>> > be required since my own CMake defines -DPETSC_HAVE_MUMPS and links
>>>>> my
>>>>> > pre-compiled MUMPS.
>>>>> >
>>>>> > I am able to make it work using PETSc 3.11.3. Attached please find
>>>>> the
>>>>> > cPETSc 3.11.3 onfigure.log PETSc.
>>>>> >
>>>>> > On Tue, Aug 31, 2021 at 4:47 PM Satish Balay <balay at mcs.anl.gov>
>>>>> wrote:
>>>>> >
>>>>> > >
>>>>> > >
>>>>> *******************************************************************************
>>>>> > >          UNABLE to CONFIGURE with GIVEN OPTIONS    (see
>>>>> configure.log for
>>>>> > > details):
>>>>> > >
>>>>> > >
>>>>> -------------------------------------------------------------------------------
>>>>> > > Package mumps requested requires Fortran but compiler turned off.
>>>>> > >
>>>>> > >
>>>>> *******************************************************************************
>>>>> > >
>>>>> > > i.e remove '--with-fc=0' and rerun configure.
>>>>> > >
>>>>> > > Satish
>>>>> > >
>>>>> > > On Tue, 31 Aug 2021, Sam Guo wrote:
>>>>> > >
>>>>> > > > Attached please find the latest configure.log.
>>>>> > > >
>>>>> > > > grep MUMPS_VERSION
>>>>> > > >
>>>>> > >
>>>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/*.h
>>>>> > > >
>>>>> > >
>>>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/cmumps_c.h:#ifndef
>>>>> > > > MUMPS_VERSION
>>>>> > > >
>>>>> > >
>>>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/cmumps_c.h:#define
>>>>> > > > MUMPS_VERSION "5.2.1"
>>>>> > > >
>>>>> > >
>>>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/cmumps_c.h:#ifndef
>>>>> > > > MUMPS_VERSION_MAX_LEN
>>>>> > > >
>>>>> > >
>>>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/cmumps_c.h:#define
>>>>> > > > MUMPS_VERSION_MAX_LEN 30
>>>>> > > >
>>>>> > >
>>>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/cmumps_c.h:
>>>>> > > >    char version_number[MUMPS_VERSION_MAX_LEN + 1 + 1];
>>>>> > > >
>>>>> > >
>>>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/dmumps_c.h:#ifndef
>>>>> > > > MUMPS_VERSION
>>>>> > > >
>>>>> > >
>>>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/dmumps_c.h:#define
>>>>> > > > MUMPS_VERSION "5.2.1"
>>>>> > > >
>>>>> > >
>>>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/dmumps_c.h:#ifndef
>>>>> > > > MUMPS_VERSION_MAX_LEN
>>>>> > > >
>>>>> > >
>>>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/dmumps_c.h:#define
>>>>> > > > MUMPS_VERSION_MAX_LEN 30
>>>>> > > >
>>>>> > >
>>>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/dmumps_c.h:
>>>>> > > >    char version_number[MUMPS_VERSION_MAX_LEN + 1 + 1];
>>>>> > > >
>>>>> > >
>>>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/smumps_c.h:#ifndef
>>>>> > > > MUMPS_VERSION
>>>>> > > >
>>>>> > >
>>>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/smumps_c.h:#define
>>>>> > > > MUMPS_VERSION "5.2.1"
>>>>> > > >
>>>>> > >
>>>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/smumps_c.h:#ifndef
>>>>> > > > MUMPS_VERSION_MAX_LEN
>>>>> > > >
>>>>> > >
>>>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/smumps_c.h:#define
>>>>> > > > MUMPS_VERSION_MAX_LEN 30
>>>>> > > >
>>>>> > >
>>>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/smumps_c.h:
>>>>> > > >    char version_number[MUMPS_VERSION_MAX_LEN + 1 + 1];
>>>>> > > >
>>>>> > >
>>>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/zmumps_c.h:#ifndef
>>>>> > > > MUMPS_VERSION
>>>>> > > >
>>>>> > >
>>>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/zmumps_c.h:#define
>>>>> > > > MUMPS_VERSION "5.2.1"
>>>>> > > >
>>>>> > >
>>>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/zmumps_c.h:#ifndef
>>>>> > > > MUMPS_VERSION_MAX_LEN
>>>>> > > >
>>>>> > >
>>>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/zmumps_c.h:#define
>>>>> > > > MUMPS_VERSION_MAX_LEN 30
>>>>> > > >
>>>>> > >
>>>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/zmumps_c.h:
>>>>> > > >    char version_number[MUMPS_VERSION_MAX_LEN + 1 + 1];
>>>>> > > >
>>>>> > > > On Mon, Aug 30, 2021 at 9:47 PM Satish Balay <balay at mcs.anl.gov>
>>>>> wrote:
>>>>> > > >
>>>>> > > > > Also - what do you have for:
>>>>> > > > >
>>>>> > > > > grep MUMPS_VERSION
>>>>> > > > >
>>>>> > >
>>>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/*.h
>>>>> > > > >
>>>>> > > > > Satish
>>>>> > > > >
>>>>> > > > > On Mon, 30 Aug 2021, Satish Balay via petsc-users wrote:
>>>>> > > > >
>>>>> > > > > > please resend the logs
>>>>> > > > > >
>>>>> > > > > > Satish
>>>>> > > > > >
>>>>> > > > > > On Mon, 30 Aug 2021, Sam Guo wrote:
>>>>> > > > > >
>>>>> > > > > > > Same compiling error with --with-mumps-serial=1.
>>>>> > > > > > >
>>>>> > > > > > > On Mon, Aug 30, 2021 at 8:22 PM Satish Balay <
>>>>> balay at mcs.anl.gov>
>>>>> > > > > wrote:
>>>>> > > > > > >
>>>>> > > > > > > > Use the additional option: -with-mumps-serial
>>>>> > > > > > > >
>>>>> > > > > > > > Satish
>>>>> > > > > > > >
>>>>> > > > > > > > On Mon, 30 Aug 2021, Sam Guo wrote:
>>>>> > > > > > > >
>>>>> > > > > > > > > Attached please find the configure.log. I use my own
>>>>> CMake. I
>>>>> > > have
>>>>> > > > > > > > > defined -DPETSC_HAVE_MUMPS. Thanks.
>>>>> > > > > > > > >
>>>>> > > > > > > > > On Mon, Aug 30, 2021 at 4:56 PM Sam Guo <
>>>>> sam.guo at cd-adapco.com
>>>>> > > >
>>>>> > > > > wrote:
>>>>> > > > > > > > >
>>>>> > > > > > > > > > I use pre-installed
>>>>> > > > > > > > > >
>>>>> > > > > > > > > > On Mon, Aug 30, 2021 at 4:53 PM Satish Balay <
>>>>> > > balay at mcs.anl.gov>
>>>>> > > > > > > > wrote:
>>>>> > > > > > > > > >
>>>>> > > > > > > > > >>
>>>>> > > > > > > > > >> Are you using --download-mumps or pre-installed
>>>>> mumps? If
>>>>> > > using
>>>>> > > > > > > > > >> pre-installed - try --download-mumps.
>>>>> > > > > > > > > >>
>>>>> > > > > > > > > >> If you still have issues - send us configure.log and
>>>>> > > make.log
>>>>> > > > > from the
>>>>> > > > > > > > > >> failed build.
>>>>> > > > > > > > > >>
>>>>> > > > > > > > > >> Satish
>>>>> > > > > > > > > >>
>>>>> > > > > > > > > >> On Mon, 30 Aug 2021, Sam Guo wrote:
>>>>> > > > > > > > > >>
>>>>> > > > > > > > > >> > Dear PETSc dev team,
>>>>> > > > > > > > > >> >    I am compiling petsc 3.15.3 and got following
>>>>> compiling
>>>>> > > > > error
>>>>> > > > > > > > > >> > petsc/src/mat/impls/aij/mpi/mumps/mumps.c:52:31:
>>>>> error:
>>>>> > > > > missing
>>>>> > > > > > > > binary
>>>>> > > > > > > > > >> > operator before token "("
>>>>> > > > > > > > > >> >    52 | #if PETSC_PKG_MUMPS_VERSION_GE(5,3,0)
>>>>> > > > > > > > > >> >    Any idea what I did wrong?
>>>>> > > > > > > > > >> >
>>>>> > > > > > > > > >> > Thanks,
>>>>> > > > > > > > > >> > Sam
>>>>> > > > > > > > > >> >
>>>>> > > > > > > > > >>
>>>>> > > > > > > > > >>
>>>>> > > > > > > > >
>>>>> > > > > > > >
>>>>> > > > > > > >
>>>>> > > > > > >
>>>>> > > > > >
>>>>> > > > >
>>>>> > > > >
>>>>> > > >
>>>>> > >
>>>>> > >
>>>>> >
>>>>>
>>>>>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210901/563be3be/attachment-0001.html>

From balay at mcs.anl.gov  Wed Sep  1 14:34:17 2021
From: balay at mcs.anl.gov (Satish Balay)
Date: Wed, 1 Sep 2021 14:34:17 -0500 (CDT)
Subject: [petsc-users] PETSc 3.15.3 compiling error
In-Reply-To: <CAMYG4G=oZ6Wz=o6RnorjFAjPQ6-swJkEk84Zr+drEMf4PC6ATw@mail.gmail.com>
References: <CAAZdwQuOEw2289v1a--OUx=b-gfwiUi7nvps+Bwu4x1ogLdwFA@mail.gmail.com>
	<cdfd44a7-a0bf-c1d8-8d7a-14752c3e379b@mcs.anl.gov>
	<CAAZdwQujE4cX0qtc7AydqOm3bROxyqGLvdbZwLf_rp71BGRv3g@mail.gmail.com>
	<CAAZdwQu9ezAD7qKdutN3iiGetvPXHm-PuQU69Omm3s_zimZVHQ@mail.gmail.com>
	<ea5527d1-2244-7c2-a1f9-f7c6babf8166@mcs.anl.gov>
	<CAAZdwQuefc0E_m46WCd8E-41QJzQ-nS040ThMDvHAC8kX5nfiw@mail.gmail.com>
	<cb5d920-3ab1-5c2d-28da-ef433557f5e9@mcs.anl.gov>
	<65d5cb9a-2dc0-8362-6a7-5acf784e7138@mcs.anl.gov>
	<CAAZdwQs7Z1Vin5fsv6BmVOMEGHmzDesWCTsZk4uuXoE2sVbLHQ@mail.gmail.com>
	<575fd7-61c5-b983-5ad0-4c2748b6b6d2@mcs.anl.gov>
	<CAAZdwQuTGwKkEbsZnLwkSE3hbxnCEfJwpjbFYLDpq6EC7Yt4kQ@mail.gmail.com>
	<ffccfb6e-309d-d215-9fb9-eef5afe3532@mcs.anl.gov>
	<CAAZdwQt-W5sfZeuqFEMSOYXbKTEKyVE8xsa_3N0yi2sowpNNew@mail.gmail.com>
	<CAAZdwQtHpQkzvxQrVz3kt_M23vMcksEnKKcT1656PUfm6hukkQ@mail.gmail.com>
	<CAAZdwQu+=05KD6QY6vSSF57qDPM8F12b5-NV0gBOMwrjf2s+2Q@mail.gmail.com>
	<CAAZdwQux8SVv-oMY8OoqjgaZDviWexkpQs0x9gM7BXkvk3CqPw@mail.gmail.com>
	<CAMYG4G=oZ6Wz=o6RnorjFAjPQ6-swJkEk84Zr+drEMf4PC6ATw@mail.gmail.com>
Message-ID: <66d7c45-413a-d122-5fbc-636340c13b5c@mcs.anl.gov>

On Wed, 1 Sep 2021, Matthew Knepley wrote:

> On Wed, Sep 1, 2021 at 3:27 PM Sam Guo <sam.guo at cd-adapco.com> wrote:
> 
> > For PETSc 3.15.3, if I don't include mat/impls/aij/mpi/mumps/mumps.c, I
> > have no compiling error. But I need it for using MUMPS. It is a compiling
> > error rather than linking error.
> >
> 
> I will ask a different way:
> 
> Can you run configure, but point it at your MUMPS installation?
> 
>   --with-mumps-dir=</path/to/mumps>

It won't overcome this issue:

>>>
Package mumps requested requires Fortran but compiler turned off.
<<<

Satish

> 
>   Thanks,
> 
>       Matt
> 
> 
> > On Wed, Sep 1, 2021 at 12:22 PM Sam Guo <sam.guo at cd-adapco.com> wrote:
> >
> >> My process only works for PTESc 3.11.3, not 3.15.3 and that's why I
> >> started this email thread.
> >>
> >> On Wed, Sep 1, 2021 at 12:19 PM Sam Guo <sam.guo at cd-adapco.com> wrote:
> >>
> >>> If we go back to the original compiling error,
> >>> "petsc/src/mat/impls/aij/mpi/mumps/mumps.c:52:31: error: missing binary
> >>> operator before token "("
> >>>    52 | #if PETSC_PKG_MUMPS_VERSION_GE(5,3,0)"
> >>> I don't understand what PETSC_PKG_MUMPS_VERSION_GE(5,3,0) is doing.
> >>>
> >>> On Wed, Sep 1, 2021 at 12:12 PM Sam Guo <sam.guo at cd-adapco.com> wrote:
> >>>
> >>>> I believe I am using MUMPS since I have done following
> >>>> (1) defined  -DPETSC_HAVE_MUMPS,
> >>>> (2) compiles and links mat/impls/aij/mpi/mumps/mumps.c
> >>>> (3) link my pre-compiled MUMPS, and
> >>>> (4) specifies following PETSc options
> >>>>        checkError(EPSGetST(eps, &st));
> >>>>         checkError(STSetType(st, STSINVERT));
> >>>>         //if(useShellMatrix) checkError(STSetMatMode(st,
> >>>> ST_MATMODE_SHELL));
> >>>>         checkError(STGetKSP(st, &ksp));
> >>>>         checkError(KSPSetOperators(ksp, A, A));
> >>>>         checkError(KSPSetType(ksp, KSPPREONLY));
> >>>>         checkError(KSPGetPC(ksp, &pc));
> >>>>         checkError(MatSetOption(A, MAT_SPD, PETSC_TRUE));
> >>>>         checkError(PCSetType(pc, PCCHOLESKY));
> >>>>         checkError(PCFactorSetMatSolverType(pc, MATSOLVERMUMPS));
> >>>>         checkError(PCFactorSetUpMatSolverType(pc));
> >>>>         checkError(PetscOptionsSetValue(NULL,
> >>>> "-mat_mumps_icntl_13","1"));
> >>>>
> >>>> Another evidence I am using MUMPS is that If I skip (1)-(3) above, I
> >>>> got the PETSc error saying that MUMPS is required.
> >>>>
> >>>> On Wed, Sep 1, 2021 at 12:00 PM Satish Balay <balay at mcs.anl.gov> wrote:
> >>>>
> >>>>> mumps is a fortran package - so best to specify fc. Any specific
> >>>>> reason for needing to force '--with-fc=0'?
> >>>>>
> >>>>> The attached configure.log is not using mumps.
> >>>>>
> >>>>> Satish
> >>>>>
> >>>>> On Wed, 1 Sep 2021, Sam Guo wrote:
> >>>>>
> >>>>> > fc should not be required since I link PETSc with pre-compiled
> >>>>> MUMPS. In
> >>>>> > fact, --with-mumps-include --with-mumps-lib --with-mumps-serial
> >>>>> should not
> >>>>> > be required since my own CMake defines -DPETSC_HAVE_MUMPS and links
> >>>>> my
> >>>>> > pre-compiled MUMPS.
> >>>>> >
> >>>>> > I am able to make it work using PETSc 3.11.3. Attached please find
> >>>>> the
> >>>>> > cPETSc 3.11.3 onfigure.log PETSc.
> >>>>> >
> >>>>> > On Tue, Aug 31, 2021 at 4:47 PM Satish Balay <balay at mcs.anl.gov>
> >>>>> wrote:
> >>>>> >
> >>>>> > >
> >>>>> > >
> >>>>> *******************************************************************************
> >>>>> > >          UNABLE to CONFIGURE with GIVEN OPTIONS    (see
> >>>>> configure.log for
> >>>>> > > details):
> >>>>> > >
> >>>>> > >
> >>>>> -------------------------------------------------------------------------------
> >>>>> > > Package mumps requested requires Fortran but compiler turned off.
> >>>>> > >
> >>>>> > >
> >>>>> *******************************************************************************
> >>>>> > >
> >>>>> > > i.e remove '--with-fc=0' and rerun configure.
> >>>>> > >
> >>>>> > > Satish
> >>>>> > >
> >>>>> > > On Tue, 31 Aug 2021, Sam Guo wrote:
> >>>>> > >
> >>>>> > > > Attached please find the latest configure.log.
> >>>>> > > >
> >>>>> > > > grep MUMPS_VERSION
> >>>>> > > >
> >>>>> > >
> >>>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/*.h
> >>>>> > > >
> >>>>> > >
> >>>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/cmumps_c.h:#ifndef
> >>>>> > > > MUMPS_VERSION
> >>>>> > > >
> >>>>> > >
> >>>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/cmumps_c.h:#define
> >>>>> > > > MUMPS_VERSION "5.2.1"
> >>>>> > > >
> >>>>> > >
> >>>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/cmumps_c.h:#ifndef
> >>>>> > > > MUMPS_VERSION_MAX_LEN
> >>>>> > > >
> >>>>> > >
> >>>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/cmumps_c.h:#define
> >>>>> > > > MUMPS_VERSION_MAX_LEN 30
> >>>>> > > >
> >>>>> > >
> >>>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/cmumps_c.h:
> >>>>> > > >    char version_number[MUMPS_VERSION_MAX_LEN + 1 + 1];
> >>>>> > > >
> >>>>> > >
> >>>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/dmumps_c.h:#ifndef
> >>>>> > > > MUMPS_VERSION
> >>>>> > > >
> >>>>> > >
> >>>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/dmumps_c.h:#define
> >>>>> > > > MUMPS_VERSION "5.2.1"
> >>>>> > > >
> >>>>> > >
> >>>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/dmumps_c.h:#ifndef
> >>>>> > > > MUMPS_VERSION_MAX_LEN
> >>>>> > > >
> >>>>> > >
> >>>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/dmumps_c.h:#define
> >>>>> > > > MUMPS_VERSION_MAX_LEN 30
> >>>>> > > >
> >>>>> > >
> >>>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/dmumps_c.h:
> >>>>> > > >    char version_number[MUMPS_VERSION_MAX_LEN + 1 + 1];
> >>>>> > > >
> >>>>> > >
> >>>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/smumps_c.h:#ifndef
> >>>>> > > > MUMPS_VERSION
> >>>>> > > >
> >>>>> > >
> >>>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/smumps_c.h:#define
> >>>>> > > > MUMPS_VERSION "5.2.1"
> >>>>> > > >
> >>>>> > >
> >>>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/smumps_c.h:#ifndef
> >>>>> > > > MUMPS_VERSION_MAX_LEN
> >>>>> > > >
> >>>>> > >
> >>>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/smumps_c.h:#define
> >>>>> > > > MUMPS_VERSION_MAX_LEN 30
> >>>>> > > >
> >>>>> > >
> >>>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/smumps_c.h:
> >>>>> > > >    char version_number[MUMPS_VERSION_MAX_LEN + 1 + 1];
> >>>>> > > >
> >>>>> > >
> >>>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/zmumps_c.h:#ifndef
> >>>>> > > > MUMPS_VERSION
> >>>>> > > >
> >>>>> > >
> >>>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/zmumps_c.h:#define
> >>>>> > > > MUMPS_VERSION "5.2.1"
> >>>>> > > >
> >>>>> > >
> >>>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/zmumps_c.h:#ifndef
> >>>>> > > > MUMPS_VERSION_MAX_LEN
> >>>>> > > >
> >>>>> > >
> >>>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/zmumps_c.h:#define
> >>>>> > > > MUMPS_VERSION_MAX_LEN 30
> >>>>> > > >
> >>>>> > >
> >>>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/zmumps_c.h:
> >>>>> > > >    char version_number[MUMPS_VERSION_MAX_LEN + 1 + 1];
> >>>>> > > >
> >>>>> > > > On Mon, Aug 30, 2021 at 9:47 PM Satish Balay <balay at mcs.anl.gov>
> >>>>> wrote:
> >>>>> > > >
> >>>>> > > > > Also - what do you have for:
> >>>>> > > > >
> >>>>> > > > > grep MUMPS_VERSION
> >>>>> > > > >
> >>>>> > >
> >>>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/*.h
> >>>>> > > > >
> >>>>> > > > > Satish
> >>>>> > > > >
> >>>>> > > > > On Mon, 30 Aug 2021, Satish Balay via petsc-users wrote:
> >>>>> > > > >
> >>>>> > > > > > please resend the logs
> >>>>> > > > > >
> >>>>> > > > > > Satish
> >>>>> > > > > >
> >>>>> > > > > > On Mon, 30 Aug 2021, Sam Guo wrote:
> >>>>> > > > > >
> >>>>> > > > > > > Same compiling error with --with-mumps-serial=1.
> >>>>> > > > > > >
> >>>>> > > > > > > On Mon, Aug 30, 2021 at 8:22 PM Satish Balay <
> >>>>> balay at mcs.anl.gov>
> >>>>> > > > > wrote:
> >>>>> > > > > > >
> >>>>> > > > > > > > Use the additional option: -with-mumps-serial
> >>>>> > > > > > > >
> >>>>> > > > > > > > Satish
> >>>>> > > > > > > >
> >>>>> > > > > > > > On Mon, 30 Aug 2021, Sam Guo wrote:
> >>>>> > > > > > > >
> >>>>> > > > > > > > > Attached please find the configure.log. I use my own
> >>>>> CMake. I
> >>>>> > > have
> >>>>> > > > > > > > > defined -DPETSC_HAVE_MUMPS. Thanks.
> >>>>> > > > > > > > >
> >>>>> > > > > > > > > On Mon, Aug 30, 2021 at 4:56 PM Sam Guo <
> >>>>> sam.guo at cd-adapco.com
> >>>>> > > >
> >>>>> > > > > wrote:
> >>>>> > > > > > > > >
> >>>>> > > > > > > > > > I use pre-installed
> >>>>> > > > > > > > > >
> >>>>> > > > > > > > > > On Mon, Aug 30, 2021 at 4:53 PM Satish Balay <
> >>>>> > > balay at mcs.anl.gov>
> >>>>> > > > > > > > wrote:
> >>>>> > > > > > > > > >
> >>>>> > > > > > > > > >>
> >>>>> > > > > > > > > >> Are you using --download-mumps or pre-installed
> >>>>> mumps? If
> >>>>> > > using
> >>>>> > > > > > > > > >> pre-installed - try --download-mumps.
> >>>>> > > > > > > > > >>
> >>>>> > > > > > > > > >> If you still have issues - send us configure.log and
> >>>>> > > make.log
> >>>>> > > > > from the
> >>>>> > > > > > > > > >> failed build.
> >>>>> > > > > > > > > >>
> >>>>> > > > > > > > > >> Satish
> >>>>> > > > > > > > > >>
> >>>>> > > > > > > > > >> On Mon, 30 Aug 2021, Sam Guo wrote:
> >>>>> > > > > > > > > >>
> >>>>> > > > > > > > > >> > Dear PETSc dev team,
> >>>>> > > > > > > > > >> >    I am compiling petsc 3.15.3 and got following
> >>>>> compiling
> >>>>> > > > > error
> >>>>> > > > > > > > > >> > petsc/src/mat/impls/aij/mpi/mumps/mumps.c:52:31:
> >>>>> error:
> >>>>> > > > > missing
> >>>>> > > > > > > > binary
> >>>>> > > > > > > > > >> > operator before token "("
> >>>>> > > > > > > > > >> >    52 | #if PETSC_PKG_MUMPS_VERSION_GE(5,3,0)
> >>>>> > > > > > > > > >> >    Any idea what I did wrong?
> >>>>> > > > > > > > > >> >
> >>>>> > > > > > > > > >> > Thanks,
> >>>>> > > > > > > > > >> > Sam
> >>>>> > > > > > > > > >> >
> >>>>> > > > > > > > > >>
> >>>>> > > > > > > > > >>
> >>>>> > > > > > > > >
> >>>>> > > > > > > >
> >>>>> > > > > > > >
> >>>>> > > > > > >
> >>>>> > > > > >
> >>>>> > > > >
> >>>>> > > > >
> >>>>> > > >
> >>>>> > >
> >>>>> > >
> >>>>> >
> >>>>>
> >>>>>
> 
> 


From junchao.zhang at gmail.com  Wed Sep  1 14:46:20 2021
From: junchao.zhang at gmail.com (Junchao Zhang)
Date: Wed, 1 Sep 2021 14:46:20 -0500
Subject: [petsc-users] PETSc 3.15.3 compiling error
In-Reply-To: <CAAZdwQtHpQkzvxQrVz3kt_M23vMcksEnKKcT1656PUfm6hukkQ@mail.gmail.com>
References: <CAAZdwQuOEw2289v1a--OUx=b-gfwiUi7nvps+Bwu4x1ogLdwFA@mail.gmail.com>
	<cdfd44a7-a0bf-c1d8-8d7a-14752c3e379b@mcs.anl.gov>
	<CAAZdwQujE4cX0qtc7AydqOm3bROxyqGLvdbZwLf_rp71BGRv3g@mail.gmail.com>
	<CAAZdwQu9ezAD7qKdutN3iiGetvPXHm-PuQU69Omm3s_zimZVHQ@mail.gmail.com>
	<ea5527d1-2244-7c2-a1f9-f7c6babf8166@mcs.anl.gov>
	<CAAZdwQuefc0E_m46WCd8E-41QJzQ-nS040ThMDvHAC8kX5nfiw@mail.gmail.com>
	<cb5d920-3ab1-5c2d-28da-ef433557f5e9@mcs.anl.gov>
	<65d5cb9a-2dc0-8362-6a7-5acf784e7138@mcs.anl.gov>
	<CAAZdwQs7Z1Vin5fsv6BmVOMEGHmzDesWCTsZk4uuXoE2sVbLHQ@mail.gmail.com>
	<575fd7-61c5-b983-5ad0-4c2748b6b6d2@mcs.anl.gov>
	<CAAZdwQuTGwKkEbsZnLwkSE3hbxnCEfJwpjbFYLDpq6EC7Yt4kQ@mail.gmail.com>
	<ffccfb6e-309d-d215-9fb9-eef5afe3532@mcs.anl.gov>
	<CAAZdwQt-W5sfZeuqFEMSOYXbKTEKyVE8xsa_3N0yi2sowpNNew@mail.gmail.com>
	<CAAZdwQtHpQkzvxQrVz3kt_M23vMcksEnKKcT1656PUfm6hukkQ@mail.gmail.com>
Message-ID: <CA+MQGp9nVOE_M-LwfYFNspjuGDc8fp5BtBoEGvXGDQ6BK+G2KQ@mail.gmail.com>

On Wed, Sep 1, 2021 at 2:20 PM Sam Guo <sam.guo at cd-adapco.com> wrote:

> If we go back to the original compiling error,
> "petsc/src/mat/impls/aij/mpi/mumps/mumps.c:52:31: error: missing binary
> operator before token "("
>    52 | #if PETSC_PKG_MUMPS_VERSION_GE(5,3,0)"
> I don't understand what PETSC_PKG_MUMPS_VERSION_GE(5,3,0) is doing.
>
When petsc is configured with mumps, you will find the macro
PETSC_PKG_MUMPS_VERSION_GE(MAJOR,MINOR,SUBMINOR) in
$PETSC_ARCH/include/petscpkg_version.h
Sam, you can manually compile the failed file, mumps.c, with preprocessing,
to see what is wrong in the expansion of the macro.


>
> On Wed, Sep 1, 2021 at 12:12 PM Sam Guo <sam.guo at cd-adapco.com> wrote:
>
>> I believe I am using MUMPS since I have done following
>> (1) defined  -DPETSC_HAVE_MUMPS,
>> (2) compiles and links mat/impls/aij/mpi/mumps/mumps.c
>> (3) link my pre-compiled MUMPS, and
>> (4) specifies following PETSc options
>>        checkError(EPSGetST(eps, &st));
>>         checkError(STSetType(st, STSINVERT));
>>         //if(useShellMatrix) checkError(STSetMatMode(st,
>> ST_MATMODE_SHELL));
>>         checkError(STGetKSP(st, &ksp));
>>         checkError(KSPSetOperators(ksp, A, A));
>>         checkError(KSPSetType(ksp, KSPPREONLY));
>>         checkError(KSPGetPC(ksp, &pc));
>>         checkError(MatSetOption(A, MAT_SPD, PETSC_TRUE));
>>         checkError(PCSetType(pc, PCCHOLESKY));
>>         checkError(PCFactorSetMatSolverType(pc, MATSOLVERMUMPS));
>>         checkError(PCFactorSetUpMatSolverType(pc));
>>         checkError(PetscOptionsSetValue(NULL, "-mat_mumps_icntl_13","1"));
>>
>> Another evidence I am using MUMPS is that If I skip (1)-(3) above, I got
>> the PETSc error saying that MUMPS is required.
>>
>> On Wed, Sep 1, 2021 at 12:00 PM Satish Balay <balay at mcs.anl.gov> wrote:
>>
>>> mumps is a fortran package - so best to specify fc. Any specific reason
>>> for needing to force '--with-fc=0'?
>>>
>>> The attached configure.log is not using mumps.
>>>
>>> Satish
>>>
>>> On Wed, 1 Sep 2021, Sam Guo wrote:
>>>
>>> > fc should not be required since I link PETSc with pre-compiled MUMPS.
>>> In
>>> > fact, --with-mumps-include --with-mumps-lib --with-mumps-serial should
>>> not
>>> > be required since my own CMake defines -DPETSC_HAVE_MUMPS and links my
>>> > pre-compiled MUMPS.
>>> >
>>> > I am able to make it work using PETSc 3.11.3. Attached please find the
>>> > cPETSc 3.11.3 onfigure.log PETSc.
>>> >
>>> > On Tue, Aug 31, 2021 at 4:47 PM Satish Balay <balay at mcs.anl.gov>
>>> wrote:
>>> >
>>> > >
>>> > >
>>> *******************************************************************************
>>> > >          UNABLE to CONFIGURE with GIVEN OPTIONS    (see
>>> configure.log for
>>> > > details):
>>> > >
>>> > >
>>> -------------------------------------------------------------------------------
>>> > > Package mumps requested requires Fortran but compiler turned off.
>>> > >
>>> > >
>>> *******************************************************************************
>>> > >
>>> > > i.e remove '--with-fc=0' and rerun configure.
>>> > >
>>> > > Satish
>>> > >
>>> > > On Tue, 31 Aug 2021, Sam Guo wrote:
>>> > >
>>> > > > Attached please find the latest configure.log.
>>> > > >
>>> > > > grep MUMPS_VERSION
>>> > > >
>>> > >
>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/*.h
>>> > > >
>>> > >
>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/cmumps_c.h:#ifndef
>>> > > > MUMPS_VERSION
>>> > > >
>>> > >
>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/cmumps_c.h:#define
>>> > > > MUMPS_VERSION "5.2.1"
>>> > > >
>>> > >
>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/cmumps_c.h:#ifndef
>>> > > > MUMPS_VERSION_MAX_LEN
>>> > > >
>>> > >
>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/cmumps_c.h:#define
>>> > > > MUMPS_VERSION_MAX_LEN 30
>>> > > >
>>> > >
>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/cmumps_c.h:
>>> > > >    char version_number[MUMPS_VERSION_MAX_LEN + 1 + 1];
>>> > > >
>>> > >
>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/dmumps_c.h:#ifndef
>>> > > > MUMPS_VERSION
>>> > > >
>>> > >
>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/dmumps_c.h:#define
>>> > > > MUMPS_VERSION "5.2.1"
>>> > > >
>>> > >
>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/dmumps_c.h:#ifndef
>>> > > > MUMPS_VERSION_MAX_LEN
>>> > > >
>>> > >
>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/dmumps_c.h:#define
>>> > > > MUMPS_VERSION_MAX_LEN 30
>>> > > >
>>> > >
>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/dmumps_c.h:
>>> > > >    char version_number[MUMPS_VERSION_MAX_LEN + 1 + 1];
>>> > > >
>>> > >
>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/smumps_c.h:#ifndef
>>> > > > MUMPS_VERSION
>>> > > >
>>> > >
>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/smumps_c.h:#define
>>> > > > MUMPS_VERSION "5.2.1"
>>> > > >
>>> > >
>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/smumps_c.h:#ifndef
>>> > > > MUMPS_VERSION_MAX_LEN
>>> > > >
>>> > >
>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/smumps_c.h:#define
>>> > > > MUMPS_VERSION_MAX_LEN 30
>>> > > >
>>> > >
>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/smumps_c.h:
>>> > > >    char version_number[MUMPS_VERSION_MAX_LEN + 1 + 1];
>>> > > >
>>> > >
>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/zmumps_c.h:#ifndef
>>> > > > MUMPS_VERSION
>>> > > >
>>> > >
>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/zmumps_c.h:#define
>>> > > > MUMPS_VERSION "5.2.1"
>>> > > >
>>> > >
>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/zmumps_c.h:#ifndef
>>> > > > MUMPS_VERSION_MAX_LEN
>>> > > >
>>> > >
>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/zmumps_c.h:#define
>>> > > > MUMPS_VERSION_MAX_LEN 30
>>> > > >
>>> > >
>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/zmumps_c.h:
>>> > > >    char version_number[MUMPS_VERSION_MAX_LEN + 1 + 1];
>>> > > >
>>> > > > On Mon, Aug 30, 2021 at 9:47 PM Satish Balay <balay at mcs.anl.gov>
>>> wrote:
>>> > > >
>>> > > > > Also - what do you have for:
>>> > > > >
>>> > > > > grep MUMPS_VERSION
>>> > > > >
>>> > >
>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/*.h
>>> > > > >
>>> > > > > Satish
>>> > > > >
>>> > > > > On Mon, 30 Aug 2021, Satish Balay via petsc-users wrote:
>>> > > > >
>>> > > > > > please resend the logs
>>> > > > > >
>>> > > > > > Satish
>>> > > > > >
>>> > > > > > On Mon, 30 Aug 2021, Sam Guo wrote:
>>> > > > > >
>>> > > > > > > Same compiling error with --with-mumps-serial=1.
>>> > > > > > >
>>> > > > > > > On Mon, Aug 30, 2021 at 8:22 PM Satish Balay <
>>> balay at mcs.anl.gov>
>>> > > > > wrote:
>>> > > > > > >
>>> > > > > > > > Use the additional option: -with-mumps-serial
>>> > > > > > > >
>>> > > > > > > > Satish
>>> > > > > > > >
>>> > > > > > > > On Mon, 30 Aug 2021, Sam Guo wrote:
>>> > > > > > > >
>>> > > > > > > > > Attached please find the configure.log. I use my own
>>> CMake. I
>>> > > have
>>> > > > > > > > > defined -DPETSC_HAVE_MUMPS. Thanks.
>>> > > > > > > > >
>>> > > > > > > > > On Mon, Aug 30, 2021 at 4:56 PM Sam Guo <
>>> sam.guo at cd-adapco.com
>>> > > >
>>> > > > > wrote:
>>> > > > > > > > >
>>> > > > > > > > > > I use pre-installed
>>> > > > > > > > > >
>>> > > > > > > > > > On Mon, Aug 30, 2021 at 4:53 PM Satish Balay <
>>> > > balay at mcs.anl.gov>
>>> > > > > > > > wrote:
>>> > > > > > > > > >
>>> > > > > > > > > >>
>>> > > > > > > > > >> Are you using --download-mumps or pre-installed
>>> mumps? If
>>> > > using
>>> > > > > > > > > >> pre-installed - try --download-mumps.
>>> > > > > > > > > >>
>>> > > > > > > > > >> If you still have issues - send us configure.log and
>>> > > make.log
>>> > > > > from the
>>> > > > > > > > > >> failed build.
>>> > > > > > > > > >>
>>> > > > > > > > > >> Satish
>>> > > > > > > > > >>
>>> > > > > > > > > >> On Mon, 30 Aug 2021, Sam Guo wrote:
>>> > > > > > > > > >>
>>> > > > > > > > > >> > Dear PETSc dev team,
>>> > > > > > > > > >> >    I am compiling petsc 3.15.3 and got following
>>> compiling
>>> > > > > error
>>> > > > > > > > > >> > petsc/src/mat/impls/aij/mpi/mumps/mumps.c:52:31:
>>> error:
>>> > > > > missing
>>> > > > > > > > binary
>>> > > > > > > > > >> > operator before token "("
>>> > > > > > > > > >> >    52 | #if PETSC_PKG_MUMPS_VERSION_GE(5,3,0)
>>> > > > > > > > > >> >    Any idea what I did wrong?
>>> > > > > > > > > >> >
>>> > > > > > > > > >> > Thanks,
>>> > > > > > > > > >> > Sam
>>> > > > > > > > > >> >
>>> > > > > > > > > >>
>>> > > > > > > > > >>
>>> > > > > > > > >
>>> > > > > > > >
>>> > > > > > > >
>>> > > > > > >
>>> > > > > >
>>> > > > >
>>> > > > >
>>> > > >
>>> > >
>>> > >
>>> >
>>>
>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210901/99a1b584/attachment-0001.html>

From balay at mcs.anl.gov  Wed Sep  1 14:52:06 2021
From: balay at mcs.anl.gov (Satish Balay)
Date: Wed, 1 Sep 2021 14:52:06 -0500 (CDT)
Subject: [petsc-users] PETSc 3.15.3 compiling error
In-Reply-To: <CA+MQGp9nVOE_M-LwfYFNspjuGDc8fp5BtBoEGvXGDQ6BK+G2KQ@mail.gmail.com>
References: <CAAZdwQuOEw2289v1a--OUx=b-gfwiUi7nvps+Bwu4x1ogLdwFA@mail.gmail.com>
	<cdfd44a7-a0bf-c1d8-8d7a-14752c3e379b@mcs.anl.gov>
	<CAAZdwQujE4cX0qtc7AydqOm3bROxyqGLvdbZwLf_rp71BGRv3g@mail.gmail.com>
	<CAAZdwQu9ezAD7qKdutN3iiGetvPXHm-PuQU69Omm3s_zimZVHQ@mail.gmail.com>
	<ea5527d1-2244-7c2-a1f9-f7c6babf8166@mcs.anl.gov>
	<CAAZdwQuefc0E_m46WCd8E-41QJzQ-nS040ThMDvHAC8kX5nfiw@mail.gmail.com>
	<cb5d920-3ab1-5c2d-28da-ef433557f5e9@mcs.anl.gov>
	<65d5cb9a-2dc0-8362-6a7-5acf784e7138@mcs.anl.gov>
	<CAAZdwQs7Z1Vin5fsv6BmVOMEGHmzDesWCTsZk4uuXoE2sVbLHQ@mail.gmail.com>
	<575fd7-61c5-b983-5ad0-4c2748b6b6d2@mcs.anl.gov>
	<CAAZdwQuTGwKkEbsZnLwkSE3hbxnCEfJwpjbFYLDpq6EC7Yt4kQ@mail.gmail.com>
	<ffccfb6e-309d-d215-9fb9-eef5afe3532@mcs.anl.gov>
	<CAAZdwQt-W5sfZeuqFEMSOYXbKTEKyVE8xsa_3N0yi2sowpNNew@mail.gmail.com>
	<CAAZdwQtHpQkzvxQrVz3kt_M23vMcksEnKKcT1656PUfm6hukkQ@mail.gmail.com>
	<CA+MQGp9nVOE_M-LwfYFNspjuGDc8fp5BtBoEGvXGDQ6BK+G2KQ@mail.gmail.com>
Message-ID: <408d7d73-4da-8d73-97c-15e91855922@mcs.anl.gov>

Well the build process used here is:

>> (1) defined  -DPETSC_HAVE_MUMPS,
>> (2) compiles and links mat/impls/aij/mpi/mumps/mumps.c


i.e configure is skipped [for mumps part] so PETSC_PKG_MUMPS_VERSION_GE etc are missing [hence this error]

Satish

On Wed, 1 Sep 2021, Junchao Zhang wrote:

> On Wed, Sep 1, 2021 at 2:20 PM Sam Guo <sam.guo at cd-adapco.com> wrote:
> 
> > If we go back to the original compiling error,
> > "petsc/src/mat/impls/aij/mpi/mumps/mumps.c:52:31: error: missing binary
> > operator before token "("
> >    52 | #if PETSC_PKG_MUMPS_VERSION_GE(5,3,0)"
> > I don't understand what PETSC_PKG_MUMPS_VERSION_GE(5,3,0) is doing.
> >
> When petsc is configured with mumps, you will find the macro
> PETSC_PKG_MUMPS_VERSION_GE(MAJOR,MINOR,SUBMINOR) in
> $PETSC_ARCH/include/petscpkg_version.h
> Sam, you can manually compile the failed file, mumps.c, with preprocessing,
> to see what is wrong in the expansion of the macro.
> 
> 
> >
> > On Wed, Sep 1, 2021 at 12:12 PM Sam Guo <sam.guo at cd-adapco.com> wrote:
> >
> >> I believe I am using MUMPS since I have done following
> >> (1) defined  -DPETSC_HAVE_MUMPS,
> >> (2) compiles and links mat/impls/aij/mpi/mumps/mumps.c
> >> (3) link my pre-compiled MUMPS, and
> >> (4) specifies following PETSc options
> >>        checkError(EPSGetST(eps, &st));
> >>         checkError(STSetType(st, STSINVERT));
> >>         //if(useShellMatrix) checkError(STSetMatMode(st,
> >> ST_MATMODE_SHELL));
> >>         checkError(STGetKSP(st, &ksp));
> >>         checkError(KSPSetOperators(ksp, A, A));
> >>         checkError(KSPSetType(ksp, KSPPREONLY));
> >>         checkError(KSPGetPC(ksp, &pc));
> >>         checkError(MatSetOption(A, MAT_SPD, PETSC_TRUE));
> >>         checkError(PCSetType(pc, PCCHOLESKY));
> >>         checkError(PCFactorSetMatSolverType(pc, MATSOLVERMUMPS));
> >>         checkError(PCFactorSetUpMatSolverType(pc));
> >>         checkError(PetscOptionsSetValue(NULL, "-mat_mumps_icntl_13","1"));
> >>
> >> Another evidence I am using MUMPS is that If I skip (1)-(3) above, I got
> >> the PETSc error saying that MUMPS is required.
> >>
> >> On Wed, Sep 1, 2021 at 12:00 PM Satish Balay <balay at mcs.anl.gov> wrote:
> >>
> >>> mumps is a fortran package - so best to specify fc. Any specific reason
> >>> for needing to force '--with-fc=0'?
> >>>
> >>> The attached configure.log is not using mumps.
> >>>
> >>> Satish
> >>>
> >>> On Wed, 1 Sep 2021, Sam Guo wrote:
> >>>
> >>> > fc should not be required since I link PETSc with pre-compiled MUMPS.
> >>> In
> >>> > fact, --with-mumps-include --with-mumps-lib --with-mumps-serial should
> >>> not
> >>> > be required since my own CMake defines -DPETSC_HAVE_MUMPS and links my
> >>> > pre-compiled MUMPS.
> >>> >
> >>> > I am able to make it work using PETSc 3.11.3. Attached please find the
> >>> > cPETSc 3.11.3 onfigure.log PETSc.
> >>> >
> >>> > On Tue, Aug 31, 2021 at 4:47 PM Satish Balay <balay at mcs.anl.gov>
> >>> wrote:
> >>> >
> >>> > >
> >>> > >
> >>> *******************************************************************************
> >>> > >          UNABLE to CONFIGURE with GIVEN OPTIONS    (see
> >>> configure.log for
> >>> > > details):
> >>> > >
> >>> > >
> >>> -------------------------------------------------------------------------------
> >>> > > Package mumps requested requires Fortran but compiler turned off.
> >>> > >
> >>> > >
> >>> *******************************************************************************
> >>> > >
> >>> > > i.e remove '--with-fc=0' and rerun configure.
> >>> > >
> >>> > > Satish
> >>> > >
> >>> > > On Tue, 31 Aug 2021, Sam Guo wrote:
> >>> > >
> >>> > > > Attached please find the latest configure.log.
> >>> > > >
> >>> > > > grep MUMPS_VERSION
> >>> > > >
> >>> > >
> >>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/*.h
> >>> > > >
> >>> > >
> >>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/cmumps_c.h:#ifndef
> >>> > > > MUMPS_VERSION
> >>> > > >
> >>> > >
> >>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/cmumps_c.h:#define
> >>> > > > MUMPS_VERSION "5.2.1"
> >>> > > >
> >>> > >
> >>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/cmumps_c.h:#ifndef
> >>> > > > MUMPS_VERSION_MAX_LEN
> >>> > > >
> >>> > >
> >>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/cmumps_c.h:#define
> >>> > > > MUMPS_VERSION_MAX_LEN 30
> >>> > > >
> >>> > >
> >>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/cmumps_c.h:
> >>> > > >    char version_number[MUMPS_VERSION_MAX_LEN + 1 + 1];
> >>> > > >
> >>> > >
> >>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/dmumps_c.h:#ifndef
> >>> > > > MUMPS_VERSION
> >>> > > >
> >>> > >
> >>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/dmumps_c.h:#define
> >>> > > > MUMPS_VERSION "5.2.1"
> >>> > > >
> >>> > >
> >>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/dmumps_c.h:#ifndef
> >>> > > > MUMPS_VERSION_MAX_LEN
> >>> > > >
> >>> > >
> >>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/dmumps_c.h:#define
> >>> > > > MUMPS_VERSION_MAX_LEN 30
> >>> > > >
> >>> > >
> >>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/dmumps_c.h:
> >>> > > >    char version_number[MUMPS_VERSION_MAX_LEN + 1 + 1];
> >>> > > >
> >>> > >
> >>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/smumps_c.h:#ifndef
> >>> > > > MUMPS_VERSION
> >>> > > >
> >>> > >
> >>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/smumps_c.h:#define
> >>> > > > MUMPS_VERSION "5.2.1"
> >>> > > >
> >>> > >
> >>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/smumps_c.h:#ifndef
> >>> > > > MUMPS_VERSION_MAX_LEN
> >>> > > >
> >>> > >
> >>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/smumps_c.h:#define
> >>> > > > MUMPS_VERSION_MAX_LEN 30
> >>> > > >
> >>> > >
> >>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/smumps_c.h:
> >>> > > >    char version_number[MUMPS_VERSION_MAX_LEN + 1 + 1];
> >>> > > >
> >>> > >
> >>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/zmumps_c.h:#ifndef
> >>> > > > MUMPS_VERSION
> >>> > > >
> >>> > >
> >>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/zmumps_c.h:#define
> >>> > > > MUMPS_VERSION "5.2.1"
> >>> > > >
> >>> > >
> >>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/zmumps_c.h:#ifndef
> >>> > > > MUMPS_VERSION_MAX_LEN
> >>> > > >
> >>> > >
> >>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/zmumps_c.h:#define
> >>> > > > MUMPS_VERSION_MAX_LEN 30
> >>> > > >
> >>> > >
> >>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/zmumps_c.h:
> >>> > > >    char version_number[MUMPS_VERSION_MAX_LEN + 1 + 1];
> >>> > > >
> >>> > > > On Mon, Aug 30, 2021 at 9:47 PM Satish Balay <balay at mcs.anl.gov>
> >>> wrote:
> >>> > > >
> >>> > > > > Also - what do you have for:
> >>> > > > >
> >>> > > > > grep MUMPS_VERSION
> >>> > > > >
> >>> > >
> >>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/*.h
> >>> > > > >
> >>> > > > > Satish
> >>> > > > >
> >>> > > > > On Mon, 30 Aug 2021, Satish Balay via petsc-users wrote:
> >>> > > > >
> >>> > > > > > please resend the logs
> >>> > > > > >
> >>> > > > > > Satish
> >>> > > > > >
> >>> > > > > > On Mon, 30 Aug 2021, Sam Guo wrote:
> >>> > > > > >
> >>> > > > > > > Same compiling error with --with-mumps-serial=1.
> >>> > > > > > >
> >>> > > > > > > On Mon, Aug 30, 2021 at 8:22 PM Satish Balay <
> >>> balay at mcs.anl.gov>
> >>> > > > > wrote:
> >>> > > > > > >
> >>> > > > > > > > Use the additional option: -with-mumps-serial
> >>> > > > > > > >
> >>> > > > > > > > Satish
> >>> > > > > > > >
> >>> > > > > > > > On Mon, 30 Aug 2021, Sam Guo wrote:
> >>> > > > > > > >
> >>> > > > > > > > > Attached please find the configure.log. I use my own
> >>> CMake. I
> >>> > > have
> >>> > > > > > > > > defined -DPETSC_HAVE_MUMPS. Thanks.
> >>> > > > > > > > >
> >>> > > > > > > > > On Mon, Aug 30, 2021 at 4:56 PM Sam Guo <
> >>> sam.guo at cd-adapco.com
> >>> > > >
> >>> > > > > wrote:
> >>> > > > > > > > >
> >>> > > > > > > > > > I use pre-installed
> >>> > > > > > > > > >
> >>> > > > > > > > > > On Mon, Aug 30, 2021 at 4:53 PM Satish Balay <
> >>> > > balay at mcs.anl.gov>
> >>> > > > > > > > wrote:
> >>> > > > > > > > > >
> >>> > > > > > > > > >>
> >>> > > > > > > > > >> Are you using --download-mumps or pre-installed
> >>> mumps? If
> >>> > > using
> >>> > > > > > > > > >> pre-installed - try --download-mumps.
> >>> > > > > > > > > >>
> >>> > > > > > > > > >> If you still have issues - send us configure.log and
> >>> > > make.log
> >>> > > > > from the
> >>> > > > > > > > > >> failed build.
> >>> > > > > > > > > >>
> >>> > > > > > > > > >> Satish
> >>> > > > > > > > > >>
> >>> > > > > > > > > >> On Mon, 30 Aug 2021, Sam Guo wrote:
> >>> > > > > > > > > >>
> >>> > > > > > > > > >> > Dear PETSc dev team,
> >>> > > > > > > > > >> >    I am compiling petsc 3.15.3 and got following
> >>> compiling
> >>> > > > > error
> >>> > > > > > > > > >> > petsc/src/mat/impls/aij/mpi/mumps/mumps.c:52:31:
> >>> error:
> >>> > > > > missing
> >>> > > > > > > > binary
> >>> > > > > > > > > >> > operator before token "("
> >>> > > > > > > > > >> >    52 | #if PETSC_PKG_MUMPS_VERSION_GE(5,3,0)
> >>> > > > > > > > > >> >    Any idea what I did wrong?
> >>> > > > > > > > > >> >
> >>> > > > > > > > > >> > Thanks,
> >>> > > > > > > > > >> > Sam
> >>> > > > > > > > > >> >
> >>> > > > > > > > > >>
> >>> > > > > > > > > >>
> >>> > > > > > > > >
> >>> > > > > > > >
> >>> > > > > > > >
> >>> > > > > > >
> >>> > > > > >
> >>> > > > >
> >>> > > > >
> >>> > > >
> >>> > >
> >>> > >
> >>> >
> >>>
> >>>
> 


From junchao.zhang at gmail.com  Wed Sep  1 14:59:09 2021
From: junchao.zhang at gmail.com (Junchao Zhang)
Date: Wed, 1 Sep 2021 14:59:09 -0500
Subject: [petsc-users] PETSc 3.15.3 compiling error
In-Reply-To: <408d7d73-4da-8d73-97c-15e91855922@mcs.anl.gov>
References: <CAAZdwQuOEw2289v1a--OUx=b-gfwiUi7nvps+Bwu4x1ogLdwFA@mail.gmail.com>
	<cdfd44a7-a0bf-c1d8-8d7a-14752c3e379b@mcs.anl.gov>
	<CAAZdwQujE4cX0qtc7AydqOm3bROxyqGLvdbZwLf_rp71BGRv3g@mail.gmail.com>
	<CAAZdwQu9ezAD7qKdutN3iiGetvPXHm-PuQU69Omm3s_zimZVHQ@mail.gmail.com>
	<ea5527d1-2244-7c2-a1f9-f7c6babf8166@mcs.anl.gov>
	<CAAZdwQuefc0E_m46WCd8E-41QJzQ-nS040ThMDvHAC8kX5nfiw@mail.gmail.com>
	<cb5d920-3ab1-5c2d-28da-ef433557f5e9@mcs.anl.gov>
	<65d5cb9a-2dc0-8362-6a7-5acf784e7138@mcs.anl.gov>
	<CAAZdwQs7Z1Vin5fsv6BmVOMEGHmzDesWCTsZk4uuXoE2sVbLHQ@mail.gmail.com>
	<575fd7-61c5-b983-5ad0-4c2748b6b6d2@mcs.anl.gov>
	<CAAZdwQuTGwKkEbsZnLwkSE3hbxnCEfJwpjbFYLDpq6EC7Yt4kQ@mail.gmail.com>
	<ffccfb6e-309d-d215-9fb9-eef5afe3532@mcs.anl.gov>
	<CAAZdwQt-W5sfZeuqFEMSOYXbKTEKyVE8xsa_3N0yi2sowpNNew@mail.gmail.com>
	<CAAZdwQtHpQkzvxQrVz3kt_M23vMcksEnKKcT1656PUfm6hukkQ@mail.gmail.com>
	<CA+MQGp9nVOE_M-LwfYFNspjuGDc8fp5BtBoEGvXGDQ6BK+G2KQ@mail.gmail.com>
	<408d7d73-4da-8d73-97c-15e91855922@mcs.anl.gov>
Message-ID: <CA+MQGp_j_46VP0p6-3XVTGw3BqJMYxt9w9zNP9R2n46cV=Z0Mg@mail.gmail.com>

On Wed, Sep 1, 2021 at 2:52 PM Satish Balay <balay at mcs.anl.gov> wrote:

> Well the build process used here is:
>
> >> (1) defined  -DPETSC_HAVE_MUMPS,
> >> (2) compiles and links mat/impls/aij/mpi/mumps/mumps.c
>
>
> i.e configure is skipped [for mumps part] so PETSC_PKG_MUMPS_VERSION_GE
> etc are missing [hence this error]
>
> Then, a hack for use of MUMPS 5.2.1 is at the beginning of mumps.c, add
two lines
#define PETSC_PKG_MUMPS_VERSION_GE(x,y,z) 0
#define PETSC_PKG_MUMPS_VERSION_LT(x,y,z) 1


> Satish
>
> On Wed, 1 Sep 2021, Junchao Zhang wrote:
>
> > On Wed, Sep 1, 2021 at 2:20 PM Sam Guo <sam.guo at cd-adapco.com> wrote:
> >
> > > If we go back to the original compiling error,
> > > "petsc/src/mat/impls/aij/mpi/mumps/mumps.c:52:31: error: missing binary
> > > operator before token "("
> > >    52 | #if PETSC_PKG_MUMPS_VERSION_GE(5,3,0)"
> > > I don't understand what PETSC_PKG_MUMPS_VERSION_GE(5,3,0) is doing.
> > >
> > When petsc is configured with mumps, you will find the macro
> > PETSC_PKG_MUMPS_VERSION_GE(MAJOR,MINOR,SUBMINOR) in
> > $PETSC_ARCH/include/petscpkg_version.h
> > Sam, you can manually compile the failed file, mumps.c, with
> preprocessing,
> > to see what is wrong in the expansion of the macro.
> >
> >
> > >
> > > On Wed, Sep 1, 2021 at 12:12 PM Sam Guo <sam.guo at cd-adapco.com> wrote:
> > >
> > >> I believe I am using MUMPS since I have done following
> > >> (1) defined  -DPETSC_HAVE_MUMPS,
> > >> (2) compiles and links mat/impls/aij/mpi/mumps/mumps.c
> > >> (3) link my pre-compiled MUMPS, and
> > >> (4) specifies following PETSc options
> > >>        checkError(EPSGetST(eps, &st));
> > >>         checkError(STSetType(st, STSINVERT));
> > >>         //if(useShellMatrix) checkError(STSetMatMode(st,
> > >> ST_MATMODE_SHELL));
> > >>         checkError(STGetKSP(st, &ksp));
> > >>         checkError(KSPSetOperators(ksp, A, A));
> > >>         checkError(KSPSetType(ksp, KSPPREONLY));
> > >>         checkError(KSPGetPC(ksp, &pc));
> > >>         checkError(MatSetOption(A, MAT_SPD, PETSC_TRUE));
> > >>         checkError(PCSetType(pc, PCCHOLESKY));
> > >>         checkError(PCFactorSetMatSolverType(pc, MATSOLVERMUMPS));
> > >>         checkError(PCFactorSetUpMatSolverType(pc));
> > >>         checkError(PetscOptionsSetValue(NULL,
> "-mat_mumps_icntl_13","1"));
> > >>
> > >> Another evidence I am using MUMPS is that If I skip (1)-(3) above, I
> got
> > >> the PETSc error saying that MUMPS is required.
> > >>
> > >> On Wed, Sep 1, 2021 at 12:00 PM Satish Balay <balay at mcs.anl.gov>
> wrote:
> > >>
> > >>> mumps is a fortran package - so best to specify fc. Any specific
> reason
> > >>> for needing to force '--with-fc=0'?
> > >>>
> > >>> The attached configure.log is not using mumps.
> > >>>
> > >>> Satish
> > >>>
> > >>> On Wed, 1 Sep 2021, Sam Guo wrote:
> > >>>
> > >>> > fc should not be required since I link PETSc with pre-compiled
> MUMPS.
> > >>> In
> > >>> > fact, --with-mumps-include --with-mumps-lib --with-mumps-serial
> should
> > >>> not
> > >>> > be required since my own CMake defines -DPETSC_HAVE_MUMPS and
> links my
> > >>> > pre-compiled MUMPS.
> > >>> >
> > >>> > I am able to make it work using PETSc 3.11.3. Attached please find
> the
> > >>> > cPETSc 3.11.3 onfigure.log PETSc.
> > >>> >
> > >>> > On Tue, Aug 31, 2021 at 4:47 PM Satish Balay <balay at mcs.anl.gov>
> > >>> wrote:
> > >>> >
> > >>> > >
> > >>> > >
> > >>>
> *******************************************************************************
> > >>> > >          UNABLE to CONFIGURE with GIVEN OPTIONS    (see
> > >>> configure.log for
> > >>> > > details):
> > >>> > >
> > >>> > >
> > >>>
> -------------------------------------------------------------------------------
> > >>> > > Package mumps requested requires Fortran but compiler turned off.
> > >>> > >
> > >>> > >
> > >>>
> *******************************************************************************
> > >>> > >
> > >>> > > i.e remove '--with-fc=0' and rerun configure.
> > >>> > >
> > >>> > > Satish
> > >>> > >
> > >>> > > On Tue, 31 Aug 2021, Sam Guo wrote:
> > >>> > >
> > >>> > > > Attached please find the latest configure.log.
> > >>> > > >
> > >>> > > > grep MUMPS_VERSION
> > >>> > > >
> > >>> > >
> > >>>
> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/*.h
> > >>> > > >
> > >>> > >
> > >>>
> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/cmumps_c.h:#ifndef
> > >>> > > > MUMPS_VERSION
> > >>> > > >
> > >>> > >
> > >>>
> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/cmumps_c.h:#define
> > >>> > > > MUMPS_VERSION "5.2.1"
> > >>> > > >
> > >>> > >
> > >>>
> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/cmumps_c.h:#ifndef
> > >>> > > > MUMPS_VERSION_MAX_LEN
> > >>> > > >
> > >>> > >
> > >>>
> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/cmumps_c.h:#define
> > >>> > > > MUMPS_VERSION_MAX_LEN 30
> > >>> > > >
> > >>> > >
> > >>>
> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/cmumps_c.h:
> > >>> > > >    char version_number[MUMPS_VERSION_MAX_LEN + 1 + 1];
> > >>> > > >
> > >>> > >
> > >>>
> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/dmumps_c.h:#ifndef
> > >>> > > > MUMPS_VERSION
> > >>> > > >
> > >>> > >
> > >>>
> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/dmumps_c.h:#define
> > >>> > > > MUMPS_VERSION "5.2.1"
> > >>> > > >
> > >>> > >
> > >>>
> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/dmumps_c.h:#ifndef
> > >>> > > > MUMPS_VERSION_MAX_LEN
> > >>> > > >
> > >>> > >
> > >>>
> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/dmumps_c.h:#define
> > >>> > > > MUMPS_VERSION_MAX_LEN 30
> > >>> > > >
> > >>> > >
> > >>>
> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/dmumps_c.h:
> > >>> > > >    char version_number[MUMPS_VERSION_MAX_LEN + 1 + 1];
> > >>> > > >
> > >>> > >
> > >>>
> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/smumps_c.h:#ifndef
> > >>> > > > MUMPS_VERSION
> > >>> > > >
> > >>> > >
> > >>>
> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/smumps_c.h:#define
> > >>> > > > MUMPS_VERSION "5.2.1"
> > >>> > > >
> > >>> > >
> > >>>
> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/smumps_c.h:#ifndef
> > >>> > > > MUMPS_VERSION_MAX_LEN
> > >>> > > >
> > >>> > >
> > >>>
> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/smumps_c.h:#define
> > >>> > > > MUMPS_VERSION_MAX_LEN 30
> > >>> > > >
> > >>> > >
> > >>>
> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/smumps_c.h:
> > >>> > > >    char version_number[MUMPS_VERSION_MAX_LEN + 1 + 1];
> > >>> > > >
> > >>> > >
> > >>>
> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/zmumps_c.h:#ifndef
> > >>> > > > MUMPS_VERSION
> > >>> > > >
> > >>> > >
> > >>>
> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/zmumps_c.h:#define
> > >>> > > > MUMPS_VERSION "5.2.1"
> > >>> > > >
> > >>> > >
> > >>>
> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/zmumps_c.h:#ifndef
> > >>> > > > MUMPS_VERSION_MAX_LEN
> > >>> > > >
> > >>> > >
> > >>>
> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/zmumps_c.h:#define
> > >>> > > > MUMPS_VERSION_MAX_LEN 30
> > >>> > > >
> > >>> > >
> > >>>
> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/zmumps_c.h:
> > >>> > > >    char version_number[MUMPS_VERSION_MAX_LEN + 1 + 1];
> > >>> > > >
> > >>> > > > On Mon, Aug 30, 2021 at 9:47 PM Satish Balay <
> balay at mcs.anl.gov>
> > >>> wrote:
> > >>> > > >
> > >>> > > > > Also - what do you have for:
> > >>> > > > >
> > >>> > > > > grep MUMPS_VERSION
> > >>> > > > >
> > >>> > >
> > >>>
> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/*.h
> > >>> > > > >
> > >>> > > > > Satish
> > >>> > > > >
> > >>> > > > > On Mon, 30 Aug 2021, Satish Balay via petsc-users wrote:
> > >>> > > > >
> > >>> > > > > > please resend the logs
> > >>> > > > > >
> > >>> > > > > > Satish
> > >>> > > > > >
> > >>> > > > > > On Mon, 30 Aug 2021, Sam Guo wrote:
> > >>> > > > > >
> > >>> > > > > > > Same compiling error with --with-mumps-serial=1.
> > >>> > > > > > >
> > >>> > > > > > > On Mon, Aug 30, 2021 at 8:22 PM Satish Balay <
> > >>> balay at mcs.anl.gov>
> > >>> > > > > wrote:
> > >>> > > > > > >
> > >>> > > > > > > > Use the additional option: -with-mumps-serial
> > >>> > > > > > > >
> > >>> > > > > > > > Satish
> > >>> > > > > > > >
> > >>> > > > > > > > On Mon, 30 Aug 2021, Sam Guo wrote:
> > >>> > > > > > > >
> > >>> > > > > > > > > Attached please find the configure.log. I use my own
> > >>> CMake. I
> > >>> > > have
> > >>> > > > > > > > > defined -DPETSC_HAVE_MUMPS. Thanks.
> > >>> > > > > > > > >
> > >>> > > > > > > > > On Mon, Aug 30, 2021 at 4:56 PM Sam Guo <
> > >>> sam.guo at cd-adapco.com
> > >>> > > >
> > >>> > > > > wrote:
> > >>> > > > > > > > >
> > >>> > > > > > > > > > I use pre-installed
> > >>> > > > > > > > > >
> > >>> > > > > > > > > > On Mon, Aug 30, 2021 at 4:53 PM Satish Balay <
> > >>> > > balay at mcs.anl.gov>
> > >>> > > > > > > > wrote:
> > >>> > > > > > > > > >
> > >>> > > > > > > > > >>
> > >>> > > > > > > > > >> Are you using --download-mumps or pre-installed
> > >>> mumps? If
> > >>> > > using
> > >>> > > > > > > > > >> pre-installed - try --download-mumps.
> > >>> > > > > > > > > >>
> > >>> > > > > > > > > >> If you still have issues - send us configure.log
> and
> > >>> > > make.log
> > >>> > > > > from the
> > >>> > > > > > > > > >> failed build.
> > >>> > > > > > > > > >>
> > >>> > > > > > > > > >> Satish
> > >>> > > > > > > > > >>
> > >>> > > > > > > > > >> On Mon, 30 Aug 2021, Sam Guo wrote:
> > >>> > > > > > > > > >>
> > >>> > > > > > > > > >> > Dear PETSc dev team,
> > >>> > > > > > > > > >> >    I am compiling petsc 3.15.3 and got following
> > >>> compiling
> > >>> > > > > error
> > >>> > > > > > > > > >> > petsc/src/mat/impls/aij/mpi/mumps/mumps.c:52:31:
> > >>> error:
> > >>> > > > > missing
> > >>> > > > > > > > binary
> > >>> > > > > > > > > >> > operator before token "("
> > >>> > > > > > > > > >> >    52 | #if PETSC_PKG_MUMPS_VERSION_GE(5,3,0)
> > >>> > > > > > > > > >> >    Any idea what I did wrong?
> > >>> > > > > > > > > >> >
> > >>> > > > > > > > > >> > Thanks,
> > >>> > > > > > > > > >> > Sam
> > >>> > > > > > > > > >> >
> > >>> > > > > > > > > >>
> > >>> > > > > > > > > >>
> > >>> > > > > > > > >
> > >>> > > > > > > >
> > >>> > > > > > > >
> > >>> > > > > > >
> > >>> > > > > >
> > >>> > > > >
> > >>> > > > >
> > >>> > > >
> > >>> > >
> > >>> > >
> > >>> >
> > >>>
> > >>>
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210901/ec21478b/attachment-0001.html>

From sam.guo at cd-adapco.com  Wed Sep  1 15:02:32 2021
From: sam.guo at cd-adapco.com (Sam Guo)
Date: Wed, 1 Sep 2021 13:02:32 -0700
Subject: [petsc-users] PETSc 3.15.3 compiling error
In-Reply-To: <408d7d73-4da-8d73-97c-15e91855922@mcs.anl.gov>
References: <CAAZdwQuOEw2289v1a--OUx=b-gfwiUi7nvps+Bwu4x1ogLdwFA@mail.gmail.com>
	<cdfd44a7-a0bf-c1d8-8d7a-14752c3e379b@mcs.anl.gov>
	<CAAZdwQujE4cX0qtc7AydqOm3bROxyqGLvdbZwLf_rp71BGRv3g@mail.gmail.com>
	<CAAZdwQu9ezAD7qKdutN3iiGetvPXHm-PuQU69Omm3s_zimZVHQ@mail.gmail.com>
	<ea5527d1-2244-7c2-a1f9-f7c6babf8166@mcs.anl.gov>
	<CAAZdwQuefc0E_m46WCd8E-41QJzQ-nS040ThMDvHAC8kX5nfiw@mail.gmail.com>
	<cb5d920-3ab1-5c2d-28da-ef433557f5e9@mcs.anl.gov>
	<65d5cb9a-2dc0-8362-6a7-5acf784e7138@mcs.anl.gov>
	<CAAZdwQs7Z1Vin5fsv6BmVOMEGHmzDesWCTsZk4uuXoE2sVbLHQ@mail.gmail.com>
	<575fd7-61c5-b983-5ad0-4c2748b6b6d2@mcs.anl.gov>
	<CAAZdwQuTGwKkEbsZnLwkSE3hbxnCEfJwpjbFYLDpq6EC7Yt4kQ@mail.gmail.com>
	<ffccfb6e-309d-d215-9fb9-eef5afe3532@mcs.anl.gov>
	<CAAZdwQt-W5sfZeuqFEMSOYXbKTEKyVE8xsa_3N0yi2sowpNNew@mail.gmail.com>
	<CAAZdwQtHpQkzvxQrVz3kt_M23vMcksEnKKcT1656PUfm6hukkQ@mail.gmail.com>
	<CA+MQGp9nVOE_M-LwfYFNspjuGDc8fp5BtBoEGvXGDQ6BK+G2KQ@mail.gmail.com>
	<408d7d73-4da-8d73-97c-15e91855922@mcs.anl.gov>
Message-ID: <CAAZdwQsuK+Mj78v4A2WVX199J4MHzoL34-pcf+iNvG0+2FzVXQ@mail.gmail.com>

Hi Matt,
   I tried  --with-mumps-dir but same error.

Hi Junchao,
   That's a very good clue and suggestion. I looked petscpkg_version.h. It
is empty as follows. I'll follow your suggestion and define those macros in
mumps.c.

#if !defined(INCLUDED_PETSCPKG_VERSION_H)
#define INCLUDED_PETSCPKG_VERSION_H

#endif

Hi Satish,
   Yes, what I am doing is hacking but it is necessary since have own own
mpi wrapper.

Thank you all,
Sam

On Wed, Sep 1, 2021 at 12:52 PM Satish Balay <balay at mcs.anl.gov> wrote:

> Well the build process used here is:
>
> >> (1) defined  -DPETSC_HAVE_MUMPS,
> >> (2) compiles and links mat/impls/aij/mpi/mumps/mumps.c
>
>
> i.e configure is skipped [for mumps part] so PETSC_PKG_MUMPS_VERSION_GE
> etc are missing [hence this error]
>
> Satish
>
> On Wed, 1 Sep 2021, Junchao Zhang wrote:
>
> > On Wed, Sep 1, 2021 at 2:20 PM Sam Guo <sam.guo at cd-adapco.com> wrote:
> >
> > > If we go back to the original compiling error,
> > > "petsc/src/mat/impls/aij/mpi/mumps/mumps.c:52:31: error: missing binary
> > > operator before token "("
> > >    52 | #if PETSC_PKG_MUMPS_VERSION_GE(5,3,0)"
> > > I don't understand what PETSC_PKG_MUMPS_VERSION_GE(5,3,0) is doing.
> > >
> > When petsc is configured with mumps, you will find the macro
> > PETSC_PKG_MUMPS_VERSION_GE(MAJOR,MINOR,SUBMINOR) in
> > $PETSC_ARCH/include/petscpkg_version.h
> > Sam, you can manually compile the failed file, mumps.c, with
> preprocessing,
> > to see what is wrong in the expansion of the macro.
> >
> >
> > >
> > > On Wed, Sep 1, 2021 at 12:12 PM Sam Guo <sam.guo at cd-adapco.com> wrote:
> > >
> > >> I believe I am using MUMPS since I have done following
> > >> (1) defined  -DPETSC_HAVE_MUMPS,
> > >> (2) compiles and links mat/impls/aij/mpi/mumps/mumps.c
> > >> (3) link my pre-compiled MUMPS, and
> > >> (4) specifies following PETSc options
> > >>        checkError(EPSGetST(eps, &st));
> > >>         checkError(STSetType(st, STSINVERT));
> > >>         //if(useShellMatrix) checkError(STSetMatMode(st,
> > >> ST_MATMODE_SHELL));
> > >>         checkError(STGetKSP(st, &ksp));
> > >>         checkError(KSPSetOperators(ksp, A, A));
> > >>         checkError(KSPSetType(ksp, KSPPREONLY));
> > >>         checkError(KSPGetPC(ksp, &pc));
> > >>         checkError(MatSetOption(A, MAT_SPD, PETSC_TRUE));
> > >>         checkError(PCSetType(pc, PCCHOLESKY));
> > >>         checkError(PCFactorSetMatSolverType(pc, MATSOLVERMUMPS));
> > >>         checkError(PCFactorSetUpMatSolverType(pc));
> > >>         checkError(PetscOptionsSetValue(NULL,
> "-mat_mumps_icntl_13","1"));
> > >>
> > >> Another evidence I am using MUMPS is that If I skip (1)-(3) above, I
> got
> > >> the PETSc error saying that MUMPS is required.
> > >>
> > >> On Wed, Sep 1, 2021 at 12:00 PM Satish Balay <balay at mcs.anl.gov>
> wrote:
> > >>
> > >>> mumps is a fortran package - so best to specify fc. Any specific
> reason
> > >>> for needing to force '--with-fc=0'?
> > >>>
> > >>> The attached configure.log is not using mumps.
> > >>>
> > >>> Satish
> > >>>
> > >>> On Wed, 1 Sep 2021, Sam Guo wrote:
> > >>>
> > >>> > fc should not be required since I link PETSc with pre-compiled
> MUMPS.
> > >>> In
> > >>> > fact, --with-mumps-include --with-mumps-lib --with-mumps-serial
> should
> > >>> not
> > >>> > be required since my own CMake defines -DPETSC_HAVE_MUMPS and
> links my
> > >>> > pre-compiled MUMPS.
> > >>> >
> > >>> > I am able to make it work using PETSc 3.11.3. Attached please find
> the
> > >>> > cPETSc 3.11.3 onfigure.log PETSc.
> > >>> >
> > >>> > On Tue, Aug 31, 2021 at 4:47 PM Satish Balay <balay at mcs.anl.gov>
> > >>> wrote:
> > >>> >
> > >>> > >
> > >>> > >
> > >>>
> *******************************************************************************
> > >>> > >          UNABLE to CONFIGURE with GIVEN OPTIONS    (see
> > >>> configure.log for
> > >>> > > details):
> > >>> > >
> > >>> > >
> > >>>
> -------------------------------------------------------------------------------
> > >>> > > Package mumps requested requires Fortran but compiler turned off.
> > >>> > >
> > >>> > >
> > >>>
> *******************************************************************************
> > >>> > >
> > >>> > > i.e remove '--with-fc=0' and rerun configure.
> > >>> > >
> > >>> > > Satish
> > >>> > >
> > >>> > > On Tue, 31 Aug 2021, Sam Guo wrote:
> > >>> > >
> > >>> > > > Attached please find the latest configure.log.
> > >>> > > >
> > >>> > > > grep MUMPS_VERSION
> > >>> > > >
> > >>> > >
> > >>>
> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/*.h
> > >>> > > >
> > >>> > >
> > >>>
> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/cmumps_c.h:#ifndef
> > >>> > > > MUMPS_VERSION
> > >>> > > >
> > >>> > >
> > >>>
> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/cmumps_c.h:#define
> > >>> > > > MUMPS_VERSION "5.2.1"
> > >>> > > >
> > >>> > >
> > >>>
> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/cmumps_c.h:#ifndef
> > >>> > > > MUMPS_VERSION_MAX_LEN
> > >>> > > >
> > >>> > >
> > >>>
> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/cmumps_c.h:#define
> > >>> > > > MUMPS_VERSION_MAX_LEN 30
> > >>> > > >
> > >>> > >
> > >>>
> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/cmumps_c.h:
> > >>> > > >    char version_number[MUMPS_VERSION_MAX_LEN + 1 + 1];
> > >>> > > >
> > >>> > >
> > >>>
> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/dmumps_c.h:#ifndef
> > >>> > > > MUMPS_VERSION
> > >>> > > >
> > >>> > >
> > >>>
> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/dmumps_c.h:#define
> > >>> > > > MUMPS_VERSION "5.2.1"
> > >>> > > >
> > >>> > >
> > >>>
> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/dmumps_c.h:#ifndef
> > >>> > > > MUMPS_VERSION_MAX_LEN
> > >>> > > >
> > >>> > >
> > >>>
> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/dmumps_c.h:#define
> > >>> > > > MUMPS_VERSION_MAX_LEN 30
> > >>> > > >
> > >>> > >
> > >>>
> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/dmumps_c.h:
> > >>> > > >    char version_number[MUMPS_VERSION_MAX_LEN + 1 + 1];
> > >>> > > >
> > >>> > >
> > >>>
> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/smumps_c.h:#ifndef
> > >>> > > > MUMPS_VERSION
> > >>> > > >
> > >>> > >
> > >>>
> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/smumps_c.h:#define
> > >>> > > > MUMPS_VERSION "5.2.1"
> > >>> > > >
> > >>> > >
> > >>>
> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/smumps_c.h:#ifndef
> > >>> > > > MUMPS_VERSION_MAX_LEN
> > >>> > > >
> > >>> > >
> > >>>
> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/smumps_c.h:#define
> > >>> > > > MUMPS_VERSION_MAX_LEN 30
> > >>> > > >
> > >>> > >
> > >>>
> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/smumps_c.h:
> > >>> > > >    char version_number[MUMPS_VERSION_MAX_LEN + 1 + 1];
> > >>> > > >
> > >>> > >
> > >>>
> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/zmumps_c.h:#ifndef
> > >>> > > > MUMPS_VERSION
> > >>> > > >
> > >>> > >
> > >>>
> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/zmumps_c.h:#define
> > >>> > > > MUMPS_VERSION "5.2.1"
> > >>> > > >
> > >>> > >
> > >>>
> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/zmumps_c.h:#ifndef
> > >>> > > > MUMPS_VERSION_MAX_LEN
> > >>> > > >
> > >>> > >
> > >>>
> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/zmumps_c.h:#define
> > >>> > > > MUMPS_VERSION_MAX_LEN 30
> > >>> > > >
> > >>> > >
> > >>>
> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/zmumps_c.h:
> > >>> > > >    char version_number[MUMPS_VERSION_MAX_LEN + 1 + 1];
> > >>> > > >
> > >>> > > > On Mon, Aug 30, 2021 at 9:47 PM Satish Balay <
> balay at mcs.anl.gov>
> > >>> wrote:
> > >>> > > >
> > >>> > > > > Also - what do you have for:
> > >>> > > > >
> > >>> > > > > grep MUMPS_VERSION
> > >>> > > > >
> > >>> > >
> > >>>
> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/*.h
> > >>> > > > >
> > >>> > > > > Satish
> > >>> > > > >
> > >>> > > > > On Mon, 30 Aug 2021, Satish Balay via petsc-users wrote:
> > >>> > > > >
> > >>> > > > > > please resend the logs
> > >>> > > > > >
> > >>> > > > > > Satish
> > >>> > > > > >
> > >>> > > > > > On Mon, 30 Aug 2021, Sam Guo wrote:
> > >>> > > > > >
> > >>> > > > > > > Same compiling error with --with-mumps-serial=1.
> > >>> > > > > > >
> > >>> > > > > > > On Mon, Aug 30, 2021 at 8:22 PM Satish Balay <
> > >>> balay at mcs.anl.gov>
> > >>> > > > > wrote:
> > >>> > > > > > >
> > >>> > > > > > > > Use the additional option: -with-mumps-serial
> > >>> > > > > > > >
> > >>> > > > > > > > Satish
> > >>> > > > > > > >
> > >>> > > > > > > > On Mon, 30 Aug 2021, Sam Guo wrote:
> > >>> > > > > > > >
> > >>> > > > > > > > > Attached please find the configure.log. I use my own
> > >>> CMake. I
> > >>> > > have
> > >>> > > > > > > > > defined -DPETSC_HAVE_MUMPS. Thanks.
> > >>> > > > > > > > >
> > >>> > > > > > > > > On Mon, Aug 30, 2021 at 4:56 PM Sam Guo <
> > >>> sam.guo at cd-adapco.com
> > >>> > > >
> > >>> > > > > wrote:
> > >>> > > > > > > > >
> > >>> > > > > > > > > > I use pre-installed
> > >>> > > > > > > > > >
> > >>> > > > > > > > > > On Mon, Aug 30, 2021 at 4:53 PM Satish Balay <
> > >>> > > balay at mcs.anl.gov>
> > >>> > > > > > > > wrote:
> > >>> > > > > > > > > >
> > >>> > > > > > > > > >>
> > >>> > > > > > > > > >> Are you using --download-mumps or pre-installed
> > >>> mumps? If
> > >>> > > using
> > >>> > > > > > > > > >> pre-installed - try --download-mumps.
> > >>> > > > > > > > > >>
> > >>> > > > > > > > > >> If you still have issues - send us configure.log
> and
> > >>> > > make.log
> > >>> > > > > from the
> > >>> > > > > > > > > >> failed build.
> > >>> > > > > > > > > >>
> > >>> > > > > > > > > >> Satish
> > >>> > > > > > > > > >>
> > >>> > > > > > > > > >> On Mon, 30 Aug 2021, Sam Guo wrote:
> > >>> > > > > > > > > >>
> > >>> > > > > > > > > >> > Dear PETSc dev team,
> > >>> > > > > > > > > >> >    I am compiling petsc 3.15.3 and got following
> > >>> compiling
> > >>> > > > > error
> > >>> > > > > > > > > >> > petsc/src/mat/impls/aij/mpi/mumps/mumps.c:52:31:
> > >>> error:
> > >>> > > > > missing
> > >>> > > > > > > > binary
> > >>> > > > > > > > > >> > operator before token "("
> > >>> > > > > > > > > >> >    52 | #if PETSC_PKG_MUMPS_VERSION_GE(5,3,0)
> > >>> > > > > > > > > >> >    Any idea what I did wrong?
> > >>> > > > > > > > > >> >
> > >>> > > > > > > > > >> > Thanks,
> > >>> > > > > > > > > >> > Sam
> > >>> > > > > > > > > >> >
> > >>> > > > > > > > > >>
> > >>> > > > > > > > > >>
> > >>> > > > > > > > >
> > >>> > > > > > > >
> > >>> > > > > > > >
> > >>> > > > > > >
> > >>> > > > > >
> > >>> > > > >
> > >>> > > > >
> > >>> > > >
> > >>> > >
> > >>> > >
> > >>> >
> > >>>
> > >>>
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210901/4fb1e2ce/attachment.html>

From sam.guo at cd-adapco.com  Wed Sep  1 15:06:07 2021
From: sam.guo at cd-adapco.com (Sam Guo)
Date: Wed, 1 Sep 2021 13:06:07 -0700
Subject: [petsc-users] PETSc 3.15.3 compiling error
In-Reply-To: <CAAZdwQsuK+Mj78v4A2WVX199J4MHzoL34-pcf+iNvG0+2FzVXQ@mail.gmail.com>
References: <CAAZdwQuOEw2289v1a--OUx=b-gfwiUi7nvps+Bwu4x1ogLdwFA@mail.gmail.com>
	<cdfd44a7-a0bf-c1d8-8d7a-14752c3e379b@mcs.anl.gov>
	<CAAZdwQujE4cX0qtc7AydqOm3bROxyqGLvdbZwLf_rp71BGRv3g@mail.gmail.com>
	<CAAZdwQu9ezAD7qKdutN3iiGetvPXHm-PuQU69Omm3s_zimZVHQ@mail.gmail.com>
	<ea5527d1-2244-7c2-a1f9-f7c6babf8166@mcs.anl.gov>
	<CAAZdwQuefc0E_m46WCd8E-41QJzQ-nS040ThMDvHAC8kX5nfiw@mail.gmail.com>
	<cb5d920-3ab1-5c2d-28da-ef433557f5e9@mcs.anl.gov>
	<65d5cb9a-2dc0-8362-6a7-5acf784e7138@mcs.anl.gov>
	<CAAZdwQs7Z1Vin5fsv6BmVOMEGHmzDesWCTsZk4uuXoE2sVbLHQ@mail.gmail.com>
	<575fd7-61c5-b983-5ad0-4c2748b6b6d2@mcs.anl.gov>
	<CAAZdwQuTGwKkEbsZnLwkSE3hbxnCEfJwpjbFYLDpq6EC7Yt4kQ@mail.gmail.com>
	<ffccfb6e-309d-d215-9fb9-eef5afe3532@mcs.anl.gov>
	<CAAZdwQt-W5sfZeuqFEMSOYXbKTEKyVE8xsa_3N0yi2sowpNNew@mail.gmail.com>
	<CAAZdwQtHpQkzvxQrVz3kt_M23vMcksEnKKcT1656PUfm6hukkQ@mail.gmail.com>
	<CA+MQGp9nVOE_M-LwfYFNspjuGDc8fp5BtBoEGvXGDQ6BK+G2KQ@mail.gmail.com>
	<408d7d73-4da-8d73-97c-15e91855922@mcs.anl.gov>
	<CAAZdwQsuK+Mj78v4A2WVX199J4MHzoL34-pcf+iNvG0+2FzVXQ@mail.gmail.com>
Message-ID: <CAAZdwQv=6zirN9ncsDPRVZsbHVAopEs+gMQvBvgjfaqF7+SbBQ@mail.gmail.com>

Hi Junchao,
   Your suggestion works. Thanks a lot.

BR,
Sam

On Wed, Sep 1, 2021 at 1:02 PM Sam Guo <sam.guo at cd-adapco.com> wrote:

> Hi Matt,
>    I tried  --with-mumps-dir but same error.
>
> Hi Junchao,
>    That's a very good clue and suggestion. I looked petscpkg_version.h. It
> is empty as follows. I'll follow your suggestion and define those macros in
> mumps.c.
>
> #if !defined(INCLUDED_PETSCPKG_VERSION_H)
> #define INCLUDED_PETSCPKG_VERSION_H
>
> #endif
>
> Hi Satish,
>    Yes, what I am doing is hacking but it is necessary since have own own
> mpi wrapper.
>
> Thank you all,
> Sam
>
> On Wed, Sep 1, 2021 at 12:52 PM Satish Balay <balay at mcs.anl.gov> wrote:
>
>> Well the build process used here is:
>>
>> >> (1) defined  -DPETSC_HAVE_MUMPS,
>> >> (2) compiles and links mat/impls/aij/mpi/mumps/mumps.c
>>
>>
>> i.e configure is skipped [for mumps part] so PETSC_PKG_MUMPS_VERSION_GE
>> etc are missing [hence this error]
>>
>> Satish
>>
>> On Wed, 1 Sep 2021, Junchao Zhang wrote:
>>
>> > On Wed, Sep 1, 2021 at 2:20 PM Sam Guo <sam.guo at cd-adapco.com> wrote:
>> >
>> > > If we go back to the original compiling error,
>> > > "petsc/src/mat/impls/aij/mpi/mumps/mumps.c:52:31: error: missing
>> binary
>> > > operator before token "("
>> > >    52 | #if PETSC_PKG_MUMPS_VERSION_GE(5,3,0)"
>> > > I don't understand what PETSC_PKG_MUMPS_VERSION_GE(5,3,0) is doing.
>> > >
>> > When petsc is configured with mumps, you will find the macro
>> > PETSC_PKG_MUMPS_VERSION_GE(MAJOR,MINOR,SUBMINOR) in
>> > $PETSC_ARCH/include/petscpkg_version.h
>> > Sam, you can manually compile the failed file, mumps.c, with
>> preprocessing,
>> > to see what is wrong in the expansion of the macro.
>> >
>> >
>> > >
>> > > On Wed, Sep 1, 2021 at 12:12 PM Sam Guo <sam.guo at cd-adapco.com>
>> wrote:
>> > >
>> > >> I believe I am using MUMPS since I have done following
>> > >> (1) defined  -DPETSC_HAVE_MUMPS,
>> > >> (2) compiles and links mat/impls/aij/mpi/mumps/mumps.c
>> > >> (3) link my pre-compiled MUMPS, and
>> > >> (4) specifies following PETSc options
>> > >>        checkError(EPSGetST(eps, &st));
>> > >>         checkError(STSetType(st, STSINVERT));
>> > >>         //if(useShellMatrix) checkError(STSetMatMode(st,
>> > >> ST_MATMODE_SHELL));
>> > >>         checkError(STGetKSP(st, &ksp));
>> > >>         checkError(KSPSetOperators(ksp, A, A));
>> > >>         checkError(KSPSetType(ksp, KSPPREONLY));
>> > >>         checkError(KSPGetPC(ksp, &pc));
>> > >>         checkError(MatSetOption(A, MAT_SPD, PETSC_TRUE));
>> > >>         checkError(PCSetType(pc, PCCHOLESKY));
>> > >>         checkError(PCFactorSetMatSolverType(pc, MATSOLVERMUMPS));
>> > >>         checkError(PCFactorSetUpMatSolverType(pc));
>> > >>         checkError(PetscOptionsSetValue(NULL,
>> "-mat_mumps_icntl_13","1"));
>> > >>
>> > >> Another evidence I am using MUMPS is that If I skip (1)-(3) above, I
>> got
>> > >> the PETSc error saying that MUMPS is required.
>> > >>
>> > >> On Wed, Sep 1, 2021 at 12:00 PM Satish Balay <balay at mcs.anl.gov>
>> wrote:
>> > >>
>> > >>> mumps is a fortran package - so best to specify fc. Any specific
>> reason
>> > >>> for needing to force '--with-fc=0'?
>> > >>>
>> > >>> The attached configure.log is not using mumps.
>> > >>>
>> > >>> Satish
>> > >>>
>> > >>> On Wed, 1 Sep 2021, Sam Guo wrote:
>> > >>>
>> > >>> > fc should not be required since I link PETSc with pre-compiled
>> MUMPS.
>> > >>> In
>> > >>> > fact, --with-mumps-include --with-mumps-lib --with-mumps-serial
>> should
>> > >>> not
>> > >>> > be required since my own CMake defines -DPETSC_HAVE_MUMPS and
>> links my
>> > >>> > pre-compiled MUMPS.
>> > >>> >
>> > >>> > I am able to make it work using PETSc 3.11.3. Attached please
>> find the
>> > >>> > cPETSc 3.11.3 onfigure.log PETSc.
>> > >>> >
>> > >>> > On Tue, Aug 31, 2021 at 4:47 PM Satish Balay <balay at mcs.anl.gov>
>> > >>> wrote:
>> > >>> >
>> > >>> > >
>> > >>> > >
>> > >>>
>> *******************************************************************************
>> > >>> > >          UNABLE to CONFIGURE with GIVEN OPTIONS    (see
>> > >>> configure.log for
>> > >>> > > details):
>> > >>> > >
>> > >>> > >
>> > >>>
>> -------------------------------------------------------------------------------
>> > >>> > > Package mumps requested requires Fortran but compiler turned
>> off.
>> > >>> > >
>> > >>> > >
>> > >>>
>> *******************************************************************************
>> > >>> > >
>> > >>> > > i.e remove '--with-fc=0' and rerun configure.
>> > >>> > >
>> > >>> > > Satish
>> > >>> > >
>> > >>> > > On Tue, 31 Aug 2021, Sam Guo wrote:
>> > >>> > >
>> > >>> > > > Attached please find the latest configure.log.
>> > >>> > > >
>> > >>> > > > grep MUMPS_VERSION
>> > >>> > > >
>> > >>> > >
>> > >>>
>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/*.h
>> > >>> > > >
>> > >>> > >
>> > >>>
>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/cmumps_c.h:#ifndef
>> > >>> > > > MUMPS_VERSION
>> > >>> > > >
>> > >>> > >
>> > >>>
>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/cmumps_c.h:#define
>> > >>> > > > MUMPS_VERSION "5.2.1"
>> > >>> > > >
>> > >>> > >
>> > >>>
>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/cmumps_c.h:#ifndef
>> > >>> > > > MUMPS_VERSION_MAX_LEN
>> > >>> > > >
>> > >>> > >
>> > >>>
>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/cmumps_c.h:#define
>> > >>> > > > MUMPS_VERSION_MAX_LEN 30
>> > >>> > > >
>> > >>> > >
>> > >>>
>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/cmumps_c.h:
>> > >>> > > >    char version_number[MUMPS_VERSION_MAX_LEN + 1 + 1];
>> > >>> > > >
>> > >>> > >
>> > >>>
>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/dmumps_c.h:#ifndef
>> > >>> > > > MUMPS_VERSION
>> > >>> > > >
>> > >>> > >
>> > >>>
>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/dmumps_c.h:#define
>> > >>> > > > MUMPS_VERSION "5.2.1"
>> > >>> > > >
>> > >>> > >
>> > >>>
>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/dmumps_c.h:#ifndef
>> > >>> > > > MUMPS_VERSION_MAX_LEN
>> > >>> > > >
>> > >>> > >
>> > >>>
>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/dmumps_c.h:#define
>> > >>> > > > MUMPS_VERSION_MAX_LEN 30
>> > >>> > > >
>> > >>> > >
>> > >>>
>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/dmumps_c.h:
>> > >>> > > >    char version_number[MUMPS_VERSION_MAX_LEN + 1 + 1];
>> > >>> > > >
>> > >>> > >
>> > >>>
>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/smumps_c.h:#ifndef
>> > >>> > > > MUMPS_VERSION
>> > >>> > > >
>> > >>> > >
>> > >>>
>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/smumps_c.h:#define
>> > >>> > > > MUMPS_VERSION "5.2.1"
>> > >>> > > >
>> > >>> > >
>> > >>>
>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/smumps_c.h:#ifndef
>> > >>> > > > MUMPS_VERSION_MAX_LEN
>> > >>> > > >
>> > >>> > >
>> > >>>
>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/smumps_c.h:#define
>> > >>> > > > MUMPS_VERSION_MAX_LEN 30
>> > >>> > > >
>> > >>> > >
>> > >>>
>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/smumps_c.h:
>> > >>> > > >    char version_number[MUMPS_VERSION_MAX_LEN + 1 + 1];
>> > >>> > > >
>> > >>> > >
>> > >>>
>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/zmumps_c.h:#ifndef
>> > >>> > > > MUMPS_VERSION
>> > >>> > > >
>> > >>> > >
>> > >>>
>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/zmumps_c.h:#define
>> > >>> > > > MUMPS_VERSION "5.2.1"
>> > >>> > > >
>> > >>> > >
>> > >>>
>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/zmumps_c.h:#ifndef
>> > >>> > > > MUMPS_VERSION_MAX_LEN
>> > >>> > > >
>> > >>> > >
>> > >>>
>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/zmumps_c.h:#define
>> > >>> > > > MUMPS_VERSION_MAX_LEN 30
>> > >>> > > >
>> > >>> > >
>> > >>>
>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/zmumps_c.h:
>> > >>> > > >    char version_number[MUMPS_VERSION_MAX_LEN + 1 + 1];
>> > >>> > > >
>> > >>> > > > On Mon, Aug 30, 2021 at 9:47 PM Satish Balay <
>> balay at mcs.anl.gov>
>> > >>> wrote:
>> > >>> > > >
>> > >>> > > > > Also - what do you have for:
>> > >>> > > > >
>> > >>> > > > > grep MUMPS_VERSION
>> > >>> > > > >
>> > >>> > >
>> > >>>
>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/*.h
>> > >>> > > > >
>> > >>> > > > > Satish
>> > >>> > > > >
>> > >>> > > > > On Mon, 30 Aug 2021, Satish Balay via petsc-users wrote:
>> > >>> > > > >
>> > >>> > > > > > please resend the logs
>> > >>> > > > > >
>> > >>> > > > > > Satish
>> > >>> > > > > >
>> > >>> > > > > > On Mon, 30 Aug 2021, Sam Guo wrote:
>> > >>> > > > > >
>> > >>> > > > > > > Same compiling error with --with-mumps-serial=1.
>> > >>> > > > > > >
>> > >>> > > > > > > On Mon, Aug 30, 2021 at 8:22 PM Satish Balay <
>> > >>> balay at mcs.anl.gov>
>> > >>> > > > > wrote:
>> > >>> > > > > > >
>> > >>> > > > > > > > Use the additional option: -with-mumps-serial
>> > >>> > > > > > > >
>> > >>> > > > > > > > Satish
>> > >>> > > > > > > >
>> > >>> > > > > > > > On Mon, 30 Aug 2021, Sam Guo wrote:
>> > >>> > > > > > > >
>> > >>> > > > > > > > > Attached please find the configure.log. I use my own
>> > >>> CMake. I
>> > >>> > > have
>> > >>> > > > > > > > > defined -DPETSC_HAVE_MUMPS. Thanks.
>> > >>> > > > > > > > >
>> > >>> > > > > > > > > On Mon, Aug 30, 2021 at 4:56 PM Sam Guo <
>> > >>> sam.guo at cd-adapco.com
>> > >>> > > >
>> > >>> > > > > wrote:
>> > >>> > > > > > > > >
>> > >>> > > > > > > > > > I use pre-installed
>> > >>> > > > > > > > > >
>> > >>> > > > > > > > > > On Mon, Aug 30, 2021 at 4:53 PM Satish Balay <
>> > >>> > > balay at mcs.anl.gov>
>> > >>> > > > > > > > wrote:
>> > >>> > > > > > > > > >
>> > >>> > > > > > > > > >>
>> > >>> > > > > > > > > >> Are you using --download-mumps or pre-installed
>> > >>> mumps? If
>> > >>> > > using
>> > >>> > > > > > > > > >> pre-installed - try --download-mumps.
>> > >>> > > > > > > > > >>
>> > >>> > > > > > > > > >> If you still have issues - send us configure.log
>> and
>> > >>> > > make.log
>> > >>> > > > > from the
>> > >>> > > > > > > > > >> failed build.
>> > >>> > > > > > > > > >>
>> > >>> > > > > > > > > >> Satish
>> > >>> > > > > > > > > >>
>> > >>> > > > > > > > > >> On Mon, 30 Aug 2021, Sam Guo wrote:
>> > >>> > > > > > > > > >>
>> > >>> > > > > > > > > >> > Dear PETSc dev team,
>> > >>> > > > > > > > > >> >    I am compiling petsc 3.15.3 and got
>> following
>> > >>> compiling
>> > >>> > > > > error
>> > >>> > > > > > > > > >> >
>> petsc/src/mat/impls/aij/mpi/mumps/mumps.c:52:31:
>> > >>> error:
>> > >>> > > > > missing
>> > >>> > > > > > > > binary
>> > >>> > > > > > > > > >> > operator before token "("
>> > >>> > > > > > > > > >> >    52 | #if PETSC_PKG_MUMPS_VERSION_GE(5,3,0)
>> > >>> > > > > > > > > >> >    Any idea what I did wrong?
>> > >>> > > > > > > > > >> >
>> > >>> > > > > > > > > >> > Thanks,
>> > >>> > > > > > > > > >> > Sam
>> > >>> > > > > > > > > >> >
>> > >>> > > > > > > > > >>
>> > >>> > > > > > > > > >>
>> > >>> > > > > > > > >
>> > >>> > > > > > > >
>> > >>> > > > > > > >
>> > >>> > > > > > >
>> > >>> > > > > >
>> > >>> > > > >
>> > >>> > > > >
>> > >>> > > >
>> > >>> > >
>> > >>> > >
>> > >>> >
>> > >>>
>> > >>>
>> >
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210901/0e576c93/attachment-0001.html>

From knepley at gmail.com  Wed Sep  1 15:58:17 2021
From: knepley at gmail.com (Matthew Knepley)
Date: Wed, 1 Sep 2021 16:58:17 -0400
Subject: [petsc-users] PETSc 3.15.3 compiling error
In-Reply-To: <CAAZdwQsuK+Mj78v4A2WVX199J4MHzoL34-pcf+iNvG0+2FzVXQ@mail.gmail.com>
References: <CAAZdwQuOEw2289v1a--OUx=b-gfwiUi7nvps+Bwu4x1ogLdwFA@mail.gmail.com>
	<cdfd44a7-a0bf-c1d8-8d7a-14752c3e379b@mcs.anl.gov>
	<CAAZdwQujE4cX0qtc7AydqOm3bROxyqGLvdbZwLf_rp71BGRv3g@mail.gmail.com>
	<CAAZdwQu9ezAD7qKdutN3iiGetvPXHm-PuQU69Omm3s_zimZVHQ@mail.gmail.com>
	<ea5527d1-2244-7c2-a1f9-f7c6babf8166@mcs.anl.gov>
	<CAAZdwQuefc0E_m46WCd8E-41QJzQ-nS040ThMDvHAC8kX5nfiw@mail.gmail.com>
	<cb5d920-3ab1-5c2d-28da-ef433557f5e9@mcs.anl.gov>
	<65d5cb9a-2dc0-8362-6a7-5acf784e7138@mcs.anl.gov>
	<CAAZdwQs7Z1Vin5fsv6BmVOMEGHmzDesWCTsZk4uuXoE2sVbLHQ@mail.gmail.com>
	<575fd7-61c5-b983-5ad0-4c2748b6b6d2@mcs.anl.gov>
	<CAAZdwQuTGwKkEbsZnLwkSE3hbxnCEfJwpjbFYLDpq6EC7Yt4kQ@mail.gmail.com>
	<ffccfb6e-309d-d215-9fb9-eef5afe3532@mcs.anl.gov>
	<CAAZdwQt-W5sfZeuqFEMSOYXbKTEKyVE8xsa_3N0yi2sowpNNew@mail.gmail.com>
	<CAAZdwQtHpQkzvxQrVz3kt_M23vMcksEnKKcT1656PUfm6hukkQ@mail.gmail.com>
	<CA+MQGp9nVOE_M-LwfYFNspjuGDc8fp5BtBoEGvXGDQ6BK+G2KQ@mail.gmail.com>
	<408d7d73-4da-8d73-97c-15e91855922@mcs.anl.gov>
	<CAAZdwQsuK+Mj78v4A2WVX199J4MHzoL34-pcf+iNvG0+2FzVXQ@mail.gmail.com>
Message-ID: <CAMYG4GnAfGvxAqAZL0tiMXYUWn1D7-BBXogbq+odjxG_mbH7iQ@mail.gmail.com>

On Wed, Sep 1, 2021 at 4:03 PM Sam Guo <sam.guo at cd-adapco.com> wrote:

> Hi Matt,
>    I tried  --with-mumps-dir but same error.
>

How can you build MUMPS without a Fortran compiler? And if you have one,
why are you not telling PETSc about it?

  Thanks,

     Matt


> Hi Junchao,
>    That's a very good clue and suggestion. I looked petscpkg_version.h. It
> is empty as follows. I'll follow your suggestion and define those macros in
> mumps.c.
>
> #if !defined(INCLUDED_PETSCPKG_VERSION_H)
> #define INCLUDED_PETSCPKG_VERSION_H
>
> #endif
>
> Hi Satish,
>    Yes, what I am doing is hacking but it is necessary since have own own
> mpi wrapper.
>
> Thank you all,
> Sam
>
> On Wed, Sep 1, 2021 at 12:52 PM Satish Balay <balay at mcs.anl.gov> wrote:
>
>> Well the build process used here is:
>>
>> >> (1) defined  -DPETSC_HAVE_MUMPS,
>> >> (2) compiles and links mat/impls/aij/mpi/mumps/mumps.c
>>
>>
>> i.e configure is skipped [for mumps part] so PETSC_PKG_MUMPS_VERSION_GE
>> etc are missing [hence this error]
>>
>> Satish
>>
>> On Wed, 1 Sep 2021, Junchao Zhang wrote:
>>
>> > On Wed, Sep 1, 2021 at 2:20 PM Sam Guo <sam.guo at cd-adapco.com> wrote:
>> >
>> > > If we go back to the original compiling error,
>> > > "petsc/src/mat/impls/aij/mpi/mumps/mumps.c:52:31: error: missing
>> binary
>> > > operator before token "("
>> > >    52 | #if PETSC_PKG_MUMPS_VERSION_GE(5,3,0)"
>> > > I don't understand what PETSC_PKG_MUMPS_VERSION_GE(5,3,0) is doing.
>> > >
>> > When petsc is configured with mumps, you will find the macro
>> > PETSC_PKG_MUMPS_VERSION_GE(MAJOR,MINOR,SUBMINOR) in
>> > $PETSC_ARCH/include/petscpkg_version.h
>> > Sam, you can manually compile the failed file, mumps.c, with
>> preprocessing,
>> > to see what is wrong in the expansion of the macro.
>> >
>> >
>> > >
>> > > On Wed, Sep 1, 2021 at 12:12 PM Sam Guo <sam.guo at cd-adapco.com>
>> wrote:
>> > >
>> > >> I believe I am using MUMPS since I have done following
>> > >> (1) defined  -DPETSC_HAVE_MUMPS,
>> > >> (2) compiles and links mat/impls/aij/mpi/mumps/mumps.c
>> > >> (3) link my pre-compiled MUMPS, and
>> > >> (4) specifies following PETSc options
>> > >>        checkError(EPSGetST(eps, &st));
>> > >>         checkError(STSetType(st, STSINVERT));
>> > >>         //if(useShellMatrix) checkError(STSetMatMode(st,
>> > >> ST_MATMODE_SHELL));
>> > >>         checkError(STGetKSP(st, &ksp));
>> > >>         checkError(KSPSetOperators(ksp, A, A));
>> > >>         checkError(KSPSetType(ksp, KSPPREONLY));
>> > >>         checkError(KSPGetPC(ksp, &pc));
>> > >>         checkError(MatSetOption(A, MAT_SPD, PETSC_TRUE));
>> > >>         checkError(PCSetType(pc, PCCHOLESKY));
>> > >>         checkError(PCFactorSetMatSolverType(pc, MATSOLVERMUMPS));
>> > >>         checkError(PCFactorSetUpMatSolverType(pc));
>> > >>         checkError(PetscOptionsSetValue(NULL,
>> "-mat_mumps_icntl_13","1"));
>> > >>
>> > >> Another evidence I am using MUMPS is that If I skip (1)-(3) above, I
>> got
>> > >> the PETSc error saying that MUMPS is required.
>> > >>
>> > >> On Wed, Sep 1, 2021 at 12:00 PM Satish Balay <balay at mcs.anl.gov>
>> wrote:
>> > >>
>> > >>> mumps is a fortran package - so best to specify fc. Any specific
>> reason
>> > >>> for needing to force '--with-fc=0'?
>> > >>>
>> > >>> The attached configure.log is not using mumps.
>> > >>>
>> > >>> Satish
>> > >>>
>> > >>> On Wed, 1 Sep 2021, Sam Guo wrote:
>> > >>>
>> > >>> > fc should not be required since I link PETSc with pre-compiled
>> MUMPS.
>> > >>> In
>> > >>> > fact, --with-mumps-include --with-mumps-lib --with-mumps-serial
>> should
>> > >>> not
>> > >>> > be required since my own CMake defines -DPETSC_HAVE_MUMPS and
>> links my
>> > >>> > pre-compiled MUMPS.
>> > >>> >
>> > >>> > I am able to make it work using PETSc 3.11.3. Attached please
>> find the
>> > >>> > cPETSc 3.11.3 onfigure.log PETSc.
>> > >>> >
>> > >>> > On Tue, Aug 31, 2021 at 4:47 PM Satish Balay <balay at mcs.anl.gov>
>> > >>> wrote:
>> > >>> >
>> > >>> > >
>> > >>> > >
>> > >>>
>> *******************************************************************************
>> > >>> > >          UNABLE to CONFIGURE with GIVEN OPTIONS    (see
>> > >>> configure.log for
>> > >>> > > details):
>> > >>> > >
>> > >>> > >
>> > >>>
>> -------------------------------------------------------------------------------
>> > >>> > > Package mumps requested requires Fortran but compiler turned
>> off.
>> > >>> > >
>> > >>> > >
>> > >>>
>> *******************************************************************************
>> > >>> > >
>> > >>> > > i.e remove '--with-fc=0' and rerun configure.
>> > >>> > >
>> > >>> > > Satish
>> > >>> > >
>> > >>> > > On Tue, 31 Aug 2021, Sam Guo wrote:
>> > >>> > >
>> > >>> > > > Attached please find the latest configure.log.
>> > >>> > > >
>> > >>> > > > grep MUMPS_VERSION
>> > >>> > > >
>> > >>> > >
>> > >>>
>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/*.h
>> > >>> > > >
>> > >>> > >
>> > >>>
>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/cmumps_c.h:#ifndef
>> > >>> > > > MUMPS_VERSION
>> > >>> > > >
>> > >>> > >
>> > >>>
>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/cmumps_c.h:#define
>> > >>> > > > MUMPS_VERSION "5.2.1"
>> > >>> > > >
>> > >>> > >
>> > >>>
>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/cmumps_c.h:#ifndef
>> > >>> > > > MUMPS_VERSION_MAX_LEN
>> > >>> > > >
>> > >>> > >
>> > >>>
>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/cmumps_c.h:#define
>> > >>> > > > MUMPS_VERSION_MAX_LEN 30
>> > >>> > > >
>> > >>> > >
>> > >>>
>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/cmumps_c.h:
>> > >>> > > >    char version_number[MUMPS_VERSION_MAX_LEN + 1 + 1];
>> > >>> > > >
>> > >>> > >
>> > >>>
>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/dmumps_c.h:#ifndef
>> > >>> > > > MUMPS_VERSION
>> > >>> > > >
>> > >>> > >
>> > >>>
>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/dmumps_c.h:#define
>> > >>> > > > MUMPS_VERSION "5.2.1"
>> > >>> > > >
>> > >>> > >
>> > >>>
>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/dmumps_c.h:#ifndef
>> > >>> > > > MUMPS_VERSION_MAX_LEN
>> > >>> > > >
>> > >>> > >
>> > >>>
>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/dmumps_c.h:#define
>> > >>> > > > MUMPS_VERSION_MAX_LEN 30
>> > >>> > > >
>> > >>> > >
>> > >>>
>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/dmumps_c.h:
>> > >>> > > >    char version_number[MUMPS_VERSION_MAX_LEN + 1 + 1];
>> > >>> > > >
>> > >>> > >
>> > >>>
>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/smumps_c.h:#ifndef
>> > >>> > > > MUMPS_VERSION
>> > >>> > > >
>> > >>> > >
>> > >>>
>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/smumps_c.h:#define
>> > >>> > > > MUMPS_VERSION "5.2.1"
>> > >>> > > >
>> > >>> > >
>> > >>>
>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/smumps_c.h:#ifndef
>> > >>> > > > MUMPS_VERSION_MAX_LEN
>> > >>> > > >
>> > >>> > >
>> > >>>
>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/smumps_c.h:#define
>> > >>> > > > MUMPS_VERSION_MAX_LEN 30
>> > >>> > > >
>> > >>> > >
>> > >>>
>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/smumps_c.h:
>> > >>> > > >    char version_number[MUMPS_VERSION_MAX_LEN + 1 + 1];
>> > >>> > > >
>> > >>> > >
>> > >>>
>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/zmumps_c.h:#ifndef
>> > >>> > > > MUMPS_VERSION
>> > >>> > > >
>> > >>> > >
>> > >>>
>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/zmumps_c.h:#define
>> > >>> > > > MUMPS_VERSION "5.2.1"
>> > >>> > > >
>> > >>> > >
>> > >>>
>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/zmumps_c.h:#ifndef
>> > >>> > > > MUMPS_VERSION_MAX_LEN
>> > >>> > > >
>> > >>> > >
>> > >>>
>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/zmumps_c.h:#define
>> > >>> > > > MUMPS_VERSION_MAX_LEN 30
>> > >>> > > >
>> > >>> > >
>> > >>>
>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/zmumps_c.h:
>> > >>> > > >    char version_number[MUMPS_VERSION_MAX_LEN + 1 + 1];
>> > >>> > > >
>> > >>> > > > On Mon, Aug 30, 2021 at 9:47 PM Satish Balay <
>> balay at mcs.anl.gov>
>> > >>> wrote:
>> > >>> > > >
>> > >>> > > > > Also - what do you have for:
>> > >>> > > > >
>> > >>> > > > > grep MUMPS_VERSION
>> > >>> > > > >
>> > >>> > >
>> > >>>
>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/*.h
>> > >>> > > > >
>> > >>> > > > > Satish
>> > >>> > > > >
>> > >>> > > > > On Mon, 30 Aug 2021, Satish Balay via petsc-users wrote:
>> > >>> > > > >
>> > >>> > > > > > please resend the logs
>> > >>> > > > > >
>> > >>> > > > > > Satish
>> > >>> > > > > >
>> > >>> > > > > > On Mon, 30 Aug 2021, Sam Guo wrote:
>> > >>> > > > > >
>> > >>> > > > > > > Same compiling error with --with-mumps-serial=1.
>> > >>> > > > > > >
>> > >>> > > > > > > On Mon, Aug 30, 2021 at 8:22 PM Satish Balay <
>> > >>> balay at mcs.anl.gov>
>> > >>> > > > > wrote:
>> > >>> > > > > > >
>> > >>> > > > > > > > Use the additional option: -with-mumps-serial
>> > >>> > > > > > > >
>> > >>> > > > > > > > Satish
>> > >>> > > > > > > >
>> > >>> > > > > > > > On Mon, 30 Aug 2021, Sam Guo wrote:
>> > >>> > > > > > > >
>> > >>> > > > > > > > > Attached please find the configure.log. I use my own
>> > >>> CMake. I
>> > >>> > > have
>> > >>> > > > > > > > > defined -DPETSC_HAVE_MUMPS. Thanks.
>> > >>> > > > > > > > >
>> > >>> > > > > > > > > On Mon, Aug 30, 2021 at 4:56 PM Sam Guo <
>> > >>> sam.guo at cd-adapco.com
>> > >>> > > >
>> > >>> > > > > wrote:
>> > >>> > > > > > > > >
>> > >>> > > > > > > > > > I use pre-installed
>> > >>> > > > > > > > > >
>> > >>> > > > > > > > > > On Mon, Aug 30, 2021 at 4:53 PM Satish Balay <
>> > >>> > > balay at mcs.anl.gov>
>> > >>> > > > > > > > wrote:
>> > >>> > > > > > > > > >
>> > >>> > > > > > > > > >>
>> > >>> > > > > > > > > >> Are you using --download-mumps or pre-installed
>> > >>> mumps? If
>> > >>> > > using
>> > >>> > > > > > > > > >> pre-installed - try --download-mumps.
>> > >>> > > > > > > > > >>
>> > >>> > > > > > > > > >> If you still have issues - send us configure.log
>> and
>> > >>> > > make.log
>> > >>> > > > > from the
>> > >>> > > > > > > > > >> failed build.
>> > >>> > > > > > > > > >>
>> > >>> > > > > > > > > >> Satish
>> > >>> > > > > > > > > >>
>> > >>> > > > > > > > > >> On Mon, 30 Aug 2021, Sam Guo wrote:
>> > >>> > > > > > > > > >>
>> > >>> > > > > > > > > >> > Dear PETSc dev team,
>> > >>> > > > > > > > > >> >    I am compiling petsc 3.15.3 and got
>> following
>> > >>> compiling
>> > >>> > > > > error
>> > >>> > > > > > > > > >> >
>> petsc/src/mat/impls/aij/mpi/mumps/mumps.c:52:31:
>> > >>> error:
>> > >>> > > > > missing
>> > >>> > > > > > > > binary
>> > >>> > > > > > > > > >> > operator before token "("
>> > >>> > > > > > > > > >> >    52 | #if PETSC_PKG_MUMPS_VERSION_GE(5,3,0)
>> > >>> > > > > > > > > >> >    Any idea what I did wrong?
>> > >>> > > > > > > > > >> >
>> > >>> > > > > > > > > >> > Thanks,
>> > >>> > > > > > > > > >> > Sam
>> > >>> > > > > > > > > >> >
>> > >>> > > > > > > > > >>
>> > >>> > > > > > > > > >>
>> > >>> > > > > > > > >
>> > >>> > > > > > > >
>> > >>> > > > > > > >
>> > >>> > > > > > >
>> > >>> > > > > >
>> > >>> > > > >
>> > >>> > > > >
>> > >>> > > >
>> > >>> > >
>> > >>> > >
>> > >>> >
>> > >>>
>> > >>>
>> >
>>
>>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210901/7096f9ea/attachment.html>

From sam.guo at cd-adapco.com  Wed Sep  1 16:19:29 2021
From: sam.guo at cd-adapco.com (Sam Guo)
Date: Wed, 1 Sep 2021 14:19:29 -0700
Subject: [petsc-users] PETSc 3.15.3 compiling error
In-Reply-To: <CAMYG4GnAfGvxAqAZL0tiMXYUWn1D7-BBXogbq+odjxG_mbH7iQ@mail.gmail.com>
References: <CAAZdwQuOEw2289v1a--OUx=b-gfwiUi7nvps+Bwu4x1ogLdwFA@mail.gmail.com>
	<cdfd44a7-a0bf-c1d8-8d7a-14752c3e379b@mcs.anl.gov>
	<CAAZdwQujE4cX0qtc7AydqOm3bROxyqGLvdbZwLf_rp71BGRv3g@mail.gmail.com>
	<CAAZdwQu9ezAD7qKdutN3iiGetvPXHm-PuQU69Omm3s_zimZVHQ@mail.gmail.com>
	<ea5527d1-2244-7c2-a1f9-f7c6babf8166@mcs.anl.gov>
	<CAAZdwQuefc0E_m46WCd8E-41QJzQ-nS040ThMDvHAC8kX5nfiw@mail.gmail.com>
	<cb5d920-3ab1-5c2d-28da-ef433557f5e9@mcs.anl.gov>
	<65d5cb9a-2dc0-8362-6a7-5acf784e7138@mcs.anl.gov>
	<CAAZdwQs7Z1Vin5fsv6BmVOMEGHmzDesWCTsZk4uuXoE2sVbLHQ@mail.gmail.com>
	<575fd7-61c5-b983-5ad0-4c2748b6b6d2@mcs.anl.gov>
	<CAAZdwQuTGwKkEbsZnLwkSE3hbxnCEfJwpjbFYLDpq6EC7Yt4kQ@mail.gmail.com>
	<ffccfb6e-309d-d215-9fb9-eef5afe3532@mcs.anl.gov>
	<CAAZdwQt-W5sfZeuqFEMSOYXbKTEKyVE8xsa_3N0yi2sowpNNew@mail.gmail.com>
	<CAAZdwQtHpQkzvxQrVz3kt_M23vMcksEnKKcT1656PUfm6hukkQ@mail.gmail.com>
	<CA+MQGp9nVOE_M-LwfYFNspjuGDc8fp5BtBoEGvXGDQ6BK+G2KQ@mail.gmail.com>
	<408d7d73-4da-8d73-97c-15e91855922@mcs.anl.gov>
	<CAAZdwQsuK+Mj78v4A2WVX199J4MHzoL34-pcf+iNvG0+2FzVXQ@mail.gmail.com>
	<CAMYG4GnAfGvxAqAZL0tiMXYUWn1D7-BBXogbq+odjxG_mbH7iQ@mail.gmail.com>
Message-ID: <CAAZdwQsNJguxwL9Wn3sJx_DJUh1AiCFsdg-A_DzAoYdraszY8A@mail.gmail.com>

I build MUMPS at the designated machine but my local machine does not have
fortran compiler.

On Wed, Sep 1, 2021 at 1:58 PM Matthew Knepley <knepley at gmail.com> wrote:

> On Wed, Sep 1, 2021 at 4:03 PM Sam Guo <sam.guo at cd-adapco.com> wrote:
>
>> Hi Matt,
>>    I tried  --with-mumps-dir but same error.
>>
>
> How can you build MUMPS without a Fortran compiler? And if you have one,
> why are you not telling PETSc about it?
>
>   Thanks,
>
>      Matt
>
>
>> Hi Junchao,
>>    That's a very good clue and suggestion. I looked petscpkg_version.h.
>> It is empty as follows. I'll follow your suggestion and define those macros
>> in mumps.c.
>>
>> #if !defined(INCLUDED_PETSCPKG_VERSION_H)
>> #define INCLUDED_PETSCPKG_VERSION_H
>>
>> #endif
>>
>> Hi Satish,
>>    Yes, what I am doing is hacking but it is necessary since have own own
>> mpi wrapper.
>>
>> Thank you all,
>> Sam
>>
>> On Wed, Sep 1, 2021 at 12:52 PM Satish Balay <balay at mcs.anl.gov> wrote:
>>
>>> Well the build process used here is:
>>>
>>> >> (1) defined  -DPETSC_HAVE_MUMPS,
>>> >> (2) compiles and links mat/impls/aij/mpi/mumps/mumps.c
>>>
>>>
>>> i.e configure is skipped [for mumps part] so PETSC_PKG_MUMPS_VERSION_GE
>>> etc are missing [hence this error]
>>>
>>> Satish
>>>
>>> On Wed, 1 Sep 2021, Junchao Zhang wrote:
>>>
>>> > On Wed, Sep 1, 2021 at 2:20 PM Sam Guo <sam.guo at cd-adapco.com> wrote:
>>> >
>>> > > If we go back to the original compiling error,
>>> > > "petsc/src/mat/impls/aij/mpi/mumps/mumps.c:52:31: error: missing
>>> binary
>>> > > operator before token "("
>>> > >    52 | #if PETSC_PKG_MUMPS_VERSION_GE(5,3,0)"
>>> > > I don't understand what PETSC_PKG_MUMPS_VERSION_GE(5,3,0) is doing.
>>> > >
>>> > When petsc is configured with mumps, you will find the macro
>>> > PETSC_PKG_MUMPS_VERSION_GE(MAJOR,MINOR,SUBMINOR) in
>>> > $PETSC_ARCH/include/petscpkg_version.h
>>> > Sam, you can manually compile the failed file, mumps.c, with
>>> preprocessing,
>>> > to see what is wrong in the expansion of the macro.
>>> >
>>> >
>>> > >
>>> > > On Wed, Sep 1, 2021 at 12:12 PM Sam Guo <sam.guo at cd-adapco.com>
>>> wrote:
>>> > >
>>> > >> I believe I am using MUMPS since I have done following
>>> > >> (1) defined  -DPETSC_HAVE_MUMPS,
>>> > >> (2) compiles and links mat/impls/aij/mpi/mumps/mumps.c
>>> > >> (3) link my pre-compiled MUMPS, and
>>> > >> (4) specifies following PETSc options
>>> > >>        checkError(EPSGetST(eps, &st));
>>> > >>         checkError(STSetType(st, STSINVERT));
>>> > >>         //if(useShellMatrix) checkError(STSetMatMode(st,
>>> > >> ST_MATMODE_SHELL));
>>> > >>         checkError(STGetKSP(st, &ksp));
>>> > >>         checkError(KSPSetOperators(ksp, A, A));
>>> > >>         checkError(KSPSetType(ksp, KSPPREONLY));
>>> > >>         checkError(KSPGetPC(ksp, &pc));
>>> > >>         checkError(MatSetOption(A, MAT_SPD, PETSC_TRUE));
>>> > >>         checkError(PCSetType(pc, PCCHOLESKY));
>>> > >>         checkError(PCFactorSetMatSolverType(pc, MATSOLVERMUMPS));
>>> > >>         checkError(PCFactorSetUpMatSolverType(pc));
>>> > >>         checkError(PetscOptionsSetValue(NULL,
>>> "-mat_mumps_icntl_13","1"));
>>> > >>
>>> > >> Another evidence I am using MUMPS is that If I skip (1)-(3) above,
>>> I got
>>> > >> the PETSc error saying that MUMPS is required.
>>> > >>
>>> > >> On Wed, Sep 1, 2021 at 12:00 PM Satish Balay <balay at mcs.anl.gov>
>>> wrote:
>>> > >>
>>> > >>> mumps is a fortran package - so best to specify fc. Any specific
>>> reason
>>> > >>> for needing to force '--with-fc=0'?
>>> > >>>
>>> > >>> The attached configure.log is not using mumps.
>>> > >>>
>>> > >>> Satish
>>> > >>>
>>> > >>> On Wed, 1 Sep 2021, Sam Guo wrote:
>>> > >>>
>>> > >>> > fc should not be required since I link PETSc with pre-compiled
>>> MUMPS.
>>> > >>> In
>>> > >>> > fact, --with-mumps-include --with-mumps-lib --with-mumps-serial
>>> should
>>> > >>> not
>>> > >>> > be required since my own CMake defines -DPETSC_HAVE_MUMPS and
>>> links my
>>> > >>> > pre-compiled MUMPS.
>>> > >>> >
>>> > >>> > I am able to make it work using PETSc 3.11.3. Attached please
>>> find the
>>> > >>> > cPETSc 3.11.3 onfigure.log PETSc.
>>> > >>> >
>>> > >>> > On Tue, Aug 31, 2021 at 4:47 PM Satish Balay <balay at mcs.anl.gov>
>>> > >>> wrote:
>>> > >>> >
>>> > >>> > >
>>> > >>> > >
>>> > >>>
>>> *******************************************************************************
>>> > >>> > >          UNABLE to CONFIGURE with GIVEN OPTIONS    (see
>>> > >>> configure.log for
>>> > >>> > > details):
>>> > >>> > >
>>> > >>> > >
>>> > >>>
>>> -------------------------------------------------------------------------------
>>> > >>> > > Package mumps requested requires Fortran but compiler turned
>>> off.
>>> > >>> > >
>>> > >>> > >
>>> > >>>
>>> *******************************************************************************
>>> > >>> > >
>>> > >>> > > i.e remove '--with-fc=0' and rerun configure.
>>> > >>> > >
>>> > >>> > > Satish
>>> > >>> > >
>>> > >>> > > On Tue, 31 Aug 2021, Sam Guo wrote:
>>> > >>> > >
>>> > >>> > > > Attached please find the latest configure.log.
>>> > >>> > > >
>>> > >>> > > > grep MUMPS_VERSION
>>> > >>> > > >
>>> > >>> > >
>>> > >>>
>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/*.h
>>> > >>> > > >
>>> > >>> > >
>>> > >>>
>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/cmumps_c.h:#ifndef
>>> > >>> > > > MUMPS_VERSION
>>> > >>> > > >
>>> > >>> > >
>>> > >>>
>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/cmumps_c.h:#define
>>> > >>> > > > MUMPS_VERSION "5.2.1"
>>> > >>> > > >
>>> > >>> > >
>>> > >>>
>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/cmumps_c.h:#ifndef
>>> > >>> > > > MUMPS_VERSION_MAX_LEN
>>> > >>> > > >
>>> > >>> > >
>>> > >>>
>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/cmumps_c.h:#define
>>> > >>> > > > MUMPS_VERSION_MAX_LEN 30
>>> > >>> > > >
>>> > >>> > >
>>> > >>>
>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/cmumps_c.h:
>>> > >>> > > >    char version_number[MUMPS_VERSION_MAX_LEN + 1 + 1];
>>> > >>> > > >
>>> > >>> > >
>>> > >>>
>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/dmumps_c.h:#ifndef
>>> > >>> > > > MUMPS_VERSION
>>> > >>> > > >
>>> > >>> > >
>>> > >>>
>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/dmumps_c.h:#define
>>> > >>> > > > MUMPS_VERSION "5.2.1"
>>> > >>> > > >
>>> > >>> > >
>>> > >>>
>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/dmumps_c.h:#ifndef
>>> > >>> > > > MUMPS_VERSION_MAX_LEN
>>> > >>> > > >
>>> > >>> > >
>>> > >>>
>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/dmumps_c.h:#define
>>> > >>> > > > MUMPS_VERSION_MAX_LEN 30
>>> > >>> > > >
>>> > >>> > >
>>> > >>>
>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/dmumps_c.h:
>>> > >>> > > >    char version_number[MUMPS_VERSION_MAX_LEN + 1 + 1];
>>> > >>> > > >
>>> > >>> > >
>>> > >>>
>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/smumps_c.h:#ifndef
>>> > >>> > > > MUMPS_VERSION
>>> > >>> > > >
>>> > >>> > >
>>> > >>>
>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/smumps_c.h:#define
>>> > >>> > > > MUMPS_VERSION "5.2.1"
>>> > >>> > > >
>>> > >>> > >
>>> > >>>
>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/smumps_c.h:#ifndef
>>> > >>> > > > MUMPS_VERSION_MAX_LEN
>>> > >>> > > >
>>> > >>> > >
>>> > >>>
>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/smumps_c.h:#define
>>> > >>> > > > MUMPS_VERSION_MAX_LEN 30
>>> > >>> > > >
>>> > >>> > >
>>> > >>>
>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/smumps_c.h:
>>> > >>> > > >    char version_number[MUMPS_VERSION_MAX_LEN + 1 + 1];
>>> > >>> > > >
>>> > >>> > >
>>> > >>>
>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/zmumps_c.h:#ifndef
>>> > >>> > > > MUMPS_VERSION
>>> > >>> > > >
>>> > >>> > >
>>> > >>>
>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/zmumps_c.h:#define
>>> > >>> > > > MUMPS_VERSION "5.2.1"
>>> > >>> > > >
>>> > >>> > >
>>> > >>>
>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/zmumps_c.h:#ifndef
>>> > >>> > > > MUMPS_VERSION_MAX_LEN
>>> > >>> > > >
>>> > >>> > >
>>> > >>>
>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/zmumps_c.h:#define
>>> > >>> > > > MUMPS_VERSION_MAX_LEN 30
>>> > >>> > > >
>>> > >>> > >
>>> > >>>
>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/zmumps_c.h:
>>> > >>> > > >    char version_number[MUMPS_VERSION_MAX_LEN + 1 + 1];
>>> > >>> > > >
>>> > >>> > > > On Mon, Aug 30, 2021 at 9:47 PM Satish Balay <
>>> balay at mcs.anl.gov>
>>> > >>> wrote:
>>> > >>> > > >
>>> > >>> > > > > Also - what do you have for:
>>> > >>> > > > >
>>> > >>> > > > > grep MUMPS_VERSION
>>> > >>> > > > >
>>> > >>> > >
>>> > >>>
>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/*.h
>>> > >>> > > > >
>>> > >>> > > > > Satish
>>> > >>> > > > >
>>> > >>> > > > > On Mon, 30 Aug 2021, Satish Balay via petsc-users wrote:
>>> > >>> > > > >
>>> > >>> > > > > > please resend the logs
>>> > >>> > > > > >
>>> > >>> > > > > > Satish
>>> > >>> > > > > >
>>> > >>> > > > > > On Mon, 30 Aug 2021, Sam Guo wrote:
>>> > >>> > > > > >
>>> > >>> > > > > > > Same compiling error with --with-mumps-serial=1.
>>> > >>> > > > > > >
>>> > >>> > > > > > > On Mon, Aug 30, 2021 at 8:22 PM Satish Balay <
>>> > >>> balay at mcs.anl.gov>
>>> > >>> > > > > wrote:
>>> > >>> > > > > > >
>>> > >>> > > > > > > > Use the additional option: -with-mumps-serial
>>> > >>> > > > > > > >
>>> > >>> > > > > > > > Satish
>>> > >>> > > > > > > >
>>> > >>> > > > > > > > On Mon, 30 Aug 2021, Sam Guo wrote:
>>> > >>> > > > > > > >
>>> > >>> > > > > > > > > Attached please find the configure.log. I use my
>>> own
>>> > >>> CMake. I
>>> > >>> > > have
>>> > >>> > > > > > > > > defined -DPETSC_HAVE_MUMPS. Thanks.
>>> > >>> > > > > > > > >
>>> > >>> > > > > > > > > On Mon, Aug 30, 2021 at 4:56 PM Sam Guo <
>>> > >>> sam.guo at cd-adapco.com
>>> > >>> > > >
>>> > >>> > > > > wrote:
>>> > >>> > > > > > > > >
>>> > >>> > > > > > > > > > I use pre-installed
>>> > >>> > > > > > > > > >
>>> > >>> > > > > > > > > > On Mon, Aug 30, 2021 at 4:53 PM Satish Balay <
>>> > >>> > > balay at mcs.anl.gov>
>>> > >>> > > > > > > > wrote:
>>> > >>> > > > > > > > > >
>>> > >>> > > > > > > > > >>
>>> > >>> > > > > > > > > >> Are you using --download-mumps or pre-installed
>>> > >>> mumps? If
>>> > >>> > > using
>>> > >>> > > > > > > > > >> pre-installed - try --download-mumps.
>>> > >>> > > > > > > > > >>
>>> > >>> > > > > > > > > >> If you still have issues - send us
>>> configure.log and
>>> > >>> > > make.log
>>> > >>> > > > > from the
>>> > >>> > > > > > > > > >> failed build.
>>> > >>> > > > > > > > > >>
>>> > >>> > > > > > > > > >> Satish
>>> > >>> > > > > > > > > >>
>>> > >>> > > > > > > > > >> On Mon, 30 Aug 2021, Sam Guo wrote:
>>> > >>> > > > > > > > > >>
>>> > >>> > > > > > > > > >> > Dear PETSc dev team,
>>> > >>> > > > > > > > > >> >    I am compiling petsc 3.15.3 and got
>>> following
>>> > >>> compiling
>>> > >>> > > > > error
>>> > >>> > > > > > > > > >> >
>>> petsc/src/mat/impls/aij/mpi/mumps/mumps.c:52:31:
>>> > >>> error:
>>> > >>> > > > > missing
>>> > >>> > > > > > > > binary
>>> > >>> > > > > > > > > >> > operator before token "("
>>> > >>> > > > > > > > > >> >    52 | #if PETSC_PKG_MUMPS_VERSION_GE(5,3,0)
>>> > >>> > > > > > > > > >> >    Any idea what I did wrong?
>>> > >>> > > > > > > > > >> >
>>> > >>> > > > > > > > > >> > Thanks,
>>> > >>> > > > > > > > > >> > Sam
>>> > >>> > > > > > > > > >> >
>>> > >>> > > > > > > > > >>
>>> > >>> > > > > > > > > >>
>>> > >>> > > > > > > > >
>>> > >>> > > > > > > >
>>> > >>> > > > > > > >
>>> > >>> > > > > > >
>>> > >>> > > > > >
>>> > >>> > > > >
>>> > >>> > > > >
>>> > >>> > > >
>>> > >>> > >
>>> > >>> > >
>>> > >>> >
>>> > >>>
>>> > >>>
>>> >
>>>
>>>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
> <http://www.cse.buffalo.edu/~knepley/>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210901/03bf9c3c/attachment-0001.html>

From knepley at gmail.com  Wed Sep  1 19:02:41 2021
From: knepley at gmail.com (Matthew Knepley)
Date: Wed, 1 Sep 2021 20:02:41 -0400
Subject: [petsc-users] PETSc 3.15.3 compiling error
In-Reply-To: <CAAZdwQsNJguxwL9Wn3sJx_DJUh1AiCFsdg-A_DzAoYdraszY8A@mail.gmail.com>
References: <CAAZdwQuOEw2289v1a--OUx=b-gfwiUi7nvps+Bwu4x1ogLdwFA@mail.gmail.com>
	<cdfd44a7-a0bf-c1d8-8d7a-14752c3e379b@mcs.anl.gov>
	<CAAZdwQujE4cX0qtc7AydqOm3bROxyqGLvdbZwLf_rp71BGRv3g@mail.gmail.com>
	<CAAZdwQu9ezAD7qKdutN3iiGetvPXHm-PuQU69Omm3s_zimZVHQ@mail.gmail.com>
	<ea5527d1-2244-7c2-a1f9-f7c6babf8166@mcs.anl.gov>
	<CAAZdwQuefc0E_m46WCd8E-41QJzQ-nS040ThMDvHAC8kX5nfiw@mail.gmail.com>
	<cb5d920-3ab1-5c2d-28da-ef433557f5e9@mcs.anl.gov>
	<65d5cb9a-2dc0-8362-6a7-5acf784e7138@mcs.anl.gov>
	<CAAZdwQs7Z1Vin5fsv6BmVOMEGHmzDesWCTsZk4uuXoE2sVbLHQ@mail.gmail.com>
	<575fd7-61c5-b983-5ad0-4c2748b6b6d2@mcs.anl.gov>
	<CAAZdwQuTGwKkEbsZnLwkSE3hbxnCEfJwpjbFYLDpq6EC7Yt4kQ@mail.gmail.com>
	<ffccfb6e-309d-d215-9fb9-eef5afe3532@mcs.anl.gov>
	<CAAZdwQt-W5sfZeuqFEMSOYXbKTEKyVE8xsa_3N0yi2sowpNNew@mail.gmail.com>
	<CAAZdwQtHpQkzvxQrVz3kt_M23vMcksEnKKcT1656PUfm6hukkQ@mail.gmail.com>
	<CA+MQGp9nVOE_M-LwfYFNspjuGDc8fp5BtBoEGvXGDQ6BK+G2KQ@mail.gmail.com>
	<408d7d73-4da-8d73-97c-15e91855922@mcs.anl.gov>
	<CAAZdwQsuK+Mj78v4A2WVX199J4MHzoL34-pcf+iNvG0+2FzVXQ@mail.gmail.com>
	<CAMYG4GnAfGvxAqAZL0tiMXYUWn1D7-BBXogbq+odjxG_mbH7iQ@mail.gmail.com>
	<CAAZdwQsNJguxwL9Wn3sJx_DJUh1AiCFsdg-A_DzAoYdraszY8A@mail.gmail.com>
Message-ID: <CAMYG4GkEFvU-UhBmvCfSyRKSebWcor0N8mMyUCPK3rx2+KojVg@mail.gmail.com>

On Wed, Sep 1, 2021 at 5:19 PM Sam Guo <sam.guo at cd-adapco.com> wrote:

> I build MUMPS at the designated machine but my local machine does not have
> fortran compiler.
>

Can you run the configure there?

  THanks,

     Matt


> On Wed, Sep 1, 2021 at 1:58 PM Matthew Knepley <knepley at gmail.com> wrote:
>
>> On Wed, Sep 1, 2021 at 4:03 PM Sam Guo <sam.guo at cd-adapco.com> wrote:
>>
>>> Hi Matt,
>>>    I tried  --with-mumps-dir but same error.
>>>
>>
>> How can you build MUMPS without a Fortran compiler? And if you have one,
>> why are you not telling PETSc about it?
>>
>>   Thanks,
>>
>>      Matt
>>
>>
>>> Hi Junchao,
>>>    That's a very good clue and suggestion. I looked petscpkg_version.h.
>>> It is empty as follows. I'll follow your suggestion and define those macros
>>> in mumps.c.
>>>
>>> #if !defined(INCLUDED_PETSCPKG_VERSION_H)
>>> #define INCLUDED_PETSCPKG_VERSION_H
>>>
>>> #endif
>>>
>>> Hi Satish,
>>>    Yes, what I am doing is hacking but it is necessary since have own
>>> own mpi wrapper.
>>>
>>> Thank you all,
>>> Sam
>>>
>>> On Wed, Sep 1, 2021 at 12:52 PM Satish Balay <balay at mcs.anl.gov> wrote:
>>>
>>>> Well the build process used here is:
>>>>
>>>> >> (1) defined  -DPETSC_HAVE_MUMPS,
>>>> >> (2) compiles and links mat/impls/aij/mpi/mumps/mumps.c
>>>>
>>>>
>>>> i.e configure is skipped [for mumps part] so PETSC_PKG_MUMPS_VERSION_GE
>>>> etc are missing [hence this error]
>>>>
>>>> Satish
>>>>
>>>> On Wed, 1 Sep 2021, Junchao Zhang wrote:
>>>>
>>>> > On Wed, Sep 1, 2021 at 2:20 PM Sam Guo <sam.guo at cd-adapco.com> wrote:
>>>> >
>>>> > > If we go back to the original compiling error,
>>>> > > "petsc/src/mat/impls/aij/mpi/mumps/mumps.c:52:31: error: missing
>>>> binary
>>>> > > operator before token "("
>>>> > >    52 | #if PETSC_PKG_MUMPS_VERSION_GE(5,3,0)"
>>>> > > I don't understand what PETSC_PKG_MUMPS_VERSION_GE(5,3,0) is doing.
>>>> > >
>>>> > When petsc is configured with mumps, you will find the macro
>>>> > PETSC_PKG_MUMPS_VERSION_GE(MAJOR,MINOR,SUBMINOR) in
>>>> > $PETSC_ARCH/include/petscpkg_version.h
>>>> > Sam, you can manually compile the failed file, mumps.c, with
>>>> preprocessing,
>>>> > to see what is wrong in the expansion of the macro.
>>>> >
>>>> >
>>>> > >
>>>> > > On Wed, Sep 1, 2021 at 12:12 PM Sam Guo <sam.guo at cd-adapco.com>
>>>> wrote:
>>>> > >
>>>> > >> I believe I am using MUMPS since I have done following
>>>> > >> (1) defined  -DPETSC_HAVE_MUMPS,
>>>> > >> (2) compiles and links mat/impls/aij/mpi/mumps/mumps.c
>>>> > >> (3) link my pre-compiled MUMPS, and
>>>> > >> (4) specifies following PETSc options
>>>> > >>        checkError(EPSGetST(eps, &st));
>>>> > >>         checkError(STSetType(st, STSINVERT));
>>>> > >>         //if(useShellMatrix) checkError(STSetMatMode(st,
>>>> > >> ST_MATMODE_SHELL));
>>>> > >>         checkError(STGetKSP(st, &ksp));
>>>> > >>         checkError(KSPSetOperators(ksp, A, A));
>>>> > >>         checkError(KSPSetType(ksp, KSPPREONLY));
>>>> > >>         checkError(KSPGetPC(ksp, &pc));
>>>> > >>         checkError(MatSetOption(A, MAT_SPD, PETSC_TRUE));
>>>> > >>         checkError(PCSetType(pc, PCCHOLESKY));
>>>> > >>         checkError(PCFactorSetMatSolverType(pc, MATSOLVERMUMPS));
>>>> > >>         checkError(PCFactorSetUpMatSolverType(pc));
>>>> > >>         checkError(PetscOptionsSetValue(NULL,
>>>> "-mat_mumps_icntl_13","1"));
>>>> > >>
>>>> > >> Another evidence I am using MUMPS is that If I skip (1)-(3) above,
>>>> I got
>>>> > >> the PETSc error saying that MUMPS is required.
>>>> > >>
>>>> > >> On Wed, Sep 1, 2021 at 12:00 PM Satish Balay <balay at mcs.anl.gov>
>>>> wrote:
>>>> > >>
>>>> > >>> mumps is a fortran package - so best to specify fc. Any specific
>>>> reason
>>>> > >>> for needing to force '--with-fc=0'?
>>>> > >>>
>>>> > >>> The attached configure.log is not using mumps.
>>>> > >>>
>>>> > >>> Satish
>>>> > >>>
>>>> > >>> On Wed, 1 Sep 2021, Sam Guo wrote:
>>>> > >>>
>>>> > >>> > fc should not be required since I link PETSc with pre-compiled
>>>> MUMPS.
>>>> > >>> In
>>>> > >>> > fact, --with-mumps-include --with-mumps-lib --with-mumps-serial
>>>> should
>>>> > >>> not
>>>> > >>> > be required since my own CMake defines -DPETSC_HAVE_MUMPS and
>>>> links my
>>>> > >>> > pre-compiled MUMPS.
>>>> > >>> >
>>>> > >>> > I am able to make it work using PETSc 3.11.3. Attached please
>>>> find the
>>>> > >>> > cPETSc 3.11.3 onfigure.log PETSc.
>>>> > >>> >
>>>> > >>> > On Tue, Aug 31, 2021 at 4:47 PM Satish Balay <balay at mcs.anl.gov
>>>> >
>>>> > >>> wrote:
>>>> > >>> >
>>>> > >>> > >
>>>> > >>> > >
>>>> > >>>
>>>> *******************************************************************************
>>>> > >>> > >          UNABLE to CONFIGURE with GIVEN OPTIONS    (see
>>>> > >>> configure.log for
>>>> > >>> > > details):
>>>> > >>> > >
>>>> > >>> > >
>>>> > >>>
>>>> -------------------------------------------------------------------------------
>>>> > >>> > > Package mumps requested requires Fortran but compiler turned
>>>> off.
>>>> > >>> > >
>>>> > >>> > >
>>>> > >>>
>>>> *******************************************************************************
>>>> > >>> > >
>>>> > >>> > > i.e remove '--with-fc=0' and rerun configure.
>>>> > >>> > >
>>>> > >>> > > Satish
>>>> > >>> > >
>>>> > >>> > > On Tue, 31 Aug 2021, Sam Guo wrote:
>>>> > >>> > >
>>>> > >>> > > > Attached please find the latest configure.log.
>>>> > >>> > > >
>>>> > >>> > > > grep MUMPS_VERSION
>>>> > >>> > > >
>>>> > >>> > >
>>>> > >>>
>>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/*.h
>>>> > >>> > > >
>>>> > >>> > >
>>>> > >>>
>>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/cmumps_c.h:#ifndef
>>>> > >>> > > > MUMPS_VERSION
>>>> > >>> > > >
>>>> > >>> > >
>>>> > >>>
>>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/cmumps_c.h:#define
>>>> > >>> > > > MUMPS_VERSION "5.2.1"
>>>> > >>> > > >
>>>> > >>> > >
>>>> > >>>
>>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/cmumps_c.h:#ifndef
>>>> > >>> > > > MUMPS_VERSION_MAX_LEN
>>>> > >>> > > >
>>>> > >>> > >
>>>> > >>>
>>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/cmumps_c.h:#define
>>>> > >>> > > > MUMPS_VERSION_MAX_LEN 30
>>>> > >>> > > >
>>>> > >>> > >
>>>> > >>>
>>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/cmumps_c.h:
>>>> > >>> > > >    char version_number[MUMPS_VERSION_MAX_LEN + 1 + 1];
>>>> > >>> > > >
>>>> > >>> > >
>>>> > >>>
>>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/dmumps_c.h:#ifndef
>>>> > >>> > > > MUMPS_VERSION
>>>> > >>> > > >
>>>> > >>> > >
>>>> > >>>
>>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/dmumps_c.h:#define
>>>> > >>> > > > MUMPS_VERSION "5.2.1"
>>>> > >>> > > >
>>>> > >>> > >
>>>> > >>>
>>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/dmumps_c.h:#ifndef
>>>> > >>> > > > MUMPS_VERSION_MAX_LEN
>>>> > >>> > > >
>>>> > >>> > >
>>>> > >>>
>>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/dmumps_c.h:#define
>>>> > >>> > > > MUMPS_VERSION_MAX_LEN 30
>>>> > >>> > > >
>>>> > >>> > >
>>>> > >>>
>>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/dmumps_c.h:
>>>> > >>> > > >    char version_number[MUMPS_VERSION_MAX_LEN + 1 + 1];
>>>> > >>> > > >
>>>> > >>> > >
>>>> > >>>
>>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/smumps_c.h:#ifndef
>>>> > >>> > > > MUMPS_VERSION
>>>> > >>> > > >
>>>> > >>> > >
>>>> > >>>
>>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/smumps_c.h:#define
>>>> > >>> > > > MUMPS_VERSION "5.2.1"
>>>> > >>> > > >
>>>> > >>> > >
>>>> > >>>
>>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/smumps_c.h:#ifndef
>>>> > >>> > > > MUMPS_VERSION_MAX_LEN
>>>> > >>> > > >
>>>> > >>> > >
>>>> > >>>
>>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/smumps_c.h:#define
>>>> > >>> > > > MUMPS_VERSION_MAX_LEN 30
>>>> > >>> > > >
>>>> > >>> > >
>>>> > >>>
>>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/smumps_c.h:
>>>> > >>> > > >    char version_number[MUMPS_VERSION_MAX_LEN + 1 + 1];
>>>> > >>> > > >
>>>> > >>> > >
>>>> > >>>
>>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/zmumps_c.h:#ifndef
>>>> > >>> > > > MUMPS_VERSION
>>>> > >>> > > >
>>>> > >>> > >
>>>> > >>>
>>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/zmumps_c.h:#define
>>>> > >>> > > > MUMPS_VERSION "5.2.1"
>>>> > >>> > > >
>>>> > >>> > >
>>>> > >>>
>>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/zmumps_c.h:#ifndef
>>>> > >>> > > > MUMPS_VERSION_MAX_LEN
>>>> > >>> > > >
>>>> > >>> > >
>>>> > >>>
>>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/zmumps_c.h:#define
>>>> > >>> > > > MUMPS_VERSION_MAX_LEN 30
>>>> > >>> > > >
>>>> > >>> > >
>>>> > >>>
>>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/zmumps_c.h:
>>>> > >>> > > >    char version_number[MUMPS_VERSION_MAX_LEN + 1 + 1];
>>>> > >>> > > >
>>>> > >>> > > > On Mon, Aug 30, 2021 at 9:47 PM Satish Balay <
>>>> balay at mcs.anl.gov>
>>>> > >>> wrote:
>>>> > >>> > > >
>>>> > >>> > > > > Also - what do you have for:
>>>> > >>> > > > >
>>>> > >>> > > > > grep MUMPS_VERSION
>>>> > >>> > > > >
>>>> > >>> > >
>>>> > >>>
>>>> /u/cd4hhv/dev4/mumps/5.2.1-vanilla-parmetis3.2.0-openmp-cda-001/linux-x86_64-2.3.4/include/*.h
>>>> > >>> > > > >
>>>> > >>> > > > > Satish
>>>> > >>> > > > >
>>>> > >>> > > > > On Mon, 30 Aug 2021, Satish Balay via petsc-users wrote:
>>>> > >>> > > > >
>>>> > >>> > > > > > please resend the logs
>>>> > >>> > > > > >
>>>> > >>> > > > > > Satish
>>>> > >>> > > > > >
>>>> > >>> > > > > > On Mon, 30 Aug 2021, Sam Guo wrote:
>>>> > >>> > > > > >
>>>> > >>> > > > > > > Same compiling error with --with-mumps-serial=1.
>>>> > >>> > > > > > >
>>>> > >>> > > > > > > On Mon, Aug 30, 2021 at 8:22 PM Satish Balay <
>>>> > >>> balay at mcs.anl.gov>
>>>> > >>> > > > > wrote:
>>>> > >>> > > > > > >
>>>> > >>> > > > > > > > Use the additional option: -with-mumps-serial
>>>> > >>> > > > > > > >
>>>> > >>> > > > > > > > Satish
>>>> > >>> > > > > > > >
>>>> > >>> > > > > > > > On Mon, 30 Aug 2021, Sam Guo wrote:
>>>> > >>> > > > > > > >
>>>> > >>> > > > > > > > > Attached please find the configure.log. I use my
>>>> own
>>>> > >>> CMake. I
>>>> > >>> > > have
>>>> > >>> > > > > > > > > defined -DPETSC_HAVE_MUMPS. Thanks.
>>>> > >>> > > > > > > > >
>>>> > >>> > > > > > > > > On Mon, Aug 30, 2021 at 4:56 PM Sam Guo <
>>>> > >>> sam.guo at cd-adapco.com
>>>> > >>> > > >
>>>> > >>> > > > > wrote:
>>>> > >>> > > > > > > > >
>>>> > >>> > > > > > > > > > I use pre-installed
>>>> > >>> > > > > > > > > >
>>>> > >>> > > > > > > > > > On Mon, Aug 30, 2021 at 4:53 PM Satish Balay <
>>>> > >>> > > balay at mcs.anl.gov>
>>>> > >>> > > > > > > > wrote:
>>>> > >>> > > > > > > > > >
>>>> > >>> > > > > > > > > >>
>>>> > >>> > > > > > > > > >> Are you using --download-mumps or pre-installed
>>>> > >>> mumps? If
>>>> > >>> > > using
>>>> > >>> > > > > > > > > >> pre-installed - try --download-mumps.
>>>> > >>> > > > > > > > > >>
>>>> > >>> > > > > > > > > >> If you still have issues - send us
>>>> configure.log and
>>>> > >>> > > make.log
>>>> > >>> > > > > from the
>>>> > >>> > > > > > > > > >> failed build.
>>>> > >>> > > > > > > > > >>
>>>> > >>> > > > > > > > > >> Satish
>>>> > >>> > > > > > > > > >>
>>>> > >>> > > > > > > > > >> On Mon, 30 Aug 2021, Sam Guo wrote:
>>>> > >>> > > > > > > > > >>
>>>> > >>> > > > > > > > > >> > Dear PETSc dev team,
>>>> > >>> > > > > > > > > >> >    I am compiling petsc 3.15.3 and got
>>>> following
>>>> > >>> compiling
>>>> > >>> > > > > error
>>>> > >>> > > > > > > > > >> >
>>>> petsc/src/mat/impls/aij/mpi/mumps/mumps.c:52:31:
>>>> > >>> error:
>>>> > >>> > > > > missing
>>>> > >>> > > > > > > > binary
>>>> > >>> > > > > > > > > >> > operator before token "("
>>>> > >>> > > > > > > > > >> >    52 | #if PETSC_PKG_MUMPS_VERSION_GE(5,3,0)
>>>> > >>> > > > > > > > > >> >    Any idea what I did wrong?
>>>> > >>> > > > > > > > > >> >
>>>> > >>> > > > > > > > > >> > Thanks,
>>>> > >>> > > > > > > > > >> > Sam
>>>> > >>> > > > > > > > > >> >
>>>> > >>> > > > > > > > > >>
>>>> > >>> > > > > > > > > >>
>>>> > >>> > > > > > > > >
>>>> > >>> > > > > > > >
>>>> > >>> > > > > > > >
>>>> > >>> > > > > > >
>>>> > >>> > > > > >
>>>> > >>> > > > >
>>>> > >>> > > > >
>>>> > >>> > > >
>>>> > >>> > >
>>>> > >>> > >
>>>> > >>> >
>>>> > >>>
>>>> > >>>
>>>> >
>>>>
>>>>
>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> -- Norbert Wiener
>>
>> https://www.cse.buffalo.edu/~knepley/
>> <http://www.cse.buffalo.edu/~knepley/>
>>
>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210901/ab9ce301/attachment-0001.html>

From balay at mcs.anl.gov  Thu Sep  2 00:27:17 2021
From: balay at mcs.anl.gov (Satish Balay)
Date: Thu, 2 Sep 2021 00:27:17 -0500 (CDT)
Subject: [petsc-users] petsc-3.15.4 now available
Message-ID: <d6b435b-1b9e-5a59-17d8-276bc52aca6b@mcs.anl.gov>

Dear PETSc users,

The patch release petsc-3.15.4 is now available for download.

https://petsc.org/release/download/

Satish




From numbersixvs at gmail.com  Thu Sep  2 07:07:20 2021
From: numbersixvs at gmail.com (Viktor Nazdrachev)
Date: Thu, 2 Sep 2021 15:07:20 +0300
Subject: [petsc-users] Slow convergence while parallel computations.
In-Reply-To: <7EFBB20A-CB8A-47BA-BDD8-4E0BD43BBC31@joliv.et>
References: <CAELBu--pQXziZLnKD0rXE8VxAdf1_AY09TCNKhW=9we2hXsMPw@mail.gmail.com>
	<7EFBB20A-CB8A-47BA-BDD8-4E0BD43BBC31@joliv.et>
Message-ID: <CAELBu-_nWLD=VTF0CNhtrR-C6wk12O4vg9G_o6EOg8BMRPrHxg@mail.gmail.com>

Hello, Pierre!

Thank you for your response!

I attached log files (txt files with convergence behavior and RAM usage log
in separate txt files) and resulting table with convergence investigation
data(xls). Data for main non-regular grid with 500K cells and heterogeneous
properties are in 500K folder, whereas data for simple uniform 125K cells
grid with constant properties are in 125K folder.



>Dear Viktor,

>

>>* On 1 Sep 2021, at 10:42 AM, **?????????* *??????** <**numbersixvs at
gmail.com <https://lists.mcs.anl.gov/mailman/listinfo/petsc-users>**>
>wrote:*

*>*>

*>*>* Dear all,*

*>*>

*>*>* I have a 3D elasticity problem with heterogeneous properties. There
is unstructured grid with aspect ratio varied from 4 to 25. Zero Dirichlet
BCs  are imposed on bottom face of mesh. Also, Neumann (traction) BCs are
imposed on side faces. Gravity load is also accounted for. The grid I use
consists of 500k cells (which is approximately 1.6M of DOFs).*

*>*>

*>*>* The best performance and memory usage for single MPI process was
obtained with HPDDM(BFBCG) solver*

*>*>

*>*Block Krylov solvers are (most often) only useful if you have multiple
right-hand sides, e.g., in the context of elasticity, multiple loadings.

Is that really the case? If not, you may as well stick to ?standard? CG
instead of the breakdown-free block (BFB) variant.

*> *



In that case only single right-hand side is utilized, so I switched to
?standard? cg solver (-ksp_hpddm_type cg), but I noticed the interesting
convergence behavior. For non-regular grid with 500K cells and
heterogeneous properties CG  solver converged with 1 iteration
(log_hpddm(cg)_gamg_nearnullspace_1_mpi.txt), but for more simple uniform
grid with 125K cells and homogeneous properties CG solves linear system
successfully(log_hpddm(cg)_gamg_nearnullspace_1_mpi.txt).

BFBCG solver works properly for both grids.





*>*>* and bjacobian + ICC (1) in subdomains as preconditioner, it took 1 m
45 s and RAM 5.0 GB. Parallel computation with 4 MPI processes took 2 m 46
s when using 5.6 GB of RAM. This because of number of iterations required
to achieve the same tolerance is significantly increased.*

*>*>

*>*>* I`ve also tried PCGAMG (agg) preconditioner with IC**?** (1)
sub-precondtioner. For single MPI process, the calculation took 10 min and
3.4 GB of RAM. To improve the convergence rate, the nullspace was attached
using MatNullSpaceCreateRigidBody and MatSetNearNullSpace subroutines.
This has reduced calculation time to 3 m 58 s when using 4.3 GB of RAM.
Also, there is peak memory usage with 14.1 GB, which appears just before
the start of the iterations. Parallel computation with 4 MPI processes took
2 m 53 s when using 8.4 GB of RAM. In that case the peak memory usage is
about 22 GB.*

>*> *

*>*I?m surprised that GAMG is converging so slowly. What do you mean by
"ICC(1) sub-preconditioner"? Do you use that as a smoother or as a coarse
level solver?

*>*


Sorry for misleading, ICC is used only for BJACOBI preconditioner, no ICC
for GAMG.



*>*How many iterations are required to reach convergence?

*>*Could you please maybe run the solver with -ksp_view -log_view and send
us the output?

*>*



For case with 4 MPI processes and attached nullspace it is required 177
iterations to reach convergence (you may see detailed log in
log_hpddm(bfbcg)_gamg_nearnullspace_4_mpi.txt and memory usage log in
RAM_log_hpddm(bfbcg)_gamg_nearnullspace_4_mpi.txt). For comparison, 90
iterations are required for sequential
run(log_hpddm(bfbcg)_gamg_nearnullspace_1_mpi.txt).



*>*Most of the default parameters of GAMG should be good enough for 3D
elasticity, provided that your MatNullSpace is correct.

*>*



How can I be sure that nullspace is attached correctly? Is there any way
for self-checking (Well perhaps calculate some parameters using matrix and
solution vector)?



*>*One parameter that may need some adjustments though is the aggregation
threshold -pc_gamg_threshold (you could try values in the [0.01; 0.1]
range, that?s what I always use for elasticity problems).

*> *



Tried to find optimal value of this option, set -pc_gamg_threshold 0.01 and
-pc_gamg_threshold_scale 2, but I didn't notice any significant changes
(Need more time for experiments )


Kind regards,



Viktor Nazdrachev



R&D senior researcher



Geosteering Technologies LLC

??, 1 ????. 2021 ?. ? 12:01, Pierre Jolivet <pierre at joliv.et>:

> Dear Viktor,
>
> On 1 Sep 2021, at 10:42 AM, ????????? ?????? <numbersixvs at gmail.com>
> wrote:
>
> Dear all,
>
> I have a 3D elasticity problem with heterogeneous properties. There is
> unstructured grid with aspect ratio varied from 4 to 25. Zero Dirichlet
> BCs  are imposed on bottom face of mesh. Also, Neumann (traction) BCs are
> imposed on side faces. Gravity load is also accounted for. The grid I use
> consists of 500k cells (which is approximately 1.6M of DOFs).
>
> The best performance and memory usage for single MPI process was obtained
> with HPDDM(BFBCG) solver
>
> Block Krylov solvers are (most often) only useful if you have multiple
> right-hand sides, e.g., in the context of elasticity, multiple loadings.
> Is that really the case? If not, you may as well stick to ?standard? CG
> instead of the breakdown-free block (BFB) variant.
>
> and bjacobian + ICC (1) in subdomains as preconditioner, it took 1 m 45 s
> and RAM 5.0 GB. Parallel computation with 4 MPI processes took 2 m 46 s
> when using 5.6 GB of RAM. This because of number of iterations required to
> achieve the same tolerance is significantly increased.
>
> I`ve also tried PCGAMG (agg) preconditioner with IC? (1)
> sub-precondtioner. For single MPI process, the calculation took 10 min and
> 3.4 GB of RAM. To improve the convergence rate, the nullspace was attached
> using MatNullSpaceCreateRigidBody and MatSetNearNullSpace subroutines.
> This has reduced calculation time to 3 m 58 s when using 4.3 GB of RAM.
> Also, there is peak memory usage with 14.1 GB, which appears just before
> the start of the iterations. Parallel computation with 4 MPI processes took
> 2 m 53 s when using 8.4 GB of RAM. In that case the peak memory usage is
> about 22 GB.
>
> I?m surprised that GAMG is converging so slowly. What do you mean by
> "ICC(1) sub-preconditioner"? Do you use that as a smoother or as a coarse
> level solver?
> How many iterations are required to reach convergence?
> Could you please maybe run the solver with -ksp_view -log_view and send us
> the output?
> Most of the default parameters of GAMG should be good enough for 3D
> elasticity, provided that your MatNullSpace is correct.
> One parameter that may need some adjustments though is the aggregation
> threshold -pc_gamg_threshold (you could try values in the [0.01; 0.1]
> range, that?s what I always use for elasticity problems).
>
> Thanks,
> Pierre
>
> Are there ways to avoid decreasing of the convergence rate for bjacobi
> precondtioner in parallel mode? Does it make sense to use hierarchical or
> nested krylov methods with a local gmres solver (sub_pc_type gmres) and
> some sub-precondtioner (for example, sub_pc_type bjacobi)?
>
>
> Is this peak memory usage expected for gamg preconditioner? is there any
> way to reduce it?
>
>
> What advice would you give to improve the convergence rate with multiple
> MPI processes, but keep memory consumption reasonable?
>
>
> Kind regards,
>
> Viktor Nazdrachev
>
> R&D senior researcher
>
> Geosteering Technologies LLC
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210902/23283a6c/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: logs.rar
Type: application/octet-stream
Size: 148643 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210902/23283a6c/attachment-0001.obj>

From pierre at joliv.et  Thu Sep  2 07:31:39 2021
From: pierre at joliv.et (Pierre Jolivet)
Date: Thu, 2 Sep 2021 14:31:39 +0200
Subject: [petsc-users] Slow convergence while parallel computations.
In-Reply-To: <CAELBu-_nWLD=VTF0CNhtrR-C6wk12O4vg9G_o6EOg8BMRPrHxg@mail.gmail.com>
References: <CAELBu--pQXziZLnKD0rXE8VxAdf1_AY09TCNKhW=9we2hXsMPw@mail.gmail.com>
	<7EFBB20A-CB8A-47BA-BDD8-4E0BD43BBC31@joliv.et>
	<CAELBu-_nWLD=VTF0CNhtrR-C6wk12O4vg9G_o6EOg8BMRPrHxg@mail.gmail.com>
Message-ID: <3789D448-CCED-403F-9984-340150F9761A@joliv.et>



> On 2 Sep 2021, at 2:07 PM, Viktor Nazdrachev <numbersixvs at gmail.com> wrote:
> 
> Hello, Pierre!
> 
> Thank you for your response!
> I attached log files (txt files with convergence behavior and RAM usage log in separate txt files) and resulting table with convergence investigation data(xls). Data for main non-regular grid with 500K cells and heterogeneous properties are in 500K folder, whereas data for simple uniform 125K cells grid with constant properties are in 125K folder. 
>  
> >Dear Viktor,
> > 
> >> On 1 Sep 2021, at 10:42 AM, ????????? ?????? <numbersixvs at gmail.com <https://lists.mcs.anl.gov/mailman/listinfo/petsc-users>> > <>wrote:
> >>
> >> Dear all,
> >>
> >> I have a 3D elasticity problem with heterogeneous properties. There is unstructured grid with aspect ratio varied from 4 to 25. Zero Dirichlet BCs  are imposed on bottom face of mesh. Also, Neumann (traction) BCs are imposed on side faces. Gravity load is also accounted for. The grid I use consists of 500k cells (which is approximately 1.6M of DOFs).
> >>
> >> The best performance and memory usage for single MPI process was obtained with HPDDM(BFBCG) solver
> >>
> >Block Krylov solvers are (most often) only useful if you have multiple right-hand sides, e.g., in the context of elasticity, multiple loadings.
> Is that really the case? If not, you may as well stick to ?standard? CG instead of the breakdown-free block (BFB) variant.
> > 
>  
> In that case only single right-hand side is utilized, so I switched to ?standard? cg solver (-ksp_hpddm_type cg), but I noticed the interesting convergence behavior. For non-regular grid with 500K cells and heterogeneous properties CG  solver converged with 1 iteration (log_hpddm(cg)_gamg_nearnullspace_1_mpi.txt), but for more simple uniform grid with 125K cells and homogeneous properties CG solves linear system successfully(log_hpddm(cg)_gamg_nearnullspace_1_mpi.txt).
> BFBCG solver works properly for both grids.

Just stick to -ksp_type cg or maybe -ksp_type gmres -ksp_gmres_modifiedgramschmidt (even if the problem is SPD).
Sorry if I repeat myself, but KSPHPDDM methods are mostly useful for either blocking or recycling.
If you use something as simple as CG, you?ll get better diagnostics and error handling if you use the native PETSc implementation (KSPCG) instead of the external implementation (-ksp_hpddm_type cg).

>   
> >> and bjacobian + ICC (1) in subdomains as preconditioner, it took 1 m 45 s and RAM 5.0 GB. Parallel computation with 4 MPI processes took 2 m 46 s when using 5.6 GB of RAM. This because of number of iterations required to achieve the same tolerance is significantly increased.
> >>
> >> I`ve also tried PCGAMG (agg) preconditioner with IC? (1) sub-precondtioner. For single MPI process, the calculation took 10 min and 3.4 GB of RAM. To improve the convergence rate, the nullspace was attached using MatNullSpaceCreateRigidBody and MatSetNearNullSpace subroutines.  This has reduced calculation time to 3 m 58 s when using 4.3 GB of RAM. Also, there is peak memory usage with 14.1 GB, which appears just before the start of the iterations. Parallel computation with 4 MPI processes took 2 m 53 s when using 8.4 GB of RAM. In that case the peak memory usage is about 22 GB.
> >>
> >I?m surprised that GAMG is converging so slowly. What do you mean by "ICC(1) sub-preconditioner"? Do you use that as a smoother or as a coarse level solver?
> > 
> 
> Sorry for misleading, ICC is used only for BJACOBI preconditioner, no ICC for GAMG.
>  
> >How many iterations are required to reach convergence?
> >Could you please maybe run the solver with -ksp_view -log_view and send us the output?
> > 
>  
> For case with 4 MPI processes and attached nullspace it is required 177 iterations to reach convergence (you may see detailed log in log_hpddm(bfbcg)_gamg_nearnullspace_4_mpi.txt and memory usage log in RAM_log_hpddm(bfbcg)_gamg_nearnullspace_4_mpi.txt). For comparison, 90 iterations are required for sequential run(log_hpddm(bfbcg)_gamg_nearnullspace_1_mpi.txt).
> 
> 
> >Most of the default parameters of GAMG should be good enough for 3D elasticity, provided that your MatNullSpace is correct.
> > 
>  
> How can I be sure that nullspace is attached correctly? Is there any way for self-checking (Well perhaps calculate some parameters using matrix and solution vector)?
>  
> >One parameter that may need some adjustments though is the aggregation threshold -pc_gamg_threshold (you could try values in the [0.01; 0.1] range, that?s what I always use for elasticity problems).
> > 
>  
> Tried to find optimal value of this option, set -pc_gamg_threshold 0.01 and -pc_gamg_threshold_scale 2, but I didn't notice any significant changes (Need more time for experiments ) 
> 
I don?t see anything too crazy in your logs at first sight. In addition to maybe trying GMRES with a more robust orthogonalization scheme, here is what I would do:
1) MatSetBlockSize(Pmat, 6), it seems to be missing right now, cf.
  linear system matrix = precond matrix:
  Mat Object: 4 MPI processes
    type: mpiaij
    rows=1600200, cols=1600200
    total: nonzeros=124439742, allocated nonzeros=259232400
    total number of mallocs used during MatSetValues calls=0
      has attached near null space
2) -mg_coarse_pc_type redundant -mg_coarse_redundant_pc_type lu
3) more playing around with the threshold, this can be critical for hard problems
If you can share your matrix/nullspace/RHS, we could have a crack at it as well.

Thanks,
Pierre 

> Kind regards,
>  
> Viktor Nazdrachev
>  
> R&D senior researcher
>  
> Geosteering Technologies LLC 
> 
> 
> ??, 1 ????. 2021 ?. ? 12:01, Pierre Jolivet <pierre at joliv.et <mailto:pierre at joliv.et>>:
> Dear Viktor,
> 
>> On 1 Sep 2021, at 10:42 AM, ????????? ?????? <numbersixvs at gmail.com <mailto:numbersixvs at gmail.com>> wrote:
>> 
>> Dear all,
>> 
>> I have a 3D elasticity problem with heterogeneous properties. There is unstructured grid with aspect ratio varied from 4 to 25. Zero Dirichlet BCs  are imposed on bottom face of mesh. Also, Neumann (traction) BCs are imposed on side faces. Gravity load is also accounted for. The grid I use consists of 500k cells (which is approximately 1.6M of DOFs).
>> 
>> The best performance and memory usage for single MPI process was obtained with HPDDM(BFBCG) solver
>> 
> Block Krylov solvers are (most often) only useful if you have multiple right-hand sides, e.g., in the context of elasticity, multiple loadings.
> Is that really the case? If not, you may as well stick to ?standard? CG instead of the breakdown-free block (BFB) variant.
> 
>> and bjacobian + ICC (1) in subdomains as preconditioner, it took 1 m 45 s and RAM 5.0 GB. Parallel computation with 4 MPI processes took 2 m 46 s when using 5.6 GB of RAM. This because of number of iterations required to achieve the same tolerance is significantly increased.
>> 
>> I`ve also tried PCGAMG (agg) preconditioner with IC? (1) sub-precondtioner. For single MPI process, the calculation took 10 min and 3.4 GB of RAM. To improve the convergence rate, the nullspace was attached using MatNullSpaceCreateRigidBody and MatSetNearNullSpace subroutines.  This has reduced calculation time to 3 m 58 s when using 4.3 GB of RAM. Also, there is peak memory usage with 14.1 GB, which appears just before the start of the iterations. Parallel computation with 4 MPI processes took 2 m 53 s when using 8.4 GB of RAM. In that case the peak memory usage is about 22 GB.
>> 
> I?m surprised that GAMG is converging so slowly. What do you mean by "ICC(1) sub-preconditioner"? Do you use that as a smoother or as a coarse level solver?
> How many iterations are required to reach convergence?
> Could you please maybe run the solver with -ksp_view -log_view and send us the output?
> Most of the default parameters of GAMG should be good enough for 3D elasticity, provided that your MatNullSpace is correct.
> One parameter that may need some adjustments though is the aggregation threshold -pc_gamg_threshold (you could try values in the [0.01; 0.1] range, that?s what I always use for elasticity problems).
> 
> Thanks,
> Pierre
> 
>> Are there ways to avoid decreasing of the convergence rate for bjacobi precondtioner in parallel mode? Does it make sense to use hierarchical or nested krylov methods with a local gmres solver (sub_pc_type gmres) and some sub-precondtioner (for example, sub_pc_type bjacobi)?
>> 
>>  
>> Is this peak memory usage expected for gamg preconditioner? is there any way to reduce it?
>> 
>>  
>> What advice would you give to improve the convergence rate with multiple MPI processes, but keep memory consumption reasonable?
>> 
>>  
>> Kind regards,
>> 
>> Viktor Nazdrachev
>> 
>> R&D senior researcher
>> 
>> Geosteering Technologies LLC
>> 
> 
> <logs.rar>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210902/06a9c9a3/attachment-0001.html>

From pierre at joliv.et  Thu Sep  2 07:34:35 2021
From: pierre at joliv.et (Pierre Jolivet)
Date: Thu, 2 Sep 2021 14:34:35 +0200
Subject: [petsc-users] Slow convergence while parallel computations.
In-Reply-To: <3789D448-CCED-403F-9984-340150F9761A@joliv.et>
References: <CAELBu--pQXziZLnKD0rXE8VxAdf1_AY09TCNKhW=9we2hXsMPw@mail.gmail.com>
	<7EFBB20A-CB8A-47BA-BDD8-4E0BD43BBC31@joliv.et>
	<CAELBu-_nWLD=VTF0CNhtrR-C6wk12O4vg9G_o6EOg8BMRPrHxg@mail.gmail.com>
	<3789D448-CCED-403F-9984-340150F9761A@joliv.et>
Message-ID: <70540BB9-BC66-40F3-9BD1-E7EC613AEE88@joliv.et>



> On 2 Sep 2021, at 2:31 PM, Pierre Jolivet <pierre at joliv.et> wrote:
> 
> 
> 
>> On 2 Sep 2021, at 2:07 PM, Viktor Nazdrachev <numbersixvs at gmail.com <mailto:numbersixvs at gmail.com>> wrote:
>> 
>> Hello, Pierre!
>> 
>> Thank you for your response!
>> I attached log files (txt files with convergence behavior and RAM usage log in separate txt files) and resulting table with convergence investigation data(xls). Data for main non-regular grid with 500K cells and heterogeneous properties are in 500K folder, whereas data for simple uniform 125K cells grid with constant properties are in 125K folder.  
>>  
>> >Dear Viktor,
>> > 
>> >> On 1 Sep 2021, at 10:42 AM, ????????? ?????? <numbersixvs at gmail.com <https://lists.mcs.anl.gov/mailman/listinfo/petsc-users>> > <>wrote:
>> >>
>> >> Dear all,
>> >>
>> >> I have a 3D elasticity problem with heterogeneous properties. There is unstructured grid with aspect ratio varied from 4 to 25. Zero Dirichlet BCs  are imposed on bottom face of mesh. Also, Neumann (traction) BCs are imposed on side faces. Gravity load is also accounted for. The grid I use consists of 500k cells (which is approximately 1.6M of DOFs).
>> >>
>> >> The best performance and memory usage for single MPI process was obtained with HPDDM(BFBCG) solver
>> >>
>> >Block Krylov solvers are (most often) only useful if you have multiple right-hand sides, e.g., in the context of elasticity, multiple loadings.
>> Is that really the case? If not, you may as well stick to ?standard? CG instead of the breakdown-free block (BFB) variant.
>> > 
>>  
>> In that case only single right-hand side is utilized, so I switched to ?standard? cg solver (-ksp_hpddm_type cg), but I noticed the interesting convergence behavior. For non-regular grid with 500K cells and heterogeneous properties CG  solver converged with 1 iteration (log_hpddm(cg)_gamg_nearnullspace_1_mpi.txt), but for more simple uniform grid with 125K cells and homogeneous properties CG solves linear system successfully(log_hpddm(cg)_gamg_nearnullspace_1_mpi.txt).
>> BFBCG solver works properly for both grids.
> 
> Just stick to -ksp_type cg or maybe -ksp_type gmres -ksp_gmres_modifiedgramschmidt (even if the problem is SPD).
> Sorry if I repeat myself, but KSPHPDDM methods are mostly useful for either blocking or recycling.
> If you use something as simple as CG, you?ll get better diagnostics and error handling if you use the native PETSc implementation (KSPCG) instead of the external implementation (-ksp_hpddm_type cg).
> 
>>   
>> >> and bjacobian + ICC (1) in subdomains as preconditioner, it took 1 m 45 s and RAM 5.0 GB. Parallel computation with 4 MPI processes took 2 m 46 s when using 5.6 GB of RAM. This because of number of iterations required to achieve the same tolerance is significantly increased.
>> >>
>> >> I`ve also tried PCGAMG (agg) preconditioner with IC? (1) sub-precondtioner. For single MPI process, the calculation took 10 min and 3.4 GB of RAM. To improve the convergence rate, the nullspace was attached using MatNullSpaceCreateRigidBody and MatSetNearNullSpace subroutines.  This has reduced calculation time to 3 m 58 s when using 4.3 GB of RAM. Also, there is peak memory usage with 14.1 GB, which appears just before the start of the iterations. Parallel computation with 4 MPI processes took 2 m 53 s when using 8.4 GB of RAM. In that case the peak memory usage is about 22 GB.
>> >> 
>> >I?m surprised that GAMG is converging so slowly. What do you mean by "ICC(1) sub-preconditioner"? Do you use that as a smoother or as a coarse level solver?
>> > 
>> 
>> Sorry for misleading, ICC is used only for BJACOBI preconditioner, no ICC for GAMG.
>>  
>> >How many iterations are required to reach convergence?
>> >Could you please maybe run the solver with -ksp_view -log_view and send us the output?
>> > 
>>  
>> For case with 4 MPI processes and attached nullspace it is required 177 iterations to reach convergence (you may see detailed log in log_hpddm(bfbcg)_gamg_nearnullspace_4_mpi.txt and memory usage log in RAM_log_hpddm(bfbcg)_gamg_nearnullspace_4_mpi.txt). For comparison, 90 iterations are required for sequential run(log_hpddm(bfbcg)_gamg_nearnullspace_1_mpi.txt).
>> 
>> 
>> >Most of the default parameters of GAMG should be good enough for 3D elasticity, provided that your MatNullSpace is correct.
>> > 
>>  
>> How can I be sure that nullspace is attached correctly? Is there any way for self-checking (Well perhaps calculate some parameters using matrix and solution vector)? 
>>  
>> >One parameter that may need some adjustments though is the aggregation threshold -pc_gamg_threshold (you could try values in the [0.01; 0.1] range, that?s what I always use for elasticity problems).
>> > 
>>  
>> Tried to find optimal value of this option, set -pc_gamg_threshold 0.01 and -pc_gamg_threshold_scale 2, but I didn't notice any significant changes (Need more time for experiments ) 
>> 
> I don?t see anything too crazy in your logs at first sight. In addition to maybe trying GMRES with a more robust orthogonalization scheme, here is what I would do:
> 1) MatSetBlockSize(Pmat, 6), it seems to be missing right now, cf.

Sorry for the noise, but this should read 3, not 6?

Thanks,
Pierre

>   linear system matrix = precond matrix:
>   Mat Object: 4 MPI processes
>     type: mpiaij
>     rows=1600200, cols=1600200
>     total: nonzeros=124439742, allocated nonzeros=259232400
>     total number of mallocs used during MatSetValues calls=0
>       has attached near null space
> 2) -mg_coarse_pc_type redundant -mg_coarse_redundant_pc_type lu
> 3) more playing around with the threshold, this can be critical for hard problems
> If you can share your matrix/nullspace/RHS, we could have a crack at it as well.
> 
> Thanks,
> Pierre 
> 
>> Kind regards,
>>  
>> Viktor Nazdrachev
>>  
>> R&D senior researcher
>>  
>> Geosteering Technologies LLC 
>> 
>> 
>> ??, 1 ????. 2021 ?. ? 12:01, Pierre Jolivet <pierre at joliv.et <mailto:pierre at joliv.et>>:
>> Dear Viktor,
>> 
>>> On 1 Sep 2021, at 10:42 AM, ????????? ?????? <numbersixvs at gmail.com <mailto:numbersixvs at gmail.com>> wrote:
>>> 
>>> Dear all,
>>> 
>>> I have a 3D elasticity problem with heterogeneous properties. There is unstructured grid with aspect ratio varied from 4 to 25. Zero Dirichlet BCs  are imposed on bottom face of mesh. Also, Neumann (traction) BCs are imposed on side faces. Gravity load is also accounted for. The grid I use consists of 500k cells (which is approximately 1.6M of DOFs).
>>> 
>>> The best performance and memory usage for single MPI process was obtained with HPDDM(BFBCG) solver
>>> 
>> Block Krylov solvers are (most often) only useful if you have multiple right-hand sides, e.g., in the context of elasticity, multiple loadings.
>> Is that really the case? If not, you may as well stick to ?standard? CG instead of the breakdown-free block (BFB) variant.
>> 
>>> and bjacobian + ICC (1) in subdomains as preconditioner, it took 1 m 45 s and RAM 5.0 GB. Parallel computation with 4 MPI processes took 2 m 46 s when using 5.6 GB of RAM. This because of number of iterations required to achieve the same tolerance is significantly increased.
>>> 
>>> I`ve also tried PCGAMG (agg) preconditioner with IC? (1) sub-precondtioner. For single MPI process, the calculation took 10 min and 3.4 GB of RAM. To improve the convergence rate, the nullspace was attached using MatNullSpaceCreateRigidBody and MatSetNearNullSpace subroutines.  This has reduced calculation time to 3 m 58 s when using 4.3 GB of RAM. Also, there is peak memory usage with 14.1 GB, which appears just before the start of the iterations. Parallel computation with 4 MPI processes took 2 m 53 s when using 8.4 GB of RAM. In that case the peak memory usage is about 22 GB.
>>> 
>> I?m surprised that GAMG is converging so slowly. What do you mean by "ICC(1) sub-preconditioner"? Do you use that as a smoother or as a coarse level solver?
>> How many iterations are required to reach convergence?
>> Could you please maybe run the solver with -ksp_view -log_view and send us the output?
>> Most of the default parameters of GAMG should be good enough for 3D elasticity, provided that your MatNullSpace is correct.
>> One parameter that may need some adjustments though is the aggregation threshold -pc_gamg_threshold (you could try values in the [0.01; 0.1] range, that?s what I always use for elasticity problems).
>> 
>> Thanks,
>> Pierre
>> 
>>> Are there ways to avoid decreasing of the convergence rate for bjacobi precondtioner in parallel mode? Does it make sense to use hierarchical or nested krylov methods with a local gmres solver (sub_pc_type gmres) and some sub-precondtioner (for example, sub_pc_type bjacobi)?
>>> 
>>>  
>>> Is this peak memory usage expected for gamg preconditioner? is there any way to reduce it?
>>> 
>>>  
>>> What advice would you give to improve the convergence rate with multiple MPI processes, but keep memory consumption reasonable?
>>> 
>>>  
>>> Kind regards,
>>> 
>>> Viktor Nazdrachev
>>> 
>>> R&D senior researcher
>>> 
>>> Geosteering Technologies LLC
>>> 
>> 
>> <logs.rar>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210902/09000903/attachment-0001.html>

From knepley at gmail.com  Thu Sep  2 07:59:10 2021
From: knepley at gmail.com (Matthew Knepley)
Date: Thu, 2 Sep 2021 08:59:10 -0400
Subject: [petsc-users] Slow convergence while parallel computations.
In-Reply-To: <CAELBu-_nWLD=VTF0CNhtrR-C6wk12O4vg9G_o6EOg8BMRPrHxg@mail.gmail.com>
References: <CAELBu--pQXziZLnKD0rXE8VxAdf1_AY09TCNKhW=9we2hXsMPw@mail.gmail.com>
	<7EFBB20A-CB8A-47BA-BDD8-4E0BD43BBC31@joliv.et>
	<CAELBu-_nWLD=VTF0CNhtrR-C6wk12O4vg9G_o6EOg8BMRPrHxg@mail.gmail.com>
Message-ID: <CAMYG4G=WHo3Q21aLGXf9DxCNA7QJ+X4fekaHqiO3HUUN5J91wg@mail.gmail.com>

On Thu, Sep 2, 2021 at 8:08 AM Viktor Nazdrachev <numbersixvs at gmail.com>
wrote:

> Hello, Pierre!
>
> Thank you for your response!
>
> I attached log files (txt files with convergence behavior and RAM usage
> log in separate txt files) and resulting table with convergence
> investigation data(xls). Data for main non-regular grid with 500K cells and
> heterogeneous properties are in 500K folder, whereas data for simple
> uniform 125K cells grid with constant properties are in 125K folder.
>
>
>
> >Dear Viktor,
>
> >
>
> >>* On 1 Sep 2021, at 10:42 AM, **?????????* *??????** <**numbersixvs at
> gmail.com <https://lists.mcs.anl.gov/mailman/listinfo/petsc-users>**>
> >wrote:*
>
> *>*>
>
> *>*>* Dear all,*
>
> *>*>
>
> *>*>* I have a 3D elasticity problem with heterogeneous properties. There
> is unstructured grid with aspect ratio varied from 4 to 25. Zero Dirichlet
> BCs  are imposed on bottom face of mesh. Also, Neumann (traction) BCs are
> imposed on side faces. Gravity load is also accounted for. The grid I use
> consists of 500k cells (which is approximately 1.6M of DOFs).*
>
> *>*>
>
> *>*>* The best performance and memory usage for single MPI process was
> obtained with HPDDM(BFBCG) solver*
>
> *>*>
>
> *>*Block Krylov solvers are (most often) only useful if you have multiple
> right-hand sides, e.g., in the context of elasticity, multiple loadings.
>
> Is that really the case? If not, you may as well stick to ?standard? CG
> instead of the breakdown-free block (BFB) variant.
>
> *> *
>
>
>
> In that case only single right-hand side is utilized, so I switched to
> ?standard? cg solver (-ksp_hpddm_type cg), but I noticed the interesting
> convergence behavior. For non-regular grid with 500K cells and
> heterogeneous properties CG  solver converged with 1 iteration
> (log_hpddm(cg)_gamg_nearnullspace_1_mpi.txt), but for more simple uniform
> grid with 125K cells and homogeneous properties CG solves linear system
> successfully(log_hpddm(cg)_gamg_nearnullspace_1_mpi.txt).
>
> BFBCG solver works properly for both grids.
>
>
>
>
>
> *>*>* and bjacobian + ICC (1) in subdomains as preconditioner, it took 1
> m 45 s and RAM 5.0 GB. Parallel computation with 4 MPI processes took 2 m
> 46 s when using 5.6 GB of RAM. This because of number of iterations
> required to achieve the same tolerance is significantly increased.*
>
> *>*>
>
> *>*>* I`ve also tried PCGAMG (agg) preconditioner with IC**?** (1)
> sub-precondtioner. For single MPI process, the calculation took 10 min and
> 3.4 GB of RAM. To improve the convergence rate, the nullspace was attached
> using MatNullSpaceCreateRigidBody and MatSetNearNullSpace subroutines.
> This has reduced calculation time to 3 m 58 s when using 4.3 GB of RAM.
> Also, there is peak memory usage with 14.1 GB, which appears just before
> the start of the iterations. Parallel computation with 4 MPI processes took
> 2 m 53 s when using 8.4 GB of RAM. In that case the peak memory usage is
> about 22 GB.*
>
> >*> *
>
> *>*I?m surprised that GAMG is converging so slowly. What do you mean by
> "ICC(1) sub-preconditioner"? Do you use that as a smoother or as a coarse
> level solver?
>
> *>*
>
>
> Sorry for misleading, ICC is used only for BJACOBI preconditioner, no ICC
> for GAMG.
>
>
>
> *>*How many iterations are required to reach convergence?
>
> *>*Could you please maybe run the solver with -ksp_view -log_view and
> send us the output?
>
> *>*
>
>
>
> For case with 4 MPI processes and attached nullspace it is required 177
> iterations
>

Pierre's suggestions are good ones.

I am confused by the failure of GAMG, since 177 iterations is not good.
Something is breaking down, either the smoother or the accuracy of the
coarse grids.
Can you give me an idea what your coefficient looks like?

  Thanks,

     Matt


> to reach convergence (you may see detailed log in
> log_hpddm(bfbcg)_gamg_nearnullspace_4_mpi.txt and memory usage log in
> RAM_log_hpddm(bfbcg)_gamg_nearnullspace_4_mpi.txt). For comparison, 90
> iterations are required for sequential
> run(log_hpddm(bfbcg)_gamg_nearnullspace_1_mpi.txt).
>
>
>
> *>*Most of the default parameters of GAMG should be good enough for 3D
> elasticity, provided that your MatNullSpace is correct.
>
> *>*
>
>
>
> How can I be sure that nullspace is attached correctly? Is there any way
> for self-checking (Well perhaps calculate some parameters using matrix and
> solution vector)?
>
>
>
> *>*One parameter that may need some adjustments though is the aggregation
> threshold -pc_gamg_threshold (you could try values in the [0.01; 0.1]
> range, that?s what I always use for elasticity problems).
>
> *> *
>
>
>
> Tried to find optimal value of this option, set -pc_gamg_threshold 0.01
> and -pc_gamg_threshold_scale 2, but I didn't notice any significant
> changes (Need more time for experiments )
>
>
> Kind regards,
>
>
>
> Viktor Nazdrachev
>
>
>
> R&D senior researcher
>
>
>
> Geosteering Technologies LLC
>
> ??, 1 ????. 2021 ?. ? 12:01, Pierre Jolivet <pierre at joliv.et>:
>
>> Dear Viktor,
>>
>> On 1 Sep 2021, at 10:42 AM, ????????? ?????? <numbersixvs at gmail.com>
>> wrote:
>>
>> Dear all,
>>
>> I have a 3D elasticity problem with heterogeneous properties. There is
>> unstructured grid with aspect ratio varied from 4 to 25. Zero Dirichlet
>> BCs  are imposed on bottom face of mesh. Also, Neumann (traction) BCs are
>> imposed on side faces. Gravity load is also accounted for. The grid I use
>> consists of 500k cells (which is approximately 1.6M of DOFs).
>>
>> The best performance and memory usage for single MPI process was obtained
>> with HPDDM(BFBCG) solver
>>
>> Block Krylov solvers are (most often) only useful if you have multiple
>> right-hand sides, e.g., in the context of elasticity, multiple loadings.
>> Is that really the case? If not, you may as well stick to ?standard? CG
>> instead of the breakdown-free block (BFB) variant.
>>
>> and bjacobian + ICC (1) in subdomains as preconditioner, it took 1 m 45 s
>> and RAM 5.0 GB. Parallel computation with 4 MPI processes took 2 m 46 s
>> when using 5.6 GB of RAM. This because of number of iterations required to
>> achieve the same tolerance is significantly increased.
>>
>> I`ve also tried PCGAMG (agg) preconditioner with IC? (1)
>> sub-precondtioner. For single MPI process, the calculation took 10 min and
>> 3.4 GB of RAM. To improve the convergence rate, the nullspace was attached
>> using MatNullSpaceCreateRigidBody and MatSetNearNullSpace subroutines.
>> This has reduced calculation time to 3 m 58 s when using 4.3 GB of RAM.
>> Also, there is peak memory usage with 14.1 GB, which appears just before
>> the start of the iterations. Parallel computation with 4 MPI processes took
>> 2 m 53 s when using 8.4 GB of RAM. In that case the peak memory usage is
>> about 22 GB.
>>
>> I?m surprised that GAMG is converging so slowly. What do you mean by
>> "ICC(1) sub-preconditioner"? Do you use that as a smoother or as a coarse
>> level solver?
>> How many iterations are required to reach convergence?
>> Could you please maybe run the solver with -ksp_view -log_view and send
>> us the output?
>> Most of the default parameters of GAMG should be good enough for 3D
>> elasticity, provided that your MatNullSpace is correct.
>> One parameter that may need some adjustments though is the aggregation
>> threshold -pc_gamg_threshold (you could try values in the [0.01; 0.1]
>> range, that?s what I always use for elasticity problems).
>>
>> Thanks,
>> Pierre
>>
>> Are there ways to avoid decreasing of the convergence rate for bjacobi
>> precondtioner in parallel mode? Does it make sense to use hierarchical or
>> nested krylov methods with a local gmres solver (sub_pc_type gmres) and
>> some sub-precondtioner (for example, sub_pc_type bjacobi)?
>>
>>
>> Is this peak memory usage expected for gamg preconditioner? is there any
>> way to reduce it?
>>
>>
>> What advice would you give to improve the convergence rate with multiple
>> MPI processes, but keep memory consumption reasonable?
>>
>>
>> Kind regards,
>>
>> Viktor Nazdrachev
>>
>> R&D senior researcher
>>
>> Geosteering Technologies LLC
>>
>>
>>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210902/7f05d88f/attachment-0001.html>

From aduarteg at utexas.edu  Thu Sep  2 09:24:46 2021
From: aduarteg at utexas.edu (Alfredo J Duarte Gomez)
Date: Thu, 2 Sep 2021 09:24:46 -0500
Subject: [petsc-users] TSBDF info
Message-ID: <CAO1tTfLQMpX0tDXiS+qxGHc5zJB9p5aL+ShGmU801Fv6--JvJw@mail.gmail.com>

Good morning PETSC team,

I am looking to implement a fully implicit BDF-DAE solver and I came across
the TSBDF object https://petsc.org/release/src/ts/impls/bdf/bdf.c.html#TSBDF
.

This seems to be exactly what I am looking for, but I still have questions
and I can't seem to find much from it in the documentation or the users
manual.

The key question for me is whether this is implementation is a *Variable
Leading Coefficient BDF *or *Fixed-Leading Coefficient BDF* (such as
CVODE). This impacts my ability of reusing matrix factorizations.

Thank you,

-Alfredo

-- 
Alfredo Duarte
Graduate Research Assistant
The University of Texas at Austin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210902/27b413af/attachment.html>

From hongzhang at anl.gov  Thu Sep  2 11:32:34 2021
From: hongzhang at anl.gov (Zhang, Hong)
Date: Thu, 2 Sep 2021 16:32:34 +0000
Subject: [petsc-users] TSBDF info
In-Reply-To: <CAO1tTfLQMpX0tDXiS+qxGHc5zJB9p5aL+ShGmU801Fv6--JvJw@mail.gmail.com>
References: <CAO1tTfLQMpX0tDXiS+qxGHc5zJB9p5aL+ShGmU801Fv6--JvJw@mail.gmail.com>
Message-ID: <1788AED1-C7A3-45BD-BAB8-0EB2980F6FEB@anl.gov>



On Sep 2, 2021, at 9:24 AM, Alfredo J Duarte Gomez <aduarteg at utexas.edu<mailto:aduarteg at utexas.edu>> wrote:

Good morning PETSC team,

I am looking to implement a fully implicit BDF-DAE solver and I came across the TSBDF object https://petsc.org/release/src/ts/impls/bdf/bdf.c.html#TSBDF.

This seems to be exactly what I am looking for, but I still have questions and I can't seem to find much from it in the documentation or the users manual.

The key question for me is whether this is implementation is a Variable Leading Coefficient BDF or Fixed-Leading Coefficient BDF (such as CVODE). This impacts my ability of reusing matrix factorizations.

It is not FLC BDF. It uses the classic variable-step formula that computes the coefficients from the last stepsizes.

Hong (Mr.)


Thank you,

-Alfredo

--
Alfredo Duarte
Graduate Research Assistant
The University of Texas at Austin

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210902/d8c87eb6/attachment.html>

From numbersixvs at gmail.com  Fri Sep  3 00:55:59 2021
From: numbersixvs at gmail.com (Viktor Nazdrachev)
Date: Fri, 3 Sep 2021 08:55:59 +0300
Subject: [petsc-users] Slow convergence while parallel computations.
In-Reply-To: <EB5F7983-CA26-496E-9BE4-E15767E0003C@gmx.li>
References: <CAELBu--pQXziZLnKD0rXE8VxAdf1_AY09TCNKhW=9we2hXsMPw@mail.gmail.com>
	<EB5F7983-CA26-496E-9BE4-E15767E0003C@gmx.li>
Message-ID: <CAELBu-83fyG37RReqiN0Y7F=k46CiPt_vThGiM1Nrj8hNuxpAA@mail.gmail.com>

Hello, Lawrence!
Thank you for your response!

I attached log files (txt files with convergence behavior and RAM usage log
in separate txt files) and resulting table with convergence investigation
data(xls). Data for main non-regular grid with 500K cells and heterogeneous
properties are in 500K folder, whereas data for simple uniform 125K cells
grid with constant properties are in 125K folder.


>>* On 1 Sep 2021, at 09:42, **?????????** ??????** <numbersixvs at gmail.com <https://lists.mcs.anl.gov/mailman/listinfo/petsc-users>**> wrote:*

>>

>>* I have a 3D elasticity problem with heterogeneous properties.*

>

>What does your coefficient variation look like? How large is the contrast?



Young modulus varies from 1 to 10 GPa, Poisson ratio varies from 0.3 to
0.44 and density ? from 1700 to 2600 kg/m^3.





>>* There is unstructured grid with aspect ratio varied from 4 to 25. Zero Dirichlet BCs  are imposed on bottom face of mesh. Also, Neumann (traction) BCs are imposed on side faces. Gravity load is also accounted for. The grid I use consists of 500k cells (which is approximately 1.6M of DOFs).*

>>

>>* The best performance and memory usage for single MPI process was obtained with HPDDM(BFBCG) solver and bjacobian + ICC (1) in subdomains as preconditioner, it took 1 m 45 s and RAM 5.0 GB. Parallel computation with 4 MPI processes took 2 m 46 s when using 5.6 GB of RAM. This because of number of iterations required to achieve the same tolerance is significantly increased.*

>

>How many iterations do you have in serial (and then in parallel)?



Serial run is required 112 iterations to reach convergence
(log_hpddm(bfbcg)_bjacobian_icc_1_mpi.txt), parallel run with 4 MPI ?
680 iterations.



I attached log files for all simulations (txt files with convergence
behavior and RAM usage log in separate txt files) and resulting table with
convergence/memory usage data(xls). Data for main non-regular grid with
500K cells and heterogeneous properties are in 500K folder, whereas data
for simple uniform 125K cells grid with constant properties are in 125K
folder.





>>* I`ve also tried PCGAMG (agg) preconditioner with IC**?** (1) sub-precondtioner. For single MPI process, the calculation took 10 min and 3.4 GB of RAM. To improve the convergence rate, the nullspace was attached using MatNullSpaceCreateRigidBody and MatSetNearNullSpace subroutines.  This has reduced calculation time to 3 m 58 s when using 4.3 GB of RAM. Also, there is peak memory usage with 14.1 GB, which appears just before the start of the iterations. Parallel computation with 4 MPI processes took 2 m 53 s when using 8.4 GB of RAM. In that case the peak memory usage is about 22 GB.*

>

>Does the number of iterates increase in parallel? Again, how many iterations do you have?



For case with 4 MPI processes and attached nullspace it is required
177 iterations to reach convergence (you may see detailed log in
log_hpddm(bfbcg)_gamg_nearnullspace_4_mpi.txt). For comparison, 90
iterations are required for sequential
run(log_hpddm(bfbcg)_gamg_nearnullspace_1_mpi.txt).







>>* Are there ways to avoid decreasing of the convergence rate for bjacobi precondtioner in parallel mode? Does it make sense to use hierarchical or nested krylov methods with a local gmres solver (sub_pc_type gmres) and some sub-precondtioner (for example, sub_pc_type bjacobi)?*

>

>bjacobi is only a one-level method, so you would not expect process-independent convergence rate for this kind of problem. If the coefficient variation is not too extreme, then I would expect GAMG (or some other smoothed aggregation package, perhaps -pc_type ml (you need --download-ml)) would work well with some tuning.



Thanks for idea, but, unfortunately, ML cannot be compiled with 64bit
integers (It is extremely necessary to perform computation on mesh with
more than 10M cells).





>If you have extremely high contrast coefficients you might need something with stronger coarse grids. If you can assemble so-called Neumann matrices (https://petsc.org/release/docs/manualpages/Mat/MATIS.html#MATIS) then you could try the geneo scheme offered by PCHPDDM.





I found strange convergence behavior for HPDDM preconditioner. For 1 MPI
process BFBCG solver did not converged
(log_hpddm(bfbcg)_pchpddm_1_mpi.txt), while for 4 MPI processes computation
was successful (1018 to reach convergence,
log_hpddm(bfbcg)_pchpddm_4_mpi.txt).

But it should be mentioned that stiffness matrix was created in AIJ format
(our default matrix format in program).

Matrix conversion to MATIS format via MatConvert subroutine resulted in
losing of convergence for both serial and parallel run.


>>* Is this peak memory usage expected for gamg preconditioner? is there any way to reduce it?*

>

>I think that peak memory usage comes from building the coarse grids. Can you run with `-info` and grep for GAMG, this will provide some output that more expert GAMG users can interpret.



 Thanks, I`ll try to use a strong threshold only for coarse grids.



Kind regards,



Viktor Nazdrachev



R&D senior researcher



Geosteering Technologies LLC









??, 1 ????. 2021 ?. ? 12:02, Lawrence Mitchell <wence at gmx.li>:

>
>
> > On 1 Sep 2021, at 09:42, ????????? ?????? <numbersixvs at gmail.com> wrote:
> >
> > I have a 3D elasticity problem with heterogeneous properties.
>
> What does your coefficient variation look like? How large is the contrast?
>
> > There is unstructured grid with aspect ratio varied from 4 to 25. Zero
> Dirichlet BCs  are imposed on bottom face of mesh. Also, Neumann (traction)
> BCs are imposed on side faces. Gravity load is also accounted for. The grid
> I use consists of 500k cells (which is approximately 1.6M of DOFs).
> >
> > The best performance and memory usage for single MPI process was
> obtained with HPDDM(BFBCG) solver and bjacobian + ICC (1) in subdomains as
> preconditioner, it took 1 m 45 s and RAM 5.0 GB. Parallel computation with
> 4 MPI processes took 2 m 46 s when using 5.6 GB of RAM. This because of
> number of iterations required to achieve the same tolerance is
> significantly increased.
>
> How many iterations do you have in serial (and then in parallel)?
>
> > I`ve also tried PCGAMG (agg) preconditioner with IC? (1)
> sub-precondtioner. For single MPI process, the calculation took 10 min and
> 3.4 GB of RAM. To improve the convergence rate, the nullspace was attached
> using MatNullSpaceCreateRigidBody and MatSetNearNullSpace subroutines.
> This has reduced calculation time to 3 m 58 s when using 4.3 GB of RAM.
> Also, there is peak memory usage with 14.1 GB, which appears just before
> the start of the iterations. Parallel computation with 4 MPI processes took
> 2 m 53 s when using 8.4 GB of RAM. In that case the peak memory usage is
> about 22 GB.
>
> Does the number of iterates increase in parallel? Again, how many
> iterations do you have?
>
> > Are there ways to avoid decreasing of the convergence rate for bjacobi
> precondtioner in parallel mode? Does it make sense to use hierarchical or
> nested krylov methods with a local gmres solver (sub_pc_type gmres) and
> some sub-precondtioner (for example, sub_pc_type bjacobi)?
>
> bjacobi is only a one-level method, so you would not expect
> process-independent convergence rate for this kind of problem. If the
> coefficient variation is not too extreme, then I would expect GAMG (or some
> other smoothed aggregation package, perhaps -pc_type ml (you need
> --download-ml)) would work well with some tuning.
>
> If you have extremely high contrast coefficients you might need something
> with stronger coarse grids. If you can assemble so-called Neumann matrices (
> https://petsc.org/release/docs/manualpages/Mat/MATIS.html#MATIS) then you
> could try the geneo scheme offered by PCHPDDM.
>
> > Is this peak memory usage expected for gamg preconditioner? is there any
> way to reduce it?
>
> I think that peak memory usage comes from building the coarse grids. Can
> you run with `-info` and grep for GAMG, this will provide some output that
> more expert GAMG users can interpret.
>
> Lawrence
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210903/ff1f6d12/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: logs.rar
Type: application/octet-stream
Size: 212693 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210903/ff1f6d12/attachment-0001.obj>

From mfadams at lbl.gov  Fri Sep  3 07:02:31 2021
From: mfadams at lbl.gov (Mark Adams)
Date: Fri, 3 Sep 2021 08:02:31 -0400
Subject: [petsc-users] Slow convergence while parallel computations.
In-Reply-To: <CAELBu-83fyG37RReqiN0Y7F=k46CiPt_vThGiM1Nrj8hNuxpAA@mail.gmail.com>
References: <CAELBu--pQXziZLnKD0rXE8VxAdf1_AY09TCNKhW=9we2hXsMPw@mail.gmail.com>
	<EB5F7983-CA26-496E-9BE4-E15767E0003C@gmx.li>
	<CAELBu-83fyG37RReqiN0Y7F=k46CiPt_vThGiM1Nrj8hNuxpAA@mail.gmail.com>
Message-ID: <CADOhEh7kq-Syz6abBiBdGV7yrfDYL6gn+ZEOrPcn7Q0L_sYv1Q@mail.gmail.com>

On Fri, Sep 3, 2021 at 1:57 AM Viktor Nazdrachev <numbersixvs at gmail.com>
wrote:

> Hello, Lawrence!
> Thank you for your response!
>
> I attached log files (txt files with convergence behavior and RAM usage
> log in separate txt files) and resulting table with convergence
> investigation data(xls). Data for main non-regular grid with 500K cells and
> heterogeneous properties are in 500K folder, whereas data for simple
> uniform 125K cells grid with constant properties are in 125K folder.
>
>
> >>* On 1 Sep 2021, at 09:42, **?????????** ??????** <numbersixvs at gmail.com <https://lists.mcs.anl.gov/mailman/listinfo/petsc-users>**> wrote:*
>
> >>
>
> >>* I have a 3D elasticity problem with heterogeneous properties.*
>
> >
>
> >What does your coefficient variation look like? How large is the contrast?
>
>
>
> Young modulus varies from 1 to 10 GPa, Poisson ratio varies from 0.3 to
> 0.44 and density ? from 1700 to 2600 kg/m^3.
>

That is not too bad. Poorly shaped elements are the next thing to worry
about. Try to keep the aspect ratio below 10 if possible.


>
>
>
>
> >>* There is unstructured grid with aspect ratio varied from 4 to 25. Zero Dirichlet BCs  are imposed on bottom face of mesh. Also, Neumann (traction) BCs are imposed on side faces. Gravity load is also accounted for. The grid I use consists of 500k cells (which is approximately 1.6M of DOFs).*
>
> >>
>
> >>* The best performance and memory usage for single MPI process was obtained with HPDDM(BFBCG) solver and bjacobian + ICC (1) in subdomains as preconditioner, it took 1 m 45 s and RAM 5.0 GB. Parallel computation with 4 MPI processes took 2 m 46 s when using 5.6 GB of RAM. This because of number of iterations required to achieve the same tolerance is significantly increased.*
>
> >
>
> >How many iterations do you have in serial (and then in parallel)?
>
>
>
> Serial run is required 112 iterations to reach convergence (log_hpddm(bfbcg)_bjacobian_icc_1_mpi.txt), parallel run with 4 MPI ? 680 iterations.
>
>
>
> I attached log files for all simulations (txt files with convergence
> behavior and RAM usage log in separate txt files) and resulting table with
> convergence/memory usage data(xls). Data for main non-regular grid with
> 500K cells and heterogeneous properties are in 500K folder, whereas data
> for simple uniform 125K cells grid with constant properties are in 125K
> folder.
>
>
>
>
>
> >>* I`ve also tried PCGAMG (agg) preconditioner with IC**?** (1) sub-precondtioner. For single MPI process, the calculation took 10 min and 3.4 GB of RAM. To improve the convergence rate, the nullspace was attached using MatNullSpaceCreateRigidBody and MatSetNearNullSpace subroutines.  This has reduced calculation time to 3 m 58 s when using 4.3 GB of RAM. Also, there is peak memory usage with 14.1 GB, which appears just before the start of the iterations. Parallel computation with 4 MPI processes took 2 m 53 s when using 8.4 GB of RAM. In that case the peak memory usage is about 22 GB.*
>
> >
>
> >Does the number of iterates increase in parallel? Again, how many iterations do you have?
>
>
>
> For case with 4 MPI processes and attached nullspace it is required 177 iterations to reach convergence (you may see detailed log in log_hpddm(bfbcg)_gamg_nearnullspace_4_mpi.txt). For comparison, 90 iterations are required for sequential run(log_hpddm(bfbcg)_gamg_nearnullspace_1_mpi.txt).
>
>
Again, do not use ICC. I am surprised to see such a large jump in iteration
count, but get ICC off the table.

You will see variability in the iteration count with processor count with
GAMG. As much as 10% +-. Maybe more (random) variability , but usually less.

You can decrease the memory a little, and the setup time a lot, by
aggressively coarsening, at the expense of higher iteration counts. It's a
balancing act.

You can run with the defaults, add '-info', grep on GAMG and send the ~30
lines of output if you want advice on parameters.

Thanks,
Mark


>
>
>
>
>
>
> >>* Are there ways to avoid decreasing of the convergence rate for bjacobi precondtioner in parallel mode? Does it make sense to use hierarchical or nested krylov methods with a local gmres solver (sub_pc_type gmres) and some sub-precondtioner (for example, sub_pc_type bjacobi)?*
>
> >
>
> >bjacobi is only a one-level method, so you would not expect process-independent convergence rate for this kind of problem. If the coefficient variation is not too extreme, then I would expect GAMG (or some other smoothed aggregation package, perhaps -pc_type ml (you need --download-ml)) would work well with some tuning.
>
>
>
> Thanks for idea, but, unfortunately, ML cannot be compiled with 64bit
> integers (It is extremely necessary to perform computation on mesh with
> more than 10M cells).
>
>
>
>
>
> >If you have extremely high contrast coefficients you might need something with stronger coarse grids. If you can assemble so-called Neumann matrices (https://petsc.org/release/docs/manualpages/Mat/MATIS.html#MATIS) then you could try the geneo scheme offered by PCHPDDM.
>
>
>
>
>
> I found strange convergence behavior for HPDDM preconditioner. For 1 MPI
> process BFBCG solver did not converged
> (log_hpddm(bfbcg)_pchpddm_1_mpi.txt), while for 4 MPI processes computation
> was successful (1018 to reach convergence,
> log_hpddm(bfbcg)_pchpddm_4_mpi.txt).
>
> But it should be mentioned that stiffness matrix was created in AIJ format
> (our default matrix format in program).
>
> Matrix conversion to MATIS format via MatConvert subroutine resulted in
> losing of convergence for both serial and parallel run.
>
>
> >>* Is this peak memory usage expected for gamg preconditioner? is there any way to reduce it?*
>
> >
>
> >I think that peak memory usage comes from building the coarse grids. Can you run with `-info` and grep for GAMG, this will provide some output that more expert GAMG users can interpret.
>
>
>
>  Thanks, I`ll try to use a strong threshold only for coarse grids.
>
>
>
> Kind regards,
>
>
>
> Viktor Nazdrachev
>
>
>
> R&D senior researcher
>
>
>
> Geosteering Technologies LLC
>
>
>
>
>
>
>
>
>
> ??, 1 ????. 2021 ?. ? 12:02, Lawrence Mitchell <wence at gmx.li>:
>
>>
>>
>> > On 1 Sep 2021, at 09:42, ????????? ?????? <numbersixvs at gmail.com>
>> wrote:
>> >
>> > I have a 3D elasticity problem with heterogeneous properties.
>>
>> What does your coefficient variation look like? How large is the contrast?
>>
>> > There is unstructured grid with aspect ratio varied from 4 to 25. Zero
>> Dirichlet BCs  are imposed on bottom face of mesh. Also, Neumann (traction)
>> BCs are imposed on side faces. Gravity load is also accounted for. The grid
>> I use consists of 500k cells (which is approximately 1.6M of DOFs).
>> >
>> > The best performance and memory usage for single MPI process was
>> obtained with HPDDM(BFBCG) solver and bjacobian + ICC (1) in subdomains as
>> preconditioner, it took 1 m 45 s and RAM 5.0 GB. Parallel computation with
>> 4 MPI processes took 2 m 46 s when using 5.6 GB of RAM. This because of
>> number of iterations required to achieve the same tolerance is
>> significantly increased.
>>
>> How many iterations do you have in serial (and then in parallel)?
>>
>> > I`ve also tried PCGAMG (agg) preconditioner with IC? (1)
>> sub-precondtioner. For single MPI process, the calculation took 10 min and
>> 3.4 GB of RAM. To improve the convergence rate, the nullspace was attached
>> using MatNullSpaceCreateRigidBody and MatSetNearNullSpace subroutines.
>> This has reduced calculation time to 3 m 58 s when using 4.3 GB of RAM.
>> Also, there is peak memory usage with 14.1 GB, which appears just before
>> the start of the iterations. Parallel computation with 4 MPI processes took
>> 2 m 53 s when using 8.4 GB of RAM. In that case the peak memory usage is
>> about 22 GB.
>>
>> Does the number of iterates increase in parallel? Again, how many
>> iterations do you have?
>>
>> > Are there ways to avoid decreasing of the convergence rate for bjacobi
>> precondtioner in parallel mode? Does it make sense to use hierarchical or
>> nested krylov methods with a local gmres solver (sub_pc_type gmres) and
>> some sub-precondtioner (for example, sub_pc_type bjacobi)?
>>
>> bjacobi is only a one-level method, so you would not expect
>> process-independent convergence rate for this kind of problem. If the
>> coefficient variation is not too extreme, then I would expect GAMG (or some
>> other smoothed aggregation package, perhaps -pc_type ml (you need
>> --download-ml)) would work well with some tuning.
>>
>> If you have extremely high contrast coefficients you might need something
>> with stronger coarse grids. If you can assemble so-called Neumann matrices (
>> https://petsc.org/release/docs/manualpages/Mat/MATIS.html#MATIS) then
>> you could try the geneo scheme offered by PCHPDDM.
>>
>> > Is this peak memory usage expected for gamg preconditioner? is there
>> any way to reduce it?
>>
>> I think that peak memory usage comes from building the coarse grids. Can
>> you run with `-info` and grep for GAMG, this will provide some output that
>> more expert GAMG users can interpret.
>>
>> Lawrence
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210903/994d1546/attachment.html>

From knepley at gmail.com  Fri Sep  3 07:11:35 2021
From: knepley at gmail.com (Matthew Knepley)
Date: Fri, 3 Sep 2021 08:11:35 -0400
Subject: [petsc-users] Slow convergence while parallel computations.
In-Reply-To: <CADOhEh7kq-Syz6abBiBdGV7yrfDYL6gn+ZEOrPcn7Q0L_sYv1Q@mail.gmail.com>
References: <CAELBu--pQXziZLnKD0rXE8VxAdf1_AY09TCNKhW=9we2hXsMPw@mail.gmail.com>
	<EB5F7983-CA26-496E-9BE4-E15767E0003C@gmx.li>
	<CAELBu-83fyG37RReqiN0Y7F=k46CiPt_vThGiM1Nrj8hNuxpAA@mail.gmail.com>
	<CADOhEh7kq-Syz6abBiBdGV7yrfDYL6gn+ZEOrPcn7Q0L_sYv1Q@mail.gmail.com>
Message-ID: <CAMYG4GmhyjNGzHkza3XFBRYrePa=ZvyZ02QqkFoxnjVvmMymuA@mail.gmail.com>

On Fri, Sep 3, 2021 at 8:02 AM Mark Adams <mfadams at lbl.gov> wrote:

>
>
> On Fri, Sep 3, 2021 at 1:57 AM Viktor Nazdrachev <numbersixvs at gmail.com>
> wrote:
>
>> Hello, Lawrence!
>> Thank you for your response!
>>
>> I attached log files (txt files with convergence behavior and RAM usage
>> log in separate txt files) and resulting table with convergence
>> investigation data(xls). Data for main non-regular grid with 500K cells and
>> heterogeneous properties are in 500K folder, whereas data for simple
>> uniform 125K cells grid with constant properties are in 125K folder.
>>
>>
>> >>* On 1 Sep 2021, at 09:42, **?????????** ??????** <numbersixvs at gmail.com <https://lists.mcs.anl.gov/mailman/listinfo/petsc-users>**> wrote:*
>>
>> >>
>>
>> >>* I have a 3D elasticity problem with heterogeneous properties.*
>>
>> >
>>
>> >What does your coefficient variation look like? How large is the contrast?
>>
>>
>>
>> Young modulus varies from 1 to 10 GPa, Poisson ratio varies from 0.3 to
>> 0.44 and density ? from 1700 to 2600 kg/m^3.
>>
>
> That is not too bad. Poorly shaped elements are the next thing to worry
> about. Try to keep the aspect ratio below 10 if possible.
>
>
>>
>>
>>
>>
>> >>* There is unstructured grid with aspect ratio varied from 4 to 25. Zero Dirichlet BCs  are imposed on bottom face of mesh. Also, Neumann (traction) BCs are imposed on side faces. Gravity load is also accounted for. The grid I use consists of 500k cells (which is approximately 1.6M of DOFs).*
>>
>> >>
>>
>> >>* The best performance and memory usage for single MPI process was obtained with HPDDM(BFBCG) solver and bjacobian + ICC (1) in subdomains as preconditioner, it took 1 m 45 s and RAM 5.0 GB. Parallel computation with 4 MPI processes took 2 m 46 s when using 5.6 GB of RAM. This because of number of iterations required to achieve the same tolerance is significantly increased.*
>>
>> >
>>
>> >How many iterations do you have in serial (and then in parallel)?
>>
>>
>>
>> Serial run is required 112 iterations to reach convergence (log_hpddm(bfbcg)_bjacobian_icc_1_mpi.txt), parallel run with 4 MPI ? 680 iterations.
>>
>>
>>
>> I attached log files for all simulations (txt files with convergence
>> behavior and RAM usage log in separate txt files) and resulting table with
>> convergence/memory usage data(xls). Data for main non-regular grid with
>> 500K cells and heterogeneous properties are in 500K folder, whereas data
>> for simple uniform 125K cells grid with constant properties are in 125K
>> folder.
>>
>>
>>
>>
>>
>> >>* I`ve also tried PCGAMG (agg) preconditioner with IC**?** (1) sub-precondtioner. For single MPI process, the calculation took 10 min and 3.4 GB of RAM. To improve the convergence rate, the nullspace was attached using MatNullSpaceCreateRigidBody and MatSetNearNullSpace subroutines.  This has reduced calculation time to 3 m 58 s when using 4.3 GB of RAM. Also, there is peak memory usage with 14.1 GB, which appears just before the start of the iterations. Parallel computation with 4 MPI processes took 2 m 53 s when using 8.4 GB of RAM. In that case the peak memory usage is about 22 GB.*
>>
>> >
>>
>> >Does the number of iterates increase in parallel? Again, how many iterations do you have?
>>
>>
>>
>> For case with 4 MPI processes and attached nullspace it is required 177 iterations to reach convergence (you may see detailed log in log_hpddm(bfbcg)_gamg_nearnullspace_4_mpi.txt). For comparison, 90 iterations are required for sequential run(log_hpddm(bfbcg)_gamg_nearnullspace_1_mpi.txt).
>>
>>
> Again, do not use ICC. I am surprised to see such a large jump in
> iteration count, but get ICC off the table.
>
> You will see variability in the iteration count with processor count with
> GAMG. As much as 10% +-. Maybe more (random) variability , but usually less.
>
> You can decrease the memory a little, and the setup time a lot, by
> aggressively coarsening, at the expense of higher iteration counts. It's a
> balancing act.
>
> You can run with the defaults, add '-info', grep on GAMG and send the ~30
> lines of output if you want advice on parameters.
>

Can you send the output of

  -ksp_view -ksp_monitor_true_residual -ksp_converged_reason

  Thanks,

      Matt


> Thanks,
> Mark
>
>
>>
>>
>>
>>
>>
>>
>> >>* Are there ways to avoid decreasing of the convergence rate for bjacobi precondtioner in parallel mode? Does it make sense to use hierarchical or nested krylov methods with a local gmres solver (sub_pc_type gmres) and some sub-precondtioner (for example, sub_pc_type bjacobi)?*
>>
>> >
>>
>> >bjacobi is only a one-level method, so you would not expect process-independent convergence rate for this kind of problem. If the coefficient variation is not too extreme, then I would expect GAMG (or some other smoothed aggregation package, perhaps -pc_type ml (you need --download-ml)) would work well with some tuning.
>>
>>
>>
>> Thanks for idea, but, unfortunately, ML cannot be compiled with 64bit
>> integers (It is extremely necessary to perform computation on mesh with
>> more than 10M cells).
>>
>>
>>
>>
>>
>> >If you have extremely high contrast coefficients you might need something with stronger coarse grids. If you can assemble so-called Neumann matrices (https://petsc.org/release/docs/manualpages/Mat/MATIS.html#MATIS) then you could try the geneo scheme offered by PCHPDDM.
>>
>>
>>
>>
>>
>> I found strange convergence behavior for HPDDM preconditioner. For 1 MPI
>> process BFBCG solver did not converged
>> (log_hpddm(bfbcg)_pchpddm_1_mpi.txt), while for 4 MPI processes computation
>> was successful (1018 to reach convergence,
>> log_hpddm(bfbcg)_pchpddm_4_mpi.txt).
>>
>> But it should be mentioned that stiffness matrix was created in AIJ
>> format (our default matrix format in program).
>>
>> Matrix conversion to MATIS format via MatConvert subroutine resulted in
>> losing of convergence for both serial and parallel run.
>>
>>
>> >>* Is this peak memory usage expected for gamg preconditioner? is there any way to reduce it?*
>>
>> >
>>
>> >I think that peak memory usage comes from building the coarse grids. Can you run with `-info` and grep for GAMG, this will provide some output that more expert GAMG users can interpret.
>>
>>
>>
>>  Thanks, I`ll try to use a strong threshold only for coarse grids.
>>
>>
>>
>> Kind regards,
>>
>>
>>
>> Viktor Nazdrachev
>>
>>
>>
>> R&D senior researcher
>>
>>
>>
>> Geosteering Technologies LLC
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> ??, 1 ????. 2021 ?. ? 12:02, Lawrence Mitchell <wence at gmx.li>:
>>
>>>
>>>
>>> > On 1 Sep 2021, at 09:42, ????????? ?????? <numbersixvs at gmail.com>
>>> wrote:
>>> >
>>> > I have a 3D elasticity problem with heterogeneous properties.
>>>
>>> What does your coefficient variation look like? How large is the
>>> contrast?
>>>
>>> > There is unstructured grid with aspect ratio varied from 4 to 25. Zero
>>> Dirichlet BCs  are imposed on bottom face of mesh. Also, Neumann (traction)
>>> BCs are imposed on side faces. Gravity load is also accounted for. The grid
>>> I use consists of 500k cells (which is approximately 1.6M of DOFs).
>>> >
>>> > The best performance and memory usage for single MPI process was
>>> obtained with HPDDM(BFBCG) solver and bjacobian + ICC (1) in subdomains as
>>> preconditioner, it took 1 m 45 s and RAM 5.0 GB. Parallel computation with
>>> 4 MPI processes took 2 m 46 s when using 5.6 GB of RAM. This because of
>>> number of iterations required to achieve the same tolerance is
>>> significantly increased.
>>>
>>> How many iterations do you have in serial (and then in parallel)?
>>>
>>> > I`ve also tried PCGAMG (agg) preconditioner with IC? (1)
>>> sub-precondtioner. For single MPI process, the calculation took 10 min and
>>> 3.4 GB of RAM. To improve the convergence rate, the nullspace was attached
>>> using MatNullSpaceCreateRigidBody and MatSetNearNullSpace subroutines.
>>> This has reduced calculation time to 3 m 58 s when using 4.3 GB of RAM.
>>> Also, there is peak memory usage with 14.1 GB, which appears just before
>>> the start of the iterations. Parallel computation with 4 MPI processes took
>>> 2 m 53 s when using 8.4 GB of RAM. In that case the peak memory usage is
>>> about 22 GB.
>>>
>>> Does the number of iterates increase in parallel? Again, how many
>>> iterations do you have?
>>>
>>> > Are there ways to avoid decreasing of the convergence rate for bjacobi
>>> precondtioner in parallel mode? Does it make sense to use hierarchical or
>>> nested krylov methods with a local gmres solver (sub_pc_type gmres) and
>>> some sub-precondtioner (for example, sub_pc_type bjacobi)?
>>>
>>> bjacobi is only a one-level method, so you would not expect
>>> process-independent convergence rate for this kind of problem. If the
>>> coefficient variation is not too extreme, then I would expect GAMG (or some
>>> other smoothed aggregation package, perhaps -pc_type ml (you need
>>> --download-ml)) would work well with some tuning.
>>>
>>> If you have extremely high contrast coefficients you might need
>>> something with stronger coarse grids. If you can assemble so-called Neumann
>>> matrices (
>>> https://petsc.org/release/docs/manualpages/Mat/MATIS.html#MATIS) then
>>> you could try the geneo scheme offered by PCHPDDM.
>>>
>>> > Is this peak memory usage expected for gamg preconditioner? is there
>>> any way to reduce it?
>>>
>>> I think that peak memory usage comes from building the coarse grids. Can
>>> you run with `-info` and grep for GAMG, this will provide some output that
>>> more expert GAMG users can interpret.
>>>
>>> Lawrence
>>>
>>>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210903/b5a81b2d/attachment-0001.html>

From numbersixvs at gmail.com  Fri Sep  3 09:48:08 2021
From: numbersixvs at gmail.com (Viktor Nazdrachev)
Date: Fri, 3 Sep 2021 17:48:08 +0300
Subject: [petsc-users] Slow convergence while parallel computations.
In-Reply-To: <CAMYG4GmhyjNGzHkza3XFBRYrePa=ZvyZ02QqkFoxnjVvmMymuA@mail.gmail.com>
References: <CAELBu--pQXziZLnKD0rXE8VxAdf1_AY09TCNKhW=9we2hXsMPw@mail.gmail.com>
	<EB5F7983-CA26-496E-9BE4-E15767E0003C@gmx.li>
	<CAELBu-83fyG37RReqiN0Y7F=k46CiPt_vThGiM1Nrj8hNuxpAA@mail.gmail.com>
	<CADOhEh7kq-Syz6abBiBdGV7yrfDYL6gn+ZEOrPcn7Q0L_sYv1Q@mail.gmail.com>
	<CAMYG4GmhyjNGzHkza3XFBRYrePa=ZvyZ02QqkFoxnjVvmMymuA@mail.gmail.com>
Message-ID: <CAELBu-_E4SG3dzj4ceVQWS8BnrZdpHVFP6Z1cbSSZ6M8VhstFQ@mail.gmail.com>

Hello Mark and Matthew!



I attached log files for serial and parallel cases and corresponding
information about GAMG preconditioner (using grep).

I have to notice, that assembling of global stiffness matrix in code
was performed by MatSetValues subrotuine (not MatSetValuesBlocked)

!nnds ? number of nodes

!dmn=3

call MatCreate(Petsc_Comm_World,Mat_K,ierr)

call MatSetFromOptions(Mat_K,ierr)

call MatSetSizes(Mat_K,Petsc_Decide,Petsc_Decide,n,n,ierr_m)

?

call MatMPIAIJSetPreallocation(Mat_K,0,dbw,0,obw,ierr)

?

call MatSetOption(Mat_K,Mat_New_Nonzero_Allocation_Err,Petsc_False,ierr)

?

do i=1,nels

    call FormLocalK(i,k,indx,"Kp") ! find local stiffness matrix

     indx=indxmap(indx,2) !find global indices for DOFs

     call MatSetValues(Mat_K,ef_eldof,indx,ef_eldof,indx,k,Add_Values,ierr)

end do



But nullspace vector was created using VecSetBlockSize subroutine.



call VecCreate(Petsc_Comm_World,Vec_NullSpace,ierr)

call VecSetBlockSize(Vec_NullSpace,dmn,ierr)

call VecSetSizes(Vec_NullSpace,nnds*dmn,Petsc_Decide,ierr)

call VecSetUp(Vec_NullSpace,ierr)

call VecGetArrayF90(Vec_NullSpace,null_space,ierr)

?

call VecRestoreArrayF90(Vec_NullSpace,null_space,ierr)

call MatNullSpaceCreateRigidBody(Vec_NullSpace,matnull,ierr)

call MatSetNearNullSpace(Mat_K,matnull,ierr)



I suppose it can be one of the reasons of GAMG slow convergence.

So I attached log files for parallel run with ?pure? GAMG precondtioner.





Kind regards,



Viktor Nazdrachev



R&D senior researcher



Geosteering Technologies LLC

??, 3 ????. 2021 ?. ? 15:11, Matthew Knepley <knepley at gmail.com>:

> On Fri, Sep 3, 2021 at 8:02 AM Mark Adams <mfadams at lbl.gov> wrote:
>
>>
>>
>> On Fri, Sep 3, 2021 at 1:57 AM Viktor Nazdrachev <numbersixvs at gmail.com>
>> wrote:
>>
>>> Hello, Lawrence!
>>> Thank you for your response!
>>>
>>> I attached log files (txt files with convergence behavior and RAM usage
>>> log in separate txt files) and resulting table with convergence
>>> investigation data(xls). Data for main non-regular grid with 500K cells and
>>> heterogeneous properties are in 500K folder, whereas data for simple
>>> uniform 125K cells grid with constant properties are in 125K folder.
>>>
>>>
>>> >>* On 1 Sep 2021, at 09:42, **?????????** ??????** <numbersixvs at gmail.com <https://lists.mcs.anl.gov/mailman/listinfo/petsc-users>**> wrote:*
>>>
>>> >>
>>>
>>> >>* I have a 3D elasticity problem with heterogeneous properties.*
>>>
>>> >
>>>
>>> >What does your coefficient variation look like? How large is the contrast?
>>>
>>>
>>>
>>> Young modulus varies from 1 to 10 GPa, Poisson ratio varies from 0.3 to
>>> 0.44 and density ? from 1700 to 2600 kg/m^3.
>>>
>>
>> That is not too bad. Poorly shaped elements are the next thing to worry
>> about. Try to keep the aspect ratio below 10 if possible.
>>
>>
>>>
>>>
>>>
>>>
>>> >>* There is unstructured grid with aspect ratio varied from 4 to 25. Zero Dirichlet BCs  are imposed on bottom face of mesh. Also, Neumann (traction) BCs are imposed on side faces. Gravity load is also accounted for. The grid I use consists of 500k cells (which is approximately 1.6M of DOFs).*
>>>
>>> >>
>>>
>>> >>* The best performance and memory usage for single MPI process was obtained with HPDDM(BFBCG) solver and bjacobian + ICC (1) in subdomains as preconditioner, it took 1 m 45 s and RAM 5.0 GB. Parallel computation with 4 MPI processes took 2 m 46 s when using 5.6 GB of RAM. This because of number of iterations required to achieve the same tolerance is significantly increased.*
>>>
>>> >
>>>
>>> >How many iterations do you have in serial (and then in parallel)?
>>>
>>>
>>>
>>> Serial run is required 112 iterations to reach convergence (log_hpddm(bfbcg)_bjacobian_icc_1_mpi.txt), parallel run with 4 MPI ? 680 iterations.
>>>
>>>
>>>
>>> I attached log files for all simulations (txt files with convergence
>>> behavior and RAM usage log in separate txt files) and resulting table with
>>> convergence/memory usage data(xls). Data for main non-regular grid with
>>> 500K cells and heterogeneous properties are in 500K folder, whereas data
>>> for simple uniform 125K cells grid with constant properties are in 125K
>>> folder.
>>>
>>>
>>>
>>>
>>>
>>> >>* I`ve also tried PCGAMG (agg) preconditioner with IC**?** (1) sub-precondtioner. For single MPI process, the calculation took 10 min and 3.4 GB of RAM. To improve the convergence rate, the nullspace was attached using MatNullSpaceCreateRigidBody and MatSetNearNullSpace subroutines.  This has reduced calculation time to 3 m 58 s when using 4.3 GB of RAM. Also, there is peak memory usage with 14.1 GB, which appears just before the start of the iterations. Parallel computation with 4 MPI processes took 2 m 53 s when using 8.4 GB of RAM. In that case the peak memory usage is about 22 GB.*
>>>
>>> >
>>>
>>> >Does the number of iterates increase in parallel? Again, how many iterations do you have?
>>>
>>>
>>>
>>> For case with 4 MPI processes and attached nullspace it is required 177 iterations to reach convergence (you may see detailed log in log_hpddm(bfbcg)_gamg_nearnullspace_4_mpi.txt). For comparison, 90 iterations are required for sequential run(log_hpddm(bfbcg)_gamg_nearnullspace_1_mpi.txt).
>>>
>>>
>> Again, do not use ICC. I am surprised to see such a large jump in
>> iteration count, but get ICC off the table.
>>
>> You will see variability in the iteration count with processor count with
>> GAMG. As much as 10% +-. Maybe more (random) variability , but usually less.
>>
>> You can decrease the memory a little, and the setup time a lot, by
>> aggressively coarsening, at the expense of higher iteration counts. It's a
>> balancing act.
>>
>> You can run with the defaults, add '-info', grep on GAMG and send the ~30
>> lines of output if you want advice on parameters.
>>
>
> Can you send the output of
>
>   -ksp_view -ksp_monitor_true_residual -ksp_converged_reason
>
>   Thanks,
>
>       Matt
>
>
>> Thanks,
>> Mark
>>
>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> >>* Are there ways to avoid decreasing of the convergence rate for bjacobi precondtioner in parallel mode? Does it make sense to use hierarchical or nested krylov methods with a local gmres solver (sub_pc_type gmres) and some sub-precondtioner (for example, sub_pc_type bjacobi)?*
>>>
>>> >
>>>
>>> >bjacobi is only a one-level method, so you would not expect process-independent convergence rate for this kind of problem. If the coefficient variation is not too extreme, then I would expect GAMG (or some other smoothed aggregation package, perhaps -pc_type ml (you need --download-ml)) would work well with some tuning.
>>>
>>>
>>>
>>> Thanks for idea, but, unfortunately, ML cannot be compiled with 64bit
>>> integers (It is extremely necessary to perform computation on mesh with
>>> more than 10M cells).
>>>
>>>
>>>
>>>
>>>
>>> >If you have extremely high contrast coefficients you might need something with stronger coarse grids. If you can assemble so-called Neumann matrices (https://petsc.org/release/docs/manualpages/Mat/MATIS.html#MATIS) then you could try the geneo scheme offered by PCHPDDM.
>>>
>>>
>>>
>>>
>>>
>>> I found strange convergence behavior for HPDDM preconditioner. For 1 MPI
>>> process BFBCG solver did not converged
>>> (log_hpddm(bfbcg)_pchpddm_1_mpi.txt), while for 4 MPI processes computation
>>> was successful (1018 to reach convergence,
>>> log_hpddm(bfbcg)_pchpddm_4_mpi.txt).
>>>
>>> But it should be mentioned that stiffness matrix was created in AIJ
>>> format (our default matrix format in program).
>>>
>>> Matrix conversion to MATIS format via MatConvert subroutine resulted in
>>> losing of convergence for both serial and parallel run.
>>>
>>>
>>> >>* Is this peak memory usage expected for gamg preconditioner? is there any way to reduce it?*
>>>
>>> >
>>>
>>> >I think that peak memory usage comes from building the coarse grids. Can you run with `-info` and grep for GAMG, this will provide some output that more expert GAMG users can interpret.
>>>
>>>
>>>
>>>  Thanks, I`ll try to use a strong threshold only for coarse grids.
>>>
>>>
>>>
>>> Kind regards,
>>>
>>>
>>>
>>> Viktor Nazdrachev
>>>
>>>
>>>
>>> R&D senior researcher
>>>
>>>
>>>
>>> Geosteering Technologies LLC
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> ??, 1 ????. 2021 ?. ? 12:02, Lawrence Mitchell <wence at gmx.li>:
>>>
>>>>
>>>>
>>>> > On 1 Sep 2021, at 09:42, ????????? ?????? <numbersixvs at gmail.com>
>>>> wrote:
>>>> >
>>>> > I have a 3D elasticity problem with heterogeneous properties.
>>>>
>>>> What does your coefficient variation look like? How large is the
>>>> contrast?
>>>>
>>>> > There is unstructured grid with aspect ratio varied from 4 to 25.
>>>> Zero Dirichlet BCs  are imposed on bottom face of mesh. Also, Neumann
>>>> (traction) BCs are imposed on side faces. Gravity load is also accounted
>>>> for. The grid I use consists of 500k cells (which is approximately 1.6M of
>>>> DOFs).
>>>> >
>>>> > The best performance and memory usage for single MPI process was
>>>> obtained with HPDDM(BFBCG) solver and bjacobian + ICC (1) in subdomains as
>>>> preconditioner, it took 1 m 45 s and RAM 5.0 GB. Parallel computation with
>>>> 4 MPI processes took 2 m 46 s when using 5.6 GB of RAM. This because of
>>>> number of iterations required to achieve the same tolerance is
>>>> significantly increased.
>>>>
>>>> How many iterations do you have in serial (and then in parallel)?
>>>>
>>>> > I`ve also tried PCGAMG (agg) preconditioner with IC? (1)
>>>> sub-precondtioner. For single MPI process, the calculation took 10 min and
>>>> 3.4 GB of RAM. To improve the convergence rate, the nullspace was attached
>>>> using MatNullSpaceCreateRigidBody and MatSetNearNullSpace subroutines.
>>>> This has reduced calculation time to 3 m 58 s when using 4.3 GB of RAM.
>>>> Also, there is peak memory usage with 14.1 GB, which appears just before
>>>> the start of the iterations. Parallel computation with 4 MPI processes took
>>>> 2 m 53 s when using 8.4 GB of RAM. In that case the peak memory usage is
>>>> about 22 GB.
>>>>
>>>> Does the number of iterates increase in parallel? Again, how many
>>>> iterations do you have?
>>>>
>>>> > Are there ways to avoid decreasing of the convergence rate for
>>>> bjacobi precondtioner in parallel mode? Does it make sense to use
>>>> hierarchical or nested krylov methods with a local gmres solver
>>>> (sub_pc_type gmres) and some sub-precondtioner (for example, sub_pc_type
>>>> bjacobi)?
>>>>
>>>> bjacobi is only a one-level method, so you would not expect
>>>> process-independent convergence rate for this kind of problem. If the
>>>> coefficient variation is not too extreme, then I would expect GAMG (or some
>>>> other smoothed aggregation package, perhaps -pc_type ml (you need
>>>> --download-ml)) would work well with some tuning.
>>>>
>>>> If you have extremely high contrast coefficients you might need
>>>> something with stronger coarse grids. If you can assemble so-called Neumann
>>>> matrices (
>>>> https://petsc.org/release/docs/manualpages/Mat/MATIS.html#MATIS) then
>>>> you could try the geneo scheme offered by PCHPDDM.
>>>>
>>>> > Is this peak memory usage expected for gamg preconditioner? is there
>>>> any way to reduce it?
>>>>
>>>> I think that peak memory usage comes from building the coarse grids.
>>>> Can you run with `-info` and grep for GAMG, this will provide some output
>>>> that more expert GAMG users can interpret.
>>>>
>>>> Lawrence
>>>>
>>>>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
> <http://www.cse.buffalo.edu/~knepley/>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210903/8f115262/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: true_residual_logs_and_greped.rar
Type: application/octet-stream
Size: 55998 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210903/8f115262/attachment-0001.obj>

From mfadams at lbl.gov  Fri Sep  3 09:56:06 2021
From: mfadams at lbl.gov (Mark Adams)
Date: Fri, 3 Sep 2021 10:56:06 -0400
Subject: [petsc-users] Slow convergence while parallel computations.
In-Reply-To: <CAELBu-_E4SG3dzj4ceVQWS8BnrZdpHVFP6Z1cbSSZ6M8VhstFQ@mail.gmail.com>
References: <CAELBu--pQXziZLnKD0rXE8VxAdf1_AY09TCNKhW=9we2hXsMPw@mail.gmail.com>
	<EB5F7983-CA26-496E-9BE4-E15767E0003C@gmx.li>
	<CAELBu-83fyG37RReqiN0Y7F=k46CiPt_vThGiM1Nrj8hNuxpAA@mail.gmail.com>
	<CADOhEh7kq-Syz6abBiBdGV7yrfDYL6gn+ZEOrPcn7Q0L_sYv1Q@mail.gmail.com>
	<CAMYG4GmhyjNGzHkza3XFBRYrePa=ZvyZ02QqkFoxnjVvmMymuA@mail.gmail.com>
	<CAELBu-_E4SG3dzj4ceVQWS8BnrZdpHVFP6Z1cbSSZ6M8VhstFQ@mail.gmail.com>
Message-ID: <CADOhEh6rkom+YOP=LLZ2Os+eoM4UhNF4V+0bpi_=yi+TamROiQ@mail.gmail.com>

That does not seem to be an ASCII file.

On Fri, Sep 3, 2021 at 10:48 AM Viktor Nazdrachev <numbersixvs at gmail.com>
wrote:

> Hello Mark and Matthew!
>
>
>
> I attached log files for serial and parallel cases and corresponding information about GAMG preconditioner (using grep).
>
> I have to notice, that assembling of global stiffness matrix in code was performed by MatSetValues subrotuine (not MatSetValuesBlocked)
>
> !nnds ? number of nodes
>
> !dmn=3
>
> call MatCreate(Petsc_Comm_World,Mat_K,ierr)
>
> call MatSetFromOptions(Mat_K,ierr)
>
> call MatSetSizes(Mat_K,Petsc_Decide,Petsc_Decide,n,n,ierr_m)
>
> ?
>
> call MatMPIAIJSetPreallocation(Mat_K,0,dbw,0,obw,ierr)
>
> ?
>
> call MatSetOption(Mat_K,Mat_New_Nonzero_Allocation_Err,Petsc_False,ierr)
>
> ?
>
> do i=1,nels
>
>     call FormLocalK(i,k,indx,"Kp") ! find local stiffness matrix
>
>      indx=indxmap(indx,2) !find global indices for DOFs
>
>      call MatSetValues(Mat_K,ef_eldof,indx,ef_eldof,indx,k,Add_Values,ierr)
>
> end do
>
>
>
> But nullspace vector was created using VecSetBlockSize subroutine.
>
>
>
> call VecCreate(Petsc_Comm_World,Vec_NullSpace,ierr)
>
> call VecSetBlockSize(Vec_NullSpace,dmn,ierr)
>
> call VecSetSizes(Vec_NullSpace,nnds*dmn,Petsc_Decide,ierr)
>
> call VecSetUp(Vec_NullSpace,ierr)
>
> call VecGetArrayF90(Vec_NullSpace,null_space,ierr)
>
> ?
>
> call VecRestoreArrayF90(Vec_NullSpace,null_space,ierr)
>
> call MatNullSpaceCreateRigidBody(Vec_NullSpace,matnull,ierr)
>
> call MatSetNearNullSpace(Mat_K,matnull,ierr)
>
>
>
> I suppose it can be one of the reasons of GAMG slow convergence.
>
> So I attached log files for parallel run with ?pure? GAMG precondtioner.
>
>
>
>
>
> Kind regards,
>
>
>
> Viktor Nazdrachev
>
>
>
> R&D senior researcher
>
>
>
> Geosteering Technologies LLC
>
> ??, 3 ????. 2021 ?. ? 15:11, Matthew Knepley <knepley at gmail.com>:
>
>> On Fri, Sep 3, 2021 at 8:02 AM Mark Adams <mfadams at lbl.gov> wrote:
>>
>>>
>>>
>>> On Fri, Sep 3, 2021 at 1:57 AM Viktor Nazdrachev <numbersixvs at gmail.com>
>>> wrote:
>>>
>>>> Hello, Lawrence!
>>>> Thank you for your response!
>>>>
>>>> I attached log files (txt files with convergence behavior and RAM usage
>>>> log in separate txt files) and resulting table with convergence
>>>> investigation data(xls). Data for main non-regular grid with 500K cells and
>>>> heterogeneous properties are in 500K folder, whereas data for simple
>>>> uniform 125K cells grid with constant properties are in 125K folder.
>>>>
>>>>
>>>> >>* On 1 Sep 2021, at 09:42, **?????????** ??????** <numbersixvs at gmail.com <https://lists.mcs.anl.gov/mailman/listinfo/petsc-users>**> wrote:*
>>>>
>>>> >>
>>>>
>>>> >>* I have a 3D elasticity problem with heterogeneous properties.*
>>>>
>>>> >
>>>>
>>>> >What does your coefficient variation look like? How large is the contrast?
>>>>
>>>>
>>>>
>>>> Young modulus varies from 1 to 10 GPa, Poisson ratio varies from 0.3 to
>>>> 0.44 and density ? from 1700 to 2600 kg/m^3.
>>>>
>>>
>>> That is not too bad. Poorly shaped elements are the next thing to worry
>>> about. Try to keep the aspect ratio below 10 if possible.
>>>
>>>
>>>>
>>>>
>>>>
>>>>
>>>> >>* There is unstructured grid with aspect ratio varied from 4 to 25. Zero Dirichlet BCs  are imposed on bottom face of mesh. Also, Neumann (traction) BCs are imposed on side faces. Gravity load is also accounted for. The grid I use consists of 500k cells (which is approximately 1.6M of DOFs).*
>>>>
>>>> >>
>>>>
>>>> >>* The best performance and memory usage for single MPI process was obtained with HPDDM(BFBCG) solver and bjacobian + ICC (1) in subdomains as preconditioner, it took 1 m 45 s and RAM 5.0 GB. Parallel computation with 4 MPI processes took 2 m 46 s when using 5.6 GB of RAM. This because of number of iterations required to achieve the same tolerance is significantly increased.*
>>>>
>>>> >
>>>>
>>>> >How many iterations do you have in serial (and then in parallel)?
>>>>
>>>>
>>>>
>>>> Serial run is required 112 iterations to reach convergence (log_hpddm(bfbcg)_bjacobian_icc_1_mpi.txt), parallel run with 4 MPI ? 680 iterations.
>>>>
>>>>
>>>>
>>>> I attached log files for all simulations (txt files with convergence
>>>> behavior and RAM usage log in separate txt files) and resulting table with
>>>> convergence/memory usage data(xls). Data for main non-regular grid with
>>>> 500K cells and heterogeneous properties are in 500K folder, whereas data
>>>> for simple uniform 125K cells grid with constant properties are in 125K
>>>> folder.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> >>* I`ve also tried PCGAMG (agg) preconditioner with IC**?** (1) sub-precondtioner. For single MPI process, the calculation took 10 min and 3.4 GB of RAM. To improve the convergence rate, the nullspace was attached using MatNullSpaceCreateRigidBody and MatSetNearNullSpace subroutines.  This has reduced calculation time to 3 m 58 s when using 4.3 GB of RAM. Also, there is peak memory usage with 14.1 GB, which appears just before the start of the iterations. Parallel computation with 4 MPI processes took 2 m 53 s when using 8.4 GB of RAM. In that case the peak memory usage is about 22 GB.*
>>>>
>>>> >
>>>>
>>>> >Does the number of iterates increase in parallel? Again, how many iterations do you have?
>>>>
>>>>
>>>>
>>>> For case with 4 MPI processes and attached nullspace it is required 177 iterations to reach convergence (you may see detailed log in log_hpddm(bfbcg)_gamg_nearnullspace_4_mpi.txt). For comparison, 90 iterations are required for sequential run(log_hpddm(bfbcg)_gamg_nearnullspace_1_mpi.txt).
>>>>
>>>>
>>> Again, do not use ICC. I am surprised to see such a large jump in
>>> iteration count, but get ICC off the table.
>>>
>>> You will see variability in the iteration count with processor count
>>> with GAMG. As much as 10% +-. Maybe more (random) variability , but usually
>>> less.
>>>
>>> You can decrease the memory a little, and the setup time a lot, by
>>> aggressively coarsening, at the expense of higher iteration counts. It's a
>>> balancing act.
>>>
>>> You can run with the defaults, add '-info', grep on GAMG and send the
>>> ~30 lines of output if you want advice on parameters.
>>>
>>
>> Can you send the output of
>>
>>   -ksp_view -ksp_monitor_true_residual -ksp_converged_reason
>>
>>   Thanks,
>>
>>       Matt
>>
>>
>>> Thanks,
>>> Mark
>>>
>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> >>* Are there ways to avoid decreasing of the convergence rate for bjacobi precondtioner in parallel mode? Does it make sense to use hierarchical or nested krylov methods with a local gmres solver (sub_pc_type gmres) and some sub-precondtioner (for example, sub_pc_type bjacobi)?*
>>>>
>>>> >
>>>>
>>>> >bjacobi is only a one-level method, so you would not expect process-independent convergence rate for this kind of problem. If the coefficient variation is not too extreme, then I would expect GAMG (or some other smoothed aggregation package, perhaps -pc_type ml (you need --download-ml)) would work well with some tuning.
>>>>
>>>>
>>>>
>>>> Thanks for idea, but, unfortunately, ML cannot be compiled with 64bit
>>>> integers (It is extremely necessary to perform computation on mesh with
>>>> more than 10M cells).
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> >If you have extremely high contrast coefficients you might need something with stronger coarse grids. If you can assemble so-called Neumann matrices (https://petsc.org/release/docs/manualpages/Mat/MATIS.html#MATIS) then you could try the geneo scheme offered by PCHPDDM.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> I found strange convergence behavior for HPDDM preconditioner. For 1
>>>> MPI process BFBCG solver did not converged
>>>> (log_hpddm(bfbcg)_pchpddm_1_mpi.txt), while for 4 MPI processes computation
>>>> was successful (1018 to reach convergence,
>>>> log_hpddm(bfbcg)_pchpddm_4_mpi.txt).
>>>>
>>>> But it should be mentioned that stiffness matrix was created in AIJ
>>>> format (our default matrix format in program).
>>>>
>>>> Matrix conversion to MATIS format via MatConvert subroutine resulted
>>>> in losing of convergence for both serial and parallel run.
>>>>
>>>>
>>>> >>* Is this peak memory usage expected for gamg preconditioner? is there any way to reduce it?*
>>>>
>>>> >
>>>>
>>>> >I think that peak memory usage comes from building the coarse grids. Can you run with `-info` and grep for GAMG, this will provide some output that more expert GAMG users can interpret.
>>>>
>>>>
>>>>
>>>>  Thanks, I`ll try to use a strong threshold only for coarse grids.
>>>>
>>>>
>>>>
>>>> Kind regards,
>>>>
>>>>
>>>>
>>>> Viktor Nazdrachev
>>>>
>>>>
>>>>
>>>> R&D senior researcher
>>>>
>>>>
>>>>
>>>> Geosteering Technologies LLC
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> ??, 1 ????. 2021 ?. ? 12:02, Lawrence Mitchell <wence at gmx.li>:
>>>>
>>>>>
>>>>>
>>>>> > On 1 Sep 2021, at 09:42, ????????? ?????? <numbersixvs at gmail.com>
>>>>> wrote:
>>>>> >
>>>>> > I have a 3D elasticity problem with heterogeneous properties.
>>>>>
>>>>> What does your coefficient variation look like? How large is the
>>>>> contrast?
>>>>>
>>>>> > There is unstructured grid with aspect ratio varied from 4 to 25.
>>>>> Zero Dirichlet BCs  are imposed on bottom face of mesh. Also, Neumann
>>>>> (traction) BCs are imposed on side faces. Gravity load is also accounted
>>>>> for. The grid I use consists of 500k cells (which is approximately 1.6M of
>>>>> DOFs).
>>>>> >
>>>>> > The best performance and memory usage for single MPI process was
>>>>> obtained with HPDDM(BFBCG) solver and bjacobian + ICC (1) in subdomains as
>>>>> preconditioner, it took 1 m 45 s and RAM 5.0 GB. Parallel computation with
>>>>> 4 MPI processes took 2 m 46 s when using 5.6 GB of RAM. This because of
>>>>> number of iterations required to achieve the same tolerance is
>>>>> significantly increased.
>>>>>
>>>>> How many iterations do you have in serial (and then in parallel)?
>>>>>
>>>>> > I`ve also tried PCGAMG (agg) preconditioner with IC? (1)
>>>>> sub-precondtioner. For single MPI process, the calculation took 10 min and
>>>>> 3.4 GB of RAM. To improve the convergence rate, the nullspace was attached
>>>>> using MatNullSpaceCreateRigidBody and MatSetNearNullSpace subroutines.
>>>>> This has reduced calculation time to 3 m 58 s when using 4.3 GB of RAM.
>>>>> Also, there is peak memory usage with 14.1 GB, which appears just before
>>>>> the start of the iterations. Parallel computation with 4 MPI processes took
>>>>> 2 m 53 s when using 8.4 GB of RAM. In that case the peak memory usage is
>>>>> about 22 GB.
>>>>>
>>>>> Does the number of iterates increase in parallel? Again, how many
>>>>> iterations do you have?
>>>>>
>>>>> > Are there ways to avoid decreasing of the convergence rate for
>>>>> bjacobi precondtioner in parallel mode? Does it make sense to use
>>>>> hierarchical or nested krylov methods with a local gmres solver
>>>>> (sub_pc_type gmres) and some sub-precondtioner (for example, sub_pc_type
>>>>> bjacobi)?
>>>>>
>>>>> bjacobi is only a one-level method, so you would not expect
>>>>> process-independent convergence rate for this kind of problem. If the
>>>>> coefficient variation is not too extreme, then I would expect GAMG (or some
>>>>> other smoothed aggregation package, perhaps -pc_type ml (you need
>>>>> --download-ml)) would work well with some tuning.
>>>>>
>>>>> If you have extremely high contrast coefficients you might need
>>>>> something with stronger coarse grids. If you can assemble so-called Neumann
>>>>> matrices (
>>>>> https://petsc.org/release/docs/manualpages/Mat/MATIS.html#MATIS) then
>>>>> you could try the geneo scheme offered by PCHPDDM.
>>>>>
>>>>> > Is this peak memory usage expected for gamg preconditioner? is there
>>>>> any way to reduce it?
>>>>>
>>>>> I think that peak memory usage comes from building the coarse grids.
>>>>> Can you run with `-info` and grep for GAMG, this will provide some output
>>>>> that more expert GAMG users can interpret.
>>>>>
>>>>> Lawrence
>>>>>
>>>>>
>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> -- Norbert Wiener
>>
>> https://www.cse.buffalo.edu/~knepley/
>> <http://www.cse.buffalo.edu/~knepley/>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210903/bb6f66fa/attachment-0001.html>

From knepley at gmail.com  Fri Sep  3 10:07:05 2021
From: knepley at gmail.com (Matthew Knepley)
Date: Fri, 3 Sep 2021 11:07:05 -0400
Subject: [petsc-users] Slow convergence while parallel computations.
In-Reply-To: <CADOhEh6rkom+YOP=LLZ2Os+eoM4UhNF4V+0bpi_=yi+TamROiQ@mail.gmail.com>
References: <CAELBu--pQXziZLnKD0rXE8VxAdf1_AY09TCNKhW=9we2hXsMPw@mail.gmail.com>
	<EB5F7983-CA26-496E-9BE4-E15767E0003C@gmx.li>
	<CAELBu-83fyG37RReqiN0Y7F=k46CiPt_vThGiM1Nrj8hNuxpAA@mail.gmail.com>
	<CADOhEh7kq-Syz6abBiBdGV7yrfDYL6gn+ZEOrPcn7Q0L_sYv1Q@mail.gmail.com>
	<CAMYG4GmhyjNGzHkza3XFBRYrePa=ZvyZ02QqkFoxnjVvmMymuA@mail.gmail.com>
	<CAELBu-_E4SG3dzj4ceVQWS8BnrZdpHVFP6Z1cbSSZ6M8VhstFQ@mail.gmail.com>
	<CADOhEh6rkom+YOP=LLZ2Os+eoM4UhNF4V+0bpi_=yi+TamROiQ@mail.gmail.com>
Message-ID: <CAMYG4GmQkY9gbG9BB-jku9cdWfC7ku=ELTiEhz5MHwPAu5RGSA@mail.gmail.com>

It is a RAR since this is Windows :)

Viktor, your system looks singular. Is it possible that you somehow have
zero on the diagonal? That might make the
SOR a problem. You could replace that with Jacobi using

  -mg_levels_pc_type jacobi

  0 KSP Residual norm 2.980664994991e+02
  0 KSP preconditioned resid norm 2.980664994991e+02 true resid norm
7.983356882620e+11 ||r(i)||/||b|| 1.000000000000e+00
  1 KSP Residual norm 1.650358505966e+01
  1 KSP preconditioned resid norm 1.650358505966e+01 true resid norm
4.601793132543e+12 ||r(i)||/||b|| 5.764233267037e+00
  2 KSP Residual norm 2.086911345353e+01
  2 KSP preconditioned resid norm 2.086911345353e+01 true resid norm
1.258153657657e+12 ||r(i)||/||b|| 1.575970705250e+00
  3 KSP Residual norm 1.909137523120e+01
  3 KSP preconditioned resid norm 1.909137523120e+01 true resid norm
2.179275269000e+12 ||r(i)||/||b|| 2.729773077969e+00

Mark, here is the solver

KSP Object: 1 MPI processes
  type: cg
  maximum iterations=100000, initial guess is zero
  tolerances:  relative=1e-08, absolute=1e-50, divergence=10000.
  left preconditioning
  using PRECONDITIONED norm type for convergence test
PC Object: 1 MPI processes
  type: gamg
    type is MULTIPLICATIVE, levels=4 cycles=v
      Cycles per PCApply=1
      Using externally compute Galerkin coarse grid matrices
      GAMG specific options
        Threshold for dropping small values in graph on each level =   0.
0.   0.   0.
        Threshold scaling factor for each level not specified = 1.
        AGG specific options
          Symmetric graph false
          Number of levels to square graph 1
          Number smoothing steps 1
        Complexity:    grid = 1.0042
  Coarse grid solver -- level -------------------------------
    KSP Object: (mg_coarse_) 1 MPI processes
      type: preonly
      maximum iterations=10000, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
      left preconditioning
      using NONE norm type for convergence test
    PC Object: (mg_coarse_) 1 MPI processes
      type: bjacobi
        number of blocks = 1
        Local solver information for first block is in the following KSP
and PC objects on rank 0:
        Use -mg_coarse_ksp_view ::ascii_info_detail to display information
for all blocks
      KSP Object: (mg_coarse_sub_) 1 MPI processes
        type: preonly
        maximum iterations=1, initial guess is zero
        tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
        left preconditioning
        using NONE norm type for convergence test
      PC Object: (mg_coarse_sub_) 1 MPI processes
        type: lu
          out-of-place factorization
          tolerance for zero pivot 2.22045e-14
          using diagonal shift on blocks to prevent zero pivot [INBLOCKS]
          matrix ordering: nd
          factor fill ratio given 5., needed 1.19444
            Factored matrix follows:
              Mat Object: 1 MPI processes
                type: seqaij
                rows=36, cols=36
                package used to perform factorization: petsc
                total: nonzeros=774, allocated nonzeros=774
                  using I-node routines: found 22 nodes, limit used is 5
        linear system matrix = precond matrix:
        Mat Object: (mg_coarse_sub_) 1 MPI processes
          type: seqaij
          rows=36, cols=36
          total: nonzeros=648, allocated nonzeros=648
          total number of mallocs used during MatSetValues calls=0
            not using I-node routines
      linear system matrix = precond matrix:
      Mat Object: (mg_coarse_sub_) 1 MPI processes
        type: seqaij
        rows=36, cols=36
        total: nonzeros=648, allocated nonzeros=648
        total number of mallocs used during MatSetValues calls=0
          not using I-node routines
  Down solver (pre-smoother) on level 1 -------------------------------
    KSP Object: (mg_levels_1_) 1 MPI processes
      type: chebyshev
        eigenvalue estimates used:  min = 0.0997354, max = 1.09709
        eigenvalues estimate via gmres min 0.00372245, max 0.997354
        eigenvalues estimated using gmres with translations  [0. 0.1; 0.
1.1]
        KSP Object: (mg_levels_1_esteig_) 1 MPI processes
          type: gmres
            restart=30, using Classical (unmodified) Gram-Schmidt
Orthogonalization with no iterative refinement
            happy breakdown tolerance 1e-30
          maximum iterations=10, initial guess is zero
          tolerances:  relative=1e-12, absolute=1e-50, divergence=10000.
          left preconditioning
          using PRECONDITIONED norm type for convergence test
        estimating eigenvalues using noisy right hand side
      maximum iterations=2, nonzero initial guess
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
      left preconditioning
      using NONE norm type for convergence test
    PC Object: (mg_levels_1_) 1 MPI processes
      type: sor
        type = local_symmetric, iterations = 1, local iterations = 1, omega
= 1.
      linear system matrix = precond matrix:
      Mat Object: 1 MPI processes
        type: seqaij
        rows=902, cols=902
        total: nonzeros=66660, allocated nonzeros=66660
        total number of mallocs used during MatSetValues calls=0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  Down solver (pre-smoother) on level 2 -------------------------------
    KSP Object: (mg_levels_2_) 1 MPI processes
      type: chebyshev
        eigenvalue estimates used:  min = 0.0994525, max = 1.09398
        eigenvalues estimate via gmres min 0.0303095, max 0.994525
        eigenvalues estimated using gmres with translations  [0. 0.1; 0.
1.1]
        KSP Object: (mg_levels_2_esteig_) 1 MPI processes
          type: gmres
            restart=30, using Classical (unmodified) Gram-Schmidt
Orthogonalization with no iterative refinement
            happy breakdown tolerance 1e-30
          maximum iterations=10, initial guess is zero
          tolerances:  relative=1e-12, absolute=1e-50, divergence=10000.
          left preconditioning
          using PRECONDITIONED norm type for convergence test
        estimating eigenvalues using noisy right hand side
      maximum iterations=2, nonzero initial guess
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
      left preconditioning
      using NONE norm type for convergence test
    PC Object: (mg_levels_2_) 1 MPI processes
      type: sor
        type = local_symmetric, iterations = 1, local iterations = 1, omega
= 1.
      linear system matrix = precond matrix:
      Mat Object: 1 MPI processes
        type: seqaij
        rows=12043, cols=12043
        total: nonzeros=455611, allocated nonzeros=455611
        total number of mallocs used during MatSetValues calls=0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  Down solver (pre-smoother) on level 3 -------------------------------
    KSP Object: (mg_levels_3_) 1 MPI processes
      type: chebyshev
        eigenvalue estimates used:  min = 0.0992144, max = 1.09136
        eigenvalues estimate via gmres min 0.0222691, max 0.992144
        eigenvalues estimated using gmres with translations  [0. 0.1; 0.
1.1]
        KSP Object: (mg_levels_3_esteig_) 1 MPI processes
          type: gmres
            restart=30, using Classical (unmodified) Gram-Schmidt
Orthogonalization with no iterative refinement
            happy breakdown tolerance 1e-30
          maximum iterations=10, initial guess is zero
          tolerances:  relative=1e-12, absolute=1e-50, divergence=10000.
          left preconditioning
          using PRECONDITIONED norm type for convergence test
        estimating eigenvalues using noisy right hand side
      maximum iterations=2, nonzero initial guess
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
      left preconditioning
      using NONE norm type for convergence test
    PC Object: (mg_levels_3_) 1 MPI processes
      type: sor
        type = local_symmetric, iterations = 1, local iterations = 1, omega
= 1.
      linear system matrix = precond matrix:
      Mat Object: 1 MPI processes
        type: seqaij
        rows=1600200, cols=1600200
        total: nonzeros=124439742, allocated nonzeros=129616200
        total number of mallocs used during MatSetValues calls=0
          using I-node routines: found 533400 nodes, limit used is 5
  Up solver (post-smoother) same as down solver (pre-smoother)
  linear system matrix = precond matrix:
  Mat Object: 1 MPI processes
    type: seqaij
    rows=1600200, cols=1600200
    total: nonzeros=124439742, allocated nonzeros=129616200
    total number of mallocs used during MatSetValues calls=0
      using I-node routines: found 533400 nodes, limit used is 5

  Thanks,

     Matt

On Fri, Sep 3, 2021 at 10:56 AM Mark Adams <mfadams at lbl.gov> wrote:

> That does not seem to be an ASCII file.
>
> On Fri, Sep 3, 2021 at 10:48 AM Viktor Nazdrachev <numbersixvs at gmail.com>
> wrote:
>
>> Hello Mark and Matthew!
>>
>>
>>
>> I attached log files for serial and parallel cases and corresponding information about GAMG preconditioner (using grep).
>>
>> I have to notice, that assembling of global stiffness matrix in code was performed by MatSetValues subrotuine (not MatSetValuesBlocked)
>>
>> !nnds ? number of nodes
>>
>> !dmn=3
>>
>> call MatCreate(Petsc_Comm_World,Mat_K,ierr)
>>
>> call MatSetFromOptions(Mat_K,ierr)
>>
>> call MatSetSizes(Mat_K,Petsc_Decide,Petsc_Decide,n,n,ierr_m)
>>
>> ?
>>
>> call MatMPIAIJSetPreallocation(Mat_K,0,dbw,0,obw,ierr)
>>
>> ?
>>
>> call MatSetOption(Mat_K,Mat_New_Nonzero_Allocation_Err,Petsc_False,ierr)
>>
>> ?
>>
>> do i=1,nels
>>
>>     call FormLocalK(i,k,indx,"Kp") ! find local stiffness matrix
>>
>>      indx=indxmap(indx,2) !find global indices for DOFs
>>
>>      call MatSetValues(Mat_K,ef_eldof,indx,ef_eldof,indx,k,Add_Values,ierr)
>>
>> end do
>>
>>
>>
>> But nullspace vector was created using VecSetBlockSize subroutine.
>>
>>
>>
>> call VecCreate(Petsc_Comm_World,Vec_NullSpace,ierr)
>>
>> call VecSetBlockSize(Vec_NullSpace,dmn,ierr)
>>
>> call VecSetSizes(Vec_NullSpace,nnds*dmn,Petsc_Decide,ierr)
>>
>> call VecSetUp(Vec_NullSpace,ierr)
>>
>> call VecGetArrayF90(Vec_NullSpace,null_space,ierr)
>>
>> ?
>>
>> call VecRestoreArrayF90(Vec_NullSpace,null_space,ierr)
>>
>> call MatNullSpaceCreateRigidBody(Vec_NullSpace,matnull,ierr)
>>
>> call MatSetNearNullSpace(Mat_K,matnull,ierr)
>>
>>
>>
>> I suppose it can be one of the reasons of GAMG slow convergence.
>>
>> So I attached log files for parallel run with ?pure? GAMG precondtioner.
>>
>>
>>
>>
>>
>> Kind regards,
>>
>>
>>
>> Viktor Nazdrachev
>>
>>
>>
>> R&D senior researcher
>>
>>
>>
>> Geosteering Technologies LLC
>>
>> ??, 3 ????. 2021 ?. ? 15:11, Matthew Knepley <knepley at gmail.com>:
>>
>>> On Fri, Sep 3, 2021 at 8:02 AM Mark Adams <mfadams at lbl.gov> wrote:
>>>
>>>>
>>>>
>>>> On Fri, Sep 3, 2021 at 1:57 AM Viktor Nazdrachev <numbersixvs at gmail.com>
>>>> wrote:
>>>>
>>>>> Hello, Lawrence!
>>>>> Thank you for your response!
>>>>>
>>>>> I attached log files (txt files with convergence behavior and RAM
>>>>> usage log in separate txt files) and resulting table with convergence
>>>>> investigation data(xls). Data for main non-regular grid with 500K cells and
>>>>> heterogeneous properties are in 500K folder, whereas data for simple
>>>>> uniform 125K cells grid with constant properties are in 125K folder.
>>>>>
>>>>>
>>>>> >>* On 1 Sep 2021, at 09:42, **?????????** ??????** <numbersixvs at gmail.com <https://lists.mcs.anl.gov/mailman/listinfo/petsc-users>**> wrote:*
>>>>>
>>>>> >>
>>>>>
>>>>> >>* I have a 3D elasticity problem with heterogeneous properties.*
>>>>>
>>>>> >
>>>>>
>>>>> >What does your coefficient variation look like? How large is the contrast?
>>>>>
>>>>>
>>>>>
>>>>> Young modulus varies from 1 to 10 GPa, Poisson ratio varies from 0.3
>>>>> to 0.44 and density ? from 1700 to 2600 kg/m^3.
>>>>>
>>>>
>>>> That is not too bad. Poorly shaped elements are the next thing to worry
>>>> about. Try to keep the aspect ratio below 10 if possible.
>>>>
>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> >>* There is unstructured grid with aspect ratio varied from 4 to 25. Zero Dirichlet BCs  are imposed on bottom face of mesh. Also, Neumann (traction) BCs are imposed on side faces. Gravity load is also accounted for. The grid I use consists of 500k cells (which is approximately 1.6M of DOFs).*
>>>>>
>>>>> >>
>>>>>
>>>>> >>* The best performance and memory usage for single MPI process was obtained with HPDDM(BFBCG) solver and bjacobian + ICC (1) in subdomains as preconditioner, it took 1 m 45 s and RAM 5.0 GB. Parallel computation with 4 MPI processes took 2 m 46 s when using 5.6 GB of RAM. This because of number of iterations required to achieve the same tolerance is significantly increased.*
>>>>>
>>>>> >
>>>>>
>>>>> >How many iterations do you have in serial (and then in parallel)?
>>>>>
>>>>>
>>>>>
>>>>> Serial run is required 112 iterations to reach convergence (log_hpddm(bfbcg)_bjacobian_icc_1_mpi.txt), parallel run with 4 MPI ? 680 iterations.
>>>>>
>>>>>
>>>>>
>>>>> I attached log files for all simulations (txt files with convergence
>>>>> behavior and RAM usage log in separate txt files) and resulting table with
>>>>> convergence/memory usage data(xls). Data for main non-regular grid with
>>>>> 500K cells and heterogeneous properties are in 500K folder, whereas data
>>>>> for simple uniform 125K cells grid with constant properties are in 125K
>>>>> folder.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> >>* I`ve also tried PCGAMG (agg) preconditioner with IC**?** (1) sub-precondtioner. For single MPI process, the calculation took 10 min and 3.4 GB of RAM. To improve the convergence rate, the nullspace was attached using MatNullSpaceCreateRigidBody and MatSetNearNullSpace subroutines.  This has reduced calculation time to 3 m 58 s when using 4.3 GB of RAM. Also, there is peak memory usage with 14.1 GB, which appears just before the start of the iterations. Parallel computation with 4 MPI processes took 2 m 53 s when using 8.4 GB of RAM. In that case the peak memory usage is about 22 GB.*
>>>>>
>>>>> >
>>>>>
>>>>> >Does the number of iterates increase in parallel? Again, how many iterations do you have?
>>>>>
>>>>>
>>>>>
>>>>> For case with 4 MPI processes and attached nullspace it is required 177 iterations to reach convergence (you may see detailed log in log_hpddm(bfbcg)_gamg_nearnullspace_4_mpi.txt). For comparison, 90 iterations are required for sequential run(log_hpddm(bfbcg)_gamg_nearnullspace_1_mpi.txt).
>>>>>
>>>>>
>>>> Again, do not use ICC. I am surprised to see such a large jump in
>>>> iteration count, but get ICC off the table.
>>>>
>>>> You will see variability in the iteration count with processor count
>>>> with GAMG. As much as 10% +-. Maybe more (random) variability , but usually
>>>> less.
>>>>
>>>> You can decrease the memory a little, and the setup time a lot, by
>>>> aggressively coarsening, at the expense of higher iteration counts. It's a
>>>> balancing act.
>>>>
>>>> You can run with the defaults, add '-info', grep on GAMG and send the
>>>> ~30 lines of output if you want advice on parameters.
>>>>
>>>
>>> Can you send the output of
>>>
>>>   -ksp_view -ksp_monitor_true_residual -ksp_converged_reason
>>>
>>>   Thanks,
>>>
>>>       Matt
>>>
>>>
>>>> Thanks,
>>>> Mark
>>>>
>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> >>* Are there ways to avoid decreasing of the convergence rate for bjacobi precondtioner in parallel mode? Does it make sense to use hierarchical or nested krylov methods with a local gmres solver (sub_pc_type gmres) and some sub-precondtioner (for example, sub_pc_type bjacobi)?*
>>>>>
>>>>> >
>>>>>
>>>>> >bjacobi is only a one-level method, so you would not expect process-independent convergence rate for this kind of problem. If the coefficient variation is not too extreme, then I would expect GAMG (or some other smoothed aggregation package, perhaps -pc_type ml (you need --download-ml)) would work well with some tuning.
>>>>>
>>>>>
>>>>>
>>>>> Thanks for idea, but, unfortunately, ML cannot be compiled with 64bit
>>>>> integers (It is extremely necessary to perform computation on mesh with
>>>>> more than 10M cells).
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> >If you have extremely high contrast coefficients you might need something with stronger coarse grids. If you can assemble so-called Neumann matrices (https://petsc.org/release/docs/manualpages/Mat/MATIS.html#MATIS) then you could try the geneo scheme offered by PCHPDDM.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> I found strange convergence behavior for HPDDM preconditioner. For 1
>>>>> MPI process BFBCG solver did not converged
>>>>> (log_hpddm(bfbcg)_pchpddm_1_mpi.txt), while for 4 MPI processes computation
>>>>> was successful (1018 to reach convergence,
>>>>> log_hpddm(bfbcg)_pchpddm_4_mpi.txt).
>>>>>
>>>>> But it should be mentioned that stiffness matrix was created in AIJ
>>>>> format (our default matrix format in program).
>>>>>
>>>>> Matrix conversion to MATIS format via MatConvert subroutine resulted
>>>>> in losing of convergence for both serial and parallel run.
>>>>>
>>>>>
>>>>> >>* Is this peak memory usage expected for gamg preconditioner? is there any way to reduce it?*
>>>>>
>>>>> >
>>>>>
>>>>> >I think that peak memory usage comes from building the coarse grids. Can you run with `-info` and grep for GAMG, this will provide some output that more expert GAMG users can interpret.
>>>>>
>>>>>
>>>>>
>>>>>  Thanks, I`ll try to use a strong threshold only for coarse grids.
>>>>>
>>>>>
>>>>>
>>>>> Kind regards,
>>>>>
>>>>>
>>>>>
>>>>> Viktor Nazdrachev
>>>>>
>>>>>
>>>>>
>>>>> R&D senior researcher
>>>>>
>>>>>
>>>>>
>>>>> Geosteering Technologies LLC
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> ??, 1 ????. 2021 ?. ? 12:02, Lawrence Mitchell <wence at gmx.li>:
>>>>>
>>>>>>
>>>>>>
>>>>>> > On 1 Sep 2021, at 09:42, ????????? ?????? <numbersixvs at gmail.com>
>>>>>> wrote:
>>>>>> >
>>>>>> > I have a 3D elasticity problem with heterogeneous properties.
>>>>>>
>>>>>> What does your coefficient variation look like? How large is the
>>>>>> contrast?
>>>>>>
>>>>>> > There is unstructured grid with aspect ratio varied from 4 to 25.
>>>>>> Zero Dirichlet BCs  are imposed on bottom face of mesh. Also, Neumann
>>>>>> (traction) BCs are imposed on side faces. Gravity load is also accounted
>>>>>> for. The grid I use consists of 500k cells (which is approximately 1.6M of
>>>>>> DOFs).
>>>>>> >
>>>>>> > The best performance and memory usage for single MPI process was
>>>>>> obtained with HPDDM(BFBCG) solver and bjacobian + ICC (1) in subdomains as
>>>>>> preconditioner, it took 1 m 45 s and RAM 5.0 GB. Parallel computation with
>>>>>> 4 MPI processes took 2 m 46 s when using 5.6 GB of RAM. This because of
>>>>>> number of iterations required to achieve the same tolerance is
>>>>>> significantly increased.
>>>>>>
>>>>>> How many iterations do you have in serial (and then in parallel)?
>>>>>>
>>>>>> > I`ve also tried PCGAMG (agg) preconditioner with IC? (1)
>>>>>> sub-precondtioner. For single MPI process, the calculation took 10 min and
>>>>>> 3.4 GB of RAM. To improve the convergence rate, the nullspace was attached
>>>>>> using MatNullSpaceCreateRigidBody and MatSetNearNullSpace subroutines.
>>>>>> This has reduced calculation time to 3 m 58 s when using 4.3 GB of RAM.
>>>>>> Also, there is peak memory usage with 14.1 GB, which appears just before
>>>>>> the start of the iterations. Parallel computation with 4 MPI processes took
>>>>>> 2 m 53 s when using 8.4 GB of RAM. In that case the peak memory usage is
>>>>>> about 22 GB.
>>>>>>
>>>>>> Does the number of iterates increase in parallel? Again, how many
>>>>>> iterations do you have?
>>>>>>
>>>>>> > Are there ways to avoid decreasing of the convergence rate for
>>>>>> bjacobi precondtioner in parallel mode? Does it make sense to use
>>>>>> hierarchical or nested krylov methods with a local gmres solver
>>>>>> (sub_pc_type gmres) and some sub-precondtioner (for example, sub_pc_type
>>>>>> bjacobi)?
>>>>>>
>>>>>> bjacobi is only a one-level method, so you would not expect
>>>>>> process-independent convergence rate for this kind of problem. If the
>>>>>> coefficient variation is not too extreme, then I would expect GAMG (or some
>>>>>> other smoothed aggregation package, perhaps -pc_type ml (you need
>>>>>> --download-ml)) would work well with some tuning.
>>>>>>
>>>>>> If you have extremely high contrast coefficients you might need
>>>>>> something with stronger coarse grids. If you can assemble so-called Neumann
>>>>>> matrices (
>>>>>> https://petsc.org/release/docs/manualpages/Mat/MATIS.html#MATIS)
>>>>>> then you could try the geneo scheme offered by PCHPDDM.
>>>>>>
>>>>>> > Is this peak memory usage expected for gamg preconditioner? is
>>>>>> there any way to reduce it?
>>>>>>
>>>>>> I think that peak memory usage comes from building the coarse grids.
>>>>>> Can you run with `-info` and grep for GAMG, this will provide some output
>>>>>> that more expert GAMG users can interpret.
>>>>>>
>>>>>> Lawrence
>>>>>>
>>>>>>
>>>
>>> --
>>> What most experimenters take for granted before they begin their
>>> experiments is infinitely more interesting than any results to which their
>>> experiments lead.
>>> -- Norbert Wiener
>>>
>>> https://www.cse.buffalo.edu/~knepley/
>>> <http://www.cse.buffalo.edu/~knepley/>
>>>
>>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210903/880fcca7/attachment-0001.html>

From mfadams at lbl.gov  Fri Sep  3 11:18:59 2021
From: mfadams at lbl.gov (Mark Adams)
Date: Fri, 3 Sep 2021 12:18:59 -0400
Subject: [petsc-users] Slow convergence while parallel computations.
In-Reply-To: <CAMYG4GmQkY9gbG9BB-jku9cdWfC7ku=ELTiEhz5MHwPAu5RGSA@mail.gmail.com>
References: <CAELBu--pQXziZLnKD0rXE8VxAdf1_AY09TCNKhW=9we2hXsMPw@mail.gmail.com>
	<EB5F7983-CA26-496E-9BE4-E15767E0003C@gmx.li>
	<CAELBu-83fyG37RReqiN0Y7F=k46CiPt_vThGiM1Nrj8hNuxpAA@mail.gmail.com>
	<CADOhEh7kq-Syz6abBiBdGV7yrfDYL6gn+ZEOrPcn7Q0L_sYv1Q@mail.gmail.com>
	<CAMYG4GmhyjNGzHkza3XFBRYrePa=ZvyZ02QqkFoxnjVvmMymuA@mail.gmail.com>
	<CAELBu-_E4SG3dzj4ceVQWS8BnrZdpHVFP6Z1cbSSZ6M8VhstFQ@mail.gmail.com>
	<CADOhEh6rkom+YOP=LLZ2Os+eoM4UhNF4V+0bpi_=yi+TamROiQ@mail.gmail.com>
	<CAMYG4GmQkY9gbG9BB-jku9cdWfC7ku=ELTiEhz5MHwPAu5RGSA@mail.gmail.com>
Message-ID: <CADOhEh5CCn5j9zJLyyB1DHuZCAHKcY3sG+ZMy04CK7YNWuvrgQ@mail.gmail.com>

The block size has not been set to 3. You need to use MatSetBlockSize.
I assume the 6 rigid body modes were not added either. This will help
problems where the deformation has some rotation to it.

Last time I checked SOR did not check for zero on the diagonal. Jacobi does.
It is possible, maybe, that SA can give a singular coarse grid with point
coarsening, but I don't think so.
You might run a "solve" with no preconditioner and ask for the eigen
estimates to get an idea of the spectrum of the system.

The eigen estimator says this matrix is very well conditioned. Is there a
mass term here?


On Fri, Sep 3, 2021 at 11:07 AM Matthew Knepley <knepley at gmail.com> wrote:

> It is a RAR since this is Windows :)
>
> Viktor, your system looks singular. Is it possible that you somehow have
> zero on the diagonal? That might make the
> SOR a problem. You could replace that with Jacobi using
>
>   -mg_levels_pc_type jacobi
>
>   0 KSP Residual norm 2.980664994991e+02
>   0 KSP preconditioned resid norm 2.980664994991e+02 true resid norm
> 7.983356882620e+11 ||r(i)||/||b|| 1.000000000000e+00
>   1 KSP Residual norm 1.650358505966e+01
>   1 KSP preconditioned resid norm 1.650358505966e+01 true resid norm
> 4.601793132543e+12 ||r(i)||/||b|| 5.764233267037e+00
>   2 KSP Residual norm 2.086911345353e+01
>   2 KSP preconditioned resid norm 2.086911345353e+01 true resid norm
> 1.258153657657e+12 ||r(i)||/||b|| 1.575970705250e+00
>   3 KSP Residual norm 1.909137523120e+01
>   3 KSP preconditioned resid norm 1.909137523120e+01 true resid norm
> 2.179275269000e+12 ||r(i)||/||b|| 2.729773077969e+00
>
> Mark, here is the solver
>
> KSP Object: 1 MPI processes
>   type: cg
>   maximum iterations=100000, initial guess is zero
>   tolerances:  relative=1e-08, absolute=1e-50, divergence=10000.
>   left preconditioning
>   using PRECONDITIONED norm type for convergence test
> PC Object: 1 MPI processes
>   type: gamg
>     type is MULTIPLICATIVE, levels=4 cycles=v
>       Cycles per PCApply=1
>       Using externally compute Galerkin coarse grid matrices
>       GAMG specific options
>         Threshold for dropping small values in graph on each level =   0.
>   0.   0.   0.
>         Threshold scaling factor for each level not specified = 1.
>         AGG specific options
>           Symmetric graph false
>           Number of levels to square graph 1
>           Number smoothing steps 1
>         Complexity:    grid = 1.0042
>   Coarse grid solver -- level -------------------------------
>     KSP Object: (mg_coarse_) 1 MPI processes
>       type: preonly
>       maximum iterations=10000, initial guess is zero
>       tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
>       left preconditioning
>       using NONE norm type for convergence test
>     PC Object: (mg_coarse_) 1 MPI processes
>       type: bjacobi
>         number of blocks = 1
>         Local solver information for first block is in the following KSP
> and PC objects on rank 0:
>         Use -mg_coarse_ksp_view ::ascii_info_detail to display information
> for all blocks
>       KSP Object: (mg_coarse_sub_) 1 MPI processes
>         type: preonly
>         maximum iterations=1, initial guess is zero
>         tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
>         left preconditioning
>         using NONE norm type for convergence test
>       PC Object: (mg_coarse_sub_) 1 MPI processes
>         type: lu
>           out-of-place factorization
>           tolerance for zero pivot 2.22045e-14
>           using diagonal shift on blocks to prevent zero pivot [INBLOCKS]
>           matrix ordering: nd
>           factor fill ratio given 5., needed 1.19444
>             Factored matrix follows:
>               Mat Object: 1 MPI processes
>                 type: seqaij
>                 rows=36, cols=36
>                 package used to perform factorization: petsc
>                 total: nonzeros=774, allocated nonzeros=774
>                   using I-node routines: found 22 nodes, limit used is 5
>         linear system matrix = precond matrix:
>         Mat Object: (mg_coarse_sub_) 1 MPI processes
>           type: seqaij
>           rows=36, cols=36
>           total: nonzeros=648, allocated nonzeros=648
>           total number of mallocs used during MatSetValues calls=0
>             not using I-node routines
>       linear system matrix = precond matrix:
>       Mat Object: (mg_coarse_sub_) 1 MPI processes
>         type: seqaij
>         rows=36, cols=36
>         total: nonzeros=648, allocated nonzeros=648
>         total number of mallocs used during MatSetValues calls=0
>           not using I-node routines
>   Down solver (pre-smoother) on level 1 -------------------------------
>     KSP Object: (mg_levels_1_) 1 MPI processes
>       type: chebyshev
>         eigenvalue estimates used:  min = 0.0997354, max = 1.09709
>         eigenvalues estimate via gmres min 0.00372245, max 0.997354
>         eigenvalues estimated using gmres with translations  [0. 0.1; 0.
> 1.1]
>         KSP Object: (mg_levels_1_esteig_) 1 MPI processes
>           type: gmres
>             restart=30, using Classical (unmodified) Gram-Schmidt
> Orthogonalization with no iterative refinement
>             happy breakdown tolerance 1e-30
>           maximum iterations=10, initial guess is zero
>           tolerances:  relative=1e-12, absolute=1e-50, divergence=10000.
>           left preconditioning
>           using PRECONDITIONED norm type for convergence test
>         estimating eigenvalues using noisy right hand side
>       maximum iterations=2, nonzero initial guess
>       tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
>       left preconditioning
>       using NONE norm type for convergence test
>     PC Object: (mg_levels_1_) 1 MPI processes
>       type: sor
>         type = local_symmetric, iterations = 1, local iterations = 1,
> omega = 1.
>       linear system matrix = precond matrix:
>       Mat Object: 1 MPI processes
>         type: seqaij
>         rows=902, cols=902
>         total: nonzeros=66660, allocated nonzeros=66660
>         total number of mallocs used during MatSetValues calls=0
>           not using I-node routines
>   Up solver (post-smoother) same as down solver (pre-smoother)
>   Down solver (pre-smoother) on level 2 -------------------------------
>     KSP Object: (mg_levels_2_) 1 MPI processes
>       type: chebyshev
>         eigenvalue estimates used:  min = 0.0994525, max = 1.09398
>         eigenvalues estimate via gmres min 0.0303095, max 0.994525
>         eigenvalues estimated using gmres with translations  [0. 0.1; 0.
> 1.1]
>         KSP Object: (mg_levels_2_esteig_) 1 MPI processes
>           type: gmres
>             restart=30, using Classical (unmodified) Gram-Schmidt
> Orthogonalization with no iterative refinement
>             happy breakdown tolerance 1e-30
>           maximum iterations=10, initial guess is zero
>           tolerances:  relative=1e-12, absolute=1e-50, divergence=10000.
>           left preconditioning
>           using PRECONDITIONED norm type for convergence test
>         estimating eigenvalues using noisy right hand side
>       maximum iterations=2, nonzero initial guess
>       tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
>       left preconditioning
>       using NONE norm type for convergence test
>     PC Object: (mg_levels_2_) 1 MPI processes
>       type: sor
>         type = local_symmetric, iterations = 1, local iterations = 1,
> omega = 1.
>       linear system matrix = precond matrix:
>       Mat Object: 1 MPI processes
>         type: seqaij
>         rows=12043, cols=12043
>         total: nonzeros=455611, allocated nonzeros=455611
>         total number of mallocs used during MatSetValues calls=0
>           not using I-node routines
>   Up solver (post-smoother) same as down solver (pre-smoother)
>   Down solver (pre-smoother) on level 3 -------------------------------
>     KSP Object: (mg_levels_3_) 1 MPI processes
>       type: chebyshev
>         eigenvalue estimates used:  min = 0.0992144, max = 1.09136
>         eigenvalues estimate via gmres min 0.0222691, max 0.992144
>         eigenvalues estimated using gmres with translations  [0. 0.1; 0.
> 1.1]
>         KSP Object: (mg_levels_3_esteig_) 1 MPI processes
>           type: gmres
>             restart=30, using Classical (unmodified) Gram-Schmidt
> Orthogonalization with no iterative refinement
>             happy breakdown tolerance 1e-30
>           maximum iterations=10, initial guess is zero
>           tolerances:  relative=1e-12, absolute=1e-50, divergence=10000.
>           left preconditioning
>           using PRECONDITIONED norm type for convergence test
>         estimating eigenvalues using noisy right hand side
>       maximum iterations=2, nonzero initial guess
>       tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
>       left preconditioning
>       using NONE norm type for convergence test
>     PC Object: (mg_levels_3_) 1 MPI processes
>       type: sor
>         type = local_symmetric, iterations = 1, local iterations = 1,
> omega = 1.
>       linear system matrix = precond matrix:
>       Mat Object: 1 MPI processes
>         type: seqaij
>         rows=1600200, cols=1600200
>         total: nonzeros=124439742, allocated nonzeros=129616200
>         total number of mallocs used during MatSetValues calls=0
>           using I-node routines: found 533400 nodes, limit used is 5
>   Up solver (post-smoother) same as down solver (pre-smoother)
>   linear system matrix = precond matrix:
>   Mat Object: 1 MPI processes
>     type: seqaij
>     rows=1600200, cols=1600200
>     total: nonzeros=124439742, allocated nonzeros=129616200
>     total number of mallocs used during MatSetValues calls=0
>       using I-node routines: found 533400 nodes, limit used is 5
>
>   Thanks,
>
>      Matt
>
> On Fri, Sep 3, 2021 at 10:56 AM Mark Adams <mfadams at lbl.gov> wrote:
>
>> That does not seem to be an ASCII file.
>>
>> On Fri, Sep 3, 2021 at 10:48 AM Viktor Nazdrachev <numbersixvs at gmail.com>
>> wrote:
>>
>>> Hello Mark and Matthew!
>>>
>>>
>>>
>>> I attached log files for serial and parallel cases and corresponding information about GAMG preconditioner (using grep).
>>>
>>> I have to notice, that assembling of global stiffness matrix in code was performed by MatSetValues subrotuine (not MatSetValuesBlocked)
>>>
>>> !nnds ? number of nodes
>>>
>>> !dmn=3
>>>
>>> call MatCreate(Petsc_Comm_World,Mat_K,ierr)
>>>
>>> call MatSetFromOptions(Mat_K,ierr)
>>>
>>> call MatSetSizes(Mat_K,Petsc_Decide,Petsc_Decide,n,n,ierr_m)
>>>
>>> ?
>>>
>>> call MatMPIAIJSetPreallocation(Mat_K,0,dbw,0,obw,ierr)
>>>
>>> ?
>>>
>>> call MatSetOption(Mat_K,Mat_New_Nonzero_Allocation_Err,Petsc_False,ierr)
>>>
>>> ?
>>>
>>> do i=1,nels
>>>
>>>     call FormLocalK(i,k,indx,"Kp") ! find local stiffness matrix
>>>
>>>      indx=indxmap(indx,2) !find global indices for DOFs
>>>
>>>      call MatSetValues(Mat_K,ef_eldof,indx,ef_eldof,indx,k,Add_Values,ierr)
>>>
>>> end do
>>>
>>>
>>>
>>> But nullspace vector was created using VecSetBlockSize subroutine.
>>>
>>>
>>>
>>> call VecCreate(Petsc_Comm_World,Vec_NullSpace,ierr)
>>>
>>> call VecSetBlockSize(Vec_NullSpace,dmn,ierr)
>>>
>>> call VecSetSizes(Vec_NullSpace,nnds*dmn,Petsc_Decide,ierr)
>>>
>>> call VecSetUp(Vec_NullSpace,ierr)
>>>
>>> call VecGetArrayF90(Vec_NullSpace,null_space,ierr)
>>>
>>> ?
>>>
>>> call VecRestoreArrayF90(Vec_NullSpace,null_space,ierr)
>>>
>>> call MatNullSpaceCreateRigidBody(Vec_NullSpace,matnull,ierr)
>>>
>>> call MatSetNearNullSpace(Mat_K,matnull,ierr)
>>>
>>>
>>>
>>> I suppose it can be one of the reasons of GAMG slow convergence.
>>>
>>> So I attached log files for parallel run with ?pure? GAMG precondtioner.
>>>
>>>
>>>
>>>
>>>
>>> Kind regards,
>>>
>>>
>>>
>>> Viktor Nazdrachev
>>>
>>>
>>>
>>> R&D senior researcher
>>>
>>>
>>>
>>> Geosteering Technologies LLC
>>>
>>> ??, 3 ????. 2021 ?. ? 15:11, Matthew Knepley <knepley at gmail.com>:
>>>
>>>> On Fri, Sep 3, 2021 at 8:02 AM Mark Adams <mfadams at lbl.gov> wrote:
>>>>
>>>>>
>>>>>
>>>>> On Fri, Sep 3, 2021 at 1:57 AM Viktor Nazdrachev <
>>>>> numbersixvs at gmail.com> wrote:
>>>>>
>>>>>> Hello, Lawrence!
>>>>>> Thank you for your response!
>>>>>>
>>>>>> I attached log files (txt files with convergence behavior and RAM
>>>>>> usage log in separate txt files) and resulting table with convergence
>>>>>> investigation data(xls). Data for main non-regular grid with 500K cells and
>>>>>> heterogeneous properties are in 500K folder, whereas data for simple
>>>>>> uniform 125K cells grid with constant properties are in 125K folder.
>>>>>>
>>>>>>
>>>>>> >>* On 1 Sep 2021, at 09:42, **?????????** ??????** <numbersixvs at gmail.com <https://lists.mcs.anl.gov/mailman/listinfo/petsc-users>**> wrote:*
>>>>>>
>>>>>> >>
>>>>>>
>>>>>> >>* I have a 3D elasticity problem with heterogeneous properties.*
>>>>>>
>>>>>> >
>>>>>>
>>>>>> >What does your coefficient variation look like? How large is the contrast?
>>>>>>
>>>>>>
>>>>>>
>>>>>> Young modulus varies from 1 to 10 GPa, Poisson ratio varies from 0.3
>>>>>> to 0.44 and density ? from 1700 to 2600 kg/m^3.
>>>>>>
>>>>>
>>>>> That is not too bad. Poorly shaped elements are the next thing to
>>>>> worry about. Try to keep the aspect ratio below 10 if possible.
>>>>>
>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> >>* There is unstructured grid with aspect ratio varied from 4 to 25. Zero Dirichlet BCs  are imposed on bottom face of mesh. Also, Neumann (traction) BCs are imposed on side faces. Gravity load is also accounted for. The grid I use consists of 500k cells (which is approximately 1.6M of DOFs).*
>>>>>>
>>>>>> >>
>>>>>>
>>>>>> >>* The best performance and memory usage for single MPI process was obtained with HPDDM(BFBCG) solver and bjacobian + ICC (1) in subdomains as preconditioner, it took 1 m 45 s and RAM 5.0 GB. Parallel computation with 4 MPI processes took 2 m 46 s when using 5.6 GB of RAM. This because of number of iterations required to achieve the same tolerance is significantly increased.*
>>>>>>
>>>>>> >
>>>>>>
>>>>>> >How many iterations do you have in serial (and then in parallel)?
>>>>>>
>>>>>>
>>>>>>
>>>>>> Serial run is required 112 iterations to reach convergence (log_hpddm(bfbcg)_bjacobian_icc_1_mpi.txt), parallel run with 4 MPI ? 680 iterations.
>>>>>>
>>>>>>
>>>>>>
>>>>>> I attached log files for all simulations (txt files with convergence
>>>>>> behavior and RAM usage log in separate txt files) and resulting table with
>>>>>> convergence/memory usage data(xls). Data for main non-regular grid with
>>>>>> 500K cells and heterogeneous properties are in 500K folder, whereas data
>>>>>> for simple uniform 125K cells grid with constant properties are in 125K
>>>>>> folder.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> >>* I`ve also tried PCGAMG (agg) preconditioner with IC**?** (1) sub-precondtioner. For single MPI process, the calculation took 10 min and 3.4 GB of RAM. To improve the convergence rate, the nullspace was attached using MatNullSpaceCreateRigidBody and MatSetNearNullSpace subroutines.  This has reduced calculation time to 3 m 58 s when using 4.3 GB of RAM. Also, there is peak memory usage with 14.1 GB, which appears just before the start of the iterations. Parallel computation with 4 MPI processes took 2 m 53 s when using 8.4 GB of RAM. In that case the peak memory usage is about 22 GB.*
>>>>>>
>>>>>> >
>>>>>>
>>>>>> >Does the number of iterates increase in parallel? Again, how many iterations do you have?
>>>>>>
>>>>>>
>>>>>>
>>>>>> For case with 4 MPI processes and attached nullspace it is required 177 iterations to reach convergence (you may see detailed log in log_hpddm(bfbcg)_gamg_nearnullspace_4_mpi.txt). For comparison, 90 iterations are required for sequential run(log_hpddm(bfbcg)_gamg_nearnullspace_1_mpi.txt).
>>>>>>
>>>>>>
>>>>> Again, do not use ICC. I am surprised to see such a large jump in
>>>>> iteration count, but get ICC off the table.
>>>>>
>>>>> You will see variability in the iteration count with processor count
>>>>> with GAMG. As much as 10% +-. Maybe more (random) variability , but usually
>>>>> less.
>>>>>
>>>>> You can decrease the memory a little, and the setup time a lot, by
>>>>> aggressively coarsening, at the expense of higher iteration counts. It's a
>>>>> balancing act.
>>>>>
>>>>> You can run with the defaults, add '-info', grep on GAMG and send the
>>>>> ~30 lines of output if you want advice on parameters.
>>>>>
>>>>
>>>> Can you send the output of
>>>>
>>>>   -ksp_view -ksp_monitor_true_residual -ksp_converged_reason
>>>>
>>>>   Thanks,
>>>>
>>>>       Matt
>>>>
>>>>
>>>>> Thanks,
>>>>> Mark
>>>>>
>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> >>* Are there ways to avoid decreasing of the convergence rate for bjacobi precondtioner in parallel mode? Does it make sense to use hierarchical or nested krylov methods with a local gmres solver (sub_pc_type gmres) and some sub-precondtioner (for example, sub_pc_type bjacobi)?*
>>>>>>
>>>>>> >
>>>>>>
>>>>>> >bjacobi is only a one-level method, so you would not expect process-independent convergence rate for this kind of problem. If the coefficient variation is not too extreme, then I would expect GAMG (or some other smoothed aggregation package, perhaps -pc_type ml (you need --download-ml)) would work well with some tuning.
>>>>>>
>>>>>>
>>>>>>
>>>>>> Thanks for idea, but, unfortunately, ML cannot be compiled with 64bit
>>>>>> integers (It is extremely necessary to perform computation on mesh with
>>>>>> more than 10M cells).
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> >If you have extremely high contrast coefficients you might need something with stronger coarse grids. If you can assemble so-called Neumann matrices (https://petsc.org/release/docs/manualpages/Mat/MATIS.html#MATIS) then you could try the geneo scheme offered by PCHPDDM.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> I found strange convergence behavior for HPDDM preconditioner. For 1
>>>>>> MPI process BFBCG solver did not converged
>>>>>> (log_hpddm(bfbcg)_pchpddm_1_mpi.txt), while for 4 MPI processes computation
>>>>>> was successful (1018 to reach convergence,
>>>>>> log_hpddm(bfbcg)_pchpddm_4_mpi.txt).
>>>>>>
>>>>>> But it should be mentioned that stiffness matrix was created in AIJ
>>>>>> format (our default matrix format in program).
>>>>>>
>>>>>> Matrix conversion to MATIS format via MatConvert subroutine resulted
>>>>>> in losing of convergence for both serial and parallel run.
>>>>>>
>>>>>>
>>>>>> >>* Is this peak memory usage expected for gamg preconditioner? is there any way to reduce it?*
>>>>>>
>>>>>> >
>>>>>>
>>>>>> >I think that peak memory usage comes from building the coarse grids. Can you run with `-info` and grep for GAMG, this will provide some output that more expert GAMG users can interpret.
>>>>>>
>>>>>>
>>>>>>
>>>>>>  Thanks, I`ll try to use a strong threshold only for coarse grids.
>>>>>>
>>>>>>
>>>>>>
>>>>>> Kind regards,
>>>>>>
>>>>>>
>>>>>>
>>>>>> Viktor Nazdrachev
>>>>>>
>>>>>>
>>>>>>
>>>>>> R&D senior researcher
>>>>>>
>>>>>>
>>>>>>
>>>>>> Geosteering Technologies LLC
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> ??, 1 ????. 2021 ?. ? 12:02, Lawrence Mitchell <wence at gmx.li>:
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> > On 1 Sep 2021, at 09:42, ????????? ?????? <numbersixvs at gmail.com>
>>>>>>> wrote:
>>>>>>> >
>>>>>>> > I have a 3D elasticity problem with heterogeneous properties.
>>>>>>>
>>>>>>> What does your coefficient variation look like? How large is the
>>>>>>> contrast?
>>>>>>>
>>>>>>> > There is unstructured grid with aspect ratio varied from 4 to 25.
>>>>>>> Zero Dirichlet BCs  are imposed on bottom face of mesh. Also, Neumann
>>>>>>> (traction) BCs are imposed on side faces. Gravity load is also accounted
>>>>>>> for. The grid I use consists of 500k cells (which is approximately 1.6M of
>>>>>>> DOFs).
>>>>>>> >
>>>>>>> > The best performance and memory usage for single MPI process was
>>>>>>> obtained with HPDDM(BFBCG) solver and bjacobian + ICC (1) in subdomains as
>>>>>>> preconditioner, it took 1 m 45 s and RAM 5.0 GB. Parallel computation with
>>>>>>> 4 MPI processes took 2 m 46 s when using 5.6 GB of RAM. This because of
>>>>>>> number of iterations required to achieve the same tolerance is
>>>>>>> significantly increased.
>>>>>>>
>>>>>>> How many iterations do you have in serial (and then in parallel)?
>>>>>>>
>>>>>>> > I`ve also tried PCGAMG (agg) preconditioner with IC? (1)
>>>>>>> sub-precondtioner. For single MPI process, the calculation took 10 min and
>>>>>>> 3.4 GB of RAM. To improve the convergence rate, the nullspace was attached
>>>>>>> using MatNullSpaceCreateRigidBody and MatSetNearNullSpace subroutines.
>>>>>>> This has reduced calculation time to 3 m 58 s when using 4.3 GB of RAM.
>>>>>>> Also, there is peak memory usage with 14.1 GB, which appears just before
>>>>>>> the start of the iterations. Parallel computation with 4 MPI processes took
>>>>>>> 2 m 53 s when using 8.4 GB of RAM. In that case the peak memory usage is
>>>>>>> about 22 GB.
>>>>>>>
>>>>>>> Does the number of iterates increase in parallel? Again, how many
>>>>>>> iterations do you have?
>>>>>>>
>>>>>>> > Are there ways to avoid decreasing of the convergence rate for
>>>>>>> bjacobi precondtioner in parallel mode? Does it make sense to use
>>>>>>> hierarchical or nested krylov methods with a local gmres solver
>>>>>>> (sub_pc_type gmres) and some sub-precondtioner (for example, sub_pc_type
>>>>>>> bjacobi)?
>>>>>>>
>>>>>>> bjacobi is only a one-level method, so you would not expect
>>>>>>> process-independent convergence rate for this kind of problem. If the
>>>>>>> coefficient variation is not too extreme, then I would expect GAMG (or some
>>>>>>> other smoothed aggregation package, perhaps -pc_type ml (you need
>>>>>>> --download-ml)) would work well with some tuning.
>>>>>>>
>>>>>>> If you have extremely high contrast coefficients you might need
>>>>>>> something with stronger coarse grids. If you can assemble so-called Neumann
>>>>>>> matrices (
>>>>>>> https://petsc.org/release/docs/manualpages/Mat/MATIS.html#MATIS)
>>>>>>> then you could try the geneo scheme offered by PCHPDDM.
>>>>>>>
>>>>>>> > Is this peak memory usage expected for gamg preconditioner? is
>>>>>>> there any way to reduce it?
>>>>>>>
>>>>>>> I think that peak memory usage comes from building the coarse grids.
>>>>>>> Can you run with `-info` and grep for GAMG, this will provide some output
>>>>>>> that more expert GAMG users can interpret.
>>>>>>>
>>>>>>> Lawrence
>>>>>>>
>>>>>>>
>>>>
>>>> --
>>>> What most experimenters take for granted before they begin their
>>>> experiments is infinitely more interesting than any results to which their
>>>> experiments lead.
>>>> -- Norbert Wiener
>>>>
>>>> https://www.cse.buffalo.edu/~knepley/
>>>> <http://www.cse.buffalo.edu/~knepley/>
>>>>
>>>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
> <http://www.cse.buffalo.edu/~knepley/>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210903/99fba855/attachment-0001.html>

From samuelestes91 at gmail.com  Fri Sep  3 11:34:06 2021
From: samuelestes91 at gmail.com (Samuel Estes)
Date: Fri, 3 Sep 2021 11:34:06 -0500
Subject: [petsc-users] Solving subsystem using larger matrix
Message-ID: <CAOUB9XuOze=fM4Uqc_9ozi1Jer3Duu0Avh54WGHfauZr-zsM7Q@mail.gmail.com>

Hi,

I have a model in which we alternatively solve two submodels by the finite
element method both on the same unstructured mesh. The first model (call it
model 1) has three degrees of freedom per node while the second model
(model 2) is scalar (one degree of freedom). We are trying several
different implementations but one option that we would like to try is to
use the same matrix for both models. In other words, assuming we have n
nodes then we allocate a square matrix A to have 3*n rows, solve model 1,
and then reuse this matrix to solve model 2. So for model 2 any non-zeros
in the matrix A will be confined to the upper left ninth and the remaining
8/9ths of the matrix are irrelevant to the problem. I have several
questions about how best to implement something like this which I will list
below:

1. The solver requires that the matrix at least have some non-zeros in each
row, otherwise the solution ends up being all NANs. This makes sense as it
is solving the subsystem 0*x=0 which is clearly ill-defined. Is there any
way that I can communicate to PETSc, either through the solver or matrix
classes or some other way, that all I care about is a subsystem and that I
would like to use only a portion of the matrix A, the right hand side and
the solution vector? There are some routines involving submatrices in the
man pages but I'm not sure if they are appropriate for this problem or not.
In particular, the MatGetLocalSubMatrix routine might be what I need but
its not clear to me whether or not this actually copies the array of values
in the submatrix (not desirable due to memory concerns) or if it is just
essentially a pointer to the submatrix of values in the original matrix A
(ideal). Basically, the idea is to use a part of the matrix that we have
without allocating unnecessary extra memory.
2. Is there a way to use multiple local to global mappings for a single
matrix. I have a problem when I try to use the same local to global mapping
from model 1 in model 2. I understand why this is and can fix it but being
able to reset the local to global mapping without destroying the matrix
would be an ideal fix.
3. Any other input on the best way to approach solving a problem like this
using PETSc would be appreciated. I'm somewhat of a novice when it comes to
PETSc so I don't necessarily know all the tools which are available to me.
It seems clear that I've reached a point where the manual and man pages
aren't quite as helpful as they were for more basic operations.

I hope my explanation of the problem and my questions was clear. If not,
let me know and I can try to provide more details. Thanks!

Sam
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210903/a148c352/attachment.html>

From knepley at gmail.com  Fri Sep  3 11:49:23 2021
From: knepley at gmail.com (Matthew Knepley)
Date: Fri, 3 Sep 2021 12:49:23 -0400
Subject: [petsc-users] Solving subsystem using larger matrix
In-Reply-To: <CAOUB9XuOze=fM4Uqc_9ozi1Jer3Duu0Avh54WGHfauZr-zsM7Q@mail.gmail.com>
References: <CAOUB9XuOze=fM4Uqc_9ozi1Jer3Duu0Avh54WGHfauZr-zsM7Q@mail.gmail.com>
Message-ID: <CAMYG4GnbGC9SMOFTQ2M=PQE7G7+NNJzNH7UkxyMTAknsK3o5Sg@mail.gmail.com>

On Fri, Sep 3, 2021 at 12:34 PM Samuel Estes <samuelestes91 at gmail.com>
wrote:

> Hi,
>
> I have a model in which we alternatively solve two submodels by the finite
> element method both on the same unstructured mesh. The first model (call it
> model 1) has three degrees of freedom per node while the second model
> (model 2) is scalar (one degree of freedom). We are trying several
> different implementations but one option that we would like to try is to
> use the same matrix for both models. In other words, assuming we have n
> nodes then we allocate a square matrix A to have 3*n rows, solve model 1,
> and then reuse this matrix to solve model 2. So for model 2 any non-zeros
> in the matrix A will be confined to the upper left ninth and the remaining
> 8/9ths of the matrix are irrelevant to the problem. I have several
> questions about how best to implement something like this which I will list
> below:
>

I can tell you how to do the operations below. However, I think we should
go over why you want to do this first. What benefit do you hope
to have with this scheme over just using two matrices.

  Thanks,

     Matt


> 1. The solver requires that the matrix at least have some non-zeros in
> each row, otherwise the solution ends up being all NANs. This makes sense
> as it is solving the subsystem 0*x=0 which is clearly ill-defined. Is there
> any way that I can communicate to PETSc, either through the solver or
> matrix classes or some other way, that all I care about is a subsystem and
> that I would like to use only a portion of the matrix A, the right hand
> side and the solution vector? There are some routines involving submatrices
> in the man pages but I'm not sure if they are appropriate for this problem
> or not. In particular, the MatGetLocalSubMatrix routine might be what I
> need but its not clear to me whether or not this actually copies the array
> of values in the submatrix (not desirable due to memory concerns) or if it
> is just essentially a pointer to the submatrix of values in the original
> matrix A (ideal). Basically, the idea is to use a part of the matrix that
> we have without allocating unnecessary extra memory.
> 2. Is there a way to use multiple local to global mappings for a single
> matrix. I have a problem when I try to use the same local to global mapping
> from model 1 in model 2. I understand why this is and can fix it but being
> able to reset the local to global mapping without destroying the matrix
> would be an ideal fix.
> 3. Any other input on the best way to approach solving a problem like this
> using PETSc would be appreciated. I'm somewhat of a novice when it comes to
> PETSc so I don't necessarily know all the tools which are available to me.
> It seems clear that I've reached a point where the manual and man pages
> aren't quite as helpful as they were for more basic operations.
>
> I hope my explanation of the problem and my questions was clear. If not,
> let me know and I can try to provide more details. Thanks!
>
> Sam
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210903/c09d7042/attachment.html>

From paulsank at msu.edu  Fri Sep  3 12:53:05 2021
From: paulsank at msu.edu (Paul, Sanku)
Date: Fri, 3 Sep 2021 17:53:05 +0000
Subject: [petsc-users] Matrix exponential
Message-ID: <CH2PR12MB4328DEF5D9351A19A2FD7A79D4CF9@CH2PR12MB4328.namprd12.prod.outlook.com>

Dear Sir/Ma'am,

   I am trying to use SLEPc to calculate matrix exponential in my python code but I am not getting the correct result. I have attached the code. Could you let me know what I am doing wrong. This is my first time using SLEPc. So, I would like to ask you if you could send me a tutorial on matrix exponential using SLEPc in python code.

Best,
Sanku
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210903/01ea6f67/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ex2.py
Type: text/x-python
Size: 1329 bytes
Desc: ex2.py
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210903/01ea6f67/attachment.py>

From jroman at dsic.upv.es  Fri Sep  3 13:13:18 2021
From: jroman at dsic.upv.es (Jose E. Roman)
Date: Fri, 3 Sep 2021 20:13:18 +0200
Subject: [petsc-users] Matrix exponential
In-Reply-To: <CH2PR12MB4328DEF5D9351A19A2FD7A79D4CF9@CH2PR12MB4328.namprd12.prod.outlook.com>
References: <CH2PR12MB4328DEF5D9351A19A2FD7A79D4CF9@CH2PR12MB4328.namprd12.prod.outlook.com>
Message-ID: <699FFF16-6190-48D1-AFAE-18638531C277@dsic.upv.es>

You should either create the FN object and then

E.setFN(F)

or extract the FN object and assign to a variable

F = E.getFN()

You can see an example in $SLEPC_DIR/src/binding/slepc4py/demo/ex6.py


Jose


> El 3 sept 2021, a las 19:53, Paul, Sanku <paulsank at msu.edu> escribi?:
> 
> Dear Sir/Ma'am,
> 
>    I am trying to use SLEPc to calculate matrix exponential in my python code but I am not getting the correct result. I have attached the code. Could you let me know what I am doing wrong. This is my first time using SLEPc. So, I would like to ask you if you could send me a tutorial on matrix exponential using SLEPc in python code.
> 
> Best,
> Sanku
> <ex2.py>


From jroman at dsic.upv.es  Fri Sep  3 13:53:41 2021
From: jroman at dsic.upv.es (Jose E. Roman)
Date: Fri, 3 Sep 2021 20:53:41 +0200
Subject: [petsc-users] Matrix exponential
In-Reply-To: <CH2PR12MB4328382A1D07AFC4F521E05FD4CF9@CH2PR12MB4328.namprd12.prod.outlook.com>
References: <CH2PR12MB4328DEF5D9351A19A2FD7A79D4CF9@CH2PR12MB4328.namprd12.prod.outlook.com>
	<699FFF16-6190-48D1-AFAE-18638531C277@dsic.upv.es>
	<CH2PR12MB4328382A1D07AFC4F521E05FD4CF9@CH2PR12MB4328.namprd12.prod.outlook.com>
Message-ID: <36DDF934-D614-405F-93CC-081AED407ECF@dsic.upv.es>

Please always reply to the list (Reply-All), not to myself.

You should be able to convert from a scipy sparse matrix to a PETSc matrix via PETSc.Mat().createAIJWithArrays(). Don't know how if there is any example in the petsc4py documentation.

Jose


> El 3 sept 2021, a las 20:26, Paul, Sanku <paulsank at msu.edu> escribi?:
> 
> Dear Jose,
> 
>    Thank you very much for your help. I have another question can we just simply pass a sparse.csr.matrix to A. For instance, if B is the sparse.csr.matrix can we do A=B.copy(). Or do I have to do it in a different way?
> 
> Best,
> Sanku
> From: Jose E. Roman <jroman at dsic.upv.es>
> Sent: Friday, September 3, 2021 2:13 PM
> To: Paul, Sanku <paulsank at msu.edu>
> Cc: petsc-users at mcs.anl.gov <petsc-users at mcs.anl.gov>
> Subject: Re: [petsc-users] Matrix exponential
>  
> You should either create the FN object and then
> 
> E.setFN(F)
> 
> or extract the FN object and assign to a variable
> 
> F = E.getFN()
> 
> You can see an example in $SLEPC_DIR/src/binding/slepc4py/demo/ex6.py
> 
> 
> Jose
> 
> 
> > El 3 sept 2021, a las 19:53, Paul, Sanku <paulsank at msu.edu> escribi?:
> > 
> > Dear Sir/Ma'am,
> > 
> >    I am trying to use SLEPc to calculate matrix exponential in my python code but I am not getting the correct result. I have attached the code. Could you let me know what I am doing wrong. This is my first time using SLEPc. So, I would like to ask you if you could send me a tutorial on matrix exponential using SLEPc in python code.
> > 
> > Best,
> > Sanku
> > <ex2.py>


From paulsank at msu.edu  Fri Sep  3 18:39:19 2021
From: paulsank at msu.edu (Paul, Sanku)
Date: Fri, 3 Sep 2021 23:39:19 +0000
Subject: [petsc-users] Matrix exponential
In-Reply-To: <36DDF934-D614-405F-93CC-081AED407ECF@dsic.upv.es>
References: <CH2PR12MB4328DEF5D9351A19A2FD7A79D4CF9@CH2PR12MB4328.namprd12.prod.outlook.com>
	<699FFF16-6190-48D1-AFAE-18638531C277@dsic.upv.es>
	<CH2PR12MB4328382A1D07AFC4F521E05FD4CF9@CH2PR12MB4328.namprd12.prod.outlook.com>
	<36DDF934-D614-405F-93CC-081AED407ECF@dsic.upv.es>
Message-ID: <CH2PR12MB43280D872F59F9B62B6A459DD4CF9@CH2PR12MB4328.namprd12.prod.outlook.com>

Hi Jose,

   I could now do matrix exponential but facing a problem with a complex matrix. In particular, I want to do \exp(-itH), where H is a Hamiltonian.  How to implement this?

Thanks,
Sanku
________________________________
From: Jose E. Roman <jroman at dsic.upv.es>
Sent: Friday, September 3, 2021 2:53 PM
To: Paul, Sanku <paulsank at msu.edu>
Cc: PETSc <petsc-users at mcs.anl.gov>
Subject: Re: [petsc-users] Matrix exponential

Please always reply to the list (Reply-All), not to myself.

You should be able to convert from a scipy sparse matrix to a PETSc matrix via PETSc.Mat().createAIJWithArrays(). Don't know how if there is any example in the petsc4py documentation.

Jose


> El 3 sept 2021, a las 20:26, Paul, Sanku <paulsank at msu.edu> escribi?:
>
> Dear Jose,
>
>    Thank you very much for your help. I have another question can we just simply pass a sparse.csr.matrix to A. For instance, if B is the sparse.csr.matrix can we do A=B.copy(). Or do I have to do it in a different way?
>
> Best,
> Sanku
> From: Jose E. Roman <jroman at dsic.upv.es>
> Sent: Friday, September 3, 2021 2:13 PM
> To: Paul, Sanku <paulsank at msu.edu>
> Cc: petsc-users at mcs.anl.gov <petsc-users at mcs.anl.gov>
> Subject: Re: [petsc-users] Matrix exponential
>
> You should either create the FN object and then
>
> E.setFN(F)
>
> or extract the FN object and assign to a variable
>
> F = E.getFN()
>
> You can see an example in $SLEPC_DIR/src/binding/slepc4py/demo/ex6.py
>
>
> Jose
>
>
> > El 3 sept 2021, a las 19:53, Paul, Sanku <paulsank at msu.edu> escribi?:
> >
> > Dear Sir/Ma'am,
> >
> >    I am trying to use SLEPc to calculate matrix exponential in my python code but I am not getting the correct result. I have attached the code. Could you let me know what I am doing wrong. This is my first time using SLEPc. So, I would like to ask you if you could send me a tutorial on matrix exponential using SLEPc in python code.
> >
> > Best,
> > Sanku
> > <ex2.py>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210903/663df8d7/attachment-0001.html>

From paulsank at msu.edu  Fri Sep  3 19:11:00 2021
From: paulsank at msu.edu (Paul, Sanku)
Date: Sat, 4 Sep 2021 00:11:00 +0000
Subject: [petsc-users] Matrix exponential
In-Reply-To: <CH2PR12MB43280D872F59F9B62B6A459DD4CF9@CH2PR12MB4328.namprd12.prod.outlook.com>
References: <CH2PR12MB4328DEF5D9351A19A2FD7A79D4CF9@CH2PR12MB4328.namprd12.prod.outlook.com>
	<699FFF16-6190-48D1-AFAE-18638531C277@dsic.upv.es>
	<CH2PR12MB4328382A1D07AFC4F521E05FD4CF9@CH2PR12MB4328.namprd12.prod.outlook.com>
	<36DDF934-D614-405F-93CC-081AED407ECF@dsic.upv.es>
	<CH2PR12MB43280D872F59F9B62B6A459DD4CF9@CH2PR12MB4328.namprd12.prod.outlook.com>
Message-ID: <CH2PR12MB43284212C8D8E3E904A703E0D4D09@CH2PR12MB4328.namprd12.prod.outlook.com>

Hi Jose,

   I tried to install
ml -* foss/2019b Python SciPy-bundle/2019.10-Python-3.7.4
virtualenv slepc4py
cd slepc4py
source bin/activate
pip install numpy mpi4py cython

export PETSC_CONFIGURE_OPTIONS="--with-scalar-type=complex"

export PETSC_DIR=/path/to/petsc PETSC_ARCH=your-arch-name

pip install petsc petsc4py
pip install slepc slepc4py

But still facing problem with complex matrices. Please help me to fix this.

Thanks,
Sanku


________________________________
From: Paul, Sanku <paulsank at msu.edu>
Sent: Friday, September 3, 2021 7:39 PM
To: Jose E. Roman <jroman at dsic.upv.es>
Cc: PETSc <petsc-users at mcs.anl.gov>
Subject: Re: [petsc-users] Matrix exponential

Hi Jose,

   I could now do matrix exponential but facing a problem with a complex matrix. In particular, I want to do \exp(-itH), where H is a Hamiltonian.  How to implement this?

Thanks,
Sanku
________________________________
From: Jose E. Roman <jroman at dsic.upv.es>
Sent: Friday, September 3, 2021 2:53 PM
To: Paul, Sanku <paulsank at msu.edu>
Cc: PETSc <petsc-users at mcs.anl.gov>
Subject: Re: [petsc-users] Matrix exponential

Please always reply to the list (Reply-All), not to myself.

You should be able to convert from a scipy sparse matrix to a PETSc matrix via PETSc.Mat().createAIJWithArrays(). Don't know how if there is any example in the petsc4py documentation.

Jose


> El 3 sept 2021, a las 20:26, Paul, Sanku <paulsank at msu.edu> escribi?:
>
> Dear Jose,
>
>    Thank you very much for your help. I have another question can we just simply pass a sparse.csr.matrix to A. For instance, if B is the sparse.csr.matrix can we do A=B.copy(). Or do I have to do it in a different way?
>
> Best,
> Sanku
> From: Jose E. Roman <jroman at dsic.upv.es>
> Sent: Friday, September 3, 2021 2:13 PM
> To: Paul, Sanku <paulsank at msu.edu>
> Cc: petsc-users at mcs.anl.gov <petsc-users at mcs.anl.gov>
> Subject: Re: [petsc-users] Matrix exponential
>
> You should either create the FN object and then
>
> E.setFN(F)
>
> or extract the FN object and assign to a variable
>
> F = E.getFN()
>
> You can see an example in $SLEPC_DIR/src/binding/slepc4py/demo/ex6.py
>
>
> Jose
>
>
> > El 3 sept 2021, a las 19:53, Paul, Sanku <paulsank at msu.edu> escribi?:
> >
> > Dear Sir/Ma'am,
> >
> >    I am trying to use SLEPc to calculate matrix exponential in my python code but I am not getting the correct result. I have attached the code. Could you let me know what I am doing wrong. This is my first time using SLEPc. So, I would like to ask you if you could send me a tutorial on matrix exponential using SLEPc in python code.
> >
> > Best,
> > Sanku
> > <ex2.py>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210904/625da4ac/attachment.html>

From jroman at dsic.upv.es  Sat Sep  4 04:33:43 2021
From: jroman at dsic.upv.es (Jose E. Roman)
Date: Sat, 4 Sep 2021 11:33:43 +0200
Subject: [petsc-users] Matrix exponential
In-Reply-To: <CH2PR12MB43284212C8D8E3E904A703E0D4D09@CH2PR12MB4328.namprd12.prod.outlook.com>
References: <CH2PR12MB4328DEF5D9351A19A2FD7A79D4CF9@CH2PR12MB4328.namprd12.prod.outlook.com>
	<699FFF16-6190-48D1-AFAE-18638531C277@dsic.upv.es>
	<CH2PR12MB4328382A1D07AFC4F521E05FD4CF9@CH2PR12MB4328.namprd12.prod.outlook.com>
	<36DDF934-D614-405F-93CC-081AED407ECF@dsic.upv.es>
	<CH2PR12MB43280D872F59F9B62B6A459DD4CF9@CH2PR12MB4328.namprd12.prod.outlook.com>
	<CH2PR12MB43284212C8D8E3E904A703E0D4D09@CH2PR12MB4328.namprd12.prod.outlook.com>
Message-ID: <6D34FC62-FB0F-412F-8339-BC100400B93F@dsic.upv.es>

No problem, do
F.setScale(-1j*t)

What do you mean "still facing problem with complex matrices"? Did you get an error during installation? Did it install for real scalars? Try setting PETSC_CONFIGURE_OPTIONS only, and unset PETSC_DIR PETSC_ARCH. Otherwise pip may not take into account --with-scalar-type=complex

Jose


> El 4 sept 2021, a las 2:11, Paul, Sanku <paulsank at msu.edu> escribi?:
> 
> Hi Jose,
> 
>    I tried to install
> ml -* foss/2019b Python SciPy-bundle/2019.10-Python-3.7.4
> virtualenv slepc4py
> cd slepc4py
> source bin/activate
> pip install numpy mpi4py cython
> export PETSC_CONFIGURE_OPTIONS="--with-scalar-type=complex"
> export PETSC_DIR=/path/to/petsc PETSC_ARCH=your-arch-name
> pip install petsc petsc4py
> pip install slepc slepc4py
> 
> But still facing problem with complex matrices. Please help me to fix this.
> 
> Thanks,
> Sanku
> 
> 
> From: Paul, Sanku <paulsank at msu.edu>
> Sent: Friday, September 3, 2021 7:39 PM
> To: Jose E. Roman <jroman at dsic.upv.es>
> Cc: PETSc <petsc-users at mcs.anl.gov>
> Subject: Re: [petsc-users] Matrix exponential
>  
> Hi Jose,
> 
>    I could now do matrix exponential but facing a problem with a complex matrix. In particular, I want to do \exp(-itH), where H is a Hamiltonian.  How to implement this?
> 
> Thanks,
> Sanku
> From: Jose E. Roman <jroman at dsic.upv.es>
> Sent: Friday, September 3, 2021 2:53 PM
> To: Paul, Sanku <paulsank at msu.edu>
> Cc: PETSc <petsc-users at mcs.anl.gov>
> Subject: Re: [petsc-users] Matrix exponential
>  
> Please always reply to the list (Reply-All), not to myself.
> 
> You should be able to convert from a scipy sparse matrix to a PETSc matrix via PETSc.Mat().createAIJWithArrays(). Don't know how if there is any example in the petsc4py documentation.
> 
> Jose
> 
> 
> > El 3 sept 2021, a las 20:26, Paul, Sanku <paulsank at msu.edu> escribi?:
> > 
> > Dear Jose,
> > 
> >    Thank you very much for your help. I have another question can we just simply pass a sparse.csr.matrix to A. For instance, if B is the sparse.csr.matrix can we do A=B.copy(). Or do I have to do it in a different way?
> > 
> > Best,
> > Sanku
> > From: Jose E. Roman <jroman at dsic.upv.es>
> > Sent: Friday, September 3, 2021 2:13 PM
> > To: Paul, Sanku <paulsank at msu.edu>
> > Cc: petsc-users at mcs.anl.gov <petsc-users at mcs.anl.gov>
> > Subject: Re: [petsc-users] Matrix exponential
> >  
> > You should either create the FN object and then
> > 
> > E.setFN(F)
> > 
> > or extract the FN object and assign to a variable
> > 
> > F = E.getFN()
> > 
> > You can see an example in $SLEPC_DIR/src/binding/slepc4py/demo/ex6.py
> > 
> > 
> > Jose
> > 
> > 
> > > El 3 sept 2021, a las 19:53, Paul, Sanku <paulsank at msu.edu> escribi?:
> > > 
> > > Dear Sir/Ma'am,
> > > 
> > >    I am trying to use SLEPc to calculate matrix exponential in my python code but I am not getting the correct result. I have attached the code. Could you let me know what I am doing wrong. This is my first time using SLEPc. So, I would like to ask you if you could send me a tutorial on matrix exponential using SLEPc in python code.
> > > 
> > > Best,
> > > Sanku
> > > <ex2.py>


From paulsank at msu.edu  Sun Sep  5 19:30:15 2021
From: paulsank at msu.edu (Paul, Sanku)
Date: Mon, 6 Sep 2021 00:30:15 +0000
Subject: [petsc-users] Matrix exponential
In-Reply-To: <6D34FC62-FB0F-412F-8339-BC100400B93F@dsic.upv.es>
References: <CH2PR12MB4328DEF5D9351A19A2FD7A79D4CF9@CH2PR12MB4328.namprd12.prod.outlook.com>
	<699FFF16-6190-48D1-AFAE-18638531C277@dsic.upv.es>
	<CH2PR12MB4328382A1D07AFC4F521E05FD4CF9@CH2PR12MB4328.namprd12.prod.outlook.com>
	<36DDF934-D614-405F-93CC-081AED407ECF@dsic.upv.es>
	<CH2PR12MB43280D872F59F9B62B6A459DD4CF9@CH2PR12MB4328.namprd12.prod.outlook.com>
	<CH2PR12MB43284212C8D8E3E904A703E0D4D09@CH2PR12MB4328.namprd12.prod.outlook.com>
	<6D34FC62-FB0F-412F-8339-BC100400B93F@dsic.upv.es>
Message-ID: <CH2PR12MB43280890CFBF8CE750B132FBD4D29@CH2PR12MB4328.namprd12.prod.outlook.com>

Hi Jose,

    While I am running
from petsc4py import PETSc
>>> print(PETSc.ScalarType)
<class 'numpy.float64'>
only float64 I am getting not complex128.

Do I have to uninstall and then install it? Otherwise doing F.setScale(-1j*t) I am getting an error
File "SLEPc/FN.pyx", line 204, in slepc4py.SLEPc.FN.setScale
  File "SLEPc/SLEPc.pyx", line 115, in slepc4py.SLEPc.asScalar
TypeError: can't convert complex to float

Sanku
________________________________
From: Jose E. Roman <jroman at dsic.upv.es>
Sent: Saturday, September 4, 2021 5:33 AM
To: Paul, Sanku <paulsank at msu.edu>
Cc: PETSc <petsc-users at mcs.anl.gov>
Subject: Re: [petsc-users] Matrix exponential

No problem, do
F.setScale(-1j*t)

What do you mean "still facing problem with complex matrices"? Did you get an error during installation? Did it install for real scalars? Try setting PETSC_CONFIGURE_OPTIONS only, and unset PETSC_DIR PETSC_ARCH. Otherwise pip may not take into account --with-scalar-type=complex

Jose


> El 4 sept 2021, a las 2:11, Paul, Sanku <paulsank at msu.edu> escribi?:
>
> Hi Jose,
>
>    I tried to install
> ml -* foss/2019b Python SciPy-bundle/2019.10-Python-3.7.4
> virtualenv slepc4py
> cd slepc4py
> source bin/activate
> pip install numpy mpi4py cython
> export PETSC_CONFIGURE_OPTIONS="--with-scalar-type=complex"
> export PETSC_DIR=/path/to/petsc PETSC_ARCH=your-arch-name
> pip install petsc petsc4py
> pip install slepc slepc4py
>
> But still facing problem with complex matrices. Please help me to fix this.
>
> Thanks,
> Sanku
>
>
> From: Paul, Sanku <paulsank at msu.edu>
> Sent: Friday, September 3, 2021 7:39 PM
> To: Jose E. Roman <jroman at dsic.upv.es>
> Cc: PETSc <petsc-users at mcs.anl.gov>
> Subject: Re: [petsc-users] Matrix exponential
>
> Hi Jose,
>
>    I could now do matrix exponential but facing a problem with a complex matrix. In particular, I want to do \exp(-itH), where H is a Hamiltonian.  How to implement this?
>
> Thanks,
> Sanku
> From: Jose E. Roman <jroman at dsic.upv.es>
> Sent: Friday, September 3, 2021 2:53 PM
> To: Paul, Sanku <paulsank at msu.edu>
> Cc: PETSc <petsc-users at mcs.anl.gov>
> Subject: Re: [petsc-users] Matrix exponential
>
> Please always reply to the list (Reply-All), not to myself.
>
> You should be able to convert from a scipy sparse matrix to a PETSc matrix via PETSc.Mat().createAIJWithArrays(). Don't know how if there is any example in the petsc4py documentation.
>
> Jose
>
>
> > El 3 sept 2021, a las 20:26, Paul, Sanku <paulsank at msu.edu> escribi?:
> >
> > Dear Jose,
> >
> >    Thank you very much for your help. I have another question can we just simply pass a sparse.csr.matrix to A. For instance, if B is the sparse.csr.matrix can we do A=B.copy(). Or do I have to do it in a different way?
> >
> > Best,
> > Sanku
> > From: Jose E. Roman <jroman at dsic.upv.es>
> > Sent: Friday, September 3, 2021 2:13 PM
> > To: Paul, Sanku <paulsank at msu.edu>
> > Cc: petsc-users at mcs.anl.gov <petsc-users at mcs.anl.gov>
> > Subject: Re: [petsc-users] Matrix exponential
> >
> > You should either create the FN object and then
> >
> > E.setFN(F)
> >
> > or extract the FN object and assign to a variable
> >
> > F = E.getFN()
> >
> > You can see an example in $SLEPC_DIR/src/binding/slepc4py/demo/ex6.py
> >
> >
> > Jose
> >
> >
> > > El 3 sept 2021, a las 19:53, Paul, Sanku <paulsank at msu.edu> escribi?:
> > >
> > > Dear Sir/Ma'am,
> > >
> > >    I am trying to use SLEPc to calculate matrix exponential in my python code but I am not getting the correct result. I have attached the code. Could you let me know what I am doing wrong. This is my first time using SLEPc. So, I would like to ask you if you could send me a tutorial on matrix exponential using SLEPc in python code.
> > >
> > > Best,
> > > Sanku
> > > <ex2.py>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210906/57b7d404/attachment.html>

From paulsank at msu.edu  Sun Sep  5 20:30:49 2021
From: paulsank at msu.edu (Paul, Sanku)
Date: Mon, 6 Sep 2021 01:30:49 +0000
Subject: [petsc-users] Matrix exponential
In-Reply-To: <CH2PR12MB43280890CFBF8CE750B132FBD4D29@CH2PR12MB4328.namprd12.prod.outlook.com>
References: <CH2PR12MB4328DEF5D9351A19A2FD7A79D4CF9@CH2PR12MB4328.namprd12.prod.outlook.com>
	<699FFF16-6190-48D1-AFAE-18638531C277@dsic.upv.es>
	<CH2PR12MB4328382A1D07AFC4F521E05FD4CF9@CH2PR12MB4328.namprd12.prod.outlook.com>
	<36DDF934-D614-405F-93CC-081AED407ECF@dsic.upv.es>
	<CH2PR12MB43280D872F59F9B62B6A459DD4CF9@CH2PR12MB4328.namprd12.prod.outlook.com>
	<CH2PR12MB43284212C8D8E3E904A703E0D4D09@CH2PR12MB4328.namprd12.prod.outlook.com>
	<6D34FC62-FB0F-412F-8339-BC100400B93F@dsic.upv.es>
	<CH2PR12MB43280890CFBF8CE750B132FBD4D29@CH2PR12MB4328.namprd12.prod.outlook.com>
Message-ID: <CH2PR12MB4328B4086BF1CE27F5C8B5E3D4D29@CH2PR12MB4328.namprd12.prod.outlook.com>

Hi Jose,

   Thanks for your help. I successfully reinstalled the packages and now it is running. Thank you very much for your help.

Best,
Sanku
________________________________
From: Paul, Sanku <paulsank at msu.edu>
Sent: Sunday, September 5, 2021 8:30 PM
To: Jose E. Roman <jroman at dsic.upv.es>
Cc: PETSc <petsc-users at mcs.anl.gov>
Subject: Re: [petsc-users] Matrix exponential

Hi Jose,

    While I am running
from petsc4py import PETSc
>>> print(PETSc.ScalarType)
<class 'numpy.float64'>
only float64 I am getting not complex128.

Do I have to uninstall and then install it? Otherwise doing F.setScale(-1j*t) I am getting an error
File "SLEPc/FN.pyx", line 204, in slepc4py.SLEPc.FN.setScale
  File "SLEPc/SLEPc.pyx", line 115, in slepc4py.SLEPc.asScalar
TypeError: can't convert complex to float

Sanku
________________________________
From: Jose E. Roman <jroman at dsic.upv.es>
Sent: Saturday, September 4, 2021 5:33 AM
To: Paul, Sanku <paulsank at msu.edu>
Cc: PETSc <petsc-users at mcs.anl.gov>
Subject: Re: [petsc-users] Matrix exponential

No problem, do
F.setScale(-1j*t)

What do you mean "still facing problem with complex matrices"? Did you get an error during installation? Did it install for real scalars? Try setting PETSC_CONFIGURE_OPTIONS only, and unset PETSC_DIR PETSC_ARCH. Otherwise pip may not take into account --with-scalar-type=complex

Jose


> El 4 sept 2021, a las 2:11, Paul, Sanku <paulsank at msu.edu> escribi?:
>
> Hi Jose,
>
>    I tried to install
> ml -* foss/2019b Python SciPy-bundle/2019.10-Python-3.7.4
> virtualenv slepc4py
> cd slepc4py
> source bin/activate
> pip install numpy mpi4py cython
> export PETSC_CONFIGURE_OPTIONS="--with-scalar-type=complex"
> export PETSC_DIR=/path/to/petsc PETSC_ARCH=your-arch-name
> pip install petsc petsc4py
> pip install slepc slepc4py
>
> But still facing problem with complex matrices. Please help me to fix this.
>
> Thanks,
> Sanku
>
>
> From: Paul, Sanku <paulsank at msu.edu>
> Sent: Friday, September 3, 2021 7:39 PM
> To: Jose E. Roman <jroman at dsic.upv.es>
> Cc: PETSc <petsc-users at mcs.anl.gov>
> Subject: Re: [petsc-users] Matrix exponential
>
> Hi Jose,
>
>    I could now do matrix exponential but facing a problem with a complex matrix. In particular, I want to do \exp(-itH), where H is a Hamiltonian.  How to implement this?
>
> Thanks,
> Sanku
> From: Jose E. Roman <jroman at dsic.upv.es>
> Sent: Friday, September 3, 2021 2:53 PM
> To: Paul, Sanku <paulsank at msu.edu>
> Cc: PETSc <petsc-users at mcs.anl.gov>
> Subject: Re: [petsc-users] Matrix exponential
>
> Please always reply to the list (Reply-All), not to myself.
>
> You should be able to convert from a scipy sparse matrix to a PETSc matrix via PETSc.Mat().createAIJWithArrays(). Don't know how if there is any example in the petsc4py documentation.
>
> Jose
>
>
> > El 3 sept 2021, a las 20:26, Paul, Sanku <paulsank at msu.edu> escribi?:
> >
> > Dear Jose,
> >
> >    Thank you very much for your help. I have another question can we just simply pass a sparse.csr.matrix to A. For instance, if B is the sparse.csr.matrix can we do A=B.copy(). Or do I have to do it in a different way?
> >
> > Best,
> > Sanku
> > From: Jose E. Roman <jroman at dsic.upv.es>
> > Sent: Friday, September 3, 2021 2:13 PM
> > To: Paul, Sanku <paulsank at msu.edu>
> > Cc: petsc-users at mcs.anl.gov <petsc-users at mcs.anl.gov>
> > Subject: Re: [petsc-users] Matrix exponential
> >
> > You should either create the FN object and then
> >
> > E.setFN(F)
> >
> > or extract the FN object and assign to a variable
> >
> > F = E.getFN()
> >
> > You can see an example in $SLEPC_DIR/src/binding/slepc4py/demo/ex6.py
> >
> >
> > Jose
> >
> >
> > > El 3 sept 2021, a las 19:53, Paul, Sanku <paulsank at msu.edu> escribi?:
> > >
> > > Dear Sir/Ma'am,
> > >
> > >    I am trying to use SLEPc to calculate matrix exponential in my python code but I am not getting the correct result. I have attached the code. Could you let me know what I am doing wrong. This is my first time using SLEPc. So, I would like to ask you if you could send me a tutorial on matrix exponential using SLEPc in python code.
> > >
> > > Best,
> > > Sanku
> > > <ex2.py>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210906/3cc9366c/attachment-0001.html>

From matteo.semplice at uninsubria.it  Mon Sep  6 11:22:29 2021
From: matteo.semplice at uninsubria.it (Matteo Semplice)
Date: Mon, 6 Sep 2021 18:22:29 +0200
Subject: [petsc-users] Mat preallocation for SNES jacobian [WAS Re: Mat
 preallocation in case of variable stencils]
In-Reply-To: <87k0k1zl2y.fsf@jedbrown.org>
References: <a5f04a44-f3d4-cd14-075b-bd1f99f58764@uninsubria.it>
	<87k0k1zl2y.fsf@jedbrown.org>
Message-ID: <9f505269-c8d6-0847-cf05-019b36ae1aee@uninsubria.it>


Il 31/08/21 17:32, Jed Brown ha scritto:
> Matteo Semplice <matteo.semplice at uninsubria.it> writes:
>
>> Hi.
>>
>> We are writing a code for a FD scheme on an irregular domain and thus
>> the local stencil is quite variable: we have inner nodes, boundary nodes
>> and inactive nodes, each with their own stencil type and offset with
>> respect to the grid node. We currently create a matrix with
>> DMCreateMatrix on a DMDA and for now have set the option
>> MAT_NEW_NONZERO_LOCATIONS to PETSC_TRUE, but its time to render the code
>> memory-efficient. The layout created automatically is correct for inner
>> nodes, wrong for boundary ones (off-centered stencils) and redundant for
>> outer nodes.
>>
>> After the preprocessing stage (including stencil creation) we'd be in
>> position to set the nonzero pattern properly.
>>
>> Do we need to start from a Mat created by CreateMatrix? Or is it ok to
>> call DMCreateMatrix (so that the splitting among CPUs and the block size
>> are set by PETSc) and then call a MatSetPreallocation routine?
> You can call MatXAIJSetPreallocation after. It'll handle all matrix types so you don't have to shepherd data for all the specific preallocations.

Hi.

Actually I am still struggling with this... Let me explain.

My code relies on a SNES and the matrix I need to preallocate is the 
Jacobian. So I do:

in the main file
 ? ierr = DMCreateMatrix(ctx.daAll,&ctx.J);CHKERRQ(ierr);
 ? ierr = setJacobianPattern(ctx,ctx.J);CHKERRQ(ierr); //calling 
MatXAIJSetPreallocation on the second argument
 ? ierr = MatSetOption(ctx.J,MAT_NEW_NONZERO_LOCATIONS,*******); 
CHKERRQ(ierr);//allow new nonzeros?

 ? ierr = SNESSetFunction(snes,ctx.F????? ,FormFunction,(void *) &ctx); 
CHKERRQ(ierr);
 ? ierr = SNESSetJacobian(snes,ctx.J,ctx.J,FormJacobian,(void *) &ctx); 
CHKERRQ(ierr);

 ? ierr = FormSulfationRHS(ctx, ctx.U0, ctx.RHS);CHKERRQ(ierr);
 ? ierr = SNESSolve(snes,ctx.RHS,ctx.U); CHKERRQ(ierr);

and

PetscErrorCode FormJacobian(SNES snes,Vec U,Mat J, Mat P,void *_ctx)

does (this is a 2 dof finite difference problem, so logically 2x2 blocks 
in J)

 ??? ierr = setJac00(....,P) //calls to MatSetValues in the 00 block

 ??? ierr = setJac01(....,P) //calls to MatSetValues in the 01 block

 ??? ierr = setJac1X(....,P) //calls to MatSetValues in the 10 ans 11 block

 ??? ierr = MatAssemblyBegin(P,MAT_FINAL_ASSEMBLY);CHKERRQ(ierr);
 ? ? ierr = MatAssemblyEnd(P,MAT_FINAL_ASSEMBLY);CHKERRQ(ierr);

If I run with MAT_NEW_NONZERO_LOCATIONS=TRUE, all runs fine and using 
the -info option I see that no mallocs are performed during Assembly.

Computing F
 ? 0 SNES Function norm 7.672682917097e+02
Computing J
[0] MatAssemblyEnd_SeqAIJ(): Matrix size: 71874 X 71874; storage space: 
17661 unneeded,191714 used
[0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 27
[0] MatCheckCompressedRow(): Found the ratio (num_zerorows 
0)/(num_localrows 71874) < 0.6. Do not use CompressedRow routines.

If I omit the call to setJacobianPattern, info reports a nonsero number 
of mallocs, so somehow the setJacobianPattern routine should be doing 
its job correctly.

However, if I run with MAT_NEW_NONZERO_LOCATIONS=FALSE, the Jacobian is 
entirely zero and no error messages appear until the KSP tries to do its 
job:

Computing F
 ? 0 SNES Function norm 7.672682917097e+02
Computing J
[0] MatAssemblyEnd_SeqAIJ(): Matrix size: 71874 X 71874; storage space: 
209375 unneeded,0 used
[0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
[0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 0
[0] MatCheckCompressedRow(): Found the ratio (num_zerorows 
71874)/(num_localrows 71874) > 0.6. Use CompressedRow routines.
... and then KSP complains!

I have tried adding MAT_FLUSH_ASSEMBLY calls inside the subroutines, but 
nothing changes.

So I have 2 questions:

1. If, as a temporary solution, I leave MAT_NEW_NONZERO_LOCATIONS=TRUE, 
am I going to incur in performance penalties even if no new nonzeros are 
created by my assembly routine?

2. Can you guess what is causing the problem?

Thanks

 ??? Matteo







From knepley at gmail.com  Mon Sep  6 18:34:57 2021
From: knepley at gmail.com (Matthew Knepley)
Date: Mon, 6 Sep 2021 19:34:57 -0400
Subject: [petsc-users] Mat preallocation for SNES jacobian [WAS Re: Mat
 preallocation in case of variable stencils]
In-Reply-To: <9f505269-c8d6-0847-cf05-019b36ae1aee@uninsubria.it>
References: <a5f04a44-f3d4-cd14-075b-bd1f99f58764@uninsubria.it>
	<87k0k1zl2y.fsf@jedbrown.org>
	<9f505269-c8d6-0847-cf05-019b36ae1aee@uninsubria.it>
Message-ID: <CAMYG4Gnw70e8Br-VxfQm5EZnTDmK2c6yN3eVcPytarfb+erA+Q@mail.gmail.com>

On Mon, Sep 6, 2021 at 12:22 PM Matteo Semplice <
matteo.semplice at uninsubria.it> wrote:

>
> Il 31/08/21 17:32, Jed Brown ha scritto:
> > Matteo Semplice <matteo.semplice at uninsubria.it> writes:
> >
> >> Hi.
> >>
> >> We are writing a code for a FD scheme on an irregular domain and thus
> >> the local stencil is quite variable: we have inner nodes, boundary nodes
> >> and inactive nodes, each with their own stencil type and offset with
> >> respect to the grid node. We currently create a matrix with
> >> DMCreateMatrix on a DMDA and for now have set the option
> >> MAT_NEW_NONZERO_LOCATIONS to PETSC_TRUE, but its time to render the code
> >> memory-efficient. The layout created automatically is correct for inner
> >> nodes, wrong for boundary ones (off-centered stencils) and redundant for
> >> outer nodes.
> >>
> >> After the preprocessing stage (including stencil creation) we'd be in
> >> position to set the nonzero pattern properly.
> >>
> >> Do we need to start from a Mat created by CreateMatrix? Or is it ok to
> >> call DMCreateMatrix (so that the splitting among CPUs and the block size
> >> are set by PETSc) and then call a MatSetPreallocation routine?
> > You can call MatXAIJSetPreallocation after. It'll handle all matrix
> types so you don't have to shepherd data for all the specific
> preallocations.
>
> Hi.
>
> Actually I am still struggling with this... Let me explain.
>
> My code relies on a SNES and the matrix I need to preallocate is the
> Jacobian. So I do:
>
> in the main file
>    ierr = DMCreateMatrix(ctx.daAll,&ctx.J);CHKERRQ(ierr);
>    ierr = setJacobianPattern(ctx,ctx.J);CHKERRQ(ierr); //calling
> MatXAIJSetPreallocation on the second argument
>

I do not understand this. DMCreateMatrix() has already preallocated _and_
filled the matrix
with zeros. Additional preallocation statements will not do anything (I
think).


>    ierr = MatSetOption(ctx.J,MAT_NEW_NONZERO_LOCATIONS,*******);
> CHKERRQ(ierr);//allow new nonzeros?


>    ierr = SNESSetFunction(snes,ctx.F      ,FormFunction,(void *) &ctx);
> CHKERRQ(ierr);
>    ierr = SNESSetJacobian(snes,ctx.J,ctx.J,FormJacobian,(void *) &ctx);
> CHKERRQ(ierr);
>
>    ierr = FormSulfationRHS(ctx, ctx.U0, ctx.RHS);CHKERRQ(ierr);
>    ierr = SNESSolve(snes,ctx.RHS,ctx.U); CHKERRQ(ierr);
>
> and
>
> PetscErrorCode FormJacobian(SNES snes,Vec U,Mat J, Mat P,void *_ctx)
>
> does (this is a 2 dof finite difference problem, so logically 2x2 blocks
> in J)
>
>      ierr = setJac00(....,P) //calls to MatSetValues in the 00 block
>
>      ierr = setJac01(....,P) //calls to MatSetValues in the 01 block
>
>      ierr = setJac1X(....,P) //calls to MatSetValues in the 10 ans 11 block
>
>      ierr = MatAssemblyBegin(P,MAT_FINAL_ASSEMBLY);CHKERRQ(ierr);
>      ierr = MatAssemblyEnd(P,MAT_FINAL_ASSEMBLY);CHKERRQ(ierr);
>
> If I run with MAT_NEW_NONZERO_LOCATIONS=TRUE, all runs fine and using
> the -info option I see that no mallocs are performed during Assembly.
>
> Computing F
>    0 SNES Function norm 7.672682917097e+02
> Computing J
> [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 71874 X 71874; storage space:
> 17661 unneeded,191714 used
> [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
> [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 27
> [0] MatCheckCompressedRow(): Found the ratio (num_zerorows
> 0)/(num_localrows 71874) < 0.6. Do not use CompressedRow routines.
>
> If I omit the call to setJacobianPattern, info reports a nonsero number
> of mallocs, so somehow the setJacobianPattern routine should be doing
> its job correctly.
>

Hmm, this might mean that the second preallocation call is wiping out the
info in the first. Okay,
I will go back and look at the code.


> However, if I run with MAT_NEW_NONZERO_LOCATIONS=FALSE, the Jacobian is
> entirely zero and no error messages appear until the KSP tries to do its
> job:
>

This sounds like your setJacobianPattern() is not filling the matrix with
zeros, so that the insertions
make new nonzeros. It is hard to make sense of this string of events
without the code.


> Computing F
>    0 SNES Function norm 7.672682917097e+02
> Computing J
> [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 71874 X 71874; storage space:
> 209375 unneeded,0 used
> [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
> [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 0
> [0] MatCheckCompressedRow(): Found the ratio (num_zerorows
> 71874)/(num_localrows 71874) > 0.6. Use CompressedRow routines.
> ... and then KSP complains!
>
> I have tried adding MAT_FLUSH_ASSEMBLY calls inside the subroutines, but
> nothing changes.
>
> So I have 2 questions:
>
> 1. If, as a temporary solution, I leave MAT_NEW_NONZERO_LOCATIONS=TRUE,
> am I going to incur in performance penalties even if no new nonzeros are
> created by my assembly routine?
>

If you are worried about performance, I think the option you want is

  MAT_NEW_NONZERO_ALLOCATION_ERR

since allocations, not new nonzeros, are what causes performance problems.

  Thanks,

     Matt


> 2. Can you guess what is causing the problem?
>
> Thanks
>
>      Matteo
>
>
>
>
>
>
>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210906/196a816c/attachment.html>

From bsmith at petsc.dev  Mon Sep  6 20:33:59 2021
From: bsmith at petsc.dev (Barry Smith)
Date: Mon, 6 Sep 2021 21:33:59 -0400
Subject: [petsc-users] Slow convergence while parallel computations.
In-Reply-To: <CAELBu-_E4SG3dzj4ceVQWS8BnrZdpHVFP6Z1cbSSZ6M8VhstFQ@mail.gmail.com>
References: <CAELBu--pQXziZLnKD0rXE8VxAdf1_AY09TCNKhW=9we2hXsMPw@mail.gmail.com>
	<EB5F7983-CA26-496E-9BE4-E15767E0003C@gmx.li>
	<CAELBu-83fyG37RReqiN0Y7F=k46CiPt_vThGiM1Nrj8hNuxpAA@mail.gmail.com>
	<CADOhEh7kq-Syz6abBiBdGV7yrfDYL6gn+ZEOrPcn7Q0L_sYv1Q@mail.gmail.com>
	<CAMYG4GmhyjNGzHkza3XFBRYrePa=ZvyZ02QqkFoxnjVvmMymuA@mail.gmail.com>
	<CAELBu-_E4SG3dzj4ceVQWS8BnrZdpHVFP6Z1cbSSZ6M8VhstFQ@mail.gmail.com>
Message-ID: <159AE122-1B25-49DC-93C5-06F5CF198DAD@petsc.dev>


  You can use -mat_null_space_test to check it the null space you provide is within the null space of the operator. There is no practical way to test if the null space you provide is exactly the full null space of the operator but at least the check ensures that you are not providing something that is not in the null space.

  Barry

  Also if you run with GMRES and look at the norms of the residuals  -ksp_monitor at each restart iteration (the default restart is 30), if they jump wildly at each restart this can indicate a problem with the nullspace.  



> On Sep 3, 2021, at 10:48 AM, Viktor Nazdrachev <numbersixvs at gmail.com> wrote:
> 
> Hello Mark and Matthew!
>  
> I attached log files for serial and parallel cases and corresponding information about GAMG preconditioner (using grep).
> 
> I have to notice, that assembling of global stiffness matrix in code was performed by MatSetValues subrotuine (not MatSetValuesBlocked) 
> 
> !nnds ? number of nodes
> !dmn=3
> call MatCreate(Petsc_Comm_World,Mat_K,ierr)
> call MatSetFromOptions(Mat_K,ierr)
> call MatSetSizes(Mat_K,Petsc_Decide,Petsc_Decide,n,n,ierr_m)
> ?
> call MatMPIAIJSetPreallocation(Mat_K,0,dbw,0,obw,ierr)
> ?
> call MatSetOption(Mat_K,Mat_New_Nonzero_Allocation_Err,Petsc_False,ierr)
> ?
> 
> do i=1,nels
>     call FormLocalK(i,k,indx,"Kp") ! find local stiffness matrix
>      indx=indxmap(indx,2) !find global indices for DOFs
>      call MatSetValues(Mat_K,ef_eldof,indx,ef_eldof,indx,k,Add_Values,ierr) 
> end do
>  
> But nullspace vector was created using VecSetBlockSize subroutine.
>  
> call VecCreate(Petsc_Comm_World,Vec_NullSpace,ierr)
> call VecSetBlockSize(Vec_NullSpace,dmn,ierr)
> call VecSetSizes(Vec_NullSpace,nnds*dmn,Petsc_Decide,ierr)
> call VecSetUp(Vec_NullSpace,ierr)
> call VecGetArrayF90(Vec_NullSpace,null_space,ierr)
> ?
> call VecRestoreArrayF90(Vec_NullSpace,null_space,ierr)
> call MatNullSpaceCreateRigidBody(Vec_NullSpace,matnull,ierr)
> call MatSetNearNullSpace(Mat_K,matnull,ierr)
>  
> I suppose it can be one of the reasons of GAMG slow convergence.
> So I attached log files for parallel run with ?pure? GAMG precondtioner.
>  
>  
> Kind regards,
>  
> Viktor Nazdrachev
>  
> R&D senior researcher
>  
> Geosteering Technologies LLC
> 
> ??, 3 ????. 2021 ?. ? 15:11, Matthew Knepley <knepley at gmail.com <mailto:knepley at gmail.com>>:
> On Fri, Sep 3, 2021 at 8:02 AM Mark Adams <mfadams at lbl.gov <mailto:mfadams at lbl.gov>> wrote:
> 
> 
> On Fri, Sep 3, 2021 at 1:57 AM Viktor Nazdrachev <numbersixvs at gmail.com <mailto:numbersixvs at gmail.com>> wrote:
> Hello, Lawrence!
> Thank you for your response!
> I attached log files (txt files with convergence behavior and RAM usage log in separate txt files) and resulting table with convergence investigation data(xls). Data for main non-regular grid with 500K cells and heterogeneous properties are in 500K folder, whereas data for simple uniform 125K cells grid with constant properties are in 125K folder. 
> 
> >> On 1 Sep 2021, at 09:42, ????????? ?????? <numbersixvs at gmail.com <https://lists.mcs.anl.gov/mailman/listinfo/petsc-users>> wrote:
> >> 
> >> I have a 3D elasticity problem with heterogeneous properties.
> > 
> >What does your coefficient variation look like? How large is the contrast?
>  
> Young modulus varies from 1 to 10 GPa, Poisson ratio varies from 0.3 to 0.44 and density ? from 1700 to 2600 kg/m^3.
> 
> That is not too bad. Poorly shaped elements are the next thing to worry about. Try to keep the aspect ratio below 10 if possible.
>  
>   
>  
> >> There is unstructured grid with aspect ratio varied from 4 to 25. Zero Dirichlet BCs  are imposed on bottom face of mesh. Also, Neumann (traction) BCs are imposed on side faces. Gravity load is also accounted for. The grid I use consists of 500k cells (which is approximately 1.6M of DOFs).
> >> 
> >> The best performance and memory usage for single MPI process was obtained with HPDDM(BFBCG) solver and bjacobian + ICC (1) in subdomains as preconditioner, it took 1 m 45 s and RAM 5.0 GB. Parallel computation with 4 MPI processes took 2 m 46 s when using 5.6 GB of RAM. This because of number of iterations required to achieve the same tolerance is significantly increased.
> > 
> >How many iterations do you have in serial (and then in parallel)?
>  
> Serial run is required 112 iterations to reach convergence (log_hpddm(bfbcg)_bjacobian_icc_1_mpi.txt), parallel run with 4 MPI ? 680 iterations.
>  
> I attached log files for all simulations (txt files with convergence behavior and RAM usage log in separate txt files) and resulting table with convergence/memory usage data(xls). Data for main non-regular grid with 500K cells and heterogeneous properties are in 500K folder, whereas data for simple uniform 125K cells grid with constant properties are in 125K folder.
>  
>  
> >> I`ve also tried PCGAMG (agg) preconditioner with IC? (1) sub-precondtioner. For single MPI process, the calculation took 10 min and 3.4 GB of RAM. To improve the convergence rate, the nullspace was attached using MatNullSpaceCreateRigidBody and MatSetNearNullSpace subroutines.  This has reduced calculation time to 3 m 58 s when using 4.3 GB of RAM. Also, there is peak memory usage with 14.1 GB, which appears just before the start of the iterations. Parallel computation with 4 MPI processes took 2 m 53 s when using 8.4 GB of RAM. In that case the peak memory usage is about 22 GB.
> > 
> >Does the number of iterates increase in parallel? Again, how many iterations do you have?
>  
> For case with 4 MPI processes and attached nullspace it is required 177 iterations to reach convergence (you may see detailed log in log_hpddm(bfbcg)_gamg_nearnullspace_4_mpi.txt). For comparison, 90 iterations are required for sequential run(log_hpddm(bfbcg)_gamg_nearnullspace_1_mpi.txt).
> 
> Again, do not use ICC. I am surprised to see such a large jump in iteration count, but get ICC off the table.
> 
> You will see variability in the iteration count with processor count with GAMG. As much as 10% +-. Maybe more (random) variability , but usually less.
> 
> You can decrease the memory a little, and the setup time a lot, by aggressively coarsening, at the expense of higher iteration counts. It's a balancing act.
> 
> You can run with the defaults, add '-info', grep on GAMG and send the ~30 lines of output if you want advice on parameters.
> 
> Can you send the output of
> 
>   -ksp_view -ksp_monitor_true_residual -ksp_converged_reason
> 
>   Thanks,
> 
>       Matt
>  
> Thanks,
> Mark
>  
>  
>  
>  
> >> Are there ways to avoid decreasing of the convergence rate for bjacobi precondtioner in parallel mode? Does it make sense to use hierarchical or nested krylov methods with a local gmres solver (sub_pc_type gmres) and some sub-precondtioner (for example, sub_pc_type bjacobi)?
> > 
> >bjacobi is only a one-level method, so you would not expect process-independent convergence rate for this kind of problem. If the coefficient variation is not too extreme, then I would expect GAMG (or some other smoothed aggregation package, perhaps -pc_type ml (you need --download-ml)) would work well with some tuning.
>  
> Thanks for idea, but, unfortunately, ML cannot be compiled with 64bit integers (It is extremely necessary to perform computation on mesh with more than 10M cells).
>  
>  
> >If you have extremely high contrast coefficients you might need something with stronger coarse grids. If you can assemble so-called Neumann matrices (https://petsc.org/release/docs/manualpages/Mat/MATIS.html#MATIS <https://petsc.org/release/docs/manualpages/Mat/MATIS.html#MATIS>) then you could try the geneo scheme offered by PCHPDDM.
>  
>  
> I found strange convergence behavior for HPDDM preconditioner. For 1 MPI process BFBCG solver did not converged (log_hpddm(bfbcg)_pchpddm_1_mpi.txt), while for 4 MPI processes computation was successful (1018 to reach convergence, log_hpddm(bfbcg)_pchpddm_4_mpi.txt).
> But it should be mentioned that stiffness matrix was created in AIJ format (our default matrix format in program).
> Matrix conversion to MATIS format via MatConvert subroutine resulted in losing of convergence for both serial and parallel run.
> 
> >> Is this peak memory usage expected for gamg preconditioner? is there any way to reduce it?
> > 
> >I think that peak memory usage comes from building the coarse grids. Can you run with `-info` and grep for GAMG, this will provide some output that more expert GAMG users can interpret. 
>  
>  Thanks, I`ll try to use a strong threshold only for coarse grids.
>  
> Kind regards,
>  
> Viktor Nazdrachev
>  
> R&D senior researcher
>  
> Geosteering Technologies LLC
>  
>  
>  
>  
> 
> ??, 1 ????. 2021 ?. ? 12:02, Lawrence Mitchell <wence at gmx.li <mailto:wence at gmx.li>>:
> 
> 
> > On 1 Sep 2021, at 09:42, ????????? ?????? <numbersixvs at gmail.com <mailto:numbersixvs at gmail.com>> wrote:
> > 
> > I have a 3D elasticity problem with heterogeneous properties.
> 
> What does your coefficient variation look like? How large is the contrast?
> 
> > There is unstructured grid with aspect ratio varied from 4 to 25. Zero Dirichlet BCs  are imposed on bottom face of mesh. Also, Neumann (traction) BCs are imposed on side faces. Gravity load is also accounted for. The grid I use consists of 500k cells (which is approximately 1.6M of DOFs).
> > 
> > The best performance and memory usage for single MPI process was obtained with HPDDM(BFBCG) solver and bjacobian + ICC (1) in subdomains as preconditioner, it took 1 m 45 s and RAM 5.0 GB. Parallel computation with 4 MPI processes took 2 m 46 s when using 5.6 GB of RAM. This because of number of iterations required to achieve the same tolerance is significantly increased.
> 
> How many iterations do you have in serial (and then in parallel)?
> 
> > I`ve also tried PCGAMG (agg) preconditioner with IC? (1) sub-precondtioner. For single MPI process, the calculation took 10 min and 3.4 GB of RAM. To improve the convergence rate, the nullspace was attached using MatNullSpaceCreateRigidBody and MatSetNearNullSpace subroutines.  This has reduced calculation time to 3 m 58 s when using 4.3 GB of RAM. Also, there is peak memory usage with 14.1 GB, which appears just before the start of the iterations. Parallel computation with 4 MPI processes took 2 m 53 s when using 8.4 GB of RAM. In that case the peak memory usage is about 22 GB.
> 
> Does the number of iterates increase in parallel? Again, how many iterations do you have?
> 
> > Are there ways to avoid decreasing of the convergence rate for bjacobi precondtioner in parallel mode? Does it make sense to use hierarchical or nested krylov methods with a local gmres solver (sub_pc_type gmres) and some sub-precondtioner (for example, sub_pc_type bjacobi)?
> 
> bjacobi is only a one-level method, so you would not expect process-independent convergence rate for this kind of problem. If the coefficient variation is not too extreme, then I would expect GAMG (or some other smoothed aggregation package, perhaps -pc_type ml (you need --download-ml)) would work well with some tuning.
> 
> If you have extremely high contrast coefficients you might need something with stronger coarse grids. If you can assemble so-called Neumann matrices (https://petsc.org/release/docs/manualpages/Mat/MATIS.html#MATIS <https://petsc.org/release/docs/manualpages/Mat/MATIS.html#MATIS>) then you could try the geneo scheme offered by PCHPDDM.
> 
> > Is this peak memory usage expected for gamg preconditioner? is there any way to reduce it?
> 
> I think that peak memory usage comes from building the coarse grids. Can you run with `-info` and grep for GAMG, this will provide some output that more expert GAMG users can interpret.
> 
> Lawrence
> 
> 
> 
> -- 
> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> -- Norbert Wiener
> 
> https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
> <true_residual_logs_and_greped.rar>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210906/10e91cbd/attachment-0001.html>

From bsmith at petsc.dev  Mon Sep  6 20:48:25 2021
From: bsmith at petsc.dev (Barry Smith)
Date: Mon, 6 Sep 2021 21:48:25 -0400
Subject: [petsc-users] Mat preallocation for SNES jacobian [WAS Re: Mat
 preallocation in case of variable stencils]
In-Reply-To: <CAMYG4Gnw70e8Br-VxfQm5EZnTDmK2c6yN3eVcPytarfb+erA+Q@mail.gmail.com>
References: <a5f04a44-f3d4-cd14-075b-bd1f99f58764@uninsubria.it>
	<87k0k1zl2y.fsf@jedbrown.org>
	<9f505269-c8d6-0847-cf05-019b36ae1aee@uninsubria.it>
	<CAMYG4Gnw70e8Br-VxfQm5EZnTDmK2c6yN3eVcPytarfb+erA+Q@mail.gmail.com>
Message-ID: <9EE66D0B-604F-4683-A42A-D487223DEE0B@petsc.dev>

  Matteo,

   I think it might be best if you simply took our "standard" DMCreateMatrix for DMDA routine and modified exactly for your needs. You can find the source code in src/dm/impls/da/fdda.c There are a variety of routines such as DMCreateMatrix_DA_2d_MPIAIJ_Fill() DMCreateMatrix_DA_2d_MPIAIJ() etc. 

    Pick the one that matches your needs, copy it and modify to do the exact preallocation and then filling with zeros for interior points, boundary points etc. I don't think "fixing" the incorrect default behavior after the fact will work.

  Barry


> On Sep 6, 2021, at 7:34 PM, Matthew Knepley <knepley at gmail.com> wrote:
> 
> On Mon, Sep 6, 2021 at 12:22 PM Matteo Semplice <matteo.semplice at uninsubria.it <mailto:matteo.semplice at uninsubria.it>> wrote:
> 
> Il 31/08/21 17:32, Jed Brown ha scritto:
> > Matteo Semplice <matteo.semplice at uninsubria.it <mailto:matteo.semplice at uninsubria.it>> writes:
> >
> >> Hi.
> >>
> >> We are writing a code for a FD scheme on an irregular domain and thus
> >> the local stencil is quite variable: we have inner nodes, boundary nodes
> >> and inactive nodes, each with their own stencil type and offset with
> >> respect to the grid node. We currently create a matrix with
> >> DMCreateMatrix on a DMDA and for now have set the option
> >> MAT_NEW_NONZERO_LOCATIONS to PETSC_TRUE, but its time to render the code
> >> memory-efficient. The layout created automatically is correct for inner
> >> nodes, wrong for boundary ones (off-centered stencils) and redundant for
> >> outer nodes.
> >>
> >> After the preprocessing stage (including stencil creation) we'd be in
> >> position to set the nonzero pattern properly.
> >>
> >> Do we need to start from a Mat created by CreateMatrix? Or is it ok to
> >> call DMCreateMatrix (so that the splitting among CPUs and the block size
> >> are set by PETSc) and then call a MatSetPreallocation routine?
> > You can call MatXAIJSetPreallocation after. It'll handle all matrix types so you don't have to shepherd data for all the specific preallocations.
> 
> Hi.
> 
> Actually I am still struggling with this... Let me explain.
> 
> My code relies on a SNES and the matrix I need to preallocate is the 
> Jacobian. So I do:
> 
> in the main file
>    ierr = DMCreateMatrix(ctx.daAll,&ctx.J);CHKERRQ(ierr);
>    ierr = setJacobianPattern(ctx,ctx.J);CHKERRQ(ierr); //calling 
> MatXAIJSetPreallocation on the second argument
> 
> I do not understand this. DMCreateMatrix() has already preallocated _and_ filled the matrix
> with zeros. Additional preallocation statements will not do anything (I think).
>  
>    ierr = MatSetOption(ctx.J,MAT_NEW_NONZERO_LOCATIONS,*******); 
> CHKERRQ(ierr);//allow new nonzeros? 
> 
>    ierr = SNESSetFunction(snes,ctx.F      ,FormFunction,(void *) &ctx); 
> CHKERRQ(ierr);
>    ierr = SNESSetJacobian(snes,ctx.J,ctx.J,FormJacobian,(void *) &ctx); 
> CHKERRQ(ierr);
> 
>    ierr = FormSulfationRHS(ctx, ctx.U0, ctx.RHS);CHKERRQ(ierr);
>    ierr = SNESSolve(snes,ctx.RHS,ctx.U); CHKERRQ(ierr);
> 
> and
> 
> PetscErrorCode FormJacobian(SNES snes,Vec U,Mat J, Mat P,void *_ctx)
> 
> does (this is a 2 dof finite difference problem, so logically 2x2 blocks 
> in J)
> 
>      ierr = setJac00(....,P) //calls to MatSetValues in the 00 block
> 
>      ierr = setJac01(....,P) //calls to MatSetValues in the 01 block
> 
>      ierr = setJac1X(....,P) //calls to MatSetValues in the 10 ans 11 block
> 
>      ierr = MatAssemblyBegin(P,MAT_FINAL_ASSEMBLY);CHKERRQ(ierr);
>      ierr = MatAssemblyEnd(P,MAT_FINAL_ASSEMBLY);CHKERRQ(ierr);
> 
> If I run with MAT_NEW_NONZERO_LOCATIONS=TRUE, all runs fine and using 
> the -info option I see that no mallocs are performed during Assembly.
> 
> Computing F
>    0 SNES Function norm 7.672682917097e+02
> Computing J
> [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 71874 X 71874; storage space: 
> 17661 unneeded,191714 used
> [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
> [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 27
> [0] MatCheckCompressedRow(): Found the ratio (num_zerorows 
> 0)/(num_localrows 71874) < 0.6. Do not use CompressedRow routines.
> 
> If I omit the call to setJacobianPattern, info reports a nonsero number 
> of mallocs, so somehow the setJacobianPattern routine should be doing 
> its job correctly.
> 
> Hmm, this might mean that the second preallocation call is wiping out the info in the first. Okay,
> I will go back and look at the code.
>  
> However, if I run with MAT_NEW_NONZERO_LOCATIONS=FALSE, the Jacobian is 
> entirely zero and no error messages appear until the KSP tries to do its 
> job:
> 
> This sounds like your setJacobianPattern() is not filling the matrix with zeros, so that the insertions
> make new nonzeros. It is hard to make sense of this string of events without the code.
>  
> Computing F
>    0 SNES Function norm 7.672682917097e+02
> Computing J
> [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 71874 X 71874; storage space: 
> 209375 unneeded,0 used
> [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during MatSetValues() is 0
> [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 0
> [0] MatCheckCompressedRow(): Found the ratio (num_zerorows 
> 71874)/(num_localrows 71874) > 0.6. Use CompressedRow routines.
> ... and then KSP complains!
> 
> I have tried adding MAT_FLUSH_ASSEMBLY calls inside the subroutines, but 
> nothing changes.
> 
> So I have 2 questions:
> 
> 1. If, as a temporary solution, I leave MAT_NEW_NONZERO_LOCATIONS=TRUE, 
> am I going to incur in performance penalties even if no new nonzeros are 
> created by my assembly routine?
> 
> If you are worried about performance, I think the option you want is
> 
>   MAT_NEW_NONZERO_ALLOCATION_ERR
> 
> since allocations, not new nonzeros, are what causes performance problems.
> 
>   Thanks,
> 
>      Matt
>  
> 2. Can you guess what is causing the problem?
> 
> Thanks
> 
>      Matteo
> 
> 
> 
> 
> 
> 
> 
> 
> -- 
> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> -- Norbert Wiener
> 
> https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210906/f669789d/attachment.html>

From matteo.semplice at uninsubria.it  Tue Sep  7 03:01:39 2021
From: matteo.semplice at uninsubria.it (Matteo Semplice)
Date: Tue, 7 Sep 2021 10:01:39 +0200
Subject: [petsc-users] Mat preallocation for SNES jacobian [WAS Re: Mat
 preallocation in case of variable stencils]
In-Reply-To: <9EE66D0B-604F-4683-A42A-D487223DEE0B@petsc.dev>
References: <a5f04a44-f3d4-cd14-075b-bd1f99f58764@uninsubria.it>
	<87k0k1zl2y.fsf@jedbrown.org>
	<9f505269-c8d6-0847-cf05-019b36ae1aee@uninsubria.it>
	<CAMYG4Gnw70e8Br-VxfQm5EZnTDmK2c6yN3eVcPytarfb+erA+Q@mail.gmail.com>
	<9EE66D0B-604F-4683-A42A-D487223DEE0B@petsc.dev>
Message-ID: <e48f7378-7435-91f7-9d82-a6dc5e6d109e@uninsubria.it>

Thanks to both of you!

@Matthew:

 ??? Indeed my setJacobianPattern() makes the calls to 
MatXAIJSetPreallocation, but does not insert zeros in the matrix.

 ??? I had missed that actual insertions of zeros was needed before 
calling SNESSolve.

@Barry:

 ??? Good idea: I'll study your DMCreateMatrix routines.

Thanks

 ??? Matteo

Il 07/09/21 03:48, Barry Smith ha scritto:
> ? Matteo,
>
> ? ?I think it might be best if you simply took our "standard" 
> DMCreateMatrix for DMDA routine and modified exactly for your needs. 
> You can find the source code in src/dm/impls/da/fdda.c There are a 
> variety of routines such 
> as?DMCreateMatrix_DA_2d_MPIAIJ_Fill()?DMCreateMatrix_DA_2d_MPIAIJ() etc.
>
> ? ? Pick the one that matches your needs, copy it and modify to do the 
> exact preallocation and then filling with zeros for interior points, 
> boundary points etc. I don't think "fixing" the incorrect default 
> behavior after the fact will work.
>
> ? Barry
>
>
>> On Sep 6, 2021, at 7:34 PM, Matthew Knepley <knepley at gmail.com 
>> <mailto:knepley at gmail.com>> wrote:
>>
>> On Mon, Sep 6, 2021 at 12:22 PM Matteo Semplice 
>> <matteo.semplice at uninsubria.it 
>> <mailto:matteo.semplice at uninsubria.it>> wrote:
>>
>>
>>     Il 31/08/21 17:32, Jed Brown ha scritto:
>>     > Matteo Semplice <matteo.semplice at uninsubria.it
>>     <mailto:matteo.semplice at uninsubria.it>> writes:
>>     >
>>     >> Hi.
>>     >>
>>     >> We are writing a code for a FD scheme on an irregular domain
>>     and thus
>>     >> the local stencil is quite variable: we have inner nodes,
>>     boundary nodes
>>     >> and inactive nodes, each with their own stencil type and
>>     offset with
>>     >> respect to the grid node. We currently create a matrix with
>>     >> DMCreateMatrix on a DMDA and for now have set the option
>>     >> MAT_NEW_NONZERO_LOCATIONS to PETSC_TRUE, but its time to
>>     render the code
>>     >> memory-efficient. The layout created automatically is correct
>>     for inner
>>     >> nodes, wrong for boundary ones (off-centered stencils) and
>>     redundant for
>>     >> outer nodes.
>>     >>
>>     >> After the preprocessing stage (including stencil creation)
>>     we'd be in
>>     >> position to set the nonzero pattern properly.
>>     >>
>>     >> Do we need to start from a Mat created by CreateMatrix? Or is
>>     it ok to
>>     >> call DMCreateMatrix (so that the splitting among CPUs and the
>>     block size
>>     >> are set by PETSc) and then call a MatSetPreallocation routine?
>>     > You can call MatXAIJSetPreallocation after. It'll handle all
>>     matrix types so you don't have to shepherd data for all the
>>     specific preallocations.
>>
>>     Hi.
>>
>>     Actually I am still struggling with this... Let me explain.
>>
>>     My code relies on a SNES and the matrix I need to preallocate is the
>>     Jacobian. So I do:
>>
>>     in the main file
>>     ?? ierr = DMCreateMatrix(ctx.daAll,&ctx.J);CHKERRQ(ierr);
>>     ?? ierr = setJacobianPattern(ctx,ctx.J);CHKERRQ(ierr); //calling
>>     MatXAIJSetPreallocation on the second argument
>>
>>
>> I do not understand this. DMCreateMatrix() has already preallocated 
>> _and_ filled the matrix
>> with zeros. Additional preallocation statements will not do anything 
>> (I think).
>>
>>     ?? ierr = MatSetOption(ctx.J,MAT_NEW_NONZERO_LOCATIONS,*******);
>>     CHKERRQ(ierr);//allow new nonzeros? 
>>
>>
>>     ?? ierr = SNESSetFunction(snes,ctx.F ,FormFunction,(void *) &ctx);
>>     CHKERRQ(ierr);
>>     ?? ierr = SNESSetJacobian(snes,ctx.J,ctx.J,FormJacobian,(void *)
>>     &ctx);
>>     CHKERRQ(ierr);
>>
>>     ?? ierr = FormSulfationRHS(ctx, ctx.U0, ctx.RHS);CHKERRQ(ierr);
>>     ?? ierr = SNESSolve(snes,ctx.RHS,ctx.U); CHKERRQ(ierr);
>>
>>     and
>>
>>     PetscErrorCode FormJacobian(SNES snes,Vec U,Mat J, Mat P,void *_ctx)
>>
>>     does (this is a 2 dof finite difference problem, so logically 2x2
>>     blocks
>>     in J)
>>
>>     ???? ierr = setJac00(....,P) //calls to MatSetValues in the 00 block
>>
>>     ???? ierr = setJac01(....,P) //calls to MatSetValues in the 01 block
>>
>>     ???? ierr = setJac1X(....,P) //calls to MatSetValues in the 10
>>     ans 11 block
>>
>>     ???? ierr = MatAssemblyBegin(P,MAT_FINAL_ASSEMBLY);CHKERRQ(ierr);
>>     ?? ? ierr = MatAssemblyEnd(P,MAT_FINAL_ASSEMBLY);CHKERRQ(ierr);
>>
>>     If I run with MAT_NEW_NONZERO_LOCATIONS=TRUE, all runs fine and
>>     using
>>     the -info option I see that no mallocs are performed during Assembly.
>>
>>     Computing F
>>     ?? 0 SNES Function norm 7.672682917097e+02
>>     Computing J
>>     [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 71874 X 71874; storage
>>     space:
>>     17661 unneeded,191714 used
>>     [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during
>>     MatSetValues() is 0
>>     [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 27
>>     [0] MatCheckCompressedRow(): Found the ratio (num_zerorows
>>     0)/(num_localrows 71874) < 0.6. Do not use CompressedRow routines.
>>
>>     If I omit the call to setJacobianPattern, info reports a nonsero
>>     number
>>     of mallocs, so somehow the setJacobianPattern routine should be
>>     doing
>>     its job correctly.
>>
>>
>> Hmm, this might mean?that the second preallocation call is wiping out 
>> the info in the first. Okay,
>> I will go back and look at the code.
>>
>>     However, if I run with MAT_NEW_NONZERO_LOCATIONS=FALSE, the
>>     Jacobian is
>>     entirely zero and no error messages appear until the KSP tries to
>>     do its
>>     job:
>>
>>
>> This sounds like your setJacobianPattern() is not filling the matrix 
>> with zeros, so that the insertions
>> make new nonzeros. It is hard to make sense of this string of events 
>> without the code.
>>
>>     Computing F
>>     ?? 0 SNES Function norm 7.672682917097e+02
>>     Computing J
>>     [0] MatAssemblyEnd_SeqAIJ(): Matrix size: 71874 X 71874; storage
>>     space:
>>     209375 unneeded,0 used
>>     [0] MatAssemblyEnd_SeqAIJ(): Number of mallocs during
>>     MatSetValues() is 0
>>     [0] MatAssemblyEnd_SeqAIJ(): Maximum nonzeros in any row is 0
>>     [0] MatCheckCompressedRow(): Found the ratio (num_zerorows
>>     71874)/(num_localrows 71874) > 0.6. Use CompressedRow routines.
>>     ... and then KSP complains!
>>
>>     I have tried adding MAT_FLUSH_ASSEMBLY calls inside the
>>     subroutines, but
>>     nothing changes.
>>
>>     So I have 2 questions:
>>
>>     1. If, as a temporary solution, I leave
>>     MAT_NEW_NONZERO_LOCATIONS=TRUE,
>>     am I going to incur in performance penalties even if no new
>>     nonzeros are
>>     created by my assembly routine?
>>
>>
>> If you are worried about performance, I think the option you want is
>>
>> ??MAT_NEW_NONZERO_ALLOCATION_ERR
>>
>> since allocations, not new nonzeros, are what causes performance 
>> problems.
>>
>> ? Thanks,
>>
>> ? ? ?Matt
>>
>>     2. Can you guess what is causing the problem?
>>
>>     Thanks
>>
>>     ???? Matteo
>>
>>
>>
>>
>>
>>
>>
>>
>> -- 
>> What most experimenters take for granted before they begin their 
>> experiments is infinitely more interesting than any results to which 
>> their experiments lead.
>> -- Norbert Wiener
>>
>> https://www.cse.buffalo.edu/~knepley/ 
>> <https://eur01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.cse.buffalo.edu%2F~knepley%2F&data=04%7C01%7Cmatteo.semplice%40uninsubria.it%7C181ba7fff01f4de6308608d971a1966e%7C9252ed8bdffc401c86ca6237da9991fa%7C0%7C0%7C637665761093056131%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=NMMtdL%2Bt78kPAV3%2Fce1RCpzD4uA%2FBIElk5qJRWRaYjs%3D&reserved=0>
>
-- 
Prof. Matteo Semplice
Universit? degli Studi dell?Insubria
Dipartimento di Scienza e Alta Tecnologia ? DiSAT
Professore Associato
Via Valleggio, 11 ? 22100 Como (CO) ? Italia
tel.: +39 031 2386316

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210907/87449ab6/attachment-0001.html>

From gsharma4189 at gmail.com  Wed Sep  8 05:27:57 2021
From: gsharma4189 at gmail.com (govind sharma)
Date: Wed, 8 Sep 2021 15:57:57 +0530
Subject: [petsc-users] petsc_with_c++_python
Message-ID: <CAGRS4a6ynJ_184ZiqLJ9SuKmZWUNbQonJKbCNm=Vr_EWpzUXHA@mail.gmail.com>

Hi,

I need a 2d poisson solver such that I can use petsc objects between C++
and python interactively. I explain this way:

-->1 Let's 2D poisson solver written in C++ with petsc
-->2 With python notebook interact with solve in parallel with petscpy
--> IPython parallel

I have seen an example of doing this type by Aron group with C but I find
it incomprehensible  to understand.

Can I get some clarification on this? How do I proceed?


Regards,
Govind
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210908/d483e749/attachment.html>

From bsmith at petsc.dev  Wed Sep  8 08:49:44 2021
From: bsmith at petsc.dev (Barry Smith)
Date: Wed, 8 Sep 2021 09:49:44 -0400
Subject: [petsc-users] petsc_with_c++_python
In-Reply-To: <CAGRS4a6ynJ_184ZiqLJ9SuKmZWUNbQonJKbCNm=Vr_EWpzUXHA@mail.gmail.com>
References: <CAGRS4a6ynJ_184ZiqLJ9SuKmZWUNbQonJKbCNm=Vr_EWpzUXHA@mail.gmail.com>
Message-ID: <7A0C8D8E-E0DE-4640-A8AB-25633FA42AA1@petsc.dev>


    This should be possible, not really different than using C (instead of C++). Please send the example by Aron group and we may have suggestions.

  Barry


> On Sep 8, 2021, at 6:27 AM, govind sharma <gsharma4189 at gmail.com> wrote:
> 
> Hi,
> 
> I need a 2d poisson solver such that I can use petsc objects between C++ and python interactively. I explain this way:
> 
> -->1 Let's 2D poisson solver written in C++ with petsc
> -->2 With python notebook interact with solve in parallel with petscpy
> --> IPython parallel
> 
> I have seen an example of doing this type by Aron group with C but I find it incomprehensible  to understand.
> 
> Can I get some clarification on this? How do I proceed? 
> 
> 
> Regards,
> Govind


From facklerpw at ornl.gov  Wed Sep  8 09:59:20 2021
From: facklerpw at ornl.gov (Fackler, Philip)
Date: Wed, 8 Sep 2021 14:59:20 +0000
Subject: [petsc-users] Redirecting petsc output
Message-ID: <SA1PR09MB80770BDB4CFEB1B88D6DDCA4C6D49@SA1PR09MB8077.namprd09.prod.outlook.com>

Is there a way to customize how petsc writes information? Instead of writing to stdout (for example: 0 TS dt 0.1 time 0.), what if we want to log that message to a file other output from Xolotl? I'm assuming there are multiple ways of getting this result. What's common practice with petsc folks?

Thanks,

Philip Fackler
Research Software Engineer, Application Engineering Group
Advanced Computing Systems Research Section
Computer Science and Mathematics Division
Oak Ridge National Laboratory
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210908/0b252c44/attachment.html>

From gsharma4189 at gmail.com  Wed Sep  8 10:12:02 2021
From: gsharma4189 at gmail.com (govind sharma)
Date: Wed, 8 Sep 2021 20:42:02 +0530
Subject: [petsc-users] petsc_with_c++_python
In-Reply-To: <7A0C8D8E-E0DE-4640-A8AB-25633FA42AA1@petsc.dev>
References: <CAGRS4a6ynJ_184ZiqLJ9SuKmZWUNbQonJKbCNm=Vr_EWpzUXHA@mail.gmail.com>
	<7A0C8D8E-E0DE-4640-A8AB-25633FA42AA1@petsc.dev>
Message-ID: <CAGRS4a7T26O6O1GH+k42yKe2Y2z9yN4xZHLXtud=r7ep6-SVog@mail.gmail.com>

Thanks Smith,

https://github.com/pyHPC/pyhpc-tutorial/blob/master/examples/scale/2D%20Cavity%20Flow%20using%20petsc4py.ipynb


This is actually a repository.  Yes, We can work.


Govind

On Wed, 8 Sep 2021, 7:19 pm Barry Smith, <bsmith at petsc.dev> wrote:

>
>     This should be possible, not really different than using C (instead of
> C++). Please send the example by Aron group and we may have suggestions.
>
>   Barry
>
>
> > On Sep 8, 2021, at 6:27 AM, govind sharma <gsharma4189 at gmail.com> wrote:
> >
> > Hi,
> >
> > I need a 2d poisson solver such that I can use petsc objects between C++
> and python interactively. I explain this way:
> >
> > -->1 Let's 2D poisson solver written in C++ with petsc
> > -->2 With python notebook interact with solve in parallel with petscpy
> > --> IPython parallel
> >
> > I have seen an example of doing this type by Aron group with C but I
> find it incomprehensible  to understand.
> >
> > Can I get some clarification on this? How do I proceed?
> >
> >
> > Regards,
> > Govind
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210908/e6acd370/attachment.html>

From bsmith at petsc.dev  Wed Sep  8 10:16:27 2021
From: bsmith at petsc.dev (Barry Smith)
Date: Wed, 8 Sep 2021 11:16:27 -0400
Subject: [petsc-users] Redirecting petsc output
In-Reply-To: <SA1PR09MB80770BDB4CFEB1B88D6DDCA4C6D49@SA1PR09MB8077.namprd09.prod.outlook.com>
References: <SA1PR09MB80770BDB4CFEB1B88D6DDCA4C6D49@SA1PR09MB8077.namprd09.prod.outlook.com>
Message-ID: <782F63EE-1821-4607-8D80-77C543E0ACF4@petsc.dev>


  Philip,

    There a variety of techniques. Some of the command line options take an optional viewer name where the output can be redirected. For example 

   -ts_monitor ascii:filename or -ts_view ascii:filename    see https://petsc.org/release/docs/manualpages/Viewer/PetscOptionsGetViewer.html <https://petsc.org/release/docs/manualpages/Viewer/PetscOptionsGetViewer.html> for more details

   It is also possible to change all stdout from PETSc to a different file by setting PETSC_STDOUT = fopen(...) 


   Barry


> On Sep 8, 2021, at 10:59 AM, Fackler, Philip via petsc-users <petsc-users at mcs.anl.gov> wrote:
> 
> Is there a way to customize how petsc writes information? Instead of writing to stdout (for example: 0 TS dt 0.1 time 0.), what if we want to log that message to a file other output from Xolotl? I'm assuming there are multiple ways of getting this result. What's common practice with petsc folks?
> 
> Thanks,
> 
> Philip Fackler
> Research Software Engineer, Application Engineering Group
> Advanced Computing Systems Research Section
> Computer Science and Mathematics Division
> Oak Ridge National Laboratory

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210908/25b2964c/attachment-0001.html>

From facklerpw at ornl.gov  Wed Sep  8 10:24:27 2021
From: facklerpw at ornl.gov (Fackler, Philip)
Date: Wed, 8 Sep 2021 15:24:27 +0000
Subject: [petsc-users] [EXTERNAL] Re:  Redirecting petsc output
In-Reply-To: <782F63EE-1821-4607-8D80-77C543E0ACF4@petsc.dev>
References: <SA1PR09MB80770BDB4CFEB1B88D6DDCA4C6D49@SA1PR09MB8077.namprd09.prod.outlook.com>
	<782F63EE-1821-4607-8D80-77C543E0ACF4@petsc.dev>
Message-ID: <SA1PR09MB807724B45AFA9E6F73E1974BC6D49@SA1PR09MB8077.namprd09.prod.outlook.com>

Barry,

Thanks for the quick reply! I'll try those out.

Philip Fackler
Research Software Engineer, Application Engineering Group
Advanced Computing Systems Research Section
Computer Science and Mathematics Division
Oak Ridge National Laboratory
________________________________
From: Barry Smith <bsmith at petsc.dev>
Sent: Wednesday, September 8, 2021 11:16
To: Fackler, Philip <facklerpw at ornl.gov>
Cc: xolotl-psi-development at lists.sourceforge.net <xolotl-psi-development at lists.sourceforge.net>; petsc-users at mcs.anl.gov <petsc-users at mcs.anl.gov>
Subject: [EXTERNAL] Re: [petsc-users] Redirecting petsc output


  Philip,

    There a variety of techniques. Some of the command line options take an optional viewer name where the output can be redirected. For example

   -ts_monitor ascii:filename or -ts_view ascii:filename    see https://petsc.org/release/docs/manualpages/Viewer/PetscOptionsGetViewer.html for more details

   It is also possible to change all stdout from PETSc to a different file by setting PETSC_STDOUT = fopen(...)


   Barry


On Sep 8, 2021, at 10:59 AM, Fackler, Philip via petsc-users <petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>> wrote:

Is there a way to customize how petsc writes information? Instead of writing to stdout (for example: 0 TS dt 0.1 time 0.), what if we want to log that message to a file other output from Xolotl? I'm assuming there are multiple ways of getting this result. What's common practice with petsc folks?

Thanks,

Philip Fackler
Research Software Engineer, Application Engineering Group
Advanced Computing Systems Research Section
Computer Science and Mathematics Division
Oak Ridge National Laboratory

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210908/5f9ffc9e/attachment.html>

From gsharma4189 at gmail.com  Wed Sep  8 13:38:02 2021
From: gsharma4189 at gmail.com (govind sharma)
Date: Thu, 9 Sep 2021 00:08:02 +0530
Subject: [petsc-users] petsc_with_c++_python
In-Reply-To: <CAGRS4a7T26O6O1GH+k42yKe2Y2z9yN4xZHLXtud=r7ep6-SVog@mail.gmail.com>
References: <CAGRS4a6ynJ_184ZiqLJ9SuKmZWUNbQonJKbCNm=Vr_EWpzUXHA@mail.gmail.com>
	<7A0C8D8E-E0DE-4640-A8AB-25633FA42AA1@petsc.dev>
	<CAGRS4a7T26O6O1GH+k42yKe2Y2z9yN4xZHLXtud=r7ep6-SVog@mail.gmail.com>
Message-ID: <CAGRS4a7ydR6ZFJYNj9Era_8+BuqqeMQJH5f07ojX2yDYNVtUQQ@mail.gmail.com>

Hi Smith,

This would be quite okay if we take poisson 2d equation as a starting point
and then we can move forward.


Govind

On Wed, 8 Sep 2021, 8:42 pm govind sharma, <gsharma4189 at gmail.com> wrote:

> Thanks Smith,
>
>
> https://github.com/pyHPC/pyhpc-tutorial/blob/master/examples/scale/2D%20Cavity%20Flow%20using%20petsc4py.ipynb
>
>
> This is actually a repository.  Yes, We can work.
>
>
> Govind
>
> On Wed, 8 Sep 2021, 7:19 pm Barry Smith, <bsmith at petsc.dev> wrote:
>
>>
>>     This should be possible, not really different than using C (instead
>> of C++). Please send the example by Aron group and we may have suggestions.
>>
>>   Barry
>>
>>
>> > On Sep 8, 2021, at 6:27 AM, govind sharma <gsharma4189 at gmail.com>
>> wrote:
>> >
>> > Hi,
>> >
>> > I need a 2d poisson solver such that I can use petsc objects between
>> C++ and python interactively. I explain this way:
>> >
>> > -->1 Let's 2D poisson solver written in C++ with petsc
>> > -->2 With python notebook interact with solve in parallel with petscpy
>> > --> IPython parallel
>> >
>> > I have seen an example of doing this type by Aron group with C but I
>> find it incomprehensible  to understand.
>> >
>> > Can I get some clarification on this? How do I proceed?
>> >
>> >
>> > Regards,
>> > Govind
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210909/11c895d8/attachment.html>

From mfadams at lbl.gov  Fri Sep 10 05:40:13 2021
From: mfadams at lbl.gov (Mark Adams)
Date: Fri, 10 Sep 2021 06:40:13 -0400
Subject: [petsc-users] this is what mangers are for ...
Message-ID: <CADOhEh7n3367UvCMyhJDHUocYQWLEAcxWAEQzpPjNQCfA0c4-A@mail.gmail.com>

Hi Sherry,
Here is a request that you do not see every day.
Can you please verify that the back of your badge says this for me?
I need a badge for access to Fugaku and my badge is old and unreadable.
They apparently grabbed this text from some other LBNL badge and want to
verify.
Thanks,
Mark

"EMERGENCY STATUS ANNOUNCEMENT
IF FOUND,DROP IN ANY MAILBOX,POSTMASTER,POSTAGE GUARANTEED. RETURN TO:
LAWRENCE BERKELEY NATIONAL LABORATORY ONE CYCLOTRON ROAD BERKELEY,
CALIFORNIA 94720
THIS CREDENTIAL IS THE PROPERTY OF THE U.S. GOVERNMENT. THE COUNTERFEIT,
ALTERATION, OR MISUSE IS A VIOLATION OF SECTION 499 AND 701, TITLE
18,UNITED STATES CODE.
OPERATED UNDER CONTRACT NO. DE-AC03-76SF00098 WITH THE U.S.DEPARTMENT OF
ENERGY"
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210910/ca20781c/attachment.html>

From mfadams at lbl.gov  Fri Sep 10 07:23:47 2021
From: mfadams at lbl.gov (Mark Adams)
Date: Fri, 10 Sep 2021 08:23:47 -0400
Subject: [petsc-users] this is what mangers are for ...
In-Reply-To: <CADOhEh7n3367UvCMyhJDHUocYQWLEAcxWAEQzpPjNQCfA0c4-A@mail.gmail.com>
References: <CADOhEh7n3367UvCMyhJDHUocYQWLEAcxWAEQzpPjNQCfA0c4-A@mail.gmail.com>
Message-ID: <CADOhEh53fmXgr=qdEymQ3sf4pW_evF5tqMviqgSerzbOctv-VQ@mail.gmail.com>

Whoops sorry, wrong email.

On Fri, Sep 10, 2021 at 6:40 AM Mark Adams <mfadams at lbl.gov> wrote:

> Hi Sherry,
> Here is a request that you do not see every day.
> Can you please verify that the back of your badge says this for me?
> I need a badge for access to Fugaku and my badge is old and unreadable.
> They apparently grabbed this text from some other LBNL badge and want to
> verify.
> Thanks,
> Mark
>
> "EMERGENCY STATUS ANNOUNCEMENT
> IF FOUND,DROP IN ANY MAILBOX,POSTMASTER,POSTAGE GUARANTEED. RETURN TO:
> LAWRENCE BERKELEY NATIONAL LABORATORY ONE CYCLOTRON ROAD BERKELEY,
> CALIFORNIA 94720
> THIS CREDENTIAL IS THE PROPERTY OF THE U.S. GOVERNMENT. THE COUNTERFEIT,
> ALTERATION, OR MISUSE IS A VIOLATION OF SECTION 499 AND 701, TITLE
> 18,UNITED STATES CODE.
> OPERATED UNDER CONTRACT NO. DE-AC03-76SF00098 WITH THE U.S.DEPARTMENT OF
> ENERGY"
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210910/8f8efb92/attachment.html>

From badi.hamid at gmail.com  Fri Sep 10 10:55:32 2021
From: badi.hamid at gmail.com (Hamid)
Date: Fri, 10 Sep 2021 17:55:32 +0200
Subject: [petsc-users] Petsc with Visual Studio
Message-ID: <9844C996-71D9-4B45-A30B-0013BE78F2B5@hxcore.ol>

An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210910/672ce9fc/attachment.html>

From balay at mcs.anl.gov  Fri Sep 10 11:02:22 2021
From: balay at mcs.anl.gov (Satish Balay)
Date: Fri, 10 Sep 2021 11:02:22 -0500 (CDT)
Subject: [petsc-users] Petsc with Visual Studio
In-Reply-To: <9844C996-71D9-4B45-A30B-0013BE78F2B5@hxcore.ol>
References: <9844C996-71D9-4B45-A30B-0013BE78F2B5@hxcore.ol>
Message-ID: <e80f5d9-11c2-44c8-adff-14a5d7c6e770@mcs.anl.gov>

At some point we had some un-maintainable cmake code [that didn't work well on windows anyway and also relied on cygwin build env] - and it was removed.

build of PETSc with either Intel or MS compiler is the same process [using cygwin env]

Satish

On Fri, 10 Sep 2021, Hamid wrote:

> 
> Hi everybody,
> 
> ?
> 
> I already compiled Petsc (without MUMPS) with Intel compilers under cygwin env, it was a real pain.
> 
> I recently tried to compile petsc with Visual using the native compiler.
> 
> First of all, i compiled?METIS, MUMPS (with Metis only cause Scotch compilation is tricky for me) and OpenBLAS.
> 
> When it ?comes to Petsc, the compilation process is quite hard, the configuration stage does a lot of things? Do you possible to cmake the project??
> 
> Is there someone who already did stuff in that way??
> 
> ?
> 
> ?
> 
> Best regards.
> 
> ?
> 
> Envoy? ? partir de Courrier pour Windows
> 
> ?
> 
> 
> [icon-envelope-tick-round-orange-animated-no-repeat-v1.gif]
> Garanti sans virus. www.avast.com
> 
> 

From bsmith at petsc.dev  Fri Sep 10 11:50:03 2021
From: bsmith at petsc.dev (Barry Smith)
Date: Fri, 10 Sep 2021 12:50:03 -0400
Subject: [petsc-users] Ainsworth formula to solve saddle point problems
 / preconditioner for shell matrices
In-Reply-To: <affba61c-29cd-a7e6-88d3-571052cd4d03@cea.fr>
References: <61b8dbda-c2c4-d834-9ef9-e12c5254fb31@cea.fr>
	<87mu15u6kx.fsf@jedbrown.org>
	<5504dd4c-1846-7652-a0d2-3dc955ab20df@cea.fr>
	<886ADC82-ED26-448E-8B3B-5EE483AEC58F@petsc.dev>
	<b5bf1014-2ec2-2ef0-8a51-39c9771ebc6b@cea.fr>
	<358AC9C4-8D8E-40EE-845D-0B124D03060D@petsc.dev>
	<7b2d0bd6-b31b-42ff-f9fc-fb359a59549f@cea.fr>
	<87tuv48osv.fsf@jedbrown.org>
	<2B8B302F-D823-4160-B674-B3DAE78E6363@petsc.dev>
	<218E7696-2A50-42A3-8CF2-D58FCC17B855@petsc.dev>
	<e4b6729d-4b0f-6d04-485f-8a437af5a807@cea.fr>
	<DF5D8D7C-8DD9-4E1A-8617-36A5FF472FA3@petsc.dev>
	<15e6d4f0-8678-d43a-22d9-9818c51072e3@cea.fr>
	<ACDCA1CC-69A3-4068-A6B6-567F3A15A4DE@petsc.dev>
	<12d9be6f-1aab-5773-f73d-a00f9106be45@cea.fr>
	<50af8386-6ccb-b6d4-ea2d-6d0ee68bc3f7@cea.fr>
	<122027EE-C4EC-4DCC-832A-ED0FBD092B30@petsc.dev>
	<affba61c-29cd-a7e6-88d3-571052cd4d03@cea.fr>
Message-ID: <6E293530-C42C-4788-876D-FE88B973514C@petsc.dev>


  Olivier,

    I believe the code is ready for your use. MatInvertVariableBlockEnvelope() will take an MPIAIJ matrix (this would be your C*C' matrix, determine the block diagonal envelope automatically and construct a new MPIAIJ matrix that is the inverse of the block diagonal envelope (by inverting each of the blocks). For small blocks, as is your case, the inverse should inexpensive. 

    I have a test code src/mat/tests/ex178.c that demonstrates its use and tests if for some random blocking across multiple processes.

    Please let me know if you have any difficulties. The branch is the same name as before barry/2020-10-08/invert-block-diagonal-aij  but it has been rebased so you will need to delete your local copy of the branch and then recreate it after git fetch.

  Barry



> On Sep 8, 2021, at 3:09 PM, Olivier Jamond <olivier.jamond at cea.fr> wrote:
> 
> Ok thanks a lot! I look forward to hearing from you!
> 
> Best regards,
> Olivier
> 
> On 08/09/2021 20:56, Barry Smith wrote:
>> 
>>   Olivier,
>> 
>>     I'll refresh my memory on this and see if I can finish it up.
>> 
>>   Barry
>> 
>> 
>>> On Sep 2, 2021, at 12:38 PM, Olivier Jamond <olivier.jamond at cea.fr <mailto:olivier.jamond at cea.fr>> wrote:
>>> 
>>> Dear Barry,
>>> 
>>> First I hope that you and your relatives are doing well in these troubled times...
>>> 
>>> I allow myself to come back to you about the subject of being able to compute something like '(C*Ct)^(-1)*C', where 'C' is a 'MPC' matrix that is used to impose some boundary conditions for a structural finite element problem:
>>> 
>>> [K C^t][U]=[F]
>>> [C 0  ][L] [D]
>>> 
>>> as we discussed some time ago, I would like to solve such a problem using the Ainsworth method, which involves this '(C*Ct)^(-1)*C'.
>>> 
>>> You kindly started some developments to help me on that, which worked as a 'proof of concept' in sequential, but not yet in parallel, and also kindly suggested that you could extend it to the parallel case (MR: https://gitlab.com/petsc/petsc/-/merge_requests/3544 <https://gitlab.com/petsc/petsc/-/merge_requests/3544>). Can this be still 'scheduled' on your side?
>>> 
>>> Sorry again to "harass" you about that...
>>> 
>>> Best regards,
>>> Olivier Jamond
>>> 
>>> On 03/02/2021 08:45, Olivier Jamond wrote:
>>>> Dear Barry,
>>>> 
>>>> I come back to you about this topic. As I wrote last time, this is not a "highly urgent" subject (whereas we will have to deal with it in the next months), but it is an important one (especially since the code I am working on just raised significantly its ambitions). So I just would like to check with you that this is still 'scheduled' on your side.
>>>> 
>>>> I am sorry, I feel a little embarrassed about asking you again about your work schedule, but I need some kind of 'visibility' about this topic which will become critical for our application.
>>>> 
>>>> Many thanks for helping me on that!
>>>> Olivier
>>>> 
>>>> On 02/12/2020 21:34, Barry Smith wrote:
>>>>> 
>>>>>   Sorry I have not gotten back to you quicker, give me a few days to see how viable it is.
>>>>> 
>>>>>    Barry
>>>>> 
>>>>> 
>>>>>> On Nov 25, 2020, at 11:57 AM, Olivier Jamond <olivier.jamond at cea.fr <mailto:olivier.jamond at cea.fr>> wrote:
>>>>>> 
>>>>>> Hi Barry,
>>>>>> 
>>>>>> I come back to you about the feature to unlock the Ainsworth method for saddle point problems in parallel. If I may ask (sorry for that...), is it still on your schedule (I just checked the branch, and it seems 'stuck')?
>>>>>> 
>>>>>> This is not "highly urgent" on my side, but the ability to handle efficiently saddle point problems with iterative solvers will be a critical point for the software I am working on...
>>>>>> 
>>>>>> Many thanks (and sorry again for asking about your work schedule...)!
>>>>>> Olivier
>>>>>> 
>>>>>> On 12/10/2020 16:49, Barry Smith wrote:
>>>>>>> 
>>>>>>> 
>>>>>>>> On Oct 12, 2020, at 6:10 AM, Olivier Jamond <olivier.jamond at cea.fr <mailto:olivier.jamond at cea.fr>> wrote:
>>>>>>>> 
>>>>>>>> Hi Barry,
>>>>>>>> 
>>>>>>>> Thanks for this work! I tried this branch with my code and sequential matrices on a small case: it does work!
>>>>>>>> 
>>>>>>>> 
>>>>>>>   Excellant. I will extend it to the parallel case and get it into our master release. 
>>>>>>> 
>>>>>>>   We'd be interested in hearing about your convergence and timing experiences when you run largish jobs (even sequentially) since this type of problem comes up relatively frequently and we do need a variety of solvers that can handle it while currently we do not have great tools for it.
>>>>>>> 
>>>>>>>    Barry
>>>>>>> 
>>>>>>>> Thanks a lot,
>>>>>>>> Olivier
>>>>>>>> 
>>>>>>>> On 09/10/2020 03:50, Barry Smith wrote:
>>>>>>>>> 
>>>>>>>>>   Olivier,
>>>>>>>>> 
>>>>>>>>>     The branch barry/2020-10-08/invert-block-diagonal-aij contains an example src/mat/tests/ex178.c that shows how to compute inv(CC'). It works for SeqAIJ matrices.
>>>>>>>>> 
>>>>>>>>>     Please let us know if it works for you and then I will implement the parallel version.
>>>>>>>>> 
>>>>>>>>>   Barry
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>> On Oct 8, 2020, at 3:59 PM, Barry Smith <bsmith at petsc.dev <mailto:bsmith at petsc.dev>> wrote:
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>>  Olivier
>>>>>>>>>> 
>>>>>>>>>>  I am working on extending the routines now and hopefully push a branch you can try fairly soon.
>>>>>>>>>> 
>>>>>>>>>>  Barry
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>>> On Oct 8, 2020, at 3:07 PM, Jed Brown <jed at jedbrown.org <mailto:jed at jedbrown.org>> wrote:
>>>>>>>>>>> 
>>>>>>>>>>> Olivier Jamond <olivier.jamond at cea.fr <mailto:olivier.jamond at cea.fr>> writes:
>>>>>>>>>>> 
>>>>>>>>>>>>>   Given the structure of C it seems you should just explicitly construct Sp and use GAMG (or other preconditioners, even a direct solver) directly on Sp. Trying to avoid explicitly forming Sp will give you a much slower performing solving for what benefit? If C was just some generic monster than forming Sp might be unrealistic but in your case CCt is is block diagonal with tiny blocks which means (C*Ct)^(-1) is block diagonal with tiny blocks (the blocks are the inverses of the blocks of (C*Ct)).
>>>>>>>>>>>>> 
>>>>>>>>>>>>>    Sp = Ct*C  + Qt * S * Q = Ct*C  +  [I - Ct * (C*Ct)^(-1)*C] S [I - Ct * (C*Ct)^(-1)*C]
>>>>>>>>>>>>> 
>>>>>>>>>>>>> [Ct * (C*Ct)^(-1)*C] will again be block diagonal with slightly larger blocks.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> You can do D = (C*Ct) with MatMatMult() then write custom code that zips through the diagonal blocks of D inverting all of them to get iD then use MatPtAP applied to C and iD to get Ct * (C*Ct)^(-1)*C then MatShift() to include the I then MatPtAP or MatRAR to get [I - Ct * (C*Ct)^(-1)*C] S [I - Ct * (C*Ct)^(-1)*C]  then finally MatAXPY() to get Sp. The complexity of each of the Mat operations is very low because of the absurdly simple structure of C and its descendants.   You might even be able to just use MUMPS to give you the explicit inv(C*Ct) without writing custom code to get iD.
>>>>>>>>>>>> 
>>>>>>>>>>>> At this time, I didn't manage to compute iD=inv(C*Ct) without using 
>>>>>>>>>>>> dense matrices, what may be a shame because all matrices are sparse . Is 
>>>>>>>>>>>> it possible?
>>>>>>>>>>>> 
>>>>>>>>>>>> And I get no idea of how to write code to manually zip through the 
>>>>>>>>>>>> diagonal blocks of D to invert them...
>>>>>>>>>>> 
>>>>>>>>>>> You could use MatInvertVariableBlockDiagonal(), which should perhaps return a Mat instead of a raw array.
>>>>>>>>>>> 
>>>>>>>>>>> If you have constant block sizes, MatInvertBlockDiagonalMat will return a Mat.
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>> 
>>>>> 
>> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210910/302c01f9/attachment-0001.html>

From aduarteg at utexas.edu  Fri Sep 10 11:51:57 2021
From: aduarteg at utexas.edu (Alfredo J Duarte Gomez)
Date: Fri, 10 Sep 2021 11:51:57 -0500
Subject: [petsc-users] DMDA matrices with one sided stencils
Message-ID: <CAO1tTfKtbjKFwTi5kdr7aifY6BnwsDzb7Orgbw+w+qv=a+ucbg@mail.gmail.com>

Good afternoon,

I have developed and validated some matrix operators using petsc with a
structured dmda.

Some of these operators use one-sided stencils at the boundaries, which
following the way the dmda uses the stencil width value, requires me to
increase the stencil width to accommodate more entries at the boundary only
if I want to avoid errors with default options.

This is very wasteful and affects my performance, since there are a lot of
extra zeros corresponding to the inner points.

What is the best way to improve this?

I have read in some public threads the possibility of using MatOption to
allow us to put more entries into the matrix, but that does not allow me to
use MatSetStencil?

Alternatively, is there any way to use a larger stencil width and then trim
the zero entries that were entered automatically?

If there are any other solutions for this problem, please let me know.

Thank you,

-Alfredo Duarte

-- 
Alfredo Duarte
Graduate Research Assistant
The University of Texas at Austin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210910/80242157/attachment.html>

From bsmith at petsc.dev  Fri Sep 10 12:52:29 2021
From: bsmith at petsc.dev (Barry Smith)
Date: Fri, 10 Sep 2021 13:52:29 -0400
Subject: [petsc-users] Petsc with Visual Studio
In-Reply-To: <9844C996-71D9-4B45-A30B-0013BE78F2B5@hxcore.ol>
References: <9844C996-71D9-4B45-A30B-0013BE78F2B5@hxcore.ol>
Message-ID: <CD79CD57-6A49-44E2-BBBF-A64D80AB974D@petsc.dev>


   Please let us know of any difficulties that arose, we may be able to improve the process or the documentation of make the process less painful.

  Barry


> On Sep 10, 2021, at 11:55 AM, Hamid <badi.hamid at gmail.com> wrote:
> 
> Hi everybody,
>  
> I already compiled Petsc (without MUMPS) with Intel compilers under cygwin env, it was a real pain.
> I recently tried to compile petsc with Visual using the native compiler.
> First of all, i compiled METIS, MUMPS (with Metis only cause Scotch compilation is tricky for me) and OpenBLAS.
> When it  comes to Petsc, the compilation process is quite hard, the configuration stage does a lot of things? Do you possible to cmake the project ?
> Is there someone who already did stuff in that way ?
>  
>  
> Best regards.
>  
> Envoy? ? partir de Courrier <https://go.microsoft.com/fwlink/?LinkId=550986> pour Windows
>  
> 
>  <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=emailclient>	Garanti sans virus. www.avast.com <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=emailclient>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210910/79f9e860/attachment.html>

From bsmith at petsc.dev  Fri Sep 10 13:04:17 2021
From: bsmith at petsc.dev (Barry Smith)
Date: Fri, 10 Sep 2021 14:04:17 -0400
Subject: [petsc-users] DMDA matrices with one sided stencils
In-Reply-To: <CAO1tTfKtbjKFwTi5kdr7aifY6BnwsDzb7Orgbw+w+qv=a+ucbg@mail.gmail.com>
References: <CAO1tTfKtbjKFwTi5kdr7aifY6BnwsDzb7Orgbw+w+qv=a+ucbg@mail.gmail.com>
Message-ID: <A9E03D57-BB0A-414A-BA6F-72DCB703A49D@petsc.dev>


    I think the following should work for you.

   Create a "wide" DMDA and then call DMSetMatrixPreallocateOnly() on it, use this DMDA to create your matrix, this will ensure that only the entries you enter into the matrix are stored (so the extra "layers" of zeros will not appear in the matrix). The matrix vector products will then not use those extra entries and will be faster. Destroy the no longer needed wide DMDA. You can use MatSetValuesStencil() with this matrix.

   Now create your regular DMDA and use that to create your vectors and for needed DMGlobalToLocal etc.

  Barry


> On Sep 10, 2021, at 12:51 PM, Alfredo J Duarte Gomez <aduarteg at utexas.edu> wrote:
> 
> Good afternoon,
> 
> I have developed and validated some matrix operators using petsc with a structured dmda.
> 
> Some of these operators use one-sided stencils at the boundaries, which following the way the dmda uses the stencil width value, requires me to increase the stencil width to accommodate more entries at the boundary only if I want to avoid errors with default options.
> 
> This is very wasteful and affects my performance, since there are a lot of extra zeros corresponding to the inner points.
> 
> What is the best way to improve this?
> 
> I have read in some public threads the possibility of using MatOption to allow us to put more entries into the matrix, but that does not allow me to use MatSetStencil?
> 
> Alternatively, is there any way to use a larger stencil width and then trim the zero entries that were entered automatically?
> 
> If there are any other solutions for this problem, please let me know.
> 
> Thank you,
> 
> -Alfredo Duarte
> 
> -- 
> Alfredo Duarte
> Graduate Research Assistant
> The University of Texas at Austin

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210910/74403557/attachment.html>

From gerhard.ungersback at uis.no  Fri Sep 10 13:23:51 2021
From: gerhard.ungersback at uis.no (=?iso-8859-1?Q?Gerhard_Ungersb=E4ck?=)
Date: Fri, 10 Sep 2021 18:23:51 +0000
Subject: [petsc-users] Combine DM and FFTW
Message-ID: <SV0P279MB004225B87A87460FBE20B45BE8D69@SV0P279MB0042.NORP279.PROD.OUTLOOK.COM>

Hello,


I want to solve a time dependent differential equation in 3D (Scalar field theory in Hamilton formulation) .

The crucial part is that at some time steps I need to FFT the 3D grid.


I have written a sequential code without petsc and now I would like to use petsc to get a parallel version.

I worked through the examples and now I understand DMDACreate and also the FFTW examples.


What is missing though is how I combine both!

DMDACreate takes care of ghost points and periodic boundary conditions. As far as I know FFTW requires each process to have a slab of the grid to work. I know how to create this grid with DMDACreate. Normally I would proceed by creating a global vector by DMCreateGlobalVector. But this vector needs then to be linked with fftw arrays. How does this work?

Or should I first allocate local memory for fftw and then somehow stitch them together to form a global petsc vector?


Thanks

best, gerhard

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210910/eadee1b2/attachment.html>

From knepley at gmail.com  Sat Sep 11 13:49:37 2021
From: knepley at gmail.com (Matthew Knepley)
Date: Sat, 11 Sep 2021 14:49:37 -0400
Subject: [petsc-users] Combine DM and FFTW
In-Reply-To: <SV0P279MB004225B87A87460FBE20B45BE8D69@SV0P279MB0042.NORP279.PROD.OUTLOOK.COM>
References: <SV0P279MB004225B87A87460FBE20B45BE8D69@SV0P279MB0042.NORP279.PROD.OUTLOOK.COM>
Message-ID: <CAMYG4GmUQKKnCRd07vjj_GeQ+DTfCuEMqQJiJa82Hf_K7NzadA@mail.gmail.com>

On Fri, Sep 10, 2021 at 2:26 PM Gerhard Ungersb?ck via petsc-users <
petsc-users at mcs.anl.gov> wrote:

> Hello,
>
>
> I want to solve a time dependent differential equation in 3D (Scalar field
> theory in Hamilton formulation) .
>
> The crucial part is that at some time steps I need to FFT the 3D grid.
>
>
> I have written a sequential code without petsc and now I would like to use
> petsc to get a parallel version.
>
> I worked through the examples and now I understand DMDACreate and also the
> FFTW examples.
>
>
> What is missing though is how I combine both!
>
> DMDACreate takes care of ghost points and periodic boundary conditions. As
> far as I know FFTW requires each process to have a slab of the grid to
> work. I know how to create this grid with DMDACreate. Normally I would
> proceed by creating a global vector by DMCreateGlobalVector. But this
> vector needs then to be linked with fftw arrays. How does this work?
>
> Or should I first allocate local memory for fftw and then somehow stitch
> them together to form a global petsc vector?
>

It would be great if we had worked through an example with this, but I can
only find a serial example:

  https://gitlab.com/petsc/petsc/-/blob/main/src/dm/tests/ex27.c

I think you can use

  https://petsc.org/main/docs/manualpages/Mat/VecScatterFFTWToPetsc.html

to go between FFTW and Petsc vectors, but I have not tried in parallel. Is
the example
in the right direction?

  Thanks,

     Matt


> Thanks
>
> best, gerhard
>
>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210911/d84e7245/attachment.html>

From knepley at gmail.com  Sat Sep 11 13:55:49 2021
From: knepley at gmail.com (Matthew Knepley)
Date: Sat, 11 Sep 2021 14:55:49 -0400
Subject: [petsc-users] DMDA matrices with one sided stencils
In-Reply-To: <A9E03D57-BB0A-414A-BA6F-72DCB703A49D@petsc.dev>
References: <CAO1tTfKtbjKFwTi5kdr7aifY6BnwsDzb7Orgbw+w+qv=a+ucbg@mail.gmail.com>
	<A9E03D57-BB0A-414A-BA6F-72DCB703A49D@petsc.dev>
Message-ID: <CAMYG4Gk9kudPXFVac6Cyc-5u-Q96J4aKD-Hd2qYJS_UoB7rChg@mail.gmail.com>

On Fri, Sep 10, 2021 at 2:04 PM Barry Smith <bsmith at petsc.dev> wrote:

>
>     I think the following should work for you.
>
>    Create a "wide" DMDA and then call DMSetMatrixPreallocateOnly()
>

Or use -dm_preallocate_only

  Thanks,

     Matt


> on it, use this DMDA to create your matrix, this will ensure that only the
> entries you enter into the matrix are stored (so the extra "layers" of
> zeros will not appear in the matrix). The matrix vector products will then
> not use those extra entries and will be faster. Destroy the no longer
> needed wide DMDA. You can use MatSetValuesStencil() with this matrix.
>
>    Now create your regular DMDA and use that to create your vectors and
> for needed DMGlobalToLocal etc.
>
>   Barry
>
>
> On Sep 10, 2021, at 12:51 PM, Alfredo J Duarte Gomez <aduarteg at utexas.edu>
> wrote:
>
> Good afternoon,
>
> I have developed and validated some matrix operators using petsc with a
> structured dmda.
>
> Some of these operators use one-sided stencils at the boundaries, which
> following the way the dmda uses the stencil width value, requires me to
> increase the stencil width to accommodate more entries at the boundary only
> if I want to avoid errors with default options.
>
> This is very wasteful and affects my performance, since there are a lot of
> extra zeros corresponding to the inner points.
>
> What is the best way to improve this?
>
> I have read in some public threads the possibility of using MatOption to
> allow us to put more entries into the matrix, but that does not allow me to
> use MatSetStencil?
>
> Alternatively, is there any way to use a larger stencil width and then
> trim the zero entries that were entered automatically?
>
> If there are any other solutions for this problem, please let me know.
>
> Thank you,
>
> -Alfredo Duarte
>
> --
> Alfredo Duarte
> Graduate Research Assistant
> The University of Texas at Austin
>
>
>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210911/c257e0e3/attachment.html>

From mfadams at lbl.gov  Sun Sep 12 14:51:11 2021
From: mfadams at lbl.gov (Mark Adams)
Date: Sun, 12 Sep 2021 15:51:11 -0400
Subject: [petsc-users] p4est error at NERSC
Message-ID: <CADOhEh5e_iC5izyhmwWxTYvxFHBc7D-noXptbzSkhdxquU_6yQ@mail.gmail.com>


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210912/6de9c3e0/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: configure.log
Type: application/octet-stream
Size: 1351643 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210912/6de9c3e0/attachment-0001.obj>

From bsmith at petsc.dev  Sun Sep 12 19:27:56 2021
From: bsmith at petsc.dev (Barry Smith)
Date: Sun, 12 Sep 2021 20:27:56 -0400
Subject: [petsc-users] p4est error at NERSC
In-Reply-To: <CADOhEh5e_iC5izyhmwWxTYvxFHBc7D-noXptbzSkhdxquU_6yQ@mail.gmail.com>
References: <CADOhEh5e_iC5izyhmwWxTYvxFHBc7D-noXptbzSkhdxquU_6yQ@mail.gmail.com>
Message-ID: <39EE145C-FA06-49B4-9AEE-3E2358961516@petsc.dev>


  You seem to have too many inconsistent modules loaded at the same time. 

  It is picking up /usr/common/software/sles15_cgpu/hpcsdk/20.11/Linux_x86_64/20.11/compilers/lib/libgomp.so  when it started with /usr/common/software/sles15_cgpu/gcc/8.3.0/lib/../lib64/libgomp.so 

   Since you are using  --with-mpi-dir=/usr/common/software/sles15_cgpu/openmpi/4.0.3/gcc you should not load the  hpcsdk/20.11   module which contains the PGI compilers.

  Barry


> On Sep 12, 2021, at 3:51 PM, Mark Adams <mfadams at lbl.gov> wrote:
> 
> 
> <configure.log>


From stefano.zampini at gmail.com  Mon Sep 13 03:02:40 2021
From: stefano.zampini at gmail.com (Stefano Zampini)
Date: Mon, 13 Sep 2021 11:02:40 +0300
Subject: [petsc-users] PETSC installation on Cray
In-Reply-To: <CE2C5383-A939-4FED-8CEB-6FAD750AAA3D@petsc.dev>
References: <f744ae2f-9b1b-19f3-f277-3bcb2db5b204@dkrz.de>
	<CAMYG4G=LcXwqMa9tX5=3EepEqVoqWHNPJpGx4sZ74-=PxKO=Ug@mail.gmail.com>
	<4fd4fc04-9c46-51f5-2720-c682d0011224@dkrz.de>
	<CE2C5383-A939-4FED-8CEB-6FAD750AAA3D@petsc.dev>
Message-ID: <CAGPUish9t65az4pi4hH4EReq-FwhFtenAQLnP_u7Ti2m6NXTNQ@mail.gmail.com>

Enrico

I have accidentally stepped on the same issue. You may want to check  if it
works with this branch
https://gitlab.com/petsc/petsc/-/tree/stefanozampini/cray-arm

Il giorno mar 2 mar 2021 alle ore 23:03 Barry Smith <bsmith at petsc.dev> ha
scritto:

>
>   Please try the following. Make four files as below then compile each
> with  cc -c -o test.o  test1.c     again for test2.c etc
>
>   Send all the output.
>
>
>
>   test1.c
> #include <complex.h>
>
>   test2.c
> #define _BSD_SOURCE
> #include <complex.h>
>
>   test3.c
> #define _DEFAULT_SOURCE
> #include <complex.h>
>
>   test4.c
> #define _GNU_SOURCE
> #include <complex.h>
>
> > On Mar 2, 2021, at 7:33 AM, Enrico <degregori at dkrz.de> wrote:
> >
> > Hi,
> >
> > attached is the configuration and make log files.
> >
> > Enrico
> >
> > On 02/03/2021 14:13, Matthew Knepley wrote:
> >> On Tue, Mar 2, 2021 at 7:49 AM Enrico <degregori at dkrz.de <mailto:
> degregori at dkrz.de>> wrote:
> >>    Hi,
> >>    I'm having some problems installing PETSC with Cray compiler.
> >>    I use this configuration:
> >>    ./configure --with-cc=cc --with-cxx=CC --with-fc=0 --with-debugging=1
> >>    --with-shared-libraries=1 COPTFLAGS=-O0 CXXOPTFLAGS=-O0
> >>    and when I do
> >>    make all
> >>    I get the following error because of cmathcalls.h:
> >>    CC-1043 craycc: ERROR File = /usr/include/bits/cmathcalls.h, Line =
> 55
> >>        _Complex can only be used with floating-point types.
> >>        __MATHCALL (cacos, (_Mdouble_complex_ __z));
> >>        ^
> >>    Am I doing something wrong?
> >> This was expended from somewhere. Can you show the entire err log?
> >>   Thanks,
> >>      Matt
> >>    Regards,
> >>    Enrico Degregori
> >> --
> >> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> >> -- Norbert Wiener
> >> https://www.cse.buffalo.edu/~knepley/ <
> http://www.cse.buffalo.edu/~knepley/>
> > <configure.log><make.log>
>
>

-- 
Stefano
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210913/b25b1b24/attachment.html>

From badi.hamid at gmail.com  Mon Sep 13 06:18:05 2021
From: badi.hamid at gmail.com (Hamid)
Date: Mon, 13 Sep 2021 13:18:05 +0200
Subject: [petsc-users] Configure slowness
Message-ID: <B6772BAD-0A64-4972-9B70-28D7378AC271@hxcore.ol>

An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210913/a5152e79/attachment.html>

From knepley at gmail.com  Mon Sep 13 07:12:12 2021
From: knepley at gmail.com (Matthew Knepley)
Date: Mon, 13 Sep 2021 08:12:12 -0400
Subject: [petsc-users] Configure slowness
In-Reply-To: <B6772BAD-0A64-4972-9B70-28D7378AC271@hxcore.ol>
References: <B6772BAD-0A64-4972-9B70-28D7378AC271@hxcore.ol>
Message-ID: <CAMYG4G=JfsLdsgCVjEv6FvHZZ4tVx+MdHACyLPZ1r6XTYWSbeg@mail.gmail.com>

On Mon, Sep 13, 2021 at 7:18 AM Hamid <badi.hamid at gmail.com> wrote:

> Hi,
>
>
>
> Do you know why configure is very very slow using cygwin/python ? It takes
> more that 10min to finalize.
>

We believe this is due to a very slow filesystem. Almost all the time in
configure is filesystem operations.
NTFS is just very slow, and this probably accounts for the difference with
the Linux runtime.

One option is to build using WSL

  Thanks,

     Matt


>
>
> Best regards.
>
>
>
> Envoy? ? partir de Courrier
> <https://go.microsoft.com/fwlink/?LinkId=550986> pour Windows
>
>
>
>
> <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=emailclient> Garanti
> sans virus. www.avast.com
> <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=emailclient>
> <#m_7906049641795168604_DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210913/88b6c6e7/attachment.html>

From balay at mcs.anl.gov  Mon Sep 13 08:20:10 2021
From: balay at mcs.anl.gov (Satish Balay)
Date: Mon, 13 Sep 2021 08:20:10 -0500 (CDT)
Subject: [petsc-users] Configure slowness
In-Reply-To: <CAMYG4G=JfsLdsgCVjEv6FvHZZ4tVx+MdHACyLPZ1r6XTYWSbeg@mail.gmail.com>
References: <B6772BAD-0A64-4972-9B70-28D7378AC271@hxcore.ol>
	<CAMYG4G=JfsLdsgCVjEv6FvHZZ4tVx+MdHACyLPZ1r6XTYWSbeg@mail.gmail.com>
Message-ID: <50fd3646-715-e395-9813-cd405ea74786@mcs.anl.gov>

cygwin is slow. And configure does 100s of compiles and other sequential steps [and we use a win32fe compile wrapper that adds extra I/O,run-time cost for each compile]

Satish

On Mon, 13 Sep 2021, Matthew Knepley wrote:

> On Mon, Sep 13, 2021 at 7:18 AM Hamid <badi.hamid at gmail.com> wrote:
> 
> > Hi,
> >
> >
> >
> > Do you know why configure is very very slow using cygwin/python ? It takes
> > more that 10min to finalize.
> >
> 
> We believe this is due to a very slow filesystem. Almost all the time in
> configure is filesystem operations.
> NTFS is just very slow, and this probably accounts for the difference with
> the Linux runtime.
> 
> One option is to build using WSL
> 
>   Thanks,
> 
>      Matt
> 
> 
> >
> >
> > Best regards.
> >
> >
> >
> > Envoy? ? partir de Courrier
> > <https://go.microsoft.com/fwlink/?LinkId=550986> pour Windows
> >
> >
> >
> >
> > <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=emailclient> Garanti
> > sans virus. www.avast.com
> > <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=emailclient>
> > <#m_7906049641795168604_DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>
> >
> 
> 
> 

From aph at email.arizona.edu  Mon Sep 13 14:34:40 2021
From: aph at email.arizona.edu (Anthony Paul Haas)
Date: Mon, 13 Sep 2021 12:34:40 -0700
Subject: [petsc-users] MatZeroRows and full assembly
Message-ID: <CAEyxMWXs7fL6R0edQAm9goX=mDGDpY=zHbLymnj=KNA43J=Vqw@mail.gmail.com>

Hello,

Is it allowed after a MatZeroRows to insert more values in the row that was
just zeroed with MatSetValues and then perform another full assembly of the
matrix?

Thanks,

Anthony
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210913/bc747da0/attachment.html>

From bsmith at petsc.dev  Mon Sep 13 15:01:55 2021
From: bsmith at petsc.dev (Barry Smith)
Date: Mon, 13 Sep 2021 16:01:55 -0400
Subject: [petsc-users] MatZeroRows and full assembly
In-Reply-To: <CAEyxMWXs7fL6R0edQAm9goX=mDGDpY=zHbLymnj=KNA43J=Vqw@mail.gmail.com>
References: <CAEyxMWXs7fL6R0edQAm9goX=mDGDpY=zHbLymnj=KNA43J=Vqw@mail.gmail.com>
Message-ID: <EA43C00F-2FA9-4230-992F-0D1CDD691921@petsc.dev>



> On Sep 13, 2021, at 3:34 PM, Anthony Paul Haas <aph at email.arizona.edu> wrote:
> 
> Hello,
> 
> Is it allowed after a MatZeroRows to insert more values in the row that was just zeroed with MatSetValues and then perform another full assembly of the matrix?

  Yes, if you are replacing previously zeroed values it will simply fill them in efficiently. If you are introducing new nonzero locations it will be inefficient in general because it has to allocate new space for the new locations.

  Barry


> 
> Thanks,
> 
> Anthony

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210913/9cbdd654/attachment.html>

From junchao.zhang at gmail.com  Mon Sep 13 15:04:04 2021
From: junchao.zhang at gmail.com (Junchao Zhang)
Date: Mon, 13 Sep 2021 15:04:04 -0500
Subject: [petsc-users] MatZeroRows and full assembly
In-Reply-To: <CAEyxMWXs7fL6R0edQAm9goX=mDGDpY=zHbLymnj=KNA43J=Vqw@mail.gmail.com>
References: <CAEyxMWXs7fL6R0edQAm9goX=mDGDpY=zHbLymnj=KNA43J=Vqw@mail.gmail.com>
Message-ID: <CA+MQGp9jHzndWc4A6axZ-EpVOVS_WLqd5g3=kKHkcHiNZVEc-A@mail.gmail.com>

>From https://petsc.org/release/docs/manualpages/Mat/MatSetOption.html,

MAT_KEEP_NONZERO_PATTERN
<https://petsc.org/release/docs/manualpages/Mat/MatOption.html#MatOption>
indicates
when MatZeroRows
<https://petsc.org/release/docs/manualpages/Mat/MatZeroRows.html#MatZeroRows>()
is called the zeroed entries are kept in the nonzero structure

So, if you have this option  true and you set to a previous location, then
it is fine, otherwise you also need MAT_NEW_NONZERO_ALLOCATION_ERR
<https://petsc.org/release/docs/manualpages/Mat/MatOption.html#MatOption>
to be false to do so.

--Junchao Zhang


On Mon, Sep 13, 2021 at 2:34 PM Anthony Paul Haas <aph at email.arizona.edu>
wrote:

> Hello,
>
> Is it allowed after a MatZeroRows to insert more values in the row that
> was just zeroed with MatSetValues and then perform another full assembly of
> the matrix?
>
> Thanks,
>
> Anthony
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210913/74535cc1/attachment.html>

From bsmith at petsc.dev  Mon Sep 13 15:25:33 2021
From: bsmith at petsc.dev (Barry Smith)
Date: Mon, 13 Sep 2021 16:25:33 -0400
Subject: [petsc-users] MatZeroRows and full assembly
In-Reply-To: <CA+MQGp9jHzndWc4A6axZ-EpVOVS_WLqd5g3=kKHkcHiNZVEc-A@mail.gmail.com>
References: <CAEyxMWXs7fL6R0edQAm9goX=mDGDpY=zHbLymnj=KNA43J=Vqw@mail.gmail.com>
	<CA+MQGp9jHzndWc4A6axZ-EpVOVS_WLqd5g3=kKHkcHiNZVEc-A@mail.gmail.com>
Message-ID: <EF79C6C1-1DEB-4FD5-9366-4F827275EB5F@petsc.dev>


  Sorry, my mistake. It is MatZeroRowsColumns() that ignores the MAT_KEEP_NONZERO_PATTERN <https://petsc.org/release/docs/manualpages/Mat/MatOption.html#MatOption>  option.

  Barry


> On Sep 13, 2021, at 4:04 PM, Junchao Zhang <junchao.zhang at gmail.com> wrote:
> 
> From https://petsc.org/release/docs/manualpages/Mat/MatSetOption.html <https://petsc.org/release/docs/manualpages/Mat/MatSetOption.html>,
> 
> MAT_KEEP_NONZERO_PATTERN <https://petsc.org/release/docs/manualpages/Mat/MatOption.html#MatOption> indicates when MatZeroRows <https://petsc.org/release/docs/manualpages/Mat/MatZeroRows.html#MatZeroRows>() is called the zeroed entries are kept in the nonzero structure
> 
> So, if you have this option  true and you set to a previous location, then it is fine, otherwise you also need MAT_NEW_NONZERO_ALLOCATION_ERR <https://petsc.org/release/docs/manualpages/Mat/MatOption.html#MatOption>  to be false to do so.
> 
> --Junchao Zhang
> 
> 
> On Mon, Sep 13, 2021 at 2:34 PM Anthony Paul Haas <aph at email.arizona.edu <mailto:aph at email.arizona.edu>> wrote:
> Hello,
> 
> Is it allowed after a MatZeroRows to insert more values in the row that was just zeroed with MatSetValues and then perform another full assembly of the matrix?
> 
> Thanks,
> 
> Anthony

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210913/49b9c468/attachment.html>

From berend.vanwachem at ovgu.de  Tue Sep 14 04:14:48 2021
From: berend.vanwachem at ovgu.de (Berend van Wachem)
Date: Tue, 14 Sep 2021 11:14:48 +0200
Subject: [petsc-users] DMView and DMLoad
Message-ID: <56ce2135-9757-4292-e33b-c7eea8fb7b2e@ovgu.de>

Dear PETSc-team,

We are trying to save and load distributed DMPlex and its associated 
physical fields (created with DMCreateGlobalVector)  (Uvelocity, 
VVelocity,  ...) in HDF5_XDMF format. To achieve this, we do the following:

1) save in the same xdmf.h5 file:
DMView( DM         , H5_XDMF_Viewer );
VecView( UVelocity, H5_XDMF_Viewer );

2) load the dm:
DMPlexCreateFromfile(PETSC_COMM_WORLD, Filename, PETSC_TRUE, DM);

3) load the physical field:
VecLoad( UVelocity, H5_XDMF_Viewer );

There are no errors in the execution, but the loaded DM is distributed 
differently to the original one, which results in the incorrect 
placement of the values of the physical fields (UVelocity etc.) in the 
domain.

This approach is used to restart the simulation with the last saved DM. 
Is there something we are missing, or there exists alternative routes to 
this goal? Can we somehow get the IS of the redistribution, so we can 
re-distribute the vector data as well?

Many thanks, best regards,

Berend.



From knepley at gmail.com  Tue Sep 14 06:23:49 2021
From: knepley at gmail.com (Matthew Knepley)
Date: Tue, 14 Sep 2021 07:23:49 -0400
Subject: [petsc-users] DMView and DMLoad
In-Reply-To: <56ce2135-9757-4292-e33b-c7eea8fb7b2e@ovgu.de>
References: <56ce2135-9757-4292-e33b-c7eea8fb7b2e@ovgu.de>
Message-ID: <CAMYG4G=9N-Y5Am8LsQG8Z9mLmYMNZn3m5CxdFZocHxRN8Tr77w@mail.gmail.com>

On Tue, Sep 14, 2021 at 5:15 AM Berend van Wachem <berend.vanwachem at ovgu.de>
wrote:

> Dear PETSc-team,
>
> We are trying to save and load distributed DMPlex and its associated
> physical fields (created with DMCreateGlobalVector)  (Uvelocity,
> VVelocity,  ...) in HDF5_XDMF format. To achieve this, we do the following:
>
> 1) save in the same xdmf.h5 file:
> DMView( DM         , H5_XDMF_Viewer );
> VecView( UVelocity, H5_XDMF_Viewer );
>
> 2) load the dm:
> DMPlexCreateFromfile(PETSC_COMM_WORLD, Filename, PETSC_TRUE, DM);
>
> 3) load the physical field:
> VecLoad( UVelocity, H5_XDMF_Viewer );
>
> There are no errors in the execution, but the loaded DM is distributed
> differently to the original one, which results in the incorrect
> placement of the values of the physical fields (UVelocity etc.) in the
> domain.
>
> This approach is used to restart the simulation with the last saved DM.
> Is there something we are missing, or there exists alternative routes to
> this goal? Can we somehow get the IS of the redistribution, so we can
> re-distribute the vector data as well?
>
> Many thanks, best regards,
>

Hi Berend,

We are in the midst of rewriting this. We want to support saving multiple
meshes, with fields attached to each,
and preserving the discretization (section) information, and allowing us to
load up on a different number of
processes. We plan to be done by October. Vaclav and I are doing this in
collaboration with Koki Sagiyama,
David Ham, and Lawrence Mitchell from the Firedrake team.

For this problem, we need to give hints for the distribution when you load
the DM, as is done now with Vec.
We have replaced the DMPlexCreateFromFile() with DMLoad() to better match
the interface in the rest of PETSc.
Hopefully the wait is not too big an inconvenience.

  Thanks,

     Matt


> Berend.
>
>
>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210914/e5fcd160/attachment.html>

From facklerpw at ornl.gov  Wed Sep 15 11:52:26 2021
From: facklerpw at ornl.gov (Fackler, Philip)
Date: Wed, 15 Sep 2021 16:52:26 +0000
Subject: [petsc-users] [EXTERNAL] Re:  Redirecting petsc output
In-Reply-To: <SA1PR09MB807724B45AFA9E6F73E1974BC6D49@SA1PR09MB8077.namprd09.prod.outlook.com>
References: <SA1PR09MB80770BDB4CFEB1B88D6DDCA4C6D49@SA1PR09MB8077.namprd09.prod.outlook.com>
	<782F63EE-1821-4607-8D80-77C543E0ACF4@petsc.dev>
	<SA1PR09MB807724B45AFA9E6F73E1974BC6D49@SA1PR09MB8077.namprd09.prod.outlook.com>
Message-ID: <SA1PR09MB8077A8A4CFA8F6B509A8C0B4C6DB9@SA1PR09MB8077.namprd09.prod.outlook.com>

Just to follow up here, I figured out a different way to get what I really wanted, and that is described on this page:
https://petsc.org/release/docs/manualpages/Sys/PetscVFPrintf.html

Thanks,

Philip Fackler
Research Software Engineer, Application Engineering Group
Advanced Computing Systems Research Section
Computer Science and Mathematics Division
Oak Ridge National Laboratory
________________________________
From: petsc-users <petsc-users-bounces at mcs.anl.gov> on behalf of Fackler, Philip via petsc-users <petsc-users at mcs.anl.gov>
Sent: Wednesday, September 8, 2021 11:24
To: Barry Smith <bsmith at petsc.dev>
Cc: petsc-users at mcs.anl.gov <petsc-users at mcs.anl.gov>; xolotl-psi-development at lists.sourceforge.net <xolotl-psi-development at lists.sourceforge.net>
Subject: Re: [petsc-users] [EXTERNAL] Re: Redirecting petsc output

Barry,

Thanks for the quick reply! I'll try those out.

Philip Fackler
Research Software Engineer, Application Engineering Group
Advanced Computing Systems Research Section
Computer Science and Mathematics Division
Oak Ridge National Laboratory
________________________________
From: Barry Smith <bsmith at petsc.dev>
Sent: Wednesday, September 8, 2021 11:16
To: Fackler, Philip <facklerpw at ornl.gov>
Cc: xolotl-psi-development at lists.sourceforge.net <xolotl-psi-development at lists.sourceforge.net>; petsc-users at mcs.anl.gov <petsc-users at mcs.anl.gov>
Subject: [EXTERNAL] Re: [petsc-users] Redirecting petsc output


  Philip,

    There a variety of techniques. Some of the command line options take an optional viewer name where the output can be redirected. For example

   -ts_monitor ascii:filename or -ts_view ascii:filename    see https://petsc.org/release/docs/manualpages/Viewer/PetscOptionsGetViewer.html for more details

   It is also possible to change all stdout from PETSc to a different file by setting PETSC_STDOUT = fopen(...)


   Barry


On Sep 8, 2021, at 10:59 AM, Fackler, Philip via petsc-users <petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>> wrote:

Is there a way to customize how petsc writes information? Instead of writing to stdout (for example: 0 TS dt 0.1 time 0.), what if we want to log that message to a file other output from Xolotl? I'm assuming there are multiple ways of getting this result. What's common practice with petsc folks?

Thanks,

Philip Fackler
Research Software Engineer, Application Engineering Group
Advanced Computing Systems Research Section
Computer Science and Mathematics Division
Oak Ridge National Laboratory

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210915/2ee9db34/attachment.html>

From wence at gmx.li  Fri Sep 17 08:46:58 2021
From: wence at gmx.li (Lawrence Mitchell)
Date: Fri, 17 Sep 2021 14:46:58 +0100
Subject: [petsc-users] DMView and DMLoad
In-Reply-To: <CAMYG4G=9N-Y5Am8LsQG8Z9mLmYMNZn3m5CxdFZocHxRN8Tr77w@mail.gmail.com>
References: <56ce2135-9757-4292-e33b-c7eea8fb7b2e@ovgu.de>
	<CAMYG4G=9N-Y5Am8LsQG8Z9mLmYMNZn3m5CxdFZocHxRN8Tr77w@mail.gmail.com>
Message-ID: <DD57D079-509E-4C60-9F32-15972E6B6245@gmx.li>

Hi Berend,

> On 14 Sep 2021, at 12:23, Matthew Knepley <knepley at gmail.com> wrote:
> 
> On Tue, Sep 14, 2021 at 5:15 AM Berend van Wachem <berend.vanwachem at ovgu.de> wrote:
> Dear PETSc-team,
> 
> We are trying to save and load distributed DMPlex and its associated 
> physical fields (created with DMCreateGlobalVector)  (Uvelocity, 
> VVelocity,  ...) in HDF5_XDMF format. To achieve this, we do the following:
> 
> 1) save in the same xdmf.h5 file:
> DMView( DM         , H5_XDMF_Viewer );
> VecView( UVelocity, H5_XDMF_Viewer );
> 
> 2) load the dm:
> DMPlexCreateFromfile(PETSC_COMM_WORLD, Filename, PETSC_TRUE, DM);
> 
> 3) load the physical field:
> VecLoad( UVelocity, H5_XDMF_Viewer );
> 
> There are no errors in the execution, but the loaded DM is distributed 
> differently to the original one, which results in the incorrect 
> placement of the values of the physical fields (UVelocity etc.) in the 
> domain.
> 
> This approach is used to restart the simulation with the last saved DM. 
> Is there something we are missing, or there exists alternative routes to 
> this goal? Can we somehow get the IS of the redistribution, so we can 
> re-distribute the vector data as well?
> 
> Many thanks, best regards,
> 
> Hi Berend,
> 
> We are in the midst of rewriting this. We want to support saving multiple meshes, with fields attached to each,
> and preserving the discretization (section) information, and allowing us to load up on a different number of
> processes. We plan to be done by October. Vaclav and I are doing this in collaboration with Koki Sagiyama,
> David Ham, and Lawrence Mitchell from the Firedrake team.

The core load/save cycle functionality is now in PETSc main. So if you're using main rather than a release, you can get access to it now. This section of the manual shows an example of how to do things https://petsc.org/main/docs/manual/dmplex/#saving-and-loading-data-with-hdf5

Let us know if things aren't clear!

Thanks,

Lawrence

From samuelestes91 at gmail.com  Fri Sep 17 11:21:35 2021
From: samuelestes91 at gmail.com (Samuel Estes)
Date: Fri, 17 Sep 2021 11:21:35 -0500
Subject: [petsc-users] Solving two successive linear systems
Message-ID: <CAOUB9Xsh7KnfyRr99nvqGYPUXgh+v7Ly870+vX31OpZV1-ZFrA@mail.gmail.com>

Hi,

I have two related questions about the best way to use the KSP solver:

First, I have an adaptive FEM code which solves the same linear system at
each iteration until the grid is refined at which point, obviously, the
size of the linear system changes. Currently, I just call:
KSPSetOperators(ksp,A,A);
KSPSetFromOptions();
KSPSolve(ksp,b,x);
In a separate part of the code, I re-create the matrix and vectors and call
KSPReset(ksp); whenever the grid is refined.
Is this an optimal way to do things? In particular, does KSPSetFromOptions
need to be called before each solve or can I just call it once somewhere
else and then be done with it. Does it need to be called after each call to
KSPReset? There is a section in the PETSc Manual about solving successive
linear systems but it is rather terse so I'm just trying to get a sense of
how to optimally code this.

Second, one model in this code actually successively solves two linear
systems of different sizes (one system is n x n and the other is 3*n x
3*n). I solve this by creating two matrices, two right hand sides, and two
solution vectors for each system. I currently just use one KSP object which
I reset after each use since the linear system changes size each time.
Would it be more efficient to simply allocate a second ksp solver object so
that I don't have to call KSPReset every time? I'm not sure how much memory
a ksp object requires or how much computation I would save by using a
second solver. Any ideas here? This part of the code is also adaptive.

Thanks in advance for the help. I hope my questions are clear. If not, I'm
happy to clarify.

Sam
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210917/4094396c/attachment.html>

From bsmith at petsc.dev  Fri Sep 17 11:37:04 2021
From: bsmith at petsc.dev (Barry Smith)
Date: Fri, 17 Sep 2021 12:37:04 -0400
Subject: [petsc-users] Solving two successive linear systems
In-Reply-To: <CAOUB9Xsh7KnfyRr99nvqGYPUXgh+v7Ly870+vX31OpZV1-ZFrA@mail.gmail.com>
References: <CAOUB9Xsh7KnfyRr99nvqGYPUXgh+v7Ly870+vX31OpZV1-ZFrA@mail.gmail.com>
Message-ID: <421D7542-769B-4763-9F6D-ABEC5C23C1B9@petsc.dev>



> On Sep 17, 2021, at 12:21 PM, Samuel Estes <samuelestes91 at gmail.com> wrote:
> 
> Hi,
> 
> I have two related questions about the best way to use the KSP solver:
> 
> First, I have an adaptive FEM code which solves the same linear system at each iteration until the grid is refined at which point, obviously, the size of the linear system changes. Currently, I just call:
> KSPSetOperators(ksp,A,A);
> KSPSetFromOptions();
> KSPSolve(ksp,b,x);
> In a separate part of the code, I re-create the matrix and vectors and call KSPReset(ksp); whenever the grid is refined. 
> Is this an optimal way to do things? In particular, does KSPSetFromOptions need to be called before each solve or can I just call it once somewhere else and then be done with it. Does it need to be called after each call to KSPReset?

   Yes, it is best to call the KSPSetFromOptions after each reset. You do not need to call it for each solve.

> There is a section in the PETSc Manual about solving successive linear systems but it is rather terse so I'm just trying to get a sense of how to optimally code this. 
> 
> Second, one model in this code actually successively solves two linear systems of different sizes (one system is n x n and the other is 3*n x 3*n). I solve this by creating two matrices, two right hand sides, and two solution vectors for each system. I currently just use one KSP object which I reset after each use since the linear system changes size each time. Would it be more efficient to simply allocate a second ksp solver object so that I don't have to call KSPReset every time? I'm not sure how much memory a ksp object requires or how much computation I would save by using a second solver. Any ideas here? This part of the code is also adaptive.

   It is best to use two KSP. There is no advantage in reusing one.
> 
> Thanks in advance for the help. I hope my questions are clear. If not, I'm happy to clarify.
> 
> Sam
> 
> 


From samuelestes91 at gmail.com  Fri Sep 17 11:46:24 2021
From: samuelestes91 at gmail.com (Samuel Estes)
Date: Fri, 17 Sep 2021 11:46:24 -0500
Subject: [petsc-users] Solving two successive linear systems
In-Reply-To: <421D7542-769B-4763-9F6D-ABEC5C23C1B9@petsc.dev>
References: <CAOUB9Xsh7KnfyRr99nvqGYPUXgh+v7Ly870+vX31OpZV1-ZFrA@mail.gmail.com>
	<421D7542-769B-4763-9F6D-ABEC5C23C1B9@petsc.dev>
Message-ID: <CAOUB9Xsp=xF8ivrdbZUiMSNxPLL_AqGTNFyLN+zC-t4-=_xczw@mail.gmail.com>

Thanks for the help!

On Fri, Sep 17, 2021 at 11:37 AM Barry Smith <bsmith at petsc.dev> wrote:

>
>
> > On Sep 17, 2021, at 12:21 PM, Samuel Estes <samuelestes91 at gmail.com>
> wrote:
> >
> > Hi,
> >
> > I have two related questions about the best way to use the KSP solver:
> >
> > First, I have an adaptive FEM code which solves the same linear system
> at each iteration until the grid is refined at which point, obviously, the
> size of the linear system changes. Currently, I just call:
> > KSPSetOperators(ksp,A,A);
> > KSPSetFromOptions();
> > KSPSolve(ksp,b,x);
> > In a separate part of the code, I re-create the matrix and vectors and
> call KSPReset(ksp); whenever the grid is refined.
> > Is this an optimal way to do things? In particular, does
> KSPSetFromOptions need to be called before each solve or can I just call it
> once somewhere else and then be done with it. Does it need to be called
> after each call to KSPReset?
>
>    Yes, it is best to call the KSPSetFromOptions after each reset. You do
> not need to call it for each solve.
>
> > There is a section in the PETSc Manual about solving successive linear
> systems but it is rather terse so I'm just trying to get a sense of how to
> optimally code this.
> >
> > Second, one model in this code actually successively solves two linear
> systems of different sizes (one system is n x n and the other is 3*n x
> 3*n). I solve this by creating two matrices, two right hand sides, and two
> solution vectors for each system. I currently just use one KSP object which
> I reset after each use since the linear system changes size each time.
> Would it be more efficient to simply allocate a second ksp solver object so
> that I don't have to call KSPReset every time? I'm not sure how much memory
> a ksp object requires or how much computation I would save by using a
> second solver. Any ideas here? This part of the code is also adaptive.
>
>    It is best to use two KSP. There is no advantage in reusing one.
> >
> > Thanks in advance for the help. I hope my questions are clear. If not,
> I'm happy to clarify.
> >
> > Sam
> >
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210917/fa069ab2/attachment.html>

From mfadams at lbl.gov  Sun Sep 19 08:29:18 2021
From: mfadams at lbl.gov (Mark Adams)
Date: Sun, 19 Sep 2021 09:29:18 -0400
Subject: [petsc-users] Spock link error
Message-ID: <CADOhEh6Wg4=+273k55N6YLtOGMZbVRFnBJaxjO6BcVY7tcBYCw@mail.gmail.com>

I am getting to see this error. It seems to be suggesting that I turn
--no-allow-shlib-undefined off.
Any ideas?
Thanks,
Mark

09:09 main= /gpfs/alpine/csc314/scratch/adams/petsc$ make
PETSC_DIR=/gpfs/alpine/phy122/proj-shared/spock/petsc/current/arch-opt-cray-new
PETSC_ARCH="" check
Running check examples to verify correct installation
Using
PETSC_DIR=/gpfs/alpine/phy122/proj-shared/spock/petsc/current/arch-opt-cray-new
and PETSC_ARCH=
gmake[3]:
[/gpfs/alpine/phy122/proj-shared/spock/petsc/current/arch-opt-cray-new/lib/petsc/conf/rules:301:
ex19.PETSc] Error 2 (ignored)
*******************Error detected during compile or link!*******************
See http://www.mcs.anl.gov/petsc/documentation/faq.html
/gpfs/alpine/csc314/scratch/adams/petsc/src/snes/tutorials ex19
*********************************************************************************
cc -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas
-fstack-protector -Qunused-arguments -fvisibility=hidden -g -O2  -fPIC
-Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas
-fstack-protector -Qunused-arguments -fvisibility=hidden -g -O2
 -I/gpfs/alpine/phy122/proj-shared/spock/petsc/current/arch-opt-cray-new/include
-I/opt/rocm-4.2.0/include     ex19.c
 -Wl,-rpath,/gpfs/alpine/phy122/proj-shared/spock/petsc/current/arch-opt-cray-new/lib
-L/gpfs/alpine/phy122/proj-shared/spock/petsc/current/arch-opt-cray-new/lib
-Wl,-rpath,/gpfs/alpine/phy122/proj-shared/spock/petsc/current/arch-opt-cray-new/lib
-L/gpfs/alpine/phy122/proj-shared/spock/petsc/current/arch-opt-cray-new/lib
-Wl,-rpath,/opt/rocm-4.2.0/lib -L/opt/rocm-4.2.0/lib
-Wl,-rpath,/opt/cray/pe/gcc/8.1.0/snos/lib64
-L/opt/cray/pe/gcc/8.1.0/snos/lib64 -Wl,-rpath,/opt/cray/pe/libsci/
21.06.1.1/CRAY/9.0/x86_64/lib -L/opt/cray/pe/libsci/
21.06.1.1/CRAY/9.0/x86_64/lib
-Wl,-rpath,/opt/cray/pe/mpich/8.1.7/ofi/cray/10.0/lib
-L/opt/cray/pe/mpich/8.1.7/ofi/cray/10.0/lib
-Wl,-rpath,/opt/cray/pe/mpich/default/gtl/lib
-L/opt/cray/pe/mpich/default/gtl/lib
-Wl,-rpath,/opt/cray/pe/dsmml/0.1.5/dsmml/lib
-L/opt/cray/pe/dsmml/0.1.5/dsmml/lib -Wl,-rpath,/opt/cray/pe/pmi/6.0.12/lib
-L/opt/cray/pe/pmi/6.0.12/lib
-Wl,-rpath,/opt/cray/pe/cce/12.0.1/cce/x86_64/lib
-L/opt/cray/pe/cce/12.0.1/cce/x86_64/lib
-Wl,-rpath,/opt/cray/xpmem/2.2.40-2.1_2.44__g3cf3325.shasta/lib64
-L/opt/cray/xpmem/2.2.40-2.1_2.44__g3cf3325.shasta/lib64
-Wl,-rpath,/opt/cray/pe/cce/12.0.1/cce-clang/x86_64/lib/clang/12.0.0/lib/linux
-L/opt/cray/pe/cce/12.0.1/cce-clang/x86_64/lib/clang/12.0.0/lib/linux
-Wl,-rpath,/opt/cray/pe/gcc/8.1.0/snos/lib/gcc/x86_64-suse-linux/8.1.0
-L/opt/cray/pe/gcc/8.1.0/snos/lib/gcc/x86_64-suse-linux/8.1.0
-Wl,-rpath,/opt/cray/pe/cce/12.0.1/binutils/x86_64/x86_64-unknown-linux-gnu/lib
-L/opt/cray/pe/cce/12.0.1/binutils/x86_64/x86_64-unknown-linux-gnu/lib
-Wl,-rpath,/opt/cray/pe/cce/12.0.1/binutils/x86_64/x86_64-pc-linux-gnu/..//x86_64-unknown-linux-gnu/lib
-L/opt/cray/pe/cce/12.0.1/binutils/x86_64/x86_64-pc-linux-gnu/..//x86_64-unknown-linux-gnu/lib
-lpetsc -lparmetis -lmetis -lhipsparse -lhipblas -lrocsparse -lrocsolver
-lrocblas -lrocrand -lamdhip64 -lstdc++ -ldl -lmpifort_cray -lmpi_cray
-lmpi_gtl_hsa -ldsmml -lpmi -lxpmem -lpgas-shmem -lquadmath
-lcrayacc_amdgpu -lopenacc -lmodules -lfi -lcraymath -lf -lu -lcsup
-lgfortran -lpthread -lgcc_eh -lm -lclang_rt.craypgo-x86_64
-lclang_rt.builtins-x86_64 -lquadmath -lstdc++ -ldl -o ex19


*ld.lld: error:
/gpfs/alpine/phy122/proj-shared/spock/petsc/current/arch-opt-cray-new/lib/libpetsc.so:
undefined reference to .omp_offloading.img_start.cray_amdgcn-amd-amdhsa
[--no-allow-shlib-undefined]ld.lld: error:
/gpfs/alpine/phy122/proj-shared/spock/petsc/current/arch-opt-cray-new/lib/libpetsc.so:
undefined reference to .omp_offloading.img_size.cray_amdgcn-amd-amdhsa
[--no-allow-shlib-undefined]ld.lld: error:
/gpfs/alpine/phy122/proj-shared/spock/petsc/current/arch-opt-cray-new/lib/libpetsc.so:
undefined reference to .omp_offloading.img_cache.cray_amdgcn-amd-amdhsa
[--no-allow-shlib-undefined]*
clang-12: error: linker command failed with exit code 1 (use -v to see
invocation)
gmake[4]: *** [<builtin>: ex19] Error 1
*******************Error detected during compile or link!*******************
See http://www.mcs.anl.gov/petsc/documentation/faq.html
/gpfs/alpine/csc314/scratch/adams/petsc/src/snes/tutorials ex5f
*********************************************************
ftn -fPIC -g -O2   -fPIC -g -O2
 -I/gpfs/alpine/phy122/proj-shared/spock/petsc/current/arch-opt-cray-new/include
-I/opt/rocm-4.2.0/include     ex5f.F90
 -Wl,-rpath,/gpfs/alpine/phy122/proj-shared/spock/petsc/current/arch-opt-cray-new/lib
-L/gpfs/alpine/phy122/proj-shared/spock/petsc/current/arch-opt-cray-new/lib
-Wl,-rpath,/gpfs/alpine/phy122/proj-shared/spock/petsc/current/arch-opt-cray-new/lib
-L/gpfs/alpine/phy122/proj-shared/spock/petsc/current/arch-opt-cray-new/lib
-Wl,-rpath,/opt/rocm-4.2.0/lib -L/opt/rocm-4.2.0/lib
-Wl,-rpath,/opt/cray/pe/gcc/8.1.0/snos/lib64
-L/opt/cray/pe/gcc/8.1.0/snos/lib64 -Wl,-rpath,/opt/cray/pe/libsci/
21.06.1.1/CRAY/9.0/x86_64/lib -L/opt/cray/pe/libsci/
21.06.1.1/CRAY/9.0/x86_64/lib
-Wl,-rpath,/opt/cray/pe/mpich/8.1.7/ofi/cray/10.0/lib
-L/opt/cray/pe/mpich/8.1.7/ofi/cray/10.0/lib
-Wl,-rpath,/opt/cray/pe/mpich/default/gtl/lib
-L/opt/cray/pe/mpich/default/gtl/lib
-Wl,-rpath,/opt/cray/pe/dsmml/0.1.5/dsmml/lib
-L/opt/cray/pe/dsmml/0.1.5/dsmml/lib -Wl,-rpath,/opt/cray/pe/pmi/6.0.12/lib
-L/opt/cray/pe/pmi/6.0.12/lib
-Wl,-rpath,/opt/cray/pe/cce/12.0.1/cce/x86_64/lib
-L/opt/cray/pe/cce/12.0.1/cce/x86_64/lib
-Wl,-rpath,/opt/cray/xpmem/2.2.40-2.1_2.44__g3cf3325.shasta/lib64
-L/opt/cray/xpmem/2.2.40-2.1_2.44__g3cf3325.shasta/lib64
-Wl,-rpath,/opt/cray/pe/cce/12.0.1/cce-clang/x86_64/lib/clang/12.0.0/lib/linux
-L/opt/cray/pe/cce/12.0.1/cce-clang/x86_64/lib/clang/12.0.0/lib/linux
-Wl,-rpath,/opt/cray/pe/gcc/8.1.0/snos/lib/gcc/x86_64-suse-linux/8.1.0
-L/opt/cray/pe/gcc/8.1.0/snos/lib/gcc/x86_64-suse-linux/8.1.0
-Wl,-rpath,/opt/cray/pe/cce/12.0.1/binutils/x86_64/x86_64-unknown-linux-gnu/lib
-L/opt/cray/pe/cce/12.0.1/binutils/x86_64/x86_64-unknown-linux-gnu/lib
-Wl,-rpath,/opt/cray/pe/cce/12.0.1/binutils/x86_64/x86_64-pc-linux-gnu/..//x86_64-unknown-linux-gnu/lib
-L/opt/cray/pe/cce/12.0.1/binutils/x86_64/x86_64-pc-linux-gnu/..//x86_64-unknown-linux-gnu/lib
-lpetsc -lparmetis -lmetis -lhipsparse -lhipblas -lrocsparse -lrocsolver
-lrocblas -lrocrand -lamdhip64 -lstdc++ -ldl -lmpifort_cray -lmpi_cray
-lmpi_gtl_hsa -ldsmml -lpmi -lxpmem -lpgas-shmem -lquadmath
-lcrayacc_amdgpu -lopenacc -lmodules -lfi -lcraymath -lf -lu -lcsup
-lgfortran -lpthread -lgcc_eh -lm -lclang_rt.craypgo-x86_64
-lclang_rt.builtins-x86_64 -lquadmath -lstdc++ -ldl -o ex5f
/opt/cray/pe/cce/12.0.1/binutils/x86_64/x86_64-pc-linux-gnu/bin/ld:
warning: alignment 128 of symbol
`$host_init$$runtime_init_for_iso_c_binding$iso_c_binding_' in
/opt/cray/pe/cce/12.0.1/cce/x86_64/lib/libmodules.so is smaller than 256 in
/tmp/pe_39617/ex5f_1.o
/opt/cray/pe/cce/12.0.1/binutils/x86_64/x86_64-pc-linux-gnu/bin/ld:
warning: alignment 64 of symbol `$data_init$iso_c_binding_' in
/opt/cray/pe/cce/12.0.1/cce/x86_64/lib/libmodules.so is smaller than 256 in
/tmp/pe_39617/ex5f_1.o
Fortran example src/snes/tutorials/ex5f run successfully with 1 MPI process
Completed test examples
09:12 main= /gpfs/alpine/csc314/scratch/adams/petsc$ module list

Currently Loaded Modules:
  1) craype-x86-rome         4) perftools-base/21.05.0                   7)
cray-pmi-lib/6.0.12  10) cray-dsmml/0.1.5       13) PrgEnv-cray/8.1.0
 16) rocm/4.2.0   19) autoconf/2.69
  2) libfabric/1.11.0.4.75   5) xpmem/2.2.40-2.1_2.44__g3cf3325.shasta   8)
cce/12.0.1           11) cray-mpich/8.1.7       14) DefApps/default
 17) emacs/27.2   20) automake/1.16.3
  3) craype-network-ofi      6) cray-pmi/6.0.12                          9)
craype/2.7.8         12) cray-libsci/21.06.1.1  15) craype-accel-amd-gfx908
 18) zlib/1.2.11  21) libtool/2.4.6
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210919/4e3a3dd3/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: make.log
Type: application/octet-stream
Size: 113451 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210919/4e3a3dd3/attachment-0002.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: configure.log
Type: application/octet-stream
Size: 1952129 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210919/4e3a3dd3/attachment-0003.obj>

From stefano.zampini at gmail.com  Sun Sep 19 08:43:59 2021
From: stefano.zampini at gmail.com (Stefano Zampini)
Date: Sun, 19 Sep 2021 16:43:59 +0300
Subject: [petsc-users] Spock link error
In-Reply-To: <CADOhEh6Wg4=+273k55N6YLtOGMZbVRFnBJaxjO6BcVY7tcBYCw@mail.gmail.com>
References: <CADOhEh6Wg4=+273k55N6YLtOGMZbVRFnBJaxjO6BcVY7tcBYCw@mail.gmail.com>
Message-ID: <CAGPUisgDa3wroqY1mXn1JS1X0Bvj56R7bAtcwR+QqprPXHN1RQ@mail.gmail.com>

Are you following the user advices here
https://docs.olcf.ornl.gov/systems/spock_quick_start_guide.html#compiling-with-the-cray-compiler-wrappers-cc-or-cc
?

Il giorno dom 19 set 2021 alle ore 16:30 Mark Adams <mfadams at lbl.gov> ha
scritto:

> I am getting to see this error. It seems to be suggesting that I turn
> --no-allow-shlib-undefined off.
> Any ideas?
> Thanks,
> Mark
>
> 09:09 main= /gpfs/alpine/csc314/scratch/adams/petsc$ make
> PETSC_DIR=/gpfs/alpine/phy122/proj-shared/spock/petsc/current/arch-opt-cray-new
> PETSC_ARCH="" check
> Running check examples to verify correct installation
> Using
> PETSC_DIR=/gpfs/alpine/phy122/proj-shared/spock/petsc/current/arch-opt-cray-new
> and PETSC_ARCH=
> gmake[3]:
> [/gpfs/alpine/phy122/proj-shared/spock/petsc/current/arch-opt-cray-new/lib/petsc/conf/rules:301:
> ex19.PETSc] Error 2 (ignored)
> *******************Error detected during compile or
> link!*******************
> See http://www.mcs.anl.gov/petsc/documentation/faq.html
> /gpfs/alpine/csc314/scratch/adams/petsc/src/snes/tutorials ex19
>
> *********************************************************************************
> cc -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas
> -fstack-protector -Qunused-arguments -fvisibility=hidden -g -O2  -fPIC
> -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas
> -fstack-protector -Qunused-arguments -fvisibility=hidden -g -O2
>  -I/gpfs/alpine/phy122/proj-shared/spock/petsc/current/arch-opt-cray-new/include
> -I/opt/rocm-4.2.0/include     ex19.c
>  -Wl,-rpath,/gpfs/alpine/phy122/proj-shared/spock/petsc/current/arch-opt-cray-new/lib
> -L/gpfs/alpine/phy122/proj-shared/spock/petsc/current/arch-opt-cray-new/lib
> -Wl,-rpath,/gpfs/alpine/phy122/proj-shared/spock/petsc/current/arch-opt-cray-new/lib
> -L/gpfs/alpine/phy122/proj-shared/spock/petsc/current/arch-opt-cray-new/lib
> -Wl,-rpath,/opt/rocm-4.2.0/lib -L/opt/rocm-4.2.0/lib
> -Wl,-rpath,/opt/cray/pe/gcc/8.1.0/snos/lib64
> -L/opt/cray/pe/gcc/8.1.0/snos/lib64 -Wl,-rpath,/opt/cray/pe/libsci/
> 21.06.1.1/CRAY/9.0/x86_64/lib -L/opt/cray/pe/libsci/
> 21.06.1.1/CRAY/9.0/x86_64/lib
> -Wl,-rpath,/opt/cray/pe/mpich/8.1.7/ofi/cray/10.0/lib
> -L/opt/cray/pe/mpich/8.1.7/ofi/cray/10.0/lib
> -Wl,-rpath,/opt/cray/pe/mpich/default/gtl/lib
> -L/opt/cray/pe/mpich/default/gtl/lib
> -Wl,-rpath,/opt/cray/pe/dsmml/0.1.5/dsmml/lib
> -L/opt/cray/pe/dsmml/0.1.5/dsmml/lib -Wl,-rpath,/opt/cray/pe/pmi/6.0.12/lib
> -L/opt/cray/pe/pmi/6.0.12/lib
> -Wl,-rpath,/opt/cray/pe/cce/12.0.1/cce/x86_64/lib
> -L/opt/cray/pe/cce/12.0.1/cce/x86_64/lib
> -Wl,-rpath,/opt/cray/xpmem/2.2.40-2.1_2.44__g3cf3325.shasta/lib64
> -L/opt/cray/xpmem/2.2.40-2.1_2.44__g3cf3325.shasta/lib64
> -Wl,-rpath,/opt/cray/pe/cce/12.0.1/cce-clang/x86_64/lib/clang/12.0.0/lib/linux
> -L/opt/cray/pe/cce/12.0.1/cce-clang/x86_64/lib/clang/12.0.0/lib/linux
> -Wl,-rpath,/opt/cray/pe/gcc/8.1.0/snos/lib/gcc/x86_64-suse-linux/8.1.0
> -L/opt/cray/pe/gcc/8.1.0/snos/lib/gcc/x86_64-suse-linux/8.1.0
> -Wl,-rpath,/opt/cray/pe/cce/12.0.1/binutils/x86_64/x86_64-unknown-linux-gnu/lib
> -L/opt/cray/pe/cce/12.0.1/binutils/x86_64/x86_64-unknown-linux-gnu/lib
> -Wl,-rpath,/opt/cray/pe/cce/12.0.1/binutils/x86_64/x86_64-pc-linux-gnu/..//x86_64-unknown-linux-gnu/lib
> -L/opt/cray/pe/cce/12.0.1/binutils/x86_64/x86_64-pc-linux-gnu/..//x86_64-unknown-linux-gnu/lib
> -lpetsc -lparmetis -lmetis -lhipsparse -lhipblas -lrocsparse -lrocsolver
> -lrocblas -lrocrand -lamdhip64 -lstdc++ -ldl -lmpifort_cray -lmpi_cray
> -lmpi_gtl_hsa -ldsmml -lpmi -lxpmem -lpgas-shmem -lquadmath
> -lcrayacc_amdgpu -lopenacc -lmodules -lfi -lcraymath -lf -lu -lcsup
> -lgfortran -lpthread -lgcc_eh -lm -lclang_rt.craypgo-x86_64
> -lclang_rt.builtins-x86_64 -lquadmath -lstdc++ -ldl -o ex19
>
>
> *ld.lld: error:
> /gpfs/alpine/phy122/proj-shared/spock/petsc/current/arch-opt-cray-new/lib/libpetsc.so:
> undefined reference to .omp_offloading.img_start.cray_amdgcn-amd-amdhsa
> [--no-allow-shlib-undefined]ld.lld: error:
> /gpfs/alpine/phy122/proj-shared/spock/petsc/current/arch-opt-cray-new/lib/libpetsc.so:
> undefined reference to .omp_offloading.img_size.cray_amdgcn-amd-amdhsa
> [--no-allow-shlib-undefined]ld.lld: error:
> /gpfs/alpine/phy122/proj-shared/spock/petsc/current/arch-opt-cray-new/lib/libpetsc.so:
> undefined reference to .omp_offloading.img_cache.cray_amdgcn-amd-amdhsa
> [--no-allow-shlib-undefined]*
> clang-12: error: linker command failed with exit code 1 (use -v to see
> invocation)
> gmake[4]: *** [<builtin>: ex19] Error 1
> *******************Error detected during compile or
> link!*******************
> See http://www.mcs.anl.gov/petsc/documentation/faq.html
> /gpfs/alpine/csc314/scratch/adams/petsc/src/snes/tutorials ex5f
> *********************************************************
> ftn -fPIC -g -O2   -fPIC -g -O2
>  -I/gpfs/alpine/phy122/proj-shared/spock/petsc/current/arch-opt-cray-new/include
> -I/opt/rocm-4.2.0/include     ex5f.F90
>  -Wl,-rpath,/gpfs/alpine/phy122/proj-shared/spock/petsc/current/arch-opt-cray-new/lib
> -L/gpfs/alpine/phy122/proj-shared/spock/petsc/current/arch-opt-cray-new/lib
> -Wl,-rpath,/gpfs/alpine/phy122/proj-shared/spock/petsc/current/arch-opt-cray-new/lib
> -L/gpfs/alpine/phy122/proj-shared/spock/petsc/current/arch-opt-cray-new/lib
> -Wl,-rpath,/opt/rocm-4.2.0/lib -L/opt/rocm-4.2.0/lib
> -Wl,-rpath,/opt/cray/pe/gcc/8.1.0/snos/lib64
> -L/opt/cray/pe/gcc/8.1.0/snos/lib64 -Wl,-rpath,/opt/cray/pe/libsci/
> 21.06.1.1/CRAY/9.0/x86_64/lib -L/opt/cray/pe/libsci/
> 21.06.1.1/CRAY/9.0/x86_64/lib
> -Wl,-rpath,/opt/cray/pe/mpich/8.1.7/ofi/cray/10.0/lib
> -L/opt/cray/pe/mpich/8.1.7/ofi/cray/10.0/lib
> -Wl,-rpath,/opt/cray/pe/mpich/default/gtl/lib
> -L/opt/cray/pe/mpich/default/gtl/lib
> -Wl,-rpath,/opt/cray/pe/dsmml/0.1.5/dsmml/lib
> -L/opt/cray/pe/dsmml/0.1.5/dsmml/lib -Wl,-rpath,/opt/cray/pe/pmi/6.0.12/lib
> -L/opt/cray/pe/pmi/6.0.12/lib
> -Wl,-rpath,/opt/cray/pe/cce/12.0.1/cce/x86_64/lib
> -L/opt/cray/pe/cce/12.0.1/cce/x86_64/lib
> -Wl,-rpath,/opt/cray/xpmem/2.2.40-2.1_2.44__g3cf3325.shasta/lib64
> -L/opt/cray/xpmem/2.2.40-2.1_2.44__g3cf3325.shasta/lib64
> -Wl,-rpath,/opt/cray/pe/cce/12.0.1/cce-clang/x86_64/lib/clang/12.0.0/lib/linux
> -L/opt/cray/pe/cce/12.0.1/cce-clang/x86_64/lib/clang/12.0.0/lib/linux
> -Wl,-rpath,/opt/cray/pe/gcc/8.1.0/snos/lib/gcc/x86_64-suse-linux/8.1.0
> -L/opt/cray/pe/gcc/8.1.0/snos/lib/gcc/x86_64-suse-linux/8.1.0
> -Wl,-rpath,/opt/cray/pe/cce/12.0.1/binutils/x86_64/x86_64-unknown-linux-gnu/lib
> -L/opt/cray/pe/cce/12.0.1/binutils/x86_64/x86_64-unknown-linux-gnu/lib
> -Wl,-rpath,/opt/cray/pe/cce/12.0.1/binutils/x86_64/x86_64-pc-linux-gnu/..//x86_64-unknown-linux-gnu/lib
> -L/opt/cray/pe/cce/12.0.1/binutils/x86_64/x86_64-pc-linux-gnu/..//x86_64-unknown-linux-gnu/lib
> -lpetsc -lparmetis -lmetis -lhipsparse -lhipblas -lrocsparse -lrocsolver
> -lrocblas -lrocrand -lamdhip64 -lstdc++ -ldl -lmpifort_cray -lmpi_cray
> -lmpi_gtl_hsa -ldsmml -lpmi -lxpmem -lpgas-shmem -lquadmath
> -lcrayacc_amdgpu -lopenacc -lmodules -lfi -lcraymath -lf -lu -lcsup
> -lgfortran -lpthread -lgcc_eh -lm -lclang_rt.craypgo-x86_64
> -lclang_rt.builtins-x86_64 -lquadmath -lstdc++ -ldl -o ex5f
> /opt/cray/pe/cce/12.0.1/binutils/x86_64/x86_64-pc-linux-gnu/bin/ld:
> warning: alignment 128 of symbol
> `$host_init$$runtime_init_for_iso_c_binding$iso_c_binding_' in
> /opt/cray/pe/cce/12.0.1/cce/x86_64/lib/libmodules.so is smaller than 256 in
> /tmp/pe_39617/ex5f_1.o
> /opt/cray/pe/cce/12.0.1/binutils/x86_64/x86_64-pc-linux-gnu/bin/ld:
> warning: alignment 64 of symbol `$data_init$iso_c_binding_' in
> /opt/cray/pe/cce/12.0.1/cce/x86_64/lib/libmodules.so is smaller than 256 in
> /tmp/pe_39617/ex5f_1.o
> Fortran example src/snes/tutorials/ex5f run successfully with 1 MPI process
> Completed test examples
> 09:12 main= /gpfs/alpine/csc314/scratch/adams/petsc$ module list
>
> Currently Loaded Modules:
>   1) craype-x86-rome         4) perftools-base/21.05.0
> 7) cray-pmi-lib/6.0.12  10) cray-dsmml/0.1.5       13) PrgEnv-cray/8.1.0
>      16) rocm/4.2.0   19) autoconf/2.69
>   2) libfabric/1.11.0.4.75   5) xpmem/2.2.40-2.1_2.44__g3cf3325.shasta
> 8) cce/12.0.1           11) cray-mpich/8.1.7       14) DefApps/default
>      17) emacs/27.2   20) automake/1.16.3
>   3) craype-network-ofi      6) cray-pmi/6.0.12
>  9) craype/2.7.8         12) cray-libsci/21.06.1.1  15)
> craype-accel-amd-gfx908  18) zlib/1.2.11  21) libtool/2.4.6
>
>
>

-- 
Stefano
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210919/1ea0337a/attachment.html>

From mfadams at lbl.gov  Sun Sep 19 11:58:19 2021
From: mfadams at lbl.gov (Mark Adams)
Date: Sun, 19 Sep 2021 12:58:19 -0400
Subject: [petsc-users] Spock link error
In-Reply-To: <CAGPUisgDa3wroqY1mXn1JS1X0Bvj56R7bAtcwR+QqprPXHN1RQ@mail.gmail.com>
References: <CADOhEh6Wg4=+273k55N6YLtOGMZbVRFnBJaxjO6BcVY7tcBYCw@mail.gmail.com>
	<CAGPUisgDa3wroqY1mXn1JS1X0Bvj56R7bAtcwR+QqprPXHN1RQ@mail.gmail.com>
Message-ID: <CADOhEh4nGF-dJvjzAyqfFzFpwuE9k09BaxO+b0a56zSb7VN21Q@mail.gmail.com>

Yes, I had the hsa lib commented out but that did not help (appended).

I now see that I had this problem in July and Junchao was helping. I was
able to fix it with PrgEnv-gnu.

THe fortran test actually worked.

Oh well, the application does their own linking so maybe that will fix it
up. (They do use OMP).

Thanks,
Mark

gmake[3]:
[/gpfs/alpine/phy122/proj-shared/spock/petsc/current/arch-opt-cray-new/lib/petsc/conf/rules:301:
ex19.PETSc] Error 2 (ignored)
*******************Error detected during compile or link!*******************
See http://www.mcs.anl.gov/petsc/documentation/faq.html
/gpfs/alpine/csc314/scratch/adams/petsc/src/snes/tutorials ex19
*********************************************************************************
cc *-L/opt/rocm-4.2.0/lib -lhsa-runtime64*   -fPIC -Wall -Wwrite-strings
-Wno-strict-aliasing -Wno-unknown-pragmas -fstack-protector
-Qunused-arguments -fvisibility=hidden -g -O2  -fPIC -Wall -Wwrite-strings
-Wno-strict-aliasing -Wno-unknown-pragmas -fstack-protector
-Qunused-arguments -fvisibility=hidden -g -O2
 -I/gpfs/alpine/phy122/proj-shared/spock/petsc/current/arch-opt-cray-new/include
-I/opt/rocm-4.2.0/include     ex19.c
 -Wl,-rpath,/gpfs/alpine/phy122/proj-shared/spock/petsc/current/arch-opt-cray-new/lib
-L/gpfs/alpine/phy122/proj-shared/spock/petsc/current/arch-opt-cray-new/lib
-Wl,-rpath,/gpfs/alpine/phy122/proj-shared/spock/petsc/current/arch-opt-cray-new/lib
-L/gpfs/alpine/phy122/proj-shared/spock/petsc/current/arch-opt-cray-new/lib
-Wl,-rpath,/opt/rocm-4.2.0/lib -L/opt/rocm-4.2.0/lib
-Wl,-rpath,/opt/cray/pe/gcc/8.1.0/snos/lib64
-L/opt/cray/pe/gcc/8.1.0/snos/lib64 -Wl,-rpath,/opt/cray/pe/libsci/
21.06.1.1/CRAY/9.0/x86_64/lib -L/opt/cray/pe/libsci/
21.06.1.1/CRAY/9.0/x86_64/lib
-Wl,-rpath,/opt/cray/pe/mpich/8.1.7/ofi/cray/10.0/lib
-L/opt/cray/pe/mpich/8.1.7/ofi/cray/10.0/lib
-Wl,-rpath,/opt/cray/pe/mpich/default/gtl/lib
-L/opt/cray/pe/mpich/default/gtl/lib
-Wl,-rpath,/opt/cray/pe/dsmml/0.1.5/dsmml/lib
-L/opt/cray/pe/dsmml/0.1.5/dsmml/lib -Wl,-rpath,/opt/cray/pe/pmi/6.0.12/lib
-L/opt/cray/pe/pmi/6.0.12/lib
-Wl,-rpath,/opt/cray/pe/cce/12.0.1/cce/x86_64/lib
-L/opt/cray/pe/cce/12.0.1/cce/x86_64/lib
-Wl,-rpath,/opt/cray/xpmem/2.2.40-2.1_2.44__g3cf3325.shasta/lib64
-L/opt/cray/xpmem/2.2.40-2.1_2.44__g3cf3325.shasta/lib64
-Wl,-rpath,/opt/cray/pe/cce/12.0.1/cce-clang/x86_64/lib/clang/12.0.0/lib/linux
-L/opt/cray/pe/cce/12.0.1/cce-clang/x86_64/lib/clang/12.0.0/lib/linux
-Wl,-rpath,/opt/cray/pe/gcc/8.1.0/snos/lib/gcc/x86_64-suse-linux/8.1.0
-L/opt/cray/pe/gcc/8.1.0/snos/lib/gcc/x86_64-suse-linux/8.1.0
-Wl,-rpath,/opt/cray/pe/cce/12.0.1/binutils/x86_64/x86_64-unknown-linux-gnu/lib
-L/opt/cray/pe/cce/12.0.1/binutils/x86_64/x86_64-unknown-linux-gnu/lib
-Wl,-rpath,/opt/cray/pe/cce/12.0.1/binutils/x86_64/x86_64-pc-linux-gnu/..//x86_64-unknown-linux-gnu/lib
-L/opt/cray/pe/cce/12.0.1/binutils/x86_64/x86_64-pc-linux-gnu/..//x86_64-unknown-linux-gnu/lib
-lpetsc -lparmetis -lmetis -lhipsparse -lhipblas -lrocsparse -lrocsolver
-lrocblas -lrocrand -lamdhip64* -lhsa-runtime64 *-lstdc++ -ldl
-lmpifort_cray -lmpi_cray -lmpi_gtl_hsa -ldsmml -lpmi -lxpmem -lpgas-shmem
-lquadmath -lcrayacc_amdgpu -lopenacc -lmodules -lfi -lcraymath -lf -lu
-lcsup -lgfortran -lpthread -lgcc_eh -lm -lclang_rt.craypgo-x86_64
-lclang_rt.builtins-x86_64 -lquadmath -lstdc++ -ldl -o ex19
ld.lld: error:
/gpfs/alpine/phy122/proj-shared/spock/petsc/current/arch-opt-cray-new/lib/libpetsc.so:
undefined reference to .omp_offloading.img_start.cray_amdgcn-amd-amdhsa
[--no-allow-shlib-undefined]
ld.lld: error:
/gpfs/alpine/phy122/proj-shared/spock/petsc/current/arch-opt-cray-new/lib/libpetsc.so:
undefined reference to .omp_offloading.img_size.cray_amdgcn-amd-amdhsa
[--no-allow-shlib-undefined]
ld.lld: error:
/gpfs/alpine/phy122/proj-shared/spock/petsc/current/arch-opt-cray-new/lib/libpetsc.so:
undefined reference to .omp_offloading.img_cache.cray_amdgcn-amd-amdhsa
[--no-allow-shlib-undefined]
clang-12: error: linker command failed with exit code 1 (use -v to see
invocation)
gmake[4]: *** [<builtin>: ex19] Error 1

On Sun, Sep 19, 2021 at 9:44 AM Stefano Zampini <stefano.zampini at gmail.com>
wrote:

> Are you following the user advices here
> https://docs.olcf.ornl.gov/systems/spock_quick_start_guide.html#compiling-with-the-cray-compiler-wrappers-cc-or-cc
> ?
>
> Il giorno dom 19 set 2021 alle ore 16:30 Mark Adams <mfadams at lbl.gov> ha
> scritto:
>
>> I am getting to see this error. It seems to be suggesting that I turn
>> --no-allow-shlib-undefined off.
>> Any ideas?
>> Thanks,
>> Mark
>>
>> 09:09 main= /gpfs/alpine/csc314/scratch/adams/petsc$ make
>> PETSC_DIR=/gpfs/alpine/phy122/proj-shared/spock/petsc/current/arch-opt-cray-new
>> PETSC_ARCH="" check
>> Running check examples to verify correct installation
>> Using
>> PETSC_DIR=/gpfs/alpine/phy122/proj-shared/spock/petsc/current/arch-opt-cray-new
>> and PETSC_ARCH=
>> gmake[3]:
>> [/gpfs/alpine/phy122/proj-shared/spock/petsc/current/arch-opt-cray-new/lib/petsc/conf/rules:301:
>> ex19.PETSc] Error 2 (ignored)
>> *******************Error detected during compile or
>> link!*******************
>> See http://www.mcs.anl.gov/petsc/documentation/faq.html
>> /gpfs/alpine/csc314/scratch/adams/petsc/src/snes/tutorials ex19
>>
>> *********************************************************************************
>> cc -fPIC -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas
>> -fstack-protector -Qunused-arguments -fvisibility=hidden -g -O2  -fPIC
>> -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas
>> -fstack-protector -Qunused-arguments -fvisibility=hidden -g -O2
>>  -I/gpfs/alpine/phy122/proj-shared/spock/petsc/current/arch-opt-cray-new/include
>> -I/opt/rocm-4.2.0/include     ex19.c
>>  -Wl,-rpath,/gpfs/alpine/phy122/proj-shared/spock/petsc/current/arch-opt-cray-new/lib
>> -L/gpfs/alpine/phy122/proj-shared/spock/petsc/current/arch-opt-cray-new/lib
>> -Wl,-rpath,/gpfs/alpine/phy122/proj-shared/spock/petsc/current/arch-opt-cray-new/lib
>> -L/gpfs/alpine/phy122/proj-shared/spock/petsc/current/arch-opt-cray-new/lib
>> -Wl,-rpath,/opt/rocm-4.2.0/lib -L/opt/rocm-4.2.0/lib
>> -Wl,-rpath,/opt/cray/pe/gcc/8.1.0/snos/lib64
>> -L/opt/cray/pe/gcc/8.1.0/snos/lib64 -Wl,-rpath,/opt/cray/pe/libsci/
>> 21.06.1.1/CRAY/9.0/x86_64/lib -L/opt/cray/pe/libsci/
>> 21.06.1.1/CRAY/9.0/x86_64/lib
>> -Wl,-rpath,/opt/cray/pe/mpich/8.1.7/ofi/cray/10.0/lib
>> -L/opt/cray/pe/mpich/8.1.7/ofi/cray/10.0/lib
>> -Wl,-rpath,/opt/cray/pe/mpich/default/gtl/lib
>> -L/opt/cray/pe/mpich/default/gtl/lib
>> -Wl,-rpath,/opt/cray/pe/dsmml/0.1.5/dsmml/lib
>> -L/opt/cray/pe/dsmml/0.1.5/dsmml/lib -Wl,-rpath,/opt/cray/pe/pmi/6.0.12/lib
>> -L/opt/cray/pe/pmi/6.0.12/lib
>> -Wl,-rpath,/opt/cray/pe/cce/12.0.1/cce/x86_64/lib
>> -L/opt/cray/pe/cce/12.0.1/cce/x86_64/lib
>> -Wl,-rpath,/opt/cray/xpmem/2.2.40-2.1_2.44__g3cf3325.shasta/lib64
>> -L/opt/cray/xpmem/2.2.40-2.1_2.44__g3cf3325.shasta/lib64
>> -Wl,-rpath,/opt/cray/pe/cce/12.0.1/cce-clang/x86_64/lib/clang/12.0.0/lib/linux
>> -L/opt/cray/pe/cce/12.0.1/cce-clang/x86_64/lib/clang/12.0.0/lib/linux
>> -Wl,-rpath,/opt/cray/pe/gcc/8.1.0/snos/lib/gcc/x86_64-suse-linux/8.1.0
>> -L/opt/cray/pe/gcc/8.1.0/snos/lib/gcc/x86_64-suse-linux/8.1.0
>> -Wl,-rpath,/opt/cray/pe/cce/12.0.1/binutils/x86_64/x86_64-unknown-linux-gnu/lib
>> -L/opt/cray/pe/cce/12.0.1/binutils/x86_64/x86_64-unknown-linux-gnu/lib
>> -Wl,-rpath,/opt/cray/pe/cce/12.0.1/binutils/x86_64/x86_64-pc-linux-gnu/..//x86_64-unknown-linux-gnu/lib
>> -L/opt/cray/pe/cce/12.0.1/binutils/x86_64/x86_64-pc-linux-gnu/..//x86_64-unknown-linux-gnu/lib
>> -lpetsc -lparmetis -lmetis -lhipsparse -lhipblas -lrocsparse -lrocsolver
>> -lrocblas -lrocrand -lamdhip64 -lstdc++ -ldl -lmpifort_cray -lmpi_cray
>> -lmpi_gtl_hsa -ldsmml -lpmi -lxpmem -lpgas-shmem -lquadmath
>> -lcrayacc_amdgpu -lopenacc -lmodules -lfi -lcraymath -lf -lu -lcsup
>> -lgfortran -lpthread -lgcc_eh -lm -lclang_rt.craypgo-x86_64
>> -lclang_rt.builtins-x86_64 -lquadmath -lstdc++ -ldl -o ex19
>>
>>
>> *ld.lld: error:
>> /gpfs/alpine/phy122/proj-shared/spock/petsc/current/arch-opt-cray-new/lib/libpetsc.so:
>> undefined reference to .omp_offloading.img_start.cray_amdgcn-amd-amdhsa
>> [--no-allow-shlib-undefined]ld.lld: error:
>> /gpfs/alpine/phy122/proj-shared/spock/petsc/current/arch-opt-cray-new/lib/libpetsc.so:
>> undefined reference to .omp_offloading.img_size.cray_amdgcn-amd-amdhsa
>> [--no-allow-shlib-undefined]ld.lld: error:
>> /gpfs/alpine/phy122/proj-shared/spock/petsc/current/arch-opt-cray-new/lib/libpetsc.so:
>> undefined reference to .omp_offloading.img_cache.cray_amdgcn-amd-amdhsa
>> [--no-allow-shlib-undefined]*
>> clang-12: error: linker command failed with exit code 1 (use -v to see
>> invocation)
>> gmake[4]: *** [<builtin>: ex19] Error 1
>> *******************Error detected during compile or
>> link!*******************
>> See http://www.mcs.anl.gov/petsc/documentation/faq.html
>> /gpfs/alpine/csc314/scratch/adams/petsc/src/snes/tutorials ex5f
>> *********************************************************
>> ftn -fPIC -g -O2   -fPIC -g -O2
>>  -I/gpfs/alpine/phy122/proj-shared/spock/petsc/current/arch-opt-cray-new/include
>> -I/opt/rocm-4.2.0/include     ex5f.F90
>>  -Wl,-rpath,/gpfs/alpine/phy122/proj-shared/spock/petsc/current/arch-opt-cray-new/lib
>> -L/gpfs/alpine/phy122/proj-shared/spock/petsc/current/arch-opt-cray-new/lib
>> -Wl,-rpath,/gpfs/alpine/phy122/proj-shared/spock/petsc/current/arch-opt-cray-new/lib
>> -L/gpfs/alpine/phy122/proj-shared/spock/petsc/current/arch-opt-cray-new/lib
>> -Wl,-rpath,/opt/rocm-4.2.0/lib -L/opt/rocm-4.2.0/lib
>> -Wl,-rpath,/opt/cray/pe/gcc/8.1.0/snos/lib64
>> -L/opt/cray/pe/gcc/8.1.0/snos/lib64 -Wl,-rpath,/opt/cray/pe/libsci/
>> 21.06.1.1/CRAY/9.0/x86_64/lib -L/opt/cray/pe/libsci/
>> 21.06.1.1/CRAY/9.0/x86_64/lib
>> -Wl,-rpath,/opt/cray/pe/mpich/8.1.7/ofi/cray/10.0/lib
>> -L/opt/cray/pe/mpich/8.1.7/ofi/cray/10.0/lib
>> -Wl,-rpath,/opt/cray/pe/mpich/default/gtl/lib
>> -L/opt/cray/pe/mpich/default/gtl/lib
>> -Wl,-rpath,/opt/cray/pe/dsmml/0.1.5/dsmml/lib
>> -L/opt/cray/pe/dsmml/0.1.5/dsmml/lib -Wl,-rpath,/opt/cray/pe/pmi/6.0.12/lib
>> -L/opt/cray/pe/pmi/6.0.12/lib
>> -Wl,-rpath,/opt/cray/pe/cce/12.0.1/cce/x86_64/lib
>> -L/opt/cray/pe/cce/12.0.1/cce/x86_64/lib
>> -Wl,-rpath,/opt/cray/xpmem/2.2.40-2.1_2.44__g3cf3325.shasta/lib64
>> -L/opt/cray/xpmem/2.2.40-2.1_2.44__g3cf3325.shasta/lib64
>> -Wl,-rpath,/opt/cray/pe/cce/12.0.1/cce-clang/x86_64/lib/clang/12.0.0/lib/linux
>> -L/opt/cray/pe/cce/12.0.1/cce-clang/x86_64/lib/clang/12.0.0/lib/linux
>> -Wl,-rpath,/opt/cray/pe/gcc/8.1.0/snos/lib/gcc/x86_64-suse-linux/8.1.0
>> -L/opt/cray/pe/gcc/8.1.0/snos/lib/gcc/x86_64-suse-linux/8.1.0
>> -Wl,-rpath,/opt/cray/pe/cce/12.0.1/binutils/x86_64/x86_64-unknown-linux-gnu/lib
>> -L/opt/cray/pe/cce/12.0.1/binutils/x86_64/x86_64-unknown-linux-gnu/lib
>> -Wl,-rpath,/opt/cray/pe/cce/12.0.1/binutils/x86_64/x86_64-pc-linux-gnu/..//x86_64-unknown-linux-gnu/lib
>> -L/opt/cray/pe/cce/12.0.1/binutils/x86_64/x86_64-pc-linux-gnu/..//x86_64-unknown-linux-gnu/lib
>> -lpetsc -lparmetis -lmetis -lhipsparse -lhipblas -lrocsparse -lrocsolver
>> -lrocblas -lrocrand -lamdhip64 -lstdc++ -ldl -lmpifort_cray -lmpi_cray
>> -lmpi_gtl_hsa -ldsmml -lpmi -lxpmem -lpgas-shmem -lquadmath
>> -lcrayacc_amdgpu -lopenacc -lmodules -lfi -lcraymath -lf -lu -lcsup
>> -lgfortran -lpthread -lgcc_eh -lm -lclang_rt.craypgo-x86_64
>> -lclang_rt.builtins-x86_64 -lquadmath -lstdc++ -ldl -o ex5f
>> /opt/cray/pe/cce/12.0.1/binutils/x86_64/x86_64-pc-linux-gnu/bin/ld:
>> warning: alignment 128 of symbol
>> `$host_init$$runtime_init_for_iso_c_binding$iso_c_binding_' in
>> /opt/cray/pe/cce/12.0.1/cce/x86_64/lib/libmodules.so is smaller than 256 in
>> /tmp/pe_39617/ex5f_1.o
>> /opt/cray/pe/cce/12.0.1/binutils/x86_64/x86_64-pc-linux-gnu/bin/ld:
>> warning: alignment 64 of symbol `$data_init$iso_c_binding_' in
>> /opt/cray/pe/cce/12.0.1/cce/x86_64/lib/libmodules.so is smaller than 256 in
>> /tmp/pe_39617/ex5f_1.o
>> Fortran example src/snes/tutorials/ex5f run successfully with 1 MPI
>> process
>> Completed test examples
>> 09:12 main= /gpfs/alpine/csc314/scratch/adams/petsc$ module list
>>
>> Currently Loaded Modules:
>>   1) craype-x86-rome         4) perftools-base/21.05.0
>> 7) cray-pmi-lib/6.0.12  10) cray-dsmml/0.1.5       13) PrgEnv-cray/8.1.0
>>      16) rocm/4.2.0   19) autoconf/2.69
>>   2) libfabric/1.11.0.4.75   5) xpmem/2.2.40-2.1_2.44__g3cf3325.shasta
>> 8) cce/12.0.1           11) cray-mpich/8.1.7       14) DefApps/default
>>      17) emacs/27.2   20) automake/1.16.3
>>   3) craype-network-ofi      6) cray-pmi/6.0.12
>>  9) craype/2.7.8         12) cray-libsci/21.06.1.1  15)
>> craype-accel-amd-gfx908  18) zlib/1.2.11  21) libtool/2.4.6
>>
>>
>>
>
> --
> Stefano
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210919/3f36ad86/attachment-0001.html>

From carlos.v.hd1 at gmail.com  Sun Sep 19 14:20:57 2021
From: carlos.v.hd1 at gmail.com (Carlos Velazquez)
Date: Sun, 19 Sep 2021 14:20:57 -0500
Subject: [petsc-users] How to get specific ordering
Message-ID: <CAMDyJ7g28EEHLdHvMgRDiDuu2uTfgfny3msAKOeyL3DjvsswEw@mail.gmail.com>

Hi there. I have a question whether I can get the DAG values in DMPlex in a
different order.

I am working on a code that uses the same node ordering to form the
elements found in the mesh file. Example:

If I use the file "doublet-tet.msh" that is inside the DMPlex mesh files
folder, this file tells us that the elements are formed as follows:

1 - 2 4 3 1
2 - 2 3 4 5
Element 1 by nodes 2, 4, 3, 1
Element 2 by nodes 2, 3, 4, 5

So what I'm doing to get this through DMPlex is getting the transitive
closure with DMPlexGetTransitiveClosure and getting the points from level
0, which is the node level, but the ordering is different. Example:

With DMPlexGetTransitiveClosure I obtain that the elements are formed as
follows:

0 - 5 3 4 2
1 - 4 3 5 6
Element 1 by nodes 5, 3, 4, 2
Element 2 by nodes 4, 3, 5, 6

But comparing this ordering with the previous one in the coordinate matrix
I can see that the order is not equivalent.

I would like to know if there is a way to modify the ordering of the graph
to obtain the same ordering that is in the mesh file for the nodes that
make up the elements or even if there is some way to configure it for a
specific desired order.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210919/c2e76608/attachment.html>

From wence at gmx.li  Mon Sep 20 08:38:29 2021
From: wence at gmx.li (Lawrence Mitchell)
Date: Mon, 20 Sep 2021 14:38:29 +0100
Subject: [petsc-users] DMView and DMLoad
In-Reply-To: <20210920130810.Horde.MErFrSPts47GDBNOjJQvNHt@webmailer.ovgu.de>
References: <20210920130810.Horde.MErFrSPts47GDBNOjJQvNHt@webmailer.ovgu.de>
Message-ID: <4BEA1166-E3AD-45E8-A758-D76767669725@gmx.li>

Dear Sergio,

(Added petsc-users back to cc),

> On 20 Sep 2021, at 14:08, sergio.bengoechea at ovgu.de wrote:
> 
> Dear Lawrence,
> 
> thanks for the HDF5 saving and loading example.
> 
> In the documentation you sent (https://petsc.org/main/docs/manual/dmplex/#saving-and-loading-data-with-hdf5) the link to a more comprehensive example (DMPlex Tutorial ex12) is not working.

Hmm, I'm not in charge of how the documentation gets built.

@Patrick, if I look here: https://petsc.org/main/src/dm/impls/plex/tutorials/, I don't see ex12.c (even though it is here https://gitlab.com/petsc/petsc/-/tree/main/src/dm/impls/plex/tutorials)

The relevant link in the docs goes to https://petsc.org/main/docs/src/dm/impls/plex/tutorials/ex12.c.html (which is 404), I suppose it should be main/src/... (not main/docs/src...)

> I am afraid I would need to see the whole example to make our case work. If I could have access the completed source code of that example would be of a great help.

You can find the relevant example in the PETSc source tree https://gitlab.com/petsc/petsc/-/blob/main/src/dm/impls/plex/tutorials/ex12.c

Lawrence

From knepley at gmail.com  Mon Sep 20 08:43:40 2021
From: knepley at gmail.com (Matthew Knepley)
Date: Mon, 20 Sep 2021 09:43:40 -0400
Subject: [petsc-users] How to get specific ordering
In-Reply-To: <CAMDyJ7g28EEHLdHvMgRDiDuu2uTfgfny3msAKOeyL3DjvsswEw@mail.gmail.com>
References: <CAMDyJ7g28EEHLdHvMgRDiDuu2uTfgfny3msAKOeyL3DjvsswEw@mail.gmail.com>
Message-ID: <CAMYG4Gmfna240DEPR3a1DzojVeKDdK-vw8TxNthE8wxPj0EFrg@mail.gmail.com>

On Sun, Sep 19, 2021 at 3:27 PM Carlos Velazquez <carlos.v.hd1 at gmail.com>
wrote:

> Hi there. I have a question whether I can get the DAG values in DMPlex in
> a different order.
>
> I am working on a code that uses the same node ordering to form the
> elements found in the mesh file. Example:
>
> If I use the file "doublet-tet.msh" that is inside the DMPlex mesh files
> folder, this file tells us that the elements are formed as follows:
>
> 1 - 2 4 3 1
> 2 - 2 3 4 5
> Element 1 by nodes 2, 4, 3, 1
> Element 2 by nodes 2, 3, 4, 5
>
> So what I'm doing to get this through DMPlex is getting the transitive
> closure with DMPlexGetTransitiveClosure and getting the points from level
> 0, which is the node level, but the ordering is different. Example:
>
> With DMPlexGetTransitiveClosure I obtain that the elements are formed as
> follows:
>
> 0 - 5 3 4 2
> 1 - 4 3 5 6
> Element 1 by nodes 5, 3, 4, 2
> Element 2 by nodes 4, 3, 5, 6
>
> But comparing this ordering with the previous one in the coordinate matrix
> I can see that the order is not equivalent.
>
> I would like to know if there is a way to modify the ordering of the graph
> to obtain the same ordering that is in the mesh file for the nodes that
> make up the elements or even if there is some way to configure it for a
> specific desired order.
>

The problem is that GMsh orients tetrahedra differently than Plex. We like
outward normals, whereas the GMsh convention has the normal for the
first face pointing inward. Thus, when we read in a GMsh tet, we flip the
first two vertices. So

  0 - 5 3 4 2

but if we number from 1 instead of 0, and number cells and vertices
separately,

  1 - 4 2 3 1

which if you flip vertices 1 and 2 is

  1 - 2 4 3 1

which is what you read in.

  Thanks,

      Matt

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210920/9d12c6c3/attachment.html>

From pierre at joliv.et  Mon Sep 20 08:43:49 2021
From: pierre at joliv.et (Pierre Jolivet)
Date: Mon, 20 Sep 2021 15:43:49 +0200
Subject: [petsc-users] DMView and DMLoad
In-Reply-To: <4BEA1166-E3AD-45E8-A758-D76767669725@gmx.li>
References: <20210920130810.Horde.MErFrSPts47GDBNOjJQvNHt@webmailer.ovgu.de>
	<4BEA1166-E3AD-45E8-A758-D76767669725@gmx.li>
Message-ID: <A4BE70A8-72A3-4B2E-A2AE-A1B120D5F3EC@joliv.et>



> On 20 Sep 2021, at 3:38 PM, Lawrence Mitchell <wence at gmx.li> wrote:
> 
> Dear Sergio,
> 
> (Added petsc-users back to cc),
> 
>> On 20 Sep 2021, at 14:08, sergio.bengoechea at ovgu.de wrote:
>> 
>> Dear Lawrence,
>> 
>> thanks for the HDF5 saving and loading example.
>> 
>> In the documentation you sent (https://petsc.org/main/docs/manual/dmplex/#saving-and-loading-data-with-hdf5) the link to a more comprehensive example (DMPlex Tutorial ex12) is not working.
> 
> Hmm, I'm not in charge of how the documentation gets built.
> 
> @Patrick, if I look here: https://petsc.org/main/src/dm/impls/plex/tutorials/, I don't see ex12.c (even though it is here https://gitlab.com/petsc/petsc/-/tree/main/src/dm/impls/plex/tutorials)

It needs to be added to EXAMPLESC in src/dm/impls/plex/tutorials/makefile
By the way, none of the ?Actual source code: XYZ.c? are working anymore (not specific to DMPlex and/or tutorials), e.g., https://petsc.org/main/src/dm/impls/plex/tutorials/ex1.c.html <https://petsc.org/main/src/dm/impls/plex/tutorials/ex1.c.html> Actual source code: ex1.c redirects to the same eye-candy/filtered .html instead of the raw/unfiltered .c

Thanks,
Pierre

> The relevant link in the docs goes to https://petsc.org/main/docs/src/dm/impls/plex/tutorials/ex12.c.html (which is 404), I suppose it should be main/src/... (not main/docs/src...)
> 
>> I am afraid I would need to see the whole example to make our case work. If I could have access the completed source code of that example would be of a great help.
> 
> You can find the relevant example in the PETSc source tree https://gitlab.com/petsc/petsc/-/blob/main/src/dm/impls/plex/tutorials/ex12.c
> 
> Lawrence

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210920/7292f5f8/attachment-0001.html>

From patrick.sanan at gmail.com  Mon Sep 20 10:51:33 2021
From: patrick.sanan at gmail.com (Patrick Sanan)
Date: Mon, 20 Sep 2021 17:51:33 +0200
Subject: [petsc-users] DMView and DMLoad
In-Reply-To: <4BEA1166-E3AD-45E8-A758-D76767669725@gmx.li>
References: <20210920130810.Horde.MErFrSPts47GDBNOjJQvNHt@webmailer.ovgu.de>
	<4BEA1166-E3AD-45E8-A758-D76767669725@gmx.li>
Message-ID: <CA+z91TfxH0rNyxPBX6-muUwMzt59vWwfWWEXDPyMA2NH_ZThew@mail.gmail.com>

Thanks for reporting that! There are still some things in the "classic"
docs build that we could make more robust but it's not clear if we should
devote the energy to that or to continuing to replace those processes with
new ones which are  better integrated with the Sphinx build.

MR that should hopefully at least partially fix this particular issue:
https://gitlab.com/petsc/petsc/-/merge_requests/4331
General issue: https://gitlab.com/petsc/petsc/-/issues/279

Am Mo., 20. Sept. 2021 um 15:38 Uhr schrieb Lawrence Mitchell <wence at gmx.li
>:

> Dear Sergio,
>
> (Added petsc-users back to cc),
>
> > On 20 Sep 2021, at 14:08, sergio.bengoechea at ovgu.de wrote:
> >
> > Dear Lawrence,
> >
> > thanks for the HDF5 saving and loading example.
> >
> > In the documentation you sent (
> https://petsc.org/main/docs/manual/dmplex/#saving-and-loading-data-with-hdf5)
> the link to a more comprehensive example (DMPlex Tutorial ex12) is not
> working.
>
> Hmm, I'm not in charge of how the documentation gets built.
>
> @Patrick, if I look here:
> https://petsc.org/main/src/dm/impls/plex/tutorials/, I don't see ex12.c
> (even though it is here
> https://gitlab.com/petsc/petsc/-/tree/main/src/dm/impls/plex/tutorials)
>
> The relevant link in the docs goes to
> https://petsc.org/main/docs/src/dm/impls/plex/tutorials/ex12.c.html
> (which is 404), I suppose it should be main/src/... (not main/docs/src...)
>
> > I am afraid I would need to see the whole example to make our case work.
> If I could have access the completed source code of that example would be
> of a great help.
>
> You can find the relevant example in the PETSc source tree
> https://gitlab.com/petsc/petsc/-/blob/main/src/dm/impls/plex/tutorials/ex12.c
>
> Lawrence
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210920/73542e16/attachment.html>

From varunhiremath at gmail.com  Mon Sep 20 17:23:09 2021
From: varunhiremath at gmail.com (Varun Hiremath)
Date: Mon, 20 Sep 2021 15:23:09 -0700
Subject: [petsc-users] SLEPc: smallest eigenvalues
In-Reply-To: <CAMXxdULvVtyHxg5UcHWvcq7pNNv_4ETKYd-vyp+AZZdSAKKi+Q@mail.gmail.com>
References: <CAMXxdUJSWU52_Wb=0tWBPYGnMNtCwzyL2Ar_qXZeYY7VkOif7A@mail.gmail.com>
	<EE81AB10-C1BE-494A-8B57-94AE2198CED4@dsic.upv.es>
	<CAMXxdUKch4vPnXPnRzArUEcdxQtjYJnd-KoQFk8yBpdG6fV2fQ@mail.gmail.com>
	<179BDB69-1EC0-4334-A964-ABE29E33EFF8@dsic.upv.es>
	<CAMXxdU+yWJVCS8=_0sk2M-Y4q6ZG8H6O87xbQ5zpxS9QrGnbYA@mail.gmail.com>
	<5B1750B3-E05F-45D7-929B-A5CF816B4A75@dsic.upv.es>
	<CAMXxdUKTSi1oR-tSVa++NwynwoC-QffAE3h_kEp3vVUuypUpnA@mail.gmail.com>
	<7031EC8B-A238-45AD-B4C2-FA8988022864@dsic.upv.es>
	<CAMXxdULvVtyHxg5UcHWvcq7pNNv_4ETKYd-vyp+AZZdSAKKi+Q@mail.gmail.com>
Message-ID: <CAMXxdU+D5tUkpspfun-H3iK3hVh5QTQtk0rXY6rLuUwLDAdYuw@mail.gmail.com>

Hi Jose,

Sorry, it took me a while to test these settings in the new builds. I am
getting good improvement in performance using the preconditioned solvers,
so thanks for the suggestions! But I have some questions related to the
usage.

We are using SLEPc to solve the acoustic modal eigenvalue problem. Attached
is a simple standalone program that computes acoustic modes in a simple
rectangular box. This program illustrates the general setup I am using,
though here the shell matrix and the preconditioner matrix are the same,
while in my actual program the shell matrix computes A*x without explicitly
forming A, and the preconditioner is a 0th order approximation of A.

In the attached program I have tested both
1) the Krylov-Schur with inexact shift-and-invert (implemented under the
option sinvert);
2) the JD solver with preconditioner (implemented under the option usejd)

Both the solvers seem to work decently, compared to no preconditioning.
This is how I run the two solvers (for a mesh size of 1600x400):
$ ./acoustic_box_test.o -nx 1600 -ny 400 -usejd 1 -deflate 1 -eps_target 0
$ ./acoustic_box_test.o -nx 1600 -ny 400 -sinvert 1 -deflate 1 -eps_target 0
Both finish in about ~10 minutes on my system in serial. JD seems to be
slightly faster and more accurate (for the imaginary part of eigenvalue).
The program also runs in parallel using mpiexec. I use complex builds, as
in my main program the matrix can be complex.

Now here are my questions:
1) For this particular problem type, could you please check if these are
the best settings that one could use? I have tried different combinations
of KSP/PC types e.g. GMRES, GAMG, etc, but BCGSL + BJACOBI seems to work
the best in serial and parallel.

2) When I tested these settings in my main program, for some reason the JD
solver was not converging. After further testing, I found the issue was
related to the setting of "-eps_target 0". I have included "
EPSSetTarget(eps,0.0);" in the program and I assumed this is equivalent to
passing "-eps_target 0" from the command line, but that doesn't seem to be
the case. For instance, if I run the attached program without "-eps_target
0" in the command line then it doesn't converge.
$ ./acoustic_box_test.o -nx 1600 -ny 400 -usejd 1 -deflate 1 -eps_target 0
 the above finishes in about 10 minutes
$ ./acoustic_box_test.o -nx 1600 -ny 400 -usejd 1 -deflate 1
 the above doesn't converge even though "EPSSetTarget(eps,0.0);" is
included in the code

This only seems to affect the JD solver, not the Krylov shift-and-invert
(-sinvert 1) option. So is there any difference between passing "-eps_target
0" from the command line vs using "EPSSetTarget(eps,0.0);" in the code? I
cannot pass any command line arguments in my actual program, so need to set
everything internally.

3) Also, another minor related issue. While using the inexact
shift-and-invert option, I was running into the following error:

""
Missing or incorrect user input
Shift-and-invert requires a target 'which' (see EPSSetWhichEigenpairs), for
instance -st_type sinvert -eps_target 0 -eps_target_magnitude
""

I already have the below two lines in the code:
EPSSetWhichEigenpairs(eps,EPS_SMALLEST_MAGNITUDE);
EPSSetTarget(eps,0.0);

so shouldn't these be enough? If I comment out the first line
"EPSSetWhichEigenpairs", then the code works fine.

I have some more questions regarding setting the preconditioner for a
quadratic eigenvalue problem, which I will ask in a follow-up email.

Thanks for your help!

-Varun


On Thu, Jul 1, 2021 at 5:01 AM Varun Hiremath <varunhiremath at gmail.com>
wrote:

> Thank you very much for these suggestions! We are currently using version
> 3.12, so I'll try to update to the latest version and try your suggestions.
> Let me get back to you, thanks!
>
> On Thu, Jul 1, 2021, 4:45 AM Jose E. Roman <jroman at dsic.upv.es> wrote:
>
>> Then I would try Davidson methods https://doi.org/10.1145/2543696
>> You can also try Krylov-Schur with "inexact" shift-and-invert, for
>> instance, with preconditioned BiCGStab or GMRES, see section 3.4.1 of the
>> users manual.
>>
>> In both cases, you have to pass matrix A in the call to EPSSetOperators()
>> and the preconditioner matrix via STSetPreconditionerMat() - note this
>> function was introduced in version 3.15.
>>
>> Jose
>>
>>
>>
>> > El 1 jul 2021, a las 13:36, Varun Hiremath <varunhiremath at gmail.com>
>> escribi?:
>> >
>> > Thanks. I actually do have a 1st order approximation of matrix A, that
>> I can explicitly compute and also invert. Can I use that matrix as
>> preconditioner to speed things up? Is there some example that explains how
>> to setup and call SLEPc for this scenario?
>> >
>> > On Thu, Jul 1, 2021, 4:29 AM Jose E. Roman <jroman at dsic.upv.es> wrote:
>> > For smallest real parts one could adapt ex34.c, but it is going to be
>> costly
>> https://slepc.upv.es/documentation/current/src/eps/tutorials/ex36.c.html
>> > Also, if eigenvalues are clustered around the origin, convergence may
>> still be very slow.
>> >
>> > It is a tough problem, unless you are able to compute a good
>> preconditioner of A (no need to compute the exact inverse).
>> >
>> > Jose
>> >
>> >
>> > > El 1 jul 2021, a las 13:23, Varun Hiremath <varunhiremath at gmail.com>
>> escribi?:
>> > >
>> > > I'm solving for the smallest eigenvalues in magnitude. Though is it
>> cheaper to solve smallest in real part, as that might also work in my case?
>> Thanks for your help.
>> > >
>> > > On Thu, Jul 1, 2021, 4:08 AM Jose E. Roman <jroman at dsic.upv.es>
>> wrote:
>> > > Smallest eigenvalue in magnitude or real part?
>> > >
>> > >
>> > > > El 1 jul 2021, a las 11:58, Varun Hiremath <varunhiremath at gmail.com>
>> escribi?:
>> > > >
>> > > > Sorry, no both A and B are general sparse matrices (non-hermitian).
>> So is there anything else I could try?
>> > > >
>> > > > On Thu, Jul 1, 2021 at 2:43 AM Jose E. Roman <jroman at dsic.upv.es>
>> wrote:
>> > > > Is the problem symmetric (GHEP)? In that case, you can try LOBPCG
>> on the pair (A,B). But this will likely be slow as well, unless you can
>> provide a good preconditioner.
>> > > >
>> > > > Jose
>> > > >
>> > > >
>> > > > > El 1 jul 2021, a las 11:37, Varun Hiremath <
>> varunhiremath at gmail.com> escribi?:
>> > > > >
>> > > > > Hi All,
>> > > > >
>> > > > > I am trying to compute the smallest eigenvalues of a generalized
>> system A*x= lambda*B*x. I don't explicitly know the matrix A (so I am using
>> a shell matrix with a custom matmult function) however, the matrix B is
>> explicitly known so I compute inv(B)*A within the shell matrix and solve
>> inv(B)*A*x = lambda*x.
>> > > > >
>> > > > > To compute the smallest eigenvalues it is recommended to solve
>> the inverted system, but since matrix A is not explicitly known I can't
>> invert the system. Moreover, the size of the system can be really big, and
>> with the default Krylov solver, it is extremely slow. So is there a better
>> way for me to compute the smallest eigenvalues of this system?
>> > > > >
>> > > > > Thanks,
>> > > > > Varun
>> > > >
>> > >
>> >
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210920/600c2d32/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: acoustic_box_test.cpp
Type: application/octet-stream
Size: 7367 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210920/600c2d32/attachment.obj>

From lihaolin at stu.xjtu.edu.cn  Mon Sep 20 21:49:07 2021
From: lihaolin at stu.xjtu.edu.cn (=?UTF-8?B?5p2O5piK6ZyW?=)
Date: Tue, 21 Sep 2021 10:49:07 +0800 (GMT+08:00)
Subject: [petsc-users] Uses of VecGetArrayF90() and VecGetArrayReadF90() in
 Recent versions of GNU Fortran.
Message-ID: <51cf3661.2408a.17c0641cb69.Coremail.lihaolin@stu.xjtu.edu.cn>

Dear all,

I used PETSc in my full Fortran codes and it worked well when my codes were compiled by GNU Fortran (GCC) 4.8.4. But for some reasons, I had to update the GNU Fortran (GCC) to version 10.0.1. Then I reinstalled the MPICH and PETSc with the newer complier and compiled my codes successfully. However, I got the following error massage:

Index '1' of dimension 1 of array 'xx' above upper bound of 0.

where xx is the Fortran pointer obtained by calling VecGetArrayF90(vec,xx,ierr).

The vector was built successfully, but it seemed that the Fortran pointer xx was not built. I got the same error massage when using VecGetArrayReadF90(). So, are the VecGetArrayF90() and VecGetArrayReadF90() not compatible with the recent versions of GNU Fortran? Or is there any other way to access the vectors? For solving a linear problem Ab=x, I use VecGetArrayF90() to get the Fortran pointer to update b and use VecGetArrayReadF90() to get the values of x. After some tests, I found that VecGetArrayF90() could be replaced by VecSetValues(), but VecGetValues() used to replace VecGetArrayReadF90() could not be run with multiple threads.

I look forward to your reply and thank you for any suggestions.

Best regards,

Haolin Li
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210921/4babd715/attachment-0001.html>

From balay at mcs.anl.gov  Mon Sep 20 21:27:51 2021
From: balay at mcs.anl.gov (Satish Balay)
Date: Mon, 20 Sep 2021 21:27:51 -0500 (CDT)
Subject: [petsc-users] Uses of VecGetArrayF90() and VecGetArrayReadF90()
 in Recent versions of GNU Fortran.
In-Reply-To: <51cf3661.2408a.17c0641cb69.Coremail.lihaolin@stu.xjtu.edu.cn>
References: <51cf3661.2408a.17c0641cb69.Coremail.lihaolin@stu.xjtu.edu.cn>
Message-ID: <6c60abda-b135-ca8b-1c13-88c03469b0f7@mcs.anl.gov>

VecGetArrayF90 should work with newer gfortran versions.

https://petsc.org/release/docs/manualpages/Vec/VecGetArrayF90.html

Check the examples listed above to see if usage in your code is different. [run them with your build of petsc/compilers to verify]

And make sure you are using the latest version of PETSc.

If you still have issues - send us a reproducible example.

Satish

On Tue, 21 Sep 2021, ??? wrote:

> Dear all,
> 
> I used PETSc in my full Fortran codes and it worked well when my codes were compiled by GNU Fortran (GCC) 4.8.4. But for some reasons, I had to update the GNU Fortran (GCC) to version 10.0.1. Then I reinstalled the MPICH and PETSc with the newer complier and compiled my codes successfully. However, I got the following error massage:
> 
> Index '1' of dimension 1 of array 'xx' above upper bound of 0.
> 
> where xx is the Fortran pointer obtained by calling VecGetArrayF90(vec,xx,ierr).
> 
> The vector was built successfully, but it seemed that the Fortran pointer xx was not built. I got the same error massage when using VecGetArrayReadF90(). So, are the VecGetArrayF90() and VecGetArrayReadF90() not compatible with the recent versions of GNU Fortran? Or is there any other way to access the vectors? For solving a linear problem Ab=x, I use VecGetArrayF90() to get the Fortran pointer to update b and use VecGetArrayReadF90() to get the values of x. After some tests, I found that VecGetArrayF90() could be replaced by VecSetValues(), but VecGetValues() used to replace VecGetArrayReadF90() could not be run with multiple threads.
> 
> I look forward to your reply and thank you for any suggestions.
> 
> Best regards,
> 
> Haolin Li

From daniel.stone at opengosim.com  Tue Sep 21 05:19:37 2021
From: daniel.stone at opengosim.com (Daniel Stone)
Date: Tue, 21 Sep 2021 11:19:37 +0100
Subject: [petsc-users] Is this a bug? MatMultAdd_SeqBAIJ_11
Message-ID: <CA+J3PzN-+7WEAGJ2KYrZ0Oia0zxqkZzOPg8_RfS-6iz+atzPPA@mail.gmail.com>

Hello,

If we look at lines 2330-2331 in file baij2.c, it looks like there are some
mistakes in assigning the `sum..` variables to the z array, causing
the function MatMultAdd_SeqBAIJ_11() to not produce the correct
answer.

I don't have a good example program to demonstrate this yet - it's
currently causing problems in a dev branch of pflotan_ogs that
can produce blocksize 11 matrices. When in parallel, a standard
matrix-vector multiplication calls MatMultAdd for the off-proc
contributions, and the result is wrong when this is redirected
to MatMultAdd_SeqBAIJ_11. Seems to be the root cause of
several solvers failing such as fgmres.

Can anyone confirm that these two lines seem incorrect?


Thanks,

Daniel
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210921/64ec8015/attachment.html>

From wence at gmx.li  Tue Sep 21 05:50:01 2021
From: wence at gmx.li (Lawrence Mitchell)
Date: Tue, 21 Sep 2021 11:50:01 +0100
Subject: [petsc-users] Is this a bug? MatMultAdd_SeqBAIJ_11
In-Reply-To: <CA+J3PzN-+7WEAGJ2KYrZ0Oia0zxqkZzOPg8_RfS-6iz+atzPPA@mail.gmail.com>
References: <CA+J3PzN-+7WEAGJ2KYrZ0Oia0zxqkZzOPg8_RfS-6iz+atzPPA@mail.gmail.com>
Message-ID: <F088B469-8F28-4A68-87FB-50DB23E60088@gmx.li>

Hi Daniel,

> On 21 Sep 2021, at 11:19, Daniel Stone <daniel.stone at opengosim.com> wrote:
> 
> Hello,
> 
> If we look at lines 2330-2331 in file baij2.c, it looks like there are some 
> mistakes in assigning the `sum..` variables to the z array, causing
> the function MatMultAdd_SeqBAIJ_11() to not produce the correct
> answer.
> 
> I don't have a good example program to demonstrate this yet - it's
> currently causing problems in a dev branch of pflotan_ogs that
> can produce blocksize 11 matrices. When in parallel, a standard
> matrix-vector multiplication calls MatMultAdd for the off-proc
> contributions, and the result is wrong when this is redirected
> to MatMultAdd_SeqBAIJ_11. Seems to be the root cause of
> several solvers failing such as fgmres.
> 
> Can anyone confirm that these two lines seem incorrect?

Looks wrong to me, I guess this patch is correct?
diff --git a/src/mat/impls/baij/seq/baij2.c b/src/mat/impls/baij/seq/baij2.c
index 2849ef9051..65513c8989 100644
--- a/src/mat/impls/baij/seq/baij2.c
+++ b/src/mat/impls/baij/seq/baij2.c
@@ -2328,7 +2328,7 @@ PetscErrorCode MatMultAdd_SeqBAIJ_11(Mat A,Vec xx,Vec yy,Vec zz)
       v    += 121;
     }
     z[0] = sum1; z[1] = sum2; z[2] = sum3; z[3] = sum4; z[4] = sum5; z[5] = sum6; z[6] = sum7;
-    z[7] = sum6; z[8] = sum7; z[9] = sum8; z[10] = sum9; z[11] = sum10;
+    z[7] = sum8; z[8] = sum9; z[9] = sum10; z[10] = sum11;
     if (!usecprow) {
       z += 11; y += 11;
     }


Lawrence

From daniel.stone at opengosim.com  Tue Sep 21 08:51:47 2021
From: daniel.stone at opengosim.com (Daniel Stone)
Date: Tue, 21 Sep 2021 14:51:47 +0100
Subject: [petsc-users] Is this a bug? MatMultAdd_SeqBAIJ_11
In-Reply-To: <F088B469-8F28-4A68-87FB-50DB23E60088@gmx.li>
References: <CA+J3PzN-+7WEAGJ2KYrZ0Oia0zxqkZzOPg8_RfS-6iz+atzPPA@mail.gmail.com>
	<F088B469-8F28-4A68-87FB-50DB23E60088@gmx.li>
Message-ID: <CA+J3PzPOHbRfovB4r5UgXVuRCkboYEgSPw=edkfup5N4APvicQ@mail.gmail.com>

I seem to have confirmed that making the change suggested by Lawrence fixes
things in our
case. Some alternate runs by a colleague that result in a blocksize 12
matrix also work
fine - I think in that case MAtMultAdd_SeqBAIJ_N must be being used as I
can't find
a blocksize 12 analogue.

Is there by any chance a setting somewhere that can tell petsc to override
the use of
blocksize specific routines like this and always go to, e.g.,
MatMultAdd_SeqBAIJ_N
etc? Might be useful as a short term fix.

Thanks,

Daniel

On Tue, Sep 21, 2021 at 11:50 AM Lawrence Mitchell <wence at gmx.li> wrote:

> Hi Daniel,
>
> > On 21 Sep 2021, at 11:19, Daniel Stone <daniel.stone at opengosim.com>
> wrote:
> >
> > Hello,
> >
> > If we look at lines 2330-2331 in file baij2.c, it looks like there are
> some
> > mistakes in assigning the `sum..` variables to the z array, causing
> > the function MatMultAdd_SeqBAIJ_11() to not produce the correct
> > answer.
> >
> > I don't have a good example program to demonstrate this yet - it's
> > currently causing problems in a dev branch of pflotan_ogs that
> > can produce blocksize 11 matrices. When in parallel, a standard
> > matrix-vector multiplication calls MatMultAdd for the off-proc
> > contributions, and the result is wrong when this is redirected
> > to MatMultAdd_SeqBAIJ_11. Seems to be the root cause of
> > several solvers failing such as fgmres.
> >
> > Can anyone confirm that these two lines seem incorrect?
>
> Looks wrong to me, I guess this patch is correct?
>
> diff --git a/src/mat/impls/baij/seq/baij2.c
> b/src/mat/impls/baij/seq/baij2.c
> index 2849ef9051..65513c8989 100644
> --- a/src/mat/impls/baij/seq/baij2.c
> +++ b/src/mat/impls/baij/seq/baij2.c
> @@ -2328,7 +2328,7 @@ PetscErrorCode MatMultAdd_SeqBAIJ_11(Mat A,Vec
> xx,Vec yy,Vec zz)
>        v    += 121;
>      }
>      z[0] = sum1; z[1] = sum2; z[2] = sum3; z[3] = sum4; z[4] = sum5; z[5]
> = sum6; z[6] = sum7;
> -    z[7] = sum6; z[8] = sum7; z[9] = sum8; z[10] = sum9; z[11] = sum10;
> +    z[7] = sum8; z[8] = sum9; z[9] = sum10; z[10] = sum11;
>      if (!usecprow) {
>        z += 11; y += 11;
>      }
>
>
> Lawrence
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210921/19e00445/attachment.html>

From bsmith at petsc.dev  Tue Sep 21 09:02:33 2021
From: bsmith at petsc.dev (Barry Smith)
Date: Tue, 21 Sep 2021 10:02:33 -0400
Subject: [petsc-users] Is this a bug? MatMultAdd_SeqBAIJ_11
In-Reply-To: <CA+J3PzPOHbRfovB4r5UgXVuRCkboYEgSPw=edkfup5N4APvicQ@mail.gmail.com>
References: <CA+J3PzN-+7WEAGJ2KYrZ0Oia0zxqkZzOPg8_RfS-6iz+atzPPA@mail.gmail.com>
	<F088B469-8F28-4A68-87FB-50DB23E60088@gmx.li>
	<CA+J3PzPOHbRfovB4r5UgXVuRCkboYEgSPw=edkfup5N4APvicQ@mail.gmail.com>
Message-ID: <E7117169-42C3-4772-BC2B-C41341F3A930@petsc.dev>


   Daniel,

     Thanks for the update. We can patch the release version of PETSc (normally we don't have a way to patch previous release versions)

     There is no direct option to always use the _N version. (For smaller blocks using _N is very slow).

    Barry


> On Sep 21, 2021, at 9:51 AM, Daniel Stone <daniel.stone at opengosim.com> wrote:
> 
> I seem to have confirmed that making the change suggested by Lawrence fixes things in our 
> case. Some alternate runs by a colleague that result in a blocksize 12 matrix also work
> fine - I think in that case MAtMultAdd_SeqBAIJ_N must be being used as I can't find
> a blocksize 12 analogue.
> 
> Is there by any chance a setting somewhere that can tell petsc to override the use of 
> blocksize specific routines like this and always go to, e.g., MatMultAdd_SeqBAIJ_N
> etc? Might be useful as a short term fix.
> 
> Thanks,
> 
> Daniel
> 
> On Tue, Sep 21, 2021 at 11:50 AM Lawrence Mitchell <wence at gmx.li <mailto:wence at gmx.li>> wrote:
> Hi Daniel,
> 
> > On 21 Sep 2021, at 11:19, Daniel Stone <daniel.stone at opengosim.com <mailto:daniel.stone at opengosim.com>> wrote:
> > 
> > Hello,
> > 
> > If we look at lines 2330-2331 in file baij2.c, it looks like there are some 
> > mistakes in assigning the `sum..` variables to the z array, causing
> > the function MatMultAdd_SeqBAIJ_11() to not produce the correct
> > answer.
> > 
> > I don't have a good example program to demonstrate this yet - it's
> > currently causing problems in a dev branch of pflotan_ogs that
> > can produce blocksize 11 matrices. When in parallel, a standard
> > matrix-vector multiplication calls MatMultAdd for the off-proc
> > contributions, and the result is wrong when this is redirected
> > to MatMultAdd_SeqBAIJ_11. Seems to be the root cause of
> > several solvers failing such as fgmres.
> > 
> > Can anyone confirm that these two lines seem incorrect?
> 
> Looks wrong to me, I guess this patch is correct?
> 
> diff --git a/src/mat/impls/baij/seq/baij2.c b/src/mat/impls/baij/seq/baij2.c
> index 2849ef9051..65513c8989 100644
> --- a/src/mat/impls/baij/seq/baij2.c
> +++ b/src/mat/impls/baij/seq/baij2.c
> @@ -2328,7 +2328,7 @@ PetscErrorCode MatMultAdd_SeqBAIJ_11(Mat A,Vec xx,Vec yy,Vec zz)
>        v    += 121;
>      }
>      z[0] = sum1; z[1] = sum2; z[2] = sum3; z[3] = sum4; z[4] = sum5; z[5] = sum6; z[6] = sum7;
> -    z[7] = sum6; z[8] = sum7; z[9] = sum8; z[10] = sum9; z[11] = sum10;
> +    z[7] = sum8; z[8] = sum9; z[9] = sum10; z[10] = sum11;
>      if (!usecprow) {
>        z += 11; y += 11;
>      }
> 
> 
> Lawrence

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210921/ef042ff8/attachment-0001.html>

From knepley at gmail.com  Tue Sep 21 09:05:15 2021
From: knepley at gmail.com (Matthew Knepley)
Date: Tue, 21 Sep 2021 10:05:15 -0400
Subject: [petsc-users] Is this a bug? MatMultAdd_SeqBAIJ_11
In-Reply-To: <E7117169-42C3-4772-BC2B-C41341F3A930@petsc.dev>
References: <CA+J3PzN-+7WEAGJ2KYrZ0Oia0zxqkZzOPg8_RfS-6iz+atzPPA@mail.gmail.com>
	<F088B469-8F28-4A68-87FB-50DB23E60088@gmx.li>
	<CA+J3PzPOHbRfovB4r5UgXVuRCkboYEgSPw=edkfup5N4APvicQ@mail.gmail.com>
	<E7117169-42C3-4772-BC2B-C41341F3A930@petsc.dev>
Message-ID: <CAMYG4GkFnH3SCuogkR8p2OwRK5YWsuLFHqSXb6j3jR+SAnSt1A@mail.gmail.com>

On Tue, Sep 21, 2021 at 10:04 AM Barry Smith <bsmith at petsc.dev> wrote:

>
>    Daniel,
>
>      Thanks for the update. We can patch the release version of PETSc
> (normally we don't have a way to patch previous release versions)
>
>      There is no direct option to always use the _N version. (For smaller
> blocks using _N is very slow).
>

It is in an MR: https://gitlab.com/petsc/petsc/-/merge_requests/4338

  Thanks,

     Matt


>     Barry
>
>
> On Sep 21, 2021, at 9:51 AM, Daniel Stone <daniel.stone at opengosim.com>
> wrote:
>
> I seem to have confirmed that making the change suggested by Lawrence
> fixes things in our
> case. Some alternate runs by a colleague that result in a blocksize 12
> matrix also work
> fine - I think in that case MAtMultAdd_SeqBAIJ_N must be being used as I
> can't find
> a blocksize 12 analogue.
>
> Is there by any chance a setting somewhere that can tell petsc to override
> the use of
> blocksize specific routines like this and always go to, e.g.,
> MatMultAdd_SeqBAIJ_N
> etc? Might be useful as a short term fix.
>
> Thanks,
>
> Daniel
>
> On Tue, Sep 21, 2021 at 11:50 AM Lawrence Mitchell <wence at gmx.li> wrote:
>
>> Hi Daniel,
>>
>> > On 21 Sep 2021, at 11:19, Daniel Stone <daniel.stone at opengosim.com>
>> wrote:
>> >
>> > Hello,
>> >
>> > If we look at lines 2330-2331 in file baij2.c, it looks like there are
>> some
>> > mistakes in assigning the `sum..` variables to the z array, causing
>> > the function MatMultAdd_SeqBAIJ_11() to not produce the correct
>> > answer.
>> >
>> > I don't have a good example program to demonstrate this yet - it's
>> > currently causing problems in a dev branch of pflotan_ogs that
>> > can produce blocksize 11 matrices. When in parallel, a standard
>> > matrix-vector multiplication calls MatMultAdd for the off-proc
>> > contributions, and the result is wrong when this is redirected
>> > to MatMultAdd_SeqBAIJ_11. Seems to be the root cause of
>> > several solvers failing such as fgmres.
>> >
>> > Can anyone confirm that these two lines seem incorrect?
>>
>> Looks wrong to me, I guess this patch is correct?
>>
>> diff --git a/src/mat/impls/baij/seq/baij2.c
>> b/src/mat/impls/baij/seq/baij2.c
>> index 2849ef9051..65513c8989 100644
>> --- a/src/mat/impls/baij/seq/baij2.c
>> +++ b/src/mat/impls/baij/seq/baij2.c
>> @@ -2328,7 +2328,7 @@ PetscErrorCode MatMultAdd_SeqBAIJ_11(Mat A,Vec
>> xx,Vec yy,Vec zz)
>>        v    += 121;
>>      }
>>      z[0] = sum1; z[1] = sum2; z[2] = sum3; z[3] = sum4; z[4] = sum5;
>> z[5] = sum6; z[6] = sum7;
>> -    z[7] = sum6; z[8] = sum7; z[9] = sum8; z[10] = sum9; z[11] = sum10;
>> +    z[7] = sum8; z[8] = sum9; z[9] = sum10; z[10] = sum11;
>>      if (!usecprow) {
>>        z += 11; y += 11;
>>      }
>>
>>
>> Lawrence
>
>
>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210921/30b1a1cb/attachment.html>

From niko.karin at gmail.com  Tue Sep 21 09:30:49 2021
From: niko.karin at gmail.com (Karin&NiKo)
Date: Tue, 21 Sep 2021 16:30:49 +0200
Subject: [petsc-users] How to combine different element types into a
 single DMPlex?
In-Reply-To: <ab13b059-1282-2314-2bc7-6336c051a590@giref.ulaval.ca>
References: <beda9fe3-b204-8b65-66c7-bdadf4509679@giref.ulaval.ca>
	<CAMYG4GnJE0DP_f-NctsMj0PoerBy2yi43RXL0WZtKoOxwqFQ7Q@mail.gmail.com>
	<ab13b059-1282-2314-2bc7-6336c051a590@giref.ulaval.ca>
Message-ID: <CA+gX-L_3i=fA7hKBnnhL2EZvwvD+sEn11D8qcc2YRaR2ywwahA@mail.gmail.com>

Dear Eric, dear Matthew,

I share Eric's desire to be able to manipulate meshes composed of different
types of elements in a PETSc's DMPlex.
Since this discussion, is there anything new on this feature for the
DMPlex object or am I missing something?

Thanks,
Nicolas

Le mer. 21 juil. 2021 ? 04:25, Eric Chamberland <
Eric.Chamberland at giref.ulaval.ca> a ?crit :

> Hi,
> On 2021-07-14 3:14 p.m., Matthew Knepley wrote:
>
> On Wed, Jul 14, 2021 at 1:25 PM Eric Chamberland <
> Eric.Chamberland at giref.ulaval.ca> wrote:
>
>> Hi,
>>
>> while playing with DMPlexBuildFromCellListParallel, I noticed we have to
>> specify "numCorners" which is a fixed value, then gives a fixed number
>> of nodes for a series of elements.
>>
>> How can I then add, for example, triangles and quadrangles into a DMPlex?
>>
>
> You can't with that function. It would be much mich more complicated if
> you could, and I am not sure
> it is worth it for that function. The reason is that you would need index
> information to offset into the
> connectivity list, and that would need to be replicated to some extent so
> that all processes know what
> the others are doing. Possible, but complicated.
>
> Maybe I can help suggest something for what you are trying to do?
>
> Yes: we are trying to partition our parallel mesh with PETSc functions.
> The mesh has been read in parallel so each process owns a part of it, but
> we have to manage mixed elements types.
>
> When we directly use ParMETIS_V3_PartMeshKway, we give two arrays to
> describe the elements which allows mixed elements.
>
> So, how would I read my mixed mesh in parallel and give it to PETSc DMPlex
> so I can use a PetscPartitioner with DMPlexDistribute ?
>
> A second goal we have is to use PETSc to compute the overlap, which is
> something I can't find in PARMetis (and any other partitionning library?)
>
> Thanks,
>
> Eric
>
>
>
>   Thanks,
>
>       Matt
>
>
>
>> Thanks,
>>
>> Eric
>>
>> --
>> Eric Chamberland, ing., M. Ing
>> Professionnel de recherche
>> GIREF/Universit? Laval
>> (418) 656-2131 poste 41 22 42
>>
>>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
> <http://www.cse.buffalo.edu/~knepley/>
>
> --
> Eric Chamberland, ing., M. Ing
> Professionnel de recherche
> GIREF/Universit? Laval
> (418) 656-2131 poste 41 22 42
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210921/4cbc356c/attachment.html>

From jroman at dsic.upv.es  Tue Sep 21 13:14:57 2021
From: jroman at dsic.upv.es (Jose E. Roman)
Date: Tue, 21 Sep 2021 20:14:57 +0200
Subject: [petsc-users] SLEPc: smallest eigenvalues
In-Reply-To: <CAMXxdU+D5tUkpspfun-H3iK3hVh5QTQtk0rXY6rLuUwLDAdYuw@mail.gmail.com>
References: <CAMXxdUJSWU52_Wb=0tWBPYGnMNtCwzyL2Ar_qXZeYY7VkOif7A@mail.gmail.com>
	<EE81AB10-C1BE-494A-8B57-94AE2198CED4@dsic.upv.es>
	<CAMXxdUKch4vPnXPnRzArUEcdxQtjYJnd-KoQFk8yBpdG6fV2fQ@mail.gmail.com>
	<179BDB69-1EC0-4334-A964-ABE29E33EFF8@dsic.upv.es>
	<CAMXxdU+yWJVCS8=_0sk2M-Y4q6ZG8H6O87xbQ5zpxS9QrGnbYA@mail.gmail.com>
	<5B1750B3-E05F-45D7-929B-A5CF816B4A75@dsic.upv.es>
	<CAMXxdUKTSi1oR-tSVa++NwynwoC-QffAE3h_kEp3vVUuypUpnA@mail.gmail.com>
	<7031EC8B-A238-45AD-B4C2-FA8988022864@dsic.upv.es>
	<CAMXxdULvVtyHxg5UcHWvcq7pNNv_4ETKYd-vyp+AZZdSAKKi+Q@mail.gmail.com>
	<CAMXxdU+D5tUkpspfun-H3iK3hVh5QTQtk0rXY6rLuUwLDAdYuw@mail.gmail.com>
Message-ID: <6B968AE2-8325-4E20-B94A-16ECDD0FBA90@dsic.upv.es>

I will have a look at your code when I have more time. Meanwhile, I am answering 3) below...

> El 21 sept 2021, a las 0:23, Varun Hiremath <varunhiremath at gmail.com> escribi?:
> 
> Hi Jose,
> 
> Sorry, it took me a while to test these settings in the new builds. I am getting good improvement in performance using the preconditioned solvers, so thanks for the suggestions! But I have some questions related to the usage.
> 
> We are using SLEPc to solve the acoustic modal eigenvalue problem. Attached is a simple standalone program that computes acoustic modes in a simple rectangular box. This program illustrates the general setup I am using, though here the shell matrix and the preconditioner matrix are the same, while in my actual program the shell matrix computes A*x without explicitly forming A, and the preconditioner is a 0th order approximation of A.
> 
> In the attached program I have tested both
> 1) the Krylov-Schur with inexact shift-and-invert (implemented under the option sinvert);
> 2) the JD solver with preconditioner (implemented under the option usejd)
> 
> Both the solvers seem to work decently, compared to no preconditioning. This is how I run the two solvers (for a mesh size of 1600x400):
> $ ./acoustic_box_test.o -nx 1600 -ny 400 -usejd 1 -deflate 1 -eps_target 0
> $ ./acoustic_box_test.o -nx 1600 -ny 400 -sinvert 1 -deflate 1 -eps_target 0
> Both finish in about ~10 minutes on my system in serial. JD seems to be slightly faster and more accurate (for the imaginary part of eigenvalue).
> The program also runs in parallel using mpiexec. I use complex builds, as in my main program the matrix can be complex.
> 
> Now here are my questions:
> 1) For this particular problem type, could you please check if these are the best settings that one could use? I have tried different combinations of KSP/PC types e.g. GMRES, GAMG, etc, but BCGSL + BJACOBI seems to work the best in serial and parallel.
> 
> 2) When I tested these settings in my main program, for some reason the JD solver was not converging. After further testing, I found the issue was related to the setting of "-eps_target 0". I have included "EPSSetTarget(eps,0.0);" in the program and I assumed this is equivalent to passing "-eps_target 0" from the command line, but that doesn't seem to be the case. For instance, if I run the attached program without "-eps_target 0" in the command line then it doesn't converge.
> $ ./acoustic_box_test.o -nx 1600 -ny 400 -usejd 1 -deflate 1 -eps_target 0
>  the above finishes in about 10 minutes
> $ ./acoustic_box_test.o -nx 1600 -ny 400 -usejd 1 -deflate 1
>  the above doesn't converge even though "EPSSetTarget(eps,0.0);" is included in the code
> 
> This only seems to affect the JD solver, not the Krylov shift-and-invert (-sinvert 1) option. So is there any difference between passing "-eps_target 0" from the command line vs using "EPSSetTarget(eps,0.0);" in the code? I cannot pass any command line arguments in my actual program, so need to set everything internally.
> 
> 3) Also, another minor related issue. While using the inexact shift-and-invert option, I was running into the following error:
> 
> ""
> Missing or incorrect user input
> Shift-and-invert requires a target 'which' (see EPSSetWhichEigenpairs), for instance -st_type sinvert -eps_target 0 -eps_target_magnitude
> ""
> 
> I already have the below two lines in the code:
> EPSSetWhichEigenpairs(eps,EPS_SMALLEST_MAGNITUDE);
> EPSSetTarget(eps,0.0);
> 
> so shouldn't these be enough? If I comment out the first line "EPSSetWhichEigenpairs", then the code works fine.

You should either do

EPSSetWhichEigenpairs(eps,EPS_SMALLEST_MAGNITUDE);

without shift-and-invert or

EPSSetWhichEigenpairs(eps,EPS_TARGET_MAGNITUDE);
EPSSetTarget(eps,0.0);

with shift-and-invert. The latter can also be used without shift-and-invert (e.g. in JD).

I have to check, but a possible explanation why in your comment above (2) the command-line option -eps_target 0 works differently is that it also sets -eps_target_magnitude if omitted, so to be equivalent in source code you have to call both
EPSSetWhichEigenpairs(eps,EPS_TARGET_MAGNITUDE);
EPSSetTarget(eps,0.0);

Jose

> I have some more questions regarding setting the preconditioner for a quadratic eigenvalue problem, which I will ask in a follow-up email.
> 
> Thanks for your help!
> 
> -Varun
> 
> 
> On Thu, Jul 1, 2021 at 5:01 AM Varun Hiremath <varunhiremath at gmail.com> wrote:
> Thank you very much for these suggestions! We are currently using version 3.12, so I'll try to update to the latest version and try your suggestions. Let me get back to you, thanks!
> 
> On Thu, Jul 1, 2021, 4:45 AM Jose E. Roman <jroman at dsic.upv.es> wrote:
> Then I would try Davidson methods https://doi.org/10.1145/2543696
> You can also try Krylov-Schur with "inexact" shift-and-invert, for instance, with preconditioned BiCGStab or GMRES, see section 3.4.1 of the users manual.
> 
> In both cases, you have to pass matrix A in the call to EPSSetOperators() and the preconditioner matrix via STSetPreconditionerMat() - note this function was introduced in version 3.15.
> 
> Jose
> 
> 
> 
> > El 1 jul 2021, a las 13:36, Varun Hiremath <varunhiremath at gmail.com> escribi?:
> > 
> > Thanks. I actually do have a 1st order approximation of matrix A, that I can explicitly compute and also invert. Can I use that matrix as preconditioner to speed things up? Is there some example that explains how to setup and call SLEPc for this scenario? 
> > 
> > On Thu, Jul 1, 2021, 4:29 AM Jose E. Roman <jroman at dsic.upv.es> wrote:
> > For smallest real parts one could adapt ex34.c, but it is going to be costly https://slepc.upv.es/documentation/current/src/eps/tutorials/ex36.c.html
> > Also, if eigenvalues are clustered around the origin, convergence may still be very slow.
> > 
> > It is a tough problem, unless you are able to compute a good preconditioner of A (no need to compute the exact inverse).
> > 
> > Jose
> > 
> > 
> > > El 1 jul 2021, a las 13:23, Varun Hiremath <varunhiremath at gmail.com> escribi?:
> > > 
> > > I'm solving for the smallest eigenvalues in magnitude. Though is it cheaper to solve smallest in real part, as that might also work in my case? Thanks for your help.
> > > 
> > > On Thu, Jul 1, 2021, 4:08 AM Jose E. Roman <jroman at dsic.upv.es> wrote:
> > > Smallest eigenvalue in magnitude or real part?
> > > 
> > > 
> > > > El 1 jul 2021, a las 11:58, Varun Hiremath <varunhiremath at gmail.com> escribi?:
> > > > 
> > > > Sorry, no both A and B are general sparse matrices (non-hermitian). So is there anything else I could try?
> > > > 
> > > > On Thu, Jul 1, 2021 at 2:43 AM Jose E. Roman <jroman at dsic.upv.es> wrote:
> > > > Is the problem symmetric (GHEP)? In that case, you can try LOBPCG on the pair (A,B). But this will likely be slow as well, unless you can provide a good preconditioner.
> > > > 
> > > > Jose
> > > > 
> > > > 
> > > > > El 1 jul 2021, a las 11:37, Varun Hiremath <varunhiremath at gmail.com> escribi?:
> > > > > 
> > > > > Hi All,
> > > > > 
> > > > > I am trying to compute the smallest eigenvalues of a generalized system A*x= lambda*B*x. I don't explicitly know the matrix A (so I am using a shell matrix with a custom matmult function) however, the matrix B is explicitly known so I compute inv(B)*A within the shell matrix and solve inv(B)*A*x = lambda*x.
> > > > > 
> > > > > To compute the smallest eigenvalues it is recommended to solve the inverted system, but since matrix A is not explicitly known I can't invert the system. Moreover, the size of the system can be really big, and with the default Krylov solver, it is extremely slow. So is there a better way for me to compute the smallest eigenvalues of this system?
> > > > > 
> > > > > Thanks,
> > > > > Varun
> > > > 
> > > 
> > 
> 
> <acoustic_box_test.cpp>


From knepley at gmail.com  Tue Sep 21 14:55:56 2021
From: knepley at gmail.com (Matthew Knepley)
Date: Tue, 21 Sep 2021 15:55:56 -0400
Subject: [petsc-users] How to combine different element types into a
 single DMPlex?
In-Reply-To: <CA+gX-L_3i=fA7hKBnnhL2EZvwvD+sEn11D8qcc2YRaR2ywwahA@mail.gmail.com>
References: <beda9fe3-b204-8b65-66c7-bdadf4509679@giref.ulaval.ca>
	<CAMYG4GnJE0DP_f-NctsMj0PoerBy2yi43RXL0WZtKoOxwqFQ7Q@mail.gmail.com>
	<ab13b059-1282-2314-2bc7-6336c051a590@giref.ulaval.ca>
	<CA+gX-L_3i=fA7hKBnnhL2EZvwvD+sEn11D8qcc2YRaR2ywwahA@mail.gmail.com>
Message-ID: <CAMYG4GmYyHqLMhNWsCenbfGON_gSdN1DUy2oRgdO7fbGbwzv4w@mail.gmail.com>

On Tue, Sep 21, 2021 at 10:31 AM Karin&NiKo <niko.karin at gmail.com> wrote:

> Dear Eric, dear Matthew,
>
> I share Eric's desire to be able to manipulate meshes composed of
> different types of elements in a PETSc's DMPlex.
> Since this discussion, is there anything new on this feature for the
> DMPlex object or am I missing something?
>

Thanks for finding this!

Okay, I did a rewrite of the Plex internals this summer. It should now be
possible to interpolate a mesh with any
number of cell types, partition it, redistribute it, and many other
manipulations.

You can read in some formats that support hybrid meshes. If you let me know
how you plan to read it in, we can make it work.
Right now, I don't want to make input interfaces that no one will ever use.
We have a project, joint with Firedrake, to finalize
parallel I/O. This will make parallel reading and writing for checkpointing
possible, supporting topology, geometry, fields and
layouts, for many meshes in one HDF5 file. I think we will finish in
November.

  Thanks,

     Matt


> Thanks,
> Nicolas
>
> Le mer. 21 juil. 2021 ? 04:25, Eric Chamberland <
> Eric.Chamberland at giref.ulaval.ca> a ?crit :
>
>> Hi,
>> On 2021-07-14 3:14 p.m., Matthew Knepley wrote:
>>
>> On Wed, Jul 14, 2021 at 1:25 PM Eric Chamberland <
>> Eric.Chamberland at giref.ulaval.ca> wrote:
>>
>>> Hi,
>>>
>>> while playing with DMPlexBuildFromCellListParallel, I noticed we have to
>>> specify "numCorners" which is a fixed value, then gives a fixed number
>>> of nodes for a series of elements.
>>>
>>> How can I then add, for example, triangles and quadrangles into a DMPlex?
>>>
>>
>> You can't with that function. It would be much mich more complicated if
>> you could, and I am not sure
>> it is worth it for that function. The reason is that you would need index
>> information to offset into the
>> connectivity list, and that would need to be replicated to some extent so
>> that all processes know what
>> the others are doing. Possible, but complicated.
>>
>> Maybe I can help suggest something for what you are trying to do?
>>
>> Yes: we are trying to partition our parallel mesh with PETSc functions.
>> The mesh has been read in parallel so each process owns a part of it, but
>> we have to manage mixed elements types.
>>
>> When we directly use ParMETIS_V3_PartMeshKway, we give two arrays to
>> describe the elements which allows mixed elements.
>>
>> So, how would I read my mixed mesh in parallel and give it to PETSc
>> DMPlex so I can use a PetscPartitioner with DMPlexDistribute ?
>>
>> A second goal we have is to use PETSc to compute the overlap, which is
>> something I can't find in PARMetis (and any other partitionning library?)
>>
>> Thanks,
>>
>> Eric
>>
>>
>>
>>   Thanks,
>>
>>       Matt
>>
>>
>>
>>> Thanks,
>>>
>>> Eric
>>>
>>> --
>>> Eric Chamberland, ing., M. Ing
>>> Professionnel de recherche
>>> GIREF/Universit? Laval
>>> (418) 656-2131 poste 41 22 42
>>>
>>>
>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> -- Norbert Wiener
>>
>> https://www.cse.buffalo.edu/~knepley/
>> <http://www.cse.buffalo.edu/~knepley/>
>>
>> --
>> Eric Chamberland, ing., M. Ing
>> Professionnel de recherche
>> GIREF/Universit? Laval
>> (418) 656-2131 poste 41 22 42
>>
>>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210921/f4411757/attachment.html>

From niko.karin at gmail.com  Wed Sep 22 02:04:35 2021
From: niko.karin at gmail.com (Karin&NiKo)
Date: Wed, 22 Sep 2021 09:04:35 +0200
Subject: [petsc-users] How to combine different element types into a
 single DMPlex?
In-Reply-To: <CAMYG4GmYyHqLMhNWsCenbfGON_gSdN1DUy2oRgdO7fbGbwzv4w@mail.gmail.com>
References: <beda9fe3-b204-8b65-66c7-bdadf4509679@giref.ulaval.ca>
	<CAMYG4GnJE0DP_f-NctsMj0PoerBy2yi43RXL0WZtKoOxwqFQ7Q@mail.gmail.com>
	<ab13b059-1282-2314-2bc7-6336c051a590@giref.ulaval.ca>
	<CA+gX-L_3i=fA7hKBnnhL2EZvwvD+sEn11D8qcc2YRaR2ywwahA@mail.gmail.com>
	<CAMYG4GmYyHqLMhNWsCenbfGON_gSdN1DUy2oRgdO7fbGbwzv4w@mail.gmail.com>
Message-ID: <CA+gX-L9FkvH+u-yPPpzKPcLkmk3vVzuFbF3mTr=jHJVZqVruZQ@mail.gmail.com>

Dear Matthew,

This is great news!
For my part, I would be mostly interested in the parallel input interface.
Sorry for that...
Indeed, in our application,  we already have a parallel mesh data structure
that supports hybrid meshes with parallel I/O and distribution (based on
the MED format). We would like to use a DMPlex to make parallel mesh
adaptation.
 As a matter of fact, all our meshes are in the MED format. We could
also contribute to extend the interface of DMPlex with MED (if you consider
it could be usefull).

Best regards,
Nicolas


Le mar. 21 sept. 2021 ? 21:56, Matthew Knepley <knepley at gmail.com> a ?crit :

> On Tue, Sep 21, 2021 at 10:31 AM Karin&NiKo <niko.karin at gmail.com> wrote:
>
>> Dear Eric, dear Matthew,
>>
>> I share Eric's desire to be able to manipulate meshes composed of
>> different types of elements in a PETSc's DMPlex.
>> Since this discussion, is there anything new on this feature for the
>> DMPlex object or am I missing something?
>>
>
> Thanks for finding this!
>
> Okay, I did a rewrite of the Plex internals this summer. It should now be
> possible to interpolate a mesh with any
> number of cell types, partition it, redistribute it, and many other
> manipulations.
>
> You can read in some formats that support hybrid meshes. If you let me
> know how you plan to read it in, we can make it work.
> Right now, I don't want to make input interfaces that no one will ever
> use. We have a project, joint with Firedrake, to finalize
> parallel I/O. This will make parallel reading and writing for
> checkpointing possible, supporting topology, geometry, fields and
> layouts, for many meshes in one HDF5 file. I think we will finish in
> November.
>
>   Thanks,
>
>      Matt
>
>
>> Thanks,
>> Nicolas
>>
>> Le mer. 21 juil. 2021 ? 04:25, Eric Chamberland <
>> Eric.Chamberland at giref.ulaval.ca> a ?crit :
>>
>>> Hi,
>>> On 2021-07-14 3:14 p.m., Matthew Knepley wrote:
>>>
>>> On Wed, Jul 14, 2021 at 1:25 PM Eric Chamberland <
>>> Eric.Chamberland at giref.ulaval.ca> wrote:
>>>
>>>> Hi,
>>>>
>>>> while playing with DMPlexBuildFromCellListParallel, I noticed we have
>>>> to
>>>> specify "numCorners" which is a fixed value, then gives a fixed number
>>>> of nodes for a series of elements.
>>>>
>>>> How can I then add, for example, triangles and quadrangles into a
>>>> DMPlex?
>>>>
>>>
>>> You can't with that function. It would be much mich more complicated if
>>> you could, and I am not sure
>>> it is worth it for that function. The reason is that you would need
>>> index information to offset into the
>>> connectivity list, and that would need to be replicated to some extent
>>> so that all processes know what
>>> the others are doing. Possible, but complicated.
>>>
>>> Maybe I can help suggest something for what you are trying to do?
>>>
>>> Yes: we are trying to partition our parallel mesh with PETSc functions.
>>> The mesh has been read in parallel so each process owns a part of it, but
>>> we have to manage mixed elements types.
>>>
>>> When we directly use ParMETIS_V3_PartMeshKway, we give two arrays to
>>> describe the elements which allows mixed elements.
>>>
>>> So, how would I read my mixed mesh in parallel and give it to PETSc
>>> DMPlex so I can use a PetscPartitioner with DMPlexDistribute ?
>>>
>>> A second goal we have is to use PETSc to compute the overlap, which is
>>> something I can't find in PARMetis (and any other partitionning library?)
>>>
>>> Thanks,
>>>
>>> Eric
>>>
>>>
>>>
>>>   Thanks,
>>>
>>>       Matt
>>>
>>>
>>>
>>>> Thanks,
>>>>
>>>> Eric
>>>>
>>>> --
>>>> Eric Chamberland, ing., M. Ing
>>>> Professionnel de recherche
>>>> GIREF/Universit? Laval
>>>> (418) 656-2131 poste 41 22 42
>>>>
>>>>
>>>
>>> --
>>> What most experimenters take for granted before they begin their
>>> experiments is infinitely more interesting than any results to which their
>>> experiments lead.
>>> -- Norbert Wiener
>>>
>>> https://www.cse.buffalo.edu/~knepley/
>>> <http://www.cse.buffalo.edu/~knepley/>
>>>
>>> --
>>> Eric Chamberland, ing., M. Ing
>>> Professionnel de recherche
>>> GIREF/Universit? Laval
>>> (418) 656-2131 poste 41 22 42
>>>
>>>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
> <http://www.cse.buffalo.edu/~knepley/>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210922/51ee6d8a/attachment-0001.html>

From knepley at gmail.com  Wed Sep 22 06:20:29 2021
From: knepley at gmail.com (Matthew Knepley)
Date: Wed, 22 Sep 2021 07:20:29 -0400
Subject: [petsc-users] How to combine different element types into a
 single DMPlex?
In-Reply-To: <CA+gX-L9FkvH+u-yPPpzKPcLkmk3vVzuFbF3mTr=jHJVZqVruZQ@mail.gmail.com>
References: <beda9fe3-b204-8b65-66c7-bdadf4509679@giref.ulaval.ca>
	<CAMYG4GnJE0DP_f-NctsMj0PoerBy2yi43RXL0WZtKoOxwqFQ7Q@mail.gmail.com>
	<ab13b059-1282-2314-2bc7-6336c051a590@giref.ulaval.ca>
	<CA+gX-L_3i=fA7hKBnnhL2EZvwvD+sEn11D8qcc2YRaR2ywwahA@mail.gmail.com>
	<CAMYG4GmYyHqLMhNWsCenbfGON_gSdN1DUy2oRgdO7fbGbwzv4w@mail.gmail.com>
	<CA+gX-L9FkvH+u-yPPpzKPcLkmk3vVzuFbF3mTr=jHJVZqVruZQ@mail.gmail.com>
Message-ID: <CAMYG4G=mUTVCuayh4_kqfjEoNT_XEt6+FF24jF2KhLu0nWb8tw@mail.gmail.com>

On Wed, Sep 22, 2021 at 3:04 AM Karin&NiKo <niko.karin at gmail.com> wrote:

> Dear Matthew,
>
> This is great news!
> For my part, I would be mostly interested in the parallel input interface.
> Sorry for that...
> Indeed, in our application,  we already have a parallel mesh data
> structure that supports hybrid meshes with parallel I/O and distribution
> (based on the MED format). We would like to use a DMPlex to make parallel
> mesh adaptation.
>  As a matter of fact, all our meshes are in the MED format. We could
> also contribute to extend the interface of DMPlex with MED (if you consider
> it could be usefull).
>

An MED interface does exist. I stopped using it for two reasons:

  1) The code was not portable and the build was failing on different
architectures. I had to manually fix it.

  2) The boundary markers did not provide global information, so that
parallel reading was much harder.

Feel free to update my MED reader to a better design.

  Thanks,

     Matt


> Best regards,
> Nicolas
>
>
> Le mar. 21 sept. 2021 ? 21:56, Matthew Knepley <knepley at gmail.com> a
> ?crit :
>
>> On Tue, Sep 21, 2021 at 10:31 AM Karin&NiKo <niko.karin at gmail.com> wrote:
>>
>>> Dear Eric, dear Matthew,
>>>
>>> I share Eric's desire to be able to manipulate meshes composed of
>>> different types of elements in a PETSc's DMPlex.
>>> Since this discussion, is there anything new on this feature for the
>>> DMPlex object or am I missing something?
>>>
>>
>> Thanks for finding this!
>>
>> Okay, I did a rewrite of the Plex internals this summer. It should now be
>> possible to interpolate a mesh with any
>> number of cell types, partition it, redistribute it, and many other
>> manipulations.
>>
>> You can read in some formats that support hybrid meshes. If you let me
>> know how you plan to read it in, we can make it work.
>> Right now, I don't want to make input interfaces that no one will ever
>> use. We have a project, joint with Firedrake, to finalize
>> parallel I/O. This will make parallel reading and writing for
>> checkpointing possible, supporting topology, geometry, fields and
>> layouts, for many meshes in one HDF5 file. I think we will finish in
>> November.
>>
>>   Thanks,
>>
>>      Matt
>>
>>
>>> Thanks,
>>> Nicolas
>>>
>>> Le mer. 21 juil. 2021 ? 04:25, Eric Chamberland <
>>> Eric.Chamberland at giref.ulaval.ca> a ?crit :
>>>
>>>> Hi,
>>>> On 2021-07-14 3:14 p.m., Matthew Knepley wrote:
>>>>
>>>> On Wed, Jul 14, 2021 at 1:25 PM Eric Chamberland <
>>>> Eric.Chamberland at giref.ulaval.ca> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> while playing with DMPlexBuildFromCellListParallel, I noticed we have
>>>>> to
>>>>> specify "numCorners" which is a fixed value, then gives a fixed number
>>>>> of nodes for a series of elements.
>>>>>
>>>>> How can I then add, for example, triangles and quadrangles into a
>>>>> DMPlex?
>>>>>
>>>>
>>>> You can't with that function. It would be much mich more complicated if
>>>> you could, and I am not sure
>>>> it is worth it for that function. The reason is that you would need
>>>> index information to offset into the
>>>> connectivity list, and that would need to be replicated to some extent
>>>> so that all processes know what
>>>> the others are doing. Possible, but complicated.
>>>>
>>>> Maybe I can help suggest something for what you are trying to do?
>>>>
>>>> Yes: we are trying to partition our parallel mesh with PETSc
>>>> functions.  The mesh has been read in parallel so each process owns a part
>>>> of it, but we have to manage mixed elements types.
>>>>
>>>> When we directly use ParMETIS_V3_PartMeshKway, we give two arrays to
>>>> describe the elements which allows mixed elements.
>>>>
>>>> So, how would I read my mixed mesh in parallel and give it to PETSc
>>>> DMPlex so I can use a PetscPartitioner with DMPlexDistribute ?
>>>>
>>>> A second goal we have is to use PETSc to compute the overlap, which is
>>>> something I can't find in PARMetis (and any other partitionning library?)
>>>>
>>>> Thanks,
>>>>
>>>> Eric
>>>>
>>>>
>>>>
>>>>   Thanks,
>>>>
>>>>       Matt
>>>>
>>>>
>>>>
>>>>> Thanks,
>>>>>
>>>>> Eric
>>>>>
>>>>> --
>>>>> Eric Chamberland, ing., M. Ing
>>>>> Professionnel de recherche
>>>>> GIREF/Universit? Laval
>>>>> (418) 656-2131 poste 41 22 42
>>>>>
>>>>>
>>>>
>>>> --
>>>> What most experimenters take for granted before they begin their
>>>> experiments is infinitely more interesting than any results to which their
>>>> experiments lead.
>>>> -- Norbert Wiener
>>>>
>>>> https://www.cse.buffalo.edu/~knepley/
>>>> <http://www.cse.buffalo.edu/~knepley/>
>>>>
>>>> --
>>>> Eric Chamberland, ing., M. Ing
>>>> Professionnel de recherche
>>>> GIREF/Universit? Laval
>>>> (418) 656-2131 poste 41 22 42
>>>>
>>>>
>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> -- Norbert Wiener
>>
>> https://www.cse.buffalo.edu/~knepley/
>> <http://www.cse.buffalo.edu/~knepley/>
>>
>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210922/0bc5a8b4/attachment.html>

From varunhiremath at gmail.com  Wed Sep 22 12:38:27 2021
From: varunhiremath at gmail.com (Varun Hiremath)
Date: Wed, 22 Sep 2021 10:38:27 -0700
Subject: [petsc-users] SLEPc: smallest eigenvalues
In-Reply-To: <6B968AE2-8325-4E20-B94A-16ECDD0FBA90@dsic.upv.es>
References: <CAMXxdUJSWU52_Wb=0tWBPYGnMNtCwzyL2Ar_qXZeYY7VkOif7A@mail.gmail.com>
	<EE81AB10-C1BE-494A-8B57-94AE2198CED4@dsic.upv.es>
	<CAMXxdUKch4vPnXPnRzArUEcdxQtjYJnd-KoQFk8yBpdG6fV2fQ@mail.gmail.com>
	<179BDB69-1EC0-4334-A964-ABE29E33EFF8@dsic.upv.es>
	<CAMXxdU+yWJVCS8=_0sk2M-Y4q6ZG8H6O87xbQ5zpxS9QrGnbYA@mail.gmail.com>
	<5B1750B3-E05F-45D7-929B-A5CF816B4A75@dsic.upv.es>
	<CAMXxdUKTSi1oR-tSVa++NwynwoC-QffAE3h_kEp3vVUuypUpnA@mail.gmail.com>
	<7031EC8B-A238-45AD-B4C2-FA8988022864@dsic.upv.es>
	<CAMXxdULvVtyHxg5UcHWvcq7pNNv_4ETKYd-vyp+AZZdSAKKi+Q@mail.gmail.com>
	<CAMXxdU+D5tUkpspfun-H3iK3hVh5QTQtk0rXY6rLuUwLDAdYuw@mail.gmail.com>
	<6B968AE2-8325-4E20-B94A-16ECDD0FBA90@dsic.upv.es>
Message-ID: <CAMXxdU+nVEUgTMrtZDcCM0yMwku0GQmKKbnJr-P9sKRLrrE4wA@mail.gmail.com>

Hi Jose,

Thank you, that explains it and my example code works now without
specifying "-eps_target 0" in the command line.

However, both the Krylov inexact shift-invert and JD solvers are struggling
to converge for some of my actual problems. The issue seems to be related
to non-symmetric general matrices. I have extracted one such matrix
attached here as MatA.gz (size 100k), and have also included a short
program that loads this matrix and then computes the smallest eigenvalues
as I described earlier.

For this matrix, if I compute the eigenvalues directly (without using the
shell matrix) using shift-and-invert (as below) then it converges in less
than a minute.
$ ./acoustic_matrix_test.o -shell 0 -st_type sinvert -deflate 1

However, if I use the shell matrix and use any of the preconditioned
solvers JD or Krylov shift-invert (as shown below) with the same matrix as
the preconditioner, then they struggle to converge.
$ ./acoustic_matrix_test.o -usejd 1 -deflate 1
$ ./acoustic_matrix_test.o -sinvert 1 -deflate 1

Could you please check the attached code and suggest any changes in
settings that might help with convergence for these kinds of matrices? I
appreciate your help!

Thanks,
Varun

On Tue, Sep 21, 2021 at 11:14 AM Jose E. Roman <jroman at dsic.upv.es> wrote:

> I will have a look at your code when I have more time. Meanwhile, I am
> answering 3) below...
>
> > El 21 sept 2021, a las 0:23, Varun Hiremath <varunhiremath at gmail.com>
> escribi?:
> >
> > Hi Jose,
> >
> > Sorry, it took me a while to test these settings in the new builds. I am
> getting good improvement in performance using the preconditioned solvers,
> so thanks for the suggestions! But I have some questions related to the
> usage.
> >
> > We are using SLEPc to solve the acoustic modal eigenvalue problem.
> Attached is a simple standalone program that computes acoustic modes in a
> simple rectangular box. This program illustrates the general setup I am
> using, though here the shell matrix and the preconditioner matrix are the
> same, while in my actual program the shell matrix computes A*x without
> explicitly forming A, and the preconditioner is a 0th order approximation
> of A.
> >
> > In the attached program I have tested both
> > 1) the Krylov-Schur with inexact shift-and-invert (implemented under the
> option sinvert);
> > 2) the JD solver with preconditioner (implemented under the option usejd)
> >
> > Both the solvers seem to work decently, compared to no preconditioning.
> This is how I run the two solvers (for a mesh size of 1600x400):
> > $ ./acoustic_box_test.o -nx 1600 -ny 400 -usejd 1 -deflate 1 -eps_target
> 0
> > $ ./acoustic_box_test.o -nx 1600 -ny 400 -sinvert 1 -deflate 1
> -eps_target 0
> > Both finish in about ~10 minutes on my system in serial. JD seems to be
> slightly faster and more accurate (for the imaginary part of eigenvalue).
> > The program also runs in parallel using mpiexec. I use complex builds,
> as in my main program the matrix can be complex.
> >
> > Now here are my questions:
> > 1) For this particular problem type, could you please check if these are
> the best settings that one could use? I have tried different combinations
> of KSP/PC types e.g. GMRES, GAMG, etc, but BCGSL + BJACOBI seems to work
> the best in serial and parallel.
> >
> > 2) When I tested these settings in my main program, for some reason the
> JD solver was not converging. After further testing, I found the issue was
> related to the setting of "-eps_target 0". I have included
> "EPSSetTarget(eps,0.0);" in the program and I assumed this is equivalent to
> passing "-eps_target 0" from the command line, but that doesn't seem to be
> the case. For instance, if I run the attached program without "-eps_target
> 0" in the command line then it doesn't converge.
> > $ ./acoustic_box_test.o -nx 1600 -ny 400 -usejd 1 -deflate 1 -eps_target
> 0
> >  the above finishes in about 10 minutes
> > $ ./acoustic_box_test.o -nx 1600 -ny 400 -usejd 1 -deflate 1
> >  the above doesn't converge even though "EPSSetTarget(eps,0.0);" is
> included in the code
> >
> > This only seems to affect the JD solver, not the Krylov shift-and-invert
> (-sinvert 1) option. So is there any difference between passing
> "-eps_target 0" from the command line vs using "EPSSetTarget(eps,0.0);" in
> the code? I cannot pass any command line arguments in my actual program, so
> need to set everything internally.
> >
> > 3) Also, another minor related issue. While using the inexact
> shift-and-invert option, I was running into the following error:
> >
> > ""
> > Missing or incorrect user input
> > Shift-and-invert requires a target 'which' (see EPSSetWhichEigenpairs),
> for instance -st_type sinvert -eps_target 0 -eps_target_magnitude
> > ""
> >
> > I already have the below two lines in the code:
> > EPSSetWhichEigenpairs(eps,EPS_SMALLEST_MAGNITUDE);
> > EPSSetTarget(eps,0.0);
> >
> > so shouldn't these be enough? If I comment out the first line
> "EPSSetWhichEigenpairs", then the code works fine.
>
> You should either do
>
> EPSSetWhichEigenpairs(eps,EPS_SMALLEST_MAGNITUDE);
>
> without shift-and-invert or
>
> EPSSetWhichEigenpairs(eps,EPS_TARGET_MAGNITUDE);
> EPSSetTarget(eps,0.0);
>
> with shift-and-invert. The latter can also be used without
> shift-and-invert (e.g. in JD).
>
> I have to check, but a possible explanation why in your comment above (2)
> the command-line option -eps_target 0 works differently is that it also
> sets -eps_target_magnitude if omitted, so to be equivalent in source code
> you have to call both
> EPSSetWhichEigenpairs(eps,EPS_TARGET_MAGNITUDE);
> EPSSetTarget(eps,0.0);
>
> Jose
>
> > I have some more questions regarding setting the preconditioner for a
> quadratic eigenvalue problem, which I will ask in a follow-up email.
> >
> > Thanks for your help!
> >
> > -Varun
> >
> >
> > On Thu, Jul 1, 2021 at 5:01 AM Varun Hiremath <varunhiremath at gmail.com>
> wrote:
> > Thank you very much for these suggestions! We are currently using
> version 3.12, so I'll try to update to the latest version and try your
> suggestions. Let me get back to you, thanks!
> >
> > On Thu, Jul 1, 2021, 4:45 AM Jose E. Roman <jroman at dsic.upv.es> wrote:
> > Then I would try Davidson methods https://doi.org/10.1145/2543696
> > You can also try Krylov-Schur with "inexact" shift-and-invert, for
> instance, with preconditioned BiCGStab or GMRES, see section 3.4.1 of the
> users manual.
> >
> > In both cases, you have to pass matrix A in the call to
> EPSSetOperators() and the preconditioner matrix via
> STSetPreconditionerMat() - note this function was introduced in version
> 3.15.
> >
> > Jose
> >
> >
> >
> > > El 1 jul 2021, a las 13:36, Varun Hiremath <varunhiremath at gmail.com>
> escribi?:
> > >
> > > Thanks. I actually do have a 1st order approximation of matrix A, that
> I can explicitly compute and also invert. Can I use that matrix as
> preconditioner to speed things up? Is there some example that explains how
> to setup and call SLEPc for this scenario?
> > >
> > > On Thu, Jul 1, 2021, 4:29 AM Jose E. Roman <jroman at dsic.upv.es> wrote:
> > > For smallest real parts one could adapt ex34.c, but it is going to be
> costly
> https://slepc.upv.es/documentation/current/src/eps/tutorials/ex36.c.html
> > > Also, if eigenvalues are clustered around the origin, convergence may
> still be very slow.
> > >
> > > It is a tough problem, unless you are able to compute a good
> preconditioner of A (no need to compute the exact inverse).
> > >
> > > Jose
> > >
> > >
> > > > El 1 jul 2021, a las 13:23, Varun Hiremath <varunhiremath at gmail.com>
> escribi?:
> > > >
> > > > I'm solving for the smallest eigenvalues in magnitude. Though is it
> cheaper to solve smallest in real part, as that might also work in my case?
> Thanks for your help.
> > > >
> > > > On Thu, Jul 1, 2021, 4:08 AM Jose E. Roman <jroman at dsic.upv.es>
> wrote:
> > > > Smallest eigenvalue in magnitude or real part?
> > > >
> > > >
> > > > > El 1 jul 2021, a las 11:58, Varun Hiremath <
> varunhiremath at gmail.com> escribi?:
> > > > >
> > > > > Sorry, no both A and B are general sparse matrices
> (non-hermitian). So is there anything else I could try?
> > > > >
> > > > > On Thu, Jul 1, 2021 at 2:43 AM Jose E. Roman <jroman at dsic.upv.es>
> wrote:
> > > > > Is the problem symmetric (GHEP)? In that case, you can try LOBPCG
> on the pair (A,B). But this will likely be slow as well, unless you can
> provide a good preconditioner.
> > > > >
> > > > > Jose
> > > > >
> > > > >
> > > > > > El 1 jul 2021, a las 11:37, Varun Hiremath <
> varunhiremath at gmail.com> escribi?:
> > > > > >
> > > > > > Hi All,
> > > > > >
> > > > > > I am trying to compute the smallest eigenvalues of a generalized
> system A*x= lambda*B*x. I don't explicitly know the matrix A (so I am using
> a shell matrix with a custom matmult function) however, the matrix B is
> explicitly known so I compute inv(B)*A within the shell matrix and solve
> inv(B)*A*x = lambda*x.
> > > > > >
> > > > > > To compute the smallest eigenvalues it is recommended to solve
> the inverted system, but since matrix A is not explicitly known I can't
> invert the system. Moreover, the size of the system can be really big, and
> with the default Krylov solver, it is extremely slow. So is there a better
> way for me to compute the smallest eigenvalues of this system?
> > > > > >
> > > > > > Thanks,
> > > > > > Varun
> > > > >
> > > >
> > >
> >
> > <acoustic_box_test.cpp>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210922/820b353c/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: acoustic_matrix_test.cpp
Type: application/octet-stream
Size: 5467 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210922/820b353c/attachment-0001.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: MatA.gz
Type: application/x-gzip
Size: 12596169 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210922/820b353c/attachment-0001.gz>

From vaclav.hapla at erdw.ethz.ch  Wed Sep 22 13:59:30 2021
From: vaclav.hapla at erdw.ethz.ch (Hapla  Vaclav)
Date: Wed, 22 Sep 2021 18:59:30 +0000
Subject: [petsc-users] DMView and DMLoad
In-Reply-To: <DD57D079-509E-4C60-9F32-15972E6B6245@gmx.li>
References: <56ce2135-9757-4292-e33b-c7eea8fb7b2e@ovgu.de>
	<CAMYG4G=9N-Y5Am8LsQG8Z9mLmYMNZn3m5CxdFZocHxRN8Tr77w@mail.gmail.com>
	<DD57D079-509E-4C60-9F32-15972E6B6245@gmx.li>
Message-ID: <056E066F-D596-4254-A44A-60BFFD30FE82@erdw.ethz.ch>

To avoid confusions here, Berend seems to be specifically demanding XDMF (PETSC_VIEWER_HDF5_XDMF). The stuff we are now working on is parallel checkpointing in our own HDF5 format (PETSC_VIEWER_HDF5_PETSC), I will make a series of MRs on this topic in the following days.

For XDMF, we are specifically missing the ability to write/load DMLabels properly. XDMF uses specific cell-local numbering for faces for specification of face sets, and face-local numbering for specification of edge sets, which is not great wrt DMPlex design. And ParaView doesn't show any of these properly so it's hard to debug. Matt, we should talk about this soon.

Berend, for now, could you just load the mesh initially from XDMF and then use our PETSC_VIEWER_HDF5_PETSC format for subsequent saving/loading?

Thanks,

Vaclav

On 17 Sep 2021, at 15:46, Lawrence Mitchell <wence at gmx.li<mailto:wence at gmx.li>> wrote:

Hi Berend,

On 14 Sep 2021, at 12:23, Matthew Knepley <knepley at gmail.com<mailto:knepley at gmail.com>> wrote:

On Tue, Sep 14, 2021 at 5:15 AM Berend van Wachem <berend.vanwachem at ovgu.de<mailto:berend.vanwachem at ovgu.de>> wrote:
Dear PETSc-team,

We are trying to save and load distributed DMPlex and its associated
physical fields (created with DMCreateGlobalVector)  (Uvelocity,
VVelocity,  ...) in HDF5_XDMF format. To achieve this, we do the following:

1) save in the same xdmf.h5 file:
DMView( DM         , H5_XDMF_Viewer );
VecView( UVelocity, H5_XDMF_Viewer );

2) load the dm:
DMPlexCreateFromfile(PETSC_COMM_WORLD, Filename, PETSC_TRUE, DM);

3) load the physical field:
VecLoad( UVelocity, H5_XDMF_Viewer );

There are no errors in the execution, but the loaded DM is distributed
differently to the original one, which results in the incorrect
placement of the values of the physical fields (UVelocity etc.) in the
domain.

This approach is used to restart the simulation with the last saved DM.
Is there something we are missing, or there exists alternative routes to
this goal? Can we somehow get the IS of the redistribution, so we can
re-distribute the vector data as well?

Many thanks, best regards,

Hi Berend,

We are in the midst of rewriting this. We want to support saving multiple meshes, with fields attached to each,
and preserving the discretization (section) information, and allowing us to load up on a different number of
processes. We plan to be done by October. Vaclav and I are doing this in collaboration with Koki Sagiyama,
David Ham, and Lawrence Mitchell from the Firedrake team.

The core load/save cycle functionality is now in PETSc main. So if you're using main rather than a release, you can get access to it now. This section of the manual shows an example of how to do things https://petsc.org/main/docs/manual/dmplex/#saving-and-loading-data-with-hdf5

Let us know if things aren't clear!

Thanks,

Lawrence

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210922/992af90b/attachment.html>

From mfadams at lbl.gov  Thu Sep 23 07:56:58 2021
From: mfadams at lbl.gov (Mark Adams)
Date: Thu, 23 Sep 2021 08:56:58 -0400
Subject: [petsc-users] New error on Summit
Message-ID: <CADOhEh6RqvW=dby_-U+YnMekT=ZKBPCZgOuxaD3-s-DVx8aF2w@mail.gmail.com>

This was working before but now I get this (strange) error:

  Run Build Command(s):/usr/bin/gmake -f Makefile cmTC_429f1/fast &&
/usr/bin/gmake  -f CMakeFiles/cmTC_429f1.dir/build.make
CMakeFiles/cmTC_429f1.dir/build
    gmake[1]: Entering directory
'/gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-kokkos-cuda/externalpackages/git.kokkos/petsc-build/CMakeFiles/CMakeTmp'
    Building CXX object CMakeFiles/cmTC_429f1.dir/testCXXCompiler.cxx.o

/gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-kokkos-cuda/externalpackages/git.kokkos/bin/nvcc_wrapper
  -fPIC -g -DLANDAU_DIM=2 -DLANDAU_MAX_SPECIES=10 -DLANDAU_MAX_Q=4 -O -+
-qPIC  -std=gnu++14 -o CMakeFiles/cmTC_429f1.dir/testCXXCompiler.cxx.o -c
/gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-kokkos-cuda/externalpackages/git.kokkos/petsc-build/CMakeFiles/CMakeTmp/testCXXCompiler.cxx
    nvcc_wrapper has been given GNU extension standard flag -std=gnu++14 -
reverting flag to -std=c++14
    nvcc warning : The 'compute_35', 'compute_37', 'compute_50', 'sm_35',
'sm_37' and 'sm_50' architectures are deprecated, and may be removed in a
future release (Use -Wno-deprecated-gpu-targets to suppress warning).
    g++: error: unrecognized command line option ?~@~X-qPIC?~@~Y; did you
mean ?~@~X-fPIC?~@~Y?
    gmake[1]: *** [CMakeFiles/cmTC_429f1.dir/build.make:78:
CMakeFiles/cmTC_429f1.dir/testCXXCompiler.cxx.o] Error 1
    gmake[1]: Leaving directory
'/gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-kokkos-cuda/externalpackages/git.kokkos/petsc-build/CMakeFiles/CMakeTmp'
    gmake: *** [Makefile:127: cmTC_429f1/fast] Error 2
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210923/b1b13c82/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: configure.log
Type: application/octet-stream
Size: 1940254 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210923/b1b13c82/attachment-0001.obj>

From mfadams at lbl.gov  Thu Sep 23 09:16:10 2021
From: mfadams at lbl.gov (Mark Adams)
Date: Thu, 23 Sep 2021 10:16:10 -0400
Subject: [petsc-users] New error on Summit
In-Reply-To: <CADOhEh6RqvW=dby_-U+YnMekT=ZKBPCZgOuxaD3-s-DVx8aF2w@mail.gmail.com>
References: <CADOhEh6RqvW=dby_-U+YnMekT=ZKBPCZgOuxaD3-s-DVx8aF2w@mail.gmail.com>
Message-ID: <CADOhEh6bzu9Lq6adC+rXpi=DAw_PQ8ojBXwF0GWkuvpF6as5SQ@mail.gmail.com>

This is fixed now. I did not have a gcc module loaded so it was picking up
some default.

On Thu, Sep 23, 2021 at 8:56 AM Mark Adams <mfadams at lbl.gov> wrote:

> This was working before but now I get this (strange) error:
>
>   Run Build Command(s):/usr/bin/gmake -f Makefile cmTC_429f1/fast &&
> /usr/bin/gmake  -f CMakeFiles/cmTC_429f1.dir/build.make
> CMakeFiles/cmTC_429f1.dir/build
>     gmake[1]: Entering directory
> '/gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-kokkos-cuda/externalpackages/git.kokkos/petsc-build/CMakeFiles/CMakeTmp'
>     Building CXX object CMakeFiles/cmTC_429f1.dir/testCXXCompiler.cxx.o
>
> /gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-kokkos-cuda/externalpackages/git.kokkos/bin/nvcc_wrapper
>   -fPIC -g -DLANDAU_DIM=2 -DLANDAU_MAX_SPECIES=10 -DLANDAU_MAX_Q=4 -O -+
> -qPIC  -std=gnu++14 -o CMakeFiles/cmTC_429f1.dir/testCXXCompiler.cxx.o -c
> /gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-kokkos-cuda/externalpackages/git.kokkos/petsc-build/CMakeFiles/CMakeTmp/testCXXCompiler.cxx
>     nvcc_wrapper has been given GNU extension standard flag -std=gnu++14 -
> reverting flag to -std=c++14
>     nvcc warning : The 'compute_35', 'compute_37', 'compute_50', 'sm_35',
> 'sm_37' and 'sm_50' architectures are deprecated, and may be removed in a
> future release (Use -Wno-deprecated-gpu-targets to suppress warning).
>     g++: error: unrecognized command line option ?~@~X-qPIC?~@~Y; did you
> mean ?~@~X-fPIC?~@~Y?
>     gmake[1]: *** [CMakeFiles/cmTC_429f1.dir/build.make:78:
> CMakeFiles/cmTC_429f1.dir/testCXXCompiler.cxx.o] Error 1
>     gmake[1]: Leaving directory
> '/gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-kokkos-cuda/externalpackages/git.kokkos/petsc-build/CMakeFiles/CMakeTmp'
>     gmake: *** [Makefile:127: cmTC_429f1/fast] Error 2
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210923/7474b340/attachment.html>

From bsmith at petsc.dev  Thu Sep 23 09:24:20 2021
From: bsmith at petsc.dev (Barry Smith)
Date: Thu, 23 Sep 2021 10:24:20 -0400
Subject: [petsc-users] New error on Summit
In-Reply-To: <CADOhEh6bzu9Lq6adC+rXpi=DAw_PQ8ojBXwF0GWkuvpF6as5SQ@mail.gmail.com>
References: <CADOhEh6RqvW=dby_-U+YnMekT=ZKBPCZgOuxaD3-s-DVx8aF2w@mail.gmail.com>
	<CADOhEh6bzu9Lq6adC+rXpi=DAw_PQ8ojBXwF0GWkuvpF6as5SQ@mail.gmail.com>
Message-ID: <EDA52E34-3C93-469F-9746-09C0596A75D5@petsc.dev>


  Not really fixed, the problem is still there. You just bypassed the problem that BuildSystem has.


> On Sep 23, 2021, at 10:16 AM, Mark Adams <mfadams at lbl.gov> wrote:
> 
> This is fixed now. I did not have a gcc module loaded so it was picking up some default.
> 
> On Thu, Sep 23, 2021 at 8:56 AM Mark Adams <mfadams at lbl.gov <mailto:mfadams at lbl.gov>> wrote:
> This was working before but now I get this (strange) error:
> 
>   Run Build Command(s):/usr/bin/gmake -f Makefile cmTC_429f1/fast && /usr/bin/gmake  -f CMakeFiles/cmTC_429f1.dir/build.make CMakeFiles/cmTC_429f1.dir/build
>     gmake[1]: Entering directory '/gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-kokkos-cuda/externalpackages/git.kokkos/petsc-build/CMakeFiles/CMakeTmp'
>     Building CXX object CMakeFiles/cmTC_429f1.dir/testCXXCompiler.cxx.o
>     /gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-kokkos-cuda/externalpackages/git.kokkos/bin/nvcc_wrapper   -fPIC -g -DLANDAU_DIM=2 -DLANDAU_MAX_SPECIES=10 -DLANDAU_MAX_Q=4 -O -+ -qPIC  -std=gnu++14 -o CMakeFiles/cmTC_429f1.dir/testCXXCompiler.cxx.o -c /gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-kokkos-cuda/externalpackages/git.kokkos/petsc-build/CMakeFiles/CMakeTmp/testCXXCompiler.cxx
>     nvcc_wrapper has been given GNU extension standard flag -std=gnu++14 - reverting flag to -std=c++14
>     nvcc warning : The 'compute_35', 'compute_37', 'compute_50', 'sm_35', 'sm_37' and 'sm_50' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
>     g++: error: unrecognized command line option ?~@~X-qPIC?~@~Y; did you mean ?~@~X-fPIC?~@~Y?
>     gmake[1]: *** [CMakeFiles/cmTC_429f1.dir/build.make:78: CMakeFiles/cmTC_429f1.dir/testCXXCompiler.cxx.o] Error 1
>     gmake[1]: Leaving directory '/gpfs/alpine/csc314/scratch/adams/petsc/arch-summit-opt-gnu-kokkos-cuda/externalpackages/git.kokkos/petsc-build/CMakeFiles/CMakeTmp'
>     gmake: *** [Makefile:127: cmTC_429f1/fast] Error 2
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210923/f0c1c8aa/attachment.html>

From Eric.Chamberland at giref.ulaval.ca  Thu Sep 23 09:53:11 2021
From: Eric.Chamberland at giref.ulaval.ca (Eric Chamberland)
Date: Thu, 23 Sep 2021 10:53:11 -0400
Subject: [petsc-users] How to combine different element types into a
 single DMPlex?
In-Reply-To: <CAMYG4G=mUTVCuayh4_kqfjEoNT_XEt6+FF24jF2KhLu0nWb8tw@mail.gmail.com>
References: <beda9fe3-b204-8b65-66c7-bdadf4509679@giref.ulaval.ca>
	<CAMYG4GnJE0DP_f-NctsMj0PoerBy2yi43RXL0WZtKoOxwqFQ7Q@mail.gmail.com>
	<ab13b059-1282-2314-2bc7-6336c051a590@giref.ulaval.ca>
	<CA+gX-L_3i=fA7hKBnnhL2EZvwvD+sEn11D8qcc2YRaR2ywwahA@mail.gmail.com>
	<CAMYG4GmYyHqLMhNWsCenbfGON_gSdN1DUy2oRgdO7fbGbwzv4w@mail.gmail.com>
	<CA+gX-L9FkvH+u-yPPpzKPcLkmk3vVzuFbF3mTr=jHJVZqVruZQ@mail.gmail.com>
	<CAMYG4G=mUTVCuayh4_kqfjEoNT_XEt6+FF24jF2KhLu0nWb8tw@mail.gmail.com>
Message-ID: <6e78845e-2054-92b1-d6db-2c0820c05b64@giref.ulaval.ca>

Hi,

oh, that's a great news!

In our case we have our home-made file-format, invariant to the number 
of processes (thanks to MPI_File_set_view), that uses collective, 
asynchronous MPI I/O native calls for unstructured hybrid meshes and 
fields .

So our needs are not for reading meshes but only to fill an hybrid 
DMPlex with DMPlexBuildFromCellListParallel (or something else to 
come?)... to exploit petsc partitioners and parallel overlap computation...

Thanks for the follow-up! :)

Eric


On 2021-09-22 7:20 a.m., Matthew Knepley wrote:
> On Wed, Sep 22, 2021 at 3:04 AM Karin&NiKo <niko.karin at gmail.com 
> <mailto:niko.karin at gmail.com>> wrote:
>
>     Dear Matthew,
>
>     This is great news!
>     For my part, I would be mostly interested?in the parallel input
>     interface. Sorry for that...
>     Indeed, in our application,? we already have a parallel mesh data
>     structure that supports hybrid meshes with parallel I/O and
>     distribution (based on the MED format). We would like to use a
>     DMPlex to make parallel mesh adaptation.
>     ?As a matter of fact, all our meshes are in the MED format. We
>     could also?contribute to extend the interface of DMPlex with MED
>     (if you consider it could be usefull).
>
>
> An MED interface does exist. I stopped using it for two reasons:
>
> ? 1) The code was not portable and the build was failing on different 
> architectures. I had to manually fix it.
>
> ? 2) The boundary markers did not provide global information, so that 
> parallel reading was much harder.
>
> Feel free to update my MED reader to a better design.
>
> ? Thanks,
>
> ? ? ?Matt
>
>     Best regards,
>     Nicolas
>
>
>     Le?mar. 21 sept. 2021 ??21:56, Matthew Knepley <knepley at gmail.com
>     <mailto:knepley at gmail.com>> a ?crit?:
>
>         On Tue, Sep 21, 2021 at 10:31 AM Karin&NiKo
>         <niko.karin at gmail.com <mailto:niko.karin at gmail.com>> wrote:
>
>             Dear Eric, dear Matthew,
>
>             I share Eric's desire to be able to manipulate meshes
>             composed of different types of elements in a PETSc's DMPlex.
>             Since this discussion, is there anything new on this
>             feature for the DMPlex?object or am I missing something?
>
>
>         Thanks for finding this!
>
>         Okay, I did a rewrite of the Plex internals this summer. It
>         should now be possible to interpolate a mesh with any
>         number of cell types, partition it, redistribute it, and many
>         other manipulations.
>
>         You can read in some formats that support hybrid?meshes. If
>         you let me know how you plan to read it in, we can make it work.
>         Right now, I don't want to make input interfaces that no one
>         will ever use. We have a project, joint with Firedrake, to
>         finalize
>         parallel I/O. This will make parallel reading and writing for
>         checkpointing possible, supporting topology, geometry, fields and
>         layouts, for many meshes?in one HDF5 file. I think we will
>         finish in November.
>
>         ? Thanks,
>
>         ? ? ?Matt
>
>             Thanks,
>             Nicolas
>
>             Le?mer. 21 juil. 2021 ??04:25, Eric Chamberland
>             <Eric.Chamberland at giref.ulaval.ca
>             <mailto:Eric.Chamberland at giref.ulaval.ca>> a ?crit?:
>
>                 Hi,
>
>                 On 2021-07-14 3:14 p.m., Matthew Knepley wrote:
>>                 On Wed, Jul 14, 2021 at 1:25 PM Eric Chamberland
>>                 <Eric.Chamberland at giref.ulaval.ca
>>                 <mailto:Eric.Chamberland at giref.ulaval.ca>> wrote:
>>
>>                     Hi,
>>
>>                     while playing with
>>                     DMPlexBuildFromCellListParallel, I noticed we
>>                     have to
>>                     specify "numCorners" which is a fixed value, then
>>                     gives a fixed number
>>                     of nodes for a series of elements.
>>
>>                     How can I then add, for example, triangles and
>>                     quadrangles into a DMPlex?
>>
>>
>>                 You can't with that function. It would be much mich
>>                 more complicated if you could, and I am not sure
>>                 it is worth it for that function. The reason is that
>>                 you would need index information to offset?into the
>>                 connectivity list, and that would need to be
>>                 replicated to some extent so that all processes know what
>>                 the others are doing. Possible, but complicated.
>>
>>                 Maybe I can help suggest something for what you are
>>                 trying?to do?
>
>                 Yes: we are trying to partition our parallel mesh with
>                 PETSc functions.? The mesh has been read in parallel
>                 so each process owns a part of it, but we have to
>                 manage mixed elements types.
>
>                 When we directly use ParMETIS_V3_PartMeshKway, we give
>                 two arrays to describe the elements which allows mixed
>                 elements.
>
>                 So, how would I read my mixed mesh in parallel and
>                 give it to PETSc DMPlex so I can use a
>                 PetscPartitioner with DMPlexDistribute ?
>
>                 A second goal we have is to use PETSc to compute the
>                 overlap, which is something I can't find in PARMetis
>                 (and any other partitionning library?)
>
>                 Thanks,
>
>                 Eric
>
>
>>
>>                 ? Thanks,
>>
>>                 ? ? ? Matt
>>
>>                     Thanks,
>>
>>                     Eric
>>
>>                     -- 
>>                     Eric Chamberland, ing., M. Ing
>>                     Professionnel de recherche
>>                     GIREF/Universit? Laval
>>                     (418) 656-2131 poste 41 22 42
>>
>>
>>
>>                 -- 
>>                 What most experimenters take for granted before they
>>                 begin their experiments is infinitely more
>>                 interesting than any results to which their
>>                 experiments lead.
>>                 -- Norbert Wiener
>>
>>                 https://www.cse.buffalo.edu/~knepley/
>>                 <http://www.cse.buffalo.edu/~knepley/>
>
>                 -- 
>                 Eric Chamberland, ing., M. Ing
>                 Professionnel de recherche
>                 GIREF/Universit? Laval
>                 (418) 656-2131 poste 41 22 42
>
>
>
>         -- 
>         What most experimenters take for granted before they begin
>         their experiments is infinitely more interesting than any
>         results to which their experiments lead.
>         -- Norbert Wiener
>
>         https://www.cse.buffalo.edu/~knepley/
>         <http://www.cse.buffalo.edu/~knepley/>
>
>
>
> -- 
> What most experimenters take for granted before they begin their 
> experiments is infinitely more interesting than any results to which 
> their experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/ 
> <http://www.cse.buffalo.edu/~knepley/>

-- 
Eric Chamberland, ing., M. Ing
Professionnel de recherche
GIREF/Universit? Laval
(418) 656-2131 poste 41 22 42

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210923/9cb807ff/attachment-0001.html>

From vaclav.hapla at erdw.ethz.ch  Thu Sep 23 10:30:45 2021
From: vaclav.hapla at erdw.ethz.ch (Hapla  Vaclav)
Date: Thu, 23 Sep 2021 15:30:45 +0000
Subject: [petsc-users] How to combine different element types into a
 single DMPlex?
In-Reply-To: <6e78845e-2054-92b1-d6db-2c0820c05b64@giref.ulaval.ca>
References: <beda9fe3-b204-8b65-66c7-bdadf4509679@giref.ulaval.ca>
	<CAMYG4GnJE0DP_f-NctsMj0PoerBy2yi43RXL0WZtKoOxwqFQ7Q@mail.gmail.com>
	<ab13b059-1282-2314-2bc7-6336c051a590@giref.ulaval.ca>
	<CA+gX-L_3i=fA7hKBnnhL2EZvwvD+sEn11D8qcc2YRaR2ywwahA@mail.gmail.com>
	<CAMYG4GmYyHqLMhNWsCenbfGON_gSdN1DUy2oRgdO7fbGbwzv4w@mail.gmail.com>
	<CA+gX-L9FkvH+u-yPPpzKPcLkmk3vVzuFbF3mTr=jHJVZqVruZQ@mail.gmail.com>
	<CAMYG4G=mUTVCuayh4_kqfjEoNT_XEt6+FF24jF2KhLu0nWb8tw@mail.gmail.com>
	<6e78845e-2054-92b1-d6db-2c0820c05b64@giref.ulaval.ca>
Message-ID: <BB7104B9-EB9F-4F1A-A62F-A4AD9CD6F294@erdw.ethz.ch>

Note there will soon be a generalization of DMPlexBuildFromCellListParallel() around, as a side product of our current collaborative efforts with Firedrake guys. It will take a PetscSection instead of relying on the blocksize [which is indeed always constant for the given dataset]. Stay tuned.

https://gitlab.com/petsc/petsc/-/merge_requests/4350

Thanks,

Vaclav

On 23 Sep 2021, at 16:53, Eric Chamberland <Eric.Chamberland at giref.ulaval.ca<mailto:Eric.Chamberland at giref.ulaval.ca>> wrote:


Hi,

oh, that's a great news!

In our case we have our home-made file-format, invariant to the number of processes (thanks to MPI_File_set_view), that uses collective, asynchronous MPI I/O native calls for unstructured hybrid meshes and fields .

So our needs are not for reading meshes but only to fill an hybrid DMPlex with DMPlexBuildFromCellListParallel (or something else to come?)... to exploit petsc partitioners and parallel overlap computation...

Thanks for the follow-up! :)

Eric


On 2021-09-22 7:20 a.m., Matthew Knepley wrote:
On Wed, Sep 22, 2021 at 3:04 AM Karin&NiKo <niko.karin at gmail.com<mailto:niko.karin at gmail.com>> wrote:
Dear Matthew,

This is great news!
For my part, I would be mostly interested in the parallel input interface. Sorry for that...
Indeed, in our application,  we already have a parallel mesh data structure that supports hybrid meshes with parallel I/O and distribution (based on the MED format). We would like to use a DMPlex to make parallel mesh adaptation.
 As a matter of fact, all our meshes are in the MED format. We could also contribute to extend the interface of DMPlex with MED (if you consider it could be usefull).

An MED interface does exist. I stopped using it for two reasons:

  1) The code was not portable and the build was failing on different architectures. I had to manually fix it.

  2) The boundary markers did not provide global information, so that parallel reading was much harder.

Feel free to update my MED reader to a better design.

  Thanks,

     Matt

Best regards,
Nicolas


Le mar. 21 sept. 2021 ? 21:56, Matthew Knepley <knepley at gmail.com<mailto:knepley at gmail.com>> a ?crit :
On Tue, Sep 21, 2021 at 10:31 AM Karin&NiKo <niko.karin at gmail.com<mailto:niko.karin at gmail.com>> wrote:
Dear Eric, dear Matthew,

I share Eric's desire to be able to manipulate meshes composed of different types of elements in a PETSc's DMPlex.
Since this discussion, is there anything new on this feature for the DMPlex object or am I missing something?

Thanks for finding this!

Okay, I did a rewrite of the Plex internals this summer. It should now be possible to interpolate a mesh with any
number of cell types, partition it, redistribute it, and many other manipulations.

You can read in some formats that support hybrid meshes. If you let me know how you plan to read it in, we can make it work.
Right now, I don't want to make input interfaces that no one will ever use. We have a project, joint with Firedrake, to finalize
parallel I/O. This will make parallel reading and writing for checkpointing possible, supporting topology, geometry, fields and
layouts, for many meshes in one HDF5 file. I think we will finish in November.

  Thanks,

     Matt

Thanks,
Nicolas

Le mer. 21 juil. 2021 ? 04:25, Eric Chamberland <Eric.Chamberland at giref.ulaval.ca<mailto:Eric.Chamberland at giref.ulaval.ca>> a ?crit :

Hi,

On 2021-07-14 3:14 p.m., Matthew Knepley wrote:
On Wed, Jul 14, 2021 at 1:25 PM Eric Chamberland <Eric.Chamberland at giref.ulaval.ca<mailto:Eric.Chamberland at giref.ulaval.ca>> wrote:
Hi,

while playing with DMPlexBuildFromCellListParallel, I noticed we have to
specify "numCorners" which is a fixed value, then gives a fixed number
of nodes for a series of elements.

How can I then add, for example, triangles and quadrangles into a DMPlex?

You can't with that function. It would be much mich more complicated if you could, and I am not sure
it is worth it for that function. The reason is that you would need index information to offset into the
connectivity list, and that would need to be replicated to some extent so that all processes know what
the others are doing. Possible, but complicated.

Maybe I can help suggest something for what you are trying to do?

Yes: we are trying to partition our parallel mesh with PETSc functions.  The mesh has been read in parallel so each process owns a part of it, but we have to manage mixed elements types.

When we directly use ParMETIS_V3_PartMeshKway, we give two arrays to describe the elements which allows mixed elements.

So, how would I read my mixed mesh in parallel and give it to PETSc DMPlex so I can use a PetscPartitioner with DMPlexDistribute ?

A second goal we have is to use PETSc to compute the overlap, which is something I can't find in PARMetis (and any other partitionning library?)

Thanks,

Eric


  Thanks,

      Matt


Thanks,

Eric

--
Eric Chamberland, ing., M. Ing
Professionnel de recherche
GIREF/Universit? Laval
(418) 656-2131 poste 41 22 42



--
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/<http://www.cse.buffalo.edu/~knepley/>

--
Eric Chamberland, ing., M. Ing
Professionnel de recherche
GIREF/Universit? Laval
(418) 656-2131 poste 41 22 42


--
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/<http://www.cse.buffalo.edu/~knepley/>


--
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/<http://www.cse.buffalo.edu/~knepley/>

--
Eric Chamberland, ing., M. Ing
Professionnel de recherche
GIREF/Universit? Laval
(418) 656-2131 poste 41 22 42

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210923/7b91690b/attachment.html>

From medane.tchakorom at univ-fcomte.fr  Fri Sep 24 09:08:01 2021
From: medane.tchakorom at univ-fcomte.fr (Medane TCHAKOROM)
Date: Fri, 24 Sep 2021 16:08:01 +0200
Subject: [petsc-users] Petsc memory consumption keep increasing in my loop
Message-ID: <48da1142-7ab9-e549-dc15-63a74dc093b2@univ-fcomte.fr>

Hello,

I have problem with a code i'am working on.

To illustrate my problem, here is an example:


int main(int argc, char *argv[])
{

 ??? PetscErrorCode ierr;

 ??? ierr = PetscInitialize(&argc, &argv, (char *)0, NULL);
 ??? if (ierr)
 ??????? return ierr;

 ??? int i = 0;
 ??? for (i = 0; i < 1; i++)
 ??? {
 ??????? Mat A;
 ??????? ierr = MatCreate(PETSC_COMM_WORLD, &A);
 ??????? CHKERRQ(ierr);
 ??????? ierr = MatSetSizes(A, 16, 16, PETSC_DECIDE,PETSC_DECIDE);
 ??????? CHKERRQ(ierr);
 ??????? ierr = MatSetFromOptions(A);
 ??????? CHKERRQ(ierr);
 ??????? ierr = MatSetUp(A);
 ??????? CHKERRQ(ierr);


 ??????? ierr = MatAssemblyBegin(A, MAT_FINAL_ASSEMBLY);
 ??????? CHKERRQ(ierr);
 ??????? ierr = MatAssemblyEnd(A, MAT_FINAL_ASSEMBLY);
 ??????? CHKERRQ(ierr);



 ??????? /// SOME CODE HERE....

 ??????? MatDestroy(&A);


 ??? }


 ??? FILE *fPtr;
 ??? fPtr = fopen("petsc_dump_file.txt", "a");
 ??? PetscMallocDump(fPtr);
 ??? fclose(fPtr);

 ??? ierr = PetscFinalize();
 ??? CHKERRQ(ierr);

 ??? return 0;
}



The problem is , in the loop, the memory consumption keep increasing 
till the end of the program.

I checked memory leak with PetscMallocDump, and found out that the 
problem may be due to matrix creation.

I'am new to Petsc and i don't know if i'am doing something wrong. Thanks


M?dane


From bsmith at petsc.dev  Fri Sep 24 09:13:59 2021
From: bsmith at petsc.dev (Barry Smith)
Date: Fri, 24 Sep 2021 10:13:59 -0400
Subject: [petsc-users] Petsc memory consumption keep increasing in my
 loop
In-Reply-To: <48da1142-7ab9-e549-dc15-63a74dc093b2@univ-fcomte.fr>
References: <48da1142-7ab9-e549-dc15-63a74dc093b2@univ-fcomte.fr>
Message-ID: <6D5C33D0-A02E-443F-8A7C-9ADBB89369CB@petsc.dev>


 The code you sent looks fine, it should not leak memory. 

 Perhaps the /// SOME CODE HERE.... is doing something that prevents the matrix from being actually freed. PETSc uses reference counting on its objects so if another object keeps a reference to the matrix then the memory of the matrix will not be freed until the reference count drops back to zero. For example if a KSP has a reference to the matrix and the KSP has not been completely freed the matrix memory will remain.

  We would need to see the full code to understand why the matrix is not being freed.

  Barry


> On Sep 24, 2021, at 10:08 AM, Medane TCHAKOROM <medane.tchakorom at univ-fcomte.fr> wrote:
> 
> Hello,
> 
> I have problem with a code i'am working on.
> 
> To illustrate my problem, here is an example:
> 
> 
> int main(int argc, char *argv[])
> {
> 
>     PetscErrorCode ierr;
> 
>     ierr = PetscInitialize(&argc, &argv, (char *)0, NULL);
>     if (ierr)
>         return ierr;
> 
>     int i = 0;
>     for (i = 0; i < 1; i++)
>     {
>         Mat A;
>         ierr = MatCreate(PETSC_COMM_WORLD, &A);
>         CHKERRQ(ierr);
>         ierr = MatSetSizes(A, 16, 16, PETSC_DECIDE,PETSC_DECIDE);
>         CHKERRQ(ierr);
>         ierr = MatSetFromOptions(A);
>         CHKERRQ(ierr);
>         ierr = MatSetUp(A);
>         CHKERRQ(ierr);
> 
> 
>         ierr = MatAssemblyBegin(A, MAT_FINAL_ASSEMBLY);
>         CHKERRQ(ierr);
>         ierr = MatAssemblyEnd(A, MAT_FINAL_ASSEMBLY);
>         CHKERRQ(ierr);
> 
> 
> 
>         /// SOME CODE HERE....
> 
>         MatDestroy(&A);
> 
> 
>     }
> 
> 
>     FILE *fPtr;
>     fPtr = fopen("petsc_dump_file.txt", "a");
>     PetscMallocDump(fPtr);
>     fclose(fPtr);
> 
>     ierr = PetscFinalize();
>     CHKERRQ(ierr);
> 
>     return 0;
> }
> 
> 
> 
> The problem is , in the loop, the memory consumption keep increasing till the end of the program.
> 
> I checked memory leak with PetscMallocDump, and found out that the problem may be due to matrix creation.
> 
> I'am new to Petsc and i don't know if i'am doing something wrong. Thanks
> 
> 
> M?dane
> 


From medane.tchakorom at univ-fcomte.fr  Fri Sep 24 09:31:51 2021
From: medane.tchakorom at univ-fcomte.fr (Medane TCHAKOROM)
Date: Fri, 24 Sep 2021 16:31:51 +0200
Subject: [petsc-users] Petsc memory consumption keep increasing in my
 loop
In-Reply-To: <6D5C33D0-A02E-443F-8A7C-9ADBB89369CB@petsc.dev>
References: <48da1142-7ab9-e549-dc15-63a74dc093b2@univ-fcomte.fr>
	<6D5C33D0-A02E-443F-8A7C-9ADBB89369CB@petsc.dev>
Message-ID: <5bb0c397-4558-b6d9-8dd7-19226cb0009c@univ-fcomte.fr>

Thanks Barry,

I can't share the orginal code i'am working on unfortunately.

But the example i wrote -- even if you do not that into account //SOME 
CODE HERE .. part -- give me , by using PetscMallocDump, some 
informations about memory that was not freed.

Based on the example code i sent, i was expecting that PetscMallocDump 
give no output.

M?dane


On 24/09/2021 16:13, Barry Smith wrote:
>   The code you sent looks fine, it should not leak memory.
>
>   Perhaps the /// SOME CODE HERE.... is doing something that prevents the matrix from being actually freed. PETSc uses reference counting on its objects so if another object keeps a reference to the matrix then the memory of the matrix will not be freed until the reference count drops back to zero. For example if a KSP has a reference to the matrix and the KSP has not been completely freed the matrix memory will remain.
>
>    We would need to see the full code to understand why the matrix is not being freed.
>
>    Barry
>
>
>> On Sep 24, 2021, at 10:08 AM, Medane TCHAKOROM <medane.tchakorom at univ-fcomte.fr> wrote:
>>
>> Hello,
>>
>> I have problem with a code i'am working on.
>>
>> To illustrate my problem, here is an example:
>>
>>
>> int main(int argc, char *argv[])
>> {
>>
>>      PetscErrorCode ierr;
>>
>>      ierr = PetscInitialize(&argc, &argv, (char *)0, NULL);
>>      if (ierr)
>>          return ierr;
>>
>>      int i = 0;
>>      for (i = 0; i < 1; i++)
>>      {
>>          Mat A;
>>          ierr = MatCreate(PETSC_COMM_WORLD, &A);
>>          CHKERRQ(ierr);
>>          ierr = MatSetSizes(A, 16, 16, PETSC_DECIDE,PETSC_DECIDE);
>>          CHKERRQ(ierr);
>>          ierr = MatSetFromOptions(A);
>>          CHKERRQ(ierr);
>>          ierr = MatSetUp(A);
>>          CHKERRQ(ierr);
>>
>>
>>          ierr = MatAssemblyBegin(A, MAT_FINAL_ASSEMBLY);
>>          CHKERRQ(ierr);
>>          ierr = MatAssemblyEnd(A, MAT_FINAL_ASSEMBLY);
>>          CHKERRQ(ierr);
>>
>>
>>
>>          /// SOME CODE HERE....
>>
>>          MatDestroy(&A);
>>
>>
>>      }
>>
>>
>>      FILE *fPtr;
>>      fPtr = fopen("petsc_dump_file.txt", "a");
>>      PetscMallocDump(fPtr);
>>      fclose(fPtr);
>>
>>      ierr = PetscFinalize();
>>      CHKERRQ(ierr);
>>
>>      return 0;
>> }
>>
>>
>>
>> The problem is , in the loop, the memory consumption keep increasing till the end of the program.
>>
>> I checked memory leak with PetscMallocDump, and found out that the problem may be due to matrix creation.
>>
>> I'am new to Petsc and i don't know if i'am doing something wrong. Thanks
>>
>>
>> M?dane
>>

From bsmith at petsc.dev  Fri Sep 24 10:21:23 2021
From: bsmith at petsc.dev (Barry Smith)
Date: Fri, 24 Sep 2021 11:21:23 -0400
Subject: [petsc-users] Petsc memory consumption keep increasing in my
 loop
In-Reply-To: <5bb0c397-4558-b6d9-8dd7-19226cb0009c@univ-fcomte.fr>
References: <48da1142-7ab9-e549-dc15-63a74dc093b2@univ-fcomte.fr>
	<6D5C33D0-A02E-443F-8A7C-9ADBB89369CB@petsc.dev>
	<5bb0c397-4558-b6d9-8dd7-19226cb0009c@univ-fcomte.fr>
Message-ID: <3A611848-D4C2-475A-94B2-C8066CBE1C21@petsc.dev>


  Ahh, the stuff you are seeing is just memory associated with the initialization of the matrix package; it is not the matrix memory (that is all freed). This memory used in the initialization is used only once and will not grow with more matrices. 

  If you run the program with -malloc_dump then you will see nothing is printed since the memory used in the initialization is freed in PetscFinalize().

  Barry


> On Sep 24, 2021, at 10:31 AM, Medane TCHAKOROM <medane.tchakorom at univ-fcomte.fr> wrote:
> 
> Thanks Barry,
> 
> I can't share the orginal code i'am working on unfortunately.
> 
> But the example i wrote -- even if you do not that into account //SOME CODE HERE .. part -- give me , by using PetscMallocDump, some informations about memory that was not freed.
> 
> Based on the example code i sent, i was expecting that PetscMallocDump give no output.
> 
> M?dane
> 
> 
> On 24/09/2021 16:13, Barry Smith wrote:
>>  The code you sent looks fine, it should not leak memory.
>> 
>>  Perhaps the /// SOME CODE HERE.... is doing something that prevents the matrix from being actually freed. PETSc uses reference counting on its objects so if another object keeps a reference to the matrix then the memory of the matrix will not be freed until the reference count drops back to zero. For example if a KSP has a reference to the matrix and the KSP has not been completely freed the matrix memory will remain.
>> 
>>   We would need to see the full code to understand why the matrix is not being freed.
>> 
>>   Barry
>> 
>> 
>>> On Sep 24, 2021, at 10:08 AM, Medane TCHAKOROM <medane.tchakorom at univ-fcomte.fr> wrote:
>>> 
>>> Hello,
>>> 
>>> I have problem with a code i'am working on.
>>> 
>>> To illustrate my problem, here is an example:
>>> 
>>> 
>>> int main(int argc, char *argv[])
>>> {
>>> 
>>>     PetscErrorCode ierr;
>>> 
>>>     ierr = PetscInitialize(&argc, &argv, (char *)0, NULL);
>>>     if (ierr)
>>>         return ierr;
>>> 
>>>     int i = 0;
>>>     for (i = 0; i < 1; i++)
>>>     {
>>>         Mat A;
>>>         ierr = MatCreate(PETSC_COMM_WORLD, &A);
>>>         CHKERRQ(ierr);
>>>         ierr = MatSetSizes(A, 16, 16, PETSC_DECIDE,PETSC_DECIDE);
>>>         CHKERRQ(ierr);
>>>         ierr = MatSetFromOptions(A);
>>>         CHKERRQ(ierr);
>>>         ierr = MatSetUp(A);
>>>         CHKERRQ(ierr);
>>> 
>>> 
>>>         ierr = MatAssemblyBegin(A, MAT_FINAL_ASSEMBLY);
>>>         CHKERRQ(ierr);
>>>         ierr = MatAssemblyEnd(A, MAT_FINAL_ASSEMBLY);
>>>         CHKERRQ(ierr);
>>> 
>>> 
>>> 
>>>         /// SOME CODE HERE....
>>> 
>>>         MatDestroy(&A);
>>> 
>>> 
>>>     }
>>> 
>>> 
>>>     FILE *fPtr;
>>>     fPtr = fopen("petsc_dump_file.txt", "a");
>>>     PetscMallocDump(fPtr);
>>>     fclose(fPtr);
>>> 
>>>     ierr = PetscFinalize();
>>>     CHKERRQ(ierr);
>>> 
>>>     return 0;
>>> }
>>> 
>>> 
>>> 
>>> The problem is , in the loop, the memory consumption keep increasing till the end of the program.
>>> 
>>> I checked memory leak with PetscMallocDump, and found out that the problem may be due to matrix creation.
>>> 
>>> I'am new to Petsc and i don't know if i'am doing something wrong. Thanks
>>> 
>>> 
>>> M?dane
>>> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210924/e1b713e0/attachment-0001.html>

From medane.tchakorom at univ-fcomte.fr  Fri Sep 24 10:38:22 2021
From: medane.tchakorom at univ-fcomte.fr (Medane TCHAKOROM)
Date: Fri, 24 Sep 2021 17:38:22 +0200
Subject: [petsc-users] Petsc memory consumption keep increasing in my
 loop
In-Reply-To: <3A611848-D4C2-475A-94B2-C8066CBE1C21@petsc.dev>
References: <48da1142-7ab9-e549-dc15-63a74dc093b2@univ-fcomte.fr>
	<6D5C33D0-A02E-443F-8A7C-9ADBB89369CB@petsc.dev>
	<5bb0c397-4558-b6d9-8dd7-19226cb0009c@univ-fcomte.fr>
	<3A611848-D4C2-475A-94B2-C8066CBE1C21@petsc.dev>
Message-ID: <83ea52a4-6804-c0fe-79a3-9dc1c2245a64@univ-fcomte.fr>

Thank you for the precision, now i can eliminate this case, and try to 
find where my bug is coming from.

M?dane

On 24/09/2021 17:21, Barry Smith wrote:
>
> ? Ahh, the stuff you are seeing is just memory associated with the 
> initialization of the matrix package; it is not the matrix memory 
> (that is all freed). This memory used in the initialization is used 
> only once and will not grow with more matrices.
>
> ? If you run the program with -malloc_dump?then you will see nothing 
> is printed since the memory used in the initialization is freed in 
> PetscFinalize().
>
> ? Barry
>
>
>> On Sep 24, 2021, at 10:31 AM, Medane TCHAKOROM 
>> <medane.tchakorom at univ-fcomte.fr 
>> <mailto:medane.tchakorom at univ-fcomte.fr>> wrote:
>>
>> Thanks Barry,
>>
>> I can't share the orginal code i'am working on unfortunately.
>>
>> But the example i wrote -- even if you do not that into account 
>> //SOME CODE HERE .. part -- give me , by using PetscMallocDump, some 
>> informations about memory that was not freed.
>>
>> Based on the example code i sent, i was expecting that 
>> PetscMallocDump give no output.
>>
>> M?dane
>>
>>
>> On 24/09/2021 16:13, Barry Smith wrote:
>>> ?The code you sent looks fine, it should not leak memory.
>>>
>>> ?Perhaps the /// SOME CODE HERE.... is doing something that prevents 
>>> the matrix from being actually freed. PETSc uses reference counting 
>>> on its objects so if another object keeps a reference to the matrix 
>>> then the memory of the matrix will not be freed until the reference 
>>> count drops back to zero. For example if a KSP has a reference to 
>>> the matrix and the KSP has not been completely freed the matrix 
>>> memory will remain.
>>>
>>> ??We would need to see the full code to understand why the matrix is 
>>> not being freed.
>>>
>>> ??Barry
>>>
>>>
>>>> On Sep 24, 2021, at 10:08 AM, Medane TCHAKOROM 
>>>> <medane.tchakorom at univ-fcomte.fr 
>>>> <mailto:medane.tchakorom at univ-fcomte.fr>> wrote:
>>>>
>>>> Hello,
>>>>
>>>> I have problem with a code i'am working on.
>>>>
>>>> To illustrate my problem, here is an example:
>>>>
>>>>
>>>> int main(int argc, char *argv[])
>>>> {
>>>>
>>>> ????PetscErrorCode ierr;
>>>>
>>>> ????ierr = PetscInitialize(&argc, &argv, (char *)0, NULL);
>>>> ????if (ierr)
>>>> ????????return ierr;
>>>>
>>>> ????int i = 0;
>>>> ????for (i = 0; i < 1; i++)
>>>> ????{
>>>> ????????Mat A;
>>>> ????????ierr = MatCreate(PETSC_COMM_WORLD, &A);
>>>> ????????CHKERRQ(ierr);
>>>> ????????ierr = MatSetSizes(A, 16, 16, PETSC_DECIDE,PETSC_DECIDE);
>>>> ????????CHKERRQ(ierr);
>>>> ????????ierr = MatSetFromOptions(A);
>>>> ????????CHKERRQ(ierr);
>>>> ????????ierr = MatSetUp(A);
>>>> ????????CHKERRQ(ierr);
>>>>
>>>>
>>>> ????????ierr = MatAssemblyBegin(A, MAT_FINAL_ASSEMBLY);
>>>> ????????CHKERRQ(ierr);
>>>> ????????ierr = MatAssemblyEnd(A, MAT_FINAL_ASSEMBLY);
>>>> ????????CHKERRQ(ierr);
>>>>
>>>>
>>>>
>>>> ????????/// SOME CODE HERE....
>>>>
>>>> ????????MatDestroy(&A);
>>>>
>>>>
>>>> ????}
>>>>
>>>>
>>>> ????FILE *fPtr;
>>>> ????fPtr = fopen("petsc_dump_file.txt", "a");
>>>> ????PetscMallocDump(fPtr);
>>>> ????fclose(fPtr);
>>>>
>>>> ????ierr = PetscFinalize();
>>>> ????CHKERRQ(ierr);
>>>>
>>>> ????return 0;
>>>> }
>>>>
>>>>
>>>>
>>>> The problem is , in the loop, the memory consumption keep 
>>>> increasing till the end of the program.
>>>>
>>>> I checked memory leak with PetscMallocDump, and found out that the 
>>>> problem may be due to matrix creation.
>>>>
>>>> I'am new to Petsc and i don't know if i'am doing something wrong. 
>>>> Thanks
>>>>
>>>>
>>>> M?dane
>>>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210924/a72f46a1/attachment.html>

From jroman at dsic.upv.es  Fri Sep 24 11:14:18 2021
From: jroman at dsic.upv.es (Jose E. Roman)
Date: Fri, 24 Sep 2021 18:14:18 +0200
Subject: [petsc-users] SLEPc: smallest eigenvalues
In-Reply-To: <CAMXxdU+nVEUgTMrtZDcCM0yMwku0GQmKKbnJr-P9sKRLrrE4wA@mail.gmail.com>
References: <CAMXxdUJSWU52_Wb=0tWBPYGnMNtCwzyL2Ar_qXZeYY7VkOif7A@mail.gmail.com>
	<EE81AB10-C1BE-494A-8B57-94AE2198CED4@dsic.upv.es>
	<CAMXxdUKch4vPnXPnRzArUEcdxQtjYJnd-KoQFk8yBpdG6fV2fQ@mail.gmail.com>
	<179BDB69-1EC0-4334-A964-ABE29E33EFF8@dsic.upv.es>
	<CAMXxdU+yWJVCS8=_0sk2M-Y4q6ZG8H6O87xbQ5zpxS9QrGnbYA@mail.gmail.com>
	<5B1750B3-E05F-45D7-929B-A5CF816B4A75@dsic.upv.es>
	<CAMXxdUKTSi1oR-tSVa++NwynwoC-QffAE3h_kEp3vVUuypUpnA@mail.gmail.com>
	<7031EC8B-A238-45AD-B4C2-FA8988022864@dsic.upv.es>
	<CAMXxdULvVtyHxg5UcHWvcq7pNNv_4ETKYd-vyp+AZZdSAKKi+Q@mail.gmail.com>
	<CAMXxdU+D5tUkpspfun-H3iK3hVh5QTQtk0rXY6rLuUwLDAdYuw@mail.gmail.com>
	<6B968AE2-8325-4E20-B94A-16ECDD0FBA90@dsic.upv.es>
	<CAMXxdU+nVEUgTMrtZDcCM0yMwku0GQmKKbnJr-P9sKRLrrE4wA@mail.gmail.com>
Message-ID: <4BB88AB3-410E-493C-9161-97775747936D@dsic.upv.es>

If you do
$ ./acoustic_matrix_test.o -shell 0 -st_type sinvert -deflate 1
then it is using an LU factorization (the default), which is fast.

Use -eps_view to see which solver settings are you using.

BiCGStab with block Jacobi does not work for you matrix, it exceeds the maximum 10000 iterations. So this is not viable unless you can find a better preconditioner for your problem. If not, just using EPS_SMALLEST_MAGNITUDE will be faster.

Computing smallest magnitude eigenvalues is a difficult task. The most robust way is to compute a (parallel) LU factorization if you can afford it.


A side note: don't add this to your source code
#define PETSC_USE_COMPLEX 1
This define is taken from PETSc's include files, you should not mess with it. Instead, you probably want to add something like this AFTER #include <slepceps.h>:
#if !defined(PETSC_USE_COMPLEX)
#error "Requires complex scalars"
#endif

Jose


> El 22 sept 2021, a las 19:38, Varun Hiremath <varunhiremath at gmail.com> escribi?:
> 
> Hi Jose,
> 
> Thank you, that explains it and my example code works now without specifying "-eps_target 0" in the command line.
> 
> However, both the Krylov inexact shift-invert and JD solvers are struggling to converge for some of my actual problems. The issue seems to be related to non-symmetric general matrices. I have extracted one such matrix attached here as MatA.gz (size 100k), and have also included a short program that loads this matrix and then computes the smallest eigenvalues as I described earlier.
> 
> For this matrix, if I compute the eigenvalues directly (without using the shell matrix) using shift-and-invert (as below) then it converges in less than a minute.
> $ ./acoustic_matrix_test.o -shell 0 -st_type sinvert -deflate 1
> 
> However, if I use the shell matrix and use any of the preconditioned solvers JD or Krylov shift-invert (as shown below) with the same matrix as the preconditioner, then they struggle to converge.
> $ ./acoustic_matrix_test.o -usejd 1 -deflate 1
> $ ./acoustic_matrix_test.o -sinvert 1 -deflate 1
> 
> Could you please check the attached code and suggest any changes in settings that might help with convergence for these kinds of matrices? I appreciate your help!
> 
> Thanks,
> Varun
> 
> On Tue, Sep 21, 2021 at 11:14 AM Jose E. Roman <jroman at dsic.upv.es> wrote:
> I will have a look at your code when I have more time. Meanwhile, I am answering 3) below...
> 
> > El 21 sept 2021, a las 0:23, Varun Hiremath <varunhiremath at gmail.com> escribi?:
> > 
> > Hi Jose,
> > 
> > Sorry, it took me a while to test these settings in the new builds. I am getting good improvement in performance using the preconditioned solvers, so thanks for the suggestions! But I have some questions related to the usage.
> > 
> > We are using SLEPc to solve the acoustic modal eigenvalue problem. Attached is a simple standalone program that computes acoustic modes in a simple rectangular box. This program illustrates the general setup I am using, though here the shell matrix and the preconditioner matrix are the same, while in my actual program the shell matrix computes A*x without explicitly forming A, and the preconditioner is a 0th order approximation of A.
> > 
> > In the attached program I have tested both
> > 1) the Krylov-Schur with inexact shift-and-invert (implemented under the option sinvert);
> > 2) the JD solver with preconditioner (implemented under the option usejd)
> > 
> > Both the solvers seem to work decently, compared to no preconditioning. This is how I run the two solvers (for a mesh size of 1600x400):
> > $ ./acoustic_box_test.o -nx 1600 -ny 400 -usejd 1 -deflate 1 -eps_target 0
> > $ ./acoustic_box_test.o -nx 1600 -ny 400 -sinvert 1 -deflate 1 -eps_target 0
> > Both finish in about ~10 minutes on my system in serial. JD seems to be slightly faster and more accurate (for the imaginary part of eigenvalue).
> > The program also runs in parallel using mpiexec. I use complex builds, as in my main program the matrix can be complex.
> > 
> > Now here are my questions:
> > 1) For this particular problem type, could you please check if these are the best settings that one could use? I have tried different combinations of KSP/PC types e.g. GMRES, GAMG, etc, but BCGSL + BJACOBI seems to work the best in serial and parallel.
> > 
> > 2) When I tested these settings in my main program, for some reason the JD solver was not converging. After further testing, I found the issue was related to the setting of "-eps_target 0". I have included "EPSSetTarget(eps,0.0);" in the program and I assumed this is equivalent to passing "-eps_target 0" from the command line, but that doesn't seem to be the case. For instance, if I run the attached program without "-eps_target 0" in the command line then it doesn't converge.
> > $ ./acoustic_box_test.o -nx 1600 -ny 400 -usejd 1 -deflate 1 -eps_target 0
> >  the above finishes in about 10 minutes
> > $ ./acoustic_box_test.o -nx 1600 -ny 400 -usejd 1 -deflate 1
> >  the above doesn't converge even though "EPSSetTarget(eps,0.0);" is included in the code
> > 
> > This only seems to affect the JD solver, not the Krylov shift-and-invert (-sinvert 1) option. So is there any difference between passing "-eps_target 0" from the command line vs using "EPSSetTarget(eps,0.0);" in the code? I cannot pass any command line arguments in my actual program, so need to set everything internally.
> > 
> > 3) Also, another minor related issue. While using the inexact shift-and-invert option, I was running into the following error:
> > 
> > ""
> > Missing or incorrect user input
> > Shift-and-invert requires a target 'which' (see EPSSetWhichEigenpairs), for instance -st_type sinvert -eps_target 0 -eps_target_magnitude
> > ""
> > 
> > I already have the below two lines in the code:
> > EPSSetWhichEigenpairs(eps,EPS_SMALLEST_MAGNITUDE);
> > EPSSetTarget(eps,0.0);
> > 
> > so shouldn't these be enough? If I comment out the first line "EPSSetWhichEigenpairs", then the code works fine.
> 
> You should either do
> 
> EPSSetWhichEigenpairs(eps,EPS_SMALLEST_MAGNITUDE);
> 
> without shift-and-invert or
> 
> EPSSetWhichEigenpairs(eps,EPS_TARGET_MAGNITUDE);
> EPSSetTarget(eps,0.0);
> 
> with shift-and-invert. The latter can also be used without shift-and-invert (e.g. in JD).
> 
> I have to check, but a possible explanation why in your comment above (2) the command-line option -eps_target 0 works differently is that it also sets -eps_target_magnitude if omitted, so to be equivalent in source code you have to call both
> EPSSetWhichEigenpairs(eps,EPS_TARGET_MAGNITUDE);
> EPSSetTarget(eps,0.0);
> 
> Jose
> 
> > I have some more questions regarding setting the preconditioner for a quadratic eigenvalue problem, which I will ask in a follow-up email.
> > 
> > Thanks for your help!
> > 
> > -Varun
> > 
> > 
> > On Thu, Jul 1, 2021 at 5:01 AM Varun Hiremath <varunhiremath at gmail.com> wrote:
> > Thank you very much for these suggestions! We are currently using version 3.12, so I'll try to update to the latest version and try your suggestions. Let me get back to you, thanks!
> > 
> > On Thu, Jul 1, 2021, 4:45 AM Jose E. Roman <jroman at dsic.upv.es> wrote:
> > Then I would try Davidson methods https://doi.org/10.1145/2543696
> > You can also try Krylov-Schur with "inexact" shift-and-invert, for instance, with preconditioned BiCGStab or GMRES, see section 3.4.1 of the users manual.
> > 
> > In both cases, you have to pass matrix A in the call to EPSSetOperators() and the preconditioner matrix via STSetPreconditionerMat() - note this function was introduced in version 3.15.
> > 
> > Jose
> > 
> > 
> > 
> > > El 1 jul 2021, a las 13:36, Varun Hiremath <varunhiremath at gmail.com> escribi?:
> > > 
> > > Thanks. I actually do have a 1st order approximation of matrix A, that I can explicitly compute and also invert. Can I use that matrix as preconditioner to speed things up? Is there some example that explains how to setup and call SLEPc for this scenario? 
> > > 
> > > On Thu, Jul 1, 2021, 4:29 AM Jose E. Roman <jroman at dsic.upv.es> wrote:
> > > For smallest real parts one could adapt ex34.c, but it is going to be costly https://slepc.upv.es/documentation/current/src/eps/tutorials/ex36.c.html
> > > Also, if eigenvalues are clustered around the origin, convergence may still be very slow.
> > > 
> > > It is a tough problem, unless you are able to compute a good preconditioner of A (no need to compute the exact inverse).
> > > 
> > > Jose
> > > 
> > > 
> > > > El 1 jul 2021, a las 13:23, Varun Hiremath <varunhiremath at gmail.com> escribi?:
> > > > 
> > > > I'm solving for the smallest eigenvalues in magnitude. Though is it cheaper to solve smallest in real part, as that might also work in my case? Thanks for your help.
> > > > 
> > > > On Thu, Jul 1, 2021, 4:08 AM Jose E. Roman <jroman at dsic.upv.es> wrote:
> > > > Smallest eigenvalue in magnitude or real part?
> > > > 
> > > > 
> > > > > El 1 jul 2021, a las 11:58, Varun Hiremath <varunhiremath at gmail.com> escribi?:
> > > > > 
> > > > > Sorry, no both A and B are general sparse matrices (non-hermitian). So is there anything else I could try?
> > > > > 
> > > > > On Thu, Jul 1, 2021 at 2:43 AM Jose E. Roman <jroman at dsic.upv.es> wrote:
> > > > > Is the problem symmetric (GHEP)? In that case, you can try LOBPCG on the pair (A,B). But this will likely be slow as well, unless you can provide a good preconditioner.
> > > > > 
> > > > > Jose
> > > > > 
> > > > > 
> > > > > > El 1 jul 2021, a las 11:37, Varun Hiremath <varunhiremath at gmail.com> escribi?:
> > > > > > 
> > > > > > Hi All,
> > > > > > 
> > > > > > I am trying to compute the smallest eigenvalues of a generalized system A*x= lambda*B*x. I don't explicitly know the matrix A (so I am using a shell matrix with a custom matmult function) however, the matrix B is explicitly known so I compute inv(B)*A within the shell matrix and solve inv(B)*A*x = lambda*x.
> > > > > > 
> > > > > > To compute the smallest eigenvalues it is recommended to solve the inverted system, but since matrix A is not explicitly known I can't invert the system. Moreover, the size of the system can be really big, and with the default Krylov solver, it is extremely slow. So is there a better way for me to compute the smallest eigenvalues of this system?
> > > > > > 
> > > > > > Thanks,
> > > > > > Varun
> > > > > 
> > > > 
> > > 
> > 
> > <acoustic_box_test.cpp>
> 
> <acoustic_matrix_test.cpp><MatA.gz>


From varunhiremath at gmail.com  Sat Sep 25 01:07:55 2021
From: varunhiremath at gmail.com (Varun Hiremath)
Date: Fri, 24 Sep 2021 23:07:55 -0700
Subject: [petsc-users] SLEPc: smallest eigenvalues
In-Reply-To: <4BB88AB3-410E-493C-9161-97775747936D@dsic.upv.es>
References: <CAMXxdUJSWU52_Wb=0tWBPYGnMNtCwzyL2Ar_qXZeYY7VkOif7A@mail.gmail.com>
	<EE81AB10-C1BE-494A-8B57-94AE2198CED4@dsic.upv.es>
	<CAMXxdUKch4vPnXPnRzArUEcdxQtjYJnd-KoQFk8yBpdG6fV2fQ@mail.gmail.com>
	<179BDB69-1EC0-4334-A964-ABE29E33EFF8@dsic.upv.es>
	<CAMXxdU+yWJVCS8=_0sk2M-Y4q6ZG8H6O87xbQ5zpxS9QrGnbYA@mail.gmail.com>
	<5B1750B3-E05F-45D7-929B-A5CF816B4A75@dsic.upv.es>
	<CAMXxdUKTSi1oR-tSVa++NwynwoC-QffAE3h_kEp3vVUuypUpnA@mail.gmail.com>
	<7031EC8B-A238-45AD-B4C2-FA8988022864@dsic.upv.es>
	<CAMXxdULvVtyHxg5UcHWvcq7pNNv_4ETKYd-vyp+AZZdSAKKi+Q@mail.gmail.com>
	<CAMXxdU+D5tUkpspfun-H3iK3hVh5QTQtk0rXY6rLuUwLDAdYuw@mail.gmail.com>
	<6B968AE2-8325-4E20-B94A-16ECDD0FBA90@dsic.upv.es>
	<CAMXxdU+nVEUgTMrtZDcCM0yMwku0GQmKKbnJr-P9sKRLrrE4wA@mail.gmail.com>
	<4BB88AB3-410E-493C-9161-97775747936D@dsic.upv.es>
Message-ID: <CAMXxdUK-E-zc1QgqyrLcZUZZx8nbAqH0HSQVf1fJZjCwbJaF0Q@mail.gmail.com>

Hi Jose,

Thanks for checking my code and providing suggestions.

In my particular case, I don't know the matrix A explicitly, I compute A*x
in a matrix-free way within a shell matrix, so I can't use any of the
direct factorization methods. But just a question regarding your suggestion
to compute a (parallel) LU factorization. In our work, we do use MUMPS to
compute the parallel factorization. For solving the generalized problem,
A*x = lambda*B*x, we are computing inv(B)*A*x within a shell matrix, where
factorization of B is computed using MUMPS. (We don't call MUMPS through
SLEPc as we have our own MPI wrapper and other user settings to handle.)

So for the preconditioning, instead of using the iterative solvers, can I
provide a shell matrix that computes inv(P)*x corrections (where P is the
preconditioner matrix) using MUMPS direct solver?

And yes, thanks, #define PETSC_USE_COMPLEX 1 is not needed, it works
without it.

Regards,
Varun

On Fri, Sep 24, 2021 at 9:14 AM Jose E. Roman <jroman at dsic.upv.es> wrote:

> If you do
> $ ./acoustic_matrix_test.o -shell 0 -st_type sinvert -deflate 1
> then it is using an LU factorization (the default), which is fast.
>
> Use -eps_view to see which solver settings are you using.
>
> BiCGStab with block Jacobi does not work for you matrix, it exceeds the
> maximum 10000 iterations. So this is not viable unless you can find a
> better preconditioner for your problem. If not, just using
> EPS_SMALLEST_MAGNITUDE will be faster.
>
> Computing smallest magnitude eigenvalues is a difficult task. The most
> robust way is to compute a (parallel) LU factorization if you can afford it.
>
>
> A side note: don't add this to your source code
> #define PETSC_USE_COMPLEX 1
> This define is taken from PETSc's include files, you should not mess with
> it. Instead, you probably want to add something like this AFTER #include
> <slepceps.h>:
> #if !defined(PETSC_USE_COMPLEX)
> #error "Requires complex scalars"
> #endif
>
> Jose
>
>
> > El 22 sept 2021, a las 19:38, Varun Hiremath <varunhiremath at gmail.com>
> escribi?:
> >
> > Hi Jose,
> >
> > Thank you, that explains it and my example code works now without
> specifying "-eps_target 0" in the command line.
> >
> > However, both the Krylov inexact shift-invert and JD solvers are
> struggling to converge for some of my actual problems. The issue seems to
> be related to non-symmetric general matrices. I have extracted one such
> matrix attached here as MatA.gz (size 100k), and have also included a short
> program that loads this matrix and then computes the smallest eigenvalues
> as I described earlier.
> >
> > For this matrix, if I compute the eigenvalues directly (without using
> the shell matrix) using shift-and-invert (as below) then it converges in
> less than a minute.
> > $ ./acoustic_matrix_test.o -shell 0 -st_type sinvert -deflate 1
> >
> > However, if I use the shell matrix and use any of the preconditioned
> solvers JD or Krylov shift-invert (as shown below) with the same matrix as
> the preconditioner, then they struggle to converge.
> > $ ./acoustic_matrix_test.o -usejd 1 -deflate 1
> > $ ./acoustic_matrix_test.o -sinvert 1 -deflate 1
> >
> > Could you please check the attached code and suggest any changes in
> settings that might help with convergence for these kinds of matrices? I
> appreciate your help!
> >
> > Thanks,
> > Varun
> >
> > On Tue, Sep 21, 2021 at 11:14 AM Jose E. Roman <jroman at dsic.upv.es>
> wrote:
> > I will have a look at your code when I have more time. Meanwhile, I am
> answering 3) below...
> >
> > > El 21 sept 2021, a las 0:23, Varun Hiremath <varunhiremath at gmail.com>
> escribi?:
> > >
> > > Hi Jose,
> > >
> > > Sorry, it took me a while to test these settings in the new builds. I
> am getting good improvement in performance using the preconditioned
> solvers, so thanks for the suggestions! But I have some questions related
> to the usage.
> > >
> > > We are using SLEPc to solve the acoustic modal eigenvalue problem.
> Attached is a simple standalone program that computes acoustic modes in a
> simple rectangular box. This program illustrates the general setup I am
> using, though here the shell matrix and the preconditioner matrix are the
> same, while in my actual program the shell matrix computes A*x without
> explicitly forming A, and the preconditioner is a 0th order approximation
> of A.
> > >
> > > In the attached program I have tested both
> > > 1) the Krylov-Schur with inexact shift-and-invert (implemented under
> the option sinvert);
> > > 2) the JD solver with preconditioner (implemented under the option
> usejd)
> > >
> > > Both the solvers seem to work decently, compared to no
> preconditioning. This is how I run the two solvers (for a mesh size of
> 1600x400):
> > > $ ./acoustic_box_test.o -nx 1600 -ny 400 -usejd 1 -deflate 1
> -eps_target 0
> > > $ ./acoustic_box_test.o -nx 1600 -ny 400 -sinvert 1 -deflate 1
> -eps_target 0
> > > Both finish in about ~10 minutes on my system in serial. JD seems to
> be slightly faster and more accurate (for the imaginary part of eigenvalue).
> > > The program also runs in parallel using mpiexec. I use complex builds,
> as in my main program the matrix can be complex.
> > >
> > > Now here are my questions:
> > > 1) For this particular problem type, could you please check if these
> are the best settings that one could use? I have tried different
> combinations of KSP/PC types e.g. GMRES, GAMG, etc, but BCGSL + BJACOBI
> seems to work the best in serial and parallel.
> > >
> > > 2) When I tested these settings in my main program, for some reason
> the JD solver was not converging. After further testing, I found the issue
> was related to the setting of "-eps_target 0". I have included
> "EPSSetTarget(eps,0.0);" in the program and I assumed this is equivalent to
> passing "-eps_target 0" from the command line, but that doesn't seem to be
> the case. For instance, if I run the attached program without "-eps_target
> 0" in the command line then it doesn't converge.
> > > $ ./acoustic_box_test.o -nx 1600 -ny 400 -usejd 1 -deflate 1
> -eps_target 0
> > >  the above finishes in about 10 minutes
> > > $ ./acoustic_box_test.o -nx 1600 -ny 400 -usejd 1 -deflate 1
> > >  the above doesn't converge even though "EPSSetTarget(eps,0.0);" is
> included in the code
> > >
> > > This only seems to affect the JD solver, not the Krylov
> shift-and-invert (-sinvert 1) option. So is there any difference between
> passing "-eps_target 0" from the command line vs using
> "EPSSetTarget(eps,0.0);" in the code? I cannot pass any command line
> arguments in my actual program, so need to set everything internally.
> > >
> > > 3) Also, another minor related issue. While using the inexact
> shift-and-invert option, I was running into the following error:
> > >
> > > ""
> > > Missing or incorrect user input
> > > Shift-and-invert requires a target 'which' (see
> EPSSetWhichEigenpairs), for instance -st_type sinvert -eps_target 0
> -eps_target_magnitude
> > > ""
> > >
> > > I already have the below two lines in the code:
> > > EPSSetWhichEigenpairs(eps,EPS_SMALLEST_MAGNITUDE);
> > > EPSSetTarget(eps,0.0);
> > >
> > > so shouldn't these be enough? If I comment out the first line
> "EPSSetWhichEigenpairs", then the code works fine.
> >
> > You should either do
> >
> > EPSSetWhichEigenpairs(eps,EPS_SMALLEST_MAGNITUDE);
> >
> > without shift-and-invert or
> >
> > EPSSetWhichEigenpairs(eps,EPS_TARGET_MAGNITUDE);
> > EPSSetTarget(eps,0.0);
> >
> > with shift-and-invert. The latter can also be used without
> shift-and-invert (e.g. in JD).
> >
> > I have to check, but a possible explanation why in your comment above
> (2) the command-line option -eps_target 0 works differently is that it also
> sets -eps_target_magnitude if omitted, so to be equivalent in source code
> you have to call both
> > EPSSetWhichEigenpairs(eps,EPS_TARGET_MAGNITUDE);
> > EPSSetTarget(eps,0.0);
> >
> > Jose
> >
> > > I have some more questions regarding setting the preconditioner for a
> quadratic eigenvalue problem, which I will ask in a follow-up email.
> > >
> > > Thanks for your help!
> > >
> > > -Varun
> > >
> > >
> > > On Thu, Jul 1, 2021 at 5:01 AM Varun Hiremath <varunhiremath at gmail.com>
> wrote:
> > > Thank you very much for these suggestions! We are currently using
> version 3.12, so I'll try to update to the latest version and try your
> suggestions. Let me get back to you, thanks!
> > >
> > > On Thu, Jul 1, 2021, 4:45 AM Jose E. Roman <jroman at dsic.upv.es> wrote:
> > > Then I would try Davidson methods https://doi.org/10.1145/2543696
> > > You can also try Krylov-Schur with "inexact" shift-and-invert, for
> instance, with preconditioned BiCGStab or GMRES, see section 3.4.1 of the
> users manual.
> > >
> > > In both cases, you have to pass matrix A in the call to
> EPSSetOperators() and the preconditioner matrix via
> STSetPreconditionerMat() - note this function was introduced in version
> 3.15.
> > >
> > > Jose
> > >
> > >
> > >
> > > > El 1 jul 2021, a las 13:36, Varun Hiremath <varunhiremath at gmail.com>
> escribi?:
> > > >
> > > > Thanks. I actually do have a 1st order approximation of matrix A,
> that I can explicitly compute and also invert. Can I use that matrix as
> preconditioner to speed things up? Is there some example that explains how
> to setup and call SLEPc for this scenario?
> > > >
> > > > On Thu, Jul 1, 2021, 4:29 AM Jose E. Roman <jroman at dsic.upv.es>
> wrote:
> > > > For smallest real parts one could adapt ex34.c, but it is going to
> be costly
> https://slepc.upv.es/documentation/current/src/eps/tutorials/ex36.c.html
> > > > Also, if eigenvalues are clustered around the origin, convergence
> may still be very slow.
> > > >
> > > > It is a tough problem, unless you are able to compute a good
> preconditioner of A (no need to compute the exact inverse).
> > > >
> > > > Jose
> > > >
> > > >
> > > > > El 1 jul 2021, a las 13:23, Varun Hiremath <
> varunhiremath at gmail.com> escribi?:
> > > > >
> > > > > I'm solving for the smallest eigenvalues in magnitude. Though is
> it cheaper to solve smallest in real part, as that might also work in my
> case? Thanks for your help.
> > > > >
> > > > > On Thu, Jul 1, 2021, 4:08 AM Jose E. Roman <jroman at dsic.upv.es>
> wrote:
> > > > > Smallest eigenvalue in magnitude or real part?
> > > > >
> > > > >
> > > > > > El 1 jul 2021, a las 11:58, Varun Hiremath <
> varunhiremath at gmail.com> escribi?:
> > > > > >
> > > > > > Sorry, no both A and B are general sparse matrices
> (non-hermitian). So is there anything else I could try?
> > > > > >
> > > > > > On Thu, Jul 1, 2021 at 2:43 AM Jose E. Roman <jroman at dsic.upv.es>
> wrote:
> > > > > > Is the problem symmetric (GHEP)? In that case, you can try
> LOBPCG on the pair (A,B). But this will likely be slow as well, unless you
> can provide a good preconditioner.
> > > > > >
> > > > > > Jose
> > > > > >
> > > > > >
> > > > > > > El 1 jul 2021, a las 11:37, Varun Hiremath <
> varunhiremath at gmail.com> escribi?:
> > > > > > >
> > > > > > > Hi All,
> > > > > > >
> > > > > > > I am trying to compute the smallest eigenvalues of a
> generalized system A*x= lambda*B*x. I don't explicitly know the matrix A
> (so I am using a shell matrix with a custom matmult function) however, the
> matrix B is explicitly known so I compute inv(B)*A within the shell matrix
> and solve inv(B)*A*x = lambda*x.
> > > > > > >
> > > > > > > To compute the smallest eigenvalues it is recommended to solve
> the inverted system, but since matrix A is not explicitly known I can't
> invert the system. Moreover, the size of the system can be really big, and
> with the default Krylov solver, it is extremely slow. So is there a better
> way for me to compute the smallest eigenvalues of this system?
> > > > > > >
> > > > > > > Thanks,
> > > > > > > Varun
> > > > > >
> > > > >
> > > >
> > >
> > > <acoustic_box_test.cpp>
> >
> > <acoustic_matrix_test.cpp><MatA.gz>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210924/816f2ccf/attachment.html>

From jroman at dsic.upv.es  Sat Sep 25 01:12:16 2021
From: jroman at dsic.upv.es (Jose E. Roman)
Date: Sat, 25 Sep 2021 08:12:16 +0200
Subject: [petsc-users] SLEPc: smallest eigenvalues
In-Reply-To: <CAMXxdUK-E-zc1QgqyrLcZUZZx8nbAqH0HSQVf1fJZjCwbJaF0Q@mail.gmail.com>
References: <CAMXxdUJSWU52_Wb=0tWBPYGnMNtCwzyL2Ar_qXZeYY7VkOif7A@mail.gmail.com>
	<EE81AB10-C1BE-494A-8B57-94AE2198CED4@dsic.upv.es>
	<CAMXxdUKch4vPnXPnRzArUEcdxQtjYJnd-KoQFk8yBpdG6fV2fQ@mail.gmail.com>
	<179BDB69-1EC0-4334-A964-ABE29E33EFF8@dsic.upv.es>
	<CAMXxdU+yWJVCS8=_0sk2M-Y4q6ZG8H6O87xbQ5zpxS9QrGnbYA@mail.gmail.com>
	<5B1750B3-E05F-45D7-929B-A5CF816B4A75@dsic.upv.es>
	<CAMXxdUKTSi1oR-tSVa++NwynwoC-QffAE3h_kEp3vVUuypUpnA@mail.gmail.com>
	<7031EC8B-A238-45AD-B4C2-FA8988022864@dsic.upv.es>
	<CAMXxdULvVtyHxg5UcHWvcq7pNNv_4ETKYd-vyp+AZZdSAKKi+Q@mail.gmail.com>
	<CAMXxdU+D5tUkpspfun-H3iK3hVh5QTQtk0rXY6rLuUwLDAdYuw@mail.gmail.com>
	<6B968AE2-8325-4E20-B94A-16ECDD0FBA90@dsic.upv.es>
	<CAMXxdU+nVEUgTMrtZDcCM0yMwku0GQmKKbnJr-P9sKRLrrE4wA@mail.gmail.com>
	<4BB88AB3-410E-493C-9161-97775747936D@dsic.upv.es>
	<CAMXxdUK-E-zc1QgqyrLcZUZZx8nbAqH0HSQVf1fJZjCwbJaF0Q@mail.gmail.com>
Message-ID: <32B34038-7E1A-42CA-A55D-9AF9D41D1697@dsic.upv.es>

Yes, you can use PCMAT https://petsc.org/release/docs/manualpages/PC/PCMAT.html then pass a preconditioner matrix that performs the inverse via a shell matrix.

> El 25 sept 2021, a las 8:07, Varun Hiremath <varunhiremath at gmail.com> escribi?:
> 
> Hi Jose,
> 
> Thanks for checking my code and providing suggestions. 
> 
> In my particular case, I don't know the matrix A explicitly, I compute A*x in a matrix-free way within a shell matrix, so I can't use any of the direct factorization methods. But just a question regarding your suggestion to compute a (parallel) LU factorization. In our work, we do use MUMPS to compute the parallel factorization. For solving the generalized problem, A*x = lambda*B*x, we are computing inv(B)*A*x within a shell matrix, where factorization of B is computed using MUMPS. (We don't call MUMPS through SLEPc as we have our own MPI wrapper and other user settings to handle.)
> 
> So for the preconditioning, instead of using the iterative solvers, can I provide a shell matrix that computes inv(P)*x corrections (where P is the preconditioner matrix) using MUMPS direct solver?
> 
> And yes, thanks, #define PETSC_USE_COMPLEX 1 is not needed, it works without it.
> 
> Regards,
> Varun
> 
> On Fri, Sep 24, 2021 at 9:14 AM Jose E. Roman <jroman at dsic.upv.es> wrote:
> If you do
> $ ./acoustic_matrix_test.o -shell 0 -st_type sinvert -deflate 1
> then it is using an LU factorization (the default), which is fast.
> 
> Use -eps_view to see which solver settings are you using.
> 
> BiCGStab with block Jacobi does not work for you matrix, it exceeds the maximum 10000 iterations. So this is not viable unless you can find a better preconditioner for your problem. If not, just using EPS_SMALLEST_MAGNITUDE will be faster.
> 
> Computing smallest magnitude eigenvalues is a difficult task. The most robust way is to compute a (parallel) LU factorization if you can afford it.
> 
> 
> A side note: don't add this to your source code
> #define PETSC_USE_COMPLEX 1
> This define is taken from PETSc's include files, you should not mess with it. Instead, you probably want to add something like this AFTER #include <slepceps.h>:
> #if !defined(PETSC_USE_COMPLEX)
> #error "Requires complex scalars"
> #endif
> 
> Jose
> 
> 
> > El 22 sept 2021, a las 19:38, Varun Hiremath <varunhiremath at gmail.com> escribi?:
> > 
> > Hi Jose,
> > 
> > Thank you, that explains it and my example code works now without specifying "-eps_target 0" in the command line.
> > 
> > However, both the Krylov inexact shift-invert and JD solvers are struggling to converge for some of my actual problems. The issue seems to be related to non-symmetric general matrices. I have extracted one such matrix attached here as MatA.gz (size 100k), and have also included a short program that loads this matrix and then computes the smallest eigenvalues as I described earlier.
> > 
> > For this matrix, if I compute the eigenvalues directly (without using the shell matrix) using shift-and-invert (as below) then it converges in less than a minute.
> > $ ./acoustic_matrix_test.o -shell 0 -st_type sinvert -deflate 1
> > 
> > However, if I use the shell matrix and use any of the preconditioned solvers JD or Krylov shift-invert (as shown below) with the same matrix as the preconditioner, then they struggle to converge.
> > $ ./acoustic_matrix_test.o -usejd 1 -deflate 1
> > $ ./acoustic_matrix_test.o -sinvert 1 -deflate 1
> > 
> > Could you please check the attached code and suggest any changes in settings that might help with convergence for these kinds of matrices? I appreciate your help!
> > 
> > Thanks,
> > Varun
> > 
> > On Tue, Sep 21, 2021 at 11:14 AM Jose E. Roman <jroman at dsic.upv.es> wrote:
> > I will have a look at your code when I have more time. Meanwhile, I am answering 3) below...
> > 
> > > El 21 sept 2021, a las 0:23, Varun Hiremath <varunhiremath at gmail.com> escribi?:
> > > 
> > > Hi Jose,
> > > 
> > > Sorry, it took me a while to test these settings in the new builds. I am getting good improvement in performance using the preconditioned solvers, so thanks for the suggestions! But I have some questions related to the usage.
> > > 
> > > We are using SLEPc to solve the acoustic modal eigenvalue problem. Attached is a simple standalone program that computes acoustic modes in a simple rectangular box. This program illustrates the general setup I am using, though here the shell matrix and the preconditioner matrix are the same, while in my actual program the shell matrix computes A*x without explicitly forming A, and the preconditioner is a 0th order approximation of A.
> > > 
> > > In the attached program I have tested both
> > > 1) the Krylov-Schur with inexact shift-and-invert (implemented under the option sinvert);
> > > 2) the JD solver with preconditioner (implemented under the option usejd)
> > > 
> > > Both the solvers seem to work decently, compared to no preconditioning. This is how I run the two solvers (for a mesh size of 1600x400):
> > > $ ./acoustic_box_test.o -nx 1600 -ny 400 -usejd 1 -deflate 1 -eps_target 0
> > > $ ./acoustic_box_test.o -nx 1600 -ny 400 -sinvert 1 -deflate 1 -eps_target 0
> > > Both finish in about ~10 minutes on my system in serial. JD seems to be slightly faster and more accurate (for the imaginary part of eigenvalue).
> > > The program also runs in parallel using mpiexec. I use complex builds, as in my main program the matrix can be complex.
> > > 
> > > Now here are my questions:
> > > 1) For this particular problem type, could you please check if these are the best settings that one could use? I have tried different combinations of KSP/PC types e.g. GMRES, GAMG, etc, but BCGSL + BJACOBI seems to work the best in serial and parallel.
> > > 
> > > 2) When I tested these settings in my main program, for some reason the JD solver was not converging. After further testing, I found the issue was related to the setting of "-eps_target 0". I have included "EPSSetTarget(eps,0.0);" in the program and I assumed this is equivalent to passing "-eps_target 0" from the command line, but that doesn't seem to be the case. For instance, if I run the attached program without "-eps_target 0" in the command line then it doesn't converge.
> > > $ ./acoustic_box_test.o -nx 1600 -ny 400 -usejd 1 -deflate 1 -eps_target 0
> > >  the above finishes in about 10 minutes
> > > $ ./acoustic_box_test.o -nx 1600 -ny 400 -usejd 1 -deflate 1
> > >  the above doesn't converge even though "EPSSetTarget(eps,0.0);" is included in the code
> > > 
> > > This only seems to affect the JD solver, not the Krylov shift-and-invert (-sinvert 1) option. So is there any difference between passing "-eps_target 0" from the command line vs using "EPSSetTarget(eps,0.0);" in the code? I cannot pass any command line arguments in my actual program, so need to set everything internally.
> > > 
> > > 3) Also, another minor related issue. While using the inexact shift-and-invert option, I was running into the following error:
> > > 
> > > ""
> > > Missing or incorrect user input
> > > Shift-and-invert requires a target 'which' (see EPSSetWhichEigenpairs), for instance -st_type sinvert -eps_target 0 -eps_target_magnitude
> > > ""
> > > 
> > > I already have the below two lines in the code:
> > > EPSSetWhichEigenpairs(eps,EPS_SMALLEST_MAGNITUDE);
> > > EPSSetTarget(eps,0.0);
> > > 
> > > so shouldn't these be enough? If I comment out the first line "EPSSetWhichEigenpairs", then the code works fine.
> > 
> > You should either do
> > 
> > EPSSetWhichEigenpairs(eps,EPS_SMALLEST_MAGNITUDE);
> > 
> > without shift-and-invert or
> > 
> > EPSSetWhichEigenpairs(eps,EPS_TARGET_MAGNITUDE);
> > EPSSetTarget(eps,0.0);
> > 
> > with shift-and-invert. The latter can also be used without shift-and-invert (e.g. in JD).
> > 
> > I have to check, but a possible explanation why in your comment above (2) the command-line option -eps_target 0 works differently is that it also sets -eps_target_magnitude if omitted, so to be equivalent in source code you have to call both
> > EPSSetWhichEigenpairs(eps,EPS_TARGET_MAGNITUDE);
> > EPSSetTarget(eps,0.0);
> > 
> > Jose
> > 
> > > I have some more questions regarding setting the preconditioner for a quadratic eigenvalue problem, which I will ask in a follow-up email.
> > > 
> > > Thanks for your help!
> > > 
> > > -Varun
> > > 
> > > 
> > > On Thu, Jul 1, 2021 at 5:01 AM Varun Hiremath <varunhiremath at gmail.com> wrote:
> > > Thank you very much for these suggestions! We are currently using version 3.12, so I'll try to update to the latest version and try your suggestions. Let me get back to you, thanks!
> > > 
> > > On Thu, Jul 1, 2021, 4:45 AM Jose E. Roman <jroman at dsic.upv.es> wrote:
> > > Then I would try Davidson methods https://doi.org/10.1145/2543696
> > > You can also try Krylov-Schur with "inexact" shift-and-invert, for instance, with preconditioned BiCGStab or GMRES, see section 3.4.1 of the users manual.
> > > 
> > > In both cases, you have to pass matrix A in the call to EPSSetOperators() and the preconditioner matrix via STSetPreconditionerMat() - note this function was introduced in version 3.15.
> > > 
> > > Jose
> > > 
> > > 
> > > 
> > > > El 1 jul 2021, a las 13:36, Varun Hiremath <varunhiremath at gmail.com> escribi?:
> > > > 
> > > > Thanks. I actually do have a 1st order approximation of matrix A, that I can explicitly compute and also invert. Can I use that matrix as preconditioner to speed things up? Is there some example that explains how to setup and call SLEPc for this scenario? 
> > > > 
> > > > On Thu, Jul 1, 2021, 4:29 AM Jose E. Roman <jroman at dsic.upv.es> wrote:
> > > > For smallest real parts one could adapt ex34.c, but it is going to be costly https://slepc.upv.es/documentation/current/src/eps/tutorials/ex36.c.html
> > > > Also, if eigenvalues are clustered around the origin, convergence may still be very slow.
> > > > 
> > > > It is a tough problem, unless you are able to compute a good preconditioner of A (no need to compute the exact inverse).
> > > > 
> > > > Jose
> > > > 
> > > > 
> > > > > El 1 jul 2021, a las 13:23, Varun Hiremath <varunhiremath at gmail.com> escribi?:
> > > > > 
> > > > > I'm solving for the smallest eigenvalues in magnitude. Though is it cheaper to solve smallest in real part, as that might also work in my case? Thanks for your help.
> > > > > 
> > > > > On Thu, Jul 1, 2021, 4:08 AM Jose E. Roman <jroman at dsic.upv.es> wrote:
> > > > > Smallest eigenvalue in magnitude or real part?
> > > > > 
> > > > > 
> > > > > > El 1 jul 2021, a las 11:58, Varun Hiremath <varunhiremath at gmail.com> escribi?:
> > > > > > 
> > > > > > Sorry, no both A and B are general sparse matrices (non-hermitian). So is there anything else I could try?
> > > > > > 
> > > > > > On Thu, Jul 1, 2021 at 2:43 AM Jose E. Roman <jroman at dsic.upv.es> wrote:
> > > > > > Is the problem symmetric (GHEP)? In that case, you can try LOBPCG on the pair (A,B). But this will likely be slow as well, unless you can provide a good preconditioner.
> > > > > > 
> > > > > > Jose
> > > > > > 
> > > > > > 
> > > > > > > El 1 jul 2021, a las 11:37, Varun Hiremath <varunhiremath at gmail.com> escribi?:
> > > > > > > 
> > > > > > > Hi All,
> > > > > > > 
> > > > > > > I am trying to compute the smallest eigenvalues of a generalized system A*x= lambda*B*x. I don't explicitly know the matrix A (so I am using a shell matrix with a custom matmult function) however, the matrix B is explicitly known so I compute inv(B)*A within the shell matrix and solve inv(B)*A*x = lambda*x.
> > > > > > > 
> > > > > > > To compute the smallest eigenvalues it is recommended to solve the inverted system, but since matrix A is not explicitly known I can't invert the system. Moreover, the size of the system can be really big, and with the default Krylov solver, it is extremely slow. So is there a better way for me to compute the smallest eigenvalues of this system?
> > > > > > > 
> > > > > > > Thanks,
> > > > > > > Varun
> > > > > > 
> > > > > 
> > > > 
> > > 
> > > <acoustic_box_test.cpp>
> > 
> > <acoustic_matrix_test.cpp><MatA.gz>
> 


From varunhiremath at gmail.com  Sat Sep 25 01:50:38 2021
From: varunhiremath at gmail.com (Varun Hiremath)
Date: Fri, 24 Sep 2021 23:50:38 -0700
Subject: [petsc-users] SLEPc: smallest eigenvalues
In-Reply-To: <32B34038-7E1A-42CA-A55D-9AF9D41D1697@dsic.upv.es>
References: <CAMXxdUJSWU52_Wb=0tWBPYGnMNtCwzyL2Ar_qXZeYY7VkOif7A@mail.gmail.com>
	<EE81AB10-C1BE-494A-8B57-94AE2198CED4@dsic.upv.es>
	<CAMXxdUKch4vPnXPnRzArUEcdxQtjYJnd-KoQFk8yBpdG6fV2fQ@mail.gmail.com>
	<179BDB69-1EC0-4334-A964-ABE29E33EFF8@dsic.upv.es>
	<CAMXxdU+yWJVCS8=_0sk2M-Y4q6ZG8H6O87xbQ5zpxS9QrGnbYA@mail.gmail.com>
	<5B1750B3-E05F-45D7-929B-A5CF816B4A75@dsic.upv.es>
	<CAMXxdUKTSi1oR-tSVa++NwynwoC-QffAE3h_kEp3vVUuypUpnA@mail.gmail.com>
	<7031EC8B-A238-45AD-B4C2-FA8988022864@dsic.upv.es>
	<CAMXxdULvVtyHxg5UcHWvcq7pNNv_4ETKYd-vyp+AZZdSAKKi+Q@mail.gmail.com>
	<CAMXxdU+D5tUkpspfun-H3iK3hVh5QTQtk0rXY6rLuUwLDAdYuw@mail.gmail.com>
	<6B968AE2-8325-4E20-B94A-16ECDD0FBA90@dsic.upv.es>
	<CAMXxdU+nVEUgTMrtZDcCM0yMwku0GQmKKbnJr-P9sKRLrrE4wA@mail.gmail.com>
	<4BB88AB3-410E-493C-9161-97775747936D@dsic.upv.es>
	<CAMXxdUK-E-zc1QgqyrLcZUZZx8nbAqH0HSQVf1fJZjCwbJaF0Q@mail.gmail.com>
	<32B34038-7E1A-42CA-A55D-9AF9D41D1697@dsic.upv.es>
Message-ID: <CAMXxdULR0e_8XNwPxWmgG3t_CUvTOVCwEvMXpX1tr-WtXy6T9w@mail.gmail.com>

Ok, great! I will give that a try, thanks for your help!

On Fri, Sep 24, 2021 at 11:12 PM Jose E. Roman <jroman at dsic.upv.es> wrote:

> Yes, you can use PCMAT
> https://petsc.org/release/docs/manualpages/PC/PCMAT.html then pass a
> preconditioner matrix that performs the inverse via a shell matrix.
>
> > El 25 sept 2021, a las 8:07, Varun Hiremath <varunhiremath at gmail.com>
> escribi?:
> >
> > Hi Jose,
> >
> > Thanks for checking my code and providing suggestions.
> >
> > In my particular case, I don't know the matrix A explicitly, I compute
> A*x in a matrix-free way within a shell matrix, so I can't use any of the
> direct factorization methods. But just a question regarding your suggestion
> to compute a (parallel) LU factorization. In our work, we do use MUMPS to
> compute the parallel factorization. For solving the generalized problem,
> A*x = lambda*B*x, we are computing inv(B)*A*x within a shell matrix, where
> factorization of B is computed using MUMPS. (We don't call MUMPS through
> SLEPc as we have our own MPI wrapper and other user settings to handle.)
> >
> > So for the preconditioning, instead of using the iterative solvers, can
> I provide a shell matrix that computes inv(P)*x corrections (where P is the
> preconditioner matrix) using MUMPS direct solver?
> >
> > And yes, thanks, #define PETSC_USE_COMPLEX 1 is not needed, it works
> without it.
> >
> > Regards,
> > Varun
> >
> > On Fri, Sep 24, 2021 at 9:14 AM Jose E. Roman <jroman at dsic.upv.es>
> wrote:
> > If you do
> > $ ./acoustic_matrix_test.o -shell 0 -st_type sinvert -deflate 1
> > then it is using an LU factorization (the default), which is fast.
> >
> > Use -eps_view to see which solver settings are you using.
> >
> > BiCGStab with block Jacobi does not work for you matrix, it exceeds the
> maximum 10000 iterations. So this is not viable unless you can find a
> better preconditioner for your problem. If not, just using
> EPS_SMALLEST_MAGNITUDE will be faster.
> >
> > Computing smallest magnitude eigenvalues is a difficult task. The most
> robust way is to compute a (parallel) LU factorization if you can afford it.
> >
> >
> > A side note: don't add this to your source code
> > #define PETSC_USE_COMPLEX 1
> > This define is taken from PETSc's include files, you should not mess
> with it. Instead, you probably want to add something like this AFTER
> #include <slepceps.h>:
> > #if !defined(PETSC_USE_COMPLEX)
> > #error "Requires complex scalars"
> > #endif
> >
> > Jose
> >
> >
> > > El 22 sept 2021, a las 19:38, Varun Hiremath <varunhiremath at gmail.com>
> escribi?:
> > >
> > > Hi Jose,
> > >
> > > Thank you, that explains it and my example code works now without
> specifying "-eps_target 0" in the command line.
> > >
> > > However, both the Krylov inexact shift-invert and JD solvers are
> struggling to converge for some of my actual problems. The issue seems to
> be related to non-symmetric general matrices. I have extracted one such
> matrix attached here as MatA.gz (size 100k), and have also included a short
> program that loads this matrix and then computes the smallest eigenvalues
> as I described earlier.
> > >
> > > For this matrix, if I compute the eigenvalues directly (without using
> the shell matrix) using shift-and-invert (as below) then it converges in
> less than a minute.
> > > $ ./acoustic_matrix_test.o -shell 0 -st_type sinvert -deflate 1
> > >
> > > However, if I use the shell matrix and use any of the preconditioned
> solvers JD or Krylov shift-invert (as shown below) with the same matrix as
> the preconditioner, then they struggle to converge.
> > > $ ./acoustic_matrix_test.o -usejd 1 -deflate 1
> > > $ ./acoustic_matrix_test.o -sinvert 1 -deflate 1
> > >
> > > Could you please check the attached code and suggest any changes in
> settings that might help with convergence for these kinds of matrices? I
> appreciate your help!
> > >
> > > Thanks,
> > > Varun
> > >
> > > On Tue, Sep 21, 2021 at 11:14 AM Jose E. Roman <jroman at dsic.upv.es>
> wrote:
> > > I will have a look at your code when I have more time. Meanwhile, I am
> answering 3) below...
> > >
> > > > El 21 sept 2021, a las 0:23, Varun Hiremath <varunhiremath at gmail.com>
> escribi?:
> > > >
> > > > Hi Jose,
> > > >
> > > > Sorry, it took me a while to test these settings in the new builds.
> I am getting good improvement in performance using the preconditioned
> solvers, so thanks for the suggestions! But I have some questions related
> to the usage.
> > > >
> > > > We are using SLEPc to solve the acoustic modal eigenvalue problem.
> Attached is a simple standalone program that computes acoustic modes in a
> simple rectangular box. This program illustrates the general setup I am
> using, though here the shell matrix and the preconditioner matrix are the
> same, while in my actual program the shell matrix computes A*x without
> explicitly forming A, and the preconditioner is a 0th order approximation
> of A.
> > > >
> > > > In the attached program I have tested both
> > > > 1) the Krylov-Schur with inexact shift-and-invert (implemented under
> the option sinvert);
> > > > 2) the JD solver with preconditioner (implemented under the option
> usejd)
> > > >
> > > > Both the solvers seem to work decently, compared to no
> preconditioning. This is how I run the two solvers (for a mesh size of
> 1600x400):
> > > > $ ./acoustic_box_test.o -nx 1600 -ny 400 -usejd 1 -deflate 1
> -eps_target 0
> > > > $ ./acoustic_box_test.o -nx 1600 -ny 400 -sinvert 1 -deflate 1
> -eps_target 0
> > > > Both finish in about ~10 minutes on my system in serial. JD seems to
> be slightly faster and more accurate (for the imaginary part of eigenvalue).
> > > > The program also runs in parallel using mpiexec. I use complex
> builds, as in my main program the matrix can be complex.
> > > >
> > > > Now here are my questions:
> > > > 1) For this particular problem type, could you please check if these
> are the best settings that one could use? I have tried different
> combinations of KSP/PC types e.g. GMRES, GAMG, etc, but BCGSL + BJACOBI
> seems to work the best in serial and parallel.
> > > >
> > > > 2) When I tested these settings in my main program, for some reason
> the JD solver was not converging. After further testing, I found the issue
> was related to the setting of "-eps_target 0". I have included
> "EPSSetTarget(eps,0.0);" in the program and I assumed this is equivalent to
> passing "-eps_target 0" from the command line, but that doesn't seem to be
> the case. For instance, if I run the attached program without "-eps_target
> 0" in the command line then it doesn't converge.
> > > > $ ./acoustic_box_test.o -nx 1600 -ny 400 -usejd 1 -deflate 1
> -eps_target 0
> > > >  the above finishes in about 10 minutes
> > > > $ ./acoustic_box_test.o -nx 1600 -ny 400 -usejd 1 -deflate 1
> > > >  the above doesn't converge even though "EPSSetTarget(eps,0.0);" is
> included in the code
> > > >
> > > > This only seems to affect the JD solver, not the Krylov
> shift-and-invert (-sinvert 1) option. So is there any difference between
> passing "-eps_target 0" from the command line vs using
> "EPSSetTarget(eps,0.0);" in the code? I cannot pass any command line
> arguments in my actual program, so need to set everything internally.
> > > >
> > > > 3) Also, another minor related issue. While using the inexact
> shift-and-invert option, I was running into the following error:
> > > >
> > > > ""
> > > > Missing or incorrect user input
> > > > Shift-and-invert requires a target 'which' (see
> EPSSetWhichEigenpairs), for instance -st_type sinvert -eps_target 0
> -eps_target_magnitude
> > > > ""
> > > >
> > > > I already have the below two lines in the code:
> > > > EPSSetWhichEigenpairs(eps,EPS_SMALLEST_MAGNITUDE);
> > > > EPSSetTarget(eps,0.0);
> > > >
> > > > so shouldn't these be enough? If I comment out the first line
> "EPSSetWhichEigenpairs", then the code works fine.
> > >
> > > You should either do
> > >
> > > EPSSetWhichEigenpairs(eps,EPS_SMALLEST_MAGNITUDE);
> > >
> > > without shift-and-invert or
> > >
> > > EPSSetWhichEigenpairs(eps,EPS_TARGET_MAGNITUDE);
> > > EPSSetTarget(eps,0.0);
> > >
> > > with shift-and-invert. The latter can also be used without
> shift-and-invert (e.g. in JD).
> > >
> > > I have to check, but a possible explanation why in your comment above
> (2) the command-line option -eps_target 0 works differently is that it also
> sets -eps_target_magnitude if omitted, so to be equivalent in source code
> you have to call both
> > > EPSSetWhichEigenpairs(eps,EPS_TARGET_MAGNITUDE);
> > > EPSSetTarget(eps,0.0);
> > >
> > > Jose
> > >
> > > > I have some more questions regarding setting the preconditioner for
> a quadratic eigenvalue problem, which I will ask in a follow-up email.
> > > >
> > > > Thanks for your help!
> > > >
> > > > -Varun
> > > >
> > > >
> > > > On Thu, Jul 1, 2021 at 5:01 AM Varun Hiremath <
> varunhiremath at gmail.com> wrote:
> > > > Thank you very much for these suggestions! We are currently using
> version 3.12, so I'll try to update to the latest version and try your
> suggestions. Let me get back to you, thanks!
> > > >
> > > > On Thu, Jul 1, 2021, 4:45 AM Jose E. Roman <jroman at dsic.upv.es>
> wrote:
> > > > Then I would try Davidson methods https://doi.org/10.1145/2543696
> > > > You can also try Krylov-Schur with "inexact" shift-and-invert, for
> instance, with preconditioned BiCGStab or GMRES, see section 3.4.1 of the
> users manual.
> > > >
> > > > In both cases, you have to pass matrix A in the call to
> EPSSetOperators() and the preconditioner matrix via
> STSetPreconditionerMat() - note this function was introduced in version
> 3.15.
> > > >
> > > > Jose
> > > >
> > > >
> > > >
> > > > > El 1 jul 2021, a las 13:36, Varun Hiremath <
> varunhiremath at gmail.com> escribi?:
> > > > >
> > > > > Thanks. I actually do have a 1st order approximation of matrix A,
> that I can explicitly compute and also invert. Can I use that matrix as
> preconditioner to speed things up? Is there some example that explains how
> to setup and call SLEPc for this scenario?
> > > > >
> > > > > On Thu, Jul 1, 2021, 4:29 AM Jose E. Roman <jroman at dsic.upv.es>
> wrote:
> > > > > For smallest real parts one could adapt ex34.c, but it is going to
> be costly
> https://slepc.upv.es/documentation/current/src/eps/tutorials/ex36.c.html
> > > > > Also, if eigenvalues are clustered around the origin, convergence
> may still be very slow.
> > > > >
> > > > > It is a tough problem, unless you are able to compute a good
> preconditioner of A (no need to compute the exact inverse).
> > > > >
> > > > > Jose
> > > > >
> > > > >
> > > > > > El 1 jul 2021, a las 13:23, Varun Hiremath <
> varunhiremath at gmail.com> escribi?:
> > > > > >
> > > > > > I'm solving for the smallest eigenvalues in magnitude. Though is
> it cheaper to solve smallest in real part, as that might also work in my
> case? Thanks for your help.
> > > > > >
> > > > > > On Thu, Jul 1, 2021, 4:08 AM Jose E. Roman <jroman at dsic.upv.es>
> wrote:
> > > > > > Smallest eigenvalue in magnitude or real part?
> > > > > >
> > > > > >
> > > > > > > El 1 jul 2021, a las 11:58, Varun Hiremath <
> varunhiremath at gmail.com> escribi?:
> > > > > > >
> > > > > > > Sorry, no both A and B are general sparse matrices
> (non-hermitian). So is there anything else I could try?
> > > > > > >
> > > > > > > On Thu, Jul 1, 2021 at 2:43 AM Jose E. Roman <
> jroman at dsic.upv.es> wrote:
> > > > > > > Is the problem symmetric (GHEP)? In that case, you can try
> LOBPCG on the pair (A,B). But this will likely be slow as well, unless you
> can provide a good preconditioner.
> > > > > > >
> > > > > > > Jose
> > > > > > >
> > > > > > >
> > > > > > > > El 1 jul 2021, a las 11:37, Varun Hiremath <
> varunhiremath at gmail.com> escribi?:
> > > > > > > >
> > > > > > > > Hi All,
> > > > > > > >
> > > > > > > > I am trying to compute the smallest eigenvalues of a
> generalized system A*x= lambda*B*x. I don't explicitly know the matrix A
> (so I am using a shell matrix with a custom matmult function) however, the
> matrix B is explicitly known so I compute inv(B)*A within the shell matrix
> and solve inv(B)*A*x = lambda*x.
> > > > > > > >
> > > > > > > > To compute the smallest eigenvalues it is recommended to
> solve the inverted system, but since matrix A is not explicitly known I
> can't invert the system. Moreover, the size of the system can be really
> big, and with the default Krylov solver, it is extremely slow. So is there
> a better way for me to compute the smallest eigenvalues of this system?
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > > Varun
> > > > > > >
> > > > > >
> > > > >
> > > >
> > > > <acoustic_box_test.cpp>
> > >
> > > <acoustic_matrix_test.cpp><MatA.gz>
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210924/a5a39292/attachment.html>

From nathan.wukie at us.af.mil  Mon Sep 27 10:59:03 2021
From: nathan.wukie at us.af.mil (WUKIE, NATHAN A DR-02 USAF AFMC AFRL/RQVC)
Date: Mon, 27 Sep 2021 15:59:03 +0000
Subject: [petsc-users] Interaction between petsc4py and application Fortran
 library
Message-ID: <SN5P111MB123248F971DC17848AF552E1C4A79@SN5P111MB1232.NAMP111.PROD.OUTLOOK.COM>

How should petsc initialization be handled for a python application utilizing petsc4py and a Fortran library application also using petsc?

The petsc documentation states that PetscInitializeFortran "should be called soon AFTER the call to PetscInitialize<https://petsc.org/release/docs/manualpages/Sys/PetscInitialize.html#PetscInitialize>() if one is using a C main program that calls Fortran routines that in turn call PETSc routines". Does petsc4py.init(...) call PetscInitializeFortran? Is it permissible to call PetscInitializeFortran from the fortran library application itself? Or must PetscInitializeFortran be called from the C main program?

Thank you,
Nathan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210927/d908078c/attachment.html>

From bsmith at petsc.dev  Mon Sep 27 12:54:46 2021
From: bsmith at petsc.dev (Barry Smith)
Date: Mon, 27 Sep 2021 13:54:46 -0400
Subject: [petsc-users] Interaction between petsc4py and application
 Fortran library
In-Reply-To: <SN5P111MB123248F971DC17848AF552E1C4A79@SN5P111MB1232.NAMP111.PROD.OUTLOOK.COM>
References: <SN5P111MB123248F971DC17848AF552E1C4A79@SN5P111MB1232.NAMP111.PROD.OUTLOOK.COM>
Message-ID: <8F63EBC4-7FF8-4388-92E5-B618A3E288D1@petsc.dev>


  Nathan,

   Yes, you can call PetscInitializeFortran() from your Fortran library.

  Barry


> On Sep 27, 2021, at 11:59 AM, WUKIE, NATHAN A DR-02 USAF AFMC AFRL/RQVC via petsc-users <petsc-users at mcs.anl.gov> wrote:
> 
> How should petsc initialization be handled for a python application utilizing petsc4py and a Fortran library application also using petsc?
> 
> The petsc documentation states that PetscInitializeFortran "should be called soon AFTER the call to PetscInitialize <https://petsc.org/release/docs/manualpages/Sys/PetscInitialize.html#PetscInitialize>() if one is using a C main program that calls Fortran routines that in turn call PETSc routines". Does petsc4py.init(...) call PetscInitializeFortran? Is it permissible to call PetscInitializeFortran from the fortran library application itself? Or must PetscInitializeFortran be called from the C main program?
> 
> Thank you,
> Nathan

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210927/782ea67a/attachment.html>

From liyiyang30 at gmail.com  Mon Sep 27 20:22:11 2021
From: liyiyang30 at gmail.com (Yiyang Li)
Date: Mon, 27 Sep 2021 18:22:11 -0700
Subject: [petsc-users] Turn off CUDA Devices information
Message-ID: <CAJ9tMMADM6wxjWF8is4vTUBHfnvW8dB71MEhEjQHBooWB72q7Q@mail.gmail.com>

Hello,

I have CUDA aware MPI, and I have upgraded from PETSc 3.12 to PETSc 3.15.4
and petsc4py 3.15.4.

Now, when I call

PETSc.KSP().solve(..., ...)

The information of GPU is always printed to stdout by every MPI rank, like

CUDA version:   v 11040
CUDA Devices:

0 : Quadro P4000 6 1
  Global memory:   8105 mb
  Shared memory:   48 kb
  Constant memory: 64 kb
  Block registers: 65536

CUDA version:   v 11040
CUDA Devices:

0 : Quadro P4000 6 1
  Global memory:   8105 mb
  Shared memory:   48 kb
  Constant memory: 64 kb
  Block registers: 6553

...

I wonder if there is an option to turn that off?
I have tried including

-cuda_device NONE

in command options, but that did not work.

Best regards,
Yiyang
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210927/347c1902/attachment.html>

From balay at mcs.anl.gov  Mon Sep 27 20:43:07 2021
From: balay at mcs.anl.gov (Satish Balay)
Date: Mon, 27 Sep 2021 20:43:07 -0500 (CDT)
Subject: [petsc-users] Turn off CUDA Devices information
In-Reply-To: <CAJ9tMMADM6wxjWF8is4vTUBHfnvW8dB71MEhEjQHBooWB72q7Q@mail.gmail.com>
References: <CAJ9tMMADM6wxjWF8is4vTUBHfnvW8dB71MEhEjQHBooWB72q7Q@mail.gmail.com>
Message-ID: <5e127072-8cc3-b41a-5e9-9e498cde85fb@mcs.anl.gov>

Do you have petsc built with superlu_dist?

Satish

On Mon, 27 Sep 2021, Yiyang Li wrote:

> Hello,
> 
> I have CUDA aware MPI, and I have upgraded from PETSc 3.12 to PETSc 3.15.4
> and petsc4py 3.15.4.
> 
> Now, when I call
> 
> PETSc.KSP().solve(..., ...)
> 
> The information of GPU is always printed to stdout by every MPI rank, like
> 
> CUDA version:   v 11040
> CUDA Devices:
> 
> 0 : Quadro P4000 6 1
>   Global memory:   8105 mb
>   Shared memory:   48 kb
>   Constant memory: 64 kb
>   Block registers: 65536
> 
> CUDA version:   v 11040
> CUDA Devices:
> 
> 0 : Quadro P4000 6 1
>   Global memory:   8105 mb
>   Shared memory:   48 kb
>   Constant memory: 64 kb
>   Block registers: 6553
> 
> ...
> 
> I wonder if there is an option to turn that off?
> I have tried including
> 
> -cuda_device NONE
> 
> in command options, but that did not work.
> 
> Best regards,
> Yiyang
> 


From varunhiremath at gmail.com  Tue Sep 28 00:50:56 2021
From: varunhiremath at gmail.com (Varun Hiremath)
Date: Mon, 27 Sep 2021 22:50:56 -0700
Subject: [petsc-users] SLEPc: smallest eigenvalues
In-Reply-To: <CAMXxdULR0e_8XNwPxWmgG3t_CUvTOVCwEvMXpX1tr-WtXy6T9w@mail.gmail.com>
References: <CAMXxdUJSWU52_Wb=0tWBPYGnMNtCwzyL2Ar_qXZeYY7VkOif7A@mail.gmail.com>
	<EE81AB10-C1BE-494A-8B57-94AE2198CED4@dsic.upv.es>
	<CAMXxdUKch4vPnXPnRzArUEcdxQtjYJnd-KoQFk8yBpdG6fV2fQ@mail.gmail.com>
	<179BDB69-1EC0-4334-A964-ABE29E33EFF8@dsic.upv.es>
	<CAMXxdU+yWJVCS8=_0sk2M-Y4q6ZG8H6O87xbQ5zpxS9QrGnbYA@mail.gmail.com>
	<5B1750B3-E05F-45D7-929B-A5CF816B4A75@dsic.upv.es>
	<CAMXxdUKTSi1oR-tSVa++NwynwoC-QffAE3h_kEp3vVUuypUpnA@mail.gmail.com>
	<7031EC8B-A238-45AD-B4C2-FA8988022864@dsic.upv.es>
	<CAMXxdULvVtyHxg5UcHWvcq7pNNv_4ETKYd-vyp+AZZdSAKKi+Q@mail.gmail.com>
	<CAMXxdU+D5tUkpspfun-H3iK3hVh5QTQtk0rXY6rLuUwLDAdYuw@mail.gmail.com>
	<6B968AE2-8325-4E20-B94A-16ECDD0FBA90@dsic.upv.es>
	<CAMXxdU+nVEUgTMrtZDcCM0yMwku0GQmKKbnJr-P9sKRLrrE4wA@mail.gmail.com>
	<4BB88AB3-410E-493C-9161-97775747936D@dsic.upv.es>
	<CAMXxdUK-E-zc1QgqyrLcZUZZx8nbAqH0HSQVf1fJZjCwbJaF0Q@mail.gmail.com>
	<32B34038-7E1A-42CA-A55D-9AF9D41D1697@dsic.upv.es>
	<CAMXxdULR0e_8XNwPxWmgG3t_CUvTOVCwEvMXpX1tr-WtXy6T9w@mail.gmail.com>
Message-ID: <CAMXxdULOJzqTbu0xJk+NJdVJcfQPotVQYwtvAyyKiL07tzSq9Q@mail.gmail.com>

Hi Jose,

I implemented the LU factorized preconditioner and tested it using
PREONLY + LU, but that actually is converging to the wrong eigenvalues,
compared to just using BICGS + BJACOBI, or simply computing
EPS_SMALLEST_MAGNITUDE without any preconditioning. My preconditioning
matrix is only a 1st order approximation, and the off-diagonal terms are
not very accurate, so I'm guessing this is why the LU factorization doesn't
help much? Nonetheless, using BICGS + BJACOBI with slightly relaxed
tolerances seems to be working fine.

I now want to test the same preconditioning idea for a quadratic problem. I
am solving a quadratic equation similar to Eqn.(5.1) in the SLEPc manual:
      (K + lambda*C + lambda^2*M)*x = 0,
I don't use the PEP package directly, but solve this by linearizing similar
to Eqn.(5.3) and calling EPS. Without explicitly forming the full matrix, I
just use the block matrix structure as explained in the below example and
that works nicely for my case:
https://slepc.upv.es/documentation/current/src/eps/tutorials/ex9.c.html

In my case, K is not explicitly known, and for linear problems, where C =
0, I am using a 1st order approximation of K as the preconditioner. Now
could you please tell me if there is a way to conveniently set the
preconditioner for the quadratic problem, which will be of the form [-K 0;
0 I]? Note that K is constructed in parallel (the rows are distributed), so
I wasn't sure how to construct this preconditioner matrix which will be
compatible with the shell matrix structure that I'm using to define the
MatMult function as in ex9.

Thanks,
Varun

On Fri, Sep 24, 2021 at 11:50 PM Varun Hiremath <varunhiremath at gmail.com>
wrote:

> Ok, great! I will give that a try, thanks for your help!
>
> On Fri, Sep 24, 2021 at 11:12 PM Jose E. Roman <jroman at dsic.upv.es> wrote:
>
>> Yes, you can use PCMAT
>> https://petsc.org/release/docs/manualpages/PC/PCMAT.html then pass a
>> preconditioner matrix that performs the inverse via a shell matrix.
>>
>> > El 25 sept 2021, a las 8:07, Varun Hiremath <varunhiremath at gmail.com>
>> escribi?:
>> >
>> > Hi Jose,
>> >
>> > Thanks for checking my code and providing suggestions.
>> >
>> > In my particular case, I don't know the matrix A explicitly, I compute
>> A*x in a matrix-free way within a shell matrix, so I can't use any of the
>> direct factorization methods. But just a question regarding your suggestion
>> to compute a (parallel) LU factorization. In our work, we do use MUMPS to
>> compute the parallel factorization. For solving the generalized problem,
>> A*x = lambda*B*x, we are computing inv(B)*A*x within a shell matrix, where
>> factorization of B is computed using MUMPS. (We don't call MUMPS through
>> SLEPc as we have our own MPI wrapper and other user settings to handle.)
>> >
>> > So for the preconditioning, instead of using the iterative solvers, can
>> I provide a shell matrix that computes inv(P)*x corrections (where P is the
>> preconditioner matrix) using MUMPS direct solver?
>> >
>> > And yes, thanks, #define PETSC_USE_COMPLEX 1 is not needed, it works
>> without it.
>> >
>> > Regards,
>> > Varun
>> >
>> > On Fri, Sep 24, 2021 at 9:14 AM Jose E. Roman <jroman at dsic.upv.es>
>> wrote:
>> > If you do
>> > $ ./acoustic_matrix_test.o -shell 0 -st_type sinvert -deflate 1
>> > then it is using an LU factorization (the default), which is fast.
>> >
>> > Use -eps_view to see which solver settings are you using.
>> >
>> > BiCGStab with block Jacobi does not work for you matrix, it exceeds the
>> maximum 10000 iterations. So this is not viable unless you can find a
>> better preconditioner for your problem. If not, just using
>> EPS_SMALLEST_MAGNITUDE will be faster.
>> >
>> > Computing smallest magnitude eigenvalues is a difficult task. The most
>> robust way is to compute a (parallel) LU factorization if you can afford it.
>> >
>> >
>> > A side note: don't add this to your source code
>> > #define PETSC_USE_COMPLEX 1
>> > This define is taken from PETSc's include files, you should not mess
>> with it. Instead, you probably want to add something like this AFTER
>> #include <slepceps.h>:
>> > #if !defined(PETSC_USE_COMPLEX)
>> > #error "Requires complex scalars"
>> > #endif
>> >
>> > Jose
>> >
>> >
>> > > El 22 sept 2021, a las 19:38, Varun Hiremath <varunhiremath at gmail.com>
>> escribi?:
>> > >
>> > > Hi Jose,
>> > >
>> > > Thank you, that explains it and my example code works now without
>> specifying "-eps_target 0" in the command line.
>> > >
>> > > However, both the Krylov inexact shift-invert and JD solvers are
>> struggling to converge for some of my actual problems. The issue seems to
>> be related to non-symmetric general matrices. I have extracted one such
>> matrix attached here as MatA.gz (size 100k), and have also included a short
>> program that loads this matrix and then computes the smallest eigenvalues
>> as I described earlier.
>> > >
>> > > For this matrix, if I compute the eigenvalues directly (without using
>> the shell matrix) using shift-and-invert (as below) then it converges in
>> less than a minute.
>> > > $ ./acoustic_matrix_test.o -shell 0 -st_type sinvert -deflate 1
>> > >
>> > > However, if I use the shell matrix and use any of the preconditioned
>> solvers JD or Krylov shift-invert (as shown below) with the same matrix as
>> the preconditioner, then they struggle to converge.
>> > > $ ./acoustic_matrix_test.o -usejd 1 -deflate 1
>> > > $ ./acoustic_matrix_test.o -sinvert 1 -deflate 1
>> > >
>> > > Could you please check the attached code and suggest any changes in
>> settings that might help with convergence for these kinds of matrices? I
>> appreciate your help!
>> > >
>> > > Thanks,
>> > > Varun
>> > >
>> > > On Tue, Sep 21, 2021 at 11:14 AM Jose E. Roman <jroman at dsic.upv.es>
>> wrote:
>> > > I will have a look at your code when I have more time. Meanwhile, I
>> am answering 3) below...
>> > >
>> > > > El 21 sept 2021, a las 0:23, Varun Hiremath <
>> varunhiremath at gmail.com> escribi?:
>> > > >
>> > > > Hi Jose,
>> > > >
>> > > > Sorry, it took me a while to test these settings in the new builds.
>> I am getting good improvement in performance using the preconditioned
>> solvers, so thanks for the suggestions! But I have some questions related
>> to the usage.
>> > > >
>> > > > We are using SLEPc to solve the acoustic modal eigenvalue problem.
>> Attached is a simple standalone program that computes acoustic modes in a
>> simple rectangular box. This program illustrates the general setup I am
>> using, though here the shell matrix and the preconditioner matrix are the
>> same, while in my actual program the shell matrix computes A*x without
>> explicitly forming A, and the preconditioner is a 0th order approximation
>> of A.
>> > > >
>> > > > In the attached program I have tested both
>> > > > 1) the Krylov-Schur with inexact shift-and-invert (implemented
>> under the option sinvert);
>> > > > 2) the JD solver with preconditioner (implemented under the option
>> usejd)
>> > > >
>> > > > Both the solvers seem to work decently, compared to no
>> preconditioning. This is how I run the two solvers (for a mesh size of
>> 1600x400):
>> > > > $ ./acoustic_box_test.o -nx 1600 -ny 400 -usejd 1 -deflate 1
>> -eps_target 0
>> > > > $ ./acoustic_box_test.o -nx 1600 -ny 400 -sinvert 1 -deflate 1
>> -eps_target 0
>> > > > Both finish in about ~10 minutes on my system in serial. JD seems
>> to be slightly faster and more accurate (for the imaginary part of
>> eigenvalue).
>> > > > The program also runs in parallel using mpiexec. I use complex
>> builds, as in my main program the matrix can be complex.
>> > > >
>> > > > Now here are my questions:
>> > > > 1) For this particular problem type, could you please check if
>> these are the best settings that one could use? I have tried different
>> combinations of KSP/PC types e.g. GMRES, GAMG, etc, but BCGSL + BJACOBI
>> seems to work the best in serial and parallel.
>> > > >
>> > > > 2) When I tested these settings in my main program, for some reason
>> the JD solver was not converging. After further testing, I found the issue
>> was related to the setting of "-eps_target 0". I have included
>> "EPSSetTarget(eps,0.0);" in the program and I assumed this is equivalent to
>> passing "-eps_target 0" from the command line, but that doesn't seem to be
>> the case. For instance, if I run the attached program without "-eps_target
>> 0" in the command line then it doesn't converge.
>> > > > $ ./acoustic_box_test.o -nx 1600 -ny 400 -usejd 1 -deflate 1
>> -eps_target 0
>> > > >  the above finishes in about 10 minutes
>> > > > $ ./acoustic_box_test.o -nx 1600 -ny 400 -usejd 1 -deflate 1
>> > > >  the above doesn't converge even though "EPSSetTarget(eps,0.0);" is
>> included in the code
>> > > >
>> > > > This only seems to affect the JD solver, not the Krylov
>> shift-and-invert (-sinvert 1) option. So is there any difference between
>> passing "-eps_target 0" from the command line vs using
>> "EPSSetTarget(eps,0.0);" in the code? I cannot pass any command line
>> arguments in my actual program, so need to set everything internally.
>> > > >
>> > > > 3) Also, another minor related issue. While using the inexact
>> shift-and-invert option, I was running into the following error:
>> > > >
>> > > > ""
>> > > > Missing or incorrect user input
>> > > > Shift-and-invert requires a target 'which' (see
>> EPSSetWhichEigenpairs), for instance -st_type sinvert -eps_target 0
>> -eps_target_magnitude
>> > > > ""
>> > > >
>> > > > I already have the below two lines in the code:
>> > > > EPSSetWhichEigenpairs(eps,EPS_SMALLEST_MAGNITUDE);
>> > > > EPSSetTarget(eps,0.0);
>> > > >
>> > > > so shouldn't these be enough? If I comment out the first line
>> "EPSSetWhichEigenpairs", then the code works fine.
>> > >
>> > > You should either do
>> > >
>> > > EPSSetWhichEigenpairs(eps,EPS_SMALLEST_MAGNITUDE);
>> > >
>> > > without shift-and-invert or
>> > >
>> > > EPSSetWhichEigenpairs(eps,EPS_TARGET_MAGNITUDE);
>> > > EPSSetTarget(eps,0.0);
>> > >
>> > > with shift-and-invert. The latter can also be used without
>> shift-and-invert (e.g. in JD).
>> > >
>> > > I have to check, but a possible explanation why in your comment above
>> (2) the command-line option -eps_target 0 works differently is that it also
>> sets -eps_target_magnitude if omitted, so to be equivalent in source code
>> you have to call both
>> > > EPSSetWhichEigenpairs(eps,EPS_TARGET_MAGNITUDE);
>> > > EPSSetTarget(eps,0.0);
>> > >
>> > > Jose
>> > >
>> > > > I have some more questions regarding setting the preconditioner for
>> a quadratic eigenvalue problem, which I will ask in a follow-up email.
>> > > >
>> > > > Thanks for your help!
>> > > >
>> > > > -Varun
>> > > >
>> > > >
>> > > > On Thu, Jul 1, 2021 at 5:01 AM Varun Hiremath <
>> varunhiremath at gmail.com> wrote:
>> > > > Thank you very much for these suggestions! We are currently using
>> version 3.12, so I'll try to update to the latest version and try your
>> suggestions. Let me get back to you, thanks!
>> > > >
>> > > > On Thu, Jul 1, 2021, 4:45 AM Jose E. Roman <jroman at dsic.upv.es>
>> wrote:
>> > > > Then I would try Davidson methods https://doi.org/10.1145/2543696
>> > > > You can also try Krylov-Schur with "inexact" shift-and-invert, for
>> instance, with preconditioned BiCGStab or GMRES, see section 3.4.1 of the
>> users manual.
>> > > >
>> > > > In both cases, you have to pass matrix A in the call to
>> EPSSetOperators() and the preconditioner matrix via
>> STSetPreconditionerMat() - note this function was introduced in version
>> 3.15.
>> > > >
>> > > > Jose
>> > > >
>> > > >
>> > > >
>> > > > > El 1 jul 2021, a las 13:36, Varun Hiremath <
>> varunhiremath at gmail.com> escribi?:
>> > > > >
>> > > > > Thanks. I actually do have a 1st order approximation of matrix A,
>> that I can explicitly compute and also invert. Can I use that matrix as
>> preconditioner to speed things up? Is there some example that explains how
>> to setup and call SLEPc for this scenario?
>> > > > >
>> > > > > On Thu, Jul 1, 2021, 4:29 AM Jose E. Roman <jroman at dsic.upv.es>
>> wrote:
>> > > > > For smallest real parts one could adapt ex34.c, but it is going
>> to be costly
>> https://slepc.upv.es/documentation/current/src/eps/tutorials/ex36.c.html
>> > > > > Also, if eigenvalues are clustered around the origin, convergence
>> may still be very slow.
>> > > > >
>> > > > > It is a tough problem, unless you are able to compute a good
>> preconditioner of A (no need to compute the exact inverse).
>> > > > >
>> > > > > Jose
>> > > > >
>> > > > >
>> > > > > > El 1 jul 2021, a las 13:23, Varun Hiremath <
>> varunhiremath at gmail.com> escribi?:
>> > > > > >
>> > > > > > I'm solving for the smallest eigenvalues in magnitude. Though
>> is it cheaper to solve smallest in real part, as that might also work in my
>> case? Thanks for your help.
>> > > > > >
>> > > > > > On Thu, Jul 1, 2021, 4:08 AM Jose E. Roman <jroman at dsic.upv.es>
>> wrote:
>> > > > > > Smallest eigenvalue in magnitude or real part?
>> > > > > >
>> > > > > >
>> > > > > > > El 1 jul 2021, a las 11:58, Varun Hiremath <
>> varunhiremath at gmail.com> escribi?:
>> > > > > > >
>> > > > > > > Sorry, no both A and B are general sparse matrices
>> (non-hermitian). So is there anything else I could try?
>> > > > > > >
>> > > > > > > On Thu, Jul 1, 2021 at 2:43 AM Jose E. Roman <
>> jroman at dsic.upv.es> wrote:
>> > > > > > > Is the problem symmetric (GHEP)? In that case, you can try
>> LOBPCG on the pair (A,B). But this will likely be slow as well, unless you
>> can provide a good preconditioner.
>> > > > > > >
>> > > > > > > Jose
>> > > > > > >
>> > > > > > >
>> > > > > > > > El 1 jul 2021, a las 11:37, Varun Hiremath <
>> varunhiremath at gmail.com> escribi?:
>> > > > > > > >
>> > > > > > > > Hi All,
>> > > > > > > >
>> > > > > > > > I am trying to compute the smallest eigenvalues of a
>> generalized system A*x= lambda*B*x. I don't explicitly know the matrix A
>> (so I am using a shell matrix with a custom matmult function) however, the
>> matrix B is explicitly known so I compute inv(B)*A within the shell matrix
>> and solve inv(B)*A*x = lambda*x.
>> > > > > > > >
>> > > > > > > > To compute the smallest eigenvalues it is recommended to
>> solve the inverted system, but since matrix A is not explicitly known I
>> can't invert the system. Moreover, the size of the system can be really
>> big, and with the default Krylov solver, it is extremely slow. So is there
>> a better way for me to compute the smallest eigenvalues of this system?
>> > > > > > > >
>> > > > > > > > Thanks,
>> > > > > > > > Varun
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > > > <acoustic_box_test.cpp>
>> > >
>> > > <acoustic_matrix_test.cpp><MatA.gz>
>> >
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210927/5fc4a774/attachment-0001.html>

From nicolas.tardieu at edf.fr  Tue Sep 28 07:34:23 2021
From: nicolas.tardieu at edf.fr (TARDIEU Nicolas)
Date: Tue, 28 Sep 2021 12:34:23 +0000
Subject: [petsc-users] GMRES Breakdown
Message-ID: <1632832463882.43878@edf.fr>

Dear PETSc Team,

We, code_aster's development team, are using PETSc in our application for years now, mainly for the KSP et PC features. We run tests cases every week without problems.
After upgrading from PETSc 3.12.3 to 3.15.3, a test case using GMRES failed with KSP_DIVERGED_BREAKDOWN. By rolling back to 3.12.3, we have checked that the upgrade is the origin of the failure.

We know this can occur with GMRES but it is the first time we are facing this situation.

Is it a bug ? If yes, how can we help to fix it ? If no, what can we do ? Could the flexible version of GMRES be an alternative ?


Best regards,
Nicolas
--
Nicolas Tardieu
Ing, PhD



Ce message et toutes les pi?ces jointes (ci-apr?s le 'Message') sont ?tablis ? l'intention exclusive des destinataires et les informations qui y figurent sont strictement confidentielles. Toute utilisation de ce Message non conforme ? sa destination, toute diffusion ou toute publication totale ou partielle, est interdite sauf autorisation expresse.

Si vous n'?tes pas le destinataire de ce Message, il vous est interdit de le copier, de le faire suivre, de le divulguer ou d'en utiliser tout ou partie. Si vous avez re?u ce Message par erreur, merci de le supprimer de votre syst?me, ainsi que toutes ses copies, et de n'en garder aucune trace sur quelque support que ce soit. Nous vous remercions ?galement d'en avertir imm?diatement l'exp?diteur par retour du message.

Il est impossible de garantir que les communications par messagerie ?lectronique arrivent en temps utile, sont s?curis?es ou d?nu?es de toute erreur ou virus.
____________________________________________________

This message and any attachments (the 'Message') are intended solely for the addressees. The information contained in this Message is confidential. Any use of information contained in this Message not in accord with its purpose, any dissemination or disclosure, either whole or partial, is prohibited except formal approval.

If you are not the addressee, you may not copy, forward, disclose or use any part of it. If you have received this message in error, please delete it and all copies from your system and notify the sender immediately by return message.

E-mail communication cannot be guaranteed to be timely secure, error or virus-free.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210928/7aa9da45/attachment.html>

From jroman at dsic.upv.es  Tue Sep 28 07:45:28 2021
From: jroman at dsic.upv.es (Jose E. Roman)
Date: Tue, 28 Sep 2021 14:45:28 +0200
Subject: [petsc-users] GMRES Breakdown
In-Reply-To: <1632832463882.43878@edf.fr>
References: <1632832463882.43878@edf.fr>
Message-ID: <1EA10A4B-9FBC-460A-9FBC-F01CF9D506D1@dsic.upv.es>

Now there is a breakdown tolerance https://petsc.org/main/docs/manualpages/KSP/KSPGMRESSetBreakdownTolerance.html
You can try changing it.

Generally, when upgrading you should check the list of changes https://petsc.org/release/docs/changes/

Jose


> El 28 sept 2021, a las 14:34, TARDIEU Nicolas via petsc-users <petsc-users at mcs.anl.gov> escribi?:
> 
> Dear PETSc Team,
> 
> We, code_aster's development team, are using PETSc in our application for years now, mainly for the KSP et PC features. We run tests cases every week without problems.
> After upgrading from PETSc 3.12.3 to 3.15.3, a test case using GMRES failed with KSP_DIVERGED_BREAKDOWN. By rolling back to 3.12.3, we have checked that the upgrade is the origin of the failure.
> 
> We know this can occur with GMRES but it is the first time we are facing this situation.  
> Is it a bug ? If yes, how can we help to fix it ? If no, what can we do ? Could the flexible version of GMRES be an alternative ?
> 
> Best regards, 
> Nicolas
> --
> Nicolas Tardieu
> Ing, PhD
> 
> Ce message et toutes les pi?ces jointes (ci-apr?s le 'Message') sont ?tablis ? l'intention exclusive des destinataires et les informations qui y figurent sont strictement confidentielles. Toute utilisation de ce Message non conforme ? sa destination, toute diffusion ou toute publication totale ou partielle, est interdite sauf autorisation expresse.
> Si vous n'?tes pas le destinataire de ce Message, il vous est interdit de le copier, de le faire suivre, de le divulguer ou d'en utiliser tout ou partie. Si vous avez re?u ce Message par erreur, merci de le supprimer de votre syst?me, ainsi que toutes ses copies, et de n'en garder aucune trace sur quelque support que ce soit. Nous vous remercions ?galement d'en avertir imm?diatement l'exp?diteur par retour du message.
> Il est impossible de garantir que les communications par messagerie ?lectronique arrivent en temps utile, sont s?curis?es ou d?nu?es de toute erreur ou virus.
> ____________________________________________________
> This message and any attachments (the 'Message') are intended solely for the addressees. The information contained in this Message is confidential. Any use of information contained in this Message not in accord with its purpose, any dissemination or disclosure, either whole or partial, is prohibited except formal approval.
> If you are not the addressee, you may not copy, forward, disclose or use any part of it. If you have received this message in error, please delete it and all copies from your system and notify the sender immediately by return message.
> E-mail communication cannot be guaranteed to be timely secure, error or virus-free.


From knepley at gmail.com  Tue Sep 28 07:54:11 2021
From: knepley at gmail.com (Matthew Knepley)
Date: Tue, 28 Sep 2021 08:54:11 -0400
Subject: [petsc-users] GMRES Breakdown
In-Reply-To: <1632832463882.43878@edf.fr>
References: <1632832463882.43878@edf.fr>
Message-ID: <CAMYG4GnG7Hu4dd-wNcvpOjU8auM4L4kDofuoyHzyBXUQPMT-AQ@mail.gmail.com>

On Tue, Sep 28, 2021 at 8:39 AM TARDIEU Nicolas via petsc-users <
petsc-users at mcs.anl.gov> wrote:

> Dear PETSc Team,
>
> We, code_aster's development team, are using PETSc in our application for
> years now, mainly for the KSP et PC features. We run tests cases every week
> without problems.
> After upgrading from PETSc 3.12.3 to 3.15.3, a test case using GMRES
> failed with KSP_DIVERGED_BREAKDOWN. By rolling back to 3.12.3, we have
> checked that the upgrade is the origin of the failure.
>
> We know this can occur with GMRES but it is the first time we are facing
> this situation.
>
> Is it a bug ? If yes, how can we help to fix it ? If no, what can we do ?
> Could the flexible version of GMRES be an alternative ?
>
This is not a bug, but we did change the operation. We now check whether
there is a significant residual spike after a restart. We
want to let the user know that this likely means that the solver is
inappropriate for the problem. As Jose notes, you can turn off this
behavior.

  Thanks,

     Matt


> Best regards,
> Nicolas
> --
> *Nicolas Tardieu*
> *Ing, PhD*
>
>
> Ce message et toutes les pi?ces jointes (ci-apr?s le 'Message') sont
> ?tablis ? l'intention exclusive des destinataires et les informations qui y
> figurent sont strictement confidentielles. Toute utilisation de ce Message
> non conforme ? sa destination, toute diffusion ou toute publication totale
> ou partielle, est interdite sauf autorisation expresse.
>
> Si vous n'?tes pas le destinataire de ce Message, il vous est interdit de
> le copier, de le faire suivre, de le divulguer ou d'en utiliser tout ou
> partie. Si vous avez re?u ce Message par erreur, merci de le supprimer de
> votre syst?me, ainsi que toutes ses copies, et de n'en garder aucune trace
> sur quelque support que ce soit. Nous vous remercions ?galement d'en
> avertir imm?diatement l'exp?diteur par retour du message.
>
> Il est impossible de garantir que les communications par messagerie
> ?lectronique arrivent en temps utile, sont s?curis?es ou d?nu?es de toute
> erreur ou virus.
> ____________________________________________________
>
> This message and any attachments (the 'Message') are intended solely for
> the addressees. The information contained in this Message is confidential.
> Any use of information contained in this Message not in accord with its
> purpose, any dissemination or disclosure, either whole or partial, is
> prohibited except formal approval.
>
> If you are not the addressee, you may not copy, forward, disclose or use
> any part of it. If you have received this message in error, please delete
> it and all copies from your system and notify the sender immediately by
> return message.
>
> E-mail communication cannot be guaranteed to be timely secure, error or
> virus-free.
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210928/28f99790/attachment.html>

From nicolas.tardieu at edf.fr  Tue Sep 28 08:29:06 2021
From: nicolas.tardieu at edf.fr (TARDIEU Nicolas)
Date: Tue, 28 Sep 2021 13:29:06 +0000
Subject: [petsc-users] GMRES Breakdown
In-Reply-To: <CAMYG4GnG7Hu4dd-wNcvpOjU8auM4L4kDofuoyHzyBXUQPMT-AQ@mail.gmail.com>
References: <1632832463882.43878@edf.fr>,
	<CAMYG4GnG7Hu4dd-wNcvpOjU8auM4L4kDofuoyHzyBXUQPMT-AQ@mail.gmail.com>
Message-ID: <1632835746671.87455@edf.fr>

Dear Jose and Matt,


I thank you very much for your super-fast answers. And I apologize for not having checked the list of changes.


Best regards,
Nicolas
--
Nicolas Tardieu
Ing?nieur Chercheur
Groupe Dynamique des Equipements - T6B
EDF - R&D Dpt ERMES
nicolas.tardieu at edf.fr
T?l. : 01 78 19 37 49
________________________________
De : knepley at gmail.com <knepley at gmail.com>
Envoy? : mardi 28 septembre 2021 14:54
? : TARDIEU Nicolas
Cc : petsc-users at mcs.anl.gov
Objet : Re: [petsc-users] GMRES Breakdown

On Tue, Sep 28, 2021 at 8:39 AM TARDIEU Nicolas via petsc-users <petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>> wrote:
Dear PETSc Team,

We, code_aster's development team, are using PETSc in our application for years now, mainly for the KSP et PC features. We run tests cases every week without problems.
After upgrading from PETSc 3.12.3 to 3.15.3, a test case using GMRES failed with KSP_DIVERGED_BREAKDOWN. By rolling back to 3.12.3, we have checked that the upgrade is the origin of the failure.

We know this can occur with GMRES but it is the first time we are facing this situation.

Is it a bug ? If yes, how can we help to fix it ? If no, what can we do ? Could the flexible version of GMRES be an alternative ?

This is not a bug, but we did change the operation. We now check whether there is a significant residual spike after a restart. We
want to let the user know that this likely means that the solver is inappropriate for the problem. As Jose notes, you can turn off this
behavior.

  Thanks,

     Matt

Best regards,
Nicolas
--
Nicolas Tardieu
Ing, PhD

Ce message et toutes les pi?ces jointes (ci-apr?s le 'Message') sont ?tablis ? l'intention exclusive des destinataires et les informations qui y figurent sont strictement confidentielles. Toute utilisation de ce Message non conforme ? sa destination, toute diffusion ou toute publication totale ou partielle, est interdite sauf autorisation expresse.

Si vous n'?tes pas le destinataire de ce Message, il vous est interdit de le copier, de le faire suivre, de le divulguer ou d'en utiliser tout ou partie. Si vous avez re?u ce Message par erreur, merci de le supprimer de votre syst?me, ainsi que toutes ses copies, et de n'en garder aucune trace sur quelque support que ce soit. Nous vous remercions ?galement d'en avertir imm?diatement l'exp?diteur par retour du message.

Il est impossible de garantir que les communications par messagerie ?lectronique arrivent en temps utile, sont s?curis?es ou d?nu?es de toute erreur ou virus.
____________________________________________________

This message and any attachments (the 'Message') are intended solely for the addressees. The information contained in this Message is confidential. Any use of information contained in this Message not in accord with its purpose, any dissemination or disclosure, either whole or partial, is prohibited except formal approval.

If you are not the addressee, you may not copy, forward, disclose or use any part of it. If you have received this message in error, please delete it and all copies from your system and notify the sender immediately by return message.

E-mail communication cannot be guaranteed to be timely secure, error or virus-free.


--
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/<http://www.cse.buffalo.edu/~knepley/>



Ce message et toutes les pi?ces jointes (ci-apr?s le 'Message') sont ?tablis ? l'intention exclusive des destinataires et les informations qui y figurent sont strictement confidentielles. Toute utilisation de ce Message non conforme ? sa destination, toute diffusion ou toute publication totale ou partielle, est interdite sauf autorisation expresse.

Si vous n'?tes pas le destinataire de ce Message, il vous est interdit de le copier, de le faire suivre, de le divulguer ou d'en utiliser tout ou partie. Si vous avez re?u ce Message par erreur, merci de le supprimer de votre syst?me, ainsi que toutes ses copies, et de n'en garder aucune trace sur quelque support que ce soit. Nous vous remercions ?galement d'en avertir imm?diatement l'exp?diteur par retour du message.

Il est impossible de garantir que les communications par messagerie ?lectronique arrivent en temps utile, sont s?curis?es ou d?nu?es de toute erreur ou virus.
____________________________________________________

This message and any attachments (the 'Message') are intended solely for the addressees. The information contained in this Message is confidential. Any use of information contained in this Message not in accord with its purpose, any dissemination or disclosure, either whole or partial, is prohibited except formal approval.

If you are not the addressee, you may not copy, forward, disclose or use any part of it. If you have received this message in error, please delete it and all copies from your system and notify the sender immediately by return message.

E-mail communication cannot be guaranteed to be timely secure, error or virus-free.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210928/92ee0286/attachment-0001.html>

From karthikeyan.chockalingam at stfc.ac.uk  Tue Sep 28 09:55:52 2021
From: karthikeyan.chockalingam at stfc.ac.uk (Karthikeyan Chockalingam - STFC UKRI)
Date: Tue, 28 Sep 2021 14:55:52 +0000
Subject: [petsc-users] %T (percent time in this phase)
Message-ID: <20E5B029-43D3-493C-873E-EB8F8CD92E08@stfc.ac.uk>

Hello,

I ran ex45 in the KPS tutorial, which is a 3D finite-difference Poisson problem. I noticed from the output from using the flag -log_summary that for various events their respective %T (percent time in this phase) do not add up to 100 but rather exceeds 100. So, I gather there is some overlap among these events. I am primarily looking at the events KSPSetUp, KSPSolve, PCSetUp and PCSolve. Is it possible to get a mutually exclusive %T or Time for these individual events? I have attached  the log_summary output file from my run for your reference.

Thanks!
Karthik.


This email and any attachments are intended solely for the use of the named recipients. If you are not the intended recipient you must not use, disclose, copy or distribute this email or any of its attachments and should notify the sender immediately and delete this email from your system. UK Research and Innovation (UKRI) has taken every reasonable precaution to minimise risk of this email or any attachments containing viruses or malware but the recipient should carry out its own virus and malware checks before opening the attachments. UKRI does not accept any liability for any losses or damages which the recipient may sustain due to presence of any viruses.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210928/3e35e78f/attachment-0001.html>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: ksp_ex45_N511_cpu_6.txt
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210928/3e35e78f/attachment-0001.txt>

From jroman at dsic.upv.es  Tue Sep 28 10:09:37 2021
From: jroman at dsic.upv.es (Jose E. Roman)
Date: Tue, 28 Sep 2021 17:09:37 +0200
Subject: [petsc-users] SLEPc: smallest eigenvalues
In-Reply-To: <CAMXxdULOJzqTbu0xJk+NJdVJcfQPotVQYwtvAyyKiL07tzSq9Q@mail.gmail.com>
References: <CAMXxdUJSWU52_Wb=0tWBPYGnMNtCwzyL2Ar_qXZeYY7VkOif7A@mail.gmail.com>
	<EE81AB10-C1BE-494A-8B57-94AE2198CED4@dsic.upv.es>
	<CAMXxdUKch4vPnXPnRzArUEcdxQtjYJnd-KoQFk8yBpdG6fV2fQ@mail.gmail.com>
	<179BDB69-1EC0-4334-A964-ABE29E33EFF8@dsic.upv.es>
	<CAMXxdU+yWJVCS8=_0sk2M-Y4q6ZG8H6O87xbQ5zpxS9QrGnbYA@mail.gmail.com>
	<5B1750B3-E05F-45D7-929B-A5CF816B4A75@dsic.upv.es>
	<CAMXxdUKTSi1oR-tSVa++NwynwoC-QffAE3h_kEp3vVUuypUpnA@mail.gmail.com>
	<7031EC8B-A238-45AD-B4C2-FA8988022864@dsic.upv.es>
	<CAMXxdULvVtyHxg5UcHWvcq7pNNv_4ETKYd-vyp+AZZdSAKKi+Q@mail.gmail.com>
	<CAMXxdU+D5tUkpspfun-H3iK3hVh5QTQtk0rXY6rLuUwLDAdYuw@mail.gmail.com>
	<6B968AE2-8325-4E20-B94A-16ECDD0FBA90@dsic.upv.es>
	<CAMXxdU+nVEUgTMrtZDcCM0yMwku0GQmKKbnJr-P9sKRLrrE4wA@mail.gmail.com>
	<4BB88AB3-410E-493C-9161-97775747936D@dsic.upv.es>
	<CAMXxdUK-E-zc1QgqyrLcZUZZx8nbAqH0HSQVf1fJZjCwbJaF0Q@mail.gmail.com>
	<32B34038-7E1A-42CA-A55D-9AF9D41D1697@dsic.upv.es>
	<CAMXxdULR0e_8XNwPxWmgG3t_CUvTOVCwEvMXpX1tr-WtXy6T9w@mail.gmail.com>
	<CAMXxdULOJzqTbu0xJk+NJdVJcfQPotVQYwtvAyyKiL07tzSq9Q@mail.gmail.com>
Message-ID: <4FC17DE7-B910-43D8-9EC5-816285FD52F4@dsic.upv.es>



> El 28 sept 2021, a las 7:50, Varun Hiremath <varunhiremath at gmail.com> escribi?:
> 
> Hi Jose,
> 
> I implemented the LU factorized preconditioner and tested it using PREONLY + LU, but that actually is converging to the wrong eigenvalues, compared to just using BICGS + BJACOBI, or simply computing EPS_SMALLEST_MAGNITUDE without any preconditioning. My preconditioning matrix is only a 1st order approximation, and the off-diagonal terms are not very accurate, so I'm guessing this is why the LU factorization doesn't help much? Nonetheless, using BICGS + BJACOBI with slightly relaxed tolerances seems to be working fine.

If your PCMAT is not an exact inverse, then you have to iterate, i.e. not use KSPPREONLY but KSPBCGS or another.

> 
> I now want to test the same preconditioning idea for a quadratic problem. I am solving a quadratic equation similar to Eqn.(5.1) in the SLEPc manual:
>       (K + lambda*C + lambda^2*M)*x = 0,
> I don't use the PEP package directly, but solve this by linearizing similar to Eqn.(5.3) and calling EPS. Without explicitly forming the full matrix, I just use the block matrix structure as explained in the below example and that works nicely for my case:
> https://slepc.upv.es/documentation/current/src/eps/tutorials/ex9.c.html

Using PEP is generally recommended. The default solver TOAR is memory-efficient and performs less computation than a trivial linearization. In addition, PEP allows you to do scaling, which is often very important to get accurate results in some problems, depending on conditioning.

In your case K is a shell matrix, so things may not be trivial. If I am not wrong, you should be able to use STSetPreconditionerMat() for a PEP, where the preconditioner in this case should be built to approximate Q(sigma), where Q(.) is the quadratic polynomial and sigma is the target.

> 
> In my case, K is not explicitly known, and for linear problems, where C = 0, I am using a 1st order approximation of K as the preconditioner. Now could you please tell me if there is a way to conveniently set the preconditioner for the quadratic problem, which will be of the form [-K 0; 0 I]? Note that K is constructed in parallel (the rows are distributed), so I wasn't sure how to construct this preconditioner matrix which will be compatible with the shell matrix structure that I'm using to define the MatMult function as in ex9.

The shell matrix of ex9.c interleaves the local parts of the first block and the second block. In other words, a process' local part consists of the local rows of the first block followed by the local rows of the second block. In your case, the local rows of K followed by the local rows of the identity (appropriately padded with zeros).

Jose


> 
> Thanks,
> Varun
> 
> On Fri, Sep 24, 2021 at 11:50 PM Varun Hiremath <varunhiremath at gmail.com> wrote:
> Ok, great! I will give that a try, thanks for your help!
> 
> On Fri, Sep 24, 2021 at 11:12 PM Jose E. Roman <jroman at dsic.upv.es> wrote:
> Yes, you can use PCMAT https://petsc.org/release/docs/manualpages/PC/PCMAT.html then pass a preconditioner matrix that performs the inverse via a shell matrix.
> 
> > El 25 sept 2021, a las 8:07, Varun Hiremath <varunhiremath at gmail.com> escribi?:
> > 
> > Hi Jose,
> > 
> > Thanks for checking my code and providing suggestions. 
> > 
> > In my particular case, I don't know the matrix A explicitly, I compute A*x in a matrix-free way within a shell matrix, so I can't use any of the direct factorization methods. But just a question regarding your suggestion to compute a (parallel) LU factorization. In our work, we do use MUMPS to compute the parallel factorization. For solving the generalized problem, A*x = lambda*B*x, we are computing inv(B)*A*x within a shell matrix, where factorization of B is computed using MUMPS. (We don't call MUMPS through SLEPc as we have our own MPI wrapper and other user settings to handle.)
> > 
> > So for the preconditioning, instead of using the iterative solvers, can I provide a shell matrix that computes inv(P)*x corrections (where P is the preconditioner matrix) using MUMPS direct solver?
> > 
> > And yes, thanks, #define PETSC_USE_COMPLEX 1 is not needed, it works without it.
> > 
> > Regards,
> > Varun
> > 
> > On Fri, Sep 24, 2021 at 9:14 AM Jose E. Roman <jroman at dsic.upv.es> wrote:
> > If you do
> > $ ./acoustic_matrix_test.o -shell 0 -st_type sinvert -deflate 1
> > then it is using an LU factorization (the default), which is fast.
> > 
> > Use -eps_view to see which solver settings are you using.
> > 
> > BiCGStab with block Jacobi does not work for you matrix, it exceeds the maximum 10000 iterations. So this is not viable unless you can find a better preconditioner for your problem. If not, just using EPS_SMALLEST_MAGNITUDE will be faster.
> > 
> > Computing smallest magnitude eigenvalues is a difficult task. The most robust way is to compute a (parallel) LU factorization if you can afford it.
> > 
> > 
> > A side note: don't add this to your source code
> > #define PETSC_USE_COMPLEX 1
> > This define is taken from PETSc's include files, you should not mess with it. Instead, you probably want to add something like this AFTER #include <slepceps.h>:
> > #if !defined(PETSC_USE_COMPLEX)
> > #error "Requires complex scalars"
> > #endif
> > 
> > Jose
> > 
> > 
> > > El 22 sept 2021, a las 19:38, Varun Hiremath <varunhiremath at gmail.com> escribi?:
> > > 
> > > Hi Jose,
> > > 
> > > Thank you, that explains it and my example code works now without specifying "-eps_target 0" in the command line.
> > > 
> > > However, both the Krylov inexact shift-invert and JD solvers are struggling to converge for some of my actual problems. The issue seems to be related to non-symmetric general matrices. I have extracted one such matrix attached here as MatA.gz (size 100k), and have also included a short program that loads this matrix and then computes the smallest eigenvalues as I described earlier.
> > > 
> > > For this matrix, if I compute the eigenvalues directly (without using the shell matrix) using shift-and-invert (as below) then it converges in less than a minute.
> > > $ ./acoustic_matrix_test.o -shell 0 -st_type sinvert -deflate 1
> > > 
> > > However, if I use the shell matrix and use any of the preconditioned solvers JD or Krylov shift-invert (as shown below) with the same matrix as the preconditioner, then they struggle to converge.
> > > $ ./acoustic_matrix_test.o -usejd 1 -deflate 1
> > > $ ./acoustic_matrix_test.o -sinvert 1 -deflate 1
> > > 
> > > Could you please check the attached code and suggest any changes in settings that might help with convergence for these kinds of matrices? I appreciate your help!
> > > 
> > > Thanks,
> > > Varun
> > > 
> > > On Tue, Sep 21, 2021 at 11:14 AM Jose E. Roman <jroman at dsic.upv.es> wrote:
> > > I will have a look at your code when I have more time. Meanwhile, I am answering 3) below...
> > > 
> > > > El 21 sept 2021, a las 0:23, Varun Hiremath <varunhiremath at gmail.com> escribi?:
> > > > 
> > > > Hi Jose,
> > > > 
> > > > Sorry, it took me a while to test these settings in the new builds. I am getting good improvement in performance using the preconditioned solvers, so thanks for the suggestions! But I have some questions related to the usage.
> > > > 
> > > > We are using SLEPc to solve the acoustic modal eigenvalue problem. Attached is a simple standalone program that computes acoustic modes in a simple rectangular box. This program illustrates the general setup I am using, though here the shell matrix and the preconditioner matrix are the same, while in my actual program the shell matrix computes A*x without explicitly forming A, and the preconditioner is a 0th order approximation of A.
> > > > 
> > > > In the attached program I have tested both
> > > > 1) the Krylov-Schur with inexact shift-and-invert (implemented under the option sinvert);
> > > > 2) the JD solver with preconditioner (implemented under the option usejd)
> > > > 
> > > > Both the solvers seem to work decently, compared to no preconditioning. This is how I run the two solvers (for a mesh size of 1600x400):
> > > > $ ./acoustic_box_test.o -nx 1600 -ny 400 -usejd 1 -deflate 1 -eps_target 0
> > > > $ ./acoustic_box_test.o -nx 1600 -ny 400 -sinvert 1 -deflate 1 -eps_target 0
> > > > Both finish in about ~10 minutes on my system in serial. JD seems to be slightly faster and more accurate (for the imaginary part of eigenvalue).
> > > > The program also runs in parallel using mpiexec. I use complex builds, as in my main program the matrix can be complex.
> > > > 
> > > > Now here are my questions:
> > > > 1) For this particular problem type, could you please check if these are the best settings that one could use? I have tried different combinations of KSP/PC types e.g. GMRES, GAMG, etc, but BCGSL + BJACOBI seems to work the best in serial and parallel.
> > > > 
> > > > 2) When I tested these settings in my main program, for some reason the JD solver was not converging. After further testing, I found the issue was related to the setting of "-eps_target 0". I have included "EPSSetTarget(eps,0.0);" in the program and I assumed this is equivalent to passing "-eps_target 0" from the command line, but that doesn't seem to be the case. For instance, if I run the attached program without "-eps_target 0" in the command line then it doesn't converge.
> > > > $ ./acoustic_box_test.o -nx 1600 -ny 400 -usejd 1 -deflate 1 -eps_target 0
> > > >  the above finishes in about 10 minutes
> > > > $ ./acoustic_box_test.o -nx 1600 -ny 400 -usejd 1 -deflate 1
> > > >  the above doesn't converge even though "EPSSetTarget(eps,0.0);" is included in the code
> > > > 
> > > > This only seems to affect the JD solver, not the Krylov shift-and-invert (-sinvert 1) option. So is there any difference between passing "-eps_target 0" from the command line vs using "EPSSetTarget(eps,0.0);" in the code? I cannot pass any command line arguments in my actual program, so need to set everything internally.
> > > > 
> > > > 3) Also, another minor related issue. While using the inexact shift-and-invert option, I was running into the following error:
> > > > 
> > > > ""
> > > > Missing or incorrect user input
> > > > Shift-and-invert requires a target 'which' (see EPSSetWhichEigenpairs), for instance -st_type sinvert -eps_target 0 -eps_target_magnitude
> > > > ""
> > > > 
> > > > I already have the below two lines in the code:
> > > > EPSSetWhichEigenpairs(eps,EPS_SMALLEST_MAGNITUDE);
> > > > EPSSetTarget(eps,0.0);
> > > > 
> > > > so shouldn't these be enough? If I comment out the first line "EPSSetWhichEigenpairs", then the code works fine.
> > > 
> > > You should either do
> > > 
> > > EPSSetWhichEigenpairs(eps,EPS_SMALLEST_MAGNITUDE);
> > > 
> > > without shift-and-invert or
> > > 
> > > EPSSetWhichEigenpairs(eps,EPS_TARGET_MAGNITUDE);
> > > EPSSetTarget(eps,0.0);
> > > 
> > > with shift-and-invert. The latter can also be used without shift-and-invert (e.g. in JD).
> > > 
> > > I have to check, but a possible explanation why in your comment above (2) the command-line option -eps_target 0 works differently is that it also sets -eps_target_magnitude if omitted, so to be equivalent in source code you have to call both
> > > EPSSetWhichEigenpairs(eps,EPS_TARGET_MAGNITUDE);
> > > EPSSetTarget(eps,0.0);
> > > 
> > > Jose
> > > 
> > > > I have some more questions regarding setting the preconditioner for a quadratic eigenvalue problem, which I will ask in a follow-up email.
> > > > 
> > > > Thanks for your help!
> > > > 
> > > > -Varun
> > > > 
> > > > 
> > > > On Thu, Jul 1, 2021 at 5:01 AM Varun Hiremath <varunhiremath at gmail.com> wrote:
> > > > Thank you very much for these suggestions! We are currently using version 3.12, so I'll try to update to the latest version and try your suggestions. Let me get back to you, thanks!
> > > > 
> > > > On Thu, Jul 1, 2021, 4:45 AM Jose E. Roman <jroman at dsic.upv.es> wrote:
> > > > Then I would try Davidson methods https://doi.org/10.1145/2543696
> > > > You can also try Krylov-Schur with "inexact" shift-and-invert, for instance, with preconditioned BiCGStab or GMRES, see section 3.4.1 of the users manual.
> > > > 
> > > > In both cases, you have to pass matrix A in the call to EPSSetOperators() and the preconditioner matrix via STSetPreconditionerMat() - note this function was introduced in version 3.15.
> > > > 
> > > > Jose
> > > > 
> > > > 
> > > > 
> > > > > El 1 jul 2021, a las 13:36, Varun Hiremath <varunhiremath at gmail.com> escribi?:
> > > > > 
> > > > > Thanks. I actually do have a 1st order approximation of matrix A, that I can explicitly compute and also invert. Can I use that matrix as preconditioner to speed things up? Is there some example that explains how to setup and call SLEPc for this scenario? 
> > > > > 
> > > > > On Thu, Jul 1, 2021, 4:29 AM Jose E. Roman <jroman at dsic.upv.es> wrote:
> > > > > For smallest real parts one could adapt ex34.c, but it is going to be costly https://slepc.upv.es/documentation/current/src/eps/tutorials/ex36.c.html
> > > > > Also, if eigenvalues are clustered around the origin, convergence may still be very slow.
> > > > > 
> > > > > It is a tough problem, unless you are able to compute a good preconditioner of A (no need to compute the exact inverse).
> > > > > 
> > > > > Jose
> > > > > 
> > > > > 
> > > > > > El 1 jul 2021, a las 13:23, Varun Hiremath <varunhiremath at gmail.com> escribi?:
> > > > > > 
> > > > > > I'm solving for the smallest eigenvalues in magnitude. Though is it cheaper to solve smallest in real part, as that might also work in my case? Thanks for your help.
> > > > > > 
> > > > > > On Thu, Jul 1, 2021, 4:08 AM Jose E. Roman <jroman at dsic.upv.es> wrote:
> > > > > > Smallest eigenvalue in magnitude or real part?
> > > > > > 
> > > > > > 
> > > > > > > El 1 jul 2021, a las 11:58, Varun Hiremath <varunhiremath at gmail.com> escribi?:
> > > > > > > 
> > > > > > > Sorry, no both A and B are general sparse matrices (non-hermitian). So is there anything else I could try?
> > > > > > > 
> > > > > > > On Thu, Jul 1, 2021 at 2:43 AM Jose E. Roman <jroman at dsic.upv.es> wrote:
> > > > > > > Is the problem symmetric (GHEP)? In that case, you can try LOBPCG on the pair (A,B). But this will likely be slow as well, unless you can provide a good preconditioner.
> > > > > > > 
> > > > > > > Jose
> > > > > > > 
> > > > > > > 
> > > > > > > > El 1 jul 2021, a las 11:37, Varun Hiremath <varunhiremath at gmail.com> escribi?:
> > > > > > > > 
> > > > > > > > Hi All,
> > > > > > > > 
> > > > > > > > I am trying to compute the smallest eigenvalues of a generalized system A*x= lambda*B*x. I don't explicitly know the matrix A (so I am using a shell matrix with a custom matmult function) however, the matrix B is explicitly known so I compute inv(B)*A within the shell matrix and solve inv(B)*A*x = lambda*x.
> > > > > > > > 
> > > > > > > > To compute the smallest eigenvalues it is recommended to solve the inverted system, but since matrix A is not explicitly known I can't invert the system. Moreover, the size of the system can be really big, and with the default Krylov solver, it is extremely slow. So is there a better way for me to compute the smallest eigenvalues of this system?
> > > > > > > > 
> > > > > > > > Thanks,
> > > > > > > > Varun
> > > > > > > 
> > > > > > 
> > > > > 
> > > > 
> > > > <acoustic_box_test.cpp>
> > > 
> > > <acoustic_matrix_test.cpp><MatA.gz>
> > 
> 


From bantingl at myumanitoba.ca  Tue Sep 28 10:33:37 2021
From: bantingl at myumanitoba.ca (Lucas Banting)
Date: Tue, 28 Sep 2021 15:33:37 +0000
Subject: [petsc-users] Using MATLAB on Windows with PETSc on WSL
Message-ID: <YTBPR01MB24489436A32B11AA71F5E94DB1A89@YTBPR01MB2448.CANPRD01.PROD.OUTLOOK.COM>

Hello,

My overall goal is to send a sparse matrix to PETSc (in WSL)from MATLAB (in Windows) so I can use SLEPc for some eigenvalue routines, and send those eigenvectors back to MATLAB, as the MATLAB eigs() struggles with my matrix and I was looking to experiment with different eigenvalue algorithms.

I was trying to configure PETSc on Windows Subsystem for Linux (WSL2). Configuring without '--with-matlab' works fine. I tried the configure command:

./configure --with-scalar-type=complex --with-openblas-dir=~/software/OpenBLAS/
--with-matlab-dir=/mnt/e/Program\ Files/MATLAB/R2020b

I was wondering if there is a fundamental reason this configure won't work, or if it is just the space in 'Program Files' that is breaking the configure command.

I think the only thing configuring with MATLAB is 'sopen' and 'sclose'?
Is it possible I could just remake these on my own for a WSL compatible version?

Thanks,

Lucas
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210928/144439a8/attachment-0001.html>

From bsmith at petsc.dev  Tue Sep 28 10:48:03 2021
From: bsmith at petsc.dev (Barry Smith)
Date: Tue, 28 Sep 2021 11:48:03 -0400
Subject: [petsc-users] Using MATLAB on Windows with PETSc on WSL
In-Reply-To: <YTBPR01MB24489436A32B11AA71F5E94DB1A89@YTBPR01MB2448.CANPRD01.PROD.OUTLOOK.COM>
References: <YTBPR01MB24489436A32B11AA71F5E94DB1A89@YTBPR01MB2448.CANPRD01.PROD.OUTLOOK.COM>
Message-ID: <23C2EF26-9EC8-42BA-A731-049221A3ECF5@petsc.dev>



> On Sep 28, 2021, at 11:33 AM, Lucas Banting <bantingl at myumanitoba.ca> wrote:
> 
> Hello,
> 
> My overall goal is to send a sparse matrix to PETSc (in WSL)from MATLAB (in Windows) so I can use SLEPc for some eigenvalue routines, and send those eigenvectors back to MATLAB, as the MATLAB eigs() struggles with my matrix and I was looking to experiment with different eigenvalue algorithms.
> 
> I was trying to configure PETSc on Windows Subsystem for Linux (WSL2). Configuring without '--with-matlab' works fine. I tried the configure command:
> 
> ./configure --with-scalar-type=complex --with-openblas-dir=~/software/OpenBLAS/
> --with-matlab-dir=/mnt/e/Program\ Files/MATLAB/R2020b
> 
> I was wondering if there is a fundamental reason this configure won't work, or if it is just the space in 'Program Files' that is breaking the configure command. 

   We cannot tell without the configure.log file. Note you may be able to use the shorten MS DOS directory names which do not have spaces for that directory?

> 
> I think the only thing configuring with MATLAB is 'sopen' and 'sclose'?

  Yes

> Is it possible I could just remake these on my own for a WSL compatible version?

  The building of sopen and sclose is something you can do directly by adjusting the makefile  by hand to link the correct files. 
> 
> Thanks,
> 
> Lucas

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210928/0411b92f/attachment.html>

From bsmith at petsc.dev  Tue Sep 28 10:56:11 2021
From: bsmith at petsc.dev (Barry Smith)
Date: Tue, 28 Sep 2021 11:56:11 -0400
Subject: [petsc-users] %T (percent time in this phase)
In-Reply-To: <20E5B029-43D3-493C-873E-EB8F8CD92E08@stfc.ac.uk>
References: <20E5B029-43D3-493C-873E-EB8F8CD92E08@stfc.ac.uk>
Message-ID: <00A59A5B-7093-4FF1-9712-D0E6296E61D6@petsc.dev>



> On Sep 28, 2021, at 10:55 AM, Karthikeyan Chockalingam - STFC UKRI <karthikeyan.chockalingam at stfc.ac.uk> wrote:
> 
> Hello,
>  
> I ran ex45 in the KPS tutorial, which is a 3D finite-difference Poisson problem. I noticed from the output from using the flag -log_summary that for various events their respective %T (percent time in this phase) do not add up to 100 but rather exceeds 100. So, I gather there is some overlap among these events. I am primarily looking at the events KSPSetUp, KSPSolve, PCSetUp and PCSolve. Is it possible to get a mutually exclusive %T or Time for these individual events? I have attached  the log_summary output file from my run for your reference.
>  

  For nested solvers it is tricky to get the times to be mutually exclusive because some parts of the building of the preconditioner is for some preconditioners delayed until the solve has started. 

  It looks like you are using the default preconditioner options which for this example are taking more or less no time since so many iterations are needed. It is best to use -pc_type mg to use geometric multigrid on this problem.

  Barry



> Thanks!
> Karthik.
>  
> This email and any attachments are intended solely for the use of the named recipients. If you are not the intended recipient you must not use, disclose, copy or distribute this email or any of its attachments and should notify the sender immediately and delete this email from your system. UK Research and Innovation (UKRI) has taken every reasonable precaution to minimise risk of this email or any attachments containing viruses or malware but the recipient should carry out its own virus and malware checks before opening the attachments. UKRI does not accept any liability for any losses or damages which the recipient may sustain due to presence of any viruses. 
> 
> <ksp_ex45_N511_cpu_6.txt>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210928/775c25de/attachment.html>

From balay at mcs.anl.gov  Tue Sep 28 10:58:40 2021
From: balay at mcs.anl.gov (Satish Balay)
Date: Tue, 28 Sep 2021 10:58:40 -0500 (CDT)
Subject: [petsc-users] Using MATLAB on Windows with PETSc on WSL
In-Reply-To: <23C2EF26-9EC8-42BA-A731-049221A3ECF5@petsc.dev>
References: <YTBPR01MB24489436A32B11AA71F5E94DB1A89@YTBPR01MB2448.CANPRD01.PROD.OUTLOOK.COM>
	<23C2EF26-9EC8-42BA-A731-049221A3ECF5@petsc.dev>
Message-ID: <405a4beb-afa-9673-e7d3-72c845981e18@mcs.anl.gov>

Well matlab as you say is on windows side - and WSL is basically linux.

One can invoke binaries from the other side - but obj-files/libraries [wrt compilers, linking] won't work as far as I know.

And the MEX targets require compilers/liners to be functional - so I don't think this will work.

And we don't have experience with using matlab natively on windows anyway..

Satish

----

sread:
        -@${MATLAB_MEX}  -g GCC='${CC}' CC='${PCC}' CFLAGS='${COPTFLAGS} ${CC_FLAGS} ${CCPPFLAGS}' LDFLAGS='${PETSC_EXTERNAL_LIB_BASIC}' sread.c bread.c
        -@${RM} -f sread.o bread.o
        -@${MV} sread.mex* ${PETSC_DIR}/${PETSC_ARCH}/lib/petsc/matlab


On Tue, 28 Sep 2021, Barry Smith wrote:

> 
> 
> > On Sep 28, 2021, at 11:33 AM, Lucas Banting <bantingl at myumanitoba.ca> wrote:
> > 
> > Hello,
> > 
> > My overall goal is to send a sparse matrix to PETSc (in WSL)from MATLAB (in Windows) so I can use SLEPc for some eigenvalue routines, and send those eigenvectors back to MATLAB, as the MATLAB eigs() struggles with my matrix and I was looking to experiment with different eigenvalue algorithms.
> > 
> > I was trying to configure PETSc on Windows Subsystem for Linux (WSL2). Configuring without '--with-matlab' works fine. I tried the configure command:
> > 
> > ./configure --with-scalar-type=complex --with-openblas-dir=~/software/OpenBLAS/
> > --with-matlab-dir=/mnt/e/Program\ Files/MATLAB/R2020b
> > 
> > I was wondering if there is a fundamental reason this configure won't work, or if it is just the space in 'Program Files' that is breaking the configure command. 
> 
>    We cannot tell without the configure.log file. Note you may be able to use the shorten MS DOS directory names which do not have spaces for that directory?
> 
> > 
> > I think the only thing configuring with MATLAB is 'sopen' and 'sclose'?
> 
>   Yes
> 
> > Is it possible I could just remake these on my own for a WSL compatible version?
> 
>   The building of sopen and sclose is something you can do directly by adjusting the makefile  by hand to link the correct files. 
> > 
> > Thanks,
> > 
> > Lucas
> 
> 


From karthikeyan.chockalingam at stfc.ac.uk  Tue Sep 28 11:11:28 2021
From: karthikeyan.chockalingam at stfc.ac.uk (Karthikeyan Chockalingam - STFC UKRI)
Date: Tue, 28 Sep 2021 16:11:28 +0000
Subject: [petsc-users] %T (percent time in this phase)
In-Reply-To: <00A59A5B-7093-4FF1-9712-D0E6296E61D6@petsc.dev>
References: <20E5B029-43D3-493C-873E-EB8F8CD92E08@stfc.ac.uk>
	<00A59A5B-7093-4FF1-9712-D0E6296E61D6@petsc.dev>
Message-ID: <64B8653D-6E4C-4F6D-AA7F-C1A6A7693B75@stfc.ac.uk>

Thanks for Barry for your response.

I was just benchmarking the problem with various preconditioner on cpu and gpu. I understand, it is not possible to get mutually exclusive timing.
However, can you tell if KSPSolve time includes both PCSetup and PCApply? And if KSPSolve and KSPSetup are mutually exclusive? Likewise for PCSetUp and PCApply.

Best,
Karthik.




From: Barry Smith <bsmith at petsc.dev>
Date: Tuesday, 28 September 2021 at 16:56
To: "Chockalingam, Karthikeyan (STFC,DL,HC)" <karthikeyan.chockalingam at stfc.ac.uk>
Cc: "petsc-users at mcs.anl.gov" <petsc-users at mcs.anl.gov>
Subject: Re: [petsc-users] %T (percent time in this phase)




On Sep 28, 2021, at 10:55 AM, Karthikeyan Chockalingam - STFC UKRI <karthikeyan.chockalingam at stfc.ac.uk<mailto:karthikeyan.chockalingam at stfc.ac.uk>> wrote:

Hello,

I ran ex45 in the KPS tutorial, which is a 3D finite-difference Poisson problem. I noticed from the output from using the flag -log_summary that for various events their respective %T (percent time in this phase) do not add up to 100 but rather exceeds 100. So, I gather there is some overlap among these events. I am primarily looking at the events KSPSetUp, KSPSolve, PCSetUp and PCSolve. Is it possible to get a mutually exclusive %T or Time for these individual events? I have attached  the log_summary output file from my run for your reference.


  For nested solvers it is tricky to get the times to be mutually exclusive because some parts of the building of the preconditioner is for some preconditioners delayed until the solve has started.

  It looks like you are using the default preconditioner options which for this example are taking more or less no time since so many iterations are needed. It is best to use -pc_type mg to use geometric multigrid on this problem.

  Barry




Thanks!
Karthik.

This email and any attachments are intended solely for the use of the named recipients. If you are not the intended recipient you must not use, disclose, copy or distribute this email or any of its attachments and should notify the sender immediately and delete this email from your system. UK Research and Innovation (UKRI) has taken every reasonable precaution to minimise risk of this email or any attachments containing viruses or malware but the recipient should carry out its own virus and malware checks before opening the attachments. UKRI does not accept any liability for any losses or damages which the recipient may sustain due to presence of any viruses.
<ksp_ex45_N511_cpu_6.txt>


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210928/a6e1a4b7/attachment.html>

From liyiyang30 at gmail.com  Tue Sep 28 12:44:33 2021
From: liyiyang30 at gmail.com (Yiyang Li)
Date: Tue, 28 Sep 2021 10:44:33 -0700
Subject: [petsc-users] Turn off CUDA Devices information
In-Reply-To: <5e127072-8cc3-b41a-5e9-9e498cde85fb@mcs.anl.gov>
References: <CAJ9tMMADM6wxjWF8is4vTUBHfnvW8dB71MEhEjQHBooWB72q7Q@mail.gmail.com>
	<5e127072-8cc3-b41a-5e9-9e498cde85fb@mcs.anl.gov>
Message-ID: <CAJ9tMMCcU7sb7J6VrFd3O7ggYa+HtahJSxdU--4e3Ct_VGxojA@mail.gmail.com>

Yes, I do have superlu_dist built with petsc.
The command I used for launching simulation is

mpiexec  --mca btl self,vader,tcp
               -np 4  python3 .../main.py ./input_ls
               -pc_type lu -pc_factor_mat_solver_type superlu_dist
-pc_asm_type basic -cuda_device NONE

On Mon, Sep 27, 2021 at 6:43 PM Satish Balay <balay at mcs.anl.gov> wrote:

> Do you have petsc built with superlu_dist?
>
> Satish
>
> On Mon, 27 Sep 2021, Yiyang Li wrote:
>
> > Hello,
> >
> > I have CUDA aware MPI, and I have upgraded from PETSc 3.12 to PETSc
> 3.15.4
> > and petsc4py 3.15.4.
> >
> > Now, when I call
> >
> > PETSc.KSP().solve(..., ...)
> >
> > The information of GPU is always printed to stdout by every MPI rank,
> like
> >
> > CUDA version:   v 11040
> > CUDA Devices:
> >
> > 0 : Quadro P4000 6 1
> >   Global memory:   8105 mb
> >   Shared memory:   48 kb
> >   Constant memory: 64 kb
> >   Block registers: 65536
> >
> > CUDA version:   v 11040
> > CUDA Devices:
> >
> > 0 : Quadro P4000 6 1
> >   Global memory:   8105 mb
> >   Shared memory:   48 kb
> >   Constant memory: 64 kb
> >   Block registers: 6553
> >
> > ...
> >
> > I wonder if there is an option to turn that off?
> > I have tried including
> >
> > -cuda_device NONE
> >
> > in command options, but that did not work.
> >
> > Best regards,
> > Yiyang
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210928/d1139f15/attachment.html>

From balay at mcs.anl.gov  Tue Sep 28 13:04:29 2021
From: balay at mcs.anl.gov (Satish Balay)
Date: Tue, 28 Sep 2021 13:04:29 -0500 (CDT)
Subject: [petsc-users] Turn off CUDA Devices information
In-Reply-To: <CAJ9tMMCcU7sb7J6VrFd3O7ggYa+HtahJSxdU--4e3Ct_VGxojA@mail.gmail.com>
References: <CAJ9tMMADM6wxjWF8is4vTUBHfnvW8dB71MEhEjQHBooWB72q7Q@mail.gmail.com>
	<5e127072-8cc3-b41a-5e9-9e498cde85fb@mcs.anl.gov>
	<CAJ9tMMCcU7sb7J6VrFd3O7ggYa+HtahJSxdU--4e3Ct_VGxojA@mail.gmail.com>
Message-ID: <46ccd3ee-eb48-ca59-e7b7-25c924f9247b@mcs.anl.gov>

This verbose message comes from superlu_dist (when built with cuda)

I'm not sure how to disable it [without going into the code and commenting out the code that does this]

balay at sb /home/balay/git-repo/github/superlu_dist (maint=)
$ git grep 'CUDA version'
SRC/cublas_utils.c:    printf("CUDA version:   v %d\n",CUDART_VERSION);


Satish

On Tue, 28 Sep 2021, Yiyang Li wrote:

> Yes, I do have superlu_dist built with petsc.
> The command I used for launching simulation is
> 
> mpiexec  --mca btl self,vader,tcp
>                -np 4  python3 .../main.py ./input_ls
>                -pc_type lu -pc_factor_mat_solver_type superlu_dist
> -pc_asm_type basic -cuda_device NONE
> 
> On Mon, Sep 27, 2021 at 6:43 PM Satish Balay <balay at mcs.anl.gov> wrote:
> 
> > Do you have petsc built with superlu_dist?
> >
> > Satish
> >
> > On Mon, 27 Sep 2021, Yiyang Li wrote:
> >
> > > Hello,
> > >
> > > I have CUDA aware MPI, and I have upgraded from PETSc 3.12 to PETSc
> > 3.15.4
> > > and petsc4py 3.15.4.
> > >
> > > Now, when I call
> > >
> > > PETSc.KSP().solve(..., ...)
> > >
> > > The information of GPU is always printed to stdout by every MPI rank,
> > like
> > >
> > > CUDA version:   v 11040
> > > CUDA Devices:
> > >
> > > 0 : Quadro P4000 6 1
> > >   Global memory:   8105 mb
> > >   Shared memory:   48 kb
> > >   Constant memory: 64 kb
> > >   Block registers: 65536
> > >
> > > CUDA version:   v 11040
> > > CUDA Devices:
> > >
> > > 0 : Quadro P4000 6 1
> > >   Global memory:   8105 mb
> > >   Shared memory:   48 kb
> > >   Constant memory: 64 kb
> > >   Block registers: 6553
> > >
> > > ...
> > >
> > > I wonder if there is an option to turn that off?
> > > I have tried including
> > >
> > > -cuda_device NONE
> > >
> > > in command options, but that did not work.
> > >
> > > Best regards,
> > > Yiyang
> > >
> >
> >
> 


From liyiyang30 at gmail.com  Tue Sep 28 13:15:29 2021
From: liyiyang30 at gmail.com (Yiyang Li)
Date: Tue, 28 Sep 2021 11:15:29 -0700
Subject: [petsc-users] Turn off CUDA Devices information
In-Reply-To: <46ccd3ee-eb48-ca59-e7b7-25c924f9247b@mcs.anl.gov>
References: <CAJ9tMMADM6wxjWF8is4vTUBHfnvW8dB71MEhEjQHBooWB72q7Q@mail.gmail.com>
	<5e127072-8cc3-b41a-5e9-9e498cde85fb@mcs.anl.gov>
	<CAJ9tMMCcU7sb7J6VrFd3O7ggYa+HtahJSxdU--4e3Ct_VGxojA@mail.gmail.com>
	<46ccd3ee-eb48-ca59-e7b7-25c924f9247b@mcs.anl.gov>
Message-ID: <CAJ9tMMDFWwJyFKgdLPRGo7qVGC9=MhS=V4kLx7vQv-2uLUA=BA@mail.gmail.com>

Alright, that explains why I can't find information on petsc website about
how to turn that off.
Thank you Satish for your hint, I will figure that out.

Best,
Yiyang

On Tue, Sep 28, 2021 at 11:04 AM Satish Balay <balay at mcs.anl.gov> wrote:

> This verbose message comes from superlu_dist (when built with cuda)
>
> I'm not sure how to disable it [without going into the code and commenting
> out the code that does this]
>
> balay at sb /home/balay/git-repo/github/superlu_dist (maint=)
> $ git grep 'CUDA version'
> SRC/cublas_utils.c:    printf("CUDA version:   v %d\n",CUDART_VERSION);
>
>
> Satish
>
> On Tue, 28 Sep 2021, Yiyang Li wrote:
>
> > Yes, I do have superlu_dist built with petsc.
> > The command I used for launching simulation is
> >
> > mpiexec  --mca btl self,vader,tcp
> >                -np 4  python3 .../main.py ./input_ls
> >                -pc_type lu -pc_factor_mat_solver_type superlu_dist
> > -pc_asm_type basic -cuda_device NONE
> >
> > On Mon, Sep 27, 2021 at 6:43 PM Satish Balay <balay at mcs.anl.gov> wrote:
> >
> > > Do you have petsc built with superlu_dist?
> > >
> > > Satish
> > >
> > > On Mon, 27 Sep 2021, Yiyang Li wrote:
> > >
> > > > Hello,
> > > >
> > > > I have CUDA aware MPI, and I have upgraded from PETSc 3.12 to PETSc
> > > 3.15.4
> > > > and petsc4py 3.15.4.
> > > >
> > > > Now, when I call
> > > >
> > > > PETSc.KSP().solve(..., ...)
> > > >
> > > > The information of GPU is always printed to stdout by every MPI rank,
> > > like
> > > >
> > > > CUDA version:   v 11040
> > > > CUDA Devices:
> > > >
> > > > 0 : Quadro P4000 6 1
> > > >   Global memory:   8105 mb
> > > >   Shared memory:   48 kb
> > > >   Constant memory: 64 kb
> > > >   Block registers: 65536
> > > >
> > > > CUDA version:   v 11040
> > > > CUDA Devices:
> > > >
> > > > 0 : Quadro P4000 6 1
> > > >   Global memory:   8105 mb
> > > >   Shared memory:   48 kb
> > > >   Constant memory: 64 kb
> > > >   Block registers: 6553
> > > >
> > > > ...
> > > >
> > > > I wonder if there is an option to turn that off?
> > > > I have tried including
> > > >
> > > > -cuda_device NONE
> > > >
> > > > in command options, but that did not work.
> > > >
> > > > Best regards,
> > > > Yiyang
> > > >
> > >
> > >
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210928/79a5a904/attachment.html>

From bsmith at petsc.dev  Tue Sep 28 13:18:56 2021
From: bsmith at petsc.dev (Barry Smith)
Date: Tue, 28 Sep 2021 14:18:56 -0400
Subject: [petsc-users] %T (percent time in this phase)
In-Reply-To: <64B8653D-6E4C-4F6D-AA7F-C1A6A7693B75@stfc.ac.uk>
References: <20E5B029-43D3-493C-873E-EB8F8CD92E08@stfc.ac.uk>
	<00A59A5B-7093-4FF1-9712-D0E6296E61D6@petsc.dev>
	<64B8653D-6E4C-4F6D-AA7F-C1A6A7693B75@stfc.ac.uk>
Message-ID: <9123E727-A05A-4614-B90B-852EE0088895@petsc.dev>



> On Sep 28, 2021, at 12:11 PM, Karthikeyan Chockalingam - STFC UKRI <karthikeyan.chockalingam at stfc.ac.uk> wrote:
> 
> Thanks for Barry for your response.
>  
> I was just benchmarking the problem with various preconditioner on cpu and gpu. I understand, it is not possible to get mutually exclusive timing.
> However, can you tell if KSPSolve time includes both PCSetup and PCApply? And if KSPSolve and KSPSetup are mutually exclusive? Likewise for PCSetUp and PCApply.

   If you do not call KSPSetUp() separately from KSPSolve() then its time is included with KSPSolve().

   PCSetUp() time may be in KSPSetUp() or it maybe in PCApply() it depends on how much of the preconditioner construction can take place early, so depends exactly on the preconditioner used.

   So yes the answer is not totally satisfying. The one thing I would recommend is to not call KSPSetUp() directly and then KSPSolve() will always include the total time of the solve plus all setup time. PCApply will contain all the time to apply the preconditioner but may also include some setup time.

  Barry

>  
> Best,
> Karthik.
>  
>  
>  
>  
> From: Barry Smith <bsmith at petsc.dev>
> Date: Tuesday, 28 September 2021 at 16:56
> To: "Chockalingam, Karthikeyan (STFC,DL,HC)" <karthikeyan.chockalingam at stfc.ac.uk>
> Cc: "petsc-users at mcs.anl.gov" <petsc-users at mcs.anl.gov>
> Subject: Re: [petsc-users] %T (percent time in this phase)
>  
>  
> 
> 
> On Sep 28, 2021, at 10:55 AM, Karthikeyan Chockalingam - STFC UKRI <karthikeyan.chockalingam at stfc.ac.uk <mailto:karthikeyan.chockalingam at stfc.ac.uk>> wrote:
>  
> Hello,
>  
> I ran ex45 in the KPS tutorial, which is a 3D finite-difference Poisson problem. I noticed from the output from using the flag -log_summary that for various events their respective %T (percent time in this phase) do not add up to 100 but rather exceeds 100. So, I gather there is some overlap among these events. I am primarily looking at the events KSPSetUp, KSPSolve, PCSetUp and PCSolve. Is it possible to get a mutually exclusive %T or Time for these individual events? I have attached  the log_summary output file from my run for your reference.
>  
>  
>   For nested solvers it is tricky to get the times to be mutually exclusive because some parts of the building of the preconditioner is for some preconditioners delayed until the solve has started. 
>  
>   It looks like you are using the default preconditioner options which for this example are taking more or less no time since so many iterations are needed. It is best to use -pc_type mg to use geometric multigrid on this problem.
>  
>   Barry
>  
>  
> 
> 
> Thanks!
> Karthik.
>  
> This email and any attachments are intended solely for the use of the named recipients. If you are not the intended recipient you must not use, disclose, copy or distribute this email or any of its attachments and should notify the sender immediately and delete this email from your system. UK Research and Innovation (UKRI) has taken every reasonable precaution to minimise risk of this email or any attachments containing viruses or malware but the recipient should carry out its own virus and malware checks before opening the attachments. UKRI does not accept any liability for any losses or damages which the recipient may sustain due to presence of any viruses. 
> <ksp_ex45_N511_cpu_6.txt>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210928/ff9326aa/attachment-0001.html>

From karthikeyan.chockalingam at stfc.ac.uk  Wed Sep 29 04:51:49 2021
From: karthikeyan.chockalingam at stfc.ac.uk (Karthikeyan Chockalingam - STFC UKRI)
Date: Wed, 29 Sep 2021 09:51:49 +0000
Subject: [petsc-users] %T (percent time in this phase)
In-Reply-To: <9123E727-A05A-4614-B90B-852EE0088895@petsc.dev>
References: <20E5B029-43D3-493C-873E-EB8F8CD92E08@stfc.ac.uk>
	<00A59A5B-7093-4FF1-9712-D0E6296E61D6@petsc.dev>
	<64B8653D-6E4C-4F6D-AA7F-C1A6A7693B75@stfc.ac.uk>
	<9123E727-A05A-4614-B90B-852EE0088895@petsc.dev>
Message-ID: <4588B16F-528E-4869-BF87-FF5716D0A1FE@stfc.ac.uk>

That was helpful. I would like to provide some additional details of my run on cpus and gpus. Please find the following attachments:


  1.  graph.pdf a plot showing overall time and various petsc events.
  2.  ksp_ex45_N511_cpu_6.txt data file of the log_summary
  3.  ksp_ex45_N511_gpu_2.txt data file of the log_summary

I used the following petsc options for cpu

mpirun -n 6 ./ex45 -log_summary -da_grid_x 511 -da_grid_y 511 -da_grid_z 511 -dm_mat_type mpiaij -dm_vec_type mpi -ksp_type cg -pc_type bjacobi -ksp_monitor

and for gpus

mpirun -n 1 ./ex45 -log_summary -da_grid_x 511 -da_grid_y 511 -da_grid_z 511  -dm_mat_type mpiaijcusparse -dm_vec_type mpicuda -ksp_type cg -pc_type bjacobi -ksp_monitor

to run the following problem

https://petsc.org/release/src/ksp/ksp/tutorials/ex45.c.html

From the above code, I see is there no individual function called KSPSetUp(), so I gather KSPSetDM, KSPSetComputeInitialGuess, KSPSetComputeRHS, kSPSetComputeOperators all are timed together as KSPSetUp. For this example, is KSPSetUp time and KSPSolve time mutually exclusive?

In your response you said that

   ?PCSetUp() time may be in KSPSetUp() or it maybe in PCApply() it depends on how much of the preconditioner construction can take place early, so depends exactly on the preconditioner used.?

I don?t see a explicit call to PCSetUp() or  PCApply() in ex45; so for this particular preconditioner (bjacobi) how can I tell how they are timed?

I am hoping to time KSP solving and preconditioning mutually exclusively.

Kind regards,
Karthik.


From: Barry Smith <bsmith at petsc.dev>
Date: Tuesday, 28 September 2021 at 19:19
To: "Chockalingam, Karthikeyan (STFC,DL,HC)" <karthikeyan.chockalingam at stfc.ac.uk>
Cc: "petsc-users at mcs.anl.gov" <petsc-users at mcs.anl.gov>
Subject: Re: [petsc-users] %T (percent time in this phase)




On Sep 28, 2021, at 12:11 PM, Karthikeyan Chockalingam - STFC UKRI <karthikeyan.chockalingam at stfc.ac.uk<mailto:karthikeyan.chockalingam at stfc.ac.uk>> wrote:

Thanks for Barry for your response.

I was just benchmarking the problem with various preconditioner on cpu and gpu. I understand, it is not possible to get mutually exclusive timing.
However, can you tell if KSPSolve time includes both PCSetup and PCApply? And if KSPSolve and KSPSetup are mutually exclusive? Likewise for PCSetUp and PCApply.

   If you do not call KSPSetUp() separately from KSPSolve() then its time is included with KSPSolve().

   PCSetUp() time may be in KSPSetUp() or it maybe in PCApply() it depends on how much of the preconditioner construction can take place early, so depends exactly on the preconditioner used.

   So yes the answer is not totally satisfying. The one thing I would recommend is to not call KSPSetUp() directly and then KSPSolve() will always include the total time of the solve plus all setup time. PCApply will contain all the time to apply the preconditioner but may also include some setup time.

  Barry


Best,
Karthik.




From: Barry Smith <bsmith at petsc.dev<mailto:bsmith at petsc.dev>>
Date: Tuesday, 28 September 2021 at 16:56
To: "Chockalingam, Karthikeyan (STFC,DL,HC)" <karthikeyan.chockalingam at stfc.ac.uk<mailto:karthikeyan.chockalingam at stfc.ac.uk>>
Cc: "petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>" <petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>>
Subject: Re: [petsc-users] %T (percent time in this phase)





On Sep 28, 2021, at 10:55 AM, Karthikeyan Chockalingam - STFC UKRI <karthikeyan.chockalingam at stfc.ac.uk<mailto:karthikeyan.chockalingam at stfc.ac.uk>> wrote:

Hello,

I ran ex45 in the KPS tutorial, which is a 3D finite-difference Poisson problem. I noticed from the output from using the flag -log_summary that for various events their respective %T (percent time in this phase) do not add up to 100 but rather exceeds 100. So, I gather there is some overlap among these events. I am primarily looking at the events KSPSetUp, KSPSolve, PCSetUp and PCSolve. Is it possible to get a mutually exclusive %T or Time for these individual events? I have attached  the log_summary output file from my run for your reference.


  For nested solvers it is tricky to get the times to be mutually exclusive because some parts of the building of the preconditioner is for some preconditioners delayed until the solve has started.

  It looks like you are using the default preconditioner options which for this example are taking more or less no time since so many iterations are needed. It is best to use -pc_type mg to use geometric multigrid on this problem.

  Barry





Thanks!
Karthik.

This email and any attachments are intended solely for the use of the named recipients. If you are not the intended recipient you must not use, disclose, copy or distribute this email or any of its attachments and should notify the sender immediately and delete this email from your system. UK Research and Innovation (UKRI) has taken every reasonable precaution to minimise risk of this email or any attachments containing viruses or malware but the recipient should carry out its own virus and malware checks before opening the attachments. UKRI does not accept any liability for any losses or damages which the recipient may sustain due to presence of any viruses.
<ksp_ex45_N511_cpu_6.txt>


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210929/86dc0a40/attachment-0001.html>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: ksp_ex45_N511_cpu_6.txt
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210929/86dc0a40/attachment-0002.txt>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graph.pdf
Type: application/pdf
Size: 97687 bytes
Desc: graph.pdf
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210929/86dc0a40/attachment-0001.pdf>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: ksp_ex45_N511_gpu_2.txt
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210929/86dc0a40/attachment-0003.txt>

From knepley at gmail.com  Wed Sep 29 04:58:05 2021
From: knepley at gmail.com (Matthew Knepley)
Date: Wed, 29 Sep 2021 05:58:05 -0400
Subject: [petsc-users] %T (percent time in this phase)
In-Reply-To: <4588B16F-528E-4869-BF87-FF5716D0A1FE@stfc.ac.uk>
References: <20E5B029-43D3-493C-873E-EB8F8CD92E08@stfc.ac.uk>
	<00A59A5B-7093-4FF1-9712-D0E6296E61D6@petsc.dev>
	<64B8653D-6E4C-4F6D-AA7F-C1A6A7693B75@stfc.ac.uk>
	<9123E727-A05A-4614-B90B-852EE0088895@petsc.dev>
	<4588B16F-528E-4869-BF87-FF5716D0A1FE@stfc.ac.uk>
Message-ID: <CAMYG4GmcgBWD=74ojbW-eLNBHZbjU+GZ6TN68-81oq4Kxzi=+g@mail.gmail.com>

On Wed, Sep 29, 2021 at 5:52 AM Karthikeyan Chockalingam - STFC UKRI <
karthikeyan.chockalingam at stfc.ac.uk> wrote:

> That was helpful. I would like to provide some additional details of my
> run on cpus and gpus. Please find the following attachments:
>
>
>
>    1. graph.pdf a plot showing overall time and various petsc events.
>    2. ksp_ex45_N511_cpu_6.txt data file of the log_summary
>    3. ksp_ex45_N511_gpu_2.txt data file of the log_summary
>
>
>
> I used the following petsc options for cpu
>
>
>
> mpirun -n 6 ./ex45 -log_summary -da_grid_x 511 -da_grid_y 511 -da_grid_z
> 511 -dm_mat_type mpiaij -dm_vec_type mpi -ksp_type cg -pc_type bjacobi
> -ksp_monitor
>
>
>
> and for gpus
>
>
>
> mpirun -n 1 ./ex45 -log_summary -da_grid_x 511 -da_grid_y 511 -da_grid_z
> 511  -dm_mat_type mpiaijcusparse -dm_vec_type mpicuda -ksp_type cg -pc_type
> bjacobi -ksp_monitor
>
>
>
> to run the following problem
>
>
>
> https://petsc.org/release/src/ksp/ksp/tutorials/ex45.c.html
>
>
>
> From the above code, I see is there no individual function called KSPSetUp(),
> so I gather KSPSetDM, KSPSetComputeInitialGuess, KSPSetComputeRHS,
> kSPSetComputeOperators all are timed together as KSPSetUp. For this
> example, is KSPSetUp time and KSPSolve time mutually exclusive?
>

No, KSPSetUp() will be contained in KSPSolve() if it is called
automatically.


> In your response you said that
>
>
>
>    ?PCSetUp() time may be in KSPSetUp() or it maybe in PCApply() it
> depends on how much of the preconditioner construction can take place
> early, so depends exactly on the preconditioner used.?
>
>
>
> I don?t see a explicit call to PCSetUp() or  PCApply() in ex45; so for
> this particular preconditioner (bjacobi) how can I tell how they are timed?
>

They are all inside KSPSolve(). If you have a preconditioned linear solve,
the oreconditioning happens during the iteration. So an iteration would
mostly
consist of MatMult + PCApply, with some vector work.


> I am hoping to time KSP solving and preconditioning mutually exclusively.
>

I am not sure that concept makes sense here. See above.

  Thanks,

     Matt


>
>
> Kind regards,
>
> Karthik.
>
>
>
>
>
> *From: *Barry Smith <bsmith at petsc.dev>
> *Date: *Tuesday, 28 September 2021 at 19:19
> *To: *"Chockalingam, Karthikeyan (STFC,DL,HC)" <
> karthikeyan.chockalingam at stfc.ac.uk>
> *Cc: *"petsc-users at mcs.anl.gov" <petsc-users at mcs.anl.gov>
> *Subject: *Re: [petsc-users] %T (percent time in this phase)
>
>
>
>
>
>
>
> On Sep 28, 2021, at 12:11 PM, Karthikeyan Chockalingam - STFC UKRI <
> karthikeyan.chockalingam at stfc.ac.uk> wrote:
>
>
>
> Thanks for Barry for your response.
>
>
>
> I was just benchmarking the problem with various preconditioner on cpu and
> gpu. I understand, it is not possible to get mutually exclusive timing.
>
> However, can you tell if KSPSolve time includes both PCSetup and PCApply?
> And if KSPSolve and KSPSetup are mutually exclusive? Likewise for PCSetUp
> and PCApply.
>
>
>
>    If you do not call KSPSetUp() separately from KSPSolve() then its time
> is included with KSPSolve().
>
>
>
>    PCSetUp() time may be in KSPSetUp() or it maybe in PCApply() it depends
> on how much of the preconditioner construction can take place early, so
> depends exactly on the preconditioner used.
>
>
>
>    So yes the answer is not totally satisfying. The one thing I would
> recommend is to not call KSPSetUp() directly and then KSPSolve() will
> always include the total time of the solve plus all setup time. PCApply
> will contain all the time to apply the preconditioner but may also include
> some setup time.
>
>
>
>   Barry
>
>
>
>
>
> Best,
>
> Karthik.
>
>
>
>
>
>
>
>
>
> *From: *Barry Smith <bsmith at petsc.dev>
> *Date: *Tuesday, 28 September 2021 at 16:56
> *To: *"Chockalingam, Karthikeyan (STFC,DL,HC)" <
> karthikeyan.chockalingam at stfc.ac.uk>
> *Cc: *"petsc-users at mcs.anl.gov" <petsc-users at mcs.anl.gov>
> *Subject: *Re: [petsc-users] %T (percent time in this phase)
>
>
>
>
>
>
>
>
> On Sep 28, 2021, at 10:55 AM, Karthikeyan Chockalingam - STFC UKRI <
> karthikeyan.chockalingam at stfc.ac.uk> wrote:
>
>
>
> Hello,
>
>
>
> I ran ex45 in the KPS tutorial, which is a 3D finite-difference Poisson
> problem. I noticed from the output from using the flag -log_summary that
> for various events their respective %T (percent time in this phase) do not
> add up to 100 but rather exceeds 100. So, I gather there is some overlap
> among these events. I am primarily looking at the events KSPSetUp,
> KSPSolve, PCSetUp and PCSolve. Is it possible to get a mutually exclusive
> %T or Time for these individual events? I have attached  the log_summary
> output file from my run for your reference.
>
>
>
>
>
>   For nested solvers it is tricky to get the times to be mutually
> exclusive because some parts of the building of the preconditioner is for
> some preconditioners delayed until the solve has started.
>
>
>
>   It looks like you are using the default preconditioner options which for
> this example are taking more or less no time since so many iterations are
> needed. It is best to use -pc_type mg to use geometric multigrid on this
> problem.
>
>
>
>   Barry
>
>
>
>
>
>
>
>
> Thanks!
>
> Karthik.
>
>
>
> This email and any attachments are intended solely for the use of the
> named recipients. If you are not the intended recipient you must not use,
> disclose, copy or distribute this email or any of its attachments and
> should notify the sender immediately and delete this email from your
> system. UK Research and Innovation (UKRI) has taken every reasonable
> precaution to minimise risk of this email or any attachments containing
> viruses or malware but the recipient should carry out its own virus and
> malware checks before opening the attachments. UKRI does not accept any
> liability for any losses or damages which the recipient may sustain due to
> presence of any viruses.
>
> <ksp_ex45_N511_cpu_6.txt>
>
>
>
>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210929/8c9d309e/attachment.html>

From karthikeyan.chockalingam at stfc.ac.uk  Wed Sep 29 05:24:46 2021
From: karthikeyan.chockalingam at stfc.ac.uk (Karthikeyan Chockalingam - STFC UKRI)
Date: Wed, 29 Sep 2021 10:24:46 +0000
Subject: [petsc-users] %T (percent time in this phase)
In-Reply-To: <CAMYG4GmcgBWD=74ojbW-eLNBHZbjU+GZ6TN68-81oq4Kxzi=+g@mail.gmail.com>
References: <20E5B029-43D3-493C-873E-EB8F8CD92E08@stfc.ac.uk>
	<00A59A5B-7093-4FF1-9712-D0E6296E61D6@petsc.dev>
	<64B8653D-6E4C-4F6D-AA7F-C1A6A7693B75@stfc.ac.uk>
	<9123E727-A05A-4614-B90B-852EE0088895@petsc.dev>
	<4588B16F-528E-4869-BF87-FF5716D0A1FE@stfc.ac.uk>
	<CAMYG4GmcgBWD=74ojbW-eLNBHZbjU+GZ6TN68-81oq4Kxzi=+g@mail.gmail.com>
Message-ID: <DFF963EE-AB04-4042-8078-B2BF42D5D1E4@stfc.ac.uk>

Thank you Mathew. Now, it is all making sense to me.

From data file ksp_ex45_N511_gpu_2.txt

KSPSolve (53%) + KSPSetup (0%) = PCSetup (16%) + PCApply (37%).

However, you said ?So an iteration would mostly consist of MatMult + PCApply, with some vector work?

The MalMult event is 4 %. How does this event figure into the above equation; if preconditioning (MatMult + PCApply) is included in KSPSolve?

Best,
Karthik.

From: Matthew Knepley <knepley at gmail.com>
Date: Wednesday, 29 September 2021 at 10:58
To: "Chockalingam, Karthikeyan (STFC,DL,HC)" <karthikeyan.chockalingam at stfc.ac.uk>
Cc: Barry Smith <bsmith at petsc.dev>, "petsc-users at mcs.anl.gov" <petsc-users at mcs.anl.gov>
Subject: Re: [petsc-users] %T (percent time in this phase)

On Wed, Sep 29, 2021 at 5:52 AM Karthikeyan Chockalingam - STFC UKRI <karthikeyan.chockalingam at stfc.ac.uk<mailto:karthikeyan.chockalingam at stfc.ac.uk>> wrote:
That was helpful. I would like to provide some additional details of my run on cpus and gpus. Please find the following attachments:


  1.  graph.pdf a plot showing overall time and various petsc events.
  2.  ksp_ex45_N511_cpu_6.txt data file of the log_summary
  3.  ksp_ex45_N511_gpu_2.txt data file of the log_summary

I used the following petsc options for cpu

mpirun -n 6 ./ex45 -log_summary -da_grid_x 511 -da_grid_y 511 -da_grid_z 511 -dm_mat_type mpiaij -dm_vec_type mpi -ksp_type cg -pc_type bjacobi -ksp_monitor

and for gpus

mpirun -n 1 ./ex45 -log_summary -da_grid_x 511 -da_grid_y 511 -da_grid_z 511  -dm_mat_type mpiaijcusparse -dm_vec_type mpicuda -ksp_type cg -pc_type bjacobi -ksp_monitor

to run the following problem

https://petsc.org/release/src/ksp/ksp/tutorials/ex45.c.html

From the above code, I see is there no individual function called KSPSetUp(), so I gather KSPSetDM, KSPSetComputeInitialGuess, KSPSetComputeRHS, kSPSetComputeOperators all are timed together as KSPSetUp. For this example, is KSPSetUp time and KSPSolve time mutually exclusive?

No, KSPSetUp() will be contained in KSPSolve() if it is called automatically.

In your response you said that

   ?PCSetUp() time may be in KSPSetUp() or it maybe in PCApply() it depends on how much of the preconditioner construction can take place early, so depends exactly on the preconditioner used.?

I don?t see a explicit call to PCSetUp() or  PCApply() in ex45; so for this particular preconditioner (bjacobi) how can I tell how they are timed?

They are all inside KSPSolve(). If you have a preconditioned linear solve, the oreconditioning happens during the iteration. So an iteration would mostly
consist of MatMult + PCApply, with some vector work.

I am hoping to time KSP solving and preconditioning mutually exclusively.

I am not sure that concept makes sense here. See above.

  Thanks,

     Matt


Kind regards,
Karthik.


From: Barry Smith <bsmith at petsc.dev<mailto:bsmith at petsc.dev>>
Date: Tuesday, 28 September 2021 at 19:19
To: "Chockalingam, Karthikeyan (STFC,DL,HC)" <karthikeyan.chockalingam at stfc.ac.uk<mailto:karthikeyan.chockalingam at stfc.ac.uk>>
Cc: "petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>" <petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>>
Subject: Re: [petsc-users] %T (percent time in this phase)



On Sep 28, 2021, at 12:11 PM, Karthikeyan Chockalingam - STFC UKRI <karthikeyan.chockalingam at stfc.ac.uk<mailto:karthikeyan.chockalingam at stfc.ac.uk>> wrote:

Thanks for Barry for your response.

I was just benchmarking the problem with various preconditioner on cpu and gpu. I understand, it is not possible to get mutually exclusive timing.
However, can you tell if KSPSolve time includes both PCSetup and PCApply? And if KSPSolve and KSPSetup are mutually exclusive? Likewise for PCSetUp and PCApply.

   If you do not call KSPSetUp() separately from KSPSolve() then its time is included with KSPSolve().

   PCSetUp() time may be in KSPSetUp() or it maybe in PCApply() it depends on how much of the preconditioner construction can take place early, so depends exactly on the preconditioner used.

   So yes the answer is not totally satisfying. The one thing I would recommend is to not call KSPSetUp() directly and then KSPSolve() will always include the total time of the solve plus all setup time. PCApply will contain all the time to apply the preconditioner but may also include some setup time.

  Barry


Best,
Karthik.




From: Barry Smith <bsmith at petsc.dev<mailto:bsmith at petsc.dev>>
Date: Tuesday, 28 September 2021 at 16:56
To: "Chockalingam, Karthikeyan (STFC,DL,HC)" <karthikeyan.chockalingam at stfc.ac.uk<mailto:karthikeyan.chockalingam at stfc.ac.uk>>
Cc: "petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>" <petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>>
Subject: Re: [petsc-users] %T (percent time in this phase)




On Sep 28, 2021, at 10:55 AM, Karthikeyan Chockalingam - STFC UKRI <karthikeyan.chockalingam at stfc.ac.uk<mailto:karthikeyan.chockalingam at stfc.ac.uk>> wrote:

Hello,

I ran ex45 in the KPS tutorial, which is a 3D finite-difference Poisson problem. I noticed from the output from using the flag -log_summary that for various events their respective %T (percent time in this phase) do not add up to 100 but rather exceeds 100. So, I gather there is some overlap among these events. I am primarily looking at the events KSPSetUp, KSPSolve, PCSetUp and PCSolve. Is it possible to get a mutually exclusive %T or Time for these individual events? I have attached  the log_summary output file from my run for your reference.


  For nested solvers it is tricky to get the times to be mutually exclusive because some parts of the building of the preconditioner is for some preconditioners delayed until the solve has started.

  It looks like you are using the default preconditioner options which for this example are taking more or less no time since so many iterations are needed. It is best to use -pc_type mg to use geometric multigrid on this problem.

  Barry




Thanks!
Karthik.

This email and any attachments are intended solely for the use of the named recipients. If you are not the intended recipient you must not use, disclose, copy or distribute this email or any of its attachments and should notify the sender immediately and delete this email from your system. UK Research and Innovation (UKRI) has taken every reasonable precaution to minimise risk of this email or any attachments containing viruses or malware but the recipient should carry out its own virus and malware checks before opening the attachments. UKRI does not accept any liability for any losses or damages which the recipient may sustain due to presence of any viruses.
<ksp_ex45_N511_cpu_6.txt>



--
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/<http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210929/8b156473/attachment-0001.html>

From knepley at gmail.com  Wed Sep 29 05:57:47 2021
From: knepley at gmail.com (Matthew Knepley)
Date: Wed, 29 Sep 2021 06:57:47 -0400
Subject: [petsc-users] %T (percent time in this phase)
In-Reply-To: <DFF963EE-AB04-4042-8078-B2BF42D5D1E4@stfc.ac.uk>
References: <20E5B029-43D3-493C-873E-EB8F8CD92E08@stfc.ac.uk>
	<00A59A5B-7093-4FF1-9712-D0E6296E61D6@petsc.dev>
	<64B8653D-6E4C-4F6D-AA7F-C1A6A7693B75@stfc.ac.uk>
	<9123E727-A05A-4614-B90B-852EE0088895@petsc.dev>
	<4588B16F-528E-4869-BF87-FF5716D0A1FE@stfc.ac.uk>
	<CAMYG4GmcgBWD=74ojbW-eLNBHZbjU+GZ6TN68-81oq4Kxzi=+g@mail.gmail.com>
	<DFF963EE-AB04-4042-8078-B2BF42D5D1E4@stfc.ac.uk>
Message-ID: <CAMYG4Gm6ExoBWQ7--h=UwaH_ynnSNcUzCTnZv_LAv2qo81Uwfg@mail.gmail.com>

On Wed, Sep 29, 2021 at 6:24 AM Karthikeyan Chockalingam - STFC UKRI <
karthikeyan.chockalingam at stfc.ac.uk> wrote:

> Thank you Mathew. Now, it is all making sense to me.
>
>
>
> From data file ksp_ex45_N511_gpu_2.txt
>
>
>
> KSPSolve (53%) + KSPSetup (0%) = PCSetup (16%) + PCApply (37%).
>
>
>
> However, you said ?So an iteration would mostly consist of MatMult +
> PCApply, with some vector work?
>

1) You do one solve, but 2 KSPSetUp()s. You must be running on more than
one process and using Block-Jacobi . Half the time is spent in the solve
(53%)

KSPSetUp               2 1.0 5.3149e-01 1.0 0.00e+00 0.0 0.0e+00
0.0e+00 2.4e+01  0  0  0  0  1   0  0  0  0  1     0       0      0
0.00e+00    0 0.00e+00  0
KSPSolve               1 1.0 1.5837e+02 1.1 8.63e+11 1.0 6.8e+02
2.1e+06 4.4e+03 53100100100 95  53100100100 96 10881   11730   1022
6.40e+03 1021 8.17e-03 100


2) The preconditioner look like BJacobi-ILU. The setup time is 16%, which
is all setup of the individual blocks, and this is all used by the
numerical ILU factorization.

PCSetUp 2 1.0 4.9623e+01 1.3 1.45e+09 1.0 0.0e+00 0.0e+00 0.0e+00 16 0 0 0
0 16 0 0 0 0 58 0 2 6.93e+03 0 0.00e+00 0 PCSetUpOnBlocks 1 1.0 4.9274e+01
1.3 1.45e+09 1.0 0.0e+00 0.0e+00 0.0e+00 15 0 0 0 0 15 0 0 0 0 59 0 2
6.93e+03 0 0.00e+00 0

MatLUFactorNum         1 1.0 4.6126e+01 1.3 1.45e+09 1.0 0.0e+00
0.0e+00 0.0e+00 14  0  0  0  0  14  0  0  0  0    63       0      2
6.93e+03    0 0.00e+00  0
MatILUFactorSym        1 1.0 2.5110e+00 1.0 0.00e+00 0.0 0.0e+00
0.0e+00 0.0e+00  1  0  0  0  0   1  0  0  0  0     0       0      0
0.00e+00    0 0.00e+00  0


3) The preconditioner application takes 37% of the time, which is all
solving the factors and recorded in MatSolve(). Matrix multiplication takes
4%.

PCApply 341 1.0 1.3068e+02 1.6 2.96e+11 1.0 0.0e+00 0.0e+00 0.0e+00 37 34 0
0 0 37 34 0 0 0 4516 4523 1 5.34e+02 0 0.00e+00 100

MatSolve             341 1.0 1.3009e+02 1.6 2.96e+11 1.0 0.0e+00
0.0e+00 0.0e+00 36 34  0  0  0  36 34  0  0  0  4536    4538      1
5.34e+02    0 0.00e+00 100

MatMult              341 1.0 1.0774e+01 1.1 2.96e+11 1.0 6.9e+02
2.1e+06 2.0e+00  4 34100100  0   4 34100100  0 54801   66441      2
5.86e+03    0 0.00e+00 100

4) The significant vector time is all in norms (11%) since they are really
slow on the GPU.

VecNorm              342 1.0 6.2261e+01129.9 4.57e+10 1.0 0.0e+00
0.0e+00 6.8e+02 11  5  0  0 15  11  5  0  0 15  1466   196884      0
0.00e+00  342 2.74e-03 100

VecTDot              680 1.0 1.7107e+00 1.3 9.09e+10 1.0 0.0e+00
0.0e+00 1.4e+03  1 10  0  0 29   1 10  0  0 29 106079   133922      0
0.00e+00  680 5.44e-03 100
VecAXPY              681 1.0 3.2036e+00 1.7 9.10e+10 1.0 0.0e+00
0.0e+00 0.0e+00  1 11  0  0  0   1 11  0  0  0 56728   58367    682
5.34e+02    0 0.00e+00 100
VecAYPX              339 1.0 2.6502e+00 1.8 4.53e+10 1.0 0.0e+00
0.0e+00 0.0e+00  1  5  0  0  0   1  5  0  0  0 34136   34153    339
2.71e-03    0 0.00e+00 100


So the solve time is:

  53% ~ 37% + 4% + 11%

and the setup time is about 16%. I was wrong about the SetUp time being
included, as it is outside the event:


https://gitlab.com/petsc/petsc/-/blob/main/src/ksp/ksp/interface/itfunc.c#L852

It looks like the remainder of the time (23%) is spent preallocating the
matrix.

  Thanks,

     Matt

The MalMult event is 4 %. How does this event figure into the above
> equation; if preconditioning (MatMult + PCApply) is included in KSPSolve?
>
>
>
> Best,
>
> Karthik.
>
>
>
> *From: *Matthew Knepley <knepley at gmail.com>
> *Date: *Wednesday, 29 September 2021 at 10:58
> *To: *"Chockalingam, Karthikeyan (STFC,DL,HC)" <
> karthikeyan.chockalingam at stfc.ac.uk>
> *Cc: *Barry Smith <bsmith at petsc.dev>, "petsc-users at mcs.anl.gov" <
> petsc-users at mcs.anl.gov>
> *Subject: *Re: [petsc-users] %T (percent time in this phase)
>
>
>
> On Wed, Sep 29, 2021 at 5:52 AM Karthikeyan Chockalingam - STFC UKRI <
> karthikeyan.chockalingam at stfc.ac.uk> wrote:
>
> That was helpful. I would like to provide some additional details of my
> run on cpus and gpus. Please find the following attachments:
>
>
>
>    1. graph.pdf a plot showing overall time and various petsc events.
>    2. ksp_ex45_N511_cpu_6.txt data file of the log_summary
>    3. ksp_ex45_N511_gpu_2.txt data file of the log_summary
>
>
>
> I used the following petsc options for cpu
>
>
>
> mpirun -n 6 ./ex45 -log_summary -da_grid_x 511 -da_grid_y 511 -da_grid_z
> 511 -dm_mat_type mpiaij -dm_vec_type mpi -ksp_type cg -pc_type bjacobi
> -ksp_monitor
>
>
>
> and for gpus
>
>
>
> mpirun -n 1 ./ex45 -log_summary -da_grid_x 511 -da_grid_y 511 -da_grid_z
> 511  -dm_mat_type mpiaijcusparse -dm_vec_type mpicuda -ksp_type cg -pc_type
> bjacobi -ksp_monitor
>
>
>
> to run the following problem
>
>
>
> https://petsc.org/release/src/ksp/ksp/tutorials/ex45.c.html
>
>
>
> From the above code, I see is there no individual function called KSPSetUp(),
> so I gather KSPSetDM, KSPSetComputeInitialGuess, KSPSetComputeRHS,
> kSPSetComputeOperators all are timed together as KSPSetUp. For this
> example, is KSPSetUp time and KSPSolve time mutually exclusive?
>
>
>
> No, KSPSetUp() will be contained in KSPSolve() if it is called
> automatically.
>
>
>
> In your response you said that
>
>
>
>    ?PCSetUp() time may be in KSPSetUp() or it maybe in PCApply() it
> depends on how much of the preconditioner construction can take place
> early, so depends exactly on the preconditioner used.?
>
>
>
> I don?t see a explicit call to PCSetUp() or  PCApply() in ex45; so for
> this particular preconditioner (bjacobi) how can I tell how they are timed?
>
>
>
> They are all inside KSPSolve(). If you have a preconditioned linear solve,
> the oreconditioning happens during the iteration. So an iteration would
> mostly
>
> consist of MatMult + PCApply, with some vector work.
>
>
>
> I am hoping to time KSP solving and preconditioning mutually exclusively.
>
>
>
> I am not sure that concept makes sense here. See above.
>
>
>
>   Thanks,
>
>
>
>      Matt
>
>
>
>
>
> Kind regards,
>
> Karthik.
>
>
>
>
>
> *From: *Barry Smith <bsmith at petsc.dev>
> *Date: *Tuesday, 28 September 2021 at 19:19
> *To: *"Chockalingam, Karthikeyan (STFC,DL,HC)" <
> karthikeyan.chockalingam at stfc.ac.uk>
> *Cc: *"petsc-users at mcs.anl.gov" <petsc-users at mcs.anl.gov>
> *Subject: *Re: [petsc-users] %T (percent time in this phase)
>
>
>
>
>
>
>
> On Sep 28, 2021, at 12:11 PM, Karthikeyan Chockalingam - STFC UKRI <
> karthikeyan.chockalingam at stfc.ac.uk> wrote:
>
>
>
> Thanks for Barry for your response.
>
>
>
> I was just benchmarking the problem with various preconditioner on cpu and
> gpu. I understand, it is not possible to get mutually exclusive timing.
>
> However, can you tell if KSPSolve time includes both PCSetup and PCApply?
> And if KSPSolve and KSPSetup are mutually exclusive? Likewise for PCSetUp
> and PCApply.
>
>
>
>    If you do not call KSPSetUp() separately from KSPSolve() then its time
> is included with KSPSolve().
>
>
>
>    PCSetUp() time may be in KSPSetUp() or it maybe in PCApply() it depends
> on how much of the preconditioner construction can take place early, so
> depends exactly on the preconditioner used.
>
>
>
>    So yes the answer is not totally satisfying. The one thing I would
> recommend is to not call KSPSetUp() directly and then KSPSolve() will
> always include the total time of the solve plus all setup time. PCApply
> will contain all the time to apply the preconditioner but may also include
> some setup time.
>
>
>
>   Barry
>
>
>
>
>
> Best,
>
> Karthik.
>
>
>
>
>
>
>
>
>
> *From: *Barry Smith <bsmith at petsc.dev>
> *Date: *Tuesday, 28 September 2021 at 16:56
> *To: *"Chockalingam, Karthikeyan (STFC,DL,HC)" <
> karthikeyan.chockalingam at stfc.ac.uk>
> *Cc: *"petsc-users at mcs.anl.gov" <petsc-users at mcs.anl.gov>
> *Subject: *Re: [petsc-users] %T (percent time in this phase)
>
>
>
>
>
>
>
> On Sep 28, 2021, at 10:55 AM, Karthikeyan Chockalingam - STFC UKRI <
> karthikeyan.chockalingam at stfc.ac.uk> wrote:
>
>
>
> Hello,
>
>
>
> I ran ex45 in the KPS tutorial, which is a 3D finite-difference Poisson
> problem. I noticed from the output from using the flag -log_summary that
> for various events their respective %T (percent time in this phase) do not
> add up to 100 but rather exceeds 100. So, I gather there is some overlap
> among these events. I am primarily looking at the events KSPSetUp,
> KSPSolve, PCSetUp and PCSolve. Is it possible to get a mutually exclusive
> %T or Time for these individual events? I have attached  the log_summary
> output file from my run for your reference.
>
>
>
>
>
>   For nested solvers it is tricky to get the times to be mutually
> exclusive because some parts of the building of the preconditioner is for
> some preconditioners delayed until the solve has started.
>
>
>
>   It looks like you are using the default preconditioner options which for
> this example are taking more or less no time since so many iterations are
> needed. It is best to use -pc_type mg to use geometric multigrid on this
> problem.
>
>
>
>   Barry
>
>
>
>
>
>
>
> Thanks!
>
> Karthik.
>
>
>
> This email and any attachments are intended solely for the use of the
> named recipients. If you are not the intended recipient you must not use,
> disclose, copy or distribute this email or any of its attachments and
> should notify the sender immediately and delete this email from your
> system. UK Research and Innovation (UKRI) has taken every reasonable
> precaution to minimise risk of this email or any attachments containing
> viruses or malware but the recipient should carry out its own virus and
> malware checks before opening the attachments. UKRI does not accept any
> liability for any losses or damages which the recipient may sustain due to
> presence of any viruses.
>
> <ksp_ex45_N511_cpu_6.txt>
>
>
>
>
>
>
> --
>
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
>
>
> https://www.cse.buffalo.edu/~knepley/
> <http://www.cse.buffalo.edu/~knepley/>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210929/972574da/attachment-0001.html>

From marco.cisternino at optimad.it  Wed Sep 29 07:46:24 2021
From: marco.cisternino at optimad.it (Marco Cisternino)
Date: Wed, 29 Sep 2021 12:46:24 +0000
Subject: [petsc-users] Disconnected domains and Poisson equation
Message-ID: <AS8PR01MB8024549ECEC514A95F05812EE3A99@AS8PR01MB8024.eurprd01.prod.exchangelabs.com>

Good morning,
I want to solve the Poisson equation on a 3D domain with 2 non-connected sub-domains.
I am using FGMRES+GAMG and I have no problem if the two sub-domains see a Dirichlet boundary condition each.
On the same domain I would like to solve the Poisson equation imposing periodic boundary condition in one direction and homogenous Neumann boundary conditions in the other two directions. The two sub-domains are symmetric with respect to the separation between them and the operator discretization and the right hand side are symmetric as well. It would be nice to have the same solution in both the sub-domains.
Setting the null space to the constant, the solver converges to a solution having the same gradients in both sub-domains but different values.
Am I doing some wrong with the null space? I'm not setting a block matrix (one block for each sub-domain), should I?
I tested the null space against the matrix using MatNullSpaceTest and the answer is true. Can I do something more to have a symmetric solution as outcome of the solver?
Thank you in advance for any comments and hints.

Best regards,

Marco Cisternino

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210929/114c74bb/attachment.html>

From marco.cisternino at optimad.it  Wed Sep 29 07:58:42 2021
From: marco.cisternino at optimad.it (Marco Cisternino)
Date: Wed, 29 Sep 2021 12:58:42 +0000
Subject: [petsc-users] FGMRES and BCGS
Message-ID: <AS8PR01MB8024D0D68C214AE0B460BF40E3A99@AS8PR01MB8024.eurprd01.prod.exchangelabs.com>

Good Morning,
I usually solve a non-symmetric discretization of the Poisson equation using GAMG+FGMRES.
In the last days I tried to use BCGS in place of FGMRES, still using GAMG as preconditioner.
No problem in finding the solution but I'm experiencing something I didn't expect.
The test case is a 25 millions cells domain with Dirichlet and Neumann boundary conditions.
Both the solvers are able to solve the problem with an increasing number of MPI processes, but:

  *   FGMRES is about 25% faster than BCGS for all the processes number
  *   Both solvers have the same scalability from 48 to 384 processes
  *   Both solvers almost use the same amount of memory (FGMRES use a restart=30)
Am I wrong expecting less memory consumption and more performance from BCGS with respect to FGMRES?
Thank you in advance for any help.

Best regards,
Marco Cisternino


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210929/d2263bed/attachment.html>

From bsmith at petsc.dev  Wed Sep 29 08:58:49 2021
From: bsmith at petsc.dev (Barry Smith)
Date: Wed, 29 Sep 2021 09:58:49 -0400
Subject: [petsc-users] Disconnected domains and Poisson equation
In-Reply-To: <AS8PR01MB8024549ECEC514A95F05812EE3A99@AS8PR01MB8024.eurprd01.prod.exchangelabs.com>
References: <AS8PR01MB8024549ECEC514A95F05812EE3A99@AS8PR01MB8024.eurprd01.prod.exchangelabs.com>
Message-ID: <448CEBF7-5B16-4E1C-8D1D-9CC067BD38BB@petsc.dev>


  The problem actually has a two dimensional null space; constant on each domain but possibly different constants. I think you need to build the MatNullSpace by explicitly constructing two vectors, one with 0 on one domain and constant value on the other and one with 0 on the other domain and constant on the first. 

   Separate note: why use FGMRES instead of just GMRES? If the problem is linear and the preconditioner is linear (no GMRES inside the smoother) then you can just use GMRES and it will save a little space/work and be conceptually clearer.

  Barry

> On Sep 29, 2021, at 8:46 AM, Marco Cisternino <marco.cisternino at optimad.it> wrote:
> 
> Good morning,
> I want to solve the Poisson equation on a 3D domain with 2 non-connected sub-domains.
> I am using FGMRES+GAMG and I have no problem if the two sub-domains see a Dirichlet boundary condition each.
> On the same domain I would like to solve the Poisson equation imposing periodic boundary condition in one direction and homogenous Neumann boundary conditions in the other two directions. The two sub-domains are symmetric with respect to the separation between them and the operator discretization and the right hand side are symmetric as well. It would be nice to have the same solution in both the sub-domains.
> Setting the null space to the constant, the solver converges to a solution having the same gradients in both sub-domains but different values.
> Am I doing some wrong with the null space? I?m not setting a block matrix (one block for each sub-domain), should I?
> I tested the null space against the matrix using MatNullSpaceTest and the answer is true. Can I do something more to have a symmetric solution as outcome of the solver?
> Thank you in advance for any comments and hints.
>  
> Best regards,
>  
> Marco Cisternino

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210929/51c94bc8/attachment-0001.html>

From karthikeyan.chockalingam at stfc.ac.uk  Wed Sep 29 09:18:53 2021
From: karthikeyan.chockalingam at stfc.ac.uk (Karthikeyan Chockalingam - STFC UKRI)
Date: Wed, 29 Sep 2021 14:18:53 +0000
Subject: [petsc-users] %T (percent time in this phase)
In-Reply-To: <CAMYG4Gm6ExoBWQ7--h=UwaH_ynnSNcUzCTnZv_LAv2qo81Uwfg@mail.gmail.com>
References: <20E5B029-43D3-493C-873E-EB8F8CD92E08@stfc.ac.uk>
	<00A59A5B-7093-4FF1-9712-D0E6296E61D6@petsc.dev>
	<64B8653D-6E4C-4F6D-AA7F-C1A6A7693B75@stfc.ac.uk>
	<9123E727-A05A-4614-B90B-852EE0088895@petsc.dev>
	<4588B16F-528E-4869-BF87-FF5716D0A1FE@stfc.ac.uk>
	<CAMYG4GmcgBWD=74ojbW-eLNBHZbjU+GZ6TN68-81oq4Kxzi=+g@mail.gmail.com>
	<DFF963EE-AB04-4042-8078-B2BF42D5D1E4@stfc.ac.uk>
	<CAMYG4Gm6ExoBWQ7--h=UwaH_ynnSNcUzCTnZv_LAv2qo81Uwfg@mail.gmail.com>
Message-ID: <8CD0BC94-1C5A-48B7-93B3-F5C467CAC1E0@stfc.ac.uk>

Thank you!

Just to summarize

KSPSolve (53%) + PCSetup (16%) + DMCreateMat (23%) + MatCUSPARSSolAnl (9%) ~ 100 %

You didn?t happen to mention how MatCUSPARSSolAnl is accounted for? Am I right in accounting for it as above?

MatCUSPARSSolAnl       2 1.0 3.2338e+01 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  9  0  0  0  0   9  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0

Finally, I believe the vector events, VecNorn, VecTDot, VecAXPY, and VecAYPX are mutually exclusive?

Best,

Karthik.

From: Matthew Knepley <knepley at gmail.com>
Date: Wednesday, 29 September 2021 at 11:58
To: "Chockalingam, Karthikeyan (STFC,DL,HC)" <karthikeyan.chockalingam at stfc.ac.uk>
Cc: Barry Smith <bsmith at petsc.dev>, "petsc-users at mcs.anl.gov" <petsc-users at mcs.anl.gov>
Subject: Re: [petsc-users] %T (percent time in this phase)

On Wed, Sep 29, 2021 at 6:24 AM Karthikeyan Chockalingam - STFC UKRI <karthikeyan.chockalingam at stfc.ac.uk<mailto:karthikeyan.chockalingam at stfc.ac.uk>> wrote:
Thank you Mathew. Now, it is all making sense to me.

From data file ksp_ex45_N511_gpu_2.txt

KSPSolve (53%) + KSPSetup (0%) = PCSetup (16%) + PCApply (37%).

However, you said ?So an iteration would mostly consist of MatMult + PCApply, with some vector work?

1) You do one solve, but 2 KSPSetUp()s. You must be running on more than one process and using Block-Jacobi . Half the time is spent in the solve (53%)


KSPSetUp               2 1.0 5.3149e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 2.4e+01  0  0  0  0  1   0  0  0  0  1     0       0      0 0.00e+00    0 0.00e+00  0

KSPSolve               1 1.0 1.5837e+02 1.1 8.63e+11 1.0 6.8e+02 2.1e+06 4.4e+03 53100100100 95  53100100100 96 10881   11730   1022 6.40e+03 1021 8.17e-03 100


2) The preconditioner look like BJacobi-ILU. The setup time is 16%, which is all setup of the individual blocks, and this is all used by the numerical ILU factorization.

PCSetUp 2 1.0 4.9623e+01 1.3 1.45e+09 1.0 0.0e+00 0.0e+00 0.0e+00 16 0 0 0 0 16 0 0 0 0 58 0 2 6.93e+03 0 0.00e+00 0 PCSetUpOnBlocks 1 1.0 4.9274e+01 1.3 1.45e+09 1.0 0.0e+00 0.0e+00 0.0e+00 15 0 0 0 0 15 0 0 0 0 59 0 2 6.93e+03 0 0.00e+00 0

MatLUFactorNum         1 1.0 4.6126e+01 1.3 1.45e+09 1.0 0.0e+00 0.0e+00 0.0e+00 14  0  0  0  0  14  0  0  0  0    63       0      2 6.93e+03    0 0.00e+00  0

MatILUFactorSym        1 1.0 2.5110e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  1  0  0  0  0   1  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0

3) The preconditioner application takes 37% of the time, which is all solving the factors and recorded in MatSolve(). Matrix multiplication takes 4%.

PCApply 341 1.0 1.3068e+02 1.6 2.96e+11 1.0 0.0e+00 0.0e+00 0.0e+00 37 34 0 0 0 37 34 0 0 0 4516 4523 1 5.34e+02 0 0.00e+00 100

MatSolve             341 1.0 1.3009e+02 1.6 2.96e+11 1.0 0.0e+00 0.0e+00 0.0e+00 36 34  0  0  0  36 34  0  0  0  4536    4538      1 5.34e+02    0 0.00e+00 100

MatMult              341 1.0 1.0774e+01 1.1 2.96e+11 1.0 6.9e+02 2.1e+06 2.0e+00  4 34100100  0   4 34100100  0 54801   66441      2 5.86e+03    0 0.00e+00 100


4) The significant vector time is all in norms (11%) since they are really slow on the GPU.



VecNorm              342 1.0 6.2261e+01129.9 4.57e+10 1.0 0.0e+00 0.0e+00 6.8e+02 11  5  0  0 15  11  5  0  0 15  1466   196884      0 0.00e+00  342 2.74e-03 100

VecTDot              680 1.0 1.7107e+00 1.3 9.09e+10 1.0 0.0e+00 0.0e+00 1.4e+03  1 10  0  0 29   1 10  0  0 29 106079   133922      0 0.00e+00  680 5.44e-03 100

VecAXPY              681 1.0 3.2036e+00 1.7 9.10e+10 1.0 0.0e+00 0.0e+00 0.0e+00  1 11  0  0  0   1 11  0  0  0 56728   58367    682 5.34e+02    0 0.00e+00 100

VecAYPX              339 1.0 2.6502e+00 1.8 4.53e+10 1.0 0.0e+00 0.0e+00 0.0e+00  1  5  0  0  0   1  5  0  0  0 34136   34153    339 2.71e-03    0 0.00e+00 100


So the solve time is:

  53% ~ 37% + 4% + 11%

and the setup time is about 16%. I was wrong about the SetUp time being included, as it is outside the event:

  https://gitlab.com/petsc/petsc/-/blob/main/src/ksp/ksp/interface/itfunc.c#L852

It looks like the remainder of the time (23%) is spent preallocating the matrix.

  Thanks,

     Matt

The MalMult event is 4 %. How does this event figure into the above equation; if preconditioning (MatMult + PCApply) is included in KSPSolve?

Best,
Karthik.

From: Matthew Knepley <knepley at gmail.com<mailto:knepley at gmail.com>>
Date: Wednesday, 29 September 2021 at 10:58
To: "Chockalingam, Karthikeyan (STFC,DL,HC)" <karthikeyan.chockalingam at stfc.ac.uk<mailto:karthikeyan.chockalingam at stfc.ac.uk>>
Cc: Barry Smith <bsmith at petsc.dev<mailto:bsmith at petsc.dev>>, "petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>" <petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>>
Subject: Re: [petsc-users] %T (percent time in this phase)

On Wed, Sep 29, 2021 at 5:52 AM Karthikeyan Chockalingam - STFC UKRI <karthikeyan.chockalingam at stfc.ac.uk<mailto:karthikeyan.chockalingam at stfc.ac.uk>> wrote:
That was helpful. I would like to provide some additional details of my run on cpus and gpus. Please find the following attachments:


  1.  graph.pdf a plot showing overall time and various petsc events.
  2.  ksp_ex45_N511_cpu_6.txt data file of the log_summary
  3.  ksp_ex45_N511_gpu_2.txt data file of the log_summary

I used the following petsc options for cpu

mpirun -n 6 ./ex45 -log_summary -da_grid_x 511 -da_grid_y 511 -da_grid_z 511 -dm_mat_type mpiaij -dm_vec_type mpi -ksp_type cg -pc_type bjacobi -ksp_monitor

and for gpus

mpirun -n 1 ./ex45 -log_summary -da_grid_x 511 -da_grid_y 511 -da_grid_z 511  -dm_mat_type mpiaijcusparse -dm_vec_type mpicuda -ksp_type cg -pc_type bjacobi -ksp_monitor

to run the following problem

https://petsc.org/release/src/ksp/ksp/tutorials/ex45.c.html

From the above code, I see is there no individual function called KSPSetUp(), so I gather KSPSetDM, KSPSetComputeInitialGuess, KSPSetComputeRHS, kSPSetComputeOperators all are timed together as KSPSetUp. For this example, is KSPSetUp time and KSPSolve time mutually exclusive?

No, KSPSetUp() will be contained in KSPSolve() if it is called automatically.

In your response you said that

   ?PCSetUp() time may be in KSPSetUp() or it maybe in PCApply() it depends on how much of the preconditioner construction can take place early, so depends exactly on the preconditioner used.?

I don?t see a explicit call to PCSetUp() or  PCApply() in ex45; so for this particular preconditioner (bjacobi) how can I tell how they are timed?

They are all inside KSPSolve(). If you have a preconditioned linear solve, the oreconditioning happens during the iteration. So an iteration would mostly
consist of MatMult + PCApply, with some vector work.

I am hoping to time KSP solving and preconditioning mutually exclusively.

I am not sure that concept makes sense here. See above.

  Thanks,

     Matt


Kind regards,
Karthik.


From: Barry Smith <bsmith at petsc.dev<mailto:bsmith at petsc.dev>>
Date: Tuesday, 28 September 2021 at 19:19
To: "Chockalingam, Karthikeyan (STFC,DL,HC)" <karthikeyan.chockalingam at stfc.ac.uk<mailto:karthikeyan.chockalingam at stfc.ac.uk>>
Cc: "petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>" <petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>>
Subject: Re: [petsc-users] %T (percent time in this phase)



On Sep 28, 2021, at 12:11 PM, Karthikeyan Chockalingam - STFC UKRI <karthikeyan.chockalingam at stfc.ac.uk<mailto:karthikeyan.chockalingam at stfc.ac.uk>> wrote:

Thanks for Barry for your response.

I was just benchmarking the problem with various preconditioner on cpu and gpu. I understand, it is not possible to get mutually exclusive timing.
However, can you tell if KSPSolve time includes both PCSetup and PCApply? And if KSPSolve and KSPSetup are mutually exclusive? Likewise for PCSetUp and PCApply.

   If you do not call KSPSetUp() separately from KSPSolve() then its time is included with KSPSolve().

   PCSetUp() time may be in KSPSetUp() or it maybe in PCApply() it depends on how much of the preconditioner construction can take place early, so depends exactly on the preconditioner used.

   So yes the answer is not totally satisfying. The one thing I would recommend is to not call KSPSetUp() directly and then KSPSolve() will always include the total time of the solve plus all setup time. PCApply will contain all the time to apply the preconditioner but may also include some setup time.

  Barry


Best,
Karthik.




From: Barry Smith <bsmith at petsc.dev<mailto:bsmith at petsc.dev>>
Date: Tuesday, 28 September 2021 at 16:56
To: "Chockalingam, Karthikeyan (STFC,DL,HC)" <karthikeyan.chockalingam at stfc.ac.uk<mailto:karthikeyan.chockalingam at stfc.ac.uk>>
Cc: "petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>" <petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>>
Subject: Re: [petsc-users] %T (percent time in this phase)



On Sep 28, 2021, at 10:55 AM, Karthikeyan Chockalingam - STFC UKRI <karthikeyan.chockalingam at stfc.ac.uk<mailto:karthikeyan.chockalingam at stfc.ac.uk>> wrote:

Hello,

I ran ex45 in the KPS tutorial, which is a 3D finite-difference Poisson problem. I noticed from the output from using the flag -log_summary that for various events their respective %T (percent time in this phase) do not add up to 100 but rather exceeds 100. So, I gather there is some overlap among these events. I am primarily looking at the events KSPSetUp, KSPSolve, PCSetUp and PCSolve. Is it possible to get a mutually exclusive %T or Time for these individual events? I have attached  the log_summary output file from my run for your reference.


  For nested solvers it is tricky to get the times to be mutually exclusive because some parts of the building of the preconditioner is for some preconditioners delayed until the solve has started.

  It looks like you are using the default preconditioner options which for this example are taking more or less no time since so many iterations are needed. It is best to use -pc_type mg to use geometric multigrid on this problem.

  Barry



Thanks!
Karthik.

This email and any attachments are intended solely for the use of the named recipients. If you are not the intended recipient you must not use, disclose, copy or distribute this email or any of its attachments and should notify the sender immediately and delete this email from your system. UK Research and Innovation (UKRI) has taken every reasonable precaution to minimise risk of this email or any attachments containing viruses or malware but the recipient should carry out its own virus and malware checks before opening the attachments. UKRI does not accept any liability for any losses or damages which the recipient may sustain due to presence of any viruses.
<ksp_ex45_N511_cpu_6.txt>



--
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/<http://www.cse.buffalo.edu/~knepley/>


--
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/<http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210929/fcab81e9/attachment-0001.html>

From knepley at gmail.com  Wed Sep 29 10:29:05 2021
From: knepley at gmail.com (Matthew Knepley)
Date: Wed, 29 Sep 2021 11:29:05 -0400
Subject: [petsc-users] %T (percent time in this phase)
In-Reply-To: <8CD0BC94-1C5A-48B7-93B3-F5C467CAC1E0@stfc.ac.uk>
References: <20E5B029-43D3-493C-873E-EB8F8CD92E08@stfc.ac.uk>
	<00A59A5B-7093-4FF1-9712-D0E6296E61D6@petsc.dev>
	<64B8653D-6E4C-4F6D-AA7F-C1A6A7693B75@stfc.ac.uk>
	<9123E727-A05A-4614-B90B-852EE0088895@petsc.dev>
	<4588B16F-528E-4869-BF87-FF5716D0A1FE@stfc.ac.uk>
	<CAMYG4GmcgBWD=74ojbW-eLNBHZbjU+GZ6TN68-81oq4Kxzi=+g@mail.gmail.com>
	<DFF963EE-AB04-4042-8078-B2BF42D5D1E4@stfc.ac.uk>
	<CAMYG4Gm6ExoBWQ7--h=UwaH_ynnSNcUzCTnZv_LAv2qo81Uwfg@mail.gmail.com>
	<8CD0BC94-1C5A-48B7-93B3-F5C467CAC1E0@stfc.ac.uk>
Message-ID: <CAMYG4Gmfp4ThQeJoTiHi7DG14qtFX_dMY9d7rmtPqhZSSwQpNg@mail.gmail.com>

On Wed, Sep 29, 2021 at 10:18 AM Karthikeyan Chockalingam - STFC UKRI <
karthikeyan.chockalingam at stfc.ac.uk> wrote:

> Thank you!
>
>
>
> Just to summarize
>
>
>
> KSPSolve (53%) + PCSetup (16%) + DMCreateMat (23%) + MatCUSPARSSolAnl
> (9%) ~ 100 %
>
>
>
> You didn?t happen to mention how MatCUSPARSSolAnl is accounted for? Am I
> right in accounting for it as above?
>

I am not sure.I thought it might be the GPU part of MatSolve(). I will have
to look in the code. I am not as familiar with the GPU part.


> MatCUSPARSSolAnl       2 1.0 3.2338e+01 1.5 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  9  0  0  0  0   9  0  0  0  0     0       0      0 0.00e+00    0
> 0.00e+00  0
>
>
>
> Finally, I believe the vector events, VecNorn, VecTDot, VecAXPY, and
> VecAYPX are mutually exclusive?
>

Yes.

  Thanks,

     Matt


> Best,
>
>
>
> Karthik.
>
>
>
> *From: *Matthew Knepley <knepley at gmail.com>
> *Date: *Wednesday, 29 September 2021 at 11:58
> *To: *"Chockalingam, Karthikeyan (STFC,DL,HC)" <
> karthikeyan.chockalingam at stfc.ac.uk>
> *Cc: *Barry Smith <bsmith at petsc.dev>, "petsc-users at mcs.anl.gov" <
> petsc-users at mcs.anl.gov>
> *Subject: *Re: [petsc-users] %T (percent time in this phase)
>
>
>
> On Wed, Sep 29, 2021 at 6:24 AM Karthikeyan Chockalingam - STFC UKRI <
> karthikeyan.chockalingam at stfc.ac.uk> wrote:
>
> Thank you Mathew. Now, it is all making sense to me.
>
>
>
> From data file ksp_ex45_N511_gpu_2.txt
>
>
>
> KSPSolve (53%) + KSPSetup (0%) = PCSetup (16%) + PCApply (37%).
>
>
>
> However, you said ?So an iteration would mostly consist of MatMult +
> PCApply, with some vector work?
>
>
>
> 1) You do one solve, but 2 KSPSetUp()s. You must be running on more than
> one process and using Block-Jacobi . Half the time is spent in the solve
> (53%)
>
>
>
> KSPSetUp               2 1.0 5.3149e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 2.4e+01  0  0  0  0  1   0  0  0  0  1     0       0      0 0.00e+00    0 0.00e+00  0
>
> KSPSolve               1 1.0 1.5837e+02 1.1 8.63e+11 1.0 6.8e+02 2.1e+06 4.4e+03 53100100100 95  53100100100 96 10881   11730   1022 6.40e+03 1021 8.17e-03 100
>
>
>
> 2) The preconditioner look like BJacobi-ILU. The setup time is 16%, which
> is all setup of the individual blocks, and this is all used by the
> numerical ILU factorization.
>
>
>
> PCSetUp 2 1.0 4.9623e+01 1.3 1.45e+09 1.0 0.0e+00 0.0e+00 0.0e+00 16 0 0 0
> 0 16 0 0 0 0 58 0 2 6.93e+03 0 0.00e+00 0 PCSetUpOnBlocks 1 1.0 4.9274e+01
> 1.3 1.45e+09 1.0 0.0e+00 0.0e+00 0.0e+00 15 0 0 0 0 15 0 0 0 0 59 0 2
> 6.93e+03 0 0.00e+00 0
>
> MatLUFactorNum         1 1.0 4.6126e+01 1.3 1.45e+09 1.0 0.0e+00 0.0e+00 0.0e+00 14  0  0  0  0  14  0  0  0  0    63       0      2 6.93e+03    0 0.00e+00  0
>
> MatILUFactorSym        1 1.0 2.5110e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  1  0  0  0  0   1  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0
>
>
>
> 3) The preconditioner application takes 37% of the time, which is all
> solving the factors and recorded in MatSolve(). Matrix multiplication takes
> 4%.
>
>
>
> PCApply 341 1.0 1.3068e+02 1.6 2.96e+11 1.0 0.0e+00 0.0e+00 0.0e+00 37 34
> 0 0 0 37 34 0 0 0 4516 4523 1 5.34e+02 0 0.00e+00 100
>
> MatSolve             341 1.0 1.3009e+02 1.6 2.96e+11 1.0 0.0e+00 0.0e+00 0.0e+00 36 34  0  0  0  36 34  0  0  0  4536    4538      1 5.34e+02    0 0.00e+00 100
>
> MatMult              341 1.0 1.0774e+01 1.1 2.96e+11 1.0 6.9e+02 2.1e+06 2.0e+00  4 34100100  0   4 34100100  0 54801   66441      2 5.86e+03    0 0.00e+00 100
>
>
>
> 4) The significant vector time is all in norms (11%) since they are really
> slow on the GPU.
>
>
>
> VecNorm              342 1.0 6.2261e+01129.9 4.57e+10 1.0 0.0e+00 0.0e+00 6.8e+02 11  5  0  0 15  11  5  0  0 15  1466   196884      0 0.00e+00  342 2.74e-03 100
>
> VecTDot              680 1.0 1.7107e+00 1.3 9.09e+10 1.0 0.0e+00 0.0e+00 1.4e+03  1 10  0  0 29   1 10  0  0 29 106079   133922      0 0.00e+00  680 5.44e-03 100
>
> VecAXPY              681 1.0 3.2036e+00 1.7 9.10e+10 1.0 0.0e+00 0.0e+00 0.0e+00  1 11  0  0  0   1 11  0  0  0 56728   58367    682 5.34e+02    0 0.00e+00 100
>
> VecAYPX              339 1.0 2.6502e+00 1.8 4.53e+10 1.0 0.0e+00 0.0e+00 0.0e+00  1  5  0  0  0   1  5  0  0  0 34136   34153    339 2.71e-03    0 0.00e+00 100
>
>
>
> So the solve time is:
>
>
>
>   53% ~ 37% + 4% + 11%
>
>
>
> and the setup time is about 16%. I was wrong about the SetUp time being
> included, as it is outside the event:
>
>
>
>
> https://gitlab.com/petsc/petsc/-/blob/main/src/ksp/ksp/interface/itfunc.c#L852
>
>
>
> It looks like the remainder of the time (23%) is spent preallocating the
> matrix.
>
>
>
>   Thanks,
>
>
>
>      Matt
>
>
>
> The MalMult event is 4 %. How does this event figure into the above
> equation; if preconditioning (MatMult + PCApply) is included in KSPSolve?
>
>
>
> Best,
>
> Karthik.
>
>
>
> *From: *Matthew Knepley <knepley at gmail.com>
> *Date: *Wednesday, 29 September 2021 at 10:58
> *To: *"Chockalingam, Karthikeyan (STFC,DL,HC)" <
> karthikeyan.chockalingam at stfc.ac.uk>
> *Cc: *Barry Smith <bsmith at petsc.dev>, "petsc-users at mcs.anl.gov" <
> petsc-users at mcs.anl.gov>
> *Subject: *Re: [petsc-users] %T (percent time in this phase)
>
>
>
> On Wed, Sep 29, 2021 at 5:52 AM Karthikeyan Chockalingam - STFC UKRI <
> karthikeyan.chockalingam at stfc.ac.uk> wrote:
>
> That was helpful. I would like to provide some additional details of my
> run on cpus and gpus. Please find the following attachments:
>
>
>
>    1. graph.pdf a plot showing overall time and various petsc events.
>    2. ksp_ex45_N511_cpu_6.txt data file of the log_summary
>    3. ksp_ex45_N511_gpu_2.txt data file of the log_summary
>
>
>
> I used the following petsc options for cpu
>
>
>
> mpirun -n 6 ./ex45 -log_summary -da_grid_x 511 -da_grid_y 511 -da_grid_z
> 511 -dm_mat_type mpiaij -dm_vec_type mpi -ksp_type cg -pc_type bjacobi
> -ksp_monitor
>
>
>
> and for gpus
>
>
>
> mpirun -n 1 ./ex45 -log_summary -da_grid_x 511 -da_grid_y 511 -da_grid_z
> 511  -dm_mat_type mpiaijcusparse -dm_vec_type mpicuda -ksp_type cg -pc_type
> bjacobi -ksp_monitor
>
>
>
> to run the following problem
>
>
>
> https://petsc.org/release/src/ksp/ksp/tutorials/ex45.c.html
>
>
>
> From the above code, I see is there no individual function called KSPSetUp(),
> so I gather KSPSetDM, KSPSetComputeInitialGuess, KSPSetComputeRHS,
> kSPSetComputeOperators all are timed together as KSPSetUp. For this
> example, is KSPSetUp time and KSPSolve time mutually exclusive?
>
>
>
> No, KSPSetUp() will be contained in KSPSolve() if it is called
> automatically.
>
>
>
> In your response you said that
>
>
>
>    ?PCSetUp() time may be in KSPSetUp() or it maybe in PCApply() it
> depends on how much of the preconditioner construction can take place
> early, so depends exactly on the preconditioner used.?
>
>
>
> I don?t see a explicit call to PCSetUp() or  PCApply() in ex45; so for
> this particular preconditioner (bjacobi) how can I tell how they are timed?
>
>
>
> They are all inside KSPSolve(). If you have a preconditioned linear solve,
> the oreconditioning happens during the iteration. So an iteration would
> mostly
>
> consist of MatMult + PCApply, with some vector work.
>
>
>
> I am hoping to time KSP solving and preconditioning mutually exclusively.
>
>
>
> I am not sure that concept makes sense here. See above.
>
>
>
>   Thanks,
>
>
>
>      Matt
>
>
>
>
>
> Kind regards,
>
> Karthik.
>
>
>
>
>
> *From: *Barry Smith <bsmith at petsc.dev>
> *Date: *Tuesday, 28 September 2021 at 19:19
> *To: *"Chockalingam, Karthikeyan (STFC,DL,HC)" <
> karthikeyan.chockalingam at stfc.ac.uk>
> *Cc: *"petsc-users at mcs.anl.gov" <petsc-users at mcs.anl.gov>
> *Subject: *Re: [petsc-users] %T (percent time in this phase)
>
>
>
>
>
>
>
> On Sep 28, 2021, at 12:11 PM, Karthikeyan Chockalingam - STFC UKRI <
> karthikeyan.chockalingam at stfc.ac.uk> wrote:
>
>
>
> Thanks for Barry for your response.
>
>
>
> I was just benchmarking the problem with various preconditioner on cpu and
> gpu. I understand, it is not possible to get mutually exclusive timing.
>
> However, can you tell if KSPSolve time includes both PCSetup and PCApply?
> And if KSPSolve and KSPSetup are mutually exclusive? Likewise for PCSetUp
> and PCApply.
>
>
>
>    If you do not call KSPSetUp() separately from KSPSolve() then its time
> is included with KSPSolve().
>
>
>
>    PCSetUp() time may be in KSPSetUp() or it maybe in PCApply() it depends
> on how much of the preconditioner construction can take place early, so
> depends exactly on the preconditioner used.
>
>
>
>    So yes the answer is not totally satisfying. The one thing I would
> recommend is to not call KSPSetUp() directly and then KSPSolve() will
> always include the total time of the solve plus all setup time. PCApply
> will contain all the time to apply the preconditioner but may also include
> some setup time.
>
>
>
>   Barry
>
>
>
>
>
> Best,
>
> Karthik.
>
>
>
>
>
>
>
>
>
> *From: *Barry Smith <bsmith at petsc.dev>
> *Date: *Tuesday, 28 September 2021 at 16:56
> *To: *"Chockalingam, Karthikeyan (STFC,DL,HC)" <
> karthikeyan.chockalingam at stfc.ac.uk>
> *Cc: *"petsc-users at mcs.anl.gov" <petsc-users at mcs.anl.gov>
> *Subject: *Re: [petsc-users] %T (percent time in this phase)
>
>
>
>
>
>
>
> On Sep 28, 2021, at 10:55 AM, Karthikeyan Chockalingam - STFC UKRI <
> karthikeyan.chockalingam at stfc.ac.uk> wrote:
>
>
>
> Hello,
>
>
>
> I ran ex45 in the KPS tutorial, which is a 3D finite-difference Poisson
> problem. I noticed from the output from using the flag -log_summary that
> for various events their respective %T (percent time in this phase) do not
> add up to 100 but rather exceeds 100. So, I gather there is some overlap
> among these events. I am primarily looking at the events KSPSetUp,
> KSPSolve, PCSetUp and PCSolve. Is it possible to get a mutually exclusive
> %T or Time for these individual events? I have attached  the log_summary
> output file from my run for your reference.
>
>
>
>
>
>   For nested solvers it is tricky to get the times to be mutually
> exclusive because some parts of the building of the preconditioner is for
> some preconditioners delayed until the solve has started.
>
>
>
>   It looks like you are using the default preconditioner options which for
> this example are taking more or less no time since so many iterations are
> needed. It is best to use -pc_type mg to use geometric multigrid on this
> problem.
>
>
>
>   Barry
>
>
>
>
>
>
>
> Thanks!
>
> Karthik.
>
>
>
> This email and any attachments are intended solely for the use of the
> named recipients. If you are not the intended recipient you must not use,
> disclose, copy or distribute this email or any of its attachments and
> should notify the sender immediately and delete this email from your
> system. UK Research and Innovation (UKRI) has taken every reasonable
> precaution to minimise risk of this email or any attachments containing
> viruses or malware but the recipient should carry out its own virus and
> malware checks before opening the attachments. UKRI does not accept any
> liability for any losses or damages which the recipient may sustain due to
> presence of any viruses.
>
> <ksp_ex45_N511_cpu_6.txt>
>
>
>
>
>
>
> --
>
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
>
>
> https://www.cse.buffalo.edu/~knepley/
> <http://www.cse.buffalo.edu/~knepley/>
>
>
>
>
> --
>
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
>
>
> https://www.cse.buffalo.edu/~knepley/
> <http://www.cse.buffalo.edu/~knepley/>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210929/17603cf2/attachment-0001.html>

From marco.cisternino at optimad.it  Wed Sep 29 10:53:37 2021
From: marco.cisternino at optimad.it (Marco Cisternino)
Date: Wed, 29 Sep 2021 15:53:37 +0000
Subject: [petsc-users] Disconnected domains and Poisson equation
In-Reply-To: <448CEBF7-5B16-4E1C-8D1D-9CC067BD38BB@petsc.dev>
References: <AS8PR01MB8024549ECEC514A95F05812EE3A99@AS8PR01MB8024.eurprd01.prod.exchangelabs.com>
	<448CEBF7-5B16-4E1C-8D1D-9CC067BD38BB@petsc.dev>
Message-ID: <AS8PR01MB80247DD771EE0861987F4C3CE3A99@AS8PR01MB8024.eurprd01.prod.exchangelabs.com>

Thank you Barry for the quick reply.
About the null space: I already tried what you suggest, building 2 Vec (constants) with 0 and 1 chosen by sub-domain, normalizing them and setting the null space like this
MatNullSpaceCreate(PETSC_COMM_WORLD,PETSC_FALSE,nconstants,constants,&nullspace);
The solution is slightly different in values but it is still different in the two sub-domains.
About the solver: I tried BCGS, GMRES and FGMRES. The linear system is a pressure system in a navier-stokes solver and only solving with FGMRES makes the CFD stable, with BCGS and GMRES the CFD solution diverges. Moreover, in the same case but with a single domain, CFD solution is stable using all the solvers, but FGMRES converges in much less iterations than the others.

Marco Cisternino

From: Barry Smith <bsmith at petsc.dev>
Sent: mercoled? 29 settembre 2021 15:59
To: Marco Cisternino <marco.cisternino at optimad.it>
Cc: petsc-users at mcs.anl.gov
Subject: Re: [petsc-users] Disconnected domains and Poisson equation


  The problem actually has a two dimensional null space; constant on each domain but possibly different constants. I think you need to build the MatNullSpace by explicitly constructing two vectors, one with 0 on one domain and constant value on the other and one with 0 on the other domain and constant on the first.

   Separate note: why use FGMRES instead of just GMRES? If the problem is linear and the preconditioner is linear (no GMRES inside the smoother) then you can just use GMRES and it will save a little space/work and be conceptually clearer.

  Barry


On Sep 29, 2021, at 8:46 AM, Marco Cisternino <marco.cisternino at optimad.it<mailto:marco.cisternino at optimad.it>> wrote:

Good morning,
I want to solve the Poisson equation on a 3D domain with 2 non-connected sub-domains.
I am using FGMRES+GAMG and I have no problem if the two sub-domains see a Dirichlet boundary condition each.
On the same domain I would like to solve the Poisson equation imposing periodic boundary condition in one direction and homogenous Neumann boundary conditions in the other two directions. The two sub-domains are symmetric with respect to the separation between them and the operator discretization and the right hand side are symmetric as well. It would be nice to have the same solution in both the sub-domains.
Setting the null space to the constant, the solver converges to a solution having the same gradients in both sub-domains but different values.
Am I doing some wrong with the null space? I?m not setting a block matrix (one block for each sub-domain), should I?
I tested the null space against the matrix using MatNullSpaceTest and the answer is true. Can I do something more to have a symmetric solution as outcome of the solver?
Thank you in advance for any comments and hints.

Best regards,

Marco Cisternino

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210929/06b84a03/attachment.html>

From marco.cisternino at optimad.it  Wed Sep 29 10:59:00 2021
From: marco.cisternino at optimad.it (Marco Cisternino)
Date: Wed, 29 Sep 2021 15:59:00 +0000
Subject: [petsc-users] Disconnected domains and Poisson equation
In-Reply-To: <AS8PR01MB80247DD771EE0861987F4C3CE3A99@AS8PR01MB8024.eurprd01.prod.exchangelabs.com>
References: <AS8PR01MB8024549ECEC514A95F05812EE3A99@AS8PR01MB8024.eurprd01.prod.exchangelabs.com>
	<448CEBF7-5B16-4E1C-8D1D-9CC067BD38BB@petsc.dev>
	<AS8PR01MB80247DD771EE0861987F4C3CE3A99@AS8PR01MB8024.eurprd01.prod.exchangelabs.com>
Message-ID: <AS8PR01MB8024A52A6C2CC88115DE1C45E3A99@AS8PR01MB8024.eurprd01.prod.exchangelabs.com>

For sake of completeness, explicitly building the null space using a vector per sub-domain make s the CFD runs using BCGS and GMRES more stable, but still slower than FGMRES.
I had divergence using BCGS and GMRES setting the null space with only one constant.
Thanks

Marco Cisternino

From: Marco Cisternino
Sent: mercoled? 29 settembre 2021 17:54
To: Barry Smith <bsmith at petsc.dev>
Cc: petsc-users at mcs.anl.gov
Subject: RE: [petsc-users] Disconnected domains and Poisson equation

Thank you Barry for the quick reply.
About the null space: I already tried what you suggest, building 2 Vec (constants) with 0 and 1 chosen by sub-domain, normalizing them and setting the null space like this
MatNullSpaceCreate(PETSC_COMM_WORLD,PETSC_FALSE,nconstants,constants,&nullspace);
The solution is slightly different in values but it is still different in the two sub-domains.
About the solver: I tried BCGS, GMRES and FGMRES. The linear system is a pressure system in a navier-stokes solver and only solving with FGMRES makes the CFD stable, with BCGS and GMRES the CFD solution diverges. Moreover, in the same case but with a single domain, CFD solution is stable using all the solvers, but FGMRES converges in much less iterations than the others.

Marco Cisternino

From: Barry Smith <bsmith at petsc.dev<mailto:bsmith at petsc.dev>>
Sent: mercoled? 29 settembre 2021 15:59
To: Marco Cisternino <marco.cisternino at optimad.it<mailto:marco.cisternino at optimad.it>>
Cc: petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>
Subject: Re: [petsc-users] Disconnected domains and Poisson equation


  The problem actually has a two dimensional null space; constant on each domain but possibly different constants. I think you need to build the MatNullSpace by explicitly constructing two vectors, one with 0 on one domain and constant value on the other and one with 0 on the other domain and constant on the first.

   Separate note: why use FGMRES instead of just GMRES? If the problem is linear and the preconditioner is linear (no GMRES inside the smoother) then you can just use GMRES and it will save a little space/work and be conceptually clearer.

  Barry

On Sep 29, 2021, at 8:46 AM, Marco Cisternino <marco.cisternino at optimad.it<mailto:marco.cisternino at optimad.it>> wrote:

Good morning,
I want to solve the Poisson equation on a 3D domain with 2 non-connected sub-domains.
I am using FGMRES+GAMG and I have no problem if the two sub-domains see a Dirichlet boundary condition each.
On the same domain I would like to solve the Poisson equation imposing periodic boundary condition in one direction and homogenous Neumann boundary conditions in the other two directions. The two sub-domains are symmetric with respect to the separation between them and the operator discretization and the right hand side are symmetric as well. It would be nice to have the same solution in both the sub-domains.
Setting the null space to the constant, the solver converges to a solution having the same gradients in both sub-domains but different values.
Am I doing some wrong with the null space? I?m not setting a block matrix (one block for each sub-domain), should I?
I tested the null space against the matrix using MatNullSpaceTest and the answer is true. Can I do something more to have a symmetric solution as outcome of the solver?
Thank you in advance for any comments and hints.

Best regards,

Marco Cisternino

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210929/0c915cf2/attachment-0001.html>

From bsmith at petsc.dev  Wed Sep 29 11:33:40 2021
From: bsmith at petsc.dev (Barry Smith)
Date: Wed, 29 Sep 2021 12:33:40 -0400
Subject: [petsc-users] Disconnected domains and Poisson equation
In-Reply-To: <AS8PR01MB8024A52A6C2CC88115DE1C45E3A99@AS8PR01MB8024.eurprd01.prod.exchangelabs.com>
References: <AS8PR01MB8024549ECEC514A95F05812EE3A99@AS8PR01MB8024.eurprd01.prod.exchangelabs.com>
	<448CEBF7-5B16-4E1C-8D1D-9CC067BD38BB@petsc.dev>
	<AS8PR01MB80247DD771EE0861987F4C3CE3A99@AS8PR01MB8024.eurprd01.prod.exchangelabs.com>
	<AS8PR01MB8024A52A6C2CC88115DE1C45E3A99@AS8PR01MB8024.eurprd01.prod.exchangelabs.com>
Message-ID: <10EA28EF-AD98-4F59-A78D-7DE3D4B585DE@petsc.dev>



> On Sep 29, 2021, at 11:59 AM, Marco Cisternino <marco.cisternino at optimad.it> wrote:
> 
> For sake of completeness, explicitly building the null space using a vector per sub-domain make s the CFD runs using BCGS and GMRES more stable, but still slower than FGMRES.

  Something is strange. Please run with -ksp_view and send the output on the solver details.

> I had divergence using BCGS and GMRES setting the null space with only one constant.
> Thanks
>  
> Marco Cisternino
>  
> From: Marco Cisternino 
> Sent: mercoled? 29 settembre 2021 17:54
> To: Barry Smith <bsmith at petsc.dev>
> Cc: petsc-users at mcs.anl.gov
> Subject: RE: [petsc-users] Disconnected domains and Poisson equation
>  
> Thank you Barry for the quick reply.
> About the null space: I already tried what you suggest, building 2 Vec (constants) with 0 and 1 chosen by sub-domain, normalizing them and setting the null space like this
> MatNullSpaceCreate(PETSC_COMM_WORLD,PETSC_FALSE,nconstants,constants,&nullspace);
> The solution is slightly different in values but it is still different in the two sub-domains.
> About the solver: I tried BCGS, GMRES and FGMRES. The linear system is a pressure system in a navier-stokes solver and only solving with FGMRES makes the CFD stable, with BCGS and GMRES the CFD solution diverges. Moreover, in the same case but with a single domain, CFD solution is stable using all the solvers, but FGMRES converges in much less iterations than the others.
>  
> Marco Cisternino
>  
> From: Barry Smith <bsmith at petsc.dev <mailto:bsmith at petsc.dev>> 
> Sent: mercoled? 29 settembre 2021 15:59
> To: Marco Cisternino <marco.cisternino at optimad.it <mailto:marco.cisternino at optimad.it>>
> Cc: petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov>
> Subject: Re: [petsc-users] Disconnected domains and Poisson equation
>  
>  
>   The problem actually has a two dimensional null space; constant on each domain but possibly different constants. I think you need to build the MatNullSpace by explicitly constructing two vectors, one with 0 on one domain and constant value on the other and one with 0 on the other domain and constant on the first. 
>  
>    Separate note: why use FGMRES instead of just GMRES? If the problem is linear and the preconditioner is linear (no GMRES inside the smoother) then you can just use GMRES and it will save a little space/work and be conceptually clearer.
>  
>   Barry
>  
> 
> On Sep 29, 2021, at 8:46 AM, Marco Cisternino <marco.cisternino at optimad.it <mailto:marco.cisternino at optimad.it>> wrote:
>  
> Good morning,
> I want to solve the Poisson equation on a 3D domain with 2 non-connected sub-domains.
> I am using FGMRES+GAMG and I have no problem if the two sub-domains see a Dirichlet boundary condition each.
> On the same domain I would like to solve the Poisson equation imposing periodic boundary condition in one direction and homogenous Neumann boundary conditions in the other two directions. The two sub-domains are symmetric with respect to the separation between them and the operator discretization and the right hand side are symmetric as well. It would be nice to have the same solution in both the sub-domains.
> Setting the null space to the constant, the solver converges to a solution having the same gradients in both sub-domains but different values.
> Am I doing some wrong with the null space? I?m not setting a block matrix (one block for each sub-domain), should I?
> I tested the null space against the matrix using MatNullSpaceTest and the answer is true. Can I do something more to have a symmetric solution as outcome of the solver?
> Thank you in advance for any comments and hints.
>  
> Best regards,
>  
> Marco Cisternino
>  

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210929/f1d383d3/attachment.html>

From knepley at gmail.com  Wed Sep 29 14:09:44 2021
From: knepley at gmail.com (Matthew Knepley)
Date: Wed, 29 Sep 2021 15:09:44 -0400
Subject: [petsc-users] Disconnected domains and Poisson equation
In-Reply-To: <AS8PR01MB80247DD771EE0861987F4C3CE3A99@AS8PR01MB8024.eurprd01.prod.exchangelabs.com>
References: <AS8PR01MB8024549ECEC514A95F05812EE3A99@AS8PR01MB8024.eurprd01.prod.exchangelabs.com>
	<448CEBF7-5B16-4E1C-8D1D-9CC067BD38BB@petsc.dev>
	<AS8PR01MB80247DD771EE0861987F4C3CE3A99@AS8PR01MB8024.eurprd01.prod.exchangelabs.com>
Message-ID: <CAMYG4Gkwov6FvqPTdTF-2ReTVc4ajML_y04sCK=ygyhF_q0Wvg@mail.gmail.com>

On Wed, Sep 29, 2021 at 11:53 AM Marco Cisternino <
marco.cisternino at optimad.it> wrote:

> Thank you Barry for the quick reply.
>
> About the null space: I already tried what you suggest, building 2 Vec
> (constants) with 0 and 1 chosen by sub-domain, normalizing them and setting
> the null space like this
>
>
> MatNullSpaceCreate(PETSC_COMM_WORLD,PETSC_FALSE,nconstants,constants,&nullspace);
>
> The solution is slightly different in values but it is still different in
> the two sub-domains.
>
> About the solver: I tried BCGS, GMRES and FGMRES. The linear system is a
> pressure system in a navier-stokes solver and only solving with FGMRES
> makes the CFD stable, with BCGS and GMRES the CFD solution diverges.
> Moreover, in the same case but with a single domain, CFD solution is stable
> using all the solvers, but FGMRES converges in much less iterations than
> the others.
>

I think this means something is wrong with the implementation. FGMRES is
the same as GMRES _if_ the preconditioner is a linear operator. The fact
that they are different means
that your preconditioner is nonlinear. Is this what you expect?

  Thanks,

      Matt


> Marco Cisternino
>
>
>
> *From:* Barry Smith <bsmith at petsc.dev>
> *Sent:* mercoled? 29 settembre 2021 15:59
> *To:* Marco Cisternino <marco.cisternino at optimad.it>
> *Cc:* petsc-users at mcs.anl.gov
> *Subject:* Re: [petsc-users] Disconnected domains and Poisson equation
>
>
>
>
>
>   The problem actually has a two dimensional null space; constant on each
> domain but possibly different constants. I think you need to build the
> MatNullSpace by explicitly constructing two vectors, one with 0 on one
> domain and constant value on the other and one with 0 on the other domain
> and constant on the first.
>
>
>
>    Separate note: why use FGMRES instead of just GMRES? If the problem is
> linear and the preconditioner is linear (no GMRES inside the smoother) then
> you can just use GMRES and it will save a little space/work and be
> conceptually clearer.
>
>
>
>   Barry
>
>
>
> On Sep 29, 2021, at 8:46 AM, Marco Cisternino <marco.cisternino at optimad.it>
> wrote:
>
>
>
> Good morning,
>
> I want to solve the Poisson equation on a 3D domain with 2 non-connected
> sub-domains.
>
> I am using FGMRES+GAMG and I have no problem if the two sub-domains see a
> Dirichlet boundary condition each.
>
> On the same domain I would like to solve the Poisson equation imposing
> periodic boundary condition in one direction and homogenous Neumann
> boundary conditions in the other two directions. The two sub-domains are
> symmetric with respect to the separation between them and the operator
> discretization and the right hand side are symmetric as well. It would be
> nice to have the same solution in both the sub-domains.
>
> Setting the null space to the constant, the solver converges to a solution
> having the same gradients in both sub-domains but different values.
>
> Am I doing some wrong with the null space? I?m not setting a block matrix
> (one block for each sub-domain), should I?
>
> I tested the null space against the matrix using MatNullSpaceTest and the
> answer is true. Can I do something more to have a symmetric solution as
> outcome of the solver?
>
> Thank you in advance for any comments and hints.
>
>
>
> Best regards,
>
>
>
> Marco Cisternino
>
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210929/9bcbc681/attachment.html>

From Eric.Chamberland at giref.ulaval.ca  Wed Sep 29 16:18:46 2021
From: Eric.Chamberland at giref.ulaval.ca (Eric Chamberland)
Date: Wed, 29 Sep 2021 17:18:46 -0400
Subject: [petsc-users] Is it possible to keep track of original elements
 # after a call to DMPlexDistribute ?
In-Reply-To: <7236c736-6066-1ba3-55b1-60782d8e754f@giref.ulaval.ca>
References: <7236c736-6066-1ba3-55b1-60782d8e754f@giref.ulaval.ca>
Message-ID: <f86494a5-db93-63be-5a6f-75249ddab242@giref.ulaval.ca>

Hi,

I come back with _almost_ the original question:

I would like to add an integer information (*our* original element 
number, not petsc one) on each element of the DMPlex I create with 
DMPlexBuildFromCellListParallel.

I would like this interger to be distribruted by or the same way 
DMPlexDistribute distribute the mesh.

Is it possible to do this?

Thanks,

Eric

On 2021-07-14 1:18 p.m., Eric Chamberland wrote:
> Hi,
>
> I want to use DMPlexDistribute from PETSc for computing overlapping 
> and play with the different partitioners supported.
>
> However, after calling DMPlexDistribute, I noticed the elements are 
> renumbered and then the original number is lost.
>
> What would be the best way to keep track of the element renumbering?
>
> a) Adding an optional parameter to let the user retrieve a vector or 
> "IS" giving the old number?
>
> b) Adding a DMLabel (seems a wrong good solution)
>
> c) Other idea?
>
> Of course, I don't want to loose performances with the need of this 
> "mapping"...
>
> Thanks,
>
> Eric
>
-- 
Eric Chamberland, ing., M. Ing
Professionnel de recherche
GIREF/Universit? Laval
(418) 656-2131 poste 41 22 42


From rthirumalaisam1857 at sdsu.edu  Wed Sep 29 16:37:12 2021
From: rthirumalaisam1857 at sdsu.edu (Ramakrishnan Thirumalaisamy)
Date: Wed, 29 Sep 2021 14:37:12 -0700
Subject: [petsc-users] Convergence rate for spatially varying Helmholtz
 system
Message-ID: <CA+NMaCbK0SiKKDuz5yW_yyb5NpAgHdbQKvtVj0mOyiga+eVRjg@mail.gmail.com>

Hi all,

I am trying to solve the Helmholtz equation for temperature T:

(C I  + Div D grad) T = f

in IBAMR, in which C is the spatially varying diagonal entries, and D is
the spatially varying diffusion coefficient.   I use a matrix-free solver
with matrix-based PETSc preconditioner. For the matrix-free solver, I use
gmres solver and for the matrix based preconditioner, I use Richardson ksp
+ Jacobi as a preconditioner. As the simulation progresses, the iterations
start to increase. To understand the cause, I set D to be zero, which
results in a diagonal system:

C T = f.

This should result in convergence within a single iteration, but I get
convergence in 3 iterations.

Residual norms for temperature_ solve.

  0 KSP preconditioned resid norm 4.590811647875e-02 true resid norm
2.406067589273e+09 ||r(i)||/||b|| 4.455533946945e-05

  1 KSP preconditioned resid norm 2.347767895880e-06 true resid norm
1.210763896685e+05 ||r(i)||/||b|| 2.242081505717e-09

  2 KSP preconditioned resid norm 1.245406571896e-10 true resid norm
6.328828824310e+00 ||r(i)||/||b|| 1.171966730978e-13

Linear temperature_ solve converged due to CONVERGED_RTOL iterations 2

To verify that I am indeed solving a diagonal system I printed the PETSc
matrix from the preconditioner and viewed it in Matlab. It indeed shows it
to be a diagonal system. Attached is the plot of the spy command on the
printed matrix. The matrix in binary form is also attached.

My understanding is that because the C coefficient is varying in 4 orders
of magnitude, i.e., Max(C)/Min(C) ~ 10^4, the matrix is poorly scaled. When
I rescale my matrix by 1/C then the system converges in 1 iteration as
expected. Is my understanding correct, and that scaling 1/C should be done
even for a diagonal system?

When D is non-zero, then scaling by 1/C seems to be very inconvenient as D
is stored as side-centered data for the matrix free solver.

In the case that I do not scale my equations by 1/C, is there some solver
setting that improves the convergence rate? (With D as non-zero, I have
also tried gmres as the ksp solver in the matrix-based preconditioner to
get better performance, but it didn't matter much.)


Thanks,
Ramakrishnan Thirumalaisamy
San Diego State University.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210929/3550a3b9/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Temperature_fill.pdf
Type: application/pdf
Size: 56939 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210929/3550a3b9/attachment-0001.pdf>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: matrix_temperature
Type: application/octet-stream
Size: 262160 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210929/3550a3b9/attachment-0001.obj>

From jed at jedbrown.org  Wed Sep 29 17:28:28 2021
From: jed at jedbrown.org (Jed Brown)
Date: Wed, 29 Sep 2021 16:28:28 -0600
Subject: [petsc-users] FGMRES and BCGS
In-Reply-To: <AS8PR01MB8024D0D68C214AE0B460BF40E3A99@AS8PR01MB8024.eurprd01.prod.exchangelabs.com>
References: <AS8PR01MB8024D0D68C214AE0B460BF40E3A99@AS8PR01MB8024.eurprd01.prod.exchangelabs.com>
Message-ID: <87a6jvc8z7.fsf@jedbrown.org>

It is not surprising. BCGS uses less memory for the Krylov vectors, but that might be a small fraction of the total memory used (considering your matrix and GAMG). FGMRES(30) needs 60 work vectors (2 per iteration). If you're using a linear (non-iterative) preconditioner, then you don't need a flexible method -- plain GMRES should be fine. FGMRES uses the unpreconditioned norm, which you can also get via -ksp_type gmres -ksp_norm_type unpreconditioned.

This classic paper shows that for any class of nonsymmetric Krylov method, there are matrices in which that method outperforms every other method by at least sqrt(N).

https://epubs.siam.org/doi/10.1137/0613049

Marco Cisternino <marco.cisternino at optimad.it> writes:

> Good Morning,
> I usually solve a non-symmetric discretization of the Poisson equation using GAMG+FGMRES.
> In the last days I tried to use BCGS in place of FGMRES, still using GAMG as preconditioner.
> No problem in finding the solution but I'm experiencing something I didn't expect.
> The test case is a 25 millions cells domain with Dirichlet and Neumann boundary conditions.
> Both the solvers are able to solve the problem with an increasing number of MPI processes, but:
>
>   *   FGMRES is about 25% faster than BCGS for all the processes number
>   *   Both solvers have the same scalability from 48 to 384 processes
>   *   Both solvers almost use the same amount of memory (FGMRES use a restart=30)
> Am I wrong expecting less memory consumption and more performance from BCGS with respect to FGMRES?
> Thank you in advance for any help.
>
> Best regards,
> Marco Cisternino

From hzhang at mcs.anl.gov  Wed Sep 29 17:39:19 2021
From: hzhang at mcs.anl.gov (Zhang, Hong)
Date: Wed, 29 Sep 2021 22:39:19 +0000
Subject: [petsc-users] FGMRES and BCGS
In-Reply-To: <87a6jvc8z7.fsf@jedbrown.org>
References: <AS8PR01MB8024D0D68C214AE0B460BF40E3A99@AS8PR01MB8024.eurprd01.prod.exchangelabs.com>
	<87a6jvc8z7.fsf@jedbrown.org>
Message-ID: <SA1PR09MB86075E9980CABE48D7EA1A0388A99@SA1PR09MB8607.namprd09.prod.outlook.com>

See https://doc.comsol.com/5.5/doc/com.comsol.help.comsol/comsol_ref_solver.27.123.html
The Iterative Solvers - COMSOL Multiphysics<https://doc.comsol.com/5.5/doc/com.comsol.help.comsol/comsol_ref_solver.27.123.html>
The relaxation factor ? to some extent controls the stability and convergence properties of a numerical solver by shifting its eigenvalue spectrum. The optimal value for the relaxation factor can improve convergence significantly ? for example, for SOR when used as a solver. However, the optimal choice is typically a subtle task with arbitrary complexity.
doc.comsol.com
We use BCGS for constant memory while gmres without restart requires increased memory but has predictable convergence than BCGS.
Hong

________________________________
From: petsc-users <petsc-users-bounces at mcs.anl.gov> on behalf of Jed Brown <jed at jedbrown.org>
Sent: Wednesday, September 29, 2021 5:28 PM
To: Marco Cisternino <marco.cisternino at optimad.it>; petsc-users at mcs.anl.gov <petsc-users at mcs.anl.gov>
Subject: Re: [petsc-users] FGMRES and BCGS

It is not surprising. BCGS uses less memory for the Krylov vectors, but that might be a small fraction of the total memory used (considering your matrix and GAMG). FGMRES(30) needs 60 work vectors (2 per iteration). If you're using a linear (non-iterative) preconditioner, then you don't need a flexible method -- plain GMRES should be fine. FGMRES uses the unpreconditioned norm, which you can also get via -ksp_type gmres -ksp_norm_type unpreconditioned.

This classic paper shows that for any class of nonsymmetric Krylov method, there are matrices in which that method outperforms every other method by at least sqrt(N).

https://epubs.siam.org/doi/10.1137/0613049

Marco Cisternino <marco.cisternino at optimad.it> writes:

> Good Morning,
> I usually solve a non-symmetric discretization of the Poisson equation using GAMG+FGMRES.
> In the last days I tried to use BCGS in place of FGMRES, still using GAMG as preconditioner.
> No problem in finding the solution but I'm experiencing something I didn't expect.
> The test case is a 25 millions cells domain with Dirichlet and Neumann boundary conditions.
> Both the solvers are able to solve the problem with an increasing number of MPI processes, but:
>
>   *   FGMRES is about 25% faster than BCGS for all the processes number
>   *   Both solvers have the same scalability from 48 to 384 processes
>   *   Both solvers almost use the same amount of memory (FGMRES use a restart=30)
> Am I wrong expecting less memory consumption and more performance from BCGS with respect to FGMRES?
> Thank you in advance for any help.
>
> Best regards,
> Marco Cisternino
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210929/90e2bcc9/attachment.html>

From knepley at gmail.com  Wed Sep 29 17:58:46 2021
From: knepley at gmail.com (Matthew Knepley)
Date: Wed, 29 Sep 2021 18:58:46 -0400
Subject: [petsc-users] Convergence rate for spatially varying Helmholtz
 system
In-Reply-To: <CA+NMaCbK0SiKKDuz5yW_yyb5NpAgHdbQKvtVj0mOyiga+eVRjg@mail.gmail.com>
References: <CA+NMaCbK0SiKKDuz5yW_yyb5NpAgHdbQKvtVj0mOyiga+eVRjg@mail.gmail.com>
Message-ID: <CAMYG4Gn1cXg-QFzO-nZLbZp2_Gw8ccZmNfAbqkoFkM8Oj7OMnQ@mail.gmail.com>

On Wed, Sep 29, 2021 at 6:03 PM Ramakrishnan Thirumalaisamy <
rthirumalaisam1857 at sdsu.edu> wrote:

> Hi all,
>
> I am trying to solve the Helmholtz equation for temperature T:
>
> (C I  + Div D grad) T = f
>
> in IBAMR, in which C is the spatially varying diagonal entries, and D is
> the spatially varying diffusion coefficient.   I use a matrix-free solver
> with matrix-based PETSc preconditioner. For the matrix-free solver, I use
> gmres solver and for the matrix based preconditioner, I use Richardson ksp
> + Jacobi as a preconditioner. As the simulation progresses, the iterations
> start to increase. To understand the cause, I set D to be zero, which
> results in a diagonal system:
>
> C T = f.
>
> This should result in convergence within a single iteration, but I get
> convergence in 3 iterations.
>
> Residual norms for temperature_ solve.
>
>   0 KSP preconditioned resid norm 4.590811647875e-02 true resid norm
> 2.406067589273e+09 ||r(i)||/||b|| 4.455533946945e-05
>
>   1 KSP preconditioned resid norm 2.347767895880e-06 true resid norm
> 1.210763896685e+05 ||r(i)||/||b|| 2.242081505717e-09
>
>   2 KSP preconditioned resid norm 1.245406571896e-10 true resid norm
> 6.328828824310e+00 ||r(i)||/||b|| 1.171966730978e-13
>
> Linear temperature_ solve converged due to CONVERGED_RTOL iterations 2
>

Several things look off here:

1) Your true residual norm is 2.4e9, but r_0/b is 4.4e-5. That seems to
indicate that ||b|| is 1e14. Is this true?

2) Your preconditioned residual is 11 orders of magnitude less than the
true residual. This usually indicates that the system is near singular.

3) The disparity above does not seem possible if C only has elements ~ 1e4.
The preconditioner consistently has norm around 1e-11.

4) Using numbers that large can be a problem. You lose precision, so that
you really only have 3-4 correct digits each time, as you see above. It
appears to be
     doing iterative refinement with a very low precision solve.

  Thanks,

     Matt


> To verify that I am indeed solving a diagonal system I printed the PETSc
> matrix from the preconditioner and viewed it in Matlab. It indeed shows it
> to be a diagonal system. Attached is the plot of the spy command on the
> printed matrix. The matrix in binary form is also attached.
>
> My understanding is that because the C coefficient is varying in 4 orders
> of magnitude, i.e., Max(C)/Min(C) ~ 10^4, the matrix is poorly scaled. When
> I rescale my matrix by 1/C then the system converges in 1 iteration as
> expected. Is my understanding correct, and that scaling 1/C should be done
> even for a diagonal system?
>
> When D is non-zero, then scaling by 1/C seems to be very inconvenient as D
> is stored as side-centered data for the matrix free solver.
>
> In the case that I do not scale my equations by 1/C, is there some solver
> setting that improves the convergence rate? (With D as non-zero, I have
> also tried gmres as the ksp solver in the matrix-based preconditioner to
> get better performance, but it didn't matter much.)
>
>
> Thanks,
> Ramakrishnan Thirumalaisamy
> San Diego State University.
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210929/25223c1d/attachment-0001.html>

From knepley at gmail.com  Wed Sep 29 18:39:08 2021
From: knepley at gmail.com (Matthew Knepley)
Date: Wed, 29 Sep 2021 19:39:08 -0400
Subject: [petsc-users] Is it possible to keep track of original elements
 # after a call to DMPlexDistribute ?
In-Reply-To: <f86494a5-db93-63be-5a6f-75249ddab242@giref.ulaval.ca>
References: <7236c736-6066-1ba3-55b1-60782d8e754f@giref.ulaval.ca>
	<f86494a5-db93-63be-5a6f-75249ddab242@giref.ulaval.ca>
Message-ID: <CAMYG4Gkb8qivU3xr1Bz0dxBBSYWJveE3RGScs93eVDjJRPGEOA@mail.gmail.com>

On Wed, Sep 29, 2021 at 5:18 PM Eric Chamberland <
Eric.Chamberland at giref.ulaval.ca> wrote:

> Hi,
>
> I come back with _almost_ the original question:
>
> I would like to add an integer information (*our* original element
> number, not petsc one) on each element of the DMPlex I create with
> DMPlexBuildFromCellListParallel.
>
> I would like this interger to be distribruted by or the same way
> DMPlexDistribute distribute the mesh.
>
> Is it possible to do this?
>

I think we already have support for what you want. If you call

  https://petsc.org/main/docs/manualpages/DM/DMSetUseNatural.html

before DMPlexDistribute(), it will compute a PetscSF encoding the global to
natural map. You
can get it with


https://petsc.org/main/docs/manualpages/DMPLEX/DMPlexGetGlobalToNaturalSF.html

and use it with


https://petsc.org/main/docs/manualpages/DMPLEX/DMPlexGlobalToNaturalBegin.html

Is this sufficient?

  Thanks,

     Matt


> Thanks,
>
> Eric
>
> On 2021-07-14 1:18 p.m., Eric Chamberland wrote:
> > Hi,
> >
> > I want to use DMPlexDistribute from PETSc for computing overlapping
> > and play with the different partitioners supported.
> >
> > However, after calling DMPlexDistribute, I noticed the elements are
> > renumbered and then the original number is lost.
> >
> > What would be the best way to keep track of the element renumbering?
> >
> > a) Adding an optional parameter to let the user retrieve a vector or
> > "IS" giving the old number?
> >
> > b) Adding a DMLabel (seems a wrong good solution)
> >
> > c) Other idea?
> >
> > Of course, I don't want to loose performances with the need of this
> > "mapping"...
> >
> > Thanks,
> >
> > Eric
> >
> --
> Eric Chamberland, ing., M. Ing
> Professionnel de recherche
> GIREF/Universit? Laval
> (418) 656-2131 poste 41 22 42
>
>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210929/169879f8/attachment.html>

From rthirumalaisam1857 at sdsu.edu  Wed Sep 29 19:31:05 2021
From: rthirumalaisam1857 at sdsu.edu (Ramakrishnan Thirumalaisamy)
Date: Wed, 29 Sep 2021 17:31:05 -0700
Subject: [petsc-users] Convergence rate for spatially varying Helmholtz
 system
In-Reply-To: <CAMYG4Gn1cXg-QFzO-nZLbZp2_Gw8ccZmNfAbqkoFkM8Oj7OMnQ@mail.gmail.com>
References: <CA+NMaCbK0SiKKDuz5yW_yyb5NpAgHdbQKvtVj0mOyiga+eVRjg@mail.gmail.com>
	<CAMYG4Gn1cXg-QFzO-nZLbZp2_Gw8ccZmNfAbqkoFkM8Oj7OMnQ@mail.gmail.com>
Message-ID: <CA+NMaCYe027kYzODv2CWa0efuzz37qArrb9x2b-kAFiX38r7pg@mail.gmail.com>

>
> Several things look off here:
>
> 1) Your true residual norm is 2.4e9, but r_0/b is 4.4e-5. That seems to
> indicate that ||b|| is 1e14. Is this true?
>
Yes. ||b|| is 1e14.  We have verified that.

>
> 2) Your preconditioned residual is 11 orders of magnitude less than the
> true residual. This usually indicates that the system is near singular.
>
The system is diagonal, as shown in Temperature_fill.pdf. So we don't think
it is singular unless we are missing something very obvious.   Diagonal
elements range from 8.8e6 to 5.62896e+10, while off-diagonal terms are 0,
as shown in the spy plot.

>
> 3) The disparity above does not seem possible if C only has elements ~
> 1e4. The preconditioner consistently has norm around 1e-11.
>
The value of C in the Helmholtz system is computed as : *C =
rho*specific_heat/dt* in which dt = 5e-5, specific_heat ~10^3 and rho
ranges from 0.4 to 2700. Hence, C ranges from 8.8e6 to 5.62896e10.
Max_diagonal(C)/Min_diagonal(C) ~ 10^4.

>
> 4) Using numbers that large can be a problem. You lose precision, so that
> you really only have 3-4 correct digits each time, as you see above. It
> appears to be
>      doing iterative refinement with a very low precision solve.
>
Indeed the numbers are large because C =  rho*specific_heat/dt.

On Wed, Sep 29, 2021 at 3:58 PM Matthew Knepley <knepley at gmail.com> wrote:

> On Wed, Sep 29, 2021 at 6:03 PM Ramakrishnan Thirumalaisamy <
> rthirumalaisam1857 at sdsu.edu> wrote:
>
>> Hi all,
>>
>> I am trying to solve the Helmholtz equation for temperature T:
>>
>> (C I  + Div D grad) T = f
>>
>> in IBAMR, in which C is the spatially varying diagonal entries, and D is
>> the spatially varying diffusion coefficient.   I use a matrix-free solver
>> with matrix-based PETSc preconditioner. For the matrix-free solver, I use
>> gmres solver and for the matrix based preconditioner, I use Richardson ksp
>> + Jacobi as a preconditioner. As the simulation progresses, the iterations
>> start to increase. To understand the cause, I set D to be zero, which
>> results in a diagonal system:
>>
>> C T = f.
>>
>> This should result in convergence within a single iteration, but I get
>> convergence in 3 iterations.
>>
>> Residual norms for temperature_ solve.
>>
>>   0 KSP preconditioned resid norm 4.590811647875e-02 true resid norm
>> 2.406067589273e+09 ||r(i)||/||b|| 4.455533946945e-05
>>
>>   1 KSP preconditioned resid norm 2.347767895880e-06 true resid norm
>> 1.210763896685e+05 ||r(i)||/||b|| 2.242081505717e-09
>>
>>   2 KSP preconditioned resid norm 1.245406571896e-10 true resid norm
>> 6.328828824310e+00 ||r(i)||/||b|| 1.171966730978e-13
>>
>> Linear temperature_ solve converged due to CONVERGED_RTOL iterations 2
>>
>
> Several things look off here:
>
> 1) Your true residual norm is 2.4e9, but r_0/b is 4.4e-5. That seems to
> indicate that ||b|| is 1e14. Is this true?
>

> 2) Your preconditioned residual is 11 orders of magnitude less than the
> true residual. This usually indicates that the system is near singular.
>
> 3) The disparity above does not seem possible if C only has elements ~
> 1e4. The preconditioner consistently has norm around 1e-11.
>
> 4) Using numbers that large can be a problem. You lose precision, so that
> you really only have 3-4 correct digits each time, as you see above. It
> appears to be
>      doing iterative refinement with a very low precision solve.
>
>   Thanks,
>
>      Matt
>
>
>> To verify that I am indeed solving a diagonal system I printed the PETSc
>> matrix from the preconditioner and viewed it in Matlab. It indeed shows it
>> to be a diagonal system. Attached is the plot of the spy command on the
>> printed matrix. The matrix in binary form is also attached.
>>
>> My understanding is that because the C coefficient is varying in 4 orders
>> of magnitude, i.e., Max(C)/Min(C) ~ 10^4, the matrix is poorly scaled. When
>> I rescale my matrix by 1/C then the system converges in 1 iteration as
>> expected. Is my understanding correct, and that scaling 1/C should be done
>> even for a diagonal system?
>>
>> When D is non-zero, then scaling by 1/C seems to be very inconvenient as
>> D is stored as side-centered data for the matrix free solver.
>>
>> In the case that I do not scale my equations by 1/C, is there some solver
>> setting that improves the convergence rate? (With D as non-zero, I have
>> also tried gmres as the ksp solver in the matrix-based preconditioner to
>> get better performance, but it didn't matter much.)
>>
>>
>> Thanks,
>> Ramakrishnan Thirumalaisamy
>> San Diego State University.
>>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://www.cse.buffalo.edu/~knepley/
> <http://www.cse.buffalo.edu/~knepley/>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210929/94c217d5/attachment.html>

From marco.cisternino at optimad.it  Thu Sep 30 03:16:01 2021
From: marco.cisternino at optimad.it (Marco Cisternino)
Date: Thu, 30 Sep 2021 08:16:01 +0000
Subject: [petsc-users] Disconnected domains and Poisson equation
In-Reply-To: <10EA28EF-AD98-4F59-A78D-7DE3D4B585DE@petsc.dev>
References: <AS8PR01MB8024549ECEC514A95F05812EE3A99@AS8PR01MB8024.eurprd01.prod.exchangelabs.com>
	<448CEBF7-5B16-4E1C-8D1D-9CC067BD38BB@petsc.dev>
	<AS8PR01MB80247DD771EE0861987F4C3CE3A99@AS8PR01MB8024.eurprd01.prod.exchangelabs.com>
	<AS8PR01MB8024A52A6C2CC88115DE1C45E3A99@AS8PR01MB8024.eurprd01.prod.exchangelabs.com>
	<10EA28EF-AD98-4F59-A78D-7DE3D4B585DE@petsc.dev>
Message-ID: <AM8PR01MB8027615DCD755DB715D775A0E3AA9@AM8PR01MB8027.eurprd01.prod.exchangelabs.com>

Hello Barry.
This is the output of ksp_view using fgmres and gamg. It has to be said that the solution of the linear system should be a zero values field. As you can see both unpreconditioned residual and r/b converge at this iteration of the CFD solver. During the time integration of the CFD, I can observe pressure linear solver residuals behaving in a different way: unpreconditioned residual stil converges but r/b stalls. After the output of ksp_view I add the output of ksp_monitor_true_residual for one of these iteration where r/b stalls.
Thanks,

KSP Object: 1 MPI processes
  type: fgmres
    restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
    happy breakdown tolerance 1e-30
  maximum iterations=100, nonzero initial guess
  tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
  right preconditioning
  using UNPRECONDITIONED norm type for convergence test
PC Object: 1 MPI processes
  type: gamg
    type is MULTIPLICATIVE, levels=4 cycles=v
      Cycles per PCApply=1
      Using externally compute Galerkin coarse grid matrices
      GAMG specific options
        Threshold for dropping small values in graph on each level =   0.02   0.02
        Threshold scaling factor for each level not specified = 1.
        AGG specific options
          Symmetric graph true
          Number of levels to square graph 1
          Number smoothing steps 0
  Coarse grid solver -- level -------------------------------
    KSP Object: (mg_coarse_) 1 MPI processes
      type: preonly
      maximum iterations=10000, initial guess is zero
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
      left preconditioning
      using NONE norm type for convergence test
    PC Object: (mg_coarse_) 1 MPI processes
      type: bjacobi
        number of blocks = 1
        Local solve is same for all blocks, in the following KSP and PC objects:
        KSP Object: (mg_coarse_sub_) 1 MPI processes
          type: preonly
          maximum iterations=1, initial guess is zero
          tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
          left preconditioning
          using DEFAULT norm type for convergence test
        PC Object: (mg_coarse_sub_) 1 MPI processes
          type: lu
          PC has not been set up so information may be incomplete
            out-of-place factorization
            tolerance for zero pivot 2.22045e-14
            using diagonal shift on blocks to prevent zero pivot [INBLOCKS]
            matrix ordering: nd
          linear system matrix = precond matrix:
          Mat Object: 1 MPI processes
            type: seqaij
            rows=18, cols=18
            total: nonzeros=104, allocated nonzeros=104
            total number of mallocs used during MatSetValues calls =0
              not using I-node routines
      linear system matrix = precond matrix:
      Mat Object: 1 MPI processes
        type: seqaij
        rows=18, cols=18
        total: nonzeros=104, allocated nonzeros=104
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Down solver (pre-smoother) on level 1 -------------------------------
    KSP Object: (mg_levels_1_) 1 MPI processes
      type: chebyshev
        eigenvalue estimates used:  min = 0., max = 0.
        eigenvalues estimate via gmres min 0., max 0.
        eigenvalues estimated using gmres with translations  [0. 0.1; 0. 1.1]
        KSP Object: (mg_levels_1_esteig_) 1 MPI processes
          type: gmres
            restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
            happy breakdown tolerance 1e-30
          maximum iterations=10, initial guess is zero
          tolerances:  relative=1e-12, absolute=1e-50, divergence=10000.
          left preconditioning
          using DEFAULT norm type for convergence test
        estimating eigenvalues using noisy right hand side
      maximum iterations=2, nonzero initial guess
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
      left preconditioning
      using NONE norm type for convergence test
    PC Object: (mg_levels_1_) 1 MPI processes
      type: sor
        type = local_symmetric, iterations = 1, local iterations = 1, omega = 1.
      linear system matrix = precond matrix:
      Mat Object: 1 MPI processes
        type: seqaij
        rows=67, cols=67
        total: nonzeros=675, allocated nonzeros=675
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  Down solver (pre-smoother) on level 2 -------------------------------
    KSP Object: (mg_levels_2_) 1 MPI processes
      type: chebyshev
        eigenvalue estimates used:  min = 0., max = 0.
        eigenvalues estimate via gmres min 0., max 0.
        eigenvalues estimated using gmres with translations  [0. 0.1; 0. 1.1]
        KSP Object: (mg_levels_2_esteig_) 1 MPI processes
          type: gmres
            restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
            happy breakdown tolerance 1e-30
          maximum iterations=10, initial guess is zero
          tolerances:  relative=1e-12, absolute=1e-50, divergence=10000.
          left preconditioning
          using DEFAULT norm type for convergence test
        estimating eigenvalues using noisy right hand side
      maximum iterations=2, nonzero initial guess
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
      left preconditioning
      using NONE norm type for convergence test
    PC Object: (mg_levels_2_) 1 MPI processes
      type: sor
        type = local_symmetric, iterations = 1, local iterations = 1, omega = 1.
      linear system matrix = precond matrix:
      Mat Object: 1 MPI processes
        type: seqaij
        rows=348, cols=348
        total: nonzeros=3928, allocated nonzeros=3928
        total number of mallocs used during MatSetValues calls =0
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  Down solver (pre-smoother) on level 3 -------------------------------
    KSP Object: (mg_levels_3_) 1 MPI processes
      type: chebyshev
        eigenvalue estimates used:  min = 0., max = 0.
        eigenvalues estimate via gmres min 0., max 0.
        eigenvalues estimated using gmres with translations  [0. 0.1; 0. 1.1]
        KSP Object: (mg_levels_3_esteig_) 1 MPI processes
          type: gmres
            restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
            happy breakdown tolerance 1e-30
          maximum iterations=10, initial guess is zero
          tolerances:  relative=1e-12, absolute=1e-50, divergence=10000.
          left preconditioning
          using DEFAULT norm type for convergence test
        estimating eigenvalues using noisy right hand side
      maximum iterations=2, nonzero initial guess
      tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
      left preconditioning
      using NONE norm type for convergence test
    PC Object: (mg_levels_3_) 1 MPI processes
      type: sor
        type = local_symmetric, iterations = 1, local iterations = 1, omega = 1.
      linear system matrix = precond matrix:
      Mat Object: 1 MPI processes
        type: seqaij
        rows=3584, cols=3584
        total: nonzeros=23616, allocated nonzeros=23616
        total number of mallocs used during MatSetValues calls =0
          has attached null space
          not using I-node routines
  Up solver (post-smoother) same as down solver (pre-smoother)
  linear system matrix = precond matrix:
  Mat Object: 1 MPI processes
    type: seqaij
    rows=3584, cols=3584
    total: nonzeros=23616, allocated nonzeros=23616
    total number of mallocs used during MatSetValues calls =0
      has attached null space
      not using I-node routines
  Pressure system has reached convergence in 0 iterations with reason 3.
  0 KSP unpreconditioned resid norm 4.798763170703e-16 true resid norm 4.798763170703e-16 ||r(i)||/||b|| 1.000000000000e+00
  0 KSP Residual norm 4.798763170703e-16 % max 1.000000000000e+00 min 1.000000000000e+00 max/min 1.000000000000e+00
  1 KSP unpreconditioned resid norm 1.648749109132e-17 true resid norm 1.648749109132e-17 ||r(i)||/||b|| 3.435779284125e-02
  1 KSP Residual norm 1.648749109132e-17 % max 9.561792537103e-01 min 9.561792537103e-01 max/min 1.000000000000e+00
  2 KSP unpreconditioned resid norm 4.737880600040e-19 true resid norm 4.737880600040e-19 ||r(i)||/||b|| 9.873128619820e-04
  2 KSP Residual norm 4.737880600040e-19 % max 9.828636644296e-01 min 9.293131521763e-01 max/min 1.057623753767e+00
  3 KSP unpreconditioned resid norm 2.542212716830e-20 true resid norm 2.542212716830e-20 ||r(i)||/||b|| 5.297641551371e-05
  3 KSP Residual norm 2.542212716830e-20 % max 9.933572357920e-01 min 9.158303248850e-01 max/min 1.084652046127e+00
  4 KSP unpreconditioned resid norm 6.614510286263e-21 true resid norm 6.614510286269e-21 ||r(i)||/||b|| 1.378378146822e-05
  4 KSP Residual norm 6.614510286263e-21 % max 9.950912550705e-01 min 6.296575800237e-01 max/min 1.580368896747e+00
  5 KSP unpreconditioned resid norm 1.981505525281e-22 true resid norm 1.981505525272e-22 ||r(i)||/||b|| 4.129200493513e-07
  5 KSP Residual norm 1.981505525281e-22 % max 9.984097962703e-01 min 5.316259535293e-01 max/min 1.878030577029e+00
Linear solve converged due to CONVERGED_RTOL iterations 5

Ksp_monitor_true_residual output for stalling r/b CFD iteration
0 KSP unpreconditioned resid norm 9.010260489109e-14 true resid norm 9.010260489109e-14 ||r(i)||/||b|| 2.021559024868e+00
  0 KSP Residual norm 9.010260489109e-14 % max 1.000000000000e+00 min 1.000000000000e+00 max/min 1.000000000000e+00
  1 KSP unpreconditioned resid norm 4.918108339808e-15 true resid norm 4.918171792537e-15 ||r(i)||/||b|| 1.103450292594e-01
  1 KSP Residual norm 4.918108339808e-15 % max 9.566256813737e-01 min 9.566256813737e-01 max/min 1.000000000000e+00
  2 KSP unpreconditioned resid norm 1.443599554690e-15 true resid norm 1.444867143493e-15 ||r(i)||/||b|| 3.241731154382e-02
  2 KSP Residual norm 1.443599554690e-15 % max 9.614019380614e-01 min 7.360950481750e-01 max/min 1.306083963538e+00
  3 KSP unpreconditioned resid norm 6.623206616803e-16 true resid norm 6.654132553541e-16 ||r(i)||/||b|| 1.492933720678e-02
  3 KSP Residual norm 6.623206616803e-16 % max 9.764112945239e-01 min 4.911485418014e-01 max/min 1.988016274960e+00
  4 KSP unpreconditioned resid norm 6.551896936698e-16 true resid norm 6.646157296305e-16 ||r(i)||/||b|| 1.491144376933e-02
  4 KSP Residual norm 6.551896936698e-16 % max 9.883425885532e-01 min 1.461270778833e-01 max/min 6.763582786091e+00
  5 KSP unpreconditioned resid norm 6.222297644887e-16 true resid norm 1.720560536914e-15 ||r(i)||/||b|| 3.860282047823e-02
  5 KSP Residual norm 6.222297644887e-16 % max 1.000409371755e+00 min 4.989767363560e-03 max/min 2.004921870829e+02
  6 KSP unpreconditioned resid norm 6.496945794974e-17 true resid norm 2.031914800253e-14 ||r(i)||/||b|| 4.558842341106e-01
  6 KSP Residual norm 6.496945794974e-17 % max 1.004914985753e+00 min 1.459258738706e-03 max/min 6.886475709192e+02
  7 KSP unpreconditioned resid norm 1.965237342540e-17 true resid norm 1.684522207337e-14 ||r(i)||/||b|| 3.779425772373e-01
  7 KSP Residual norm 1.965237342540e-17 % max 1.005737762541e+00 min 1.452603803766e-03 max/min 6.923689446035e+02
  8 KSP unpreconditioned resid norm 1.627718951285e-17 true resid norm 1.958642967520e-14 ||r(i)||/||b|| 4.394448276241e-01
  8 KSP Residual norm 1.627718951285e-17 % max 1.006364278765e+00 min 1.452081813014e-03 max/min 6.930492963590e+02
  9 KSP unpreconditioned resid norm 1.616577677764e-17 true resid norm 2.019110946644e-14 ||r(i)||/||b|| 4.530115373837e-01
  9 KSP Residual norm 1.616577677764e-17 % max 1.006648747131e+00 min 1.452031376577e-03 max/min 6.932692801059e+02
10 KSP unpreconditioned resid norm 1.285788988203e-17 true resid norm 2.065082694477e-14 ||r(i)||/||b|| 4.633258453698e-01
10 KSP Residual norm 1.285788988203e-17 % max 1.007469033514e+00 min 1.433291867068e-03 max/min 7.029057072477e+02
11 KSP unpreconditioned resid norm 5.490854431580e-19 true resid norm 1.798071628891e-14 ||r(i)||/||b|| 4.034187394623e-01
11 KSP Residual norm 5.490854431580e-19 % max 1.008058905554e+00 min 1.369401685301e-03 max/min 7.361309076612e+02
12 KSP unpreconditioned resid norm 1.371754802104e-20 true resid norm 1.965688920064e-14 ||r(i)||/||b|| 4.410256708163e-01
12 KSP Residual norm 1.371754802104e-20 % max 1.008409402214e+00 min 1.369243011779e-03 max/min 7.364721919624e+02
Linear solve converged due to CONVERGED_RTOL iterations 12



Marco Cisternino

From: Barry Smith <bsmith at petsc.dev>
Sent: mercoled? 29 settembre 2021 18:34
To: Marco Cisternino <marco.cisternino at optimad.it>
Cc: petsc-users at mcs.anl.gov
Subject: Re: [petsc-users] Disconnected domains and Poisson equation




On Sep 29, 2021, at 11:59 AM, Marco Cisternino <marco.cisternino at optimad.it<mailto:marco.cisternino at optimad.it>> wrote:

For sake of completeness, explicitly building the null space using a vector per sub-domain make s the CFD runs using BCGS and GMRES more stable, but still slower than FGMRES.

  Something is strange. Please run with -ksp_view and send the output on the solver details.


I had divergence using BCGS and GMRES setting the null space with only one constant.
Thanks

Marco Cisternino

From: Marco Cisternino
Sent: mercoled? 29 settembre 2021 17:54
To: Barry Smith <bsmith at petsc.dev<mailto:bsmith at petsc.dev>>
Cc: petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>
Subject: RE: [petsc-users] Disconnected domains and Poisson equation

Thank you Barry for the quick reply.
About the null space: I already tried what you suggest, building 2 Vec (constants) with 0 and 1 chosen by sub-domain, normalizing them and setting the null space like this
MatNullSpaceCreate(PETSC_COMM_WORLD,PETSC_FALSE,nconstants,constants,&nullspace);
The solution is slightly different in values but it is still different in the two sub-domains.
About the solver: I tried BCGS, GMRES and FGMRES. The linear system is a pressure system in a navier-stokes solver and only solving with FGMRES makes the CFD stable, with BCGS and GMRES the CFD solution diverges. Moreover, in the same case but with a single domain, CFD solution is stable using all the solvers, but FGMRES converges in much less iterations than the others.

Marco Cisternino

From: Barry Smith <bsmith at petsc.dev<mailto:bsmith at petsc.dev>>
Sent: mercoled? 29 settembre 2021 15:59
To: Marco Cisternino <marco.cisternino at optimad.it<mailto:marco.cisternino at optimad.it>>
Cc: petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>
Subject: Re: [petsc-users] Disconnected domains and Poisson equation


  The problem actually has a two dimensional null space; constant on each domain but possibly different constants. I think you need to build the MatNullSpace by explicitly constructing two vectors, one with 0 on one domain and constant value on the other and one with 0 on the other domain and constant on the first.

   Separate note: why use FGMRES instead of just GMRES? If the problem is linear and the preconditioner is linear (no GMRES inside the smoother) then you can just use GMRES and it will save a little space/work and be conceptually clearer.

  Barry

On Sep 29, 2021, at 8:46 AM, Marco Cisternino <marco.cisternino at optimad.it<mailto:marco.cisternino at optimad.it>> wrote:

Good morning,
I want to solve the Poisson equation on a 3D domain with 2 non-connected sub-domains.
I am using FGMRES+GAMG and I have no problem if the two sub-domains see a Dirichlet boundary condition each.
On the same domain I would like to solve the Poisson equation imposing periodic boundary condition in one direction and homogenous Neumann boundary conditions in the other two directions. The two sub-domains are symmetric with respect to the separation between them and the operator discretization and the right hand side are symmetric as well. It would be nice to have the same solution in both the sub-domains.
Setting the null space to the constant, the solver converges to a solution having the same gradients in both sub-domains but different values.
Am I doing some wrong with the null space? I?m not setting a block matrix (one block for each sub-domain), should I?
I tested the null space against the matrix using MatNullSpaceTest and the answer is true. Can I do something more to have a symmetric solution as outcome of the solver?
Thank you in advance for any comments and hints.

Best regards,

Marco Cisternino


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210930/5ec47c66/attachment-0001.html>

From michael.wick.1980 at gmail.com  Thu Sep 30 04:55:50 2021
From: michael.wick.1980 at gmail.com (Michael Wick)
Date: Thu, 30 Sep 2021 17:55:50 +0800
Subject: [petsc-users] pass a member function to MatShellSetOperation
Message-ID: <CAMH7-4khRFYpAP6SJEXwCM2VwvORCRyjQHeBtOqoRAu7cmmBzQ@mail.gmail.com>

Hi:

I want to have the shell matrix-vector multiplication written as a class
member function and pass it to the shell matrix via MatShellSetOperation.

MatShellSetOperation(A, MATOP_MULT, (void
(*)(void))(&Global_Assem::MyMatMult));

Perhaps I have a wrong understanding of function pointers, and I am
constantly getting warnings that say I cannot convert a member function to
a void type. The warning indeed makes sense to me, as the function pointer
passed in the above manner is independent of an instance. Perhaps there are
other ways of passing a member function that I don't know of. If you know
how to address this, I would appreciate it a lot!

Thanks,

Mike
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210930/b0912b9c/attachment.html>

From praveen at gmx.net  Thu Sep 30 05:10:57 2021
From: praveen at gmx.net (Praveen C)
Date: Thu, 30 Sep 2021 15:40:57 +0530
Subject: [petsc-users] pass a member function to MatShellSetOperation
In-Reply-To: <CAMH7-4khRFYpAP6SJEXwCM2VwvORCRyjQHeBtOqoRAu7cmmBzQ@mail.gmail.com>
References: <CAMH7-4khRFYpAP6SJEXwCM2VwvORCRyjQHeBtOqoRAu7cmmBzQ@mail.gmail.com>
Message-ID: <63546E7A-36E3-440C-80EF-7B38A2B27071@gmx.net>

I have used something like this in similar situation

auto MatMult = [this](?args?)
{
   this->MyMatMult(?args?);
};


Then pass MatMult to petsc.

this refers to the class Global_Assem and we are assuming you are inside this class when doing the above.

best
praveen

> On 30-Sep-2021, at 3:25 PM, Michael Wick <michael.wick.1980 at gmail.com> wrote:
> 
> Hi:
> 
> I want to have the shell matrix-vector multiplication written as a class member function and pass it to the shell matrix via MatShellSetOperation. 
> 
> MatShellSetOperation(A, MATOP_MULT, (void (*)(void))(&Global_Assem::MyMatMult));
> 
> Perhaps I have a wrong understanding of function pointers, and I am constantly getting warnings that say I cannot convert a member function to a void type. The warning indeed makes sense to me, as the function pointer passed in the above manner is independent of an instance. Perhaps there are other ways of passing a member function that I don't know of. If you know how to address this, I would appreciate it a lot!
> 
> Thanks,
> 
> Mike

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210930/97c49ed3/attachment.html>

From knepley at gmail.com  Thu Sep 30 05:52:17 2021
From: knepley at gmail.com (Matthew Knepley)
Date: Thu, 30 Sep 2021 06:52:17 -0400
Subject: [petsc-users] Convergence rate for spatially varying Helmholtz
 system
In-Reply-To: <CA+NMaCYe027kYzODv2CWa0efuzz37qArrb9x2b-kAFiX38r7pg@mail.gmail.com>
References: <CA+NMaCbK0SiKKDuz5yW_yyb5NpAgHdbQKvtVj0mOyiga+eVRjg@mail.gmail.com>
	<CAMYG4Gn1cXg-QFzO-nZLbZp2_Gw8ccZmNfAbqkoFkM8Oj7OMnQ@mail.gmail.com>
	<CA+NMaCYe027kYzODv2CWa0efuzz37qArrb9x2b-kAFiX38r7pg@mail.gmail.com>
Message-ID: <CAMYG4GnE58FwV_znYPUbG=it2bBwQ5h=_z4VfKvN=kg_JOQ8mg@mail.gmail.com>

On Wed, Sep 29, 2021 at 8:31 PM Ramakrishnan Thirumalaisamy <
rthirumalaisam1857 at sdsu.edu> wrote:

> Several things look off here:
>>
>> 1) Your true residual norm is 2.4e9, but r_0/b is 4.4e-5. That seems to
>> indicate that ||b|| is 1e14. Is this true?
>>
> Yes. ||b|| is 1e14.  We have verified that.
>
>>
>> 2) Your preconditioned residual is 11 orders of magnitude less than the
>> true residual. This usually indicates that the system is near singular.
>>
> The system is diagonal, as shown in Temperature_fill.pdf. So we don't
> think it is singular unless we are missing something very obvious.
> Diagonal elements range from 8.8e6 to 5.62896e+10, while off-diagonal terms
> are 0, as shown in the spy plot.
>
>>
>> 3) The disparity above does not seem possible if C only has elements ~
>> 1e4. The preconditioner consistently has norm around 1e-11.
>>
> The value of C in the Helmholtz system is computed as : *C =
> rho*specific_heat/dt* in which dt = 5e-5, specific_heat ~10^3 and rho
> ranges from 0.4 to 2700. Hence, C ranges from 8.8e6 to 5.62896e10.
> Max_diagonal(C)/Min_diagonal(C) ~ 10^4.
>
>>
>> 4) Using numbers that large can be a problem. You lose precision, so that
>> you really only have 3-4 correct digits each time, as you see above. It
>> appears to be
>>      doing iterative refinement with a very low precision solve.
>>
> Indeed the numbers are large because C =  rho*specific_heat/dt.
>

If you want to solve systems accurately, you should non-dimensionalize the
system prior to discretization. This would mean that
your C and b have elements in the [1, D] range, where D is the dynamic
range of your problem, say 1e4, rather than these huge
numbers you have now.

  Thanks,

     Matt


> On Wed, Sep 29, 2021 at 3:58 PM Matthew Knepley <knepley at gmail.com> wrote:
>
>> On Wed, Sep 29, 2021 at 6:03 PM Ramakrishnan Thirumalaisamy <
>> rthirumalaisam1857 at sdsu.edu> wrote:
>>
>>> Hi all,
>>>
>>> I am trying to solve the Helmholtz equation for temperature T:
>>>
>>> (C I  + Div D grad) T = f
>>>
>>> in IBAMR, in which C is the spatially varying diagonal entries, and D is
>>> the spatially varying diffusion coefficient.   I use a matrix-free solver
>>> with matrix-based PETSc preconditioner. For the matrix-free solver, I use
>>> gmres solver and for the matrix based preconditioner, I use Richardson ksp
>>> + Jacobi as a preconditioner. As the simulation progresses, the iterations
>>> start to increase. To understand the cause, I set D to be zero, which
>>> results in a diagonal system:
>>>
>>> C T = f.
>>>
>>> This should result in convergence within a single iteration, but I get
>>> convergence in 3 iterations.
>>>
>>> Residual norms for temperature_ solve.
>>>
>>>   0 KSP preconditioned resid norm 4.590811647875e-02 true resid norm
>>> 2.406067589273e+09 ||r(i)||/||b|| 4.455533946945e-05
>>>
>>>   1 KSP preconditioned resid norm 2.347767895880e-06 true resid norm
>>> 1.210763896685e+05 ||r(i)||/||b|| 2.242081505717e-09
>>>
>>>   2 KSP preconditioned resid norm 1.245406571896e-10 true resid norm
>>> 6.328828824310e+00 ||r(i)||/||b|| 1.171966730978e-13
>>>
>>> Linear temperature_ solve converged due to CONVERGED_RTOL iterations 2
>>>
>>
>> Several things look off here:
>>
>> 1) Your true residual norm is 2.4e9, but r_0/b is 4.4e-5. That seems to
>> indicate that ||b|| is 1e14. Is this true?
>>
>
>> 2) Your preconditioned residual is 11 orders of magnitude less than the
>> true residual. This usually indicates that the system is near singular.
>>
>> 3) The disparity above does not seem possible if C only has elements ~
>> 1e4. The preconditioner consistently has norm around 1e-11.
>>
>> 4) Using numbers that large can be a problem. You lose precision, so that
>> you really only have 3-4 correct digits each time, as you see above. It
>> appears to be
>>      doing iterative refinement with a very low precision solve.
>>
>>   Thanks,
>>
>>      Matt
>>
>>
>>> To verify that I am indeed solving a diagonal system I printed the PETSc
>>> matrix from the preconditioner and viewed it in Matlab. It indeed shows it
>>> to be a diagonal system. Attached is the plot of the spy command on the
>>> printed matrix. The matrix in binary form is also attached.
>>>
>>> My understanding is that because the C coefficient is varying in 4
>>> orders of magnitude, i.e., Max(C)/Min(C) ~ 10^4, the matrix is poorly
>>> scaled. When I rescale my matrix by 1/C then the system converges in 1
>>> iteration as expected. Is my understanding correct, and that scaling 1/C
>>> should be done even for a diagonal system?
>>>
>>> When D is non-zero, then scaling by 1/C seems to be very inconvenient as
>>> D is stored as side-centered data for the matrix free solver.
>>>
>>> In the case that I do not scale my equations by 1/C, is there some
>>> solver setting that improves the convergence rate? (With D as non-zero, I
>>> have also tried gmres as the ksp solver in the matrix-based preconditioner
>>> to get better performance, but it didn't matter much.)
>>>
>>>
>>> Thanks,
>>> Ramakrishnan Thirumalaisamy
>>> San Diego State University.
>>>
>>
>>
>> --
>> What most experimenters take for granted before they begin their
>> experiments is infinitely more interesting than any results to which their
>> experiments lead.
>> -- Norbert Wiener
>>
>> https://www.cse.buffalo.edu/~knepley/
>> <http://www.cse.buffalo.edu/~knepley/>
>>
>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210930/af907835/attachment-0001.html>

From berend.vanwachem at ovgu.de  Thu Sep 30 06:02:13 2021
From: berend.vanwachem at ovgu.de (Berend van Wachem)
Date: Thu, 30 Sep 2021 13:02:13 +0200
Subject: [petsc-users] DMView and DMLoad
In-Reply-To: <056E066F-D596-4254-A44A-60BFFD30FE82@erdw.ethz.ch>
References: <56ce2135-9757-4292-e33b-c7eea8fb7b2e@ovgu.de>
	<CAMYG4G=9N-Y5Am8LsQG8Z9mLmYMNZn3m5CxdFZocHxRN8Tr77w@mail.gmail.com>
	<DD57D079-509E-4C60-9F32-15972E6B6245@gmx.li>
	<056E066F-D596-4254-A44A-60BFFD30FE82@erdw.ethz.ch>
Message-ID: <45d209e2-ecab-ead7-7229-a819736b91df@ovgu.de>

Dear Vaclav, Lawrence,

following your example, we have managed to save the DM with a wrapped 
Vector in h5 format (PETSC_VIEWER_HDF5_PETSC) with:

     DMPlexTopologyView(dm, viewer);
     DMClone(dm, &sdm);
     ...
     DMPlexSectionView(dm, viewer, sdm);
     DMGetLocalVector(sdm, &vec);
     ...
     DMPlexLocalVectorView(dm, viewer, sdm, vec);

The problem comes with the loading of the "DM+Vec.h5" with:

     DMCreate(PETSC_COMM_WORLD, &dm);
     DMSetType(dm, DMPLEX);
     ...
     DMPlexTopologyLoad(dm, viewer, &sfO);
     ...
     PetscSFCompose(sfO, sfDist, &sf);
     ...
     DMClone(dm, &sdm);
     DMPlexSectionLoad(dm, viewer, sdm, sf, &globalDataSF, &localDataSF);
     DMGetLocalVector(sdm, &vec);
     ...
     DMPlexLocalVectorLoad(dm, H5Viewer, sdm, localDataSF, vec);

The loaded DM is different to the one created with DMPlexCreateFromfile 
(for instance, no "coordinates" are recovered with the use of 
DMGetCoordinatesLocal). This conflicts with our code, which relies on 
features of the DM as delivered by the DMPlexCreateFromfile function.

We have also noticed that the "DM+Vec.h5" can not be loaded directly 
with DMPlexCreateFromfile because it contains only the groups "topology" 
and "topologies" while the groups "geometry" and "labels" are missing 
(and probably other conflicts).

Is this something which can be changed? We would need to reload a DM 
similar to the one created with DMPlexCreateFromfile.

Best regards,

Berend.



On 9/22/21 8:59 PM, Hapla Vaclav wrote:
> To avoid confusions here, Berend seems to be specifically demanding XDMF 
> (PETSC_VIEWER_HDF5_XDMF). The stuff we are now working on is parallel 
> checkpointing in our own HDF5 format?(PETSC_VIEWER_HDF5_PETSC), I will 
> make a series of MRs on this topic in the following days.
> 
> For XDMF, we are specifically missing the ability to write/load DMLabels 
> properly. XDMF uses specific cell-local numbering for faces for 
> specification of face sets, and face-local numbering for specification 
> of edge sets, which is not great wrt DMPlex design. And ParaView doesn't 
> show any of these properly so it's hard to debug. Matt, we should talk 
> about this soon.
> 
> Berend, for now, could you just load the mesh initially from XDMF and 
> then use our PETSC_VIEWER_HDF5_PETSC format for subsequent saving/loading?
> 
> Thanks,
> 
> Vaclav
> 
>> On 17 Sep 2021, at 15:46, Lawrence Mitchell <wence at gmx.li 
>> <mailto:wence at gmx.li>> wrote:
>>
>> Hi Berend,
>>
>>> On 14 Sep 2021, at 12:23, Matthew Knepley <knepley at gmail.com 
>>> <mailto:knepley at gmail.com>> wrote:
>>>
>>> On Tue, Sep 14, 2021 at 5:15 AM Berend van Wachem 
>>> <berend.vanwachem at ovgu.de <mailto:berend.vanwachem at ovgu.de>> wrote:
>>> Dear PETSc-team,
>>>
>>> We are trying to save and load distributed DMPlex and its associated
>>> physical fields (created with DMCreateGlobalVector) ?(Uvelocity,
>>> VVelocity, ?...) in HDF5_XDMF format. To achieve this, we do the 
>>> following:
>>>
>>> 1) save in the same xdmf.h5 file:
>>> DMView( DM ????????, H5_XDMF_Viewer );
>>> VecView( UVelocity, H5_XDMF_Viewer );
>>>
>>> 2) load the dm:
>>> DMPlexCreateFromfile(PETSC_COMM_WORLD, Filename, PETSC_TRUE, DM);
>>>
>>> 3) load the physical field:
>>> VecLoad( UVelocity, H5_XDMF_Viewer );
>>>
>>> There are no errors in the execution, but the loaded DM is distributed
>>> differently to the original one, which results in the incorrect
>>> placement of the values of the physical fields (UVelocity etc.) in the
>>> domain.
>>>
>>> This approach is used to restart the simulation with the last saved DM.
>>> Is there something we are missing, or there exists alternative routes to
>>> this goal? Can we somehow get the IS of the redistribution, so we can
>>> re-distribute the vector data as well?
>>>
>>> Many thanks, best regards,
>>>
>>> Hi Berend,
>>>
>>> We are in the midst of rewriting this. We want to support saving 
>>> multiple meshes, with fields attached to each,
>>> and preserving the discretization (section) information, and allowing 
>>> us to load up on a different number of
>>> processes. We plan to be done by October. Vaclav and I are doing this 
>>> in collaboration with Koki Sagiyama,
>>> David Ham, and Lawrence Mitchell from the Firedrake team.
>>
>> The core load/save cycle functionality is now in PETSc main. So if 
>> you're using main rather than a release, you can get access to it now. 
>> This section of the manual shows an example of how to do 
>> thingshttps://petsc.org/main/docs/manual/dmplex/#saving-and-loading-data-with-hdf5 
>> <https://petsc.org/main/docs/manual/dmplex/#saving-and-loading-data-with-hdf5>
>>
>> Let us know if things aren't clear!
>>
>> Thanks,
>>
>> Lawrence
> 

From mfadams at lbl.gov  Thu Sep 30 06:06:06 2021
From: mfadams at lbl.gov (Mark Adams)
Date: Thu, 30 Sep 2021 07:06:06 -0400
Subject: [petsc-users] Disconnected domains and Poisson equation
In-Reply-To: <AM8PR01MB8027615DCD755DB715D775A0E3AA9@AM8PR01MB8027.eurprd01.prod.exchangelabs.com>
References: <AS8PR01MB8024549ECEC514A95F05812EE3A99@AS8PR01MB8024.eurprd01.prod.exchangelabs.com>
	<448CEBF7-5B16-4E1C-8D1D-9CC067BD38BB@petsc.dev>
	<AS8PR01MB80247DD771EE0861987F4C3CE3A99@AS8PR01MB8024.eurprd01.prod.exchangelabs.com>
	<AS8PR01MB8024A52A6C2CC88115DE1C45E3A99@AS8PR01MB8024.eurprd01.prod.exchangelabs.com>
	<10EA28EF-AD98-4F59-A78D-7DE3D4B585DE@petsc.dev>
	<AM8PR01MB8027615DCD755DB715D775A0E3AA9@AM8PR01MB8027.eurprd01.prod.exchangelabs.com>
Message-ID: <CADOhEh7Na+ep6N_9w3W6UD7PEE2pBkd-tt_eVnAm2HNuebiPsA@mail.gmail.com>

* Do we understand:

    type: chebyshev

        eigenvalue estimates used:  *min = 0., max = 0.*

        eigenvalues estimate via gmres *min 0., max 0.*


* Is this Poisson solver unsymmetric?


* Does this problem start off converging and then evolve and then start
stagnating? The eigen estimates may need to be recalculated.

Also Chebyshev is problematic for unsymmetric matrices. Hypre does better
with unsymmetric matrices.


Mark

On Thu, Sep 30, 2021 at 4:16 AM Marco Cisternino <
marco.cisternino at optimad.it> wrote:

> Hello Barry.
>
> This is the output of ksp_view using fgmres and gamg. It has to be said
> that the solution of the linear system should be a zero values field. As
> you can see both unpreconditioned residual and r/b converge at this
> iteration of the CFD solver. During the time integration of the CFD, I can
> observe pressure linear solver residuals behaving in a different way:
> unpreconditioned residual stil converges but r/b stalls. After the output
> of ksp_view I add the output of ksp_monitor_true_residual for one of these
> iteration where r/b stalls.
> Thanks,
>
>
>
> KSP Object: 1 MPI processes
>
>   type: fgmres
>
>     restart=30, using Classical (unmodified) Gram-Schmidt
> Orthogonalization with no iterative refinement
>
>     happy breakdown tolerance 1e-30
>
>   maximum iterations=100, nonzero initial guess
>
>   tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
>
>   right preconditioning
>
>   using UNPRECONDITIONED norm type for convergence test
>
> PC Object: 1 MPI processes
>
>   type: gamg
>
>     type is MULTIPLICATIVE, levels=4 cycles=v
>
>       Cycles per PCApply=1
>
>       Using externally compute Galerkin coarse grid matrices
>
>       GAMG specific options
>
>         Threshold for dropping small values in graph on each level =
> 0.02   0.02
>
>         Threshold scaling factor for each level not specified = 1.
>
>         AGG specific options
>
>           Symmetric graph true
>
>           Number of levels to square graph 1
>
>           Number smoothing steps 0
>
>   Coarse grid solver -- level -------------------------------
>
>     KSP Object: (mg_coarse_) 1 MPI processes
>
>       type: preonly
>
>       maximum iterations=10000, initial guess is zero
>
>       tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
>
>       left preconditioning
>
>       using NONE norm type for convergence test
>
>     PC Object: (mg_coarse_) 1 MPI processes
>
>       type: bjacobi
>
>         number of blocks = 1
>
>         Local solve is same for all blocks, in the following KSP and PC
> objects:
>
>         KSP Object: (mg_coarse_sub_) 1 MPI processes
>
>           type: preonly
>
>           maximum iterations=1, initial guess is zero
>
>           tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
>
>           left preconditioning
>
>           using DEFAULT norm type for convergence test
>
>         PC Object: (mg_coarse_sub_) 1 MPI processes
>
>           type: lu
>
>           PC has not been set up so information may be incomplete
>
>             out-of-place factorization
>
>             tolerance for zero pivot 2.22045e-14
>
>             using diagonal shift on blocks to prevent zero pivot [INBLOCKS]
>
>             matrix ordering: nd
>
>           linear system matrix = precond matrix:
>
>           Mat Object: 1 MPI processes
>
>             type: seqaij
>
>             rows=18, cols=18
>
>             total: nonzeros=104, allocated nonzeros=104
>
>             total number of mallocs used during MatSetValues calls =0
>
>               not using I-node routines
>
>       linear system matrix = precond matrix:
>
>       Mat Object: 1 MPI processes
>
>         type: seqaij
>
>         rows=18, cols=18
>
>         total: nonzeros=104, allocated nonzeros=104
>
>         total number of mallocs used during MatSetValues calls =0
>
>           not using I-node routines
>
>   Down solver (pre-smoother) on level 1 -------------------------------
>
>     KSP Object: (mg_levels_1_) 1 MPI processes
>
>       type: chebyshev
>
>         eigenvalue estimates used:  min = 0., max = 0.
>
>         eigenvalues estimate via gmres min 0., max 0.
>
>         eigenvalues estimated using gmres with translations  [0. 0.1; 0.
> 1.1]
>
>         KSP Object: (mg_levels_1_esteig_) 1 MPI processes
>
>           type: gmres
>
>             restart=30, using Classical (unmodified) Gram-Schmidt
> Orthogonalization with no iterative refinement
>
>             happy breakdown tolerance 1e-30
>
>           maximum iterations=10, initial guess is zero
>
>           tolerances:  relative=1e-12, absolute=1e-50, divergence=10000.
>
>           left preconditioning
>
>           using DEFAULT norm type for convergence test
>
>         estimating eigenvalues using noisy right hand side
>
>       maximum iterations=2, nonzero initial guess
>
>       tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
>
>       left preconditioning
>
>       using NONE norm type for convergence test
>
>     PC Object: (mg_levels_1_) 1 MPI processes
>
>       type: sor
>
>         type = local_symmetric, iterations = 1, local iterations = 1,
> omega = 1.
>
>       linear system matrix = precond matrix:
>
>       Mat Object: 1 MPI processes
>
>         type: seqaij
>
>         rows=67, cols=67
>
>         total: nonzeros=675, allocated nonzeros=675
>
>         total number of mallocs used during MatSetValues calls =0
>
>           not using I-node routines
>
>   Up solver (post-smoother) same as down solver (pre-smoother)
>
>   Down solver (pre-smoother) on level 2 -------------------------------
>
>     KSP Object: (mg_levels_2_) 1 MPI processes
>
>       type: chebyshev
>
>         eigenvalue estimates used:  min = 0., max = 0.
>
>         eigenvalues estimate via gmres min 0., max 0.
>
>         eigenvalues estimated using gmres with translations  [0. 0.1; 0.
> 1.1]
>
>         KSP Object: (mg_levels_2_esteig_) 1 MPI processes
>
>           type: gmres
>
>             restart=30, using Classical (unmodified) Gram-Schmidt
> Orthogonalization with no iterative refinement
>
>             happy breakdown tolerance 1e-30
>
>           maximum iterations=10, initial guess is zero
>
>           tolerances:  relative=1e-12, absolute=1e-50, divergence=10000.
>
>           left preconditioning
>
>           using DEFAULT norm type for convergence test
>
>         estimating eigenvalues using noisy right hand side
>
>       maximum iterations=2, nonzero initial guess
>
>       tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
>
>       left preconditioning
>
>       using NONE norm type for convergence test
>
>     PC Object: (mg_levels_2_) 1 MPI processes
>
>       type: sor
>
>         type = local_symmetric, iterations = 1, local iterations = 1,
> omega = 1.
>
>       linear system matrix = precond matrix:
>
>       Mat Object: 1 MPI processes
>
>         type: seqaij
>
>         rows=348, cols=348
>
>         total: nonzeros=3928, allocated nonzeros=3928
>
>         total number of mallocs used during MatSetValues calls =0
>
>           not using I-node routines
>
>   Up solver (post-smoother) same as down solver (pre-smoother)
>
>   Down solver (pre-smoother) on level 3 -------------------------------
>
>     KSP Object: (mg_levels_3_) 1 MPI processes
>
>       type: chebyshev
>
>         eigenvalue estimates used:  min = 0., max = 0.
>
>         eigenvalues estimate via gmres min 0., max 0.
>
>         eigenvalues estimated using gmres with translations  [0. 0.1; 0.
> 1.1]
>
>         KSP Object: (mg_levels_3_esteig_) 1 MPI processes
>
>           type: gmres
>
>             restart=30, using Classical (unmodified) Gram-Schmidt
> Orthogonalization with no iterative refinement
>
>             happy breakdown tolerance 1e-30
>
>           maximum iterations=10, initial guess is zero
>
>           tolerances:  relative=1e-12, absolute=1e-50, divergence=10000.
>
>           left preconditioning
>
>           using DEFAULT norm type for convergence test
>
>         estimating eigenvalues using noisy right hand side
>
>       maximum iterations=2, nonzero initial guess
>
>       tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
>
>       left preconditioning
>
>       using NONE norm type for convergence test
>
>     PC Object: (mg_levels_3_) 1 MPI processes
>
>       type: sor
>
>         type = local_symmetric, iterations = 1, local iterations = 1,
> omega = 1.
>
>       linear system matrix = precond matrix:
>
>       Mat Object: 1 MPI processes
>
>         type: seqaij
>
>         rows=3584, cols=3584
>
>         total: nonzeros=23616, allocated nonzeros=23616
>
>         total number of mallocs used during MatSetValues calls =0
>
>           has attached null space
>
>           not using I-node routines
>
>   Up solver (post-smoother) same as down solver (pre-smoother)
>
>   linear system matrix = precond matrix:
>
>   Mat Object: 1 MPI processes
>
>     type: seqaij
>
>     rows=3584, cols=3584
>
>     total: nonzeros=23616, allocated nonzeros=23616
>
>     total number of mallocs used during MatSetValues calls =0
>
>       has attached null space
>
>       not using I-node routines
>
>   Pressure system has reached convergence in 0 iterations with reason 3.
>
>   0 KSP unpreconditioned resid norm 4.798763170703e-16 true resid norm
> 4.798763170703e-16 ||r(i)||/||b|| 1.000000000000e+00
>
>   0 KSP Residual norm 4.798763170703e-16 % max 1.000000000000e+00 min
> 1.000000000000e+00 max/min 1.000000000000e+00
>
>   1 KSP unpreconditioned resid norm 1.648749109132e-17 true resid norm
> 1.648749109132e-17 ||r(i)||/||b|| 3.435779284125e-02
>
>   1 KSP Residual norm 1.648749109132e-17 % max 9.561792537103e-01 min
> 9.561792537103e-01 max/min 1.000000000000e+00
>
>   2 KSP unpreconditioned resid norm 4.737880600040e-19 true resid norm
> 4.737880600040e-19 ||r(i)||/||b|| 9.873128619820e-04
>
>   2 KSP Residual norm 4.737880600040e-19 % max 9.828636644296e-01 min
> 9.293131521763e-01 max/min 1.057623753767e+00
>
>   3 KSP unpreconditioned resid norm 2.542212716830e-20 true resid norm
> 2.542212716830e-20 ||r(i)||/||b|| 5.297641551371e-05
>
>   3 KSP Residual norm 2.542212716830e-20 % max 9.933572357920e-01 min
> 9.158303248850e-01 max/min 1.084652046127e+00
>
>   4 KSP unpreconditioned resid norm 6.614510286263e-21 true resid norm
> 6.614510286269e-21 ||r(i)||/||b|| 1.378378146822e-05
>
>   4 KSP Residual norm 6.614510286263e-21 % max 9.950912550705e-01 min
> 6.296575800237e-01 max/min 1.580368896747e+00
>
>   5 KSP unpreconditioned resid norm 1.981505525281e-22 true resid norm
> 1.981505525272e-22 ||r(i)||/||b|| 4.129200493513e-07
>
>   5 KSP Residual norm 1.981505525281e-22 % max 9.984097962703e-01 min
> 5.316259535293e-01 max/min 1.878030577029e+00
>
> Linear solve converged due to CONVERGED_RTOL iterations 5
>
>
>
> Ksp_monitor_true_residual output for stalling r/b CFD iteration
> 0 KSP unpreconditioned resid norm 9.010260489109e-14 true resid norm
> 9.010260489109e-14 ||r(i)||/||b|| 2.021559024868e+00
>
>   0 KSP Residual norm 9.010260489109e-14 % max 1.000000000000e+00 min
> 1.000000000000e+00 max/min 1.000000000000e+00
>
>   1 KSP unpreconditioned resid norm 4.918108339808e-15 true resid norm
> 4.918171792537e-15 ||r(i)||/||b|| 1.103450292594e-01
>
>   1 KSP Residual norm 4.918108339808e-15 % max 9.566256813737e-01 min
> 9.566256813737e-01 max/min 1.000000000000e+00
>
>   2 KSP unpreconditioned resid norm 1.443599554690e-15 true resid norm
> 1.444867143493e-15 ||r(i)||/||b|| 3.241731154382e-02
>
>   2 KSP Residual norm 1.443599554690e-15 % max 9.614019380614e-01 min
> 7.360950481750e-01 max/min 1.306083963538e+00
>
>   3 KSP unpreconditioned resid norm 6.623206616803e-16 true resid norm
> 6.654132553541e-16 ||r(i)||/||b|| 1.492933720678e-02
>
>   3 KSP Residual norm 6.623206616803e-16 % max 9.764112945239e-01 min
> 4.911485418014e-01 max/min 1.988016274960e+00
>
>   4 KSP unpreconditioned resid norm 6.551896936698e-16 true resid norm
> 6.646157296305e-16 ||r(i)||/||b|| 1.491144376933e-02
>
>   4 KSP Residual norm 6.551896936698e-16 % max 9.883425885532e-01 min
> 1.461270778833e-01 max/min 6.763582786091e+00
>
>   5 KSP unpreconditioned resid norm 6.222297644887e-16 true resid norm
> 1.720560536914e-15 ||r(i)||/||b|| 3.860282047823e-02
>
>   5 KSP Residual norm 6.222297644887e-16 % max 1.000409371755e+00 min
> 4.989767363560e-03 max/min 2.004921870829e+02
>
>   6 KSP unpreconditioned resid norm 6.496945794974e-17 true resid norm
> 2.031914800253e-14 ||r(i)||/||b|| 4.558842341106e-01
>
>   6 KSP Residual norm 6.496945794974e-17 % max 1.004914985753e+00 min
> 1.459258738706e-03 max/min 6.886475709192e+02
>
>   7 KSP unpreconditioned resid norm 1.965237342540e-17 true resid norm
> 1.684522207337e-14 ||r(i)||/||b|| 3.779425772373e-01
>
>   7 KSP Residual norm 1.965237342540e-17 % max 1.005737762541e+00 min
> 1.452603803766e-03 max/min 6.923689446035e+02
>
>   8 KSP unpreconditioned resid norm 1.627718951285e-17 true resid norm
> 1.958642967520e-14 ||r(i)||/||b|| 4.394448276241e-01
>
>   8 KSP Residual norm 1.627718951285e-17 % max 1.006364278765e+00 min
> 1.452081813014e-03 max/min 6.930492963590e+02
>
>   9 KSP unpreconditioned resid norm 1.616577677764e-17 true resid norm
> 2.019110946644e-14 ||r(i)||/||b|| 4.530115373837e-01
>
>   9 KSP Residual norm 1.616577677764e-17 % max 1.006648747131e+00 min
> 1.452031376577e-03 max/min 6.932692801059e+02
>
> 10 KSP unpreconditioned resid norm 1.285788988203e-17 true resid norm
> 2.065082694477e-14 ||r(i)||/||b|| 4.633258453698e-01
>
> 10 KSP Residual norm 1.285788988203e-17 % max 1.007469033514e+00 min
> 1.433291867068e-03 max/min 7.029057072477e+02
>
> 11 KSP unpreconditioned resid norm 5.490854431580e-19 true resid norm
> 1.798071628891e-14 ||r(i)||/||b|| 4.034187394623e-01
>
> 11 KSP Residual norm 5.490854431580e-19 % max 1.008058905554e+00 min
> 1.369401685301e-03 max/min 7.361309076612e+02
>
> 12 KSP unpreconditioned resid norm 1.371754802104e-20 true resid norm
> 1.965688920064e-14 ||r(i)||/||b|| 4.410256708163e-01
>
> 12 KSP Residual norm 1.371754802104e-20 % max 1.008409402214e+00 min
> 1.369243011779e-03 max/min 7.364721919624e+02
>
> Linear solve converged due to CONVERGED_RTOL iterations 12
>
>
>
>
>
>
>
> Marco Cisternino
>
>
>
> *From:* Barry Smith <bsmith at petsc.dev>
> *Sent:* mercoled? 29 settembre 2021 18:34
> *To:* Marco Cisternino <marco.cisternino at optimad.it>
> *Cc:* petsc-users at mcs.anl.gov
> *Subject:* Re: [petsc-users] Disconnected domains and Poisson equation
>
>
>
>
>
>
>
> On Sep 29, 2021, at 11:59 AM, Marco Cisternino <
> marco.cisternino at optimad.it> wrote:
>
>
>
> For sake of completeness, explicitly building the null space using a
> vector per sub-domain make s the CFD runs using BCGS and GMRES more stable,
> but still slower than FGMRES.
>
>
>
>   Something is strange. Please run with -ksp_view and send the output on
> the solver details.
>
>
>
> I had divergence using BCGS and GMRES setting the null space with only one
> constant.
>
> Thanks
>
>
>
> Marco Cisternino
>
>
>
> *From:* Marco Cisternino
> *Sent:* mercoled? 29 settembre 2021 17:54
> *To:* Barry Smith <bsmith at petsc.dev>
> *Cc:* petsc-users at mcs.anl.gov
> *Subject:* RE: [petsc-users] Disconnected domains and Poisson equation
>
>
>
> Thank you Barry for the quick reply.
>
> About the null space: I already tried what you suggest, building 2 Vec
> (constants) with 0 and 1 chosen by sub-domain, normalizing them and setting
> the null space like this
>
>
> MatNullSpaceCreate(PETSC_COMM_WORLD,PETSC_FALSE,nconstants,constants,&nullspace);
>
> The solution is slightly different in values but it is still different in
> the two sub-domains.
>
> About the solver: I tried BCGS, GMRES and FGMRES. The linear system is a
> pressure system in a navier-stokes solver and only solving with FGMRES
> makes the CFD stable, with BCGS and GMRES the CFD solution diverges.
> Moreover, in the same case but with a single domain, CFD solution is stable
> using all the solvers, but FGMRES converges in much less iterations than
> the others.
>
>
>
> Marco Cisternino
>
>
>
> *From:* Barry Smith <bsmith at petsc.dev>
> *Sent:* mercoled? 29 settembre 2021 15:59
> *To:* Marco Cisternino <marco.cisternino at optimad.it>
> *Cc:* petsc-users at mcs.anl.gov
> *Subject:* Re: [petsc-users] Disconnected domains and Poisson equation
>
>
>
>
>
>   The problem actually has a two dimensional null space; constant on each
> domain but possibly different constants. I think you need to build the
> MatNullSpace by explicitly constructing two vectors, one with 0 on one
> domain and constant value on the other and one with 0 on the other domain
> and constant on the first.
>
>
>
>    Separate note: why use FGMRES instead of just GMRES? If the problem is
> linear and the preconditioner is linear (no GMRES inside the smoother) then
> you can just use GMRES and it will save a little space/work and be
> conceptually clearer.
>
>
>
>   Barry
>
>
>
> On Sep 29, 2021, at 8:46 AM, Marco Cisternino <marco.cisternino at optimad.it>
> wrote:
>
>
>
> Good morning,
>
> I want to solve the Poisson equation on a 3D domain with 2 non-connected
> sub-domains.
>
> I am using FGMRES+GAMG and I have no problem if the two sub-domains see a
> Dirichlet boundary condition each.
>
> On the same domain I would like to solve the Poisson equation imposing
> periodic boundary condition in one direction and homogenous Neumann
> boundary conditions in the other two directions. The two sub-domains are
> symmetric with respect to the separation between them and the operator
> discretization and the right hand side are symmetric as well. It would be
> nice to have the same solution in both the sub-domains.
>
> Setting the null space to the constant, the solver converges to a solution
> having the same gradients in both sub-domains but different values.
>
> Am I doing some wrong with the null space? I?m not setting a block matrix
> (one block for each sub-domain), should I?
>
> I tested the null space against the matrix using MatNullSpaceTest and the
> answer is true. Can I do something more to have a symmetric solution as
> outcome of the solver?
>
> Thank you in advance for any comments and hints.
>
>
>
> Best regards,
>
>
>
> Marco Cisternino
>
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210930/24508069/attachment-0001.html>

From knepley at gmail.com  Thu Sep 30 06:22:20 2021
From: knepley at gmail.com (Matthew Knepley)
Date: Thu, 30 Sep 2021 07:22:20 -0400
Subject: [petsc-users] pass a member function to MatShellSetOperation
In-Reply-To: <63546E7A-36E3-440C-80EF-7B38A2B27071@gmx.net>
References: <CAMH7-4khRFYpAP6SJEXwCM2VwvORCRyjQHeBtOqoRAu7cmmBzQ@mail.gmail.com>
	<63546E7A-36E3-440C-80EF-7B38A2B27071@gmx.net>
Message-ID: <CAMYG4GkO_=BvOfFMRMrjvQuqi3zxofURXFwurEjCnQF+2SkT1g@mail.gmail.com>

That is the new way to do it. The other way to do it is to have it be a
static member function, so that it does not take "this".

  Thanks,

     Matt

On Thu, Sep 30, 2021 at 6:11 AM Praveen C <praveen at gmx.net> wrote:

> I have used something like this in similar situation
>
> auto MatMult = [this](?args?)
> {
>    this->MyMatMult(?args?);
> };
>
>
> Then pass MatMult to petsc.
>
> *this* refers to the class Global_Assem and we are assuming you are
> inside this class when doing the above.
>
> best
> praveen
>
> On 30-Sep-2021, at 3:25 PM, Michael Wick <michael.wick.1980 at gmail.com>
> wrote:
>
> Hi:
>
> I want to have the shell matrix-vector multiplication written as a class
> member function and pass it to the shell matrix via MatShellSetOperation.
>
> MatShellSetOperation(A, MATOP_MULT, (void
> (*)(void))(&Global_Assem::MyMatMult));
>
> Perhaps I have a wrong understanding of function pointers, and I am
> constantly getting warnings that say I cannot convert a member function to
> a void type. The warning indeed makes sense to me, as the function pointer
> passed in the above manner is independent of an instance. Perhaps there are
> other ways of passing a member function that I don't know of. If you know
> how to address this, I would appreciate it a lot!
>
> Thanks,
>
> Mike
>
>
>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210930/6d169d75/attachment.html>

From karthikeyan.chockalingam at stfc.ac.uk  Thu Sep 30 07:50:08 2021
From: karthikeyan.chockalingam at stfc.ac.uk (Karthikeyan Chockalingam - STFC UKRI)
Date: Thu, 30 Sep 2021 12:50:08 +0000
Subject: [petsc-users] (percent time in this phase)
Message-ID: <B52E713E-F3B4-48B5-804D-2886DE0EDD1A@stfc.ac.uk>

When comparing the MatSolve data for

GPU

MatSolve             341 1.0 1.3009e+02 1.6 2.96e+11 1.0 0.0e+00 0.0e+00 0.0e+00 36 34  0  0  0  36 34  0  0  0  4536    4538      1 5.34e+02    0 0.00e+00 100
MatCUSPARSSolAnl       2 1.0 3.2338e+01 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  9  0  0  0  0   9  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0

and CPU

MatSolve             352 1.0 1.3553e+02 1.0 1.02e+11 1.0 0.0e+00 0.0e+00 0.0e+00 35 34  0  0  0  35 34  0  0  0  4489

the time spent is almost the same for this preconditioner.  Look like MatCUSPARSSolAnl is called only twice (since I am running on two cores)

mpirun -n 2 ./ex45 -log_summary -da_grid_x 511 -da_grid_y 511 -da_grid_z 511  -dm_mat_type mpiaijcusparse -dm_vec_type mpicuda -ksp_type cg -pc_type bjacobi -ksp_monitor

So would it be fair to assume MatCUSPARSSolAnl  is not accounted for in MatSolve and it is an exclusive event?
KSPSolve (53%) + PCSetup (16%) + DMCreateMat (23%) + MatCUSPARSSolAnl (9%) ~ 100 %

Best,
Karthik.


From: Matthew Knepley <knepley at gmail.com>
Date: Wednesday, 29 September 2021 at 16:29
To: "Chockalingam, Karthikeyan (STFC,DL,HC)" <karthikeyan.chockalingam at stfc.ac.uk>
Cc: Barry Smith <bsmith at petsc.dev>, "petsc-users at mcs.anl.gov" <petsc-users at mcs.anl.gov>
Subject: Re: [petsc-users] %T (percent time in this phase)

On Wed, Sep 29, 2021 at 10:18 AM Karthikeyan Chockalingam - STFC UKRI <karthikeyan.chockalingam at stfc.ac.uk<mailto:karthikeyan.chockalingam at stfc.ac.uk>> wrote:
Thank you!

Just to summarize

KSPSolve (53%) + PCSetup (16%) + DMCreateMat (23%) + MatCUSPARSSolAnl (9%) ~ 100 %

You didn?t happen to mention how MatCUSPARSSolAnl is accounted for? Am I right in accounting for it as above?

I am not sure.I thought it might be the GPU part of MatSolve(). I will have to look in the code. I am not as familiar with the GPU part.

MatCUSPARSSolAnl       2 1.0 3.2338e+01 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  9  0  0  0  0   9  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0

Finally, I believe the vector events, VecNorn, VecTDot, VecAXPY, and VecAYPX are mutually exclusive?

Yes.

  Thanks,

     Matt

Best,

Karthik.

From: Matthew Knepley <knepley at gmail.com<mailto:knepley at gmail.com>>
Date: Wednesday, 29 September 2021 at 11:58
To: "Chockalingam, Karthikeyan (STFC,DL,HC)" <karthikeyan.chockalingam at stfc.ac.uk<mailto:karthikeyan.chockalingam at stfc.ac.uk>>
Cc: Barry Smith <bsmith at petsc.dev<mailto:bsmith at petsc.dev>>, "petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>" <petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>>
Subject: Re: [petsc-users] %T (percent time in this phase)

On Wed, Sep 29, 2021 at 6:24 AM Karthikeyan Chockalingam - STFC UKRI <karthikeyan.chockalingam at stfc.ac.uk<mailto:karthikeyan.chockalingam at stfc.ac.uk>> wrote:
Thank you Mathew. Now, it is all making sense to me.

From data file ksp_ex45_N511_gpu_2.txt

KSPSolve (53%) + KSPSetup (0%) = PCSetup (16%) + PCApply (37%).

However, you said ?So an iteration would mostly consist of MatMult + PCApply, with some vector work?

1) You do one solve, but 2 KSPSetUp()s. You must be running on more than one process and using Block-Jacobi . Half the time is spent in the solve (53%)


KSPSetUp               2 1.0 5.3149e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 2.4e+01  0  0  0  0  1   0  0  0  0  1     0       0      0 0.00e+00    0 0.00e+00  0

KSPSolve               1 1.0 1.5837e+02 1.1 8.63e+11 1.0 6.8e+02 2.1e+06 4.4e+03 53100100100 95  53100100100 96 10881   11730   1022 6.40e+03 1021 8.17e-03 100

2) The preconditioner look like BJacobi-ILU. The setup time is 16%, which is all setup of the individual blocks, and this is all used by the numerical ILU factorization.

PCSetUp 2 1.0 4.9623e+01 1.3 1.45e+09 1.0 0.0e+00 0.0e+00 0.0e+00 16 0 0 0 0 16 0 0 0 0 58 0 2 6.93e+03 0 0.00e+00 0 PCSetUpOnBlocks 1 1.0 4.9274e+01 1.3 1.45e+09 1.0 0.0e+00 0.0e+00 0.0e+00 15 0 0 0 0 15 0 0 0 0 59 0 2 6.93e+03 0 0.00e+00 0

MatLUFactorNum         1 1.0 4.6126e+01 1.3 1.45e+09 1.0 0.0e+00 0.0e+00 0.0e+00 14  0  0  0  0  14  0  0  0  0    63       0      2 6.93e+03    0 0.00e+00  0

MatILUFactorSym        1 1.0 2.5110e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  1  0  0  0  0   1  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0

3) The preconditioner application takes 37% of the time, which is all solving the factors and recorded in MatSolve(). Matrix multiplication takes 4%.

PCApply 341 1.0 1.3068e+02 1.6 2.96e+11 1.0 0.0e+00 0.0e+00 0.0e+00 37 34 0 0 0 37 34 0 0 0 4516 4523 1 5.34e+02 0 0.00e+00 100

MatSolve             341 1.0 1.3009e+02 1.6 2.96e+11 1.0 0.0e+00 0.0e+00 0.0e+00 36 34  0  0  0  36 34  0  0  0  4536    4538      1 5.34e+02    0 0.00e+00 100

MatMult              341 1.0 1.0774e+01 1.1 2.96e+11 1.0 6.9e+02 2.1e+06 2.0e+00  4 34100100  0   4 34100100  0 54801   66441      2 5.86e+03    0 0.00e+00 100


4) The significant vector time is all in norms (11%) since they are really slow on the GPU.


VecNorm              342 1.0 6.2261e+01129.9 4.57e+10 1.0 0.0e+00 0.0e+00 6.8e+02 11  5  0  0 15  11  5  0  0 15  1466   196884      0 0.00e+00  342 2.74e-03 100

VecTDot              680 1.0 1.7107e+00 1.3 9.09e+10 1.0 0.0e+00 0.0e+00 1.4e+03  1 10  0  0 29   1 10  0  0 29 106079   133922      0 0.00e+00  680 5.44e-03 100

VecAXPY              681 1.0 3.2036e+00 1.7 9.10e+10 1.0 0.0e+00 0.0e+00 0.0e+00  1 11  0  0  0   1 11  0  0  0 56728   58367    682 5.34e+02    0 0.00e+00 100

VecAYPX              339 1.0 2.6502e+00 1.8 4.53e+10 1.0 0.0e+00 0.0e+00 0.0e+00  1  5  0  0  0   1  5  0  0  0 34136   34153    339 2.71e-03    0 0.00e+00 100


So the solve time is:

  53% ~ 37% + 4% + 11%

and the setup time is about 16%. I was wrong about the SetUp time being included, as it is outside the event:

  https://gitlab.com/petsc/petsc/-/blob/main/src/ksp/ksp/interface/itfunc.c#L852

It looks like the remainder of the time (23%) is spent preallocating the matrix.

  Thanks,

     Matt

The MalMult event is 4 %. How does this event figure into the above equation; if preconditioning (MatMult + PCApply) is included in KSPSolve?

Best,
Karthik.

From: Matthew Knepley <knepley at gmail.com<mailto:knepley at gmail.com>>
Date: Wednesday, 29 September 2021 at 10:58
To: "Chockalingam, Karthikeyan (STFC,DL,HC)" <karthikeyan.chockalingam at stfc.ac.uk<mailto:karthikeyan.chockalingam at stfc.ac.uk>>
Cc: Barry Smith <bsmith at petsc.dev<mailto:bsmith at petsc.dev>>, "petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>" <petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>>
Subject: Re: [petsc-users] %T (percent time in this phase)

On Wed, Sep 29, 2021 at 5:52 AM Karthikeyan Chockalingam - STFC UKRI <karthikeyan.chockalingam at stfc.ac.uk<mailto:karthikeyan.chockalingam at stfc.ac.uk>> wrote:
That was helpful. I would like to provide some additional details of my run on cpus and gpus. Please find the following attachments:


  1.  graph.pdf a plot showing overall time and various petsc events.
  2.  ksp_ex45_N511_cpu_6.txt data file of the log_summary
  3.  ksp_ex45_N511_gpu_2.txt data file of the log_summary

I used the following petsc options for cpu

mpirun -n 6 ./ex45 -log_summary -da_grid_x 511 -da_grid_y 511 -da_grid_z 511 -dm_mat_type mpiaij -dm_vec_type mpi -ksp_type cg -pc_type bjacobi -ksp_monitor

and for gpus

mpirun -n 1 ./ex45 -log_summary -da_grid_x 511 -da_grid_y 511 -da_grid_z 511  -dm_mat_type mpiaijcusparse -dm_vec_type mpicuda -ksp_type cg -pc_type bjacobi -ksp_monitor

to run the following problem

https://petsc.org/release/src/ksp/ksp/tutorials/ex45.c.html

From the above code, I see is there no individual function called KSPSetUp(), so I gather KSPSetDM, KSPSetComputeInitialGuess, KSPSetComputeRHS, kSPSetComputeOperators all are timed together as KSPSetUp. For this example, is KSPSetUp time and KSPSolve time mutually exclusive?

No, KSPSetUp() will be contained in KSPSolve() if it is called automatically.

In your response you said that

   ?PCSetUp() time may be in KSPSetUp() or it maybe in PCApply() it depends on how much of the preconditioner construction can take place early, so depends exactly on the preconditioner used.?

I don?t see a explicit call to PCSetUp() or  PCApply() in ex45; so for this particular preconditioner (bjacobi) how can I tell how they are timed?

They are all inside KSPSolve(). If you have a preconditioned linear solve, the oreconditioning happens during the iteration. So an iteration would mostly
consist of MatMult + PCApply, with some vector work.

I am hoping to time KSP solving and preconditioning mutually exclusively.

I am not sure that concept makes sense here. See above.

  Thanks,

     Matt


Kind regards,
Karthik.


From: Barry Smith <bsmith at petsc.dev<mailto:bsmith at petsc.dev>>
Date: Tuesday, 28 September 2021 at 19:19
To: "Chockalingam, Karthikeyan (STFC,DL,HC)" <karthikeyan.chockalingam at stfc.ac.uk<mailto:karthikeyan.chockalingam at stfc.ac.uk>>
Cc: "petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>" <petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>>
Subject: Re: [petsc-users] %T (percent time in this phase)



On Sep 28, 2021, at 12:11 PM, Karthikeyan Chockalingam - STFC UKRI <karthikeyan.chockalingam at stfc.ac.uk<mailto:karthikeyan.chockalingam at stfc.ac.uk>> wrote:

Thanks for Barry for your response.

I was just benchmarking the problem with various preconditioner on cpu and gpu. I understand, it is not possible to get mutually exclusive timing.
However, can you tell if KSPSolve time includes both PCSetup and PCApply? And if KSPSolve and KSPSetup are mutually exclusive? Likewise for PCSetUp and PCApply.

   If you do not call KSPSetUp() separately from KSPSolve() then its time is included with KSPSolve().

   PCSetUp() time may be in KSPSetUp() or it maybe in PCApply() it depends on how much of the preconditioner construction can take place early, so depends exactly on the preconditioner used.

   So yes the answer is not totally satisfying. The one thing I would recommend is to not call KSPSetUp() directly and then KSPSolve() will always include the total time of the solve plus all setup time. PCApply will contain all the time to apply the preconditioner but may also include some setup time.

  Barry


Best,
Karthik.




From: Barry Smith <bsmith at petsc.dev<mailto:bsmith at petsc.dev>>
Date: Tuesday, 28 September 2021 at 16:56
To: "Chockalingam, Karthikeyan (STFC,DL,HC)" <karthikeyan.chockalingam at stfc.ac.uk<mailto:karthikeyan.chockalingam at stfc.ac.uk>>
Cc: "petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>" <petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>>
Subject: Re: [petsc-users] %T (percent time in this phase)



On Sep 28, 2021, at 10:55 AM, Karthikeyan Chockalingam - STFC UKRI <karthikeyan.chockalingam at stfc.ac.uk<mailto:karthikeyan.chockalingam at stfc.ac.uk>> wrote:

Hello,

I ran ex45 in the KPS tutorial, which is a 3D finite-difference Poisson problem. I noticed from the output from using the flag -log_summary that for various events their respective %T (percent time in this phase) do not add up to 100 but rather exceeds 100. So, I gather there is some overlap among these events. I am primarily looking at the events KSPSetUp, KSPSolve, PCSetUp and PCSolve. Is it possible to get a mutually exclusive %T or Time for these individual events? I have attached  the log_summary output file from my run for your reference.


  For nested solvers it is tricky to get the times to be mutually exclusive because some parts of the building of the preconditioner is for some preconditioners delayed until the solve has started.

  It looks like you are using the default preconditioner options which for this example are taking more or less no time since so many iterations are needed. It is best to use -pc_type mg to use geometric multigrid on this problem.

  Barry



Thanks!
Karthik.

This email and any attachments are intended solely for the use of the named recipients. If you are not the intended recipient you must not use, disclose, copy or distribute this email or any of its attachments and should notify the sender immediately and delete this email from your system. UK Research and Innovation (UKRI) has taken every reasonable precaution to minimise risk of this email or any attachments containing viruses or malware but the recipient should carry out its own virus and malware checks before opening the attachments. UKRI does not accept any liability for any losses or damages which the recipient may sustain due to presence of any viruses.
<ksp_ex45_N511_cpu_6.txt>



--
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/<http://www.cse.buffalo.edu/~knepley/>


--
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/<http://www.cse.buffalo.edu/~knepley/>


--
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/<http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210930/8358772f/attachment-0001.html>

From knepley at gmail.com  Thu Sep 30 07:52:03 2021
From: knepley at gmail.com (Matthew Knepley)
Date: Thu, 30 Sep 2021 08:52:03 -0400
Subject: [petsc-users] (percent time in this phase)
In-Reply-To: <B52E713E-F3B4-48B5-804D-2886DE0EDD1A@stfc.ac.uk>
References: <B52E713E-F3B4-48B5-804D-2886DE0EDD1A@stfc.ac.uk>
Message-ID: <CAMYG4G=uhz7sjifjUFRRSY4w+5K1-dZZDVZASaOQuHnBiEOkfg@mail.gmail.com>

On Thu, Sep 30, 2021 at 8:50 AM Karthikeyan Chockalingam - STFC UKRI <
karthikeyan.chockalingam at stfc.ac.uk> wrote:

> When comparing the MatSolve data for
>
>
>
> GPU
>
>
>
> MatSolve             341 1.0 1.3009e+02 1.6 2.96e+11 1.0 0.0e+00 0.0e+00
> 0.0e+00 36 34  0  0  0  36 34  0  0  0  4536    4538      1 5.34e+02    0
> 0.00e+00 100
>
> MatCUSPARSSolAnl       2 1.0 3.2338e+01 1.5 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  9  0  0  0  0   9  0  0  0  0     0       0      0 0.00e+00    0
> 0.00e+00  0
>
>
>
> and CPU
>
>
>
> MatSolve             352 1.0 1.3553e+02 1.0 1.02e+11 1.0 0.0e+00 0.0e+00
> 0.0e+00 35 34  0  0  0  35 34  0  0  0  4489
>
>
>
> the time spent is almost the same for this preconditioner.  Look like
> MatCUSPARSSolAnl is called only *twice* (since I am running on two cores)
>
>
>
> mpirun -n 2 ./ex45 -log_summary -da_grid_x 511 -da_grid_y 511 -da_grid_z
> 511  -dm_mat_type mpiaijcusparse -dm_vec_type mpicuda -ksp_type cg -pc_type
> bjacobi -ksp_monitor
>
>
>
> So would it be fair to assume MatCUSPARSSolAnl  is *not *accounted for in
> MatSolve and it is an exclusive event?
>

Looks like that.

  Thanks

     Matt


> KSPSolve (53%) + PCSetup (16%) + DMCreateMat (23%) + MatCUSPARSSolAnl (9%)
> ~ 100 %
>
>
>
> Best,
>
> Karthik.
>
>
>
>
>
> *From: *Matthew Knepley <knepley at gmail.com>
> *Date: *Wednesday, 29 September 2021 at 16:29
> *To: *"Chockalingam, Karthikeyan (STFC,DL,HC)" <
> karthikeyan.chockalingam at stfc.ac.uk>
> *Cc: *Barry Smith <bsmith at petsc.dev>, "petsc-users at mcs.anl.gov" <
> petsc-users at mcs.anl.gov>
> *Subject: *Re: [petsc-users] %T (percent time in this phase)
>
>
>
> On Wed, Sep 29, 2021 at 10:18 AM Karthikeyan Chockalingam - STFC UKRI <
> karthikeyan.chockalingam at stfc.ac.uk> wrote:
>
> Thank you!
>
>
>
> Just to summarize
>
>
>
> KSPSolve (53%) + PCSetup (16%) + DMCreateMat (23%) + MatCUSPARSSolAnl (9%)
> ~ 100 %
>
>
>
> You didn?t happen to mention how MatCUSPARSSolAnl is accounted for? Am I
> right in accounting for it as above?
>
>
>
> I am not sure.I thought it might be the GPU part of MatSolve(). I will
> have to look in the code. I am not as familiar with the GPU part.
>
>
>
> MatCUSPARSSolAnl       2 1.0 3.2338e+01 1.5 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  9  0  0  0  0   9  0  0  0  0     0       0      0 0.00e+00    0
> 0.00e+00  0
>
>
>
> Finally, I believe the vector events, VecNorn, VecTDot, VecAXPY, and
> VecAYPX are mutually exclusive?
>
>
>
> Yes.
>
>
>
>   Thanks,
>
>
>
>      Matt
>
>
>
> Best,
>
>
>
> Karthik.
>
>
>
> *From: *Matthew Knepley <knepley at gmail.com>
> *Date: *Wednesday, 29 September 2021 at 11:58
> *To: *"Chockalingam, Karthikeyan (STFC,DL,HC)" <
> karthikeyan.chockalingam at stfc.ac.uk>
> *Cc: *Barry Smith <bsmith at petsc.dev>, "petsc-users at mcs.anl.gov" <
> petsc-users at mcs.anl.gov>
> *Subject: *Re: [petsc-users] %T (percent time in this phase)
>
>
>
> On Wed, Sep 29, 2021 at 6:24 AM Karthikeyan Chockalingam - STFC UKRI <
> karthikeyan.chockalingam at stfc.ac.uk> wrote:
>
> Thank you Mathew. Now, it is all making sense to me.
>
>
>
> From data file ksp_ex45_N511_gpu_2.txt
>
>
>
> KSPSolve (53%) + KSPSetup (0%) = PCSetup (16%) + PCApply (37%).
>
>
>
> However, you said ?So an iteration would mostly consist of MatMult +
> PCApply, with some vector work?
>
>
>
> 1) You do one solve, but 2 KSPSetUp()s. You must be running on more than
> one process and using Block-Jacobi . Half the time is spent in the solve
> (53%)
>
>
>
> KSPSetUp               2 1.0 5.3149e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 2.4e+01  0  0  0  0  1   0  0  0  0  1     0       0      0 0.00e+00    0 0.00e+00  0
>
> KSPSolve               1 1.0 1.5837e+02 1.1 8.63e+11 1.0 6.8e+02 2.1e+06 4.4e+03 53100100100 95  53100100100 96 10881   11730   1022 6.40e+03 1021 8.17e-03 100
>
>
>
> 2) The preconditioner look like BJacobi-ILU. The setup time is 16%, which
> is all setup of the individual blocks, and this is all used by the
> numerical ILU factorization.
>
>
>
> PCSetUp 2 1.0 4.9623e+01 1.3 1.45e+09 1.0 0.0e+00 0.0e+00 0.0e+00 16 0 0 0
> 0 16 0 0 0 0 58 0 2 6.93e+03 0 0.00e+00 0 PCSetUpOnBlocks 1 1.0 4.9274e+01
> 1.3 1.45e+09 1.0 0.0e+00 0.0e+00 0.0e+00 15 0 0 0 0 15 0 0 0 0 59 0 2
> 6.93e+03 0 0.00e+00 0
>
> MatLUFactorNum         1 1.0 4.6126e+01 1.3 1.45e+09 1.0 0.0e+00 0.0e+00 0.0e+00 14  0  0  0  0  14  0  0  0  0    63       0      2 6.93e+03    0 0.00e+00  0
>
> MatILUFactorSym        1 1.0 2.5110e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  1  0  0  0  0   1  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0
>
>
>
> 3) The preconditioner application takes 37% of the time, which is all
> solving the factors and recorded in MatSolve(). Matrix multiplication takes
> 4%.
>
>
>
> PCApply 341 1.0 1.3068e+02 1.6 2.96e+11 1.0 0.0e+00 0.0e+00 0.0e+00 37 34
> 0 0 0 37 34 0 0 0 4516 4523 1 5.34e+02 0 0.00e+00 100
>
> MatSolve             341 1.0 1.3009e+02 1.6 2.96e+11 1.0 0.0e+00 0.0e+00 0.0e+00 36 34  0  0  0  36 34  0  0  0  4536    4538      1 5.34e+02    0 0.00e+00 100
>
> MatMult              341 1.0 1.0774e+01 1.1 2.96e+11 1.0 6.9e+02 2.1e+06 2.0e+00  4 34100100  0   4 34100100  0 54801   66441      2 5.86e+03    0 0.00e+00 100
>
>
>
> 4) The significant vector time is all in norms (11%) since they are really
> slow on the GPU.
>
>
>
> VecNorm              342 1.0 6.2261e+01129.9 4.57e+10 1.0 0.0e+00 0.0e+00 6.8e+02 11  5  0  0 15  11  5  0  0 15  1466   196884      0 0.00e+00  342 2.74e-03 100
>
> VecTDot              680 1.0 1.7107e+00 1.3 9.09e+10 1.0 0.0e+00 0.0e+00 1.4e+03  1 10  0  0 29   1 10  0  0 29 106079   133922      0 0.00e+00  680 5.44e-03 100
>
> VecAXPY              681 1.0 3.2036e+00 1.7 9.10e+10 1.0 0.0e+00 0.0e+00 0.0e+00  1 11  0  0  0   1 11  0  0  0 56728   58367    682 5.34e+02    0 0.00e+00 100
>
> VecAYPX              339 1.0 2.6502e+00 1.8 4.53e+10 1.0 0.0e+00 0.0e+00 0.0e+00  1  5  0  0  0   1  5  0  0  0 34136   34153    339 2.71e-03    0 0.00e+00 100
>
>
>
> So the solve time is:
>
>
>
>   53% ~ 37% + 4% + 11%
>
>
>
> and the setup time is about 16%. I was wrong about the SetUp time being
> included, as it is outside the event:
>
>
>
>
> https://gitlab.com/petsc/petsc/-/blob/main/src/ksp/ksp/interface/itfunc.c#L852
>
>
>
> It looks like the remainder of the time (23%) is spent preallocating the
> matrix.
>
>
>
>   Thanks,
>
>
>
>      Matt
>
>
>
> The MalMult event is 4 %. How does this event figure into the above
> equation; if preconditioning (MatMult + PCApply) is included in KSPSolve?
>
>
>
> Best,
>
> Karthik.
>
>
>
> *From: *Matthew Knepley <knepley at gmail.com>
> *Date: *Wednesday, 29 September 2021 at 10:58
> *To: *"Chockalingam, Karthikeyan (STFC,DL,HC)" <
> karthikeyan.chockalingam at stfc.ac.uk>
> *Cc: *Barry Smith <bsmith at petsc.dev>, "petsc-users at mcs.anl.gov" <
> petsc-users at mcs.anl.gov>
> *Subject: *Re: [petsc-users] %T (percent time in this phase)
>
>
>
> On Wed, Sep 29, 2021 at 5:52 AM Karthikeyan Chockalingam - STFC UKRI <
> karthikeyan.chockalingam at stfc.ac.uk> wrote:
>
> That was helpful. I would like to provide some additional details of my
> run on cpus and gpus. Please find the following attachments:
>
>
>
>    1. graph.pdf a plot showing overall time and various petsc events.
>    2. ksp_ex45_N511_cpu_6.txt data file of the log_summary
>    3. ksp_ex45_N511_gpu_2.txt data file of the log_summary
>
>
>
> I used the following petsc options for cpu
>
>
>
> mpirun -n 6 ./ex45 -log_summary -da_grid_x 511 -da_grid_y 511 -da_grid_z
> 511 -dm_mat_type mpiaij -dm_vec_type mpi -ksp_type cg -pc_type bjacobi
> -ksp_monitor
>
>
>
> and for gpus
>
>
>
> mpirun -n 1 ./ex45 -log_summary -da_grid_x 511 -da_grid_y 511 -da_grid_z
> 511  -dm_mat_type mpiaijcusparse -dm_vec_type mpicuda -ksp_type cg -pc_type
> bjacobi -ksp_monitor
>
>
>
> to run the following problem
>
>
>
> https://petsc.org/release/src/ksp/ksp/tutorials/ex45.c.html
>
>
>
> From the above code, I see is there no individual function called KSPSetUp(),
> so I gather KSPSetDM, KSPSetComputeInitialGuess, KSPSetComputeRHS,
> kSPSetComputeOperators all are timed together as KSPSetUp. For this
> example, is KSPSetUp time and KSPSolve time mutually exclusive?
>
>
>
> No, KSPSetUp() will be contained in KSPSolve() if it is called
> automatically.
>
>
>
> In your response you said that
>
>
>
>    ?PCSetUp() time may be in KSPSetUp() or it maybe in PCApply() it
> depends on how much of the preconditioner construction can take place
> early, so depends exactly on the preconditioner used.?
>
>
>
> I don?t see a explicit call to PCSetUp() or  PCApply() in ex45; so for
> this particular preconditioner (bjacobi) how can I tell how they are timed?
>
>
>
> They are all inside KSPSolve(). If you have a preconditioned linear solve,
> the oreconditioning happens during the iteration. So an iteration would
> mostly
>
> consist of MatMult + PCApply, with some vector work.
>
>
>
> I am hoping to time KSP solving and preconditioning mutually exclusively.
>
>
>
> I am not sure that concept makes sense here. See above.
>
>
>
>   Thanks,
>
>
>
>      Matt
>
>
>
>
>
> Kind regards,
>
> Karthik.
>
>
>
>
>
> *From: *Barry Smith <bsmith at petsc.dev>
> *Date: *Tuesday, 28 September 2021 at 19:19
> *To: *"Chockalingam, Karthikeyan (STFC,DL,HC)" <
> karthikeyan.chockalingam at stfc.ac.uk>
> *Cc: *"petsc-users at mcs.anl.gov" <petsc-users at mcs.anl.gov>
> *Subject: *Re: [petsc-users] %T (percent time in this phase)
>
>
>
>
>
>
>
> On Sep 28, 2021, at 12:11 PM, Karthikeyan Chockalingam - STFC UKRI <
> karthikeyan.chockalingam at stfc.ac.uk> wrote:
>
>
>
> Thanks for Barry for your response.
>
>
>
> I was just benchmarking the problem with various preconditioner on cpu and
> gpu. I understand, it is not possible to get mutually exclusive timing.
>
> However, can you tell if KSPSolve time includes both PCSetup and PCApply?
> And if KSPSolve and KSPSetup are mutually exclusive? Likewise for PCSetUp
> and PCApply.
>
>
>
>    If you do not call KSPSetUp() separately from KSPSolve() then its time
> is included with KSPSolve().
>
>
>
>    PCSetUp() time may be in KSPSetUp() or it maybe in PCApply() it depends
> on how much of the preconditioner construction can take place early, so
> depends exactly on the preconditioner used.
>
>
>
>    So yes the answer is not totally satisfying. The one thing I would
> recommend is to not call KSPSetUp() directly and then KSPSolve() will
> always include the total time of the solve plus all setup time. PCApply
> will contain all the time to apply the preconditioner but may also include
> some setup time.
>
>
>
>   Barry
>
>
>
>
>
> Best,
>
> Karthik.
>
>
>
>
>
>
>
>
>
> *From: *Barry Smith <bsmith at petsc.dev>
> *Date: *Tuesday, 28 September 2021 at 16:56
> *To: *"Chockalingam, Karthikeyan (STFC,DL,HC)" <
> karthikeyan.chockalingam at stfc.ac.uk>
> *Cc: *"petsc-users at mcs.anl.gov" <petsc-users at mcs.anl.gov>
> *Subject: *Re: [petsc-users] %T (percent time in this phase)
>
>
>
>
>
>
>
> On Sep 28, 2021, at 10:55 AM, Karthikeyan Chockalingam - STFC UKRI <
> karthikeyan.chockalingam at stfc.ac.uk> wrote:
>
>
>
> Hello,
>
>
>
> I ran ex45 in the KPS tutorial, which is a 3D finite-difference Poisson
> problem. I noticed from the output from using the flag -log_summary that
> for various events their respective %T (percent time in this phase) do not
> add up to 100 but rather exceeds 100. So, I gather there is some overlap
> among these events. I am primarily looking at the events KSPSetUp,
> KSPSolve, PCSetUp and PCSolve. Is it possible to get a mutually exclusive
> %T or Time for these individual events? I have attached  the log_summary
> output file from my run for your reference.
>
>
>
>
>
>   For nested solvers it is tricky to get the times to be mutually
> exclusive because some parts of the building of the preconditioner is for
> some preconditioners delayed until the solve has started.
>
>
>
>   It looks like you are using the default preconditioner options which for
> this example are taking more or less no time since so many iterations are
> needed. It is best to use -pc_type mg to use geometric multigrid on this
> problem.
>
>
>
>   Barry
>
>
>
>
>
>
>
> Thanks!
>
> Karthik.
>
>
>
> This email and any attachments are intended solely for the use of the
> named recipients. If you are not the intended recipient you must not use,
> disclose, copy or distribute this email or any of its attachments and
> should notify the sender immediately and delete this email from your
> system. UK Research and Innovation (UKRI) has taken every reasonable
> precaution to minimise risk of this email or any attachments containing
> viruses or malware but the recipient should carry out its own virus and
> malware checks before opening the attachments. UKRI does not accept any
> liability for any losses or damages which the recipient may sustain due to
> presence of any viruses.
>
> <ksp_ex45_N511_cpu_6.txt>
>
>
>
>
>
>
> --
>
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
>
>
> https://www.cse.buffalo.edu/~knepley/
> <http://www.cse.buffalo.edu/~knepley/>
>
>
>
>
> --
>
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
>
>
> https://www.cse.buffalo.edu/~knepley/
> <http://www.cse.buffalo.edu/~knepley/>
>
>
>
>
> --
>
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
>
>
> https://www.cse.buffalo.edu/~knepley/
> <http://www.cse.buffalo.edu/~knepley/>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210930/c3def623/attachment-0001.html>

From s6hsbran at uni-bonn.de  Thu Sep 30 08:39:00 2021
From: s6hsbran at uni-bonn.de (Hannes Phil Niklas Brandt)
Date: Thu, 30 Sep 2021 15:39:00 +0200
Subject: [petsc-users] Possibilities to VecScatter to a sparse Vector-Format
Message-ID: <ximss-1710695@be2.uni-bonn.de>

Hello,



I      intend to compute a parallel Matrix-Vector-Product (via MPI) and 
     therefore would like to scatter the entries of the input MPI-Vec v 
     to      a local vector containing all entries relevant to the current 
     process.

To        achieve this I tried defining        a VecScatter, which 
       scatters        from v        to a sequential Vec        v_seq (each 
process has it's own version of v_seq). However,        storing v_seq (which 
has one        entry for each global row, thus        containing a large 
amount of        zero-entries)        may demand too        much storage 
space (in        comparison to my data-sparse Matrix-Storage-Format).

I        am interested in possibilities to scatter v to a sparse Vec-type 
       to avoid storing unnecessary large        amounts of 
       zero-entries. Is there a sparse Vector format in Petsc 
       compatible to        the VecScatter procedure        or is there 
another efficient way to compute        Matrix-Vector-Products 
       without usinglarge amounts        of storage space on each process?


Best      Regards
     Hannes      Brandt

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210930/d62cb589/attachment.html>

From karthikeyan.chockalingam at stfc.ac.uk  Thu Sep 30 08:41:29 2021
From: karthikeyan.chockalingam at stfc.ac.uk (Karthikeyan Chockalingam - STFC UKRI)
Date: Thu, 30 Sep 2021 13:41:29 +0000
Subject: [petsc-users] (percent time in this phase)
In-Reply-To: <CAMYG4G=uhz7sjifjUFRRSY4w+5K1-dZZDVZASaOQuHnBiEOkfg@mail.gmail.com>
References: <B52E713E-F3B4-48B5-804D-2886DE0EDD1A@stfc.ac.uk>
	<CAMYG4G=uhz7sjifjUFRRSY4w+5K1-dZZDVZASaOQuHnBiEOkfg@mail.gmail.com>
Message-ID: <6295C9A3-0EC7-4D6A-8F62-88EC8651D207@stfc.ac.uk>

Based on your feedback from yesterday. I was trying to breakdown KSPSolve.
Please find the attached bar plot. The numbers are not adding up at least for GPUs.
Your feedback from yesterday were based on T%.
I plotted the time spend on each event, hoping that the cumulative sum would add up to KSPSolve time.

Kind regards,
Karthik.

From: Matthew Knepley <knepley at gmail.com>
Date: Thursday, 30 September 2021 at 13:52
To: "Chockalingam, Karthikeyan (STFC,DL,HC)" <karthikeyan.chockalingam at stfc.ac.uk>
Cc: Barry Smith <bsmith at petsc.dev>, "petsc-users at mcs.anl.gov" <petsc-users at mcs.anl.gov>
Subject: Re: [petsc-users] (percent time in this phase)

On Thu, Sep 30, 2021 at 8:50 AM Karthikeyan Chockalingam - STFC UKRI <karthikeyan.chockalingam at stfc.ac.uk<mailto:karthikeyan.chockalingam at stfc.ac.uk>> wrote:
When comparing the MatSolve data for

GPU

MatSolve             341 1.0 1.3009e+02 1.6 2.96e+11 1.0 0.0e+00 0.0e+00 0.0e+00 36 34  0  0  0  36 34  0  0  0  4536    4538      1 5.34e+02    0 0.00e+00 100
MatCUSPARSSolAnl       2 1.0 3.2338e+01 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  9  0  0  0  0   9  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0

and CPU

MatSolve             352 1.0 1.3553e+02 1.0 1.02e+11 1.0 0.0e+00 0.0e+00 0.0e+00 35 34  0  0  0  35 34  0  0  0  4489

the time spent is almost the same for this preconditioner.  Look like MatCUSPARSSolAnl is called only twice (since I am running on two cores)

mpirun -n 2 ./ex45 -log_summary -da_grid_x 511 -da_grid_y 511 -da_grid_z 511  -dm_mat_type mpiaijcusparse -dm_vec_type mpicuda -ksp_type cg -pc_type bjacobi -ksp_monitor

So would it be fair to assume MatCUSPARSSolAnl  is not accounted for in MatSolve and it is an exclusive event?

Looks like that.

  Thanks

     Matt

KSPSolve (53%) + PCSetup (16%) + DMCreateMat (23%) + MatCUSPARSSolAnl (9%) ~ 100 %

Best,
Karthik.


From: Matthew Knepley <knepley at gmail.com<mailto:knepley at gmail.com>>
Date: Wednesday, 29 September 2021 at 16:29
To: "Chockalingam, Karthikeyan (STFC,DL,HC)" <karthikeyan.chockalingam at stfc.ac.uk<mailto:karthikeyan.chockalingam at stfc.ac.uk>>
Cc: Barry Smith <bsmith at petsc.dev<mailto:bsmith at petsc.dev>>, "petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>" <petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>>
Subject: Re: [petsc-users] %T (percent time in this phase)

On Wed, Sep 29, 2021 at 10:18 AM Karthikeyan Chockalingam - STFC UKRI <karthikeyan.chockalingam at stfc.ac.uk<mailto:karthikeyan.chockalingam at stfc.ac.uk>> wrote:
Thank you!

Just to summarize

KSPSolve (53%) + PCSetup (16%) + DMCreateMat (23%) + MatCUSPARSSolAnl (9%) ~ 100 %

You didn?t happen to mention how MatCUSPARSSolAnl is accounted for? Am I right in accounting for it as above?

I am not sure.I thought it might be the GPU part of MatSolve(). I will have to look in the code. I am not as familiar with the GPU part.

MatCUSPARSSolAnl       2 1.0 3.2338e+01 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  9  0  0  0  0   9  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0

Finally, I believe the vector events, VecNorn, VecTDot, VecAXPY, and VecAYPX are mutually exclusive?

Yes.

  Thanks,

     Matt

Best,

Karthik.

From: Matthew Knepley <knepley at gmail.com<mailto:knepley at gmail.com>>
Date: Wednesday, 29 September 2021 at 11:58
To: "Chockalingam, Karthikeyan (STFC,DL,HC)" <karthikeyan.chockalingam at stfc.ac.uk<mailto:karthikeyan.chockalingam at stfc.ac.uk>>
Cc: Barry Smith <bsmith at petsc.dev<mailto:bsmith at petsc.dev>>, "petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>" <petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>>
Subject: Re: [petsc-users] %T (percent time in this phase)

On Wed, Sep 29, 2021 at 6:24 AM Karthikeyan Chockalingam - STFC UKRI <karthikeyan.chockalingam at stfc.ac.uk<mailto:karthikeyan.chockalingam at stfc.ac.uk>> wrote:
Thank you Mathew. Now, it is all making sense to me.

From data file ksp_ex45_N511_gpu_2.txt

KSPSolve (53%) + KSPSetup (0%) = PCSetup (16%) + PCApply (37%).

However, you said ?So an iteration would mostly consist of MatMult + PCApply, with some vector work?

1) You do one solve, but 2 KSPSetUp()s. You must be running on more than one process and using Block-Jacobi . Half the time is spent in the solve (53%)


KSPSetUp               2 1.0 5.3149e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 2.4e+01  0  0  0  0  1   0  0  0  0  1     0       0      0 0.00e+00    0 0.00e+00  0

KSPSolve               1 1.0 1.5837e+02 1.1 8.63e+11 1.0 6.8e+02 2.1e+06 4.4e+03 53100100100 95  53100100100 96 10881   11730   1022 6.40e+03 1021 8.17e-03 100

2) The preconditioner look like BJacobi-ILU. The setup time is 16%, which is all setup of the individual blocks, and this is all used by the numerical ILU factorization.

PCSetUp 2 1.0 4.9623e+01 1.3 1.45e+09 1.0 0.0e+00 0.0e+00 0.0e+00 16 0 0 0 0 16 0 0 0 0 58 0 2 6.93e+03 0 0.00e+00 0 PCSetUpOnBlocks 1 1.0 4.9274e+01 1.3 1.45e+09 1.0 0.0e+00 0.0e+00 0.0e+00 15 0 0 0 0 15 0 0 0 0 59 0 2 6.93e+03 0 0.00e+00 0

MatLUFactorNum         1 1.0 4.6126e+01 1.3 1.45e+09 1.0 0.0e+00 0.0e+00 0.0e+00 14  0  0  0  0  14  0  0  0  0    63       0      2 6.93e+03    0 0.00e+00  0

MatILUFactorSym        1 1.0 2.5110e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  1  0  0  0  0   1  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0

3) The preconditioner application takes 37% of the time, which is all solving the factors and recorded in MatSolve(). Matrix multiplication takes 4%.

PCApply 341 1.0 1.3068e+02 1.6 2.96e+11 1.0 0.0e+00 0.0e+00 0.0e+00 37 34 0 0 0 37 34 0 0 0 4516 4523 1 5.34e+02 0 0.00e+00 100

MatSolve             341 1.0 1.3009e+02 1.6 2.96e+11 1.0 0.0e+00 0.0e+00 0.0e+00 36 34  0  0  0  36 34  0  0  0  4536    4538      1 5.34e+02    0 0.00e+00 100

MatMult              341 1.0 1.0774e+01 1.1 2.96e+11 1.0 6.9e+02 2.1e+06 2.0e+00  4 34100100  0   4 34100100  0 54801   66441      2 5.86e+03    0 0.00e+00 100


4) The significant vector time is all in norms (11%) since they are really slow on the GPU.


VecNorm              342 1.0 6.2261e+01129.9 4.57e+10 1.0 0.0e+00 0.0e+00 6.8e+02 11  5  0  0 15  11  5  0  0 15  1466   196884      0 0.00e+00  342 2.74e-03 100

VecTDot              680 1.0 1.7107e+00 1.3 9.09e+10 1.0 0.0e+00 0.0e+00 1.4e+03  1 10  0  0 29   1 10  0  0 29 106079   133922      0 0.00e+00  680 5.44e-03 100

VecAXPY              681 1.0 3.2036e+00 1.7 9.10e+10 1.0 0.0e+00 0.0e+00 0.0e+00  1 11  0  0  0   1 11  0  0  0 56728   58367    682 5.34e+02    0 0.00e+00 100

VecAYPX              339 1.0 2.6502e+00 1.8 4.53e+10 1.0 0.0e+00 0.0e+00 0.0e+00  1  5  0  0  0   1  5  0  0  0 34136   34153    339 2.71e-03    0 0.00e+00 100


So the solve time is:

  53% ~ 37% + 4% + 11%

and the setup time is about 16%. I was wrong about the SetUp time being included, as it is outside the event:

  https://gitlab.com/petsc/petsc/-/blob/main/src/ksp/ksp/interface/itfunc.c#L852

It looks like the remainder of the time (23%) is spent preallocating the matrix.

  Thanks,

     Matt

The MalMult event is 4 %. How does this event figure into the above equation; if preconditioning (MatMult + PCApply) is included in KSPSolve?

Best,
Karthik.

From: Matthew Knepley <knepley at gmail.com<mailto:knepley at gmail.com>>
Date: Wednesday, 29 September 2021 at 10:58
To: "Chockalingam, Karthikeyan (STFC,DL,HC)" <karthikeyan.chockalingam at stfc.ac.uk<mailto:karthikeyan.chockalingam at stfc.ac.uk>>
Cc: Barry Smith <bsmith at petsc.dev<mailto:bsmith at petsc.dev>>, "petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>" <petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>>
Subject: Re: [petsc-users] %T (percent time in this phase)

On Wed, Sep 29, 2021 at 5:52 AM Karthikeyan Chockalingam - STFC UKRI <karthikeyan.chockalingam at stfc.ac.uk<mailto:karthikeyan.chockalingam at stfc.ac.uk>> wrote:
That was helpful. I would like to provide some additional details of my run on cpus and gpus. Please find the following attachments:


  1.  graph.pdf a plot showing overall time and various petsc events.
  2.  ksp_ex45_N511_cpu_6.txt data file of the log_summary
  3.  ksp_ex45_N511_gpu_2.txt data file of the log_summary

I used the following petsc options for cpu

mpirun -n 6 ./ex45 -log_summary -da_grid_x 511 -da_grid_y 511 -da_grid_z 511 -dm_mat_type mpiaij -dm_vec_type mpi -ksp_type cg -pc_type bjacobi -ksp_monitor

and for gpus

mpirun -n 1 ./ex45 -log_summary -da_grid_x 511 -da_grid_y 511 -da_grid_z 511  -dm_mat_type mpiaijcusparse -dm_vec_type mpicuda -ksp_type cg -pc_type bjacobi -ksp_monitor

to run the following problem

https://petsc.org/release/src/ksp/ksp/tutorials/ex45.c.html

From the above code, I see is there no individual function called KSPSetUp(), so I gather KSPSetDM, KSPSetComputeInitialGuess, KSPSetComputeRHS, kSPSetComputeOperators all are timed together as KSPSetUp. For this example, is KSPSetUp time and KSPSolve time mutually exclusive?

No, KSPSetUp() will be contained in KSPSolve() if it is called automatically.

In your response you said that

   ?PCSetUp() time may be in KSPSetUp() or it maybe in PCApply() it depends on how much of the preconditioner construction can take place early, so depends exactly on the preconditioner used.?

I don?t see a explicit call to PCSetUp() or  PCApply() in ex45; so for this particular preconditioner (bjacobi) how can I tell how they are timed?

They are all inside KSPSolve(). If you have a preconditioned linear solve, the oreconditioning happens during the iteration. So an iteration would mostly
consist of MatMult + PCApply, with some vector work.

I am hoping to time KSP solving and preconditioning mutually exclusively.

I am not sure that concept makes sense here. See above.

  Thanks,

     Matt


Kind regards,
Karthik.


From: Barry Smith <bsmith at petsc.dev<mailto:bsmith at petsc.dev>>
Date: Tuesday, 28 September 2021 at 19:19
To: "Chockalingam, Karthikeyan (STFC,DL,HC)" <karthikeyan.chockalingam at stfc.ac.uk<mailto:karthikeyan.chockalingam at stfc.ac.uk>>
Cc: "petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>" <petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>>
Subject: Re: [petsc-users] %T (percent time in this phase)



On Sep 28, 2021, at 12:11 PM, Karthikeyan Chockalingam - STFC UKRI <karthikeyan.chockalingam at stfc.ac.uk<mailto:karthikeyan.chockalingam at stfc.ac.uk>> wrote:

Thanks for Barry for your response.

I was just benchmarking the problem with various preconditioner on cpu and gpu. I understand, it is not possible to get mutually exclusive timing.
However, can you tell if KSPSolve time includes both PCSetup and PCApply? And if KSPSolve and KSPSetup are mutually exclusive? Likewise for PCSetUp and PCApply.

   If you do not call KSPSetUp() separately from KSPSolve() then its time is included with KSPSolve().

   PCSetUp() time may be in KSPSetUp() or it maybe in PCApply() it depends on how much of the preconditioner construction can take place early, so depends exactly on the preconditioner used.

   So yes the answer is not totally satisfying. The one thing I would recommend is to not call KSPSetUp() directly and then KSPSolve() will always include the total time of the solve plus all setup time. PCApply will contain all the time to apply the preconditioner but may also include some setup time.

  Barry


Best,
Karthik.




From: Barry Smith <bsmith at petsc.dev<mailto:bsmith at petsc.dev>>
Date: Tuesday, 28 September 2021 at 16:56
To: "Chockalingam, Karthikeyan (STFC,DL,HC)" <karthikeyan.chockalingam at stfc.ac.uk<mailto:karthikeyan.chockalingam at stfc.ac.uk>>
Cc: "petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>" <petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>>
Subject: Re: [petsc-users] %T (percent time in this phase)



On Sep 28, 2021, at 10:55 AM, Karthikeyan Chockalingam - STFC UKRI <karthikeyan.chockalingam at stfc.ac.uk<mailto:karthikeyan.chockalingam at stfc.ac.uk>> wrote:

Hello,

I ran ex45 in the KPS tutorial, which is a 3D finite-difference Poisson problem. I noticed from the output from using the flag -log_summary that for various events their respective %T (percent time in this phase) do not add up to 100 but rather exceeds 100. So, I gather there is some overlap among these events. I am primarily looking at the events KSPSetUp, KSPSolve, PCSetUp and PCSolve. Is it possible to get a mutually exclusive %T or Time for these individual events? I have attached  the log_summary output file from my run for your reference.


  For nested solvers it is tricky to get the times to be mutually exclusive because some parts of the building of the preconditioner is for some preconditioners delayed until the solve has started.

  It looks like you are using the default preconditioner options which for this example are taking more or less no time since so many iterations are needed. It is best to use -pc_type mg to use geometric multigrid on this problem.

  Barry



Thanks!
Karthik.

This email and any attachments are intended solely for the use of the named recipients. If you are not the intended recipient you must not use, disclose, copy or distribute this email or any of its attachments and should notify the sender immediately and delete this email from your system. UK Research and Innovation (UKRI) has taken every reasonable precaution to minimise risk of this email or any attachments containing viruses or malware but the recipient should carry out its own virus and malware checks before opening the attachments. UKRI does not accept any liability for any losses or damages which the recipient may sustain due to presence of any viruses.
<ksp_ex45_N511_cpu_6.txt>



--
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/<http://www.cse.buffalo.edu/~knepley/>


--
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/<http://www.cse.buffalo.edu/~knepley/>


--
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/<http://www.cse.buffalo.edu/~knepley/>


--
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/<http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210930/7a3f1d69/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: KSPSolve.pdf
Type: application/pdf
Size: 175716 bytes
Desc: KSPSolve.pdf
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210930/7a3f1d69/attachment-0001.pdf>

From knepley at gmail.com  Thu Sep 30 08:44:01 2021
From: knepley at gmail.com (Matthew Knepley)
Date: Thu, 30 Sep 2021 09:44:01 -0400
Subject: [petsc-users] Possibilities to VecScatter to a sparse
 Vector-Format
In-Reply-To: <ximss-1710695@be2.uni-bonn.de>
References: <ximss-1710695@be2.uni-bonn.de>
Message-ID: <CAMYG4GnNxqxRe6RkTw72WZisWJrKOEBU8BpFY48aWL1e=D+=9g@mail.gmail.com>

On Thu, Sep 30, 2021 at 9:39 AM Hannes Phil Niklas Brandt <
s6hsbran at uni-bonn.de> wrote:

> Hello,
>
>
>
> I intend to compute a parallel Matrix-Vector-Product (via MPI) and
> therefore would like to scatter the entries of the input MPI-Vec v to a
> local vector containing all entries relevant to the current process.
>
>
>
> To achieve this I tried defining a VecScatter, which scatters from v to a
> sequential Vec v_seq (each process has it's own version of v_seq). However,
> storing v_seq (which has one entry for each global row, thus containing a
> large amount of zero-entries) may demand too much storage space (in
> comparison to my data-sparse Matrix-Storage-Format).
>
>
>
> I am interested in possibilities to scatter v to a sparse Vec-type to
> avoid storing unnecessary large amounts of zero-entries. Is there a sparse
> Vector format in Petsc compatible to the VecScatter procedure or is there
> another efficient way to compute Matrix-Vector-Products without usinglarge
> amounts of storage space on each process?
>
I think you misunderstand VecScatter. Parallel to sequential is one
possibility, but also parallel-parallel, seq-parallel, etc. Second, you can
give
whatever indices you want into it. Thus you can index only a few places in
a large array, or compact a sparse array into a contiguous one. I am
not sure what other possibilities may exist.

  Thanks,

      Matt

> Best Regards
>
> Hannes Brandt
>
>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210930/f76e4259/attachment.html>

From junchao.zhang at gmail.com  Thu Sep 30 09:27:01 2021
From: junchao.zhang at gmail.com (Junchao Zhang)
Date: Thu, 30 Sep 2021 09:27:01 -0500
Subject: [petsc-users] Possibilities to VecScatter to a sparse
 Vector-Format
In-Reply-To: <ximss-1710695@be2.uni-bonn.de>
References: <ximss-1710695@be2.uni-bonn.de>
Message-ID: <CA+MQGp_cdEY_e9iaTeYhXBTT_vJBCZVY8dkFa509Kee6q6-BzQ@mail.gmail.com>

On Thu, Sep 30, 2021 at 8:39 AM Hannes Phil Niklas Brandt <
s6hsbran at uni-bonn.de> wrote:

> Hello,
>
>
>
>
>
> I intend to compute a parallel Matrix-Vector-Product (via MPI) and
> therefore would like to scatter the entries of the input MPI-Vec v to a
> local vector containing all entries relevant to the current process.
>
>
>
> To achieve this I tried defining a VecScatter, which scatters from v to a
> sequential Vec v_seq (each process has it's own version of v_seq). However,
> storing v_seq (which has one entry for each global row, thus containing a
> large amount of zero-entries) may demand too much storage space (in
> comparison to my data-sparse Matrix-Storage-Format).
>
 What you said is exactly what petsc's MatMult does. It builds a VecScatter
object (aij->Mvctx), and has a local vector (aij->lvec).   It does not
communicate or store unneeded  remote entries. The code is at
https://gitlab.com/petsc/petsc/-/blob/main/src/mat/impls/aij/mpi/mmaij.c#L9


> I am interested in possibilities to scatter v to a sparse Vec-type to
> avoid storing unnecessary large amounts of zero-entries. Is there a sparse
> Vector format in Petsc compatible to the VecScatter procedure or is there
> another efficient way to compute Matrix-Vector-Products without usinglarge
> amounts of storage space on each process?
>
>
>
>
>
> Best Regards
>
> Hannes Brandt
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210930/0c435f37/attachment.html>

From bsmith at petsc.dev  Thu Sep 30 09:32:30 2021
From: bsmith at petsc.dev (Barry Smith)
Date: Thu, 30 Sep 2021 10:32:30 -0400
Subject: [petsc-users] Convergence rate for spatially varying Helmholtz
 system
In-Reply-To: <CA+NMaCbK0SiKKDuz5yW_yyb5NpAgHdbQKvtVj0mOyiga+eVRjg@mail.gmail.com>
References: <CA+NMaCbK0SiKKDuz5yW_yyb5NpAgHdbQKvtVj0mOyiga+eVRjg@mail.gmail.com>
Message-ID: <E58BD43E-EEE0-4A45-B6A0-339AABA7CBC9@petsc.dev>


   

> On Sep 29, 2021, at 5:37 PM, Ramakrishnan Thirumalaisamy <rthirumalaisam1857 at sdsu.edu> wrote:
> 
> Hi all,
> 
> I am trying to solve the Helmholtz equation for temperature T:
> 
> (C I  + Div D grad) T = f
> 
> in IBAMR, in which C is the spatially varying diagonal entries, and D is the spatially varying diffusion coefficient.   I use a matrix-free solver with matrix-based PETSc preconditioner. For the matrix-free solver, I use gmres solver and for the matrix based preconditioner, I use Richardson ksp + Jacobi as a preconditioner. As the simulation progresses, the iterations start to increase. To understand the cause, I set D to be zero, which results in a diagonal system:
>   
> C T = f.
> 
> This should result in convergence within a single iteration, but I get convergence in 3 iterations.
> 
> Residual norms for temperature_ solve.
>   0 KSP preconditioned resid norm 4.590811647875e-02 true resid norm 2.406067589273e+09 ||r(i)||/||b|| 4.455533946945e-05
>   1 KSP preconditioned resid norm 2.347767895880e-06 true resid norm 1.210763896685e+05 ||r(i)||/||b|| 2.242081505717e-09
>   2 KSP preconditioned resid norm 1.245406571896e-10 true resid norm 6.328828824310e+00 ||r(i)||/||b|| 1.171966730978e-13
> Linear temperature_ solve converged due to CONVERGED_RTOL iterations 2
> 

   What is the result of -ksp_view on the solve? 

   The way you describe your implementation it does not sound like standard PETSc practice. 

With PETSc using a matrix-free operation mA and a matrix from which KSP will build the preconditioner  A one uses  KSPSetOperator(ksp,mA,A); and then just selects the preconditioner with -pc_type xxx  For example to use Jacobi preconditioning one uses -pc_type jacobi (note that this only uses the diagonal of A, the rest of A is never used).

If you wish to precondition mA by fully solving with the matrix A one can use -ksp_monitor_true_residual -pc_type ksp -ksp_ksp_type yyy -ksp_pc_type xxx  -ksp_ksp_monitor_true_residual with, for example, yyy of richardson and xxx of jacobi

  Barry




> To verify that I am indeed solving a diagonal system I printed the PETSc matrix from the preconditioner and viewed it in Matlab. It indeed shows it to be a diagonal system. Attached is the plot of the spy command on the printed matrix. The matrix in binary form is also attached. 
> 
> My understanding is that because the C coefficient is varying in 4 orders of magnitude, i.e., Max(C)/Min(C) ~ 10^4, the matrix is poorly scaled. When I rescale my matrix by 1/C then the system converges in 1 iteration as expected. Is my understanding correct, and that scaling 1/C should be done even for a diagonal system?
> 
> When D is non-zero, then scaling by 1/C seems to be very inconvenient as D is stored as side-centered data for the matrix free solver. 
> 
> In the case that I do not scale my equations by 1/C, is there some solver setting that improves the convergence rate? (With D as non-zero, I have also tried gmres as the ksp solver in the matrix-based preconditioner to get better performance, but it didn't matter much.)
> 
> 
> Thanks,
> Ramakrishnan Thirumalaisamy
> San Diego State University.
> <Temperature_fill.pdf><matrix_temperature>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210930/4c6f374d/attachment-0001.html>

From bsmith at petsc.dev  Thu Sep 30 09:39:25 2021
From: bsmith at petsc.dev (Barry Smith)
Date: Thu, 30 Sep 2021 10:39:25 -0400
Subject: [petsc-users] Disconnected domains and Poisson equation
In-Reply-To: <AM8PR01MB8027615DCD755DB715D775A0E3AA9@AM8PR01MB8027.eurprd01.prod.exchangelabs.com>
References: <AS8PR01MB8024549ECEC514A95F05812EE3A99@AS8PR01MB8024.eurprd01.prod.exchangelabs.com>
	<448CEBF7-5B16-4E1C-8D1D-9CC067BD38BB@petsc.dev>
	<AS8PR01MB80247DD771EE0861987F4C3CE3A99@AS8PR01MB8024.eurprd01.prod.exchangelabs.com>
	<AS8PR01MB8024A52A6C2CC88115DE1C45E3A99@AS8PR01MB8024.eurprd01.prod.exchangelabs.com>
	<10EA28EF-AD98-4F59-A78D-7DE3D4B585DE@petsc.dev>
	<AM8PR01MB8027615DCD755DB715D775A0E3AA9@AM8PR01MB8027.eurprd01.prod.exchangelabs.com>
Message-ID: <3A2F7686-44AA-47A5-B996-461E057F4EC3@petsc.dev>


   It looks like the initial solution (guess) is to round-off the solution to the linear system 9.010260489109e-14

0 KSP unpreconditioned resid norm 9.010260489109e-14 true resid norm 9.010260489109e-14 ||r(i)||/||b|| 2.021559024868e+00
  0 KSP Residual norm 9.010260489109e-14 % max 1.000000000000e+00 min 1.000000000000e+00 max/min 1.000000000000e+00
  1 KSP unpreconditioned resid norm 4.918108339808e-15 true resid norm 4.918171792537e-15 ||r(i)||/||b|| 1.103450292594e-01
  1 KSP Residual norm 4.918108339808e-15 % max 9.566256813737e-01 min 9.566256813737e-01 max/min 1.000000000000e+00
  2 KSP unpreconditioned resid norm 1.443599554690e-15 true resid norm 1.444867143493e-15 ||r(i)||/||b|| 3.241731154382e-02
  2 KSP Residual norm 1.443599554690e-15 % max 9.614019380614e-01 min 7.360950481750e-01 max/min 1.306083963538e+00

Thus the Krylov solver will not be able to improve the solution, it then gets stuck trying to improve the solution but cannot because of round off. 

In other words the algorithm has converged (even at the initial solution (guess) and should stop immediately.

You can use -ksp_atol 1.e-12 to get it to stop immediately without iterating if the initial residual is less than 1e-12.  

Barry



> On Sep 30, 2021, at 4:16 AM, Marco Cisternino <marco.cisternino at optimad.it> wrote:
> 
> Hello Barry.
> This is the output of ksp_view using fgmres and gamg. It has to be said that the solution of the linear system should be a zero values field. As you can see both unpreconditioned residual and r/b converge at this iteration of the CFD solver. During the time integration of the CFD, I can observe pressure linear solver residuals behaving in a different way: unpreconditioned residual stil converges but r/b stalls. After the output of ksp_view I add the output of ksp_monitor_true_residual for one of these iteration where r/b stalls.
> Thanks,
>  
> KSP Object: 1 MPI processes
>   type: fgmres
>     restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
>     happy breakdown tolerance 1e-30
>   maximum iterations=100, nonzero initial guess
>   tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
>   right preconditioning
>   using UNPRECONDITIONED norm type for convergence test
> PC Object: 1 MPI processes
>   type: gamg
>     type is MULTIPLICATIVE, levels=4 cycles=v
>       Cycles per PCApply=1
>       Using externally compute Galerkin coarse grid matrices
>       GAMG specific options
>         Threshold for dropping small values in graph on each level =   0.02   0.02 
>         Threshold scaling factor for each level not specified = 1.
>         AGG specific options
>           Symmetric graph true
>           Number of levels to square graph 1
>           Number smoothing steps 0
>   Coarse grid solver -- level -------------------------------
>     KSP Object: (mg_coarse_) 1 MPI processes
>       type: preonly
>       maximum iterations=10000, initial guess is zero
>       tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
>       left preconditioning
>       using NONE norm type for convergence test
>     PC Object: (mg_coarse_) 1 MPI processes
>       type: bjacobi
>         number of blocks = 1
>         Local solve is same for all blocks, in the following KSP and PC objects:
>         KSP Object: (mg_coarse_sub_) 1 MPI processes
>           type: preonly
>           maximum iterations=1, initial guess is zero
>           tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
>           left preconditioning
>           using DEFAULT norm type for convergence test
>         PC Object: (mg_coarse_sub_) 1 MPI processes
>           type: lu
>           PC has not been set up so information may be incomplete
>             out-of-place factorization
>             tolerance for zero pivot 2.22045e-14
>             using diagonal shift on blocks to prevent zero pivot [INBLOCKS]
>             matrix ordering: nd
>           linear system matrix = precond matrix:
>           Mat Object: 1 MPI processes
>             type: seqaij
>             rows=18, cols=18
>             total: nonzeros=104, allocated nonzeros=104
>             total number of mallocs used during MatSetValues calls =0
>               not using I-node routines
>       linear system matrix = precond matrix:
>       Mat Object: 1 MPI processes
>         type: seqaij
>         rows=18, cols=18
>         total: nonzeros=104, allocated nonzeros=104
>         total number of mallocs used during MatSetValues calls =0
>           not using I-node routines
>   Down solver (pre-smoother) on level 1 -------------------------------
>     KSP Object: (mg_levels_1_) 1 MPI processes
>       type: chebyshev
>         eigenvalue estimates used:  min = 0., max = 0.
>         eigenvalues estimate via gmres min 0., max 0.
>         eigenvalues estimated using gmres with translations  [0. 0.1; 0. 1.1]
>         KSP Object: (mg_levels_1_esteig_) 1 MPI processes
>           type: gmres
>             restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
>             happy breakdown tolerance 1e-30
>           maximum iterations=10, initial guess is zero
>           tolerances:  relative=1e-12, absolute=1e-50, divergence=10000.
>           left preconditioning
>           using DEFAULT norm type for convergence test
>         estimating eigenvalues using noisy right hand side
>       maximum iterations=2, nonzero initial guess
>       tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
>       left preconditioning
>       using NONE norm type for convergence test
>     PC Object: (mg_levels_1_) 1 MPI processes
>       type: sor
>         type = local_symmetric, iterations = 1, local iterations = 1, omega = 1.
>       linear system matrix = precond matrix:
>       Mat Object: 1 MPI processes
>         type: seqaij
>         rows=67, cols=67
>         total: nonzeros=675, allocated nonzeros=675
>         total number of mallocs used during MatSetValues calls =0
>           not using I-node routines
>   Up solver (post-smoother) same as down solver (pre-smoother)
>   Down solver (pre-smoother) on level 2 -------------------------------
>     KSP Object: (mg_levels_2_) 1 MPI processes
>       type: chebyshev
>         eigenvalue estimates used:  min = 0., max = 0.
>         eigenvalues estimate via gmres min 0., max 0.
>         eigenvalues estimated using gmres with translations  [0. 0.1; 0. 1.1]
>         KSP Object: (mg_levels_2_esteig_) 1 MPI processes
>           type: gmres
>             restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
>             happy breakdown tolerance 1e-30
>           maximum iterations=10, initial guess is zero
>           tolerances:  relative=1e-12, absolute=1e-50, divergence=10000.
>           left preconditioning
>           using DEFAULT norm type for convergence test
>         estimating eigenvalues using noisy right hand side
>       maximum iterations=2, nonzero initial guess
>       tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
>       left preconditioning
>       using NONE norm type for convergence test
>     PC Object: (mg_levels_2_) 1 MPI processes
>       type: sor
>         type = local_symmetric, iterations = 1, local iterations = 1, omega = 1.
>       linear system matrix = precond matrix:
>       Mat Object: 1 MPI processes
>         type: seqaij
>         rows=348, cols=348
>         total: nonzeros=3928, allocated nonzeros=3928
>         total number of mallocs used during MatSetValues calls =0
>           not using I-node routines
>   Up solver (post-smoother) same as down solver (pre-smoother)
>   Down solver (pre-smoother) on level 3 -------------------------------
>     KSP Object: (mg_levels_3_) 1 MPI processes
>       type: chebyshev
>         eigenvalue estimates used:  min = 0., max = 0.
>         eigenvalues estimate via gmres min 0., max 0.
>         eigenvalues estimated using gmres with translations  [0. 0.1; 0. 1.1]
>         KSP Object: (mg_levels_3_esteig_) 1 MPI processes
>           type: gmres
>             restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with no iterative refinement
>             happy breakdown tolerance 1e-30
>           maximum iterations=10, initial guess is zero
>           tolerances:  relative=1e-12, absolute=1e-50, divergence=10000.
>           left preconditioning
>           using DEFAULT norm type for convergence test
>         estimating eigenvalues using noisy right hand side
>       maximum iterations=2, nonzero initial guess
>       tolerances:  relative=1e-05, absolute=1e-50, divergence=10000.
>       left preconditioning
>       using NONE norm type for convergence test
>     PC Object: (mg_levels_3_) 1 MPI processes
>       type: sor
>         type = local_symmetric, iterations = 1, local iterations = 1, omega = 1.
>       linear system matrix = precond matrix:
>       Mat Object: 1 MPI processes
>         type: seqaij
>         rows=3584, cols=3584
>         total: nonzeros=23616, allocated nonzeros=23616
>         total number of mallocs used during MatSetValues calls =0
>           has attached null space
>           not using I-node routines
>   Up solver (post-smoother) same as down solver (pre-smoother)
>   linear system matrix = precond matrix:
>   Mat Object: 1 MPI processes
>     type: seqaij
>     rows=3584, cols=3584
>     total: nonzeros=23616, allocated nonzeros=23616
>     total number of mallocs used during MatSetValues calls =0
>       has attached null space
>       not using I-node routines
>   Pressure system has reached convergence in 0 iterations with reason 3.
>   0 KSP unpreconditioned resid norm 4.798763170703e-16 true resid norm 4.798763170703e-16 ||r(i)||/||b|| 1.000000000000e+00
>   0 KSP Residual norm 4.798763170703e-16 % max 1.000000000000e+00 min 1.000000000000e+00 max/min 1.000000000000e+00
>   1 KSP unpreconditioned resid norm 1.648749109132e-17 true resid norm 1.648749109132e-17 ||r(i)||/||b|| 3.435779284125e-02
>   1 KSP Residual norm 1.648749109132e-17 % max 9.561792537103e-01 min 9.561792537103e-01 max/min 1.000000000000e+00
>   2 KSP unpreconditioned resid norm 4.737880600040e-19 true resid norm 4.737880600040e-19 ||r(i)||/||b|| 9.873128619820e-04
>   2 KSP Residual norm 4.737880600040e-19 % max 9.828636644296e-01 min 9.293131521763e-01 max/min 1.057623753767e+00
>   3 KSP unpreconditioned resid norm 2.542212716830e-20 true resid norm 2.542212716830e-20 ||r(i)||/||b|| 5.297641551371e-05
>   3 KSP Residual norm 2.542212716830e-20 % max 9.933572357920e-01 min 9.158303248850e-01 max/min 1.084652046127e+00
>   4 KSP unpreconditioned resid norm 6.614510286263e-21 true resid norm 6.614510286269e-21 ||r(i)||/||b|| 1.378378146822e-05
>   4 KSP Residual norm 6.614510286263e-21 % max 9.950912550705e-01 min 6.296575800237e-01 max/min 1.580368896747e+00
>   5 KSP unpreconditioned resid norm 1.981505525281e-22 true resid norm 1.981505525272e-22 ||r(i)||/||b|| 4.129200493513e-07
>   5 KSP Residual norm 1.981505525281e-22 % max 9.984097962703e-01 min 5.316259535293e-01 max/min 1.878030577029e+00
> Linear solve converged due to CONVERGED_RTOL iterations 5
>  
> Ksp_monitor_true_residual output for stalling r/b CFD iteration
> 0 KSP unpreconditioned resid norm 9.010260489109e-14 true resid norm 9.010260489109e-14 ||r(i)||/||b|| 2.021559024868e+00
>   0 KSP Residual norm 9.010260489109e-14 % max 1.000000000000e+00 min 1.000000000000e+00 max/min 1.000000000000e+00
>   1 KSP unpreconditioned resid norm 4.918108339808e-15 true resid norm 4.918171792537e-15 ||r(i)||/||b|| 1.103450292594e-01
>   1 KSP Residual norm 4.918108339808e-15 % max 9.566256813737e-01 min 9.566256813737e-01 max/min 1.000000000000e+00
>   2 KSP unpreconditioned resid norm 1.443599554690e-15 true resid norm 1.444867143493e-15 ||r(i)||/||b|| 3.241731154382e-02
>   2 KSP Residual norm 1.443599554690e-15 % max 9.614019380614e-01 min 7.360950481750e-01 max/min 1.306083963538e+00
>   3 KSP unpreconditioned resid norm 6.623206616803e-16 true resid norm 6.654132553541e-16 ||r(i)||/||b|| 1.492933720678e-02
>   3 KSP Residual norm 6.623206616803e-16 % max 9.764112945239e-01 min 4.911485418014e-01 max/min 1.988016274960e+00
>   4 KSP unpreconditioned resid norm 6.551896936698e-16 true resid norm 6.646157296305e-16 ||r(i)||/||b|| 1.491144376933e-02
>   4 KSP Residual norm 6.551896936698e-16 % max 9.883425885532e-01 min 1.461270778833e-01 max/min 6.763582786091e+00
>   5 KSP unpreconditioned resid norm 6.222297644887e-16 true resid norm 1.720560536914e-15 ||r(i)||/||b|| 3.860282047823e-02
>   5 KSP Residual norm 6.222297644887e-16 % max 1.000409371755e+00 min 4.989767363560e-03 max/min 2.004921870829e+02
>   6 KSP unpreconditioned resid norm 6.496945794974e-17 true resid norm 2.031914800253e-14 ||r(i)||/||b|| 4.558842341106e-01
>   6 KSP Residual norm 6.496945794974e-17 % max 1.004914985753e+00 min 1.459258738706e-03 max/min 6.886475709192e+02
>   7 KSP unpreconditioned resid norm 1.965237342540e-17 true resid norm 1.684522207337e-14 ||r(i)||/||b|| 3.779425772373e-01
>   7 KSP Residual norm 1.965237342540e-17 % max 1.005737762541e+00 min 1.452603803766e-03 max/min 6.923689446035e+02
>   8 KSP unpreconditioned resid norm 1.627718951285e-17 true resid norm 1.958642967520e-14 ||r(i)||/||b|| 4.394448276241e-01
>   8 KSP Residual norm 1.627718951285e-17 % max 1.006364278765e+00 min 1.452081813014e-03 max/min 6.930492963590e+02
>   9 KSP unpreconditioned resid norm 1.616577677764e-17 true resid norm 2.019110946644e-14 ||r(i)||/||b|| 4.530115373837e-01
>   9 KSP Residual norm 1.616577677764e-17 % max 1.006648747131e+00 min 1.452031376577e-03 max/min 6.932692801059e+02
> 10 KSP unpreconditioned resid norm 1.285788988203e-17 true resid norm 2.065082694477e-14 ||r(i)||/||b|| 4.633258453698e-01
> 10 KSP Residual norm 1.285788988203e-17 % max 1.007469033514e+00 min 1.433291867068e-03 max/min 7.029057072477e+02
> 11 KSP unpreconditioned resid norm 5.490854431580e-19 true resid norm 1.798071628891e-14 ||r(i)||/||b|| 4.034187394623e-01
> 11 KSP Residual norm 5.490854431580e-19 % max 1.008058905554e+00 min 1.369401685301e-03 max/min 7.361309076612e+02
> 12 KSP unpreconditioned resid norm 1.371754802104e-20 true resid norm 1.965688920064e-14 ||r(i)||/||b|| 4.410256708163e-01
> 12 KSP Residual norm 1.371754802104e-20 % max 1.008409402214e+00 min 1.369243011779e-03 max/min 7.364721919624e+02
> Linear solve converged due to CONVERGED_RTOL iterations 12
>  
>  
>  
> Marco Cisternino 
>  
> From: Barry Smith <bsmith at petsc.dev <mailto:bsmith at petsc.dev>> 
> Sent: mercoled? 29 settembre 2021 18:34
> To: Marco Cisternino <marco.cisternino at optimad.it <mailto:marco.cisternino at optimad.it>>
> Cc: petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov>
> Subject: Re: [petsc-users] Disconnected domains and Poisson equation
>  
>  
> 
> 
> On Sep 29, 2021, at 11:59 AM, Marco Cisternino <marco.cisternino at optimad.it <mailto:marco.cisternino at optimad.it>> wrote:
>  
> For sake of completeness, explicitly building the null space using a vector per sub-domain make s the CFD runs using BCGS and GMRES more stable, but still slower than FGMRES.
>  
>   Something is strange. Please run with -ksp_view and send the output on the solver details.
> 
> 
> I had divergence using BCGS and GMRES setting the null space with only one constant.
> Thanks
>  
> Marco Cisternino 
>  
> From: Marco Cisternino 
> Sent: mercoled? 29 settembre 2021 17:54
> To: Barry Smith <bsmith at petsc.dev <mailto:bsmith at petsc.dev>>
> Cc: petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov>
> Subject: RE: [petsc-users] Disconnected domains and Poisson equation
>  
> Thank you Barry for the quick reply.
> About the null space: I already tried what you suggest, building 2 Vec (constants) with 0 and 1 chosen by sub-domain, normalizing them and setting the null space like this
> MatNullSpaceCreate(PETSC_COMM_WORLD,PETSC_FALSE,nconstants,constants,&nullspace);
> The solution is slightly different in values but it is still different in the two sub-domains.
> About the solver: I tried BCGS, GMRES and FGMRES. The linear system is a pressure system in a navier-stokes solver and only solving with FGMRES makes the CFD stable, with BCGS and GMRES the CFD solution diverges. Moreover, in the same case but with a single domain, CFD solution is stable using all the solvers, but FGMRES converges in much less iterations than the others.
>  
> Marco Cisternino 
>  
> From: Barry Smith <bsmith at petsc.dev <mailto:bsmith at petsc.dev>> 
> Sent: mercoled? 29 settembre 2021 15:59
> To: Marco Cisternino <marco.cisternino at optimad.it <mailto:marco.cisternino at optimad.it>>
> Cc: petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov>
> Subject: Re: [petsc-users] Disconnected domains and Poisson equation
>  
>  
>   The problem actually has a two dimensional null space; constant on each domain but possibly different constants. I think you need to build the MatNullSpace by explicitly constructing two vectors, one with 0 on one domain and constant value on the other and one with 0 on the other domain and constant on the first. 
>  
>    Separate note: why use FGMRES instead of just GMRES? If the problem is linear and the preconditioner is linear (no GMRES inside the smoother) then you can just use GMRES and it will save a little space/work and be conceptually clearer.
>  
>   Barry
>  
> 
> On Sep 29, 2021, at 8:46 AM, Marco Cisternino <marco.cisternino at optimad.it <mailto:marco.cisternino at optimad.it>> wrote:
>  
> Good morning,
> I want to solve the Poisson equation on a 3D domain with 2 non-connected sub-domains.
> I am using FGMRES+GAMG and I have no problem if the two sub-domains see a Dirichlet boundary condition each.
> On the same domain I would like to solve the Poisson equation imposing periodic boundary condition in one direction and homogenous Neumann boundary conditions in the other two directions. The two sub-domains are symmetric with respect to the separation between them and the operator discretization and the right hand side are symmetric as well. It would be nice to have the same solution in both the sub-domains.
> Setting the null space to the constant, the solver converges to a solution having the same gradients in both sub-domains but different values.
> Am I doing some wrong with the null space? I?m not setting a block matrix (one block for each sub-domain), should I?
> I tested the null space against the matrix using MatNullSpaceTest and the answer is true. Can I do something more to have a symmetric solution as outcome of the solver?
> Thank you in advance for any comments and hints.
>  
> Best regards,
>  
> Marco Cisternino

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210930/3e2d9121/attachment-0001.html>

From bsmith at petsc.dev  Thu Sep 30 09:47:07 2021
From: bsmith at petsc.dev (Barry Smith)
Date: Thu, 30 Sep 2021 10:47:07 -0400
Subject: [petsc-users] (percent time in this phase)
In-Reply-To: <6295C9A3-0EC7-4D6A-8F62-88EC8651D207@stfc.ac.uk>
References: <B52E713E-F3B4-48B5-804D-2886DE0EDD1A@stfc.ac.uk>
	<CAMYG4G=uhz7sjifjUFRRSY4w+5K1-dZZDVZASaOQuHnBiEOkfg@mail.gmail.com>
	<6295C9A3-0EC7-4D6A-8F62-88EC8651D207@stfc.ac.uk>
Message-ID: <3B13EDB4-A22B-421B-9B5C-F95BA9CF9705@petsc.dev>


  The MatSolve is no better on the GPUs then on the CPU; while other parts of the computation seem to speed up nicely.  What is the result of -ksp_view ? Are you using ILU(0) as the preconditioner, this will not solve well on the GPU, its solve is essentially sequential. You won't want to use ILU(0) in this way on GPUs.

  Barry


> On Sep 30, 2021, at 9:41 AM, Karthikeyan Chockalingam - STFC UKRI <karthikeyan.chockalingam at stfc.ac.uk> wrote:
> 
> Based on your feedback from yesterday. I was trying to breakdown KSPSolve.
> Please find the attached bar plot. The numbers are not adding up at least for GPUs.
> Your feedback from yesterday were based on T%.
> I plotted the time spend on each event, hoping that the cumulative sum would add up to KSPSolve time.
>  
> Kind regards,
> Karthik.
>  
> From: Matthew Knepley <knepley at gmail.com>
> Date: Thursday, 30 September 2021 at 13:52
> To: "Chockalingam, Karthikeyan (STFC,DL,HC)" <karthikeyan.chockalingam at stfc.ac.uk>
> Cc: Barry Smith <bsmith at petsc.dev>, "petsc-users at mcs.anl.gov" <petsc-users at mcs.anl.gov>
> Subject: Re: [petsc-users] (percent time in this phase)
>  
> On Thu, Sep 30, 2021 at 8:50 AM Karthikeyan Chockalingam - STFC UKRI <karthikeyan.chockalingam at stfc.ac.uk <mailto:karthikeyan.chockalingam at stfc.ac.uk>> wrote:
> When comparing the MatSolve data for
>  
> GPU
>  
> MatSolve             341 1.0 1.3009e+02 1.6 2.96e+11 1.0 0.0e+00 0.0e+00 0.0e+00 36 34  0  0  0  36 34  0  0  0  4536    4538      1 5.34e+02    0 0.00e+00 100
> MatCUSPARSSolAnl       2 1.0 3.2338e+01 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  9  0  0  0  0   9  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0
>  
> and CPU
>  
> MatSolve             352 1.0 1.3553e+02 1.0 1.02e+11 1.0 0.0e+00 0.0e+00 0.0e+00 35 34  0  0  0  35 34  0  0  0  4489
>  
> the time spent is almost the same for this preconditioner.  Look like MatCUSPARSSolAnl is called only twice (since I am running on two cores)
>  
> mpirun -n 2 ./ex45 -log_summary -da_grid_x 511 -da_grid_y 511 -da_grid_z 511  -dm_mat_type mpiaijcusparse -dm_vec_type mpicuda -ksp_type cg -pc_type bjacobi -ksp_monitor
>  
> So would it be fair to assume MatCUSPARSSolAnl  is not accounted for in MatSolve and it is an exclusive event?
>  
> Looks like that.
>  
>   Thanks
>  
>      Matt
>  
> KSPSolve (53%) + PCSetup (16%) + DMCreateMat (23%) + MatCUSPARSSolAnl (9%) ~ 100 %
>  
> Best,
> Karthik.
>  
>  
> From: Matthew Knepley <knepley at gmail.com <mailto:knepley at gmail.com>>
> Date: Wednesday, 29 September 2021 at 16:29
> To: "Chockalingam, Karthikeyan (STFC,DL,HC)" <karthikeyan.chockalingam at stfc.ac.uk <mailto:karthikeyan.chockalingam at stfc.ac.uk>>
> Cc: Barry Smith <bsmith at petsc.dev <mailto:bsmith at petsc.dev>>, "petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov>" <petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov>>
> Subject: Re: [petsc-users] %T (percent time in this phase)
>  
> On Wed, Sep 29, 2021 at 10:18 AM Karthikeyan Chockalingam - STFC UKRI <karthikeyan.chockalingam at stfc.ac.uk <mailto:karthikeyan.chockalingam at stfc.ac.uk>> wrote:
> Thank you!
>  
> Just to summarize
>  
> KSPSolve (53%) + PCSetup (16%) + DMCreateMat (23%) + MatCUSPARSSolAnl (9%) ~ 100 %
>  
> You didn?t happen to mention how MatCUSPARSSolAnl is accounted for? Am I right in accounting for it as above?
>  
> I am not sure.I thought it might be the GPU part of MatSolve(). I will have to look in the code. I am not as familiar with the GPU part.
>  
> MatCUSPARSSolAnl       2 1.0 3.2338e+01 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  9  0  0  0  0   9  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0
>  
> Finally, I believe the vector events, VecNorn, VecTDot, VecAXPY, and VecAYPX are mutually exclusive?
>  
> Yes.
>  
>   Thanks,
>  
>      Matt
>  
> Best,
>  
> Karthik.
>  
> From: Matthew Knepley <knepley at gmail.com <mailto:knepley at gmail.com>>
> Date: Wednesday, 29 September 2021 at 11:58
> To: "Chockalingam, Karthikeyan (STFC,DL,HC)" <karthikeyan.chockalingam at stfc.ac.uk <mailto:karthikeyan.chockalingam at stfc.ac.uk>>
> Cc: Barry Smith <bsmith at petsc.dev <mailto:bsmith at petsc.dev>>, "petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov>" <petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov>>
> Subject: Re: [petsc-users] %T (percent time in this phase)
>  
> On Wed, Sep 29, 2021 at 6:24 AM Karthikeyan Chockalingam - STFC UKRI <karthikeyan.chockalingam at stfc.ac.uk <mailto:karthikeyan.chockalingam at stfc.ac.uk>> wrote:
> Thank you Mathew. Now, it is all making sense to me.
>  
> From data file ksp_ex45_N511_gpu_2.txt
>  
> KSPSolve (53%) + KSPSetup (0%) = PCSetup (16%) + PCApply (37%).
>  
> However, you said ?So an iteration would mostly consist of MatMult + PCApply, with some vector work?
>  
> 1) You do one solve, but 2 KSPSetUp()s. You must be running on more than one process and using Block-Jacobi . Half the time is spent in the solve (53%)
>  
> KSPSetUp               2 1.0 5.3149e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 2.4e+01  0  0  0  0  1   0  0  0  0  1     0       0      0 0.00e+00    0 0.00e+00  0
> KSPSolve               1 1.0 1.5837e+02 1.1 8.63e+11 1.0 6.8e+02 2.1e+06 4.4e+03 53100100100 95  53100100100 96 10881   11730   1022 6.40e+03 1021 8.17e-03 100
>  
> 
> 2) The preconditioner look like BJacobi-ILU. The setup time is 16%, which is all setup of the individual blocks, and this is all used by the numerical ILU factorization.
>  
> PCSetUp 2 1.0 4.9623e+01 1.3 1.45e+09 1.0 0.0e+00 0.0e+00 0.0e+00 16 0 0 0 0 16 0 0 0 0 58 0 2 6.93e+03 0 0.00e+00 0 PCSetUpOnBlocks 1 1.0 4.9274e+01 1.3 1.45e+09 1.0 0.0e+00 0.0e+00 0.0e+00 15 0 0 0 0 15 0 0 0 0 59 0 2 6.93e+03 0 0.00e+00 0
> MatLUFactorNum         1 1.0 4.6126e+01 1.3 1.45e+09 1.0 0.0e+00 0.0e+00 0.0e+00 14  0  0  0  0  14  0  0  0  0    63       0      2 6.93e+03    0 0.00e+00  0
> MatILUFactorSym        1 1.0 2.5110e+00 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  1  0  0  0  0   1  0  0  0  0     0       0      0 0.00e+00    0 0.00e+00  0
>  
> 3) The preconditioner application takes 37% of the time, which is all solving the factors and recorded in MatSolve(). Matrix multiplication takes 4%.
>  
> PCApply 341 1.0 1.3068e+02 1.6 2.96e+11 1.0 0.0e+00 0.0e+00 0.0e+00 37 34 0 0 0 37 34 0 0 0 4516 4523 1 5.34e+02 0 0.00e+00 100
> MatSolve             341 1.0 1.3009e+02 1.6 2.96e+11 1.0 0.0e+00 0.0e+00 0.0e+00 36 34  0  0  0  36 34  0  0  0  4536    4538      1 5.34e+02    0 0.00e+00 100
> MatMult              341 1.0 1.0774e+01 1.1 2.96e+11 1.0 6.9e+02 2.1e+06 2.0e+00  4 34100100  0   4 34100100  0 54801   66441      2 5.86e+03    0 0.00e+00 100
>  
> 4) The significant vector time is all in norms (11%) since they are really slow on the GPU.
>  
> 
> VecNorm              342 1.0 6.2261e+01129.9 4.57e+10 1.0 0.0e+00 0.0e+00 6.8e+02 11  5  0  0 15  11  5  0  0 15  1466   196884      0 0.00e+00  342 2.74e-03 100
> VecTDot              680 1.0 1.7107e+00 1.3 9.09e+10 1.0 0.0e+00 0.0e+00 1.4e+03  1 10  0  0 29   1 10  0  0 29 106079   133922      0 0.00e+00  680 5.44e-03 100
> VecAXPY              681 1.0 3.2036e+00 1.7 9.10e+10 1.0 0.0e+00 0.0e+00 0.0e+00  1 11  0  0  0   1 11  0  0  0 56728   58367    682 5.34e+02    0 0.00e+00 100
> VecAYPX              339 1.0 2.6502e+00 1.8 4.53e+10 1.0 0.0e+00 0.0e+00 0.0e+00  1  5  0  0  0   1  5  0  0  0 34136   34153    339 2.71e-03    0 0.00e+00 100
>  
> So the solve time is:
>  
>   53% ~ 37% + 4% + 11%
>  
> and the setup time is about 16%. I was wrong about the SetUp time being included, as it is outside the event:
>  
>   https://gitlab.com/petsc/petsc/-/blob/main/src/ksp/ksp/interface/itfunc.c#L852 <https://gitlab.com/petsc/petsc/-/blob/main/src/ksp/ksp/interface/itfunc.c#L852>
>  
> It looks like the remainder of the time (23%) is spent preallocating the matrix.
>  
>   Thanks,
>  
>      Matt
>  
> The MalMult event is 4 %. How does this event figure into the above equation; if preconditioning (MatMult + PCApply) is included in KSPSolve?
>  
> Best,
> Karthik.
>  
> From: Matthew Knepley <knepley at gmail.com <mailto:knepley at gmail.com>>
> Date: Wednesday, 29 September 2021 at 10:58
> To: "Chockalingam, Karthikeyan (STFC,DL,HC)" <karthikeyan.chockalingam at stfc.ac.uk <mailto:karthikeyan.chockalingam at stfc.ac.uk>>
> Cc: Barry Smith <bsmith at petsc.dev <mailto:bsmith at petsc.dev>>, "petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov>" <petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov>>
> Subject: Re: [petsc-users] %T (percent time in this phase)
>  
> On Wed, Sep 29, 2021 at 5:52 AM Karthikeyan Chockalingam - STFC UKRI <karthikeyan.chockalingam at stfc.ac.uk <mailto:karthikeyan.chockalingam at stfc.ac.uk>> wrote:
> That was helpful. I would like to provide some additional details of my run on cpus and gpus. Please find the following attachments:
>  
> graph.pdf a plot showing overall time and various petsc events.
> ksp_ex45_N511_cpu_6.txt data file of the log_summary
> ksp_ex45_N511_gpu_2.txt data file of the log_summary
>  
> I used the following petsc options for cpu
>  
> mpirun -n 6 ./ex45 -log_summary -da_grid_x 511 -da_grid_y 511 -da_grid_z 511 -dm_mat_type mpiaij -dm_vec_type mpi -ksp_type cg -pc_type bjacobi -ksp_monitor
>  
> and for gpus
>  
> mpirun -n 1 ./ex45 -log_summary -da_grid_x 511 -da_grid_y 511 -da_grid_z 511  -dm_mat_type mpiaijcusparse -dm_vec_type mpicuda -ksp_type cg -pc_type bjacobi -ksp_monitor
>  
> to run the following problem
>  
> https://petsc.org/release/src/ksp/ksp/tutorials/ex45.c.html <https://petsc.org/release/src/ksp/ksp/tutorials/ex45.c.html>
>  
> From the above code, I see is there no individual function called KSPSetUp(), so I gather KSPSetDM, KSPSetComputeInitialGuess, KSPSetComputeRHS, kSPSetComputeOperators all are timed together as KSPSetUp. For this example, is KSPSetUp time and KSPSolve time mutually exclusive?
>  
> No, KSPSetUp() will be contained in KSPSolve() if it is called automatically.
>  
> In your response you said that
>  
>    ?PCSetUp() time may be in KSPSetUp() or it maybe in PCApply() it depends on how much of the preconditioner construction can take place early, so depends exactly on the preconditioner used.?
>  
> I don?t see a explicit call to PCSetUp() or  PCApply() in ex45; so for this particular preconditioner (bjacobi) how can I tell how they are timed?
>  
> They are all inside KSPSolve(). If you have a preconditioned linear solve, the oreconditioning happens during the iteration. So an iteration would mostly
> consist of MatMult + PCApply, with some vector work.
>  
> I am hoping to time KSP solving and preconditioning mutually exclusively.
>  
> I am not sure that concept makes sense here. See above.
>  
>   Thanks,
>  
>      Matt
>  
>  
> Kind regards,
> Karthik.
>  
>  
> From: Barry Smith <bsmith at petsc.dev <mailto:bsmith at petsc.dev>>
> Date: Tuesday, 28 September 2021 at 19:19
> To: "Chockalingam, Karthikeyan (STFC,DL,HC)" <karthikeyan.chockalingam at stfc.ac.uk <mailto:karthikeyan.chockalingam at stfc.ac.uk>>
> Cc: "petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov>" <petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov>>
> Subject: Re: [petsc-users] %T (percent time in this phase)
>  
>  
>  
> 
> On Sep 28, 2021, at 12:11 PM, Karthikeyan Chockalingam - STFC UKRI <karthikeyan.chockalingam at stfc.ac.uk <mailto:karthikeyan.chockalingam at stfc.ac.uk>> wrote:
>  
> Thanks for Barry for your response.
>  
> I was just benchmarking the problem with various preconditioner on cpu and gpu. I understand, it is not possible to get mutually exclusive timing.
> However, can you tell if KSPSolve time includes both PCSetup and PCApply? And if KSPSolve and KSPSetup are mutually exclusive? Likewise for PCSetUp and PCApply.
>  
>    If you do not call KSPSetUp() separately from KSPSolve() then its time is included with KSPSolve().
>  
>    PCSetUp() time may be in KSPSetUp() or it maybe in PCApply() it depends on how much of the preconditioner construction can take place early, so depends exactly on the preconditioner used.
>  
>    So yes the answer is not totally satisfying. The one thing I would recommend is to not call KSPSetUp() directly and then KSPSolve() will always include the total time of the solve plus all setup time. PCApply will contain all the time to apply the preconditioner but may also include some setup time.
>  
>   Barry
>  
>  
> Best,
> Karthik.
>  
>  
>  
>  
> From: Barry Smith <bsmith at petsc.dev <mailto:bsmith at petsc.dev>>
> Date: Tuesday, 28 September 2021 at 16:56
> To: "Chockalingam, Karthikeyan (STFC,DL,HC)" <karthikeyan.chockalingam at stfc.ac.uk <mailto:karthikeyan.chockalingam at stfc.ac.uk>>
> Cc: "petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov>" <petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov>>
> Subject: Re: [petsc-users] %T (percent time in this phase)
>  
>  
>  
> 
> On Sep 28, 2021, at 10:55 AM, Karthikeyan Chockalingam - STFC UKRI <karthikeyan.chockalingam at stfc.ac.uk <mailto:karthikeyan.chockalingam at stfc.ac.uk>> wrote:
>  
> Hello,
>  
> I ran ex45 in the KPS tutorial, which is a 3D finite-difference Poisson problem. I noticed from the output from using the flag -log_summary that for various events their respective %T (percent time in this phase) do not add up to 100 but rather exceeds 100. So, I gather there is some overlap among these events. I am primarily looking at the events KSPSetUp, KSPSolve, PCSetUp and PCSolve. Is it possible to get a mutually exclusive %T or Time for these individual events? I have attached  the log_summary output file from my run for your reference.
>  
>  
>   For nested solvers it is tricky to get the times to be mutually exclusive because some parts of the building of the preconditioner is for some preconditioners delayed until the solve has started. 
>  
>   It looks like you are using the default preconditioner options which for this example are taking more or less no time since so many iterations are needed. It is best to use -pc_type mg to use geometric multigrid on this problem.
>  
>   Barry
>  
>  
>  
> 
> Thanks!
> Karthik.
>  
> This email and any attachments are intended solely for the use of the named recipients. If you are not the intended recipient you must not use, disclose, copy or distribute this email or any of its attachments and should notify the sender immediately and delete this email from your system. UK Research and Innovation (UKRI) has taken every reasonable precaution to minimise risk of this email or any attachments containing viruses or malware but the recipient should carry out its own virus and malware checks before opening the attachments. UKRI does not accept any liability for any losses or damages which the recipient may sustain due to presence of any viruses. 
> <ksp_ex45_N511_cpu_6.txt>
>  
> 
> 
>  
> --
> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> -- Norbert Wiener
>  
> https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
> 
>  
> --
> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> -- Norbert Wiener
>  
> https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
> 
>  
> --
> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> -- Norbert Wiener
>  
> https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
> 
>  
> --
> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> -- Norbert Wiener
>  
> https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/><KSPSolve.pdf>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210930/89ec6fca/attachment-0001.html>

From bsmith at petsc.dev  Thu Sep 30 09:52:54 2021
From: bsmith at petsc.dev (Barry Smith)
Date: Thu, 30 Sep 2021 10:52:54 -0400
Subject: [petsc-users] PETSc 3.16 release
Message-ID: <8C74EED7-7C05-4E27-A2BD-B0B76F71B86B@petsc.dev>


We are pleased to announce the release of PETSc version 3.16 at https://petsc.org/release/download/ <https://petsc.org/release/download/>

A list of the major changes and updates can be found at https://petsc.org/release/docs/changes/316 <https://petsc.org/release/docs/changes/316>

The final update to petsc-3.15 i.e petsc-3.15.5 is also available

We recommend upgrading to PETSc 3.16 soon. As always, please report problems to  petsc-maint at mcs.anl.gov <mailto:petsc-maint at mcs.anl.gov> and ask questions at petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov>

This release includes contributions from

Albert Cowie
Alexei Colin
Barry Smith
Blaise Bourdin
Carsten Uphoff
Connor Ward
Daniel Finn
Daniel Shapero
Erik Schnetter
Fande Kong
Hong Zhang
Jacob Faibussowitsch
Jed Brown
Jeremy Tillay
Joe Wallwork
Joseph Pusztay
Jose Roman
Junchao Zhang
Koki Sagiyama
Kyle Gerard Felker
Lawrence Mitchell
Leila Ghaffari
Lisandro Dalcin
Mark Adams
Martin Diehl
Matthew Knepley
Matt McGurn
Moritz Huck
Mr. Hong Zhang
nathawani
olivecha
Pablo Brubeck
Patrick Sanan
pbrubeck
Pierre Jolivet
Richard Tran Mills
Rylee Sundermann
Sajid Ali
Sam Reynolds
Satish Balay
Scott Kruger
Stefano Zampini
Toby Isaac
Todd Munson
Vaclav Hapla
Yang Zongze
Zhao Gang

and bug reports/patches/proposed improvements received from

Adrian Croucher
Alexandre Halbach
Bret K. Stanford
Chonglin Zhang (@zhangchonglin)
cleaf
"Constantinescu, Emil M."
Damian Marek
Daniel Stone
Danyang Su
David Salac
dazza simplythebest
Drew Parsons (@RizzerAtGitLab)
edgar at openmail.cc <mailto:edgar at openmail.cc>
Emily Jakobs
Emmanuel Ayala
Eric Chamberland (@eric.chamberland)
Fande Kong
Getnet Betrie
Haplav
hg
Iman Datta
"Isaac, Tobin G"
Jacob Faibussowitsch
Jeremy Kozdon
Jin Chen
Junchao Zhang
Lars Corbijn
Lawrence Mitchell
Lisandro Dalcin
"Lundvick, Nick"
Mark Adams
Martin Diehl
Matthew Otten
Milan Pelletier
Mr. Hong Zhang
Nan Ding
Pierre Jolivet
Pieter Ghysels
Qi Tang
@qiuchangkai
Rezgar Shakeri
Rory Johnston
Sam Fagbemi
Saransh Saxena
Sergio Bengoechea
Stephen Jardin
TAY wee-beng
Victor Eijkhout
Xiaoye S. Li
Yang Liu

As always, thanks for your support,

Barry

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210930/1c6cc146/attachment.html>

From aduarteg at utexas.edu  Thu Sep 30 15:14:17 2021
From: aduarteg at utexas.edu (Alfredo J Duarte Gomez)
Date: Thu, 30 Sep 2021 15:14:17 -0500
Subject: [petsc-users] PC shell destroy
Message-ID: <CAO1tTfLLtiNoO8_UaT2Sxs5NttyCxArTEKorcWxL2i2WdXfz-w@mail.gmail.com>

Good afternoon PETSC team,

I am currently developing an application for PETSC in which I use my own
preconditioner with a PCSHELL.

I have successfully set all the functions and the performance of the
preconditioner is good. I am using this PCSHELL within a TS object, and it
is imperative that the objects in the PCSHELL context are freed every time
since the memory requirements of those are large.

I have set up the Preconditioner before the TS starts with the following
block of code:
----------------------------------------------------------------------------------------------------

ierr = PCSetType(pc,PCSHELL);CHKERRQ(ierr);
  ierr = ShellPCCreate(&shell);CHKERRQ(ierr);
  ierr = PCShellSetApply(pc,MatrixFreePreconditioner);CHKERRQ(ierr);
  ierr = PCShellSetContext(pc,shell);CHKERRQ(ierr);
  ierr = PCShellSetDestroy(pc,ShellPCDestroy);CHKERRQ(ierr);
  ierr = PCShellSetName(pc,"MyPreconditioner");CHKERRQ(ierr);
  ierr = ShellPCSetUp(pc,da,0.0,dt,u,user);CHKERRQ(ierr);
  ierr = TSSetPreStep(ts,PreStep);CHKERRQ(ierr);

------------------------------------------------------------------------------------------------

The shell context is then updated by using the following code within the
TSPreStep function:

---------------------------------------------------------------------------

 ierr = TSGetSNES(ts,&snes);CHKERRQ(ierr);
  ierr = SNESGetKSP(snes,&ksp);CHKERRQ(ierr);
  ierr = KSPGetPC(ksp,&pc);CHKERRQ(ierr);

  // Get necessary objects from TS context
  TSGetTime(ts,&time);
  TSGetApplicationContext(ts,&user);
  TSGetSolution(ts,&X);
  TSGetTimeStep(ts,&dt);
  TSGetStepNumber(ts, &stepi);
  TSGetDM(ts,&da);

  tdt = time+dt;
  // Update preconditioner context with current values
  ierr = ShellPCSetUp(pc,da,tdt,dt,X,user);CHKERRQ(ierr);
---------------------------------------------------------------------------

I have set up the necessary code in the function ShellPCDestroy to free the
objects within this context, however I am unsure when/if this function is
called automatically. Do I have to free the context myself after every
step? How would I call the function myself?

I am running out of memory after a few steps, and I think this shell
context is the culprit.

In addition to that, is it possible to get what is called the "ashift" dF/dU
+ a*dF/dU_t in this function from the TS object?

https://petsc.org/release/docs/manualpages/TS/TSSetIJacobian.html

I need it as an input for my preconditioner (currrently hardcoded for
TSBEULER where ashift is always 1/dt).

Thank you,

-Alfredo

-- 
Alfredo Duarte
Graduate Research Assistant
The University of Texas at Austin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210930/28bef399/attachment.html>

From mfadams at lbl.gov  Thu Sep 30 16:32:57 2021
From: mfadams at lbl.gov (Mark Adams)
Date: Thu, 30 Sep 2021 17:32:57 -0400
Subject: [petsc-users] PC shell destroy
In-Reply-To: <CAO1tTfLLtiNoO8_UaT2Sxs5NttyCxArTEKorcWxL2i2WdXfz-w@mail.gmail.com>
References: <CAO1tTfLLtiNoO8_UaT2Sxs5NttyCxArTEKorcWxL2i2WdXfz-w@mail.gmail.com>
Message-ID: <CADOhEh4Tqn3TzD7ncnf8pdAELyK_ijJ8hdCmRfCvcGF-e9v1qQ@mail.gmail.com>

You can use PETSc functions to allocate and free memory and then run with
-malloc_debug and you will get a printout of memory used and any unfreed
memory.
Mark

On Thu, Sep 30, 2021 at 4:14 PM Alfredo J Duarte Gomez <aduarteg at utexas.edu>
wrote:

> Good afternoon PETSC team,
>
> I am currently developing an application for PETSC in which I use my own
> preconditioner with a PCSHELL.
>
> I have successfully set all the functions and the performance of the
> preconditioner is good. I am using this PCSHELL within a TS object, and it
> is imperative that the objects in the PCSHELL context are freed every time
> since the memory requirements of those are large.
>
> I have set up the Preconditioner before the TS starts with the following
> block of code:
>
> ----------------------------------------------------------------------------------------------------
>
> ierr = PCSetType(pc,PCSHELL);CHKERRQ(ierr);
>   ierr = ShellPCCreate(&shell);CHKERRQ(ierr);
>   ierr = PCShellSetApply(pc,MatrixFreePreconditioner);CHKERRQ(ierr);
>   ierr = PCShellSetContext(pc,shell);CHKERRQ(ierr);
>   ierr = PCShellSetDestroy(pc,ShellPCDestroy);CHKERRQ(ierr);
>   ierr = PCShellSetName(pc,"MyPreconditioner");CHKERRQ(ierr);
>   ierr = ShellPCSetUp(pc,da,0.0,dt,u,user);CHKERRQ(ierr);
>   ierr = TSSetPreStep(ts,PreStep);CHKERRQ(ierr);
>
>
> ------------------------------------------------------------------------------------------------
>
> The shell context is then updated by using the following code within the
> TSPreStep function:
>
> ---------------------------------------------------------------------------
>
>  ierr = TSGetSNES(ts,&snes);CHKERRQ(ierr);
>   ierr = SNESGetKSP(snes,&ksp);CHKERRQ(ierr);
>   ierr = KSPGetPC(ksp,&pc);CHKERRQ(ierr);
>
>   // Get necessary objects from TS context
>   TSGetTime(ts,&time);
>   TSGetApplicationContext(ts,&user);
>   TSGetSolution(ts,&X);
>   TSGetTimeStep(ts,&dt);
>   TSGetStepNumber(ts, &stepi);
>   TSGetDM(ts,&da);
>
>   tdt = time+dt;
>   // Update preconditioner context with current values
>   ierr = ShellPCSetUp(pc,da,tdt,dt,X,user);CHKERRQ(ierr);
> ---------------------------------------------------------------------------
>
> I have set up the necessary code in the function ShellPCDestroy to free
> the objects within this context, however I am unsure when/if this function
> is called automatically. Do I have to free the context myself after every
> step? How would I call the function myself?
>
> I am running out of memory after a few steps, and I think this shell
> context is the culprit.
>
> In addition to that, is it possible to get what is called the "ashift" dF/dU
> + a*dF/dU_t in this function from the TS object?
>
> https://petsc.org/release/docs/manualpages/TS/TSSetIJacobian.html
>
> I need it as an input for my preconditioner (currrently hardcoded for
> TSBEULER where ashift is always 1/dt).
>
> Thank you,
>
> -Alfredo
>
> --
> Alfredo Duarte
> Graduate Research Assistant
> The University of Texas at Austin
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210930/dd775ab4/attachment.html>

From bsmith at petsc.dev  Thu Sep 30 17:08:56 2021
From: bsmith at petsc.dev (Barry Smith)
Date: Thu, 30 Sep 2021 18:08:56 -0400
Subject: [petsc-users] PC shell destroy
In-Reply-To: <CAO1tTfLLtiNoO8_UaT2Sxs5NttyCxArTEKorcWxL2i2WdXfz-w@mail.gmail.com>
References: <CAO1tTfLLtiNoO8_UaT2Sxs5NttyCxArTEKorcWxL2i2WdXfz-w@mail.gmail.com>
Message-ID: <3F7565A5-E6BA-4C62-92ED-C5665BFC4B09@petsc.dev>


  Alfredo,

  I think the best approach for you to use is to have your own MATSHELL and your own PCSHELL. You will use your MATSHELL as the second matrix argument to TSSetIJacobian(). It should record the current x and the current shift. 

  Your PCSHELL will then, in PCSetUp(), get access to the current x and the current shift from your MATSHELL and build itself. In other words most of your

> / Get necessary objects from TS context
>   TSGetTime(ts,&time);
>   TSGetApplicationContext(ts,&user);
>   TSGetSolution(ts,&X);
>   TSGetTimeStep(ts,&dt);
>   TSGetStepNumber(ts, &stepi);
>   TSGetDM(ts,&da);
> 
>   tdt = time+dt;
>   // Update preconditioner context with current values
>   ierr = ShellPCSetUp(pc,da,tdt,dt,X,user);CHKERRQ(ierr);
 
 code will disappear and you won't need to mess with the internals of the TS (getting current dt etc) at all. What you need is handed off to your TSSetIJacobian() function which will stick it into your MATSHELL. So nice and clean code.

  Regarding the PCDestroy() for your PCSHELL. It only gets called when the PC is finally destroyed which is when the TS is destroy. So if building your PC in PCSetUp() requires creating new objects you should destroy any previous ones when you create the new ones, hence "lost" objects won't persist in the code.

  Barry





> On Sep 30, 2021, at 4:14 PM, Alfredo J Duarte Gomez <aduarteg at utexas.edu> wrote:
> 
> Good afternoon PETSC team,
> 
> I am currently developing an application for PETSC in which I use my own preconditioner with a PCSHELL.
> 
> I have successfully set all the functions and the performance of the preconditioner is good. I am using this PCSHELL within a TS object, and it is imperative that the objects in the PCSHELL context are freed every time since the memory requirements of those are large.
> 
> I have set up the Preconditioner before the TS starts with the following block of code:
> ----------------------------------------------------------------------------------------------------
> 
> ierr = PCSetType(pc,PCSHELL);CHKERRQ(ierr);
>   ierr = ShellPCCreate(&shell);CHKERRQ(ierr);
>   ierr = PCShellSetApply(pc,MatrixFreePreconditioner);CHKERRQ(ierr);
>   ierr = PCShellSetContext(pc,shell);CHKERRQ(ierr);
>   ierr = PCShellSetDestroy(pc,ShellPCDestroy);CHKERRQ(ierr);
>   ierr = PCShellSetName(pc,"MyPreconditioner");CHKERRQ(ierr);
>   ierr = ShellPCSetUp(pc,da,0.0,dt,u,user);CHKERRQ(ierr);
>   ierr = TSSetPreStep(ts,PreStep);CHKERRQ(ierr);
> 
> ------------------------------------------------------------------------------------------------
> 
> The shell context is then updated by using the following code within the TSPreStep function:
> 
> ---------------------------------------------------------------------------
> 
>  ierr = TSGetSNES(ts,&snes);CHKERRQ(ierr);
>   ierr = SNESGetKSP(snes,&ksp);CHKERRQ(ierr);
>   ierr = KSPGetPC(ksp,&pc);CHKERRQ(ierr);
> 
>   // Get necessary objects from TS context
>   TSGetTime(ts,&time);
>   TSGetApplicationContext(ts,&user);
>   TSGetSolution(ts,&X);
>   TSGetTimeStep(ts,&dt);
>   TSGetStepNumber(ts, &stepi);
>   TSGetDM(ts,&da);
> 
>   tdt = time+dt;
>   // Update preconditioner context with current values
>   ierr = ShellPCSetUp(pc,da,tdt,dt,X,user);CHKERRQ(ierr);
> ---------------------------------------------------------------------------
> 
> I have set up the necessary code in the function ShellPCDestroy to free the objects within this context, however I am unsure when/if this function is called automatically. Do I have to free the context myself after every step? How would I call the function myself?
> 
> I am running out of memory after a few steps, and I think this shell context is the culprit.
> 
> In addition to that, is it possible to get what is called the "ashift" dF/dU + a*dF/dU_t in this function from the TS object?
> 
> https://petsc.org/release/docs/manualpages/TS/TSSetIJacobian.html <https://petsc.org/release/docs/manualpages/TS/TSSetIJacobian.html>
> 
> I need it as an input for my preconditioner (currrently hardcoded for TSBEULER where ashift is always 1/dt).
> 
> Thank you,
> 
> -Alfredo
> 
> -- 
> Alfredo Duarte
> Graduate Research Assistant
> The University of Texas at Austin

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210930/a7429039/attachment-0001.html>

From mail2amneet at gmail.com  Thu Sep 30 17:16:38 2021
From: mail2amneet at gmail.com (Amneet Bhalla)
Date: Thu, 30 Sep 2021 15:16:38 -0700
Subject: [petsc-users] Convergence rate for spatially varying Helmholtz
 system
In-Reply-To: <E58BD43E-EEE0-4A45-B6A0-339AABA7CBC9@petsc.dev>
References: <CA+NMaCbK0SiKKDuz5yW_yyb5NpAgHdbQKvtVj0mOyiga+eVRjg@mail.gmail.com>
	<E58BD43E-EEE0-4A45-B6A0-339AABA7CBC9@petsc.dev>
Message-ID: <CAMETWJ0orwEytwx4KACBvFnDakwi-0N_N0+Lide1PVOpahoZrg@mail.gmail.com>

>> If you want to solve systems accurately, you should non-dimensionalize
the system prior to discretization. This would mean that
your C and b have elements in the [1, D] range, where D is the dynamic
range of your problem, say 1e4, rather than these huge
numbers you have now.

@Matt: We have done non-dimensionalization and the diagonal matrix ranges
from 1 to 1e4 now. Still it takes 4-5 iterations to converge for the
non-dimensional diagonal matrix. The convergence trend is looking much
better now, though:

Residual norms for temperature_ solve.

  0 KSP preconditioned resid norm 4.724547545716e-04 true resid norm
2.529423250889e+00 ||r(i)||/||b|| 4.397759655853e-05

  1 KSP preconditioned resid norm 6.504853596318e-06 true resid norm
2.197130494439e-02 ||r(i)||/||b|| 3.820021755431e-07

  2 KSP preconditioned resid norm 7.733420341215e-08 true resid norm
3.539290481432e-04 ||r(i)||/||b|| 6.153556501117e-09

  3 KSP preconditioned resid norm 6.419092250844e-10 true resid norm
5.220398494466e-06 ||r(i)||/||b|| 9.076400273607e-11

  4 KSP preconditioned resid norm 5.095955157158e-12 true resid norm
2.484163999489e-08 ||r(i)||/||b|| 4.319070053474e-13

  5 KSP preconditioned resid norm 6.828200916501e-14 true resid norm
2.499229854610e-10 ||r(i)||/||b|| 4.345264170970e-15

Linear temperature_ solve converged due to CONVERGED_RTOL iterations 5



Only when all the equations are scaled individually the convergence is
achieved in a single iteration. In the above, all equations are scaled
using the same non-dimensional parameter. Do you think this is reasonable
or do you expect the diagonal system to converge in a single iteration
irrespective of the range of diagonal entries?

@Barry:

>
>
>    What is the result of -ksp_view on the solve?
>

KSP Object: (temperature_) 1 MPI processes

  type: gmres

    restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization
with one step of iterative refinement when needed

    happy breakdown tolerance 1e-30

  maximum iterations=1000, nonzero initial guess

  tolerances:  relative=1e-12, absolute=1e-50, divergence=10000.

  left preconditioning

  using PRECONDITIONED norm type for convergence test

PC Object: (temperature_) 1 MPI processes

  type: shell

    IEPSemiImplicitHierarchyIntegrator::helmholtz_precond::Temperature

  linear system matrix = precond matrix:

  Mat Object: 1 MPI processes

    type: shell
    rows=1, cols=1


>
>    The way you describe your implementation it does not sound like
> standard PETSc practice.
>

Yes, we do it differently in IBAMR. Succinctly, the main solver is a
matrix-free one, whereas the preconditioner is a FAC multigrid solver with
its bottom solver formed on the coarsest level of AMR grid using PETSc
(matrix-based KSP).

In the above -ksp_view temperature_ is the matrix-free KSP solver and
IEPSemiImplicitHierarchyIntegrator::helmholtz_precond
is the FAC preconditioner.

>
> With PETSc using a matrix-free operation mA and a matrix from which KSP
> will build the preconditioner  A one uses  KSPSetOperator(ksp,mA,A); and
> then just selects the preconditioner with -pc_type xxx  For example to use
> Jacobi preconditioning one uses -pc_type jacobi (note that this only uses
> the diagonal of A, the rest of A is never used).
>

We run -pc_type jacobi on the bottom solver of the FAC preconditioner.

>
> If you wish to precondition mA by fully solving with the matrix A one can
> use -ksp_monitor_true_residual -pc_type ksp -ksp_ksp_type yyy -ksp_pc_type
> xxx  -ksp_ksp_monitor_true_residual with, for example, yyy of richardson
> and xxx of jacobi
>

Yes, this is what we do.

>
>   Barry
>
>
>
>
> To verify that I am indeed solving a diagonal system I printed the PETSc
> matrix from the preconditioner and viewed it in Matlab. It indeed shows it
> to be a diagonal system. Attached is the plot of the spy command on the
> printed matrix. The matrix in binary form is also attached.
>
> My understanding is that because the C coefficient is varying in 4 orders
> of magnitude, i.e., Max(C)/Min(C) ~ 10^4, the matrix is poorly scaled. When
> I rescale my matrix by 1/C then the system converges in 1 iteration as
> expected. Is my understanding correct, and that scaling 1/C should be done
> even for a diagonal system?
>
> When D is non-zero, then scaling by 1/C seems to be very inconvenient as D
> is stored as side-centered data for the matrix free solver.
>
> In the case that I do not scale my equations by 1/C, is there some solver
> setting that improves the convergence rate? (With D as non-zero, I have
> also tried gmres as the ksp solver in the matrix-based preconditioner to
> get better performance, but it didn't matter much.)
>
>
> Thanks,
> Ramakrishnan Thirumalaisamy
> San Diego State University.
> <Temperature_fill.pdf><matrix_temperature>
>
>
>

-- 
--Amneet
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210930/21e0b584/attachment.html>

From bsmith at petsc.dev  Thu Sep 30 17:34:50 2021
From: bsmith at petsc.dev (Barry Smith)
Date: Thu, 30 Sep 2021 18:34:50 -0400
Subject: [petsc-users] Convergence rate for spatially varying Helmholtz
 system
In-Reply-To: <CAMETWJ0orwEytwx4KACBvFnDakwi-0N_N0+Lide1PVOpahoZrg@mail.gmail.com>
References: <CA+NMaCbK0SiKKDuz5yW_yyb5NpAgHdbQKvtVj0mOyiga+eVRjg@mail.gmail.com>
	<E58BD43E-EEE0-4A45-B6A0-339AABA7CBC9@petsc.dev>
	<CAMETWJ0orwEytwx4KACBvFnDakwi-0N_N0+Lide1PVOpahoZrg@mail.gmail.com>
Message-ID: <00A92945-C009-4A92-B7E2-909B1783CCF4@petsc.dev>



> On Sep 30, 2021, at 6:16 PM, Amneet Bhalla <mail2amneet at gmail.com> wrote:
> 
>  
> >> If you want to solve systems accurately, you should non-dimensionalize the system prior to discretization. This would mean that
> your C and b have elements in the [1, D] range, where D is the dynamic range of your problem, say 1e4, rather than these huge
> numbers you have now.
> 
> @Matt: We have done non-dimensionalization and the diagonal matrix ranges from 1 to 1e4 now. Still it takes 4-5 iterations to converge for the non-dimensional diagonal matrix. The convergence trend is looking much better now, though:
> 
> Residual norms for temperature_ solve.
>   0 KSP preconditioned resid norm 4.724547545716e-04 true resid norm 2.529423250889e+00 ||r(i)||/||b|| 4.397759655853e-05
>   1 KSP preconditioned resid norm 6.504853596318e-06 true resid norm 2.197130494439e-02 ||r(i)||/||b|| 3.820021755431e-07
>   2 KSP preconditioned resid norm 7.733420341215e-08 true resid norm 3.539290481432e-04 ||r(i)||/||b|| 6.153556501117e-09
>   3 KSP preconditioned resid norm 6.419092250844e-10 true resid norm 5.220398494466e-06 ||r(i)||/||b|| 9.076400273607e-11
>   4 KSP preconditioned resid norm 5.095955157158e-12 true resid norm 2.484163999489e-08 ||r(i)||/||b|| 4.319070053474e-13
>   5 KSP preconditioned resid norm 6.828200916501e-14 true resid norm 2.499229854610e-10 ||r(i)||/||b|| 4.345264170970e-15
> Linear temperature_ solve converged due to CONVERGED_RTOL iterations 5
> 
> 
> Only when all the equations are scaled individually the convergence is achieved in a single iteration. In the above, all equations are scaled using the same non-dimensional parameter. Do you think this is reasonable or do you expect the diagonal system to converge in a single iteration irrespective of the range of diagonal entries? 

   For a diagonal system with this modest range of values Jacobi should converge in a single iteration. 

   The output below is confusing, it is a system with 1 variable and should definitely converge in one iterations. 

   I am concerned we may be talking apples and oranges here and your test may not be as simple as you think it is (with regard to the diagonal).

> 
> @Barry: 
>> 
> 
>    What is the result of -ksp_view on the solve? 
> 
> KSP Object: (temperature_) 1 MPI processes
>   type: gmres
>     restart=30, using Classical (unmodified) Gram-Schmidt Orthogonalization with one step of iterative refinement when needed
>     happy breakdown tolerance 1e-30
>   maximum iterations=1000, nonzero initial guess
>   tolerances:  relative=1e-12, absolute=1e-50, divergence=10000.
>   left preconditioning
>   using PRECONDITIONED norm type for convergence test
> PC Object: (temperature_) 1 MPI processes
>   type: shell
>     IEPSemiImplicitHierarchyIntegrator::helmholtz_precond::Temperature
>   linear system matrix = precond matrix:
>   Mat Object: 1 MPI processes
>     type: shell
>     rows=1, cols=1 
>  
> 
>    The way you describe your implementation it does not sound like standard PETSc practice. 
> 
> Yes, we do it differently in IBAMR. Succinctly, the main solver is a matrix-free one, whereas the preconditioner is a FAC multigrid solver with its bottom solver formed on the coarsest level of AMR grid using PETSc (matrix-based KSP).  
> 
> In the above -ksp_view temperature_ is the matrix-free KSP solver and IEPSemiImplicitHierarchyIntegrator::helmholtz_precond is the FAC preconditioner.
> 
> With PETSc using a matrix-free operation mA and a matrix from which KSP will build the preconditioner  A one uses  KSPSetOperator(ksp,mA,A); and then just selects the preconditioner with -pc_type xxx  For example to use Jacobi preconditioning one uses -pc_type jacobi (note that this only uses the diagonal of A, the rest of A is never used).
> 
> We run -pc_type jacobi on the bottom solver of the FAC preconditioner.  
> 
> If you wish to precondition mA by fully solving with the matrix A one can use -ksp_monitor_true_residual -pc_type ksp -ksp_ksp_type yyy -ksp_pc_type xxx  -ksp_ksp_monitor_true_residual with, for example, yyy of richardson and xxx of jacobi
> 
> Yes, this is what we do.  
> 
>   Barry
> 
> 
> 
> 
>> To verify that I am indeed solving a diagonal system I printed the PETSc matrix from the preconditioner and viewed it in Matlab. It indeed shows it to be a diagonal system. Attached is the plot of the spy command on the printed matrix. The matrix in binary form is also attached. 
>> 
>> My understanding is that because the C coefficient is varying in 4 orders of magnitude, i.e., Max(C)/Min(C) ~ 10^4, the matrix is poorly scaled. When I rescale my matrix by 1/C then the system converges in 1 iteration as expected. Is my understanding correct, and that scaling 1/C should be done even for a diagonal system?
>> 
>> When D is non-zero, then scaling by 1/C seems to be very inconvenient as D is stored as side-centered data for the matrix free solver. 
>> 
>> In the case that I do not scale my equations by 1/C, is there some solver setting that improves the convergence rate? (With D as non-zero, I have also tried gmres as the ksp solver in the matrix-based preconditioner to get better performance, but it didn't matter much.)
>> 
>> 
>> Thanks,
>> Ramakrishnan Thirumalaisamy
>> San Diego State University.
>> <Temperature_fill.pdf><matrix_temperature>
> 
> 
> 
> -- 
> --Amneet 
> 
> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210930/30e8302f/attachment-0001.html>

From mail2amneet at gmail.com  Thu Sep 30 17:58:02 2021
From: mail2amneet at gmail.com (Amneet Bhalla)
Date: Thu, 30 Sep 2021 15:58:02 -0700
Subject: [petsc-users] Convergence rate for spatially varying Helmholtz
 system
In-Reply-To: <00A92945-C009-4A92-B7E2-909B1783CCF4@petsc.dev>
References: <CA+NMaCbK0SiKKDuz5yW_yyb5NpAgHdbQKvtVj0mOyiga+eVRjg@mail.gmail.com>
	<E58BD43E-EEE0-4A45-B6A0-339AABA7CBC9@petsc.dev>
	<CAMETWJ0orwEytwx4KACBvFnDakwi-0N_N0+Lide1PVOpahoZrg@mail.gmail.com>
	<00A92945-C009-4A92-B7E2-909B1783CCF4@petsc.dev>
Message-ID: <CAMETWJ2J7NcYHnLm_eP0RVDk90tiPvpQMO-02-WanoWe-sQjrg@mail.gmail.com>

>
>
>    For a diagonal system with this modest range of values Jacobi should
> converge in a single iteration.
>

This is what I wanted to confirm (and my expectation also). There could be
a bug in the way we are setting up the linear operators in the
preconditioner and the matrix-free solver. We need to do some debugging.

(with regard to the diagonal).

We have printed the matrix and viewed it in Matlab. It is a diagonal
matrix.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210930/b94c981e/attachment.html>

From knepley at gmail.com  Thu Sep 30 19:48:43 2021
From: knepley at gmail.com (Matthew Knepley)
Date: Thu, 30 Sep 2021 20:48:43 -0400
Subject: [petsc-users] Convergence rate for spatially varying Helmholtz
 system
In-Reply-To: <CAMETWJ2J7NcYHnLm_eP0RVDk90tiPvpQMO-02-WanoWe-sQjrg@mail.gmail.com>
References: <CA+NMaCbK0SiKKDuz5yW_yyb5NpAgHdbQKvtVj0mOyiga+eVRjg@mail.gmail.com>
	<E58BD43E-EEE0-4A45-B6A0-339AABA7CBC9@petsc.dev>
	<CAMETWJ0orwEytwx4KACBvFnDakwi-0N_N0+Lide1PVOpahoZrg@mail.gmail.com>
	<00A92945-C009-4A92-B7E2-909B1783CCF4@petsc.dev>
	<CAMETWJ2J7NcYHnLm_eP0RVDk90tiPvpQMO-02-WanoWe-sQjrg@mail.gmail.com>
Message-ID: <CAMYG4GkUSSd84A+OUmnziEtP8UApM=j3pzXyDN0SR-FQWYWGRg@mail.gmail.com>

On Thu, Sep 30, 2021 at 6:58 PM Amneet Bhalla <mail2amneet at gmail.com> wrote:

>
>
>>
>>    For a diagonal system with this modest range of values Jacobi should
>> converge in a single iteration.
>>
>
> This is what I wanted to confirm (and my expectation also). There could be
> a bug in the way we are setting up the linear operators in the
> preconditioner and the matrix-free solver. We need to do some debugging.
>
> (with regard to the diagonal).
>
> We have printed the matrix and viewed it in Matlab. It is a diagonal
> matrix.
>

Can you send us the matrix? This definitely should converge in 1 iterate
now, so something I do not understand is going on.
I will take any format you've got :)

  Thanks,

     Matt

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://www.cse.buffalo.edu/~knepley/ <http://www.cse.buffalo.edu/~knepley/>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20210930/e4a9b966/attachment.html>