From aldo.bonfiglioli at unibas.it  Sun Feb  2 14:59:12 2025
From: aldo.bonfiglioli at unibas.it (Aldo Bonfiglioli)
Date: Sun, 2 Feb 2025 21:59:12 +0100
Subject: [petsc-users] markers
Message-ID: <5548e3b0-2c5a-40be-a734-c1e2a5084f38@unibas.it>

Dear all,

it is unclear to me what is being flagged with the "marker" flag when a 
mesh is created using DMPlexCreate.

The 2D triangular mesh pictured in the enclosed pdf has the following 
features:

> DM Object: 2D plex 1 MPI process
> type: plex
> 2D plex in 2 dimensions:
> Number of 0-cells per rank: 12
> Number of 1-cells per rank: 23
> Number of 2-cells per rank: 12
> Labels:
> celltype: 3 strata with value/size (0 (12), 1 (23), 3 (12))
> depth: 3 strata with value/size (0 (12), 1 (23), 2 (12))
> marker: 1 strata with value/size (1 (20))
> Face Sets: 4 strata with value/size (1 (3), 2 (2), 3 (3), 4 (2))
>
i.e 12 gridpoints, 23 edges and 12 triangular cells.

When I call DMGetStratumSize at stratum 0 to 2, this is what I get.

> CreateSectionAlternate DMGetStratumSize found0 ?'marker' points @ 
> depth0 ?on PE#0
> CreateSectionAlternate DMGetStratumSize found20 ?'marker' points @ 
> depth1 ?on PE#0
> CreateSectionAlternate DMGetStratumSize found0 ?'marker' points @ 
> depth2 ?on PE#0
>
Is the marker flagging boundary edges or boundary vertices (nodes) ? In 
any case, why are there 20, instead of 10?

Finally: I believe face sets refers to boundary faces, where each side 
of the square domain has been given a different flag.

How do I access the face sets information ?

Thanks,

Aldo


-- 
Dr. Aldo Bonfiglioli
Associate professor of Fluid Machines
Scuola di Ingegneria
Universita' della Basilicata
V.le dell'Ateneo lucano, 10 85100 Potenza ITALY
tel:+39.0971.205203 fax:+39.0971.205215
web:https://urldefense.us/v3/__http://docenti.unibas.it/site/home/docente.html?m=002423__;!!G_uCfscf7eWS!c6_YYKL6k6FdMvXiZuhIJ2t5rbgB45aqicDR0WpLsZSgrs7NO0SMggi_L8KB2ZKd-w8JlcGQJTFm-uotq1RhZRDxD575sfToW2A$ 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20250202/6f13df90/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: compressed_PDF_file.pdf
Type: application/pdf
Size: 69121 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20250202/6f13df90/attachment-0001.pdf>

From knepley at gmail.com  Sun Feb  2 15:49:10 2025
From: knepley at gmail.com (Matthew Knepley)
Date: Sun, 2 Feb 2025 16:49:10 -0500
Subject: [petsc-users] markers
In-Reply-To: <5548e3b0-2c5a-40be-a734-c1e2a5084f38@unibas.it>
References: <5548e3b0-2c5a-40be-a734-c1e2a5084f38@unibas.it>
Message-ID: <CAMYG4GkRvhLXiZykLStvHR9tKLkkNPHi4+2hppqiw8WvRT_ZPg@mail.gmail.com>

On Sun, Feb 2, 2025 at 3:59?PM Aldo Bonfiglioli <aldo.bonfiglioli at unibas.it>
wrote:

> Dear all,
>
> it is unclear to me what is being flagged with the "marker" flag when a
> mesh is created using DMPlexCreate.
>
> The 2D triangular mesh pictured in the enclosed pdf has the following
> features:
>
> DM Object: 2D plex 1 MPI process
>  type: plex
> 2D plex in 2 dimensions:
>  Number of 0-cells per rank: 12
>  Number of 1-cells per rank: 23
>  Number of 2-cells per rank: 12
> Labels:
>  celltype: 3 strata with value/size (0 (12), 1 (23), 3 (12))
>  depth: 3 strata with value/size (0 (12), 1 (23), 2 (12))
>  marker: 1 strata with value/size (1 (20))
>  Face Sets: 4 strata with value/size (1 (3), 2 (2), 3 (3), 4 (2))
>
> i.e 12 gridpoints, 23 edges and 12 triangular cells.
>
> When I call DMGetStratumSize at stratum 0 to 2, this is what I get.
>
> CreateSectionAlternate DMGetStratumSize found            0  'marker'
> points @ depth            0  on PE#            0
> CreateSectionAlternate DMGetStratumSize found           20  'marker'
> points @ depth            1  on PE#            0
> CreateSectionAlternate DMGetStratumSize found            0  'marker'
> points @ depth            2  on PE#            0
>
> Is the marker flagging boundary edges or boundary vertices (nodes) ? In
> any case, why are there 20, instead of 10?
>

By default the "marker" label marks all k-cells on the boundary. In this
case it means

  10 vertices + 10 edges = 20 points

You can see what is in the label using

  -dm_view -dm_plex_view_labels marker

with

  DMViewFromOptions(dm, NULL, "-dm_view")

in your code.

Finally: I believe face sets refers to boundary faces, where each side of
> the square domain has been given a different flag.
>
> Yes.

> How do I access the face sets information ?
>
DMLabel label;

PetscCall(DMGetLabel(dm, "Face Sets", &label));

  Thanks,

     Matt

> Thanks,
>
> Aldo
>
>
> --
> Dr. Aldo Bonfiglioli
> Associate professor of Fluid Machines
> Scuola di Ingegneria
> Universita' della Basilicata
> V.le dell'Ateneo lucano, 10 85100 Potenza ITALY
> tel:+39.0971.205203 fax:+39.0971.205215
> web: https://urldefense.us/v3/__http://docenti.unibas.it/site/home/docente.html?m=002423__;!!G_uCfscf7eWS!Zs0PBHmkwRS6C-HyPNWsfCXfxTmHZ51FTqXs6A-ujrSlUUYpguXpO2Cg1tShryNL4k0RYpZVhUjgM7T7j789$  <https://urldefense.us/v3/__http://docenti.unibas.it/site/home/docente.html?m=002423__;!!G_uCfscf7eWS!c6_YYKL6k6FdMvXiZuhIJ2t5rbgB45aqicDR0WpLsZSgrs7NO0SMggi_L8KB2ZKd-w8JlcGQJTFm-uotq1RhZRDxD575sfToW2A$>
>
>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!Zs0PBHmkwRS6C-HyPNWsfCXfxTmHZ51FTqXs6A-ujrSlUUYpguXpO2Cg1tShryNL4k0RYpZVhUjgM-avqGtk$  <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!Zs0PBHmkwRS6C-HyPNWsfCXfxTmHZ51FTqXs6A-ujrSlUUYpguXpO2Cg1tShryNL4k0RYpZVhUjgM5Vlw3jP$ >
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20250202/fb22868f/attachment.html>

From Daniel.Abele at dlr.de  Sun Feb  2 12:09:17 2025
From: Daniel.Abele at dlr.de (Daniel.Abele at dlr.de)
Date: Sun, 2 Feb 2025 18:09:17 +0000
Subject: [petsc-users] KSP: when to use initial residual norm
 (ksp_converged_use_initial_residual_norm)
Message-ID: <e5ffb6c0f3ec419cac51ff732af5e931@dlr.de>

Hi,
we are solving a time dependent problem with a single KSP in every time step. We are debating which convergence criterion to use. Is there general guidance around when to use one of the norms with initial residual (ksp_converged_use_initial_residual_norm or ksp_converged_use_min_initial_residual_norm) over the default norm? If I understand the formulas correctly, the initial residual norm "norm(b - A * x0)" (maybe add preconditioning) means that if you have a very good initial guess (as is often the case in time dependent problems if you can use the result if the last time step as initial guess), the norm is much stricter than the default norm "norm(b)". Is this meant as a way to control error accumulation over time? Or does it have some other purpose?
Thanks and Regards,
Daniel
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20250202/8e82f880/attachment.html>

From bsmith at petsc.dev  Mon Feb  3 10:16:36 2025
From: bsmith at petsc.dev (Barry Smith)
Date: Mon, 3 Feb 2025 11:16:36 -0500
Subject: [petsc-users] KSP: when to use initial residual norm
 (ksp_converged_use_initial_residual_norm)
In-Reply-To: <e5ffb6c0f3ec419cac51ff732af5e931@dlr.de>
References: <e5ffb6c0f3ec419cac51ff732af5e931@dlr.de>
Message-ID: <6E5AF340-30B3-4C37-965A-26E0638BA64B@petsc.dev>


> On Feb 2, 2025, at 1:09?PM, Daniel.Abele--- via petsc-users <petsc-users at mcs.anl.gov> wrote:
> 
> Hi,
> we are solving a time dependent problem with a single KSP in every time step. We are debating which convergence criterion to use. Is there general guidance around when to use one of the norms with initial residual (ksp_converged_use_initial_residual_norm or ksp_converged_use_min_initial_residual_norm) over the default norm?


> If I understand the formulas correctly, the initial residual norm ?norm(b ? A * x0)? (maybe add preconditioning) means that if you have a very good initial guess (as is often the case in time dependent problems if you can use the result if the last time step as initial guess), the norm is much stricter than the default norm ?norm(b)?.

   The above statement is correct.

> Is this meant as a way to control error accumulation over time? Or does it have some other purpose?
> Thanks and Regards,
> Daniel

   For splitting-type methods, you want the error in the linear system solve, e, to be on the same order as the maximum error from the splitting, the explicit time-step discretization,  and the implicit time-step discretization. So you need some estimate of that value.

Now

   || e ||_2  < || B(b - Ax) ||_2 
                    -------------------
                    \lambda_min(BA).


   For this you need a handle on  \lambda_min(BA) which you can obtain with Lanczo using -ksp_monitor_singular_values.  So the converge criteria should really depend on setting the -ksp_atol using the required bound on || e ||_2  and \lambda_min(BA). 

   PETSc should provide this convergence test but no one has gotten around to adding it.

  Barry


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20250203/991d5ad2/attachment-0001.html>

From anna.dalklint at solid.lth.se  Tue Feb  4 06:32:33 2025
From: anna.dalklint at solid.lth.se (Anna Dalklint)
Date: Tue, 4 Feb 2025 12:32:33 +0000
Subject: [petsc-users] Visualizing higher order finite element output in
 ParaView
In-Reply-To: <87a5b6lvb2.fsf@jedbrown.org>
References: <GVYP280MB098796408115462E07A63F81D2EF2@GVYP280MB0987.SWEP280.PROD.OUTLOOK.COM>
	<875xlxgumq.fsf@jedbrown.org>
	<CAMYG4GnvAz30xYQiufaGEJDvxv5ZiS1bsJ+5FGnfaSWumXqdPA@mail.gmail.com>
	<GVYP280MB0987DE16623E713A9976B9C3D2E92@GVYP280MB0987.SWEP280.PROD.OUTLOOK.COM>
	<CAMYG4Gn+G2nGaD+ZiETKmxHuNf44LNm1_002NwN-6ZXuHp-+fA@mail.gmail.com>
	<GVYP280MB0987041BD9C9E904D3132279D2E82@GVYP280MB0987.SWEP280.PROD.OUTLOOK.COM>
	<87a5b6lvb2.fsf@jedbrown.org>
Message-ID: <GVYP280MB098717C0764A3D7DF9A13679D2F42@GVYP280MB0987.SWEP280.PROD.OUTLOOK.COM>

Thank you, that worked!

Best,
Anna

From: Jed Brown <jed at jedbrown.org>
Date: Friday, 31 January 2025 at 18:58
To: Anna Dalklint <anna.dalklint at solid.lth.se>, Matthew Knepley <knepley at gmail.com>
Cc: petsc-users at mcs.anl.gov <petsc-users at mcs.anl.gov>
Subject: Re: [petsc-users] Visualizing higher order finite element output in ParaView
Anna Dalklint <anna.dalklint at solid.lth.se> writes:

> I want to save e.g. the discretized displacement field obtained from a quasi-static non-linear finite element simulation using 10 node tetrahedral elements (i.e. which has edge dofs). As mentioned, I use PetscSection to add the additional dofs on edges. I have also written my own Newton solver, i.e. I do not use SNES. In conclusion, what I want is to be able to save the discretized displacement field in each outer iteration of the Newton loop (where I increase the pseudo-time, i.e. scaling of the load). I would then preferably be able to load a stack of these files (call them u001, u002, u003? for each ?load-step?) and step in ?time? in ParaView.

Please use DMSetOutputSequenceNumber to record step number. You can either use one PetscViewer of type CGNS and call VecView in your loading loop or you can write a sequence of files by creating a new PetscViewer each time.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20250204/efeef284/attachment.html>

From Daniel.Abele at dlr.de  Tue Feb  4 12:01:22 2025
From: Daniel.Abele at dlr.de (Daniel.Abele at dlr.de)
Date: Tue, 4 Feb 2025 18:01:22 +0000
Subject: [petsc-users] KSP: when to use initial residual norm
 (ksp_converged_use_initial_residual_norm)
In-Reply-To: <6E5AF340-30B3-4C37-965A-26E0638BA64B@petsc.dev>
References: <e5ffb6c0f3ec419cac51ff732af5e931@dlr.de>
	<6E5AF340-30B3-4C37-965A-26E0638BA64B@petsc.dev>
Message-ID: <b61571ce900f4b039c3ae57d0f5f5e03@dlr.de>

Hi Barry,
thanks for the reply. We are not using any operator splitting method. Can I take from your answer that the initial residual norm is not recommended then?
We are solving a diffusion equation with FD and implicit time stepping (mostly euler, sometimes crank Nicholson.)
Regards,
Daniel

Von: Barry Smith <bsmith at petsc.dev>
Gesendet: Montag, 3. Februar 2025 17:17
An: Abele, Daniel <Daniel.Abele at dlr.de>
Cc: petsc-users at mcs.anl.gov
Betreff: Re: [petsc-users] KSP: when to use initial residual norm (ksp_converged_use_initial_residual_norm)


On Feb 2, 2025, at 1:09?PM, Daniel.Abele--- via petsc-users <petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>> wrote:

Hi,
we are solving a time dependent problem with a single KSP in every time step. We are debating which convergence criterion to use. Is there general guidance around when to use one of the norms with initial residual (ksp_converged_use_initial_residual_norm or ksp_converged_use_min_initial_residual_norm) over the default norm?


If I understand the formulas correctly, the initial residual norm ?norm(b ? A * x0)? (maybe add preconditioning) means that if you have a very good initial guess (as is often the case in time dependent problems if you can use the result if the last time step as initial guess), the norm is much stricter than the default norm ?norm(b)?.

   The above statement is correct.


Is this meant as a way to control error accumulation over time? Or does it have some other purpose?
Thanks and Regards,
Daniel

   For splitting-type methods, you want the error in the linear system solve, e, to be on the same order as the maximum error from the splitting, the explicit time-step discretization,  and the implicit time-step discretization. So you need some estimate of that value.

Now

   || e ||_2  < || B(b - Ax) ||_2
                    -------------------
                    \lambda_min(BA).


   For this you need a handle on  \lambda_min(BA) which you can obtain with Lanczo using -ksp_monitor_singular_values.  So the converge criteria should really depend on setting the -ksp_atol using the required bound on || e ||_2  and \lambda_min(BA).

   PETSc should provide this convergence test but no one has gotten around to adding it.

  Barry


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20250204/51589c7f/attachment.html>

From matteo.semplice at uninsubria.it  Wed Feb  5 11:26:50 2025
From: matteo.semplice at uninsubria.it (Matteo Semplice)
Date: Wed, 5 Feb 2025 18:26:50 +0100
Subject: [petsc-users] DMUninterpolate of periodic mesh
Message-ID: <c16adfa2-40e4-4ab8-8a6e-d0508386b565@uninsubria.it>

Dear all

 ??? I have updated a code of mine to Petsc3.22, when trying to 
uninterpolate a periodic mesh and get the following error

[0]PETSC ERROR: --------------------- Error Message 
--------------------------------------------------------------
[0]PETSC ERROR: No support for this operation for this object type
[0]PETSC ERROR: Missing local coordinate vector
[0]PETSC ERROR: WARNING! There are unused option(s) set! Could be the 
program crashed before usage or a spelling mistake, etc!
[0]PETSC ERROR:?? Option left: name:-dm_view (no value) source: command line
[0]PETSC ERROR: See https://urldefense.us/v3/__https://petsc.org/release/faq/__;!!G_uCfscf7eWS!faxBLw4JxiiCgvgXrNR1BSgniTuJMn0UxlQ7HDbFsTvh1NfjdD1Od-Ac3koxw3kaxHCmCf6MByHS5ty49s-1W9xW81DQIhYr8dp4Zw$  for trouble shooting.
[0]PETSC ERROR: Petsc Release Version 3.22.0, unknown
[0]PETSC ERROR: ./testDM with 1 MPI process(es) and PETSC_ARCH? on 
signalkuppe by matteo Wed Feb? 5 18:21:24 2025
[0]PETSC ERROR: Configure options: 
--prefix=/home/matteo/software/petscsaved/3.22-opt/ 
PETSC_DIR=/home/matteo/software/petsc --PETSC_ARCH=opt 
--with-debugging=0 --COPTFLAGS="-O3 -march=native -mtune=native -mavx2" 
--CXXOPTFLAGS="-O3 -march=native -mtune=native -mavx2" --FOPTFLAGS="-O3 
-march=native -mtune=native -mavx2" --with-strict-petscerrorcode 
--download-hdf5 --download-ml --with-metis --with-parmetis --with-gmsh 
--with-triangle --with-zlib --with-p4est-dir=~/software/p4est/local/
[0]PETSC ERROR: #1 DMLocalizeCoordinates() at 
/home/matteo/software/petsc/src/dm/interface/dmperiodicity.c:368
[0]PETSC ERROR: #2 DMPlexCopy_Internal() at 
/home/matteo/software/petsc/src/dm/impls/plex/plexcreate.c:37
[0]PETSC ERROR: #3 DMPlexUninterpolate() at 
/home/matteo/software/petsc/src/dm/impls/plex/plexinterpolate.c:1892
[0]PETSC ERROR: #4 main() at ../src/testDM.cpp:32
[0]PETSC ERROR: PETSc Option Table entries:
[0]PETSC ERROR: -dm_view (source: command line)
[0]PETSC ERROR: -dm_view_early (source: command line)
[0]PETSC ERROR: ----------------End of Error Message -------send entire 
error message to petsc-maint at mcs.anl.gov----------

Basically I am creating a quad mesh on [0,1]x[0,1] with the command

 ? PetscCall( DMPlexCreateBoxMesh(MPI_COMM_WORLD, dim, simplex, faces, 
lower, upper, periodicity, interpolate, 0, PETSC_TRUE, &dmMesh) );

and then calling

 ? PetscCall( DMPlexUninterpolate(dmMesh, &dmMeshUnint) );

raises the error. This seems independent of the values that I pass for 
the new parameters localizationHeight and sparseLocalize.

Do I need to change something else in addition to adding the new 
parameters in the DMPlexCreateBoxMesh call?

Thanks in advance

 ??? Matteo


From knepley at gmail.com  Wed Feb  5 11:35:26 2025
From: knepley at gmail.com (Matthew Knepley)
Date: Wed, 5 Feb 2025 12:35:26 -0500
Subject: [petsc-users] DMUninterpolate of periodic mesh
In-Reply-To: <c16adfa2-40e4-4ab8-8a6e-d0508386b565@uninsubria.it>
References: <c16adfa2-40e4-4ab8-8a6e-d0508386b565@uninsubria.it>
Message-ID: <CAMYG4G=kt8avi7no+bV9WBt1JRDs8LVd1DPgNZacvrBa6ibpkg@mail.gmail.com>

On Wed, Feb 5, 2025 at 12:27?PM Matteo Semplice via petsc-users <
petsc-users at mcs.anl.gov> wrote:

> Dear all
>
>      I have updated a code of mine to Petsc3.22, when trying to
> uninterpolate a periodic mesh and get the following error
>

This looks like a bug. I will fix it.

Note that you can get what you want by passing interpolate = PETSC_FALSE

  Thanks,

     Matt


> [0]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------
> [0]PETSC ERROR: No support for this operation for this object type
> [0]PETSC ERROR: Missing local coordinate vector
> [0]PETSC ERROR: WARNING! There are unused option(s) set! Could be the
> program crashed before usage or a spelling mistake, etc!
> [0]PETSC ERROR:   Option left: name:-dm_view (no value) source: command
> line
> [0]PETSC ERROR: See
> https://urldefense.us/v3/__https://petsc.org/release/faq/__;!!G_uCfscf7eWS!faxBLw4JxiiCgvgXrNR1BSgniTuJMn0UxlQ7HDbFsTvh1NfjdD1Od-Ac3koxw3kaxHCmCf6MByHS5ty49s-1W9xW81DQIhYr8dp4Zw$
> for trouble shooting.
> [0]PETSC ERROR: Petsc Release Version 3.22.0, unknown
> [0]PETSC ERROR: ./testDM with 1 MPI process(es) and PETSC_ARCH  on
> signalkuppe by matteo Wed Feb  5 18:21:24 2025
> [0]PETSC ERROR: Configure options:
> --prefix=/home/matteo/software/petscsaved/3.22-opt/
> PETSC_DIR=/home/matteo/software/petsc --PETSC_ARCH=opt
> --with-debugging=0 --COPTFLAGS="-O3 -march=native -mtune=native -mavx2"
> --CXXOPTFLAGS="-O3 -march=native -mtune=native -mavx2" --FOPTFLAGS="-O3
> -march=native -mtune=native -mavx2" --with-strict-petscerrorcode
> --download-hdf5 --download-ml --with-metis --with-parmetis --with-gmsh
> --with-triangle --with-zlib --with-p4est-dir=~/software/p4est/local/
> [0]PETSC ERROR: #1 DMLocalizeCoordinates() at
> /home/matteo/software/petsc/src/dm/interface/dmperiodicity.c:368
> [0]PETSC ERROR: #2 DMPlexCopy_Internal() at
> /home/matteo/software/petsc/src/dm/impls/plex/plexcreate.c:37
> [0]PETSC ERROR: #3 DMPlexUninterpolate() at
> /home/matteo/software/petsc/src/dm/impls/plex/plexinterpolate.c:1892
> [0]PETSC ERROR: #4 main() at ../src/testDM.cpp:32
> [0]PETSC ERROR: PETSc Option Table entries:
> [0]PETSC ERROR: -dm_view (source: command line)
> [0]PETSC ERROR: -dm_view_early (source: command line)
> [0]PETSC ERROR: ----------------End of Error Message -------send entire
> error message to petsc-maint at mcs.anl.gov----------
>
> Basically I am creating a quad mesh on [0,1]x[0,1] with the command
>
>    PetscCall( DMPlexCreateBoxMesh(MPI_COMM_WORLD, dim, simplex, faces,
> lower, upper, periodicity, interpolate, 0, PETSC_TRUE, &dmMesh) );
>
> and then calling
>
>    PetscCall( DMPlexUninterpolate(dmMesh, &dmMeshUnint) );
>
> raises the error. This seems independent of the values that I pass for
> the new parameters localizationHeight and sparseLocalize.
>
> Do I need to change something else in addition to adding the new
> parameters in the DMPlexCreateBoxMesh call?
>
> Thanks in advance
>
>      Matteo
>
>
>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!YZMEed4K2IwyEQhBwJUypWI3aKs5Wmwawwo7BynU-0Vn_H-W4_qFcSC3h-JFkofKZsXlP63vBUj4owuXbY5T$  <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!YZMEed4K2IwyEQhBwJUypWI3aKs5Wmwawwo7BynU-0Vn_H-W4_qFcSC3h-JFkofKZsXlP63vBUj4o7DdLz_A$ >
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20250205/6750b10f/attachment.html>

From medane.tchakorom at univ-fcomte.fr  Fri Feb  7 04:05:56 2025
From: medane.tchakorom at univ-fcomte.fr (medane.tchakorom at univ-fcomte.fr)
Date: Fri, 7 Feb 2025 11:05:56 +0100
Subject: [petsc-users] Incoherent data entries in array from a dense sub
 matrix
Message-ID: <50ED5DCF-A092-4C35-8E3D-F018A96AD56E@univ-fcomte.fr>


Dear all,

I have been experiencing incoherent data entries from this code below, when printing the array. Maybe I?am doing something wrong. 

----------------

     PetscInt nlines = 8; // lines
    PetscInt ncols = 4;  // columns
    PetscMPIInt rank;
    PetscMPIInt size;

    // Initialize PETSc
    PetscCall(PetscInitialize(&argc, &args, NULL, NULL));
    PetscCallMPI(MPI_Comm_rank(MPI_COMM_WORLD, &rank));
    PetscCallMPI(MPI_Comm_size(MPI_COMM_WORLD, &size));

    Mat R_full;
    Mat R_part;
    PetscInt idx_first_row = 0;
    PetscInt idx_one_plus_last_row = nlines / 2;
    PetscCall(MatCreateDense(PETSC_COMM_WORLD, PETSC_DECIDE, PETSC_DECIDE, nlines, ncols, NULL, &R_full));

    // Get sub matrix
    PetscCall(MatDenseGetSubMatrix(R_full, idx_first_row, idx_one_plus_last_row, PETSC_DECIDE, PETSC_DECIDE, &R_part));
    // Add entries to sub matrix
    MatSetRandom(R_part, NULL);
    //View sub matrix
    PetscCall(MatView(R_part, PETSC_VIEWER_STDOUT_WORLD));

    // Get array from sub matrix and print entries
    PetscScalar *buffer;
    PetscCall(MatDenseGetArray(R_part, &buffer));
    PetscInt idx_end = (nlines/2) * ncols;

    for (int i = 0; i < idx_end; i++)
    {
        PetscPrintf(PETSC_COMM_SELF, "buffer[%d] = %e \n", i, buffer[i]);
    }

    //Restore array to sub matrix
    PetscCall(MatDenseRestoreArray(R_part, &buffer));
    // Restore sub matrix
    PetscCall(MatDenseRestoreSubMatrix(R_full, &R_part));
    // View the initial matrix
    PetscCall(MatView(R_full, PETSC_VIEWER_STDOUT_WORLD));

    PetscCall(MatDestroy(&R_full));

    PetscCall(PetscFinalize());
    return 0;

----------------


Thanks
Medane

From pierre at joliv.et  Fri Feb  7 04:34:36 2025
From: pierre at joliv.et (Pierre Jolivet)
Date: Fri, 7 Feb 2025 11:34:36 +0100
Subject: [petsc-users] Incoherent data entries in array from a dense sub
 matrix
In-Reply-To: <50ED5DCF-A092-4C35-8E3D-F018A96AD56E@univ-fcomte.fr>
References: <50ED5DCF-A092-4C35-8E3D-F018A96AD56E@univ-fcomte.fr>
Message-ID: <FFE8E785-E126-4F64-A99A-5D31EC5D4CC4@joliv.et>


> On 7 Feb 2025, at 11:05?AM, medane.tchakorom at univ-fcomte.fr wrote:
> 
> 
> Dear all,
> 
> I have been experiencing incoherent data entries from this code below, when printing the array. Maybe I?am doing something wrong.

What is incoherent?
Everything looks OK to me.

Thanks,
Pierre

> ----------------
> 
>     PetscInt nlines = 8; // lines
>    PetscInt ncols = 4;  // columns
>    PetscMPIInt rank;
>    PetscMPIInt size;
> 
>    // Initialize PETSc
>    PetscCall(PetscInitialize(&argc, &args, NULL, NULL));
>    PetscCallMPI(MPI_Comm_rank(MPI_COMM_WORLD, &rank));
>    PetscCallMPI(MPI_Comm_size(MPI_COMM_WORLD, &size));
> 
>    Mat R_full;
>    Mat R_part;
>    PetscInt idx_first_row = 0;
>    PetscInt idx_one_plus_last_row = nlines / 2;
>    PetscCall(MatCreateDense(PETSC_COMM_WORLD, PETSC_DECIDE, PETSC_DECIDE, nlines, ncols, NULL, &R_full));
> 
>    // Get sub matrix
>    PetscCall(MatDenseGetSubMatrix(R_full, idx_first_row, idx_one_plus_last_row, PETSC_DECIDE, PETSC_DECIDE, &R_part));
>    // Add entries to sub matrix
>    MatSetRandom(R_part, NULL);
>    //View sub matrix
>    PetscCall(MatView(R_part, PETSC_VIEWER_STDOUT_WORLD));
> 
>    // Get array from sub matrix and print entries
>    PetscScalar *buffer;
>    PetscCall(MatDenseGetArray(R_part, &buffer));
>    PetscInt idx_end = (nlines/2) * ncols;
> 
>    for (int i = 0; i < idx_end; i++)
>    {
>        PetscPrintf(PETSC_COMM_SELF, "buffer[%d] = %e \n", i, buffer[i]);
>    }
> 
>    //Restore array to sub matrix
>    PetscCall(MatDenseRestoreArray(R_part, &buffer));
>    // Restore sub matrix
>    PetscCall(MatDenseRestoreSubMatrix(R_full, &R_part));
>    // View the initial matrix
>    PetscCall(MatView(R_full, PETSC_VIEWER_STDOUT_WORLD));
> 
>    PetscCall(MatDestroy(&R_full));
> 
>    PetscCall(PetscFinalize());
>    return 0;
> 
> ----------------
> 
> 
> Thanks
> Medane


From medane.tchakorom at univ-fcomte.fr  Fri Feb  7 04:49:38 2025
From: medane.tchakorom at univ-fcomte.fr (medane.tchakorom at univ-fcomte.fr)
Date: Fri, 7 Feb 2025 11:49:38 +0100
Subject: [petsc-users] Incoherent data entries in array from a dense sub
 matrix
In-Reply-To: <FFE8E785-E126-4F64-A99A-5D31EC5D4CC4@joliv.et>
References: <50ED5DCF-A092-4C35-8E3D-F018A96AD56E@univ-fcomte.fr>
	<FFE8E785-E126-4F64-A99A-5D31EC5D4CC4@joliv.et>
Message-ID: <95DA5C68-39E1-4BDF-8DDD-120B2DB0E1DC@univ-fcomte.fr>

Re:
Please find below the output from the previous code, running on only one processor.

Mat Object: 1 MPI process
  type: seqdense
7.2003197397953400e-01 3.9777780919128602e-01 9.8405227390177075e-01 1.4405427480322786e-01
6.1793966542126100e-02 7.3036588248200474e-02 7.3851607000756303e-01 9.9650445216117589e-01
1.0022337819588500e-02 1.0386628927366459e-01 4.0114727059134836e-01 1.0677308875937896e-01
1.4463931936456476e-01 2.5078039364333193e-01 5.2764865382548720e-01 9.8905332488367748e-01

buffer[0] = 7.200320e-01
buffer[1] = 6.179397e-02
buffer[2] = 1.002234e-02
buffer[3] = 1.446393e-01
buffer[4] = 0.000000e+00
buffer[5] = 0.000000e+00
buffer[6] = 0.000000e+00
buffer[7] = 0.000000e+00
buffer[8] = 3.977778e-01
buffer[9] = 7.303659e-02
buffer[10] = 1.038663e-01
buffer[11] = 2.507804e-01
buffer[12] = 0.000000e+00
buffer[13] = 0.000000e+00
buffer[14] = 0.000000e+00
buffer[15] = 0.000000e+00

Mat Object: 1 MPI process
  type: seqdense
7.2003197397953400e-01 3.9777780919128602e-01 9.8405227390177075e-01 1.4405427480322786e-01
6.1793966542126100e-02 7.3036588248200474e-02 7.3851607000756303e-01 9.9650445216117589e-01
1.0022337819588500e-02 1.0386628927366459e-01 4.0114727059134836e-01 1.0677308875937896e-01
1.4463931936456476e-01 2.5078039364333193e-01 5.2764865382548720e-01 9.8905332488367748e-01
0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00
0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00
0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00
0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00


I was expecting to get in ?buffer?, only the data entries from R_part. Please, let me know if this is the excepted behavior and I?am missing something. 

Thanks,
Medane


> On 7 Feb 2025, at 11:34, Pierre Jolivet <pierre at joliv.et> wrote:
> 
> 
> 
>> On 7 Feb 2025, at 11:05?AM, medane.tchakorom at univ-fcomte.fr wrote:
>> 
>> 
>> Dear all,
>> 
>> I have been experiencing incoherent data entries from this code below, when printing the array. Maybe I?am doing something wrong.
> 
> What is incoherent?
> Everything looks OK to me.
> 
> Thanks,
> Pierre
> 
>> ----------------
>> 
>>    PetscInt nlines = 8; // lines
>>   PetscInt ncols = 4;  // columns
>>   PetscMPIInt rank;
>>   PetscMPIInt size;
>> 
>>   // Initialize PETSc
>>   PetscCall(PetscInitialize(&argc, &args, NULL, NULL));
>>   PetscCallMPI(MPI_Comm_rank(MPI_COMM_WORLD, &rank));
>>   PetscCallMPI(MPI_Comm_size(MPI_COMM_WORLD, &size));
>> 
>>   Mat R_full;
>>   Mat R_part;
>>   PetscInt idx_first_row = 0;
>>   PetscInt idx_one_plus_last_row = nlines / 2;
>>   PetscCall(MatCreateDense(PETSC_COMM_WORLD, PETSC_DECIDE, PETSC_DECIDE, nlines, ncols, NULL, &R_full));
>> 
>>   // Get sub matrix
>>   PetscCall(MatDenseGetSubMatrix(R_full, idx_first_row, idx_one_plus_last_row, PETSC_DECIDE, PETSC_DECIDE, &R_part));
>>   // Add entries to sub matrix
>>   MatSetRandom(R_part, NULL);
>>   //View sub matrix
>>   PetscCall(MatView(R_part, PETSC_VIEWER_STDOUT_WORLD));
>> 
>>   // Get array from sub matrix and print entries
>>   PetscScalar *buffer;
>>   PetscCall(MatDenseGetArray(R_part, &buffer));
>>   PetscInt idx_end = (nlines/2) * ncols;
>> 
>>   for (int i = 0; i < idx_end; i++)
>>   {
>>       PetscPrintf(PETSC_COMM_SELF, "buffer[%d] = %e \n", i, buffer[i]);
>>   }
>> 
>>   //Restore array to sub matrix
>>   PetscCall(MatDenseRestoreArray(R_part, &buffer));
>>   // Restore sub matrix
>>   PetscCall(MatDenseRestoreSubMatrix(R_full, &R_part));
>>   // View the initial matrix
>>   PetscCall(MatView(R_full, PETSC_VIEWER_STDOUT_WORLD));
>> 
>>   PetscCall(MatDestroy(&R_full));
>> 
>>   PetscCall(PetscFinalize());
>>   return 0;
>> 
>> ----------------
>> 
>> 
>> Thanks
>> Medane
> 
> 


From jroman at dsic.upv.es  Fri Feb  7 05:15:33 2025
From: jroman at dsic.upv.es (Jose E. Roman)
Date: Fri, 7 Feb 2025 11:15:33 +0000
Subject: [petsc-users] Incoherent data entries in array from a dense sub
 matrix
In-Reply-To: <95DA5C68-39E1-4BDF-8DDD-120B2DB0E1DC@univ-fcomte.fr>
References: <50ED5DCF-A092-4C35-8E3D-F018A96AD56E@univ-fcomte.fr>
	<FFE8E785-E126-4F64-A99A-5D31EC5D4CC4@joliv.et>
	<95DA5C68-39E1-4BDF-8DDD-120B2DB0E1DC@univ-fcomte.fr>
Message-ID: <E47F6E83-9771-4CA5-9E51-723B688FA2D6@dsic.upv.es>

This is expected. For dense matrices, MatDenseGetSubMatrix() does not duplicate the memory. You should interpret the array as a two-dimensional column-major array: buffer[i+j*lda] where i,j are row and column indices, and lda can be obtained with MatDenseGetLDA().

Jose

> El 7 feb 2025, a las 11:49, medane.tchakorom at univ-fcomte.fr escribi?:
> 
> Re:
> Please find below the output from the previous code, running on only one processor.
> 
> Mat Object: 1 MPI process
>  type: seqdense
> 7.2003197397953400e-01 3.9777780919128602e-01 9.8405227390177075e-01 1.4405427480322786e-01
> 6.1793966542126100e-02 7.3036588248200474e-02 7.3851607000756303e-01 9.9650445216117589e-01
> 1.0022337819588500e-02 1.0386628927366459e-01 4.0114727059134836e-01 1.0677308875937896e-01
> 1.4463931936456476e-01 2.5078039364333193e-01 5.2764865382548720e-01 9.8905332488367748e-01
> 
> buffer[0] = 7.200320e-01
> buffer[1] = 6.179397e-02
> buffer[2] = 1.002234e-02
> buffer[3] = 1.446393e-01
> buffer[4] = 0.000000e+00
> buffer[5] = 0.000000e+00
> buffer[6] = 0.000000e+00
> buffer[7] = 0.000000e+00
> buffer[8] = 3.977778e-01
> buffer[9] = 7.303659e-02
> buffer[10] = 1.038663e-01
> buffer[11] = 2.507804e-01
> buffer[12] = 0.000000e+00
> buffer[13] = 0.000000e+00
> buffer[14] = 0.000000e+00
> buffer[15] = 0.000000e+00
> 
> Mat Object: 1 MPI process
>  type: seqdense
> 7.2003197397953400e-01 3.9777780919128602e-01 9.8405227390177075e-01 1.4405427480322786e-01
> 6.1793966542126100e-02 7.3036588248200474e-02 7.3851607000756303e-01 9.9650445216117589e-01
> 1.0022337819588500e-02 1.0386628927366459e-01 4.0114727059134836e-01 1.0677308875937896e-01
> 1.4463931936456476e-01 2.5078039364333193e-01 5.2764865382548720e-01 9.8905332488367748e-01
> 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00
> 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00
> 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00
> 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00
> 
> 
> I was expecting to get in ?buffer?, only the data entries from R_part. Please, let me know if this is the excepted behavior and I?am missing something. 
> 
> Thanks,
> Medane
> 
> 
> 
>> On 7 Feb 2025, at 11:34, Pierre Jolivet <pierre at joliv.et> wrote:
>> 
>> 
>> 
>>> On 7 Feb 2025, at 11:05?AM, medane.tchakorom at univ-fcomte.fr wrote:
>>> 
>>> 
>>> Dear all,
>>> 
>>> I have been experiencing incoherent data entries from this code below, when printing the array. Maybe I?am doing something wrong.
>> 
>> What is incoherent?
>> Everything looks OK to me.
>> 
>> Thanks,
>> Pierre
>> 
>>> ----------------
>>> 
>>>   PetscInt nlines = 8; // lines
>>>  PetscInt ncols = 4;  // columns
>>>  PetscMPIInt rank;
>>>  PetscMPIInt size;
>>> 
>>>  // Initialize PETSc
>>>  PetscCall(PetscInitialize(&argc, &args, NULL, NULL));
>>>  PetscCallMPI(MPI_Comm_rank(MPI_COMM_WORLD, &rank));
>>>  PetscCallMPI(MPI_Comm_size(MPI_COMM_WORLD, &size));
>>> 
>>>  Mat R_full;
>>>  Mat R_part;
>>>  PetscInt idx_first_row = 0;
>>>  PetscInt idx_one_plus_last_row = nlines / 2;
>>>  PetscCall(MatCreateDense(PETSC_COMM_WORLD, PETSC_DECIDE, PETSC_DECIDE, nlines, ncols, NULL, &R_full));
>>> 
>>>  // Get sub matrix
>>>  PetscCall(MatDenseGetSubMatrix(R_full, idx_first_row, idx_one_plus_last_row, PETSC_DECIDE, PETSC_DECIDE, &R_part));
>>>  // Add entries to sub matrix
>>>  MatSetRandom(R_part, NULL);
>>>  //View sub matrix
>>>  PetscCall(MatView(R_part, PETSC_VIEWER_STDOUT_WORLD));
>>> 
>>>  // Get array from sub matrix and print entries
>>>  PetscScalar *buffer;
>>>  PetscCall(MatDenseGetArray(R_part, &buffer));
>>>  PetscInt idx_end = (nlines/2) * ncols;
>>> 
>>>  for (int i = 0; i < idx_end; i++)
>>>  {
>>>      PetscPrintf(PETSC_COMM_SELF, "buffer[%d] = %e \n", i, buffer[i]);
>>>  }
>>> 
>>>  //Restore array to sub matrix
>>>  PetscCall(MatDenseRestoreArray(R_part, &buffer));
>>>  // Restore sub matrix
>>>  PetscCall(MatDenseRestoreSubMatrix(R_full, &R_part));
>>>  // View the initial matrix
>>>  PetscCall(MatView(R_full, PETSC_VIEWER_STDOUT_WORLD));
>>> 
>>>  PetscCall(MatDestroy(&R_full));
>>> 
>>>  PetscCall(PetscFinalize());
>>>  return 0;
>>> 
>>> ----------------
>>> 
>>> 
>>> Thanks
>>> Medane
>> 
>> 
> 


From knepley at gmail.com  Fri Feb  7 08:22:58 2025
From: knepley at gmail.com (Matthew Knepley)
Date: Fri, 7 Feb 2025 09:22:58 -0500
Subject: [petsc-users] Incoherent data entries in array from a dense sub
 matrix
In-Reply-To: <95DA5C68-39E1-4BDF-8DDD-120B2DB0E1DC@univ-fcomte.fr>
References: <50ED5DCF-A092-4C35-8E3D-F018A96AD56E@univ-fcomte.fr>
	<FFE8E785-E126-4F64-A99A-5D31EC5D4CC4@joliv.et>
	<95DA5C68-39E1-4BDF-8DDD-120B2DB0E1DC@univ-fcomte.fr>
Message-ID: <CAMYG4GmVq2bXuPOrXPn42JMzEGGTYz=7_sivR8+J2-PfOz5Mug@mail.gmail.com>

On Fri, Feb 7, 2025 at 8:20?AM medane.tchakorom at univ-fcomte.fr <
medane.tchakorom at univ-fcomte.fr> wrote:

> Re:
> Please find below the output from the previous code, running on only one
> processor.
>
> Mat Object: 1 MPI process
>   type: seqdense
> 7.2003197397953400e-01 3.9777780919128602e-01 9.8405227390177075e-01
> 1.4405427480322786e-01
> 6.1793966542126100e-02 7.3036588248200474e-02 7.3851607000756303e-01
> 9.9650445216117589e-01
> 1.0022337819588500e-02 1.0386628927366459e-01 4.0114727059134836e-01
> 1.0677308875937896e-01
> 1.4463931936456476e-01 2.5078039364333193e-01 5.2764865382548720e-01
> 9.8905332488367748e-01
>
> buffer[0] = 7.200320e-01
> buffer[1] = 6.179397e-02
> buffer[2] = 1.002234e-02
> buffer[3] = 1.446393e-01
> buffer[4] = 0.000000e+00
> buffer[5] = 0.000000e+00
> buffer[6] = 0.000000e+00
> buffer[7] = 0.000000e+00
> buffer[8] = 3.977778e-01
> buffer[9] = 7.303659e-02
> buffer[10] = 1.038663e-01
> buffer[11] = 2.507804e-01
> buffer[12] = 0.000000e+00
> buffer[13] = 0.000000e+00
> buffer[14] = 0.000000e+00
> buffer[15] = 0.000000e+00
>
> Mat Object: 1 MPI process
>   type: seqdense
> 7.2003197397953400e-01 3.9777780919128602e-01 9.8405227390177075e-01
> 1.4405427480322786e-01
> 6.1793966542126100e-02 7.3036588248200474e-02 7.3851607000756303e-01
> 9.9650445216117589e-01
> 1.0022337819588500e-02 1.0386628927366459e-01 4.0114727059134836e-01
> 1.0677308875937896e-01
> 1.4463931936456476e-01 2.5078039364333193e-01 5.2764865382548720e-01
> 9.8905332488367748e-01
> 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00
> 0.0000000000000000e+00
> 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00
> 0.0000000000000000e+00
> 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00
> 0.0000000000000000e+00
> 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00
> 0.0000000000000000e+00
>
>
> I was expecting to get in ?buffer?, only the data entries from R_part.
> Please, let me know if this is the excepted behavior and I?am missing
> something.
>

As Jose already pointed out, SubMatrix() does not copy. It gives you a Mat
front end to the same data, but with changed sizes. In this case, the LDA
is 4, not 2, so when you iterate over the values, you skip over the ones
you don't want.

  Thanks,

    Matt


> Thanks,
> Medane
>
>
>
> > On 7 Feb 2025, at 11:34, Pierre Jolivet <pierre at joliv.et> wrote:
> >
> >
> >
> >> On 7 Feb 2025, at 11:05?AM, medane.tchakorom at univ-fcomte.fr wrote:
> >>
> >>
> >> Dear all,
> >>
> >> I have been experiencing incoherent data entries from this code below,
> when printing the array. Maybe I?am doing something wrong.
> >
> > What is incoherent?
> > Everything looks OK to me.
> >
> > Thanks,
> > Pierre
> >
> >> ----------------
> >>
> >>    PetscInt nlines = 8; // lines
> >>   PetscInt ncols = 4;  // columns
> >>   PetscMPIInt rank;
> >>   PetscMPIInt size;
> >>
> >>   // Initialize PETSc
> >>   PetscCall(PetscInitialize(&argc, &args, NULL, NULL));
> >>   PetscCallMPI(MPI_Comm_rank(MPI_COMM_WORLD, &rank));
> >>   PetscCallMPI(MPI_Comm_size(MPI_COMM_WORLD, &size));
> >>
> >>   Mat R_full;
> >>   Mat R_part;
> >>   PetscInt idx_first_row = 0;
> >>   PetscInt idx_one_plus_last_row = nlines / 2;
> >>   PetscCall(MatCreateDense(PETSC_COMM_WORLD, PETSC_DECIDE,
> PETSC_DECIDE, nlines, ncols, NULL, &R_full));
> >>
> >>   // Get sub matrix
> >>   PetscCall(MatDenseGetSubMatrix(R_full, idx_first_row,
> idx_one_plus_last_row, PETSC_DECIDE, PETSC_DECIDE, &R_part));
> >>   // Add entries to sub matrix
> >>   MatSetRandom(R_part, NULL);
> >>   //View sub matrix
> >>   PetscCall(MatView(R_part, PETSC_VIEWER_STDOUT_WORLD));
> >>
> >>   // Get array from sub matrix and print entries
> >>   PetscScalar *buffer;
> >>   PetscCall(MatDenseGetArray(R_part, &buffer));
> >>   PetscInt idx_end = (nlines/2) * ncols;
> >>
> >>   for (int i = 0; i < idx_end; i++)
> >>   {
> >>       PetscPrintf(PETSC_COMM_SELF, "buffer[%d] = %e \n", i, buffer[i]);
> >>   }
> >>
> >>   //Restore array to sub matrix
> >>   PetscCall(MatDenseRestoreArray(R_part, &buffer));
> >>   // Restore sub matrix
> >>   PetscCall(MatDenseRestoreSubMatrix(R_full, &R_part));
> >>   // View the initial matrix
> >>   PetscCall(MatView(R_full, PETSC_VIEWER_STDOUT_WORLD));
> >>
> >>   PetscCall(MatDestroy(&R_full));
> >>
> >>   PetscCall(PetscFinalize());
> >>   return 0;
> >>
> >> ----------------
> >>
> >>
> >> Thanks
> >> Medane
> >
> >
>
>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!c2IunCO2oQ3I91jAU5GYm2XQbPfgQfcl0n_uf1fsjnqd7gGNf1YDMYee5YkTRcQAfGtUxSZxDS4kWs9Rs1qa$  <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!c2IunCO2oQ3I91jAU5GYm2XQbPfgQfcl0n_uf1fsjnqd7gGNf1YDMYee5YkTRcQAfGtUxSZxDS4kWhYE-4ae$ >
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20250207/55f10b27/attachment.html>

From medane.tchakorom at univ-fcomte.fr  Sat Feb  8 03:46:01 2025
From: medane.tchakorom at univ-fcomte.fr (medane.tchakorom at univ-fcomte.fr)
Date: Sat, 8 Feb 2025 10:46:01 +0100
Subject: [petsc-users] Incoherent data entries in array from a dense sub
 matrix
In-Reply-To: <CAMYG4GmVq2bXuPOrXPn42JMzEGGTYz=7_sivR8+J2-PfOz5Mug@mail.gmail.com>
References: <50ED5DCF-A092-4C35-8E3D-F018A96AD56E@univ-fcomte.fr>
	<FFE8E785-E126-4F64-A99A-5D31EC5D4CC4@joliv.et>
	<95DA5C68-39E1-4BDF-8DDD-120B2DB0E1DC@univ-fcomte.fr>
	<CAMYG4GmVq2bXuPOrXPn42JMzEGGTYz=7_sivR8+J2-PfOz5Mug@mail.gmail.com>
Message-ID: <38CA73F0-1204-4909-8778-2C59BA41707B@univ-fcomte.fr>


Dear petsc team,

Thank you for all your answers. I really appreciate.

Best regards,
Medane


> On 7 Feb 2025, at 15:22, Matthew Knepley <knepley at gmail.com> wrote:
> 
> On Fri, Feb 7, 2025 at 8:20?AM medane.tchakorom at univ-fcomte.fr <mailto:medane.tchakorom at univ-fcomte.fr> <medane.tchakorom at univ-fcomte.fr <mailto:medane.tchakorom at univ-fcomte.fr>> wrote:
>> Re:
>> Please find below the output from the previous code, running on only one processor.
>> 
>> Mat Object: 1 MPI process
>>   type: seqdense
>> 7.2003197397953400e-01 3.9777780919128602e-01 9.8405227390177075e-01 1.4405427480322786e-01
>> 6.1793966542126100e-02 7.3036588248200474e-02 7.3851607000756303e-01 9.9650445216117589e-01
>> 1.0022337819588500e-02 1.0386628927366459e-01 4.0114727059134836e-01 1.0677308875937896e-01
>> 1.4463931936456476e-01 2.5078039364333193e-01 5.2764865382548720e-01 9.8905332488367748e-01
>> 
>> buffer[0] = 7.200320e-01
>> buffer[1] = 6.179397e-02
>> buffer[2] = 1.002234e-02
>> buffer[3] = 1.446393e-01
>> buffer[4] = 0.000000e+00
>> buffer[5] = 0.000000e+00
>> buffer[6] = 0.000000e+00
>> buffer[7] = 0.000000e+00
>> buffer[8] = 3.977778e-01
>> buffer[9] = 7.303659e-02
>> buffer[10] = 1.038663e-01
>> buffer[11] = 2.507804e-01
>> buffer[12] = 0.000000e+00
>> buffer[13] = 0.000000e+00
>> buffer[14] = 0.000000e+00
>> buffer[15] = 0.000000e+00
>> 
>> Mat Object: 1 MPI process
>>   type: seqdense
>> 7.2003197397953400e-01 3.9777780919128602e-01 9.8405227390177075e-01 1.4405427480322786e-01
>> 6.1793966542126100e-02 7.3036588248200474e-02 7.3851607000756303e-01 9.9650445216117589e-01
>> 1.0022337819588500e-02 1.0386628927366459e-01 4.0114727059134836e-01 1.0677308875937896e-01
>> 1.4463931936456476e-01 2.5078039364333193e-01 5.2764865382548720e-01 9.8905332488367748e-01
>> 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00
>> 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00
>> 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00
>> 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00
>> 
>> 
>> I was expecting to get in ?buffer?, only the data entries from R_part. Please, let me know if this is the excepted behavior and I?am missing something.
> 
> As Jose already pointed out, SubMatrix() does not copy. It gives you a Mat front end to the same data, but with changed sizes. In this case, the LDA is 4, not 2, so when you iterate over the values, you skip over the ones you don't want.
> 
>   Thanks,
> 
>     Matt
>  
>> Thanks,
>> Medane
>> 
>> 
>> 
>> > On 7 Feb 2025, at 11:34, Pierre Jolivet <pierre at joliv.et <mailto:pierre at joliv.et>> wrote:
>> > 
>> > 
>> > 
>> >> On 7 Feb 2025, at 11:05?AM, medane.tchakorom at univ-fcomte.fr <mailto:medane.tchakorom at univ-fcomte.fr> wrote:
>> >> 
>> >> 
>> >> Dear all,
>> >> 
>> >> I have been experiencing incoherent data entries from this code below, when printing the array. Maybe I?am doing something wrong.
>> > 
>> > What is incoherent?
>> > Everything looks OK to me.
>> > 
>> > Thanks,
>> > Pierre
>> > 
>> >> ----------------
>> >> 
>> >>    PetscInt nlines = 8; // lines
>> >>   PetscInt ncols = 4;  // columns
>> >>   PetscMPIInt rank;
>> >>   PetscMPIInt size;
>> >> 
>> >>   // Initialize PETSc
>> >>   PetscCall(PetscInitialize(&argc, &args, NULL, NULL));
>> >>   PetscCallMPI(MPI_Comm_rank(MPI_COMM_WORLD, &rank));
>> >>   PetscCallMPI(MPI_Comm_size(MPI_COMM_WORLD, &size));
>> >> 
>> >>   Mat R_full;
>> >>   Mat R_part;
>> >>   PetscInt idx_first_row = 0;
>> >>   PetscInt idx_one_plus_last_row = nlines / 2;
>> >>   PetscCall(MatCreateDense(PETSC_COMM_WORLD, PETSC_DECIDE, PETSC_DECIDE, nlines, ncols, NULL, &R_full));
>> >> 
>> >>   // Get sub matrix
>> >>   PetscCall(MatDenseGetSubMatrix(R_full, idx_first_row, idx_one_plus_last_row, PETSC_DECIDE, PETSC_DECIDE, &R_part));
>> >>   // Add entries to sub matrix
>> >>   MatSetRandom(R_part, NULL);
>> >>   //View sub matrix
>> >>   PetscCall(MatView(R_part, PETSC_VIEWER_STDOUT_WORLD));
>> >> 
>> >>   // Get array from sub matrix and print entries
>> >>   PetscScalar *buffer;
>> >>   PetscCall(MatDenseGetArray(R_part, &buffer));
>> >>   PetscInt idx_end = (nlines/2) * ncols;
>> >> 
>> >>   for (int i = 0; i < idx_end; i++)
>> >>   {
>> >>       PetscPrintf(PETSC_COMM_SELF, "buffer[%d] = %e \n", i, buffer[i]);
>> >>   }
>> >> 
>> >>   //Restore array to sub matrix
>> >>   PetscCall(MatDenseRestoreArray(R_part, &buffer));
>> >>   // Restore sub matrix
>> >>   PetscCall(MatDenseRestoreSubMatrix(R_full, &R_part));
>> >>   // View the initial matrix
>> >>   PetscCall(MatView(R_full, PETSC_VIEWER_STDOUT_WORLD));
>> >> 
>> >>   PetscCall(MatDestroy(&R_full));
>> >> 
>> >>   PetscCall(PetscFinalize());
>> >>   return 0;
>> >> 
>> >> ----------------
>> >> 
>> >> 
>> >> Thanks
>> >> Medane
>> > 
>> > 
>> 
> 
> 
> 
> -- 
> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> -- Norbert Wiener
> 
> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!aW67hycVSXmRHgmoXNGV0fVjZ4HM7XloTvLw0b1d9peGDnJGYOm6nKgJPy53qErREjKHhwJybk0bAXiSQY7f9RreQX1Aik1VZBB7CIDA$  <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!aW67hycVSXmRHgmoXNGV0fVjZ4HM7XloTvLw0b1d9peGDnJGYOm6nKgJPy53qErREjKHhwJybk0bAXiSQY7f9RreQX1Aik1VZKEXJSu-$ >
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20250208/6ea32f2a/attachment.html>

From antonio.ghidoni at unibs.it  Thu Feb 13 07:15:51 2025
From: antonio.ghidoni at unibs.it (ANTONIO GHIDONI)
Date: Thu, 13 Feb 2025 14:15:51 +0100
Subject: [petsc-users] Problem with VecGhostUpdateBegin
Message-ID: <476488F7-FC0B-4AC4-9CA4-C53877C89AD5@unibs.it>

Hello,
   I am using Petsc 3.30.2. When I trie to update a ghost vector, I obtain the following error:

[0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
[0]PETSC ERROR: Object is in wrong state
[0]PETSC ERROR: Outstanding operation has not been completed
[0]PETSC ERROR: See https://urldefense.us/v3/__https://petsc.org/release/faq/__;!!G_uCfscf7eWS!e7HV8lVYESfvZo54Jin3YQ8o42CMuDL5AoF-o58a35sCLqd_0HhPSZMkoF20Zd0goGvAFFbbapxUFblcU12EitvG2MwayenCuQ$  for trouble shooting.
[0]PETSC ERROR: Petsc Release Version 3.20.2, Nov 30, 2023 
[0]PETSC ERROR: ./main2d.out on a linux-intel named node1 by cfdlab Thu Feb 13 14:08:53 2025
[0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --with-debugging=0 --with-pic COPTFLAGS=-O3 CXXOPTFLAGS=-O3 FOPTFLAGS=-O3 F90OPTFLAGS=-O3 --download-fblaslapack --download-mpich
[0]PETSC ERROR: #1 PetscSFReset_Basic() at /home/cfdlab/Lib/petsc-3.20.2/src/vec/is/sf/impls/basic/sfbasic.c:93
[0]PETSC ERROR: #2 PetscSFReset() at /home/cfdlab/Lib/petsc-3.20.2/src/vec/is/sf/interface/sf.c:119
[0]PETSC ERROR: #3 PetscSFDestroy() at /home/cfdlab/Lib/petsc-3.20.2/src/vec/is/sf/interface/sf.c:237
[0]PETSC ERROR: #4 VecScatterDestroy() at /home/cfdlab/Lib/petsc-3.20.2/src/vec/is/sf/interface/vscat.c:483
[0]PETSC ERROR: #5 VecDestroy_MPI() at /home/cfdlab/Lib/petsc-3.20.2/src/vec/vec/impls/mpi/pdvec.c:38
[0]PETSC ERROR: #6 VecDestroy() at /home/cfdlab/Lib/petsc-3.20.2/src/vec/vec/interface/vector.c:579


It?s a strange behavior because with other vectors the same routine works properly. My routine is as follows:

INTEGER(4) :: nb, nt(:), jtg(nb), nblk, nlfr
INTEGER(4) :: fr(nlfr+nb*nblk)

 Vec          iv_fr

 INTEGER(4), ALLOCATABLE :: igh(:)
 INTEGER(4)              :: ierr


 ALLOCATE    (igh(nt(1)))

DO it = 1, nt(1)
       igh(it)   =  jtg(it)  -1
 ENDDO

 CALL VecCreateGhostBlockWithArray (PETSC_COMM_WORLD,nblk,nlfr, &
                PETSC_DECIDE,nt(1),igh,fr,iv_fr,ierr)
 CALL VecGhostUpdateBegin(iv_fr,INSERT_VALUES,SCATTER_FORWARD,ierr)
 CALL VecGhostUpdateEnd (iv_fr,INSERT_VALUES,SCATTER_FORWARD,ierr)

 CALL VecDestroy (iv_fr,ierr)

 DEALLOCATE  (igh)

Any suggestion about this strange error?

Antonio


-- 


Informativa sulla Privacy:?https://urldefense.us/v3/__https://www.unibs.it/it/node/1452__;!!G_uCfscf7eWS!e7HV8lVYESfvZo54Jin3YQ8o42CMuDL5AoF-o58a35sCLqd_0HhPSZMkoF20Zd0goGvAFFbbapxUFblcU12EitvG2MzDu5UMRg$  
<https://urldefense.us/v3/__https://www.unibs.it/it/node/1452__;!!G_uCfscf7eWS!e7HV8lVYESfvZo54Jin3YQ8o42CMuDL5AoF-o58a35sCLqd_0HhPSZMkoF20Zd0goGvAFFbbapxUFblcU12EitvG2MzDu5UMRg$ >

From knepley at gmail.com  Thu Feb 13 09:33:31 2025
From: knepley at gmail.com (Matthew Knepley)
Date: Thu, 13 Feb 2025 10:33:31 -0500
Subject: [petsc-users] Problem with VecGhostUpdateBegin
In-Reply-To: <476488F7-FC0B-4AC4-9CA4-C53877C89AD5@unibs.it>
References: <476488F7-FC0B-4AC4-9CA4-C53877C89AD5@unibs.it>
Message-ID: <CAMYG4Gm-c88bwK-f1oBW-iD-8MZOkMKPr=6iscPW1rSQ0wSHaQ@mail.gmail.com>

On Thu, Feb 13, 2025 at 10:27?AM ANTONIO GHIDONI via petsc-users <
petsc-users at mcs.anl.gov> wrote:

> Hello,
>    I am using Petsc 3.30.2. When I trie to update a ghost vector, I obtain
> the following error:
>
> [0]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------
> [0]PETSC ERROR: Object is in wrong state
> [0]PETSC ERROR: Outstanding operation has not been completed
> [0]PETSC ERROR: See
> https://urldefense.us/v3/__https://petsc.org/release/faq/__;!!G_uCfscf7eWS!e7HV8lVYESfvZo54Jin3YQ8o42CMuDL5AoF-o58a35sCLqd_0HhPSZMkoF20Zd0goGvAFFbbapxUFblcU12EitvG2MwayenCuQ$
> for trouble shooting.
> [0]PETSC ERROR: Petsc Release Version 3.20.2, Nov 30, 2023
> [0]PETSC ERROR: ./main2d.out on a linux-intel named node1 by cfdlab Thu
> Feb 13 14:08:53 2025
> [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++
> --with-fc=gfortran --with-debugging=0 --with-pic COPTFLAGS=-O3
> CXXOPTFLAGS=-O3 FOPTFLAGS=-O3 F90OPTFLAGS=-O3 --download-fblaslapack
> --download-mpich
> [0]PETSC ERROR: #1 PetscSFReset_Basic() at
> /home/cfdlab/Lib/petsc-3.20.2/src/vec/is/sf/impls/basic/sfbasic.c:93
> [0]PETSC ERROR: #2 PetscSFReset() at
> /home/cfdlab/Lib/petsc-3.20.2/src/vec/is/sf/interface/sf.c:119
> [0]PETSC ERROR: #3 PetscSFDestroy() at
> /home/cfdlab/Lib/petsc-3.20.2/src/vec/is/sf/interface/sf.c:237
> [0]PETSC ERROR: #4 VecScatterDestroy() at
> /home/cfdlab/Lib/petsc-3.20.2/src/vec/is/sf/interface/vscat.c:483
> [0]PETSC ERROR: #5 VecDestroy_MPI() at
> /home/cfdlab/Lib/petsc-3.20.2/src/vec/vec/impls/mpi/pdvec.c:38
> [0]PETSC ERROR: #6 VecDestroy() at
> /home/cfdlab/Lib/petsc-3.20.2/src/vec/vec/interface/vector.c:579
>
>
> It?s a strange behavior because with other vectors the same routine works
> properly. My routine is as follows:
>
> INTEGER(4) :: nb, nt(:), jtg(nb), nblk, nlfr
> INTEGER(4) :: fr(nlfr+nb*nblk)
>
>  Vec          iv_fr
>
>  INTEGER(4), ALLOCATABLE :: igh(:)
>  INTEGER(4)              :: ierr
>
>
>  ALLOCATE    (igh(nt(1)))
>
> DO it = 1, nt(1)
>        igh(it)   =  jtg(it)  -1
>  ENDDO
>
>  CALL VecCreateGhostBlockWithArray (PETSC_COMM_WORLD,nblk,nlfr, &
>                 PETSC_DECIDE,nt(1),igh,fr,iv_fr,ierr)
>  CALL VecGhostUpdateBegin(iv_fr,INSERT_VALUES,SCATTER_FORWARD,ierr)
>  CALL VecGhostUpdateEnd (iv_fr,INSERT_VALUES,SCATTER_FORWARD,ierr)
>
>  CALL VecDestroy (iv_fr,ierr)
>
>  DEALLOCATE  (igh)
>
> Any suggestion about this strange error?
>

You have a Begin() somewhere without an End(). It is hard to say anything
else without the code.

  Thanks,

     Matt


> Antonio
>
>
> --
>
>
>
>
> Informativa sulla Privacy:
> https://urldefense.us/v3/__https://www.unibs.it/it/node/1452__;!!G_uCfscf7eWS!e7HV8lVYESfvZo54Jin3YQ8o42CMuDL5AoF-o58a35sCLqd_0HhPSZMkoF20Zd0goGvAFFbbapxUFblcU12EitvG2MzDu5UMRg$
>
> <
> https://urldefense.us/v3/__https://www.unibs.it/it/node/1452__;!!G_uCfscf7eWS!e7HV8lVYESfvZo54Jin3YQ8o42CMuDL5AoF-o58a35sCLqd_0HhPSZMkoF20Zd0goGvAFFbbapxUFblcU12EitvG2MzDu5UMRg$
> >
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!aHE-77So2SAK4bBUqZtJoEI5p6AYoobJOERwt-ejOAIW_O9tKt10N3EzIsYw3bc_PTiu6nip8M2siVtrPfIz$  <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!aHE-77So2SAK4bBUqZtJoEI5p6AYoobJOERwt-ejOAIW_O9tKt10N3EzIsYw3bc_PTiu6nip8M2siafTwy4x$ >
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20250213/f99a936b/attachment.html>

From mfadams at lbl.gov  Fri Feb 14 07:28:07 2025
From: mfadams at lbl.gov (Mark Adams)
Date: Fri, 14 Feb 2025 08:28:07 -0500
Subject: [petsc-users] DOE STTR partner opportunity
In-Reply-To: <1340996571.133008.1739538340313@mail.yahoo.com>
References: <1544782710.24203.1736797499791.ref@mail.yahoo.com>
	<1544782710.24203.1736797499791@mail.yahoo.com>
	<858303897.206391.1736863756977@mail.yahoo.com>
	<CO6PR09MB7352316A197F425C7A156264B8182@CO6PR09MB7352.namprd09.prod.outlook.com>
	<1654280882.6193.1736889607440@mail.yahoo.com>
	<1661786321.7226111.1738901906542@mail.yahoo.com>
	<CO6PR09MB735264B678C80F0426CC3A85B8FC2@CO6PR09MB7352.namprd09.prod.outlook.com>
	<1789024741.10078199.1739394891478@mail.yahoo.com>
	<1340996571.133008.1739538340313@mail.yahoo.com>
Message-ID: <CADOhEh4adoBsDZ=1R1sH-9kHaKRpymt0KNAG1gZKEt1TvvkGDg@mail.gmail.com>

cc'ing Rich and petsc-users.

* I tried putting "petsc example stiffness matrix" into ChatGPT and it
actually looked fine, 1D Laplacian C code with instructions to build and
run it.

* But we have many tutorials that do this and you can browse them to find
one that looks best for your interests at
https://urldefense.us/v3/__https://petsc.org/release/tutorials/__;!!G_uCfscf7eWS!fNXzXlpQ-ukRC2r1EAR0vzOZCP4M_UUNOIfBsJXXEMXS3FrkItd2r4hJKy94MHlF6h7APDqb7-Pfc9V4GdZtUls$ 

Good luck,
Mark

On Fri, Feb 14, 2025 at 8:05?AM Debiprasad Panda <dpanda68 at yahoo.com> wrote:

> Mark,
>
> Can you send me any link for FEA example using PETSC which will generate
> the stiffness matrix so that I can play with you for porting into our FPGA.
> Regards.
>
> Debiprasad Panda, PhD
> President & CTO,
> Universal Real Time Power Conversion LLC
> Greater Milwaukee, WI
> Tel:1-440-840-3393 (cell)
> Web: https://urldefense.us/v3/__https://urtpc.com__;!!G_uCfscf7eWS!fNXzXlpQ-ukRC2r1EAR0vzOZCP4M_UUNOIfBsJXXEMXS3FrkItd2r4hJKy94MHlF6h7APDqb7-Pfc9V435nHato$ 
>
> NOTICE OF PROPRIETARY AND CONFIDENTIAL INFORMATION*:*
> *This e-mail transmission and its attachments are privileged, proprietary
> and confidential and is **for the review of the designated recipient only**.
> If you have received this transmission in error, please immediately
> notify dpanda68 at yahoo.com <r_satpati at yahoo.com>.  Unintended transmission
> shall not constitute waiver of any privilege.*
>
>
> On Wednesday, February 12, 2025 at 03:14:51 PM CST, Debiprasad Panda <
> dpanda68 at yahoo.com> wrote:
>
>
> Richard,
>
> Thanks for the detailed email. It certainly explains the limitation at
> this time. Your email also very clearly explains about what kind of
> collaboration you extend to the partnering company. I had in my mind that
> neither you or Mark or Todd will write the code for us. I thought there
> might be junior scientists/programmers who works in your team will do the
> bulk of the work under your supervision. Now I understand that you would
> like to participate only in specific issues which is not readily available
> or needs further development in PETSc. As you say some issues may arise
> while using it and if so, you would like to participate in resolving such
> issues either in a future DOE proposal or through a self-generated project
> by PETSc community.  Correct me if my understanding is not right. We did
> work with university and consultants as sub-contractor in the past from
> this organization, but not directly with research labs. Your email
> certainly provides some guideline on the process and timeline.
>
> As you know we did implement a complete FEA analysis in FPGA and the speed
> up is significant. However, that was partly hardcoded. Thats why looking
> for an interface which is already tested and just need to be streamlined
> with our workflow. I thought that having a complete example of our interest
> in PETSC and implementing the same by part/full in FPGA will give us a good
> handle to continue development in that direction. As I mentioned we can do
> that task ourselves - we do have people who used the same workflow as I
> provided in my email, but it was for a different application. The main
> problem for small business like us is lack of funding. An SBIR/STTR funding
> will be very helpful ton accomplish this ground research on FPGA PETSC
> interface.
>
> I know time is short and certainly this transition time is making things
> more complicated.
>
> Let's plan for the next round and I believe the solicitation will be out
> in first week of June and the submission of the final proposal will be in
> October 2025. I will contact you in June.
>
>
> Anyway, we will proceed with our proposal with another partner this time.
>
> In the mean it will be helpful if Adam or any of you can send a link to
> any existing FEA example so that we can play with it.
>
> Thanks again for all your time and email discussion.
>
>
>
> Regards.
>
>
> Debiprasad Panda, PhD
> President & CTO,
> Universal Real Time Power Conversion LLC
> Greater Milwaukee, WI
> Tel:1-440-840-3393 (cell)
> Web: https://urldefense.us/v3/__https://urtpc.com__;!!G_uCfscf7eWS!fNXzXlpQ-ukRC2r1EAR0vzOZCP4M_UUNOIfBsJXXEMXS3FrkItd2r4hJKy94MHlF6h7APDqb7-Pfc9V435nHato$ 
>
> NOTICE OF PROPRIETARY AND CONFIDENTIAL INFORMATION*:*
> *This e-mail transmission and its attachments are privileged, proprietary
> and confidential and is **for the review of the designated recipient only**.
> If you have received this transmission in error, please immediately
> notify dpanda68 at yahoo.com <r_satpati at yahoo.com>.  Unintended transmission
> shall not constitute waiver of any privilege.*
>
>
> On Wednesday, February 12, 2025 at 02:07:03 PM CST, Mills, Richard Tran <
> rtmills at anl.gov> wrote:
>
>
> Hi Debiprasad,
>
> I apologize for being slow to get back to you; I am severely
> over-committed at the moment, and keeping up with email (among other
> things) has been extremely difficult.
>
> I am sorry to have to disappoint you, but I do not think that it will be
> possible for me, Todd, or Mark to partner with you on the STTR call this
> time around. Let me try to explain the two major reasons why.
>
> First: For staff at the DOE National Laboratories, it is very
> time-consuming to get approvals to participate as a subcontractor for
> something like an SBIR or STTR project. I receive a small amount of funding
> from an SBIR project right now, and it literally took weeks to get that
> proposal through all of the required approvals, including an "letter of
> commitment" signed by the the Laboratory's Director of Sponsored Research,
> as well as approval from our Contracting Officer at the DOE Site Office in
> Chicago. There are many steps of the review to ensure that proposed work is
> consistent with the DOE and Argonne missions, that it does not adversely
> impact DOE work at Argonne, and that it is not in direct competition with
> the private sector. The laboratory's guidance on this approval process
> state that we should allow a minimum of 15 business days for this process,
> but, with the current upheaval due to the transition to the new
> Administration, I suspect that more than 15 business days would be
> required. I also note that a reasonably close to complete draft of the
> proposal is required to be submitted at the beginning of the approval
> process, so you need to factor in time to develop the proposal ahead of the
> approval window if you want to respond to a future STTR or SBIR call and
> partner with a DOE Laboratory.
>
> Second: The breakdown of work that you are proposing isn't really aligned
> what with laboratory research scientists like Todd, Mark, and I are
> expected to do. We are primarily researchers, and our output is judged
> similarly to that of a professor at an R1 university, except that we have
> no teaching load and engage in some programmatic work. What you have
> proposed is having us develop a complete finite-element analysis code to
> some specification you provide, which we will then hand to you (before you
> implement part or all of it using FPGAs). For this sort of arrangement, it
> sounds like what you are looking for is scientific programmers who work on
> contract. That is not the role that we play. We do research on
> computational mathematics and its applications, and we develop software to
> aid this research and to enable the broader computing community to benefit
> from our research and perhaps collaborate with us on further developments.
> This has led to a widely-used piece of software, PETSc, which provides
> useful computational building blocks that many teams have used to build
> finite-element analysis applications, but when teams have used PETSc for
> such work and have teamed with us, it has very much been in a collaborative
> research relationship: others are doing much of the development of their
> FEM code, but we help them because, say, they are modeling systems with
> very difficult nonlinearities, discontinuous jumps in material
> coefficients, strangely stretched elements, etc., that cause problems for
> simple algebraic solvers, so we collaborate with them on developing new
> solver techniques that are amenable to their problems.
>
> It may make sense for you to partner with us or other members of the PETSc
> team in the future, but I think you need to take some time to lay more of
> the groundwork before a future funding call. You can experiment with
> porting a PETSc-based FEM code using your FPGA approach without needing
> anything from us right now: There are numerous finite-element example codes
> provided with PETSc (Mark has written a few of them, and might be able to
> recommend some good ones to start with). You could start by playing with
> these examples and then try porting bits of them to FPGAs. As I said in an
> earlier message, based on my limited experience with FPGAs, I suspect that
> you will run into several technical challenges. When you have had a chance
> to identify these challenges, then it might make sense to come back to the
> PETSc team to describe some of them ? you can start by emailing petsc-maint
> or petsc-users about this ? and perhaps eventually develop a proposal that
> aims to address them in collaboration with the team.
>
> Apologies if I have had to disappoint you, and best of luck. Perhaps later
> there will be good opportunities to partner with us in the future. I
> encourage you to experiment some with PETSc to determine whether it is the
> right software toolkit to use for your FPGA-targeted applications, and to
> not be shy about asking on the PETSc user lists as you uncover issues as
> you experiment.
>
> Best regards,
> Richard
>
> ------------------------------
> *From:* Debiprasad Panda <dpanda68 at yahoo.com>
> *Sent:* Thursday, February 6, 2025 8:18 PM
> *To:* Mills, Richard Tran <rtmills at anl.gov>
> *Cc:* Mark Adams <mfadams at lbl.gov>; Munson, Todd <tmunson at mcs.anl.gov>
> *Subject:* Re: DOE STTR partner opportunity
>
> This Message Is From an External Sender
> This message came from outside your organization.
>
> Dear All,
>
> Amidst all these organizational and administrative changes, I have good
> news to share that our LOI has been accepted by DOE and the final proposal
> submission is due on 26th February 2025. The proposal is about an FEA
> thermal analysis using PETSc and porting it to FPGA for its real time
> simulation.
>
> Given a mechanical drawing of an object, in PETSC a mesh will be generated
> and then a thermal problem will be formulated using FEA theory and boundary
> condition to generate a global stiffness matrix in the form of Ax =B, which
> will be eventually solved using linear or non-linear solver. In Phase I, we
> will concentrate only on linear system and only the solver part will be
> implemented in FPGA to demonstrate the real time operation in part. In
> Phase II, the entire FEA problem formulation with non-linearity as well as
> solver will be implemented in FPGA to have a complete real time solution.
>
> We went through PETSc libraries and one of our team members has used it
> extensively during his PhD. The steps we would like to follow to formulate
> a FEA problem, and its solution is described in the attached document.
>
> We would like you to partnering with us in this DOE project and your
> responsibility will be to create this FEA thermal model in PETSc following
> the steps in the given document and then run it in a PC/server and
> collect the result. We will take the responsibility of implementing the
> same in our FPGA solver.
>
> I was thinking to write this email for some time but kept on hold till the
> formal acceptance of LOI in order to justify your time.
>
> Please go through the attached document and then let's follow up with a
> zoom call sometime early next week per your convenience for discussing it
> for any question you may have.
>
> Please acknowledge receiving this email so that I know our communication
> is going through.
>
> I will look forward to collaborating with you.
>
> Regards.
>
>
> Debiprasad Panda, PhD
> President & CTO,
> Universal Real Time Power Conversion LLC
> Greater Milwaukee, WI
> Tel:1-440-840-3393 (cell)
> Web: https://urldefense.us/v3/__https://urtpc.com__;!!G_uCfscf7eWS!fNXzXlpQ-ukRC2r1EAR0vzOZCP4M_UUNOIfBsJXXEMXS3FrkItd2r4hJKy94MHlF6h7APDqb7-Pfc9V435nHato$ 
> <https://urldefense.us/v3/__https://urtpc.com__;!!G_uCfscf7eWS!YKeheulpEvP02Jraqr7SiUTNGtRfqaaLJ7-ibIrQ3HGvPKHIqKfkL8mQ6rfHuR-j4Fra6KhYon67LEc$>
>
> NOTICE OF PROPRIETARY AND CONFIDENTIAL INFORMATION*:*
> *This e-mail transmission and its attachments are privileged, proprietary
> and confidential and is **for the review of the designated recipient only**.
> If you have received this transmission in error, please immediately notify **dpanda68 at yahoo.com
> <r_satpati at yahoo.com>**.  Unintended transmission shall not constitute
> waiver of any privilege.*
>
>
> On Tuesday, January 14, 2025 at 03:20:07 PM CST, Debiprasad Panda <
> dpanda68 at yahoo.com> wrote:
>
>
> Richard,  Mark, Todd
>
> I am submitting the LOI without ANL at this time. It seems we can include
> ANL as STTR partner while submitting the full proposal if things look good
> from both sides. So, we may have about six weeks from now to understand the
> project. Let's discuss it over a zoom call sometime this week.
>
> Regards.
>
> Debiprasad Panda, PhD
> President & CTO,
> Universal Real Time Power Conversion LLC
> Greater Milwaukee, WI
> Tel:1-440-840-3393 (cell)
> Web: https://urldefense.us/v3/__https://urtpc.com__;!!G_uCfscf7eWS!fNXzXlpQ-ukRC2r1EAR0vzOZCP4M_UUNOIfBsJXXEMXS3FrkItd2r4hJKy94MHlF6h7APDqb7-Pfc9V435nHato$ 
> <https://urldefense.us/v3/__https://urtpc.com__;!!G_uCfscf7eWS!YKeheulpEvP02Jraqr7SiUTNGtRfqaaLJ7-ibIrQ3HGvPKHIqKfkL8mQ6rfHuR-j4Fra6KhYon67LEc$>
>
> NOTICE OF PROPRIETARY AND CONFIDENTIAL INFORMATION*:*
> *This e-mail transmission and its attachments are privileged, proprietary
> and confidential and is **for the review of the designated recipient only**.
> If you have received this transmission in error, please immediately notify **dpanda68 at yahoo.com
> <r_satpati at yahoo.com>**.  Unintended transmission shall not constitute
> waiver of any privilege.*
>
>
> On Tuesday, January 14, 2025 at 12:50:53 PM CST, Mills, Richard Tran <
> rtmills at anl.gov> wrote:
>
>
> Hi Debiprasad,
>
> Apologies for the delay in my reply; the past few days have been
> especially busy ones due to some internal proposal deadlines I had to rush
> to meet, on top of several other things.
>
> Your project sounds interesting, but, unfortunately, I don't think that
> there is time before your LOI is due for me to understand your application,
> discuss whether PETSc is appropriate for it, or how you would map any
> implementation using PETSc to FPGA hardware. PETSc is an extremely
> complicated piece of software and a lot of effort is required from
> algorithm selection and parallel problem decomposition on down to details
> of individual microkernels when bringing it to and optimizing it for new
> kinds of computing architectures. (I spent roughly six years working with
> several others on getting solid GPU support in PETSc, for instance.) I have
> a little bit of familiarity with FPGAs from my time at ORNL and Intel, and
> I think that enabling PETSc to make efficient use of FPGAs is going to be a
> highly non-trivial (though interesting!) project. Are you familiar at all
> with PETSc, and do you have a particular reason that you think it would be
> helpful to your work? You might be better served by using a different piece
> of software as a starting point, if you do not need things like the
> distributed memory-parallel implementations or the advanced, composable
> solvers and preconditioners. If you do have a particular need for things
> that PETSc provides, perhaps I or others from the PETSc team could discuss
> this with you with future opportunities in mind. Best of luck to you if you
> do submit an STTR proposal this time.
>
> Sincerely,
> Richard
>
> ------------------------------
> *From:* Debiprasad Panda <dpanda68 at yahoo.com>
> *Sent:* Tuesday, January 14, 2025 6:09 AM
> *To:* Mills, Richard Tran <rtmills at anl.gov>
> *Subject:* Re: DOE STTR partner opportunity
>
> This Message Is From an External Sender
> This message came from outside your organization.
>
> Richard,
> Hope you received my previous email. I will appreciate if you let me know
> if you would like to participate in this STTR project or not. I know its a
> short notice and I will understand if that is not sufficient to make it a
> "GO".
>
> I will still have good amount time to create and upload an LOI.
>
> Regards.
>
> Debiprasad Panda, PhD
> President & CTO,
> Universal Real Time Power Conversion LLC
> Greater Milwaukee, WI
> Tel:1-440-840-3393 (cell)
> Web: https://urldefense.us/v3/__https://urtpc.com__;!!G_uCfscf7eWS!fNXzXlpQ-ukRC2r1EAR0vzOZCP4M_UUNOIfBsJXXEMXS3FrkItd2r4hJKy94MHlF6h7APDqb7-Pfc9V435nHato$ 
> <https://urldefense.us/v3/__https://urtpc.com__;!!G_uCfscf7eWS!Y1KB89LXCkt6T8Z7n9PMvin_XYuQGUOsbWYpr4EbU3SJkMiBRE8VUOVlOIxbu8ETP36hdk0DEvJZnRc$>
>
> NOTICE OF PROPRIETARY AND CONFIDENTIAL INFORMATION*:*
> *This e-mail transmission and its attachments are privileged, proprietary
> and confidential and is **for the review of the designated recipient only**.
> If you have received this transmission in error, please immediately notify **dpanda68 at yahoo.com
> <r_satpati at yahoo.com>**.  Unintended transmission shall not constitute
> waiver of any privilege.*
>
>
> On Monday, January 13, 2025 at 01:44:59 PM CST, Debiprasad Panda <
> dpanda68 at yahoo.com> wrote:
>
>
> Richard,
> I got your contact from Todd Munson. We are a small business located in
> greater Milwaukee and working on a one-stop real time simulator where we
> can simulate a large grid along with IBR in real time. In addition, we can
> conduct a thermal and structural FEA analysis in real time for up to 1-5
> Million grid points.
> A new DOE solicitation is out where we can propose a one stop solution for
> solar power IBR where we can model an IBR with very low step size (20-40ns)
> for its real time simulation and also, we can calculate thermal loss
> through semiconductor switches and then provide a thermal footprint of the
> IBR in real time employing a FEA analysis. We have implemented a thermal
> analysis of a heat sink using our proprietary FPGA implementation in real
> time with 52000 nodes and can extend it upto 1-5M. I am wondering if you
> would like to take part as RI for our STTR application where you can
> formulate the FEA problem using PETSC or any other software and then we can
> implement the same in FPGA for its real time implementation. If so, let me
> know by COB today. We do not have much time - the LOI is due tomorrow 4:00
> PM central time, and the full proposal is due on 26th February. If you
> would like we can have a quick call to discuss. At this time an email
> consent will be fine and then we can discuss the detailed scopes and
> deliverable in next couple of weeks.  The STTR
>
> Let me know if you will be interested.
>
>
> Regards.
>
> Debiprasad Panda, PhD
> President & CTO,
> Universal Real Time Power Conversion LLC
> Greater Milwaukee, WI
> Tel:1-440-840-3393 (cell)
> Web: https://urldefense.us/v3/__https://urtpc.com__;!!G_uCfscf7eWS!fNXzXlpQ-ukRC2r1EAR0vzOZCP4M_UUNOIfBsJXXEMXS3FrkItd2r4hJKy94MHlF6h7APDqb7-Pfc9V435nHato$ 
> <https://urldefense.us/v3/__https://urtpc.com__;!!G_uCfscf7eWS!Y1KB89LXCkt6T8Z7n9PMvin_XYuQGUOsbWYpr4EbU3SJkMiBRE8VUOVlOIxbu8ETP36hdk0DEvJZnRc$>
>
> NOTICE OF PROPRIETARY AND CONFIDENTIAL INFORMATION*:*
> *This e-mail transmission and its attachments are privileged, proprietary
> and confidential and is **for the review of the designated recipient only**.
> If you have received this transmission in error, please immediately notify **dpanda68 at yahoo.com
> <r_satpati at yahoo.com>*
> *.  Unintended transmission shall not constitute waiver of any privilege. *
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20250214/493f46b7/attachment-0001.html>

From aldo.bonfiglioli at unibas.it  Sat Feb 15 10:37:18 2025
From: aldo.bonfiglioli at unibas.it (Aldo Bonfiglioli)
Date: Sat, 15 Feb 2025 17:37:18 +0100
Subject: [petsc-users] Advice on identifying boundary vertices
Message-ID: <20efa212-964c-4a34-a417-9a888d2a47c5@unibas.it>

Dear all,

I am trying to identify the boundary vertices that belong to a given 
stratum of the Face sets. This is going to be used to prescribe 
Dirichlet-type bcs.

I can select the boundary faces (edges in 2D) of a given stratum using 
PetscCall(DMLabelGetStratumIS(label, stratum, user%bndryfaces(i), ierr)) 
and that looks ok;

I then try to identify the boundary vertices by looping over the points 
(edges in 2D/faces in 3D) of the face set of a given stratum and 
retrieve the vertices that make up each individual edge/face.

For reasons I fail to understand, the above procedure fails to identify 
certain vertices (those circled in the enclosed pdf where different 
colours mark different ranks) in a parallel environment.

Questions:

1. is there an available function that does what I am trying to do? I 
know that the boundary points can be found in the "marker" label, but I 
need to discriminate among Face Sets of different strata.

2. what might be wrong in the aforementioned approach?

Thanks,

Aldo

-- 
Dr. Aldo Bonfiglioli
Associate professor of Fluid Machines
Scuola di Ingegneria
Universita' della Basilicata
V.le dell'Ateneo lucano, 10 85100 Potenza ITALY
tel:+39.0971.205203 fax:+39.0971.205215
web: https://urldefense.us/v3/__http://docenti.unibas.it/site/home/docente.html?m=002423__;!!G_uCfscf7eWS!YLL7mzBlK9H_sEdtXob1AVklf8cVLhT3NTDdExIZsdI3xMfNHBoLXr92BmYzCXOuJqtm2L6OU4DJKV_81AkjvQf-Kra-Ym8P0tU$ 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 2025-02-15-Nota-14-02_annotated.pdf
Type: application/pdf
Size: 60693 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20250215/391457f2/attachment-0001.pdf>

From knepley at gmail.com  Sat Feb 15 20:03:24 2025
From: knepley at gmail.com (Matthew Knepley)
Date: Sat, 15 Feb 2025 21:03:24 -0500
Subject: [petsc-users] Advice on identifying boundary vertices
In-Reply-To: <20efa212-964c-4a34-a417-9a888d2a47c5@unibas.it>
References: <20efa212-964c-4a34-a417-9a888d2a47c5@unibas.it>
Message-ID: <CAMYG4GnT+k_HkNFY2C0sNNAN0wGdam-Fq2RAoJGLLHDz=KHt3A@mail.gmail.com>

On Sat, Feb 15, 2025 at 11:40?AM Aldo Bonfiglioli <
aldo.bonfiglioli at unibas.it> wrote:

> Dear all,
>
> I am trying to identify the boundary vertices that belong to a given
> stratum of the Face sets. This is going to be used to prescribe
> Dirichlet-type bcs.
>

The label "Face Sets" is designed to only contain faces.


> I can select the boundary faces (edges in 2D) of a given stratum using
> PetscCall(DMLabelGetStratumIS(label, stratum, user%bndryfaces(i), ierr))
> and that looks ok;
>

Yes, you get the faces marked with some BC value.


> I then try to identify the boundary vertices by looping over the points
> (edges in 2D/faces in 3D) of the face set of a given stratum and
> retrieve the vertices that make up each individual edge/face.
>

You can do that. You can also do something like

  DMLabelDuplicate(faceSets, &newLabel);
  DMPlexLabelComplete(dm, newLabel);

which will put in all the points in the transitive closure (such as
vertices).
Then you can just loop over the points in the label, and check for vertices
using DMPlexGetPointDepth().


> For reasons I fail to understand, the above procedure fails to identify
> certain vertices (those circled in the enclosed pdf where different
> colours mark different ranks) in a parallel environment.
>

I do this all the time, so it should not happen. If the above fails, can you
send a small reproducer?

  Thanks,

     Matt


> Questions:
>
> 1. is there an available function that does what I am trying to do? I
> know that the boundary points can be found in the "marker" label, but I
> need to discriminate among Face Sets of different strata.
>
> 2. what might be wrong in the aforementioned approach?
>
> Thanks,
>
> Aldo
>
> --
> Dr. Aldo Bonfiglioli
> Associate professor of Fluid Machines
> Scuola di Ingegneria
> Universita' della Basilicata
> V.le dell'Ateneo lucano, 10 85100 Potenza ITALY
> tel:+39.0971.205203 fax:+39.0971.205215
> web:
> https://urldefense.us/v3/__http://docenti.unibas.it/site/home/docente.html?m=002423__;!!G_uCfscf7eWS!YLL7mzBlK9H_sEdtXob1AVklf8cVLhT3NTDdExIZsdI3xMfNHBoLXr92BmYzCXOuJqtm2L6OU4DJKV_81AkjvQf-Kra-Ym8P0tU$
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!cwtigEBvRpcpoLeV3aRNqP6fHwoAHN_2SW4Rjbh_ZqKelJ54Nvgncprg0QimeBjNfY5Ox-8qG6AzP5ZqCg-M$  <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!cwtigEBvRpcpoLeV3aRNqP6fHwoAHN_2SW4Rjbh_ZqKelJ54Nvgncprg0QimeBjNfY5Ox-8qG6AzP4FxXi_H$ >
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20250215/30bbf548/attachment.html>

From silvia.preda at uninsubria.it  Wed Feb 19 10:18:19 2025
From: silvia.preda at uninsubria.it (Preda Silvia)
Date: Wed, 19 Feb 2025 16:18:19 +0000
Subject: [petsc-users] grid adaptivity with dmforest
Message-ID: <VI0PR05MB11211730A65227C0CF4B62AAA8FC52@VI0PR05MB11211.eurprd05.prod.outlook.com>

Hi,

I'm using the toycode below to manage adaptivity using a DMForest, based on p4est.

The code simply performs three adaptivity steps on a square grid [0,1]x[0,1], uniformly refined at level 2 at the beginning. The minimum and the maximum level of refinement are set to 1 and 5, respectively. During the adaptivity procedure, a quadrant is refined if its centroid is inside the circle of radius 0.25, centred in (0.5,0.5).

I'm facing two main issues:

  *   As far as I understood, when the label asks to refine a quadrant which is already at the maximum level, the adaption procedure errors out instead of ignoring the request. Thus I need to check the quadrant level before setting the adaptivity label. How can I get access to the quadrant-level of a cell?
  *   Once the mesh is adapted, data need to be projected on the new mesh and in my method I'll need to write an elaborate ad-hoc procedure for that. How can I access the map that provides the correspondence between the indexes of outcoming quadrants and the indexes of incoming ones?

Below is the code, to be run with these options:
-dm_type p4est
-dm_forest_topology brick
-dm_p4est_brick_size 1,1
-dm_view vtk:brick.vtu
-dm_forest_initial_refinement 2
-dm_forest_minimum_refinement 1
-dm_forest_maximum_refinement 5

static char help[] = "Create and view a forest mesh\n\n";

#include <petscdmforest.h>
#include <petscdmplex.h>
#include <petscoptions.h>

static PetscErrorCode CreateAdaptLabel(DM dm, DM dmConv, PetscInt *nAdaptLoc)
{
  DMLabel  adaptLabel;
  PetscInt cStart, cEnd, c;

  PetscFunctionBeginUser;
  PetscCall(DMGetCoordinatesLocalSetUp(dm));
  PetscCall(DMCreateLabel(dm, "adaptLabel"));
  PetscCall(DMGetLabel(dm, "adaptLabel", &adaptLabel));
  PetscCall(DMForestGetCellChart(dm, &cStart, &cEnd));
  for (c = cStart; c < cEnd; ++c) {
    PetscReal centroid[3], volume, x, y;
    PetscCall(DMPlexComputeCellGeometryFVM(dmConv, c, &volume, centroid, NULL));
    x = centroid[0];
    y = centroid[1];
    if (std::sqrt((x-0.5)*(x-0.5)+(y-0.5)*(y-0.5))<0.25) {
      PetscCall(DMLabelSetValue(adaptLabel, c, DM_ADAPT_REFINE));
      ++nAdaptLoc[0];
    } else {
      PetscCall(DMLabelSetValue(adaptLabel, c, DM_ADAPT_KEEP));
      ++nAdaptLoc[1];
    }
  }
  PetscFunctionReturn(PETSC_SUCCESS);
}

static PetscErrorCode ForestToPlex(DM *dm, DM *dmConv)
{
  PetscFunctionBeginUser;
  PetscCall(DMConvert(*dm, DMPLEX, dmConv));
  PetscCall(DMLocalizeCoordinates(*dmConv));
  PetscCall(DMViewFromOptions(*dmConv, NULL, "-dm_conv_view"));
  PetscCall(DMPlexCheckCellShape(*dmConv, PETSC_FALSE, PETSC_DETERMINE));
  PetscFunctionReturn(PETSC_SUCCESS);
}

static PetscErrorCode AdaptMesh(DM *dm)
{
  DM              dmCur = *dm;
  PetscBool       hasLabel=PETSC_FALSE, adapt=PETSC_TRUE;
  PetscInt        adaptIter=0, maxAdaptIter=3;

  PetscFunctionBeginUser;
  while (adapt) {
    DM       dmAdapt;
    DMLabel  adaptLabel;
    PetscInt nAdaptLoc[2]={0,0}, nAdapt[2]={0,0};

    ++adaptIter;
    PetscCall(PetscPrintf(PETSC_COMM_SELF,"\nADAPT ITER %d\n",adaptIter));

    DM dmConv;
    PetscCall(ForestToPlex(&dmCur,&dmConv));
    PetscCall(CreateAdaptLabel(dmCur,dmConv,nAdaptLoc));
    PetscCallMPI(MPIU_Allreduce(&nAdaptLoc, &nAdapt, 2, MPIU_INT, MPI_SUM, PetscObjectComm((PetscObject)dmCur)));
    PetscCall(DMGetLabel(dmCur, "adaptLabel", &adaptLabel));
    PetscCall(PetscPrintf(PETSC_COMM_WORLD,"Cell to refine = %d\n",nAdapt[0]));
    PetscCall(PetscPrintf(PETSC_COMM_WORLD,"Cell to keep = %d\n",nAdapt[1]));

    if (nAdapt[0]) {
      PetscCall(DMAdaptLabel(dmCur, adaptLabel, &dmAdapt));
      PetscCall(DMHasLabel(dmAdapt, "adaptLabel", &hasLabel));
      PetscCall(DMDestroy(&dmCur));
      PetscCall(DMViewFromOptions(dmAdapt, NULL, "-adapt_dm_view"));
      dmCur = dmAdapt;
    }
    //PetscCall(DMLabelDestroy(&adaptLabel));
    PetscCall(DMDestroy(&dmConv));
    if (adaptIter==maxAdaptIter) adapt=PETSC_FALSE;
  }
  *dm = dmCur;
  PetscFunctionReturn(PETSC_SUCCESS);
}

int main(int argc, char **argv)
{
  DM          dm;
  char        typeString[256] = {'\0'};
  PetscViewer viewer          = NULL;

  PetscFunctionBeginUser;
  PetscCall(PetscInitialize(&argc, &argv, NULL, help));
  PetscCall(DMCreate(PETSC_COMM_WORLD, &dm));
  PetscCall(PetscStrncpy(typeString, DMFOREST, 256));
  PetscOptionsBegin(PETSC_COMM_WORLD, NULL, "DM Forest example options", NULL);
  PetscCall(PetscOptionsString("-dm_type", "The type of the dm", NULL, DMFOREST, typeString, sizeof(typeString), NULL));
  PetscOptionsEnd();

  PetscCall(PetscPrintf(PETSC_COMM_SELF,"\n ==== TOY CODE DMFOREST WITH AMR ====\n"));

  PetscCall(DMSetType(dm, (DMType)typeString));
  PetscCall(DMSetFromOptions(dm));
  PetscCall(DMSetUp(dm));

  /* Adapt */
  PetscCall(PetscPrintf(PETSC_COMM_SELF,"\nADAPTIVITY PHASE STARTED\n"));
  PetscCall(AdaptMesh(&dm));
  PetscCall(PetscPrintf(PETSC_COMM_SELF,"\nADAPTIVITY PHASE ENDED\n\n"));

  PetscCall(DMViewFromOptions(dm, NULL, "-dm_view"));
  PetscCall(PetscViewerDestroy(&viewer));

  PetscCall(DMDestroy(&dm));
  PetscCall(PetscFinalize());
  return 0;
}


Thanks a lot,

Silvia

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20250219/03d6ebc2/attachment-0001.html>

From knepley at gmail.com  Wed Feb 19 13:13:50 2025
From: knepley at gmail.com (Matthew Knepley)
Date: Wed, 19 Feb 2025 14:13:50 -0500
Subject: [petsc-users] grid adaptivity with dmforest
In-Reply-To: <VI0PR05MB11211730A65227C0CF4B62AAA8FC52@VI0PR05MB11211.eurprd05.prod.outlook.com>
References: <VI0PR05MB11211730A65227C0CF4B62AAA8FC52@VI0PR05MB11211.eurprd05.prod.outlook.com>
Message-ID: <CAMYG4Gnyp4QjvEWqZ3kHcrMYkWtmpRjm0uc0RsD0BNAp8w0B6Q@mail.gmail.com>

On Wed, Feb 19, 2025 at 11:18?AM Preda Silvia via petsc-users <
petsc-users at mcs.anl.gov> wrote:

> Hi,
>
>
>
> I?m using the toycode below to manage adaptivity using a DMForest, based
> on p4est.
>
>
>
> The code simply performs three adaptivity steps on a square grid
> [0,1]x[0,1], uniformly refined at level 2 at the beginning. The minimum and
> the maximum level of refinement are set to 1 and 5, respectively. During
> the adaptivity procedure, a quadrant is refined if its centroid is inside
> the circle of radius 0.25, centred in (0.5,0.5).
>
>
>
> I?m facing two main issues:
>
>    - As far as I understood, when the label asks to refine a quadrant
>    which is already at the maximum level, the adaption procedure errors out
>    instead of ignoring the request. Thus I need to check the quadrant level
>    before setting the adaptivity label. How can I get access to the
>    quadrant-level of a cell?
>    - Once the mesh is adapted, data need to be projected on the new mesh
>    and in my method I?ll need to write an elaborate ad-hoc procedure for that.
>    How can I access the map that provides the correspondence between the
>    indexes of outcoming quadrants and the indexes of incoming ones?
>
>
Toby understands this better than me. Toby, two questions:

  1) Can we catch that error and just return?

  2) Will DMProjectField() work in the case of multiple refinements like
this?

  Thanks,

      Matt


> Below is the code, to be run with these options:
>
> -dm_type p4est
>
> -dm_forest_topology brick
>
> -dm_p4est_brick_size 1,1
>
> -dm_view vtk:brick.vtu
>
> -dm_forest_initial_refinement 2
>
> -dm_forest_minimum_refinement 1
>
> -dm_forest_maximum_refinement 5
>
>
>
> static char help[] = "Create and view a forest mesh\n\n";
>
>
>
> #include <petscdmforest.h>
>
> #include <petscdmplex.h>
>
> #include <petscoptions.h>
>
>
>
> static PetscErrorCode CreateAdaptLabel(DM dm, DM dmConv, PetscInt
> *nAdaptLoc)
>
> {
>
>   DMLabel  adaptLabel;
>
>   PetscInt cStart, cEnd, c;
>
>
>
>   PetscFunctionBeginUser;
>
>   PetscCall(DMGetCoordinatesLocalSetUp(dm));
>
>   PetscCall(DMCreateLabel(dm, "adaptLabel"));
>
>   PetscCall(DMGetLabel(dm, "adaptLabel", &adaptLabel));
>
>   PetscCall(DMForestGetCellChart(dm, &cStart, &cEnd));
>
>   for (c = cStart; c < cEnd; ++c) {
>
>     PetscReal centroid[3], volume, x, y;
>
>     PetscCall(DMPlexComputeCellGeometryFVM(dmConv, c, &volume, centroid,
> NULL));
>
>     x = centroid[0];
>
>     y = centroid[1];
>
>     if (std::sqrt((x-0.5)*(x-0.5)+(y-0.5)*(y-0.5))<0.25) {
>
>       PetscCall(DMLabelSetValue(adaptLabel, c, DM_ADAPT_REFINE));
>
>       ++nAdaptLoc[0];
>
>     } else {
>
>       PetscCall(DMLabelSetValue(adaptLabel, c, DM_ADAPT_KEEP));
>
>       ++nAdaptLoc[1];
>
>     }
>
>   }
>
>   PetscFunctionReturn(PETSC_SUCCESS);
>
> }
>
>
>
> static PetscErrorCode ForestToPlex(DM *dm, DM *dmConv)
>
> {
>
>   PetscFunctionBeginUser;
>
>   PetscCall(DMConvert(*dm, DMPLEX, dmConv));
>
>   PetscCall(DMLocalizeCoordinates(*dmConv));
>
>   PetscCall(DMViewFromOptions(*dmConv, NULL, "-dm_conv_view"));
>
>   PetscCall(DMPlexCheckCellShape(*dmConv, PETSC_FALSE, PETSC_DETERMINE));
>
>   PetscFunctionReturn(PETSC_SUCCESS);
>
> }
>
>
>
> static PetscErrorCode AdaptMesh(DM *dm)
>
> {
>
>   DM              dmCur = *dm;
>
>   PetscBool       hasLabel=PETSC_FALSE, adapt=PETSC_TRUE;
>
>   PetscInt        adaptIter=0, maxAdaptIter=3;
>
>
>
>   PetscFunctionBeginUser;
>
>   while (adapt) {
>
>     DM       dmAdapt;
>
>     DMLabel  adaptLabel;
>
>     PetscInt nAdaptLoc[2]={0,0}, nAdapt[2]={0,0};
>
>
>
>     ++adaptIter;
>
>     PetscCall(PetscPrintf(PETSC_COMM_SELF,"\nADAPT ITER %d\n",adaptIter));
>
>
>
>     DM dmConv;
>
>     PetscCall(ForestToPlex(&dmCur,&dmConv));
>
>     PetscCall(CreateAdaptLabel(dmCur,dmConv,nAdaptLoc));
>
>     PetscCallMPI(MPIU_Allreduce(&nAdaptLoc, &nAdapt, 2, MPIU_INT, MPI_SUM,
> PetscObjectComm((PetscObject)dmCur)));
>
>     PetscCall(DMGetLabel(dmCur, "adaptLabel", &adaptLabel));
>
>     PetscCall(PetscPrintf(PETSC_COMM_WORLD,"Cell to refine =
> %d\n",nAdapt[0]));
>
>     PetscCall(PetscPrintf(PETSC_COMM_WORLD,"Cell to keep =
> %d\n",nAdapt[1]));
>
>
>
>     if (nAdapt[0]) {
>
>       PetscCall(DMAdaptLabel(dmCur, adaptLabel, &dmAdapt));
>
>       PetscCall(DMHasLabel(dmAdapt, "adaptLabel", &hasLabel));
>
>       PetscCall(DMDestroy(&dmCur));
>
>       PetscCall(DMViewFromOptions(dmAdapt, NULL, "-adapt_dm_view"));
>
>       dmCur = dmAdapt;
>
>     }
>
>     //PetscCall(DMLabelDestroy(&adaptLabel));
>
>     PetscCall(DMDestroy(&dmConv));
>
>     if (adaptIter==maxAdaptIter) adapt=PETSC_FALSE;
>
>   }
>
>   *dm = dmCur;
>
>   PetscFunctionReturn(PETSC_SUCCESS);
>
> }
>
>
>
> int main(int argc, char **argv)
>
> {
>
>   DM          dm;
>
>   char        typeString[256] = {'\0'};
>
>   PetscViewer viewer          = NULL;
>
>
>
>   PetscFunctionBeginUser;
>
>   PetscCall(PetscInitialize(&argc, &argv, NULL, help));
>
>   PetscCall(DMCreate(PETSC_COMM_WORLD, &dm));
>
>   PetscCall(PetscStrncpy(typeString, DMFOREST, 256));
>
>   PetscOptionsBegin(PETSC_COMM_WORLD, NULL, "DM Forest example options",
> NULL);
>
>   PetscCall(PetscOptionsString("-dm_type", "The type of the dm", NULL,
> DMFOREST, typeString, sizeof(typeString), NULL));
>
>   PetscOptionsEnd();
>
>
>
>   PetscCall(PetscPrintf(PETSC_COMM_SELF,"\n ==== TOY CODE DMFOREST WITH
> AMR ====\n"));
>
>
>
>   PetscCall(DMSetType(dm, (DMType)typeString));
>
>   PetscCall(DMSetFromOptions(dm));
>
>   PetscCall(DMSetUp(dm));
>
>
>
>   /* Adapt */
>
>   PetscCall(PetscPrintf(PETSC_COMM_SELF,"\nADAPTIVITY PHASE STARTED\n"));
>
>   PetscCall(AdaptMesh(&dm));
>
>   PetscCall(PetscPrintf(PETSC_COMM_SELF,"\nADAPTIVITY PHASE ENDED\n\n"));
>
>
>
>   PetscCall(DMViewFromOptions(dm, NULL, "-dm_view"));
>
>   PetscCall(PetscViewerDestroy(&viewer));
>
>
>
>   PetscCall(DMDestroy(&dm));
>
>   PetscCall(PetscFinalize());
>
>   return 0;
>
> }
>
>
>
>
>
> Thanks a lot,
>
>
>
> Silvia
>
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!ZSCq5o18nPkr_fesXZsTE1dDngSGlwHv1_gJXKWrkgdaUqTlCeNxjF-VSyRK8bjgaHXIu9LJw-fH5fP0nQ8Z$  <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!ZSCq5o18nPkr_fesXZsTE1dDngSGlwHv1_gJXKWrkgdaUqTlCeNxjF-VSyRK8bjgaHXIu9LJw-fH5Wo8ZisR$ >
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20250219/7b933fcd/attachment.html>

From dargaville.steven at gmail.com  Wed Feb 19 17:17:49 2025
From: dargaville.steven at gmail.com (Steven Dargaville)
Date: Wed, 19 Feb 2025 23:17:49 +0000
Subject: [petsc-users] kokkos and include flags
Message-ID: <CAG_C8sKuAMTETfO=GJuymVU9UbkT=a9ma5GuNq5ssacrM+zVKA@mail.gmail.com>

Hi

I'm trying to build my application code (which includes C and Fortran
files) with a Makefile based off $PETSC_DIR/share/petsc/Makefile.basic.user
by using the variables and rules defined in
${PETSC_DIR}/lib/petsc/conf/variables. My application uses petsc as well as
another library, and hence I have to add some extra include statements
pointing at the other library during compilation. Currently I have been
doing:

# Read in the petsc compile/linking variables and makefile rules
include ${PETSC_DIR}/lib/petsc/conf/variables
include ${PETSC_DIR}/lib/petsc/conf/rules

# Add the extra include files
PETSC_FC_INCLUDES += $(INCLUDE_OTHER_LIB)
PETSC_CC_INCLUDES += $(INCLUDE_OTHER_LIB)


which works very well, with the correct include flags from
INCLUDE_OTHER_LIBS being added to the compilation of both fortran and C
files.

If however I try and compile a kokkos file, named adv_1dk.kokkos.cxx (by
calling "make adv_1dk"), the extra flags are not included. If I instead
call "make adv_1dk.kokkos", the rule for cxx files is instead triggered and
correctly includes the include flags, but this just calls the c++ wrapper,
rather than the nvcc_wrapper and therefore breaks when kokkos has been
built with cuda (or hip, etc).

Just wondering if there is something I have missed, from what I can tell
the kokkos rules don't use the PETSC_CC_INCLUDES during compilation.

Thanks for all your help
Steven
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20250219/f380645b/attachment-0001.html>

From balay.anl at fastmail.org  Wed Feb 19 18:32:08 2025
From: balay.anl at fastmail.org (Satish Balay)
Date: Wed, 19 Feb 2025 18:32:08 -0600 (CST)
Subject: [petsc-users] kokkos and include flags
In-Reply-To: <CAG_C8sKuAMTETfO=GJuymVU9UbkT=a9ma5GuNq5ssacrM+zVKA@mail.gmail.com>
References: <CAG_C8sKuAMTETfO=GJuymVU9UbkT=a9ma5GuNq5ssacrM+zVKA@mail.gmail.com>
Message-ID: <f3e292eb-392c-70b7-9d01-2641f4454b13@fastmail.org>

Try setting CPPFLAGS, FPPFLAGS, CXXPPFLAGS [and not via PETSC_FC_INCLUDES].

I think kokkos compile targets [for *.kokkos.cxx sources]  should pick up one of them. 

for ex:

>>>
CPPFLAGS = -Wall
FPPFLAGS = -Wall
CXXPPFLAGS = -Wall

include ${PETSC_DIR}/lib/petsc/conf/variables
include ${PETSC_DIR}/lib/petsc/conf/rules

...
<<<

Satish

On Wed, 19 Feb 2025, Steven Dargaville wrote:

> Hi
> 
> I'm trying to build my application code (which includes C and Fortran
> files) with a Makefile based off $PETSC_DIR/share/petsc/Makefile.basic.user
> by using the variables and rules defined in
> ${PETSC_DIR}/lib/petsc/conf/variables. My application uses petsc as well as
> another library, and hence I have to add some extra include statements
> pointing at the other library during compilation. Currently I have been
> doing:
> 
> # Read in the petsc compile/linking variables and makefile rules
> include ${PETSC_DIR}/lib/petsc/conf/variables
> include ${PETSC_DIR}/lib/petsc/conf/rules
> 
> # Add the extra include files
> PETSC_FC_INCLUDES += $(INCLUDE_OTHER_LIB)
> PETSC_CC_INCLUDES += $(INCLUDE_OTHER_LIB)
> 
> 
> which works very well, with the correct include flags from
> INCLUDE_OTHER_LIBS being added to the compilation of both fortran and C
> files.
> 
> If however I try and compile a kokkos file, named adv_1dk.kokkos.cxx (by
> calling "make adv_1dk"), the extra flags are not included. If I instead
> call "make adv_1dk.kokkos", the rule for cxx files is instead triggered and
> correctly includes the include flags, but this just calls the c++ wrapper,
> rather than the nvcc_wrapper and therefore breaks when kokkos has been
> built with cuda (or hip, etc).
> 
> Just wondering if there is something I have missed, from what I can tell
> the kokkos rules don't use the PETSC_CC_INCLUDES during compilation.
> 
> Thanks for all your help
> Steven
> 


From dargaville.steven at gmail.com  Thu Feb 20 05:36:11 2025
From: dargaville.steven at gmail.com (Steven Dargaville)
Date: Thu, 20 Feb 2025 11:36:11 +0000
Subject: [petsc-users] kokkos and include flags
In-Reply-To: <f3e292eb-392c-70b7-9d01-2641f4454b13@fastmail.org>
References: <CAG_C8sKuAMTETfO=GJuymVU9UbkT=a9ma5GuNq5ssacrM+zVKA@mail.gmail.com>
	<f3e292eb-392c-70b7-9d01-2641f4454b13@fastmail.org>
Message-ID: <CAG_C8sKfHXJ-W8FZ8K0T8qmZka088Q8CHHBK3WUjvnM_MaJvqw@mail.gmail.com>

Thanks for the reply! I've tried that and again it doesn't seem to work for
the kokkos files. I went a bit overboard and set every variable I could
find but it doesn't seem to change the kokkos compilation, despite some of
those flags definitely being present in the kokkos compile targets.

CPPFLAGS = $(INCLUDE)
FPPFLAGS = $(INCLUDE)
CPPFLAGS = $(INCLUDE)
CXXPPFLAGS = $(INCLUDE)
CXXCPPFLAGS = $(INCLUDE)
CUDAC_FLAGS = $(INCLUDE)
HIPC_FLAGS = $(INCLUDE)
SYCLC_FLAGS = $(INCLUDE)
PETSC_CXXCPPFLAGS = $(INCLUDE)
PETSC_CCPPFLAGS = $(INCLUDE)
PETSC_FCPPFLAGS = $(INCLUDE)
PETSC_CUDACPPFLAGS = $(INCLUDE)
MPICXX_INCLUDES = $(INCLUDE)

# Read in the petsc compile/linking variables and makefile rules
include ${PETSC_DIR}/lib/petsc/conf/variables
include ${PETSC_DIR}/lib/petsc/conf/rules

The strangest thing is if I echo the value of PETSC_KOKKOSCOMPILE_SINGLE
before building, it seems to have the correct flags in it.

# Build the tests
build_tests: $(OUT)
echo $(PETSC_KOKKOSCOMPILE_SINGLE)
@for t in $(TEST_TARGETS); do \
$(MAKE) -C tests $$t; \
done

for example the echo gives (where I've bolded the flags I need added):

mpicxx -o .o -c -Wall -Wwrite-strings -Wno-strict-aliasing
-Wno-unknown-pragmas -Wno-lto-type-mismatch -Wno-psabi -fstack-protector
-Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas
-Wno-lto-type-mismatch -Wno-psabi -fstack-protector -g -O0 -std=gnu++17
-fPIC *-I/home/sdargavi/projects/PFLARE -Iinclude*

but the actual command that is called when the build is happening is (which
doesn't have the includes I need):

mpicxx -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas
-Wno-lto-type-mismatch -Wno-psabi -fstack-protector -Wall -Wwrite-strings
-Wno-strict-aliasing -Wno-unknown-pragmas -Wno-lto-type-mismatch -Wno-psabi
-fstack-protector -g -O0 -Wall -Wwrite-strings -Wno-strict-aliasing
-Wno-unknown-pragmas -Wno-lto-type-mismatch -Wno-psabi -fstack-protector
-Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas
-Wno-lto-type-mismatch -Wno-psabi -fstack-protector -g -O0 -std=gnu++17
-fPIC -I/home/sdargavi/projects/dependencies/petsc-3.22.0/include
-I/home/sdargavi/projects/dependencies/petsc-3.22.0/arch-linux-c-debug/include
adv_1dk.kokkos.cxx  -L/home/sdargavi/projects/PFLARE/lib -lpflare
-Wl,-rpath,/home/sdargavi/projects/PFLARE/lib:-Wl,-rpath,/home/sdargavi/projects/dependencies/petsc-3.22.0/arch-linux-c-debug/lib
-L/home/sdargavi/projects/dependencies/petsc-3.22.0/arch-linux-c-debug/lib
-Wl,-rpath,/usr/lib/x86_64-linux-gnu/openmpi/lib/fortran/gfortran
-L/usr/lib/x86_64-linux-gnu/openmpi/lib/fortran/gfortran
-Wl,-rpath,/usr/lib/gcc/x86_64-linux-gnu/11
-L/usr/lib/gcc/x86_64-linux-gnu/11 -lpetsc -lkokkoskernels
-lkokkoscontainers -lkokkoscore -lkokkossimd -lflapack -lfblas -lparmetis
-lmetis -lm -lX11 -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi
-lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lgfortran -lm
-lz -lgfortran -lm -lgfortran -lgcc_s -lquadmath -lstdc++ -lquadmath -o
adv_1dk


On Thu, 20 Feb 2025 at 00:32, Satish Balay <balay.anl at fastmail.org> wrote:

> Try setting CPPFLAGS, FPPFLAGS, CXXPPFLAGS [and not via PETSC_FC_INCLUDES].
>
> I think kokkos compile targets [for *.kokkos.cxx sources]  should pick up
> one of them.
>
> for ex:
>
> >>>
> CPPFLAGS = -Wall
> FPPFLAGS = -Wall
> CXXPPFLAGS = -Wall
>
> include ${PETSC_DIR}/lib/petsc/conf/variables
> include ${PETSC_DIR}/lib/petsc/conf/rules
>
> ...
> <<<
>
> Satish
>
> On Wed, 19 Feb 2025, Steven Dargaville wrote:
>
> > Hi
> >
> > I'm trying to build my application code (which includes C and Fortran
> > files) with a Makefile based off
> $PETSC_DIR/share/petsc/Makefile.basic.user
> > by using the variables and rules defined in
> > ${PETSC_DIR}/lib/petsc/conf/variables. My application uses petsc as well
> as
> > another library, and hence I have to add some extra include statements
> > pointing at the other library during compilation. Currently I have been
> > doing:
> >
> > # Read in the petsc compile/linking variables and makefile rules
> > include ${PETSC_DIR}/lib/petsc/conf/variables
> > include ${PETSC_DIR}/lib/petsc/conf/rules
> >
> > # Add the extra include files
> > PETSC_FC_INCLUDES += $(INCLUDE_OTHER_LIB)
> > PETSC_CC_INCLUDES += $(INCLUDE_OTHER_LIB)
> >
> >
> > which works very well, with the correct include flags from
> > INCLUDE_OTHER_LIBS being added to the compilation of both fortran and C
> > files.
> >
> > If however I try and compile a kokkos file, named adv_1dk.kokkos.cxx (by
> > calling "make adv_1dk"), the extra flags are not included. If I instead
> > call "make adv_1dk.kokkos", the rule for cxx files is instead triggered
> and
> > correctly includes the include flags, but this just calls the c++
> wrapper,
> > rather than the nvcc_wrapper and therefore breaks when kokkos has been
> > built with cuda (or hip, etc).
> >
> > Just wondering if there is something I have missed, from what I can tell
> > the kokkos rules don't use the PETSC_CC_INCLUDES during compilation.
> >
> > Thanks for all your help
> > Steven
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20250220/d82610bb/attachment-0001.html>

From dargaville.steven at gmail.com  Thu Feb 20 05:38:44 2025
From: dargaville.steven at gmail.com (Steven Dargaville)
Date: Thu, 20 Feb 2025 11:38:44 +0000
Subject: [petsc-users] kokkos and include flags
In-Reply-To: <CAG_C8sKfHXJ-W8FZ8K0T8qmZka088Q8CHHBK3WUjvnM_MaJvqw@mail.gmail.com>
References: <CAG_C8sKuAMTETfO=GJuymVU9UbkT=a9ma5GuNq5ssacrM+zVKA@mail.gmail.com>
	<f3e292eb-392c-70b7-9d01-2641f4454b13@fastmail.org>
	<CAG_C8sKfHXJ-W8FZ8K0T8qmZka088Q8CHHBK3WUjvnM_MaJvqw@mail.gmail.com>
Message-ID: <CAG_C8sLm84xTCOPavzSfJzpTJxrKz8FNObPvoN2hycLeZRBDzQ@mail.gmail.com>

My apologies, that does seem to have fixed the problem, the build rules in
my tests were overriding the new variables.

Thanks for your help in sorting that out!
Steven

On Thu, 20 Feb 2025 at 11:36, Steven Dargaville <dargaville.steven at gmail.com>
wrote:

> Thanks for the reply! I've tried that and again it doesn't seem to work
> for the kokkos files. I went a bit overboard and set every variable I could
> find but it doesn't seem to change the kokkos compilation, despite some of
> those flags definitely being present in the kokkos compile targets.
>
> CPPFLAGS = $(INCLUDE)
> FPPFLAGS = $(INCLUDE)
> CPPFLAGS = $(INCLUDE)
> CXXPPFLAGS = $(INCLUDE)
> CXXCPPFLAGS = $(INCLUDE)
> CUDAC_FLAGS = $(INCLUDE)
> HIPC_FLAGS = $(INCLUDE)
> SYCLC_FLAGS = $(INCLUDE)
> PETSC_CXXCPPFLAGS = $(INCLUDE)
> PETSC_CCPPFLAGS = $(INCLUDE)
> PETSC_FCPPFLAGS = $(INCLUDE)
> PETSC_CUDACPPFLAGS = $(INCLUDE)
> MPICXX_INCLUDES = $(INCLUDE)
>
> # Read in the petsc compile/linking variables and makefile rules
> include ${PETSC_DIR}/lib/petsc/conf/variables
> include ${PETSC_DIR}/lib/petsc/conf/rules
>
> The strangest thing is if I echo the value of PETSC_KOKKOSCOMPILE_SINGLE
> before building, it seems to have the correct flags in it.
>
> # Build the tests
> build_tests: $(OUT)
> echo $(PETSC_KOKKOSCOMPILE_SINGLE)
> @for t in $(TEST_TARGETS); do \
> $(MAKE) -C tests $$t; \
> done
>
> for example the echo gives (where I've bolded the flags I need added):
>
> mpicxx -o .o -c -Wall -Wwrite-strings -Wno-strict-aliasing
> -Wno-unknown-pragmas -Wno-lto-type-mismatch -Wno-psabi -fstack-protector
> -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas
> -Wno-lto-type-mismatch -Wno-psabi -fstack-protector -g -O0 -std=gnu++17
> -fPIC *-I/home/sdargavi/projects/PFLARE -Iinclude*
>
> but the actual command that is called when the build is happening is
> (which doesn't have the includes I need):
>
> mpicxx -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas
> -Wno-lto-type-mismatch -Wno-psabi -fstack-protector -Wall -Wwrite-strings
> -Wno-strict-aliasing -Wno-unknown-pragmas -Wno-lto-type-mismatch -Wno-psabi
> -fstack-protector -g -O0 -Wall -Wwrite-strings -Wno-strict-aliasing
> -Wno-unknown-pragmas -Wno-lto-type-mismatch -Wno-psabi -fstack-protector
> -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas
> -Wno-lto-type-mismatch -Wno-psabi -fstack-protector -g -O0 -std=gnu++17
> -fPIC -I/home/sdargavi/projects/dependencies/petsc-3.22.0/include
> -I/home/sdargavi/projects/dependencies/petsc-3.22.0/arch-linux-c-debug/include
> adv_1dk.kokkos.cxx  -L/home/sdargavi/projects/PFLARE/lib -lpflare
> -Wl,-rpath,/home/sdargavi/projects/PFLARE/lib:-Wl,-rpath,/home/sdargavi/projects/dependencies/petsc-3.22.0/arch-linux-c-debug/lib
> -L/home/sdargavi/projects/dependencies/petsc-3.22.0/arch-linux-c-debug/lib
> -Wl,-rpath,/usr/lib/x86_64-linux-gnu/openmpi/lib/fortran/gfortran
> -L/usr/lib/x86_64-linux-gnu/openmpi/lib/fortran/gfortran
> -Wl,-rpath,/usr/lib/gcc/x86_64-linux-gnu/11
> -L/usr/lib/gcc/x86_64-linux-gnu/11 -lpetsc -lkokkoskernels
> -lkokkoscontainers -lkokkoscore -lkokkossimd -lflapack -lfblas -lparmetis
> -lmetis -lm -lX11 -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi
> -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lgfortran -lm
> -lz -lgfortran -lm -lgfortran -lgcc_s -lquadmath -lstdc++ -lquadmath -o
> adv_1dk
>
>
>
> On Thu, 20 Feb 2025 at 00:32, Satish Balay <balay.anl at fastmail.org> wrote:
>
>> Try setting CPPFLAGS, FPPFLAGS, CXXPPFLAGS [and not via
>> PETSC_FC_INCLUDES].
>>
>> I think kokkos compile targets [for *.kokkos.cxx sources]  should pick up
>> one of them.
>>
>> for ex:
>>
>> >>>
>> CPPFLAGS = -Wall
>> FPPFLAGS = -Wall
>> CXXPPFLAGS = -Wall
>>
>> include ${PETSC_DIR}/lib/petsc/conf/variables
>> include ${PETSC_DIR}/lib/petsc/conf/rules
>>
>> ...
>> <<<
>>
>> Satish
>>
>> On Wed, 19 Feb 2025, Steven Dargaville wrote:
>>
>> > Hi
>> >
>> > I'm trying to build my application code (which includes C and Fortran
>> > files) with a Makefile based off
>> $PETSC_DIR/share/petsc/Makefile.basic.user
>> > by using the variables and rules defined in
>> > ${PETSC_DIR}/lib/petsc/conf/variables. My application uses petsc as
>> well as
>> > another library, and hence I have to add some extra include statements
>> > pointing at the other library during compilation. Currently I have been
>> > doing:
>> >
>> > # Read in the petsc compile/linking variables and makefile rules
>> > include ${PETSC_DIR}/lib/petsc/conf/variables
>> > include ${PETSC_DIR}/lib/petsc/conf/rules
>> >
>> > # Add the extra include files
>> > PETSC_FC_INCLUDES += $(INCLUDE_OTHER_LIB)
>> > PETSC_CC_INCLUDES += $(INCLUDE_OTHER_LIB)
>> >
>> >
>> > which works very well, with the correct include flags from
>> > INCLUDE_OTHER_LIBS being added to the compilation of both fortran and C
>> > files.
>> >
>> > If however I try and compile a kokkos file, named adv_1dk.kokkos.cxx (by
>> > calling "make adv_1dk"), the extra flags are not included. If I instead
>> > call "make adv_1dk.kokkos", the rule for cxx files is instead triggered
>> and
>> > correctly includes the include flags, but this just calls the c++
>> wrapper,
>> > rather than the nvcc_wrapper and therefore breaks when kokkos has been
>> > built with cuda (or hip, etc).
>> >
>> > Just wondering if there is something I have missed, from what I can tell
>> > the kokkos rules don't use the PETSC_CC_INCLUDES during compilation.
>> >
>> > Thanks for all your help
>> > Steven
>> >
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20250220/e4c2db64/attachment.html>

From balay.anl at fastmail.org  Thu Feb 20 08:44:02 2025
From: balay.anl at fastmail.org (Satish Balay)
Date: Thu, 20 Feb 2025 08:44:02 -0600 (CST)
Subject: [petsc-users] kokkos and include flags
In-Reply-To: <CAG_C8sLm84xTCOPavzSfJzpTJxrKz8FNObPvoN2hycLeZRBDzQ@mail.gmail.com>
References: <CAG_C8sKuAMTETfO=GJuymVU9UbkT=a9ma5GuNq5ssacrM+zVKA@mail.gmail.com>
	<f3e292eb-392c-70b7-9d01-2641f4454b13@fastmail.org>
	<CAG_C8sKfHXJ-W8FZ8K0T8qmZka088Q8CHHBK3WUjvnM_MaJvqw@mail.gmail.com>
	<CAG_C8sLm84xTCOPavzSfJzpTJxrKz8FNObPvoN2hycLeZRBDzQ@mail.gmail.com>
Message-ID: <9432ed9e-384a-4860-62b9-b7fb569bf457@fastmail.org>

I'm glad you have it working now! Thanks for the update!

Satish

On Thu, 20 Feb 2025, Steven Dargaville wrote:

> My apologies, that does seem to have fixed the problem, the build rules in
> my tests were overriding the new variables.
> 
> Thanks for your help in sorting that out!
> Steven
> 
> On Thu, 20 Feb 2025 at 11:36, Steven Dargaville <dargaville.steven at gmail.com>
> wrote:
> 
> > Thanks for the reply! I've tried that and again it doesn't seem to work
> > for the kokkos files. I went a bit overboard and set every variable I could
> > find but it doesn't seem to change the kokkos compilation, despite some of
> > those flags definitely being present in the kokkos compile targets.
> >
> > CPPFLAGS = $(INCLUDE)
> > FPPFLAGS = $(INCLUDE)
> > CPPFLAGS = $(INCLUDE)
> > CXXPPFLAGS = $(INCLUDE)
> > CXXCPPFLAGS = $(INCLUDE)
> > CUDAC_FLAGS = $(INCLUDE)
> > HIPC_FLAGS = $(INCLUDE)
> > SYCLC_FLAGS = $(INCLUDE)
> > PETSC_CXXCPPFLAGS = $(INCLUDE)
> > PETSC_CCPPFLAGS = $(INCLUDE)
> > PETSC_FCPPFLAGS = $(INCLUDE)
> > PETSC_CUDACPPFLAGS = $(INCLUDE)
> > MPICXX_INCLUDES = $(INCLUDE)
> >
> > # Read in the petsc compile/linking variables and makefile rules
> > include ${PETSC_DIR}/lib/petsc/conf/variables
> > include ${PETSC_DIR}/lib/petsc/conf/rules
> >
> > The strangest thing is if I echo the value of PETSC_KOKKOSCOMPILE_SINGLE
> > before building, it seems to have the correct flags in it.
> >
> > # Build the tests
> > build_tests: $(OUT)
> > echo $(PETSC_KOKKOSCOMPILE_SINGLE)
> > @for t in $(TEST_TARGETS); do \
> > $(MAKE) -C tests $$t; \
> > done
> >
> > for example the echo gives (where I've bolded the flags I need added):
> >
> > mpicxx -o .o -c -Wall -Wwrite-strings -Wno-strict-aliasing
> > -Wno-unknown-pragmas -Wno-lto-type-mismatch -Wno-psabi -fstack-protector
> > -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas
> > -Wno-lto-type-mismatch -Wno-psabi -fstack-protector -g -O0 -std=gnu++17
> > -fPIC *-I/home/sdargavi/projects/PFLARE -Iinclude*
> >
> > but the actual command that is called when the build is happening is
> > (which doesn't have the includes I need):
> >
> > mpicxx -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas
> > -Wno-lto-type-mismatch -Wno-psabi -fstack-protector -Wall -Wwrite-strings
> > -Wno-strict-aliasing -Wno-unknown-pragmas -Wno-lto-type-mismatch -Wno-psabi
> > -fstack-protector -g -O0 -Wall -Wwrite-strings -Wno-strict-aliasing
> > -Wno-unknown-pragmas -Wno-lto-type-mismatch -Wno-psabi -fstack-protector
> > -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas
> > -Wno-lto-type-mismatch -Wno-psabi -fstack-protector -g -O0 -std=gnu++17
> > -fPIC -I/home/sdargavi/projects/dependencies/petsc-3.22.0/include
> > -I/home/sdargavi/projects/dependencies/petsc-3.22.0/arch-linux-c-debug/include
> > adv_1dk.kokkos.cxx  -L/home/sdargavi/projects/PFLARE/lib -lpflare
> > -Wl,-rpath,/home/sdargavi/projects/PFLARE/lib:-Wl,-rpath,/home/sdargavi/projects/dependencies/petsc-3.22.0/arch-linux-c-debug/lib
> > -L/home/sdargavi/projects/dependencies/petsc-3.22.0/arch-linux-c-debug/lib
> > -Wl,-rpath,/usr/lib/x86_64-linux-gnu/openmpi/lib/fortran/gfortran
> > -L/usr/lib/x86_64-linux-gnu/openmpi/lib/fortran/gfortran
> > -Wl,-rpath,/usr/lib/gcc/x86_64-linux-gnu/11
> > -L/usr/lib/gcc/x86_64-linux-gnu/11 -lpetsc -lkokkoskernels
> > -lkokkoscontainers -lkokkoscore -lkokkossimd -lflapack -lfblas -lparmetis
> > -lmetis -lm -lX11 -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi
> > -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lgfortran -lm
> > -lz -lgfortran -lm -lgfortran -lgcc_s -lquadmath -lstdc++ -lquadmath -o
> > adv_1dk
> >
> >
> >
> > On Thu, 20 Feb 2025 at 00:32, Satish Balay <balay.anl at fastmail.org> wrote:
> >
> >> Try setting CPPFLAGS, FPPFLAGS, CXXPPFLAGS [and not via
> >> PETSC_FC_INCLUDES].
> >>
> >> I think kokkos compile targets [for *.kokkos.cxx sources]  should pick up
> >> one of them.
> >>
> >> for ex:
> >>
> >> >>>
> >> CPPFLAGS = -Wall
> >> FPPFLAGS = -Wall
> >> CXXPPFLAGS = -Wall
> >>
> >> include ${PETSC_DIR}/lib/petsc/conf/variables
> >> include ${PETSC_DIR}/lib/petsc/conf/rules
> >>
> >> ...
> >> <<<
> >>
> >> Satish
> >>
> >> On Wed, 19 Feb 2025, Steven Dargaville wrote:
> >>
> >> > Hi
> >> >
> >> > I'm trying to build my application code (which includes C and Fortran
> >> > files) with a Makefile based off
> >> $PETSC_DIR/share/petsc/Makefile.basic.user
> >> > by using the variables and rules defined in
> >> > ${PETSC_DIR}/lib/petsc/conf/variables. My application uses petsc as
> >> well as
> >> > another library, and hence I have to add some extra include statements
> >> > pointing at the other library during compilation. Currently I have been
> >> > doing:
> >> >
> >> > # Read in the petsc compile/linking variables and makefile rules
> >> > include ${PETSC_DIR}/lib/petsc/conf/variables
> >> > include ${PETSC_DIR}/lib/petsc/conf/rules
> >> >
> >> > # Add the extra include files
> >> > PETSC_FC_INCLUDES += $(INCLUDE_OTHER_LIB)
> >> > PETSC_CC_INCLUDES += $(INCLUDE_OTHER_LIB)
> >> >
> >> >
> >> > which works very well, with the correct include flags from
> >> > INCLUDE_OTHER_LIBS being added to the compilation of both fortran and C
> >> > files.
> >> >
> >> > If however I try and compile a kokkos file, named adv_1dk.kokkos.cxx (by
> >> > calling "make adv_1dk"), the extra flags are not included. If I instead
> >> > call "make adv_1dk.kokkos", the rule for cxx files is instead triggered
> >> and
> >> > correctly includes the include flags, but this just calls the c++
> >> wrapper,
> >> > rather than the nvcc_wrapper and therefore breaks when kokkos has been
> >> > built with cuda (or hip, etc).
> >> >
> >> > Just wondering if there is something I have missed, from what I can tell
> >> > the kokkos rules don't use the PETSC_CC_INCLUDES during compilation.
> >> >
> >> > Thanks for all your help
> >> > Steven
> >> >
> >>
> >>
> 


From schaferk at bellsouth.net  Thu Feb 20 12:22:24 2025
From: schaferk at bellsouth.net (Michael Schaferkotter)
Date: Thu, 20 Feb 2025 12:22:24 -0600
Subject: [petsc-users] Info; build petsc-3.20.2 with llvm check fails
References: <288707F5-8A3C-4973-AE12-99F370C39C5A.ref@bellsouth.net>
Message-ID: <288707F5-8A3C-4973-AE12-99F370C39C5A@bellsouth.net>

build petsc-3.20.3 with llvm, clang, clang++, gfortran 

CFLAGS='-std=c++11'
CXXFLAGS='-std=c++11 -D_GLIBCXX_USE_CXX11_ABI=1'
LDLIBS += -lstdc++

$PETSC_ARCH     arch-linux-c-opt
MPIF90  = ./models/src/v2021.03-2.0.3-llvm/bin/mpif90
MPICC   = ./models/src/v2021.03-2.0.3-llvm/bin/mpicc
CLANG   = clang
FC   = gfortran


Petsc libraries are built;
/models/src/v2021.03-2.0.3-llvm/lib/libpetsc.so@
/models/src/v2021.03-2.0.3-llvm/lib/libpetsc.so.3.020@
/models/src/v2021.03-2.0.3-llvm/lib/libpetsc.so.3.020.3*


The configure is this:
        cd $(PETSC_SRC) && unset CXX CC FC F77 && $(PYTHON2) ./configure --prefix=$(PREFIX) \
  --with-cc=clang \
  --with-cxx=clang++ \
  --with-fc=gfortran \
  --download-mpich="$(DIR_SRC)/mpich-$(MPICH_VERSION).tar.gz" \
  --download-fblaslapack="$(DIR_SRC)/fblaslapack-$(FBLASLAPACK_VERSION).tar.gz" \
  --download-sowing \
  --with-debugging=$(PETSC_DBG) \
  --with-shared-libraries=1 \
 CFLAGS='-std=c11' \
  CXXFLAGS='-std=c++11 -D_GLIBCXX_USE_CXX11_ABI=1' \
  CPPFLAGS='-D_GLIBCXX_USE_CXX11_ABI=1' \
  LDFLAGS='-L$(LLVM_LIB)' \
  LIBS='-lstdc++? \
  --COPTFLAGS=$(COPTFLAGS) --CXXOPTFLAGS=$(CXXOPTFLAGS) --FOPTFLAGS=$(FOPTFLAGS)


Here is the make:

        $(MAKE) -C $(PETSC_SRC) PETSC_DIR=$(PETSC_SRC) PETSC_ARCH=$(PETSC_ARCH) all


Check-petsc is:

        $(MAKE) -C $(PETSC_SRC) PETSC_DIR=$(PETSC_SRC) PETSC_ARCH=$(PETSC_ARCH) test

Here is the log file for test:

make[1]: Entering directory '/models/src/v2021.03-2.0.3-llvm/build/petsc/petsc-3.20.3'
/usr/bin/python3 /models/src/v2021.03-2.0.3-llvm/build/petsc/petsc-3.20.3/config/gmakegentest.py --petsc-dir=/models/src/v2021.03-2.0.3-llvm/build/petsc/petsc-3.20.3 --petsc-arch=arch-linux-c-opt --testdir=./arch-linux-c-opt/tests --srcdir /models/src/v2021.03-2.0.3-llvm/build/petsc/petsc-3.20.3/src --pkg-pkgs "sys vec mat dm ksp snes ts tao"
Using MAKEFLAGS: iw -- PETSC_ARCH=arch-linux-c-opt PETSC_DIR=/models/src/v2021.03-2.0.3-llvm/build/petsc/petsc-3.20.3
         CC arch-linux-c-opt/tests/sys/classes/draw/tests/ex1.o
    CLINKER arch-linux-c-opt/tests/sys/classes/draw/tests/ex1
/opt/rh/gcc-toolset-12/root/usr/lib/gcc/x86_64-redhat-linux/12/../../../../bin/ld: arch-linux-c-opt/lib/libpetsc.so: undefined reference to `std::__cxx11::basic_ostringstream<char, std::char_traits<char>, std::allocator<char> >::basic_ostringstream()'
/opt/rh/gcc-toolset-12/root/usr/lib/gcc/x86_64-redhat-linux/12/../../../../bin/ld: arch-linux-c-opt/lib/libpetsc.so: undefined reference to `std::__throw_bad_array_new_length()'
clang: error: linker command failed with exit code 1 (use -v to see invocation)
make[1]: [gmakefile.test:273: arch-linux-c-opt/tests/sys/classes/draw/tests/ex1] Error 1 (ignored)


There are many errors of the ilk: 

std::__cxx11::basic_ostringstream<char, std::char_traits<char>, std::allocator<char> >::basic_ostringstream()

[lib]$ nm -A libpetsc.so | grep basic_ostringstream
libpetsc.so:                 U _ZNKSt7__cxx1119basic_ostringstreamIcSt11char_traitsIcESaIcEE3strEv at GLIBCXX_3.4.21
libpetsc.so:                 U _ZNSt7__cxx1119basic_ostringstreamIcSt11char_traitsIcESaIcEEC1Ev
libpetsc.so:                 U _ZNSt7__cxx1119basic_ostringstreamIcSt11char_traitsIcESaIcEED1Ev at GLIBCXX_3.4.21


I/m new to llvm and this is the first time to compile petsc.3.20.3 with llvm compilers.

Clearly something is amiss.

Any ideas appreciated.

Michael


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20250220/a68c6459/attachment.html>

From balay.anl at fastmail.org  Thu Feb 20 12:52:05 2025
From: balay.anl at fastmail.org (Satish Balay)
Date: Thu, 20 Feb 2025 12:52:05 -0600 (CST)
Subject: [petsc-users] Info; build petsc-3.20.2 with llvm check fails
In-Reply-To: <288707F5-8A3C-4973-AE12-99F370C39C5A@bellsouth.net>
References: <288707F5-8A3C-4973-AE12-99F370C39C5A.ref@bellsouth.net>
	<288707F5-8A3C-4973-AE12-99F370C39C5A@bellsouth.net>
Message-ID: <209ececc-d6bb-734a-e928-e12b73b2dc15@fastmail.org>


Any particular reason to use these flags? What clang version? OS?

Best if you can send build logs [perhaps to petsc-maint]

Can you try a simpler build and see if it works:

./configure --with-mpi-dir=/PATH_TO/models/src/v2021.03-2.0.3-llvm --download-fblaslapack="$(DIR_SRC)/fblaslapack-$(FBLASLAPACK_VERSION).tar.gz" && make && make check
or:
./configure --with-cc=clang --with-cxx=clang++ --with-fc=gfortran --download-mpich="$(DIR_SRC)/mpich-$(MPICH_VERSION).tar.gz" --download-fblaslapack="$(DIR_SRC)/fblaslapack-$(FBLASLAPACK_VERSION).tar.gz" && make && make check

Satish

On Thu, 20 Feb 2025, Michael Schaferkotter wrote:

> build petsc-3.20.3 with llvm, clang, clang++, gfortran 
> 
> CFLAGS='-std=c++11'
> CXXFLAGS='-std=c++11 -D_GLIBCXX_USE_CXX11_ABI=1'
> LDLIBS += -lstdc++
> 
> $PETSC_ARCH     arch-linux-c-opt
> MPIF90  = ./models/src/v2021.03-2.0.3-llvm/bin/mpif90
> MPICC   = ./models/src/v2021.03-2.0.3-llvm/bin/mpicc
> CLANG   = clang
> FC   = gfortran
> 
> 
> Petsc libraries are built;
> /models/src/v2021.03-2.0.3-llvm/lib/libpetsc.so@
> /models/src/v2021.03-2.0.3-llvm/lib/libpetsc.so.3.020@
> /models/src/v2021.03-2.0.3-llvm/lib/libpetsc.so.3.020.3*
> 
> 
> The configure is this:
>         cd $(PETSC_SRC) && unset CXX CC FC F77 && $(PYTHON2) ./configure --prefix=$(PREFIX) \
>   --with-cc=clang \
>   --with-cxx=clang++ \
>   --with-fc=gfortran \
>   --download-mpich="$(DIR_SRC)/mpich-$(MPICH_VERSION).tar.gz" \
>   --download-fblaslapack="$(DIR_SRC)/fblaslapack-$(FBLASLAPACK_VERSION).tar.gz" \
>   --download-sowing \
>   --with-debugging=$(PETSC_DBG) \
>   --with-shared-libraries=1 \
>  CFLAGS='-std=c11' \
>   CXXFLAGS='-std=c++11 -D_GLIBCXX_USE_CXX11_ABI=1' \
>   CPPFLAGS='-D_GLIBCXX_USE_CXX11_ABI=1' \
>   LDFLAGS='-L$(LLVM_LIB)' \
>   LIBS='-lstdc++? \
>   --COPTFLAGS=$(COPTFLAGS) --CXXOPTFLAGS=$(CXXOPTFLAGS) --FOPTFLAGS=$(FOPTFLAGS)
> 
> 
> Here is the make:
> 
>         $(MAKE) -C $(PETSC_SRC) PETSC_DIR=$(PETSC_SRC) PETSC_ARCH=$(PETSC_ARCH) all
> 
> 
> Check-petsc is:
> 
>         $(MAKE) -C $(PETSC_SRC) PETSC_DIR=$(PETSC_SRC) PETSC_ARCH=$(PETSC_ARCH) test
> 
> Here is the log file for test:
> 
> make[1]: Entering directory '/models/src/v2021.03-2.0.3-llvm/build/petsc/petsc-3.20.3'
> /usr/bin/python3 /models/src/v2021.03-2.0.3-llvm/build/petsc/petsc-3.20.3/config/gmakegentest.py --petsc-dir=/models/src/v2021.03-2.0.3-llvm/build/petsc/petsc-3.20.3 --petsc-arch=arch-linux-c-opt --testdir=./arch-linux-c-opt/tests --srcdir /models/src/v2021.03-2.0.3-llvm/build/petsc/petsc-3.20.3/src --pkg-pkgs "sys vec mat dm ksp snes ts tao"
> Using MAKEFLAGS: iw -- PETSC_ARCH=arch-linux-c-opt PETSC_DIR=/models/src/v2021.03-2.0.3-llvm/build/petsc/petsc-3.20.3
>          CC arch-linux-c-opt/tests/sys/classes/draw/tests/ex1.o
>     CLINKER arch-linux-c-opt/tests/sys/classes/draw/tests/ex1
> /opt/rh/gcc-toolset-12/root/usr/lib/gcc/x86_64-redhat-linux/12/../../../../bin/ld: arch-linux-c-opt/lib/libpetsc.so: undefined reference to `std::__cxx11::basic_ostringstream<char, std::char_traits<char>, std::allocator<char> >::basic_ostringstream()'
> /opt/rh/gcc-toolset-12/root/usr/lib/gcc/x86_64-redhat-linux/12/../../../../bin/ld: arch-linux-c-opt/lib/libpetsc.so: undefined reference to `std::__throw_bad_array_new_length()'
> clang: error: linker command failed with exit code 1 (use -v to see invocation)
> make[1]: [gmakefile.test:273: arch-linux-c-opt/tests/sys/classes/draw/tests/ex1] Error 1 (ignored)
> 
> 
> There are many errors of the ilk: 
> 
> std::__cxx11::basic_ostringstream<char, std::char_traits<char>, std::allocator<char> >::basic_ostringstream()
> 
> [lib]$ nm -A libpetsc.so | grep basic_ostringstream
> libpetsc.so:                 U _ZNKSt7__cxx1119basic_ostringstreamIcSt11char_traitsIcESaIcEE3strEv at GLIBCXX_3.4.21
> libpetsc.so:                 U _ZNSt7__cxx1119basic_ostringstreamIcSt11char_traitsIcESaIcEEC1Ev
> libpetsc.so:                 U _ZNSt7__cxx1119basic_ostringstreamIcSt11char_traitsIcESaIcEED1Ev at GLIBCXX_3.4.21
> 
> 
> I/m new to llvm and this is the first time to compile petsc.3.20.3 with llvm compilers.
> 
> Clearly something is amiss.
> 
> Any ideas appreciated.
> 
> Michael
> 
> 
> 

From balay.anl at fastmail.org  Thu Feb 20 13:18:11 2025
From: balay.anl at fastmail.org (Satish Balay)
Date: Thu, 20 Feb 2025 13:18:11 -0600 (CST)
Subject: [petsc-users] Info; build petsc-3.20.2 with llvm check fails
In-Reply-To: <209ececc-d6bb-734a-e928-e12b73b2dc15@fastmail.org>
References: <288707F5-8A3C-4973-AE12-99F370C39C5A.ref@bellsouth.net>
	<288707F5-8A3C-4973-AE12-99F370C39C5A@bellsouth.net>
	<209ececc-d6bb-734a-e928-e12b73b2dc15@fastmail.org>
Message-ID: <d4d5f68e-04b8-b521-a324-ba0f8733d082@fastmail.org>

Actually, simpler:

 ./configure --with-cc=clang --with-cxx=clang++ --with-fc=gfortran  --with-mpi=0 --download-fblaslapack="$(DIR_SRC)/fblaslapack-$(FBLASLAPACK_VERSION).tar.gz" && make && make check

> /opt/rh/gcc-toolset-12/root/usr/lib/gcc/x86_64-redhat-linux/12/../../../../bin/ld

Hm - there was in issue with some (clang versions?) incompatibilities with gcc-12 - I think using gcc-11 (system default in that use case) worked. I'm not sure if you are seeing the same issue here.

Satish

On Thu, 20 Feb 2025, Satish Balay wrote:

> 
> Any particular reason to use these flags? What clang version? OS?
> 
> Best if you can send build logs [perhaps to petsc-maint]
> 
> Can you try a simpler build and see if it works:
> 
> ./configure --with-mpi-dir=/PATH_TO/models/src/v2021.03-2.0.3-llvm --download-fblaslapack="$(DIR_SRC)/fblaslapack-$(FBLASLAPACK_VERSION).tar.gz" && make && make check
> or:
> ./configure --with-cc=clang --with-cxx=clang++ --with-fc=gfortran --download-mpich="$(DIR_SRC)/mpich-$(MPICH_VERSION).tar.gz" --download-fblaslapack="$(DIR_SRC)/fblaslapack-$(FBLASLAPACK_VERSION).tar.gz" && make && make check
> 
> Satish
> 
> On Thu, 20 Feb 2025, Michael Schaferkotter wrote:
> 
> > build petsc-3.20.3 with llvm, clang, clang++, gfortran 
> > 
> > CFLAGS='-std=c++11'
> > CXXFLAGS='-std=c++11 -D_GLIBCXX_USE_CXX11_ABI=1'
> > LDLIBS += -lstdc++
> > 
> > $PETSC_ARCH     arch-linux-c-opt
> > MPIF90  = ./models/src/v2021.03-2.0.3-llvm/bin/mpif90
> > MPICC   = ./models/src/v2021.03-2.0.3-llvm/bin/mpicc
> > CLANG   = clang
> > FC   = gfortran
> > 
> > 
> > Petsc libraries are built;
> > /models/src/v2021.03-2.0.3-llvm/lib/libpetsc.so@
> > /models/src/v2021.03-2.0.3-llvm/lib/libpetsc.so.3.020@
> > /models/src/v2021.03-2.0.3-llvm/lib/libpetsc.so.3.020.3*
> > 
> > 
> > The configure is this:
> >         cd $(PETSC_SRC) && unset CXX CC FC F77 && $(PYTHON2) ./configure --prefix=$(PREFIX) \
> >   --with-cc=clang \
> >   --with-cxx=clang++ \
> >   --with-fc=gfortran \
> >   --download-mpich="$(DIR_SRC)/mpich-$(MPICH_VERSION).tar.gz" \
> >   --download-fblaslapack="$(DIR_SRC)/fblaslapack-$(FBLASLAPACK_VERSION).tar.gz" \
> >   --download-sowing \
> >   --with-debugging=$(PETSC_DBG) \
> >   --with-shared-libraries=1 \
> >  CFLAGS='-std=c11' \
> >   CXXFLAGS='-std=c++11 -D_GLIBCXX_USE_CXX11_ABI=1' \
> >   CPPFLAGS='-D_GLIBCXX_USE_CXX11_ABI=1' \
> >   LDFLAGS='-L$(LLVM_LIB)' \
> >   LIBS='-lstdc++? \
> >   --COPTFLAGS=$(COPTFLAGS) --CXXOPTFLAGS=$(CXXOPTFLAGS) --FOPTFLAGS=$(FOPTFLAGS)
> > 
> > 
> > Here is the make:
> > 
> >         $(MAKE) -C $(PETSC_SRC) PETSC_DIR=$(PETSC_SRC) PETSC_ARCH=$(PETSC_ARCH) all
> > 
> > 
> > Check-petsc is:
> > 
> >         $(MAKE) -C $(PETSC_SRC) PETSC_DIR=$(PETSC_SRC) PETSC_ARCH=$(PETSC_ARCH) test
> > 
> > Here is the log file for test:
> > 
> > make[1]: Entering directory '/models/src/v2021.03-2.0.3-llvm/build/petsc/petsc-3.20.3'
> > /usr/bin/python3 /models/src/v2021.03-2.0.3-llvm/build/petsc/petsc-3.20.3/config/gmakegentest.py --petsc-dir=/models/src/v2021.03-2.0.3-llvm/build/petsc/petsc-3.20.3 --petsc-arch=arch-linux-c-opt --testdir=./arch-linux-c-opt/tests --srcdir /models/src/v2021.03-2.0.3-llvm/build/petsc/petsc-3.20.3/src --pkg-pkgs "sys vec mat dm ksp snes ts tao"
> > Using MAKEFLAGS: iw -- PETSC_ARCH=arch-linux-c-opt PETSC_DIR=/models/src/v2021.03-2.0.3-llvm/build/petsc/petsc-3.20.3
> >          CC arch-linux-c-opt/tests/sys/classes/draw/tests/ex1.o
> >     CLINKER arch-linux-c-opt/tests/sys/classes/draw/tests/ex1
> > /opt/rh/gcc-toolset-12/root/usr/lib/gcc/x86_64-redhat-linux/12/../../../../bin/ld: arch-linux-c-opt/lib/libpetsc.so: undefined reference to `std::__cxx11::basic_ostringstream<char, std::char_traits<char>, std::allocator<char> >::basic_ostringstream()'
> > /opt/rh/gcc-toolset-12/root/usr/lib/gcc/x86_64-redhat-linux/12/../../../../bin/ld: arch-linux-c-opt/lib/libpetsc.so: undefined reference to `std::__throw_bad_array_new_length()'
> > clang: error: linker command failed with exit code 1 (use -v to see invocation)
> > make[1]: [gmakefile.test:273: arch-linux-c-opt/tests/sys/classes/draw/tests/ex1] Error 1 (ignored)
> > 
> > 
> > There are many errors of the ilk: 
> > 
> > std::__cxx11::basic_ostringstream<char, std::char_traits<char>, std::allocator<char> >::basic_ostringstream()
> > 
> > [lib]$ nm -A libpetsc.so | grep basic_ostringstream
> > libpetsc.so:                 U _ZNKSt7__cxx1119basic_ostringstreamIcSt11char_traitsIcESaIcEE3strEv at GLIBCXX_3.4.21
> > libpetsc.so:                 U _ZNSt7__cxx1119basic_ostringstreamIcSt11char_traitsIcESaIcEEC1Ev
> > libpetsc.so:                 U _ZNSt7__cxx1119basic_ostringstreamIcSt11char_traitsIcESaIcEED1Ev at GLIBCXX_3.4.21
> > 
> > 
> > I/m new to llvm and this is the first time to compile petsc.3.20.3 with llvm compilers.
> > 
> > Clearly something is amiss.
> > 
> > Any ideas appreciated.
> > 
> > Michael
> > 
> > 
> > 
> 

From balay.anl at fastmail.org  Thu Feb 20 14:08:44 2025
From: balay.anl at fastmail.org (Satish Balay)
Date: Thu, 20 Feb 2025 14:08:44 -0600 (CST)
Subject: [petsc-users] Info; build petsc-3.20.2 with llvm check fails
In-Reply-To: <d4d5f68e-04b8-b521-a324-ba0f8733d082@fastmail.org>
References: <288707F5-8A3C-4973-AE12-99F370C39C5A.ref@bellsouth.net>
	<288707F5-8A3C-4973-AE12-99F370C39C5A@bellsouth.net>
	<209ececc-d6bb-734a-e928-e12b73b2dc15@fastmail.org>
	<d4d5f68e-04b8-b521-a324-ba0f8733d082@fastmail.org>
Message-ID: <618386bb-0919-0fee-656e-b89ba7eaa08e@fastmail.org>

Ok - I see this issue on CentOS [Stream/9].

What I have is:
>>>
[balay at frog petsc]$ clang --version
clang version 19.1.7 (CentOS 19.1.7-1.el9)
Target: x86_64-redhat-linux-gnu
Thread model: posix
InstalledDir: /usr/bin
Configuration file: /etc/clang/x86_64-redhat-linux-gnu-clang.cfg
[balay at frog petsc]$ gfortran --version
GNU Fortran (GCC) 11.5.0 20240719 (Red Hat 11.5.0-5)
Copyright (C) 2021 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
<<<<

Now I build:
>>>
[balay at frog petsc]$ ./configure --with-cc=clang --with-cxx=clang++ --with-fc=gfortran --with-mpi=0 && make && make check
<snip>
*********************************************************************************
clang -fPIC -Wall -Wwrite-strings -Wno-unknown-pragmas -Wconversion -Wno-sign-conversion -Wno-float-conversion -Wno-implicit-float-conversion -fstack-protector -Qunused-arguments -fvisibility=hidden -Wall -Wwrite-strings -Wno-unknown-pragmas -Wconversion -Wno-sign-conversion -Wno-float-conversion -Wno-implicit-float-conversion -fstack-protector -Qunused-arguments -fvisibility=hidden -g3 -O0  -I/home/balay/petsc/include -I/home/balay/petsc/arch-linux-c-debug/include     -Wl,-export-dynamic ex19.c  -Wl,-rpath,/home/balay/petsc/arch-linux-c-debug/lib -L/home/balay/petsc/arch-linux-c-debug/lib -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/11 -L/usr/lib/gcc/x86_64-redhat-linux/11 -lpetsc -llapack -lblas -lm -lX11 -lgfortran -lm -lgfortran -lm -lgcc_s -lquadmath -lstdc++ -o ex19
/opt/rh/gcc-toolset-14/root//usr/lib/gcc/x86_64-redhat-linux/14/../../../../bin/ld: /home/balay/petsc/arch-linux-c-debug/lib/libpetsc.so: undefined reference to `std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_replace_cold(char*, unsigned long, char const*, unsigned long, unsigned long)'
<snip>
<<<<

Ok some v11 compiler libraries are getting mixed up (likely from -lgfortran) causing grief.
>>>>>>>>
[root at frog ~]# yum remove gcc-toolset-14-runtime
Dependencies resolved.
================================================================================
 Package                          Arch     Version           Repository    Size
================================================================================
Removing:
 gcc-toolset-14-runtime           x86_64   14.0-1.el9        @appstream    11 k
Removing dependent packages:
 clang                            x86_64   19.1.7-1.el9      @appstream   181 k
 clang-tools-extra                x86_64   19.1.7-1.el9      @appstream    69 M
 gcc-toolset-14-binutils          x86_64   2.41-3.el9        @appstream    27 M
Removing unused dependencies:
 clang-libs                       x86_64   19.1.7-1.el9      @appstream   413 M
 clang-resource-filesystem        x86_64   19.1.7-1.el9      @appstream    15 k
 compiler-rt                      x86_64   19.1.7-1.el9      @appstream    37 M
 gcc-toolset-14-gcc               x86_64   14.2.1-7.1.el9    @appstream   122 M
 gcc-toolset-14-gcc-c++           x86_64   14.2.1-7.1.el9    @appstream    39 M
 gcc-toolset-14-libstdc++-devel   x86_64   14.2.1-7.1.el9    @appstream    22 M
 libomp                           x86_64   19.1.7-1.el9      @appstream   1.9 M
 libomp-devel                     x86_64   19.1.7-1.el9      @appstream    31 M

Transaction Summary
================================================================================
Remove  12 Packages

Freed space: 763 M
Is this ok [y/N]: 
<<<<<

So this install of clang depends-on/requires gcc-toolset-14-gcc. Also gfortran-14 is missing. Try installing it.
>>>>
[root at frog ~]# yum install gcc-toolset-14-gcc-gfortran
<<<<

Now retry build:
>>>
[balay at frog petsc]$ ./configure --with-cc=clang --with-cxx=clang++ --with-fc=gfortran --with-mpi=0 && make && make check
<snip>
Running PETSc check examples to verify correct installation
Using PETSC_DIR=/home/balay/petsc and PETSC_ARCH=arch-linux-c-debug
C/C++ example src/snes/tutorials/ex19 run successfully with 1 MPI process
Fortran example src/snes/tutorials/ex5f run successfully with 1 MPI process
Completed PETSc check examples
[balay at frog petsc]$ 
<<<<

Hm - Using gfortran-11 here [with gfortran-14 installed] somehow worked! But perhaps its better to use gfortran-14 [as this install of clang requires g++-14]
>>>>
[balay at frog petsc]$ export PATH=/opt/rh/gcc-toolset-14/root/usr/bin:$PATH
[balay at frog petsc]$ gfortran --version
GNU Fortran (GCC) 14.2.1 20250110 (Red Hat 14.2.1-7)
Copyright (C) 2024 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

[balay at frog petsc]$ ./configure --with-cc=clang --with-cxx=clang++ --with-fc=gfortran --with-mpi=0 && make && make check
<snip>
    CLINKER arch-linux-c-debug/lib/libpetsc.so.3.22.3
=========================================
Now to check if the libraries are working do:
make PETSC_DIR=/home/balay/petsc PETSC_ARCH=arch-linux-c-debug check
=========================================
Running PETSc check examples to verify correct installation
Using PETSC_DIR=/home/balay/petsc and PETSC_ARCH=arch-linux-c-debug
C/C++ example src/snes/tutorials/ex19 run successfully with 1 MPI process
Fortran example src/snes/tutorials/ex5f run successfully with 1 MPI process
Completed PETSc check examples
[balay at frog petsc]$ 
<<<<

So that worked!

Satish


On Thu, 20 Feb 2025, Satish Balay wrote:

> Actually, simpler:
> 
>  ./configure --with-cc=clang --with-cxx=clang++ --with-fc=gfortran  --with-mpi=0 --download-fblaslapack="$(DIR_SRC)/fblaslapack-$(FBLASLAPACK_VERSION).tar.gz" && make && make check
> 
> > /opt/rh/gcc-toolset-12/root/usr/lib/gcc/x86_64-redhat-linux/12/../../../../bin/ld
> 
> Hm - there was in issue with some (clang versions?) incompatibilities with gcc-12 - I think using gcc-11 (system default in that use case) worked. I'm not sure if you are seeing the same issue here.
> 
> Satish
> 
> On Thu, 20 Feb 2025, Satish Balay wrote:
> 
> > 
> > Any particular reason to use these flags? What clang version? OS?
> > 
> > Best if you can send build logs [perhaps to petsc-maint]
> > 
> > Can you try a simpler build and see if it works:
> > 
> > ./configure --with-mpi-dir=/PATH_TO/models/src/v2021.03-2.0.3-llvm --download-fblaslapack="$(DIR_SRC)/fblaslapack-$(FBLASLAPACK_VERSION).tar.gz" && make && make check
> > or:
> > ./configure --with-cc=clang --with-cxx=clang++ --with-fc=gfortran --download-mpich="$(DIR_SRC)/mpich-$(MPICH_VERSION).tar.gz" --download-fblaslapack="$(DIR_SRC)/fblaslapack-$(FBLASLAPACK_VERSION).tar.gz" && make && make check
> > 
> > Satish
> > 
> > On Thu, 20 Feb 2025, Michael Schaferkotter wrote:
> > 
> > > build petsc-3.20.3 with llvm, clang, clang++, gfortran 
> > > 
> > > CFLAGS='-std=c++11'
> > > CXXFLAGS='-std=c++11 -D_GLIBCXX_USE_CXX11_ABI=1'
> > > LDLIBS += -lstdc++
> > > 
> > > $PETSC_ARCH     arch-linux-c-opt
> > > MPIF90  = ./models/src/v2021.03-2.0.3-llvm/bin/mpif90
> > > MPICC   = ./models/src/v2021.03-2.0.3-llvm/bin/mpicc
> > > CLANG   = clang
> > > FC   = gfortran
> > > 
> > > 
> > > Petsc libraries are built;
> > > /models/src/v2021.03-2.0.3-llvm/lib/libpetsc.so@
> > > /models/src/v2021.03-2.0.3-llvm/lib/libpetsc.so.3.020@
> > > /models/src/v2021.03-2.0.3-llvm/lib/libpetsc.so.3.020.3*
> > > 
> > > 
> > > The configure is this:
> > >         cd $(PETSC_SRC) && unset CXX CC FC F77 && $(PYTHON2) ./configure --prefix=$(PREFIX) \
> > >   --with-cc=clang \
> > >   --with-cxx=clang++ \
> > >   --with-fc=gfortran \
> > >   --download-mpich="$(DIR_SRC)/mpich-$(MPICH_VERSION).tar.gz" \
> > >   --download-fblaslapack="$(DIR_SRC)/fblaslapack-$(FBLASLAPACK_VERSION).tar.gz" \
> > >   --download-sowing \
> > >   --with-debugging=$(PETSC_DBG) \
> > >   --with-shared-libraries=1 \
> > >  CFLAGS='-std=c11' \
> > >   CXXFLAGS='-std=c++11 -D_GLIBCXX_USE_CXX11_ABI=1' \
> > >   CPPFLAGS='-D_GLIBCXX_USE_CXX11_ABI=1' \
> > >   LDFLAGS='-L$(LLVM_LIB)' \
> > >   LIBS='-lstdc++? \
> > >   --COPTFLAGS=$(COPTFLAGS) --CXXOPTFLAGS=$(CXXOPTFLAGS) --FOPTFLAGS=$(FOPTFLAGS)
> > > 
> > > 
> > > Here is the make:
> > > 
> > >         $(MAKE) -C $(PETSC_SRC) PETSC_DIR=$(PETSC_SRC) PETSC_ARCH=$(PETSC_ARCH) all
> > > 
> > > 
> > > Check-petsc is:
> > > 
> > >         $(MAKE) -C $(PETSC_SRC) PETSC_DIR=$(PETSC_SRC) PETSC_ARCH=$(PETSC_ARCH) test
> > > 
> > > Here is the log file for test:
> > > 
> > > make[1]: Entering directory '/models/src/v2021.03-2.0.3-llvm/build/petsc/petsc-3.20.3'
> > > /usr/bin/python3 /models/src/v2021.03-2.0.3-llvm/build/petsc/petsc-3.20.3/config/gmakegentest.py --petsc-dir=/models/src/v2021.03-2.0.3-llvm/build/petsc/petsc-3.20.3 --petsc-arch=arch-linux-c-opt --testdir=./arch-linux-c-opt/tests --srcdir /models/src/v2021.03-2.0.3-llvm/build/petsc/petsc-3.20.3/src --pkg-pkgs "sys vec mat dm ksp snes ts tao"
> > > Using MAKEFLAGS: iw -- PETSC_ARCH=arch-linux-c-opt PETSC_DIR=/models/src/v2021.03-2.0.3-llvm/build/petsc/petsc-3.20.3
> > >          CC arch-linux-c-opt/tests/sys/classes/draw/tests/ex1.o
> > >     CLINKER arch-linux-c-opt/tests/sys/classes/draw/tests/ex1
> > > /opt/rh/gcc-toolset-12/root/usr/lib/gcc/x86_64-redhat-linux/12/../../../../bin/ld: arch-linux-c-opt/lib/libpetsc.so: undefined reference to `std::__cxx11::basic_ostringstream<char, std::char_traits<char>, std::allocator<char> >::basic_ostringstream()'
> > > /opt/rh/gcc-toolset-12/root/usr/lib/gcc/x86_64-redhat-linux/12/../../../../bin/ld: arch-linux-c-opt/lib/libpetsc.so: undefined reference to `std::__throw_bad_array_new_length()'
> > > clang: error: linker command failed with exit code 1 (use -v to see invocation)
> > > make[1]: [gmakefile.test:273: arch-linux-c-opt/tests/sys/classes/draw/tests/ex1] Error 1 (ignored)
> > > 
> > > 
> > > There are many errors of the ilk: 
> > > 
> > > std::__cxx11::basic_ostringstream<char, std::char_traits<char>, std::allocator<char> >::basic_ostringstream()
> > > 
> > > [lib]$ nm -A libpetsc.so | grep basic_ostringstream
> > > libpetsc.so:                 U _ZNKSt7__cxx1119basic_ostringstreamIcSt11char_traitsIcESaIcEE3strEv at GLIBCXX_3.4.21
> > > libpetsc.so:                 U _ZNSt7__cxx1119basic_ostringstreamIcSt11char_traitsIcESaIcEEC1Ev
> > > libpetsc.so:                 U _ZNSt7__cxx1119basic_ostringstreamIcSt11char_traitsIcESaIcEED1Ev at GLIBCXX_3.4.21
> > > 
> > > 
> > > I/m new to llvm and this is the first time to compile petsc.3.20.3 with llvm compilers.
> > > 
> > > Clearly something is amiss.
> > > 
> > > Any ideas appreciated.
> > > 
> > > Michael
> > > 
> > > 
> > > 
> > 
> 

From balay.anl at fastmail.org  Thu Feb 20 14:44:15 2025
From: balay.anl at fastmail.org (Satish Balay)
Date: Thu, 20 Feb 2025 14:44:15 -0600 (CST)
Subject: [petsc-users] Info; build petsc-3.20.2 with llvm check fails
In-Reply-To: <618386bb-0919-0fee-656e-b89ba7eaa08e@fastmail.org>
References: <288707F5-8A3C-4973-AE12-99F370C39C5A.ref@bellsouth.net>
	<288707F5-8A3C-4973-AE12-99F370C39C5A@bellsouth.net>
	<209ececc-d6bb-734a-e928-e12b73b2dc15@fastmail.org>
	<d4d5f68e-04b8-b521-a324-ba0f8733d082@fastmail.org>
	<618386bb-0919-0fee-656e-b89ba7eaa08e@fastmail.org>
Message-ID: <fdc77d29-8f86-0986-700b-d5f848728026@fastmail.org>

A couple of alternates (if mixing compiler versions can't be avoided):

- don't need to use petsc from fortran:
[balay at frog petsc]$ ./configure --with-cc=clang --with-cxx=clang++ --with-fc=0 --with-mpi=0 --download-f2cblaslapack && make && make check

- don't use c++:
[balay at frog petsc]$ ./configure --with-cc=clang --with-cxx=0 --with-fc=gfortran --with-mpi=0 && make && make check

- add in v14 -lstdc++ location ahead in the search path - so that even when -lgfortran is found in v11,  v14 -lstdc++ gets picked up correctly.
[balay at frog petsc]$ ./configure LDFLAGS=-L/opt/rh/gcc-toolset-14/root/usr/lib/gcc/x86_64-redhat-linux/14/ --with-cc=clang --with-cxx=clang++ --with-fc=gfortran --with-mpi=0 && make && make check

Satish

On Thu, 20 Feb 2025, Satish Balay wrote:

> Ok - I see this issue on CentOS [Stream/9].
> 
> What I have is:
> >>>
> [balay at frog petsc]$ clang --version
> clang version 19.1.7 (CentOS 19.1.7-1.el9)
> Target: x86_64-redhat-linux-gnu
> Thread model: posix
> InstalledDir: /usr/bin
> Configuration file: /etc/clang/x86_64-redhat-linux-gnu-clang.cfg
> [balay at frog petsc]$ gfortran --version
> GNU Fortran (GCC) 11.5.0 20240719 (Red Hat 11.5.0-5)
> Copyright (C) 2021 Free Software Foundation, Inc.
> This is free software; see the source for copying conditions.  There is NO
> warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
> <<<<
> 
> Now I build:
> >>>
> [balay at frog petsc]$ ./configure --with-cc=clang --with-cxx=clang++ --with-fc=gfortran --with-mpi=0 && make && make check
> <snip>
> *********************************************************************************
> clang -fPIC -Wall -Wwrite-strings -Wno-unknown-pragmas -Wconversion -Wno-sign-conversion -Wno-float-conversion -Wno-implicit-float-conversion -fstack-protector -Qunused-arguments -fvisibility=hidden -Wall -Wwrite-strings -Wno-unknown-pragmas -Wconversion -Wno-sign-conversion -Wno-float-conversion -Wno-implicit-float-conversion -fstack-protector -Qunused-arguments -fvisibility=hidden -g3 -O0  -I/home/balay/petsc/include -I/home/balay/petsc/arch-linux-c-debug/include     -Wl,-export-dynamic ex19.c  -Wl,-rpath,/home/balay/petsc/arch-linux-c-debug/lib -L/home/balay/petsc/arch-linux-c-debug/lib -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/11 -L/usr/lib/gcc/x86_64-redhat-linux/11 -lpetsc -llapack -lblas -lm -lX11 -lgfortran -lm -lgfortran -lm -lgcc_s -lquadmath -lstdc++ -o ex19
> /opt/rh/gcc-toolset-14/root//usr/lib/gcc/x86_64-redhat-linux/14/../../../../bin/ld: /home/balay/petsc/arch-linux-c-debug/lib/libpetsc.so: undefined reference to `std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_replace_cold(char*, unsigned long, char const*, unsigned long, unsigned long)'
> <snip>
> <<<<
> 
> Ok some v11 compiler libraries are getting mixed up (likely from -lgfortran) causing grief.
> >>>>>>>>
> [root at frog ~]# yum remove gcc-toolset-14-runtime
> Dependencies resolved.
> ================================================================================
>  Package                          Arch     Version           Repository    Size
> ================================================================================
> Removing:
>  gcc-toolset-14-runtime           x86_64   14.0-1.el9        @appstream    11 k
> Removing dependent packages:
>  clang                            x86_64   19.1.7-1.el9      @appstream   181 k
>  clang-tools-extra                x86_64   19.1.7-1.el9      @appstream    69 M
>  gcc-toolset-14-binutils          x86_64   2.41-3.el9        @appstream    27 M
> Removing unused dependencies:
>  clang-libs                       x86_64   19.1.7-1.el9      @appstream   413 M
>  clang-resource-filesystem        x86_64   19.1.7-1.el9      @appstream    15 k
>  compiler-rt                      x86_64   19.1.7-1.el9      @appstream    37 M
>  gcc-toolset-14-gcc               x86_64   14.2.1-7.1.el9    @appstream   122 M
>  gcc-toolset-14-gcc-c++           x86_64   14.2.1-7.1.el9    @appstream    39 M
>  gcc-toolset-14-libstdc++-devel   x86_64   14.2.1-7.1.el9    @appstream    22 M
>  libomp                           x86_64   19.1.7-1.el9      @appstream   1.9 M
>  libomp-devel                     x86_64   19.1.7-1.el9      @appstream    31 M
> 
> Transaction Summary
> ================================================================================
> Remove  12 Packages
> 
> Freed space: 763 M
> Is this ok [y/N]: 
> <<<<<
> 
> So this install of clang depends-on/requires gcc-toolset-14-gcc. Also gfortran-14 is missing. Try installing it.
> >>>>
> [root at frog ~]# yum install gcc-toolset-14-gcc-gfortran
> <<<<
> 
> Now retry build:
> >>>
> [balay at frog petsc]$ ./configure --with-cc=clang --with-cxx=clang++ --with-fc=gfortran --with-mpi=0 && make && make check
> <snip>
> Running PETSc check examples to verify correct installation
> Using PETSC_DIR=/home/balay/petsc and PETSC_ARCH=arch-linux-c-debug
> C/C++ example src/snes/tutorials/ex19 run successfully with 1 MPI process
> Fortran example src/snes/tutorials/ex5f run successfully with 1 MPI process
> Completed PETSc check examples
> [balay at frog petsc]$ 
> <<<<
> 
> Hm - Using gfortran-11 here [with gfortran-14 installed] somehow worked! But perhaps its better to use gfortran-14 [as this install of clang requires g++-14]
> >>>>
> [balay at frog petsc]$ export PATH=/opt/rh/gcc-toolset-14/root/usr/bin:$PATH
> [balay at frog petsc]$ gfortran --version
> GNU Fortran (GCC) 14.2.1 20250110 (Red Hat 14.2.1-7)
> Copyright (C) 2024 Free Software Foundation, Inc.
> This is free software; see the source for copying conditions.  There is NO
> warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
> 
> [balay at frog petsc]$ ./configure --with-cc=clang --with-cxx=clang++ --with-fc=gfortran --with-mpi=0 && make && make check
> <snip>
>     CLINKER arch-linux-c-debug/lib/libpetsc.so.3.22.3
> =========================================
> Now to check if the libraries are working do:
> make PETSC_DIR=/home/balay/petsc PETSC_ARCH=arch-linux-c-debug check
> =========================================
> Running PETSc check examples to verify correct installation
> Using PETSC_DIR=/home/balay/petsc and PETSC_ARCH=arch-linux-c-debug
> C/C++ example src/snes/tutorials/ex19 run successfully with 1 MPI process
> Fortran example src/snes/tutorials/ex5f run successfully with 1 MPI process
> Completed PETSc check examples
> [balay at frog petsc]$ 
> <<<<
> 
> So that worked!
> 
> Satish
> 
> 
> On Thu, 20 Feb 2025, Satish Balay wrote:
> 
> > Actually, simpler:
> > 
> >  ./configure --with-cc=clang --with-cxx=clang++ --with-fc=gfortran  --with-mpi=0 --download-fblaslapack="$(DIR_SRC)/fblaslapack-$(FBLASLAPACK_VERSION).tar.gz" && make && make check
> > 
> > > /opt/rh/gcc-toolset-12/root/usr/lib/gcc/x86_64-redhat-linux/12/../../../../bin/ld
> > 
> > Hm - there was in issue with some (clang versions?) incompatibilities with gcc-12 - I think using gcc-11 (system default in that use case) worked. I'm not sure if you are seeing the same issue here.
> > 
> > Satish
> > 
> > On Thu, 20 Feb 2025, Satish Balay wrote:
> > 
> > > 
> > > Any particular reason to use these flags? What clang version? OS?
> > > 
> > > Best if you can send build logs [perhaps to petsc-maint]
> > > 
> > > Can you try a simpler build and see if it works:
> > > 
> > > ./configure --with-mpi-dir=/PATH_TO/models/src/v2021.03-2.0.3-llvm --download-fblaslapack="$(DIR_SRC)/fblaslapack-$(FBLASLAPACK_VERSION).tar.gz" && make && make check
> > > or:
> > > ./configure --with-cc=clang --with-cxx=clang++ --with-fc=gfortran --download-mpich="$(DIR_SRC)/mpich-$(MPICH_VERSION).tar.gz" --download-fblaslapack="$(DIR_SRC)/fblaslapack-$(FBLASLAPACK_VERSION).tar.gz" && make && make check
> > > 
> > > Satish
> > > 
> > > On Thu, 20 Feb 2025, Michael Schaferkotter wrote:
> > > 
> > > > build petsc-3.20.3 with llvm, clang, clang++, gfortran 
> > > > 
> > > > CFLAGS='-std=c++11'
> > > > CXXFLAGS='-std=c++11 -D_GLIBCXX_USE_CXX11_ABI=1'
> > > > LDLIBS += -lstdc++
> > > > 
> > > > $PETSC_ARCH     arch-linux-c-opt
> > > > MPIF90  = ./models/src/v2021.03-2.0.3-llvm/bin/mpif90
> > > > MPICC   = ./models/src/v2021.03-2.0.3-llvm/bin/mpicc
> > > > CLANG   = clang
> > > > FC   = gfortran
> > > > 
> > > > 
> > > > Petsc libraries are built;
> > > > /models/src/v2021.03-2.0.3-llvm/lib/libpetsc.so@
> > > > /models/src/v2021.03-2.0.3-llvm/lib/libpetsc.so.3.020@
> > > > /models/src/v2021.03-2.0.3-llvm/lib/libpetsc.so.3.020.3*
> > > > 
> > > > 
> > > > The configure is this:
> > > >         cd $(PETSC_SRC) && unset CXX CC FC F77 && $(PYTHON2) ./configure --prefix=$(PREFIX) \
> > > >   --with-cc=clang \
> > > >   --with-cxx=clang++ \
> > > >   --with-fc=gfortran \
> > > >   --download-mpich="$(DIR_SRC)/mpich-$(MPICH_VERSION).tar.gz" \
> > > >   --download-fblaslapack="$(DIR_SRC)/fblaslapack-$(FBLASLAPACK_VERSION).tar.gz" \
> > > >   --download-sowing \
> > > >   --with-debugging=$(PETSC_DBG) \
> > > >   --with-shared-libraries=1 \
> > > >  CFLAGS='-std=c11' \
> > > >   CXXFLAGS='-std=c++11 -D_GLIBCXX_USE_CXX11_ABI=1' \
> > > >   CPPFLAGS='-D_GLIBCXX_USE_CXX11_ABI=1' \
> > > >   LDFLAGS='-L$(LLVM_LIB)' \
> > > >   LIBS='-lstdc++? \
> > > >   --COPTFLAGS=$(COPTFLAGS) --CXXOPTFLAGS=$(CXXOPTFLAGS) --FOPTFLAGS=$(FOPTFLAGS)
> > > > 
> > > > 
> > > > Here is the make:
> > > > 
> > > >         $(MAKE) -C $(PETSC_SRC) PETSC_DIR=$(PETSC_SRC) PETSC_ARCH=$(PETSC_ARCH) all
> > > > 
> > > > 
> > > > Check-petsc is:
> > > > 
> > > >         $(MAKE) -C $(PETSC_SRC) PETSC_DIR=$(PETSC_SRC) PETSC_ARCH=$(PETSC_ARCH) test
> > > > 
> > > > Here is the log file for test:
> > > > 
> > > > make[1]: Entering directory '/models/src/v2021.03-2.0.3-llvm/build/petsc/petsc-3.20.3'
> > > > /usr/bin/python3 /models/src/v2021.03-2.0.3-llvm/build/petsc/petsc-3.20.3/config/gmakegentest.py --petsc-dir=/models/src/v2021.03-2.0.3-llvm/build/petsc/petsc-3.20.3 --petsc-arch=arch-linux-c-opt --testdir=./arch-linux-c-opt/tests --srcdir /models/src/v2021.03-2.0.3-llvm/build/petsc/petsc-3.20.3/src --pkg-pkgs "sys vec mat dm ksp snes ts tao"
> > > > Using MAKEFLAGS: iw -- PETSC_ARCH=arch-linux-c-opt PETSC_DIR=/models/src/v2021.03-2.0.3-llvm/build/petsc/petsc-3.20.3
> > > >          CC arch-linux-c-opt/tests/sys/classes/draw/tests/ex1.o
> > > >     CLINKER arch-linux-c-opt/tests/sys/classes/draw/tests/ex1
> > > > /opt/rh/gcc-toolset-12/root/usr/lib/gcc/x86_64-redhat-linux/12/../../../../bin/ld: arch-linux-c-opt/lib/libpetsc.so: undefined reference to `std::__cxx11::basic_ostringstream<char, std::char_traits<char>, std::allocator<char> >::basic_ostringstream()'
> > > > /opt/rh/gcc-toolset-12/root/usr/lib/gcc/x86_64-redhat-linux/12/../../../../bin/ld: arch-linux-c-opt/lib/libpetsc.so: undefined reference to `std::__throw_bad_array_new_length()'
> > > > clang: error: linker command failed with exit code 1 (use -v to see invocation)
> > > > make[1]: [gmakefile.test:273: arch-linux-c-opt/tests/sys/classes/draw/tests/ex1] Error 1 (ignored)
> > > > 
> > > > 
> > > > There are many errors of the ilk: 
> > > > 
> > > > std::__cxx11::basic_ostringstream<char, std::char_traits<char>, std::allocator<char> >::basic_ostringstream()
> > > > 
> > > > [lib]$ nm -A libpetsc.so | grep basic_ostringstream
> > > > libpetsc.so:                 U _ZNKSt7__cxx1119basic_ostringstreamIcSt11char_traitsIcESaIcEE3strEv at GLIBCXX_3.4.21
> > > > libpetsc.so:                 U _ZNSt7__cxx1119basic_ostringstreamIcSt11char_traitsIcESaIcEEC1Ev
> > > > libpetsc.so:                 U _ZNSt7__cxx1119basic_ostringstreamIcSt11char_traitsIcESaIcEED1Ev at GLIBCXX_3.4.21
> > > > 
> > > > 
> > > > I/m new to llvm and this is the first time to compile petsc.3.20.3 with llvm compilers.
> > > > 
> > > > Clearly something is amiss.
> > > > 
> > > > Any ideas appreciated.
> > > > 
> > > > Michael
> > > > 
> > > > 
> > > > 
> > > 
> > 

From schaferk at bellsouth.net  Fri Feb 21 08:02:36 2025
From: schaferk at bellsouth.net (Michael Schaferkotter)
Date: Fri, 21 Feb 2025 08:02:36 -0600
Subject: [petsc-users] Info; build petsc-3.20.2 with llvm check fails
In-Reply-To: <fdc77d29-8f86-0986-700b-d5f848728026@fastmail.org>
References: <288707F5-8A3C-4973-AE12-99F370C39C5A.ref@bellsouth.net>
	<288707F5-8A3C-4973-AE12-99F370C39C5A@bellsouth.net>
	<209ececc-d6bb-734a-e928-e12b73b2dc15@fastmail.org>
	<d4d5f68e-04b8-b521-a324-ba0f8733d082@fastmail.org>
	<618386bb-0919-0fee-656e-b89ba7eaa08e@fastmail.org>
	<fdc77d29-8f86-0986-700b-d5f848728026@fastmail.org>
Message-ID: <03DE8DD9-B858-4D21-A2A8-CBC3BC1BE9C4@bellsouth.net>

Satish;

Thank you for the masterful demonstration.

One of the alternatives caught my eye:   ?with-cxx=0 (I remember I had to do that ages ago on my macOS Darwin machine.

I cleared *FLAGS and successfully completed make && make check as suggested;

#=======================================================================
It may be moot now. Here is the requested OS, compiler information:

cat /etc/os-release

NAME="Red Hat Enterprise Linux"
VERSION="8.8 (Ootpa)"
ID="rhel"
ID_LIKE="fedora"
VERSION_ID="8.8"
PLATFORM_ID="platform:el8"
PRETTY_NAME="Red Hat Enterprise Linux 8.8 (Ootpa)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:redhat:enterprise_linux:8::baseos"
HOME_URL="https://urldefense.us/v3/__https://www.redhat.com/__;!!G_uCfscf7eWS!Z9OsinaKusqhhdPuDCNknHJq6f6UGZt17SofPYc-BvWQvrlqpeDbEEucEHNxioN04anLOPsjW0v_aCHG5RpV-0tjnQ$ "
DOCUMENTATION_URL="https://urldefense.us/v3/__https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8__;!!G_uCfscf7eWS!Z9OsinaKusqhhdPuDCNknHJq6f6UGZt17SofPYc-BvWQvrlqpeDbEEucEHNxioN04anLOPsjW0v_aCHG5Roh1jkm1w$ "
BUG_REPORT_URL="https://urldefense.us/v3/__https://bugzilla.redhat.com/__;!!G_uCfscf7eWS!Z9OsinaKusqhhdPuDCNknHJq6f6UGZt17SofPYc-BvWQvrlqpeDbEEucEHNxioN04anLOPsjW0v_aCHG5RpG5qdUaQ$ "

REDHAT_BUGZILLA_PRODUCT="Red Hat Enterprise Linux 8"
REDHAT_BUGZILLA_PRODUCT_VERSION=8.8
REDHAT_SUPPORT_PRODUCT="Red Hat Enterprise Linux"
REDHAT_SUPPORT_PRODUCT_VERSION="8.8"

clang version 20.0.0git (https://urldefense.us/v3/__https://github.com/llvm/llvm-project.git__;!!G_uCfscf7eWS!Z9OsinaKusqhhdPuDCNknHJq6f6UGZt17SofPYc-BvWQvrlqpeDbEEucEHNxioN04anLOPsjW0v_aCHG5RoxqM2a5g$  48d0ef1a07993139e1acf65910704255443103a5)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /tmp/build_release/bin
Clang version: 20.0.0git
LLVM version: LLVM version 20.0.0git (48d0ef1a07993139e1acf65910704255443103a5<unknown encoding>
C++ standard:
Host target: x86_64-unknown-linux-gnu
Supported targets:
  Registered Targets:
    x86    - 32-bit X86: Pentium-Pro and above
    x86-64 - 64-bit X86: EM64T and AMD64

flang version 20.0.0git (https://urldefense.us/v3/__https://github.com/llvm/llvm-project.git__;!!G_uCfscf7eWS!Z9OsinaKusqhhdPuDCNknHJq6f6UGZt17SofPYc-BvWQvrlqpeDbEEucEHNxioN04anLOPsjW0v_aCHG5RoxqM2a5g$  48d0ef1a07993139e1acf65910704255443103a5)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /tmp/build_release/bin
Flang version: 20.0.0git
Host target: x86_64-unknown-linux-gnu


Fortran compiler: GNU Fortran (GCC) 8.5.0 20210514 (Red Hat 8.5.0-4)
Compiler path: /usr/bin/gfortran
Version:
  -std=<standard>          Assume that the input sources are for <standard>.
Default flags: No default flags information available
Target: x86_64-redhat-linux

#=======================================================================

Thank you again.

> On Feb 20, 2025, at 2:44 PM, Satish Balay <balay.anl at fastmail.org> wrote:
> 
> A couple of alternates (if mixing compiler versions can't be avoided):
> 
> - don't need to use petsc from fortran:
> [balay at frog petsc]$ ./configure --with-cc=clang --with-cxx=clang++ --with-fc=0 --with-mpi=0 --download-f2cblaslapack && make && make check
> 
> - don't use c++:
> [balay at frog petsc]$ ./configure --with-cc=clang --with-cxx=0 --with-fc=gfortran --with-mpi=0 && make && make check
> 
> - add in v14 -lstdc++ location ahead in the search path - so that even when -lgfortran is found in v11,  v14 -lstdc++ gets picked up correctly.
> [balay at frog petsc]$ ./configure LDFLAGS=-L/opt/rh/gcc-toolset-14/root/usr/lib/gcc/x86_64-redhat-linux/14/ --with-cc=clang --with-cxx=clang++ --with-fc=gfortran --with-mpi=0 && make && make check
> 
> Satish
> 
> On Thu, 20 Feb 2025, Satish Balay wrote:
> 
>> Ok - I see this issue on CentOS [Stream/9].
>> 
>> What I have is:
>>>>> 
>> [balay at frog petsc]$ clang --version
>> clang version 19.1.7 (CentOS 19.1.7-1.el9)
>> Target: x86_64-redhat-linux-gnu
>> Thread model: posix
>> InstalledDir: /usr/bin
>> Configuration file: /etc/clang/x86_64-redhat-linux-gnu-clang.cfg
>> [balay at frog petsc]$ gfortran --version
>> GNU Fortran (GCC) 11.5.0 20240719 (Red Hat 11.5.0-5)
>> Copyright (C) 2021 Free Software Foundation, Inc.
>> This is free software; see the source for copying conditions.  There is NO
>> warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
>> <<<<
>> 
>> Now I build:
>>>>> 
>> [balay at frog petsc]$ ./configure --with-cc=clang --with-cxx=clang++ --with-fc=gfortran --with-mpi=0 && make && make check
>> <snip>
>> *********************************************************************************
>> clang -fPIC -Wall -Wwrite-strings -Wno-unknown-pragmas -Wconversion -Wno-sign-conversion -Wno-float-conversion -Wno-implicit-float-conversion -fstack-protector -Qunused-arguments -fvisibility=hidden -Wall -Wwrite-strings -Wno-unknown-pragmas -Wconversion -Wno-sign-conversion -Wno-float-conversion -Wno-implicit-float-conversion -fstack-protector -Qunused-arguments -fvisibility=hidden -g3 -O0  -I/home/balay/petsc/include -I/home/balay/petsc/arch-linux-c-debug/include     -Wl,-export-dynamic ex19.c  -Wl,-rpath,/home/balay/petsc/arch-linux-c-debug/lib -L/home/balay/petsc/arch-linux-c-debug/lib -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/11 -L/usr/lib/gcc/x86_64-redhat-linux/11 -lpetsc -llapack -lblas -lm -lX11 -lgfortran -lm -lgfortran -lm -lgcc_s -lquadmath -lstdc++ -o ex19
>> /opt/rh/gcc-toolset-14/root//usr/lib/gcc/x86_64-redhat-linux/14/../../../../bin/ld: /home/balay/petsc/arch-linux-c-debug/lib/libpetsc.so: undefined reference to `std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_replace_cold(char*, unsigned long, char const*, unsigned long, unsigned long)'
>> <snip>
>> <<<<
>> 
>> Ok some v11 compiler libraries are getting mixed up (likely from -lgfortran) causing grief.
>>>>>>>>>> 
>> [root at frog ~]# yum remove gcc-toolset-14-runtime
>> Dependencies resolved.
>> ================================================================================
>> Package                          Arch     Version           Repository    Size
>> ================================================================================
>> Removing:
>> gcc-toolset-14-runtime           x86_64   14.0-1.el9        @appstream    11 k
>> Removing dependent packages:
>> clang                            x86_64   19.1.7-1.el9      @appstream   181 k
>> clang-tools-extra                x86_64   19.1.7-1.el9      @appstream    69 M
>> gcc-toolset-14-binutils          x86_64   2.41-3.el9        @appstream    27 M
>> Removing unused dependencies:
>> clang-libs                       x86_64   19.1.7-1.el9      @appstream   413 M
>> clang-resource-filesystem        x86_64   19.1.7-1.el9      @appstream    15 k
>> compiler-rt                      x86_64   19.1.7-1.el9      @appstream    37 M
>> gcc-toolset-14-gcc               x86_64   14.2.1-7.1.el9    @appstream   122 M
>> gcc-toolset-14-gcc-c++           x86_64   14.2.1-7.1.el9    @appstream    39 M
>> gcc-toolset-14-libstdc++-devel   x86_64   14.2.1-7.1.el9    @appstream    22 M
>> libomp                           x86_64   19.1.7-1.el9      @appstream   1.9 M
>> libomp-devel                     x86_64   19.1.7-1.el9      @appstream    31 M
>> 
>> Transaction Summary
>> ================================================================================
>> Remove  12 Packages
>> 
>> Freed space: 763 M
>> Is this ok [y/N]: 
>> <<<<<
>> 
>> So this install of clang depends-on/requires gcc-toolset-14-gcc. Also gfortran-14 is missing. Try installing it.
>>>>>> 
>> [root at frog ~]# yum install gcc-toolset-14-gcc-gfortran
>> <<<<
>> 
>> Now retry build:
>>>>> 
>> [balay at frog petsc]$ ./configure --with-cc=clang --with-cxx=clang++ --with-fc=gfortran --with-mpi=0 && make && make check
>> <snip>
>> Running PETSc check examples to verify correct installation
>> Using PETSC_DIR=/home/balay/petsc and PETSC_ARCH=arch-linux-c-debug
>> C/C++ example src/snes/tutorials/ex19 run successfully with 1 MPI process
>> Fortran example src/snes/tutorials/ex5f run successfully with 1 MPI process
>> Completed PETSc check examples
>> [balay at frog petsc]$ 
>> <<<<
>> 
>> Hm - Using gfortran-11 here [with gfortran-14 installed] somehow worked! But perhaps its better to use gfortran-14 [as this install of clang requires g++-14]
>>>>>> 
>> [balay at frog petsc]$ export PATH=/opt/rh/gcc-toolset-14/root/usr/bin:$PATH
>> [balay at frog petsc]$ gfortran --version
>> GNU Fortran (GCC) 14.2.1 20250110 (Red Hat 14.2.1-7)
>> Copyright (C) 2024 Free Software Foundation, Inc.
>> This is free software; see the source for copying conditions.  There is NO
>> warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
>> 
>> [balay at frog petsc]$ ./configure --with-cc=clang --with-cxx=clang++ --with-fc=gfortran --with-mpi=0 && make && make check
>> <snip>
>>    CLINKER arch-linux-c-debug/lib/libpetsc.so.3.22.3
>> =========================================
>> Now to check if the libraries are working do:
>> make PETSC_DIR=/home/balay/petsc PETSC_ARCH=arch-linux-c-debug check
>> =========================================
>> Running PETSc check examples to verify correct installation
>> Using PETSC_DIR=/home/balay/petsc and PETSC_ARCH=arch-linux-c-debug
>> C/C++ example src/snes/tutorials/ex19 run successfully with 1 MPI process
>> Fortran example src/snes/tutorials/ex5f run successfully with 1 MPI process
>> Completed PETSc check examples
>> [balay at frog petsc]$ 
>> <<<<
>> 
>> So that worked!
>> 
>> Satish
>> 
>> 
>> On Thu, 20 Feb 2025, Satish Balay wrote:
>> 
>>> Actually, simpler:
>>> 
>>> ./configure --with-cc=clang --with-cxx=clang++ --with-fc=gfortran  --with-mpi=0 --download-fblaslapack="$(DIR_SRC)/fblaslapack-$(FBLASLAPACK_VERSION).tar.gz" && make && make check
>>> 
>>>> /opt/rh/gcc-toolset-12/root/usr/lib/gcc/x86_64-redhat-linux/12/../../../../bin/ld
>>> 
>>> Hm - there was in issue with some (clang versions?) incompatibilities with gcc-12 - I think using gcc-11 (system default in that use case) worked. I'm not sure if you are seeing the same issue here.
>>> 
>>> Satish
>>> 
>>> On Thu, 20 Feb 2025, Satish Balay wrote:
>>> 
>>>> 
>>>> Any particular reason to use these flags? What clang version? OS?
>>>> 
>>>> Best if you can send build logs [perhaps to petsc-maint]
>>>> 
>>>> Can you try a simpler build and see if it works:
>>>> 
>>>> ./configure --with-mpi-dir=/PATH_TO/models/src/v2021.03-2.0.3-llvm --download-fblaslapack="$(DIR_SRC)/fblaslapack-$(FBLASLAPACK_VERSION).tar.gz" && make && make check
>>>> or:
>>>> ./configure --with-cc=clang --with-cxx=clang++ --with-fc=gfortran --download-mpich="$(DIR_SRC)/mpich-$(MPICH_VERSION).tar.gz" --download-fblaslapack="$(DIR_SRC)/fblaslapack-$(FBLASLAPACK_VERSION).tar.gz" && make && make check
>>>> 
>>>> Satish
>>>> 
>>>> On Thu, 20 Feb 2025, Michael Schaferkotter wrote:
>>>> 
>>>>> build petsc-3.20.3 with llvm, clang, clang++, gfortran 
>>>>> 
>>>>> CFLAGS='-std=c++11'
>>>>> CXXFLAGS='-std=c++11 -D_GLIBCXX_USE_CXX11_ABI=1'
>>>>> LDLIBS += -lstdc++
>>>>> 
>>>>> $PETSC_ARCH     arch-linux-c-opt
>>>>> MPIF90  = ./models/src/v2021.03-2.0.3-llvm/bin/mpif90
>>>>> MPICC   = ./models/src/v2021.03-2.0.3-llvm/bin/mpicc
>>>>> CLANG   = clang
>>>>> FC   = gfortran
>>>>> 
>>>>> 
>>>>> Petsc libraries are built;
>>>>> /models/src/v2021.03-2.0.3-llvm/lib/libpetsc.so@
>>>>> /models/src/v2021.03-2.0.3-llvm/lib/libpetsc.so.3.020@
>>>>> /models/src/v2021.03-2.0.3-llvm/lib/libpetsc.so.3.020.3*
>>>>> 
>>>>> 
>>>>> The configure is this:
>>>>>        cd $(PETSC_SRC) && unset CXX CC FC F77 && $(PYTHON2) ./configure --prefix=$(PREFIX) \
>>>>>  --with-cc=clang \
>>>>>  --with-cxx=clang++ \
>>>>>  --with-fc=gfortran \
>>>>>  --download-mpich="$(DIR_SRC)/mpich-$(MPICH_VERSION).tar.gz" \
>>>>>  --download-fblaslapack="$(DIR_SRC)/fblaslapack-$(FBLASLAPACK_VERSION).tar.gz" \
>>>>>  --download-sowing \
>>>>>  --with-debugging=$(PETSC_DBG) \
>>>>>  --with-shared-libraries=1 \
>>>>> CFLAGS='-std=c11' \
>>>>>  CXXFLAGS='-std=c++11 -D_GLIBCXX_USE_CXX11_ABI=1' \
>>>>>  CPPFLAGS='-D_GLIBCXX_USE_CXX11_ABI=1' \
>>>>>  LDFLAGS='-L$(LLVM_LIB)' \
>>>>>  LIBS='-lstdc++? \
>>>>>  --COPTFLAGS=$(COPTFLAGS) --CXXOPTFLAGS=$(CXXOPTFLAGS) --FOPTFLAGS=$(FOPTFLAGS)
>>>>> 
>>>>> 
>>>>> Here is the make:
>>>>> 
>>>>>        $(MAKE) -C $(PETSC_SRC) PETSC_DIR=$(PETSC_SRC) PETSC_ARCH=$(PETSC_ARCH) all
>>>>> 
>>>>> 
>>>>> Check-petsc is:
>>>>> 
>>>>>        $(MAKE) -C $(PETSC_SRC) PETSC_DIR=$(PETSC_SRC) PETSC_ARCH=$(PETSC_ARCH) test
>>>>> 
>>>>> Here is the log file for test:
>>>>> 
>>>>> make[1]: Entering directory '/models/src/v2021.03-2.0.3-llvm/build/petsc/petsc-3.20.3'
>>>>> /usr/bin/python3 /models/src/v2021.03-2.0.3-llvm/build/petsc/petsc-3.20.3/config/gmakegentest.py --petsc-dir=/models/src/v2021.03-2.0.3-llvm/build/petsc/petsc-3.20.3 --petsc-arch=arch-linux-c-opt --testdir=./arch-linux-c-opt/tests --srcdir /models/src/v2021.03-2.0.3-llvm/build/petsc/petsc-3.20.3/src --pkg-pkgs "sys vec mat dm ksp snes ts tao"
>>>>> Using MAKEFLAGS: iw -- PETSC_ARCH=arch-linux-c-opt PETSC_DIR=/models/src/v2021.03-2.0.3-llvm/build/petsc/petsc-3.20.3
>>>>>         CC arch-linux-c-opt/tests/sys/classes/draw/tests/ex1.o
>>>>>    CLINKER arch-linux-c-opt/tests/sys/classes/draw/tests/ex1
>>>>> /opt/rh/gcc-toolset-12/root/usr/lib/gcc/x86_64-redhat-linux/12/../../../../bin/ld: arch-linux-c-opt/lib/libpetsc.so: undefined reference to `std::__cxx11::basic_ostringstream<char, std::char_traits<char>, std::allocator<char> >::basic_ostringstream()'
>>>>> /opt/rh/gcc-toolset-12/root/usr/lib/gcc/x86_64-redhat-linux/12/../../../../bin/ld: arch-linux-c-opt/lib/libpetsc.so: undefined reference to `std::__throw_bad_array_new_length()'
>>>>> clang: error: linker command failed with exit code 1 (use -v to see invocation)
>>>>> make[1]: [gmakefile.test:273: arch-linux-c-opt/tests/sys/classes/draw/tests/ex1] Error 1 (ignored)
>>>>> 
>>>>> 
>>>>> There are many errors of the ilk: 
>>>>> 
>>>>> std::__cxx11::basic_ostringstream<char, std::char_traits<char>, std::allocator<char> >::basic_ostringstream()
>>>>> 
>>>>> [lib]$ nm -A libpetsc.so | grep basic_ostringstream
>>>>> libpetsc.so:                 U _ZNKSt7__cxx1119basic_ostringstreamIcSt11char_traitsIcESaIcEE3strEv at GLIBCXX_3.4.21
>>>>> libpetsc.so:                 U _ZNSt7__cxx1119basic_ostringstreamIcSt11char_traitsIcESaIcEEC1Ev
>>>>> libpetsc.so:                 U _ZNSt7__cxx1119basic_ostringstreamIcSt11char_traitsIcESaIcEED1Ev at GLIBCXX_3.4.21
>>>>> 
>>>>> 
>>>>> I/m new to llvm and this is the first time to compile petsc.3.20.3 with llvm compilers.
>>>>> 
>>>>> Clearly something is amiss.
>>>>> 
>>>>> Any ideas appreciated.
>>>>> 
>>>>> Michael
>>>>> 
>>>>> 
>>>>> 
>>>> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20250221/bf34b45e/attachment-0001.html>

From balay.anl at fastmail.org  Fri Feb 21 09:38:57 2025
From: balay.anl at fastmail.org (Satish Balay)
Date: Fri, 21 Feb 2025 09:38:57 -0600 (CST)
Subject: [petsc-users] Info; build petsc-3.20.2 with llvm check fails
In-Reply-To: <03DE8DD9-B858-4D21-A2A8-CBC3BC1BE9C4@bellsouth.net>
References: <288707F5-8A3C-4973-AE12-99F370C39C5A.ref@bellsouth.net>
	<288707F5-8A3C-4973-AE12-99F370C39C5A@bellsouth.net>
	<209ececc-d6bb-734a-e928-e12b73b2dc15@fastmail.org>
	<d4d5f68e-04b8-b521-a324-ba0f8733d082@fastmail.org>
	<618386bb-0919-0fee-656e-b89ba7eaa08e@fastmail.org>
	<fdc77d29-8f86-0986-700b-d5f848728026@fastmail.org>
	<03DE8DD9-B858-4D21-A2A8-CBC3BC1BE9C4@bellsouth.net>
Message-ID: <5d274e31-ee10-cbeb-93e2-1bdbf8decc74@fastmail.org>

I'm glad you now have a working build!

BTW: Since you have flang installed - a build with it - i.e. with clang/clang++/flang might also work [instead of clang/clang++/gfortran]
However flang usage is still nascent (likely some fortran examples fail with it) -  if you are using PETSc from fortran - a build with gfortran is the preferred option

Satish

On Fri, 21 Feb 2025, Michael Schaferkotter wrote:

> Satish;
> 
> Thank you for the masterful demonstration.
> 
> One of the alternatives caught my eye:   ?with-cxx=0 (I remember I had to do that ages ago on my macOS Darwin machine.
> 
> I cleared *FLAGS and successfully completed make && make check as suggested;
> 
> #=======================================================================
> It may be moot now. Here is the requested OS, compiler information:
> 
> cat /etc/os-release
> 
> NAME="Red Hat Enterprise Linux"
> VERSION="8.8 (Ootpa)"
> ID="rhel"
> ID_LIKE="fedora"
> VERSION_ID="8.8"
> PLATFORM_ID="platform:el8"
> PRETTY_NAME="Red Hat Enterprise Linux 8.8 (Ootpa)"
> ANSI_COLOR="0;31"
> CPE_NAME="cpe:/o:redhat:enterprise_linux:8::baseos"
> HOME_URL="https://urldefense.us/v3/__https://www.redhat.com/__;!!G_uCfscf7eWS!Z9OsinaKusqhhdPuDCNknHJq6f6UGZt17SofPYc-BvWQvrlqpeDbEEucEHNxioN04anLOPsjW0v_aCHG5RpV-0tjnQ$ "
> DOCUMENTATION_URL="https://urldefense.us/v3/__https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8__;!!G_uCfscf7eWS!Z9OsinaKusqhhdPuDCNknHJq6f6UGZt17SofPYc-BvWQvrlqpeDbEEucEHNxioN04anLOPsjW0v_aCHG5Roh1jkm1w$ "
> BUG_REPORT_URL="https://urldefense.us/v3/__https://bugzilla.redhat.com/__;!!G_uCfscf7eWS!Z9OsinaKusqhhdPuDCNknHJq6f6UGZt17SofPYc-BvWQvrlqpeDbEEucEHNxioN04anLOPsjW0v_aCHG5RpG5qdUaQ$ "
> 
> REDHAT_BUGZILLA_PRODUCT="Red Hat Enterprise Linux 8"
> REDHAT_BUGZILLA_PRODUCT_VERSION=8.8
> REDHAT_SUPPORT_PRODUCT="Red Hat Enterprise Linux"
> REDHAT_SUPPORT_PRODUCT_VERSION="8.8"
> 
> clang version 20.0.0git (https://urldefense.us/v3/__https://github.com/llvm/llvm-project.git__;!!G_uCfscf7eWS!Z9OsinaKusqhhdPuDCNknHJq6f6UGZt17SofPYc-BvWQvrlqpeDbEEucEHNxioN04anLOPsjW0v_aCHG5RoxqM2a5g$  48d0ef1a07993139e1acf65910704255443103a5)
> Target: x86_64-unknown-linux-gnu
> Thread model: posix
> InstalledDir: /tmp/build_release/bin
> Clang version: 20.0.0git
> LLVM version: LLVM version 20.0.0git (48d0ef1a07993139e1acf65910704255443103a5<unknown encoding>
> C++ standard:
> Host target: x86_64-unknown-linux-gnu
> Supported targets:
>   Registered Targets:
>     x86    - 32-bit X86: Pentium-Pro and above
>     x86-64 - 64-bit X86: EM64T and AMD64
> 
> flang version 20.0.0git (https://urldefense.us/v3/__https://github.com/llvm/llvm-project.git__;!!G_uCfscf7eWS!Z9OsinaKusqhhdPuDCNknHJq6f6UGZt17SofPYc-BvWQvrlqpeDbEEucEHNxioN04anLOPsjW0v_aCHG5RoxqM2a5g$  48d0ef1a07993139e1acf65910704255443103a5)
> Target: x86_64-unknown-linux-gnu
> Thread model: posix
> InstalledDir: /tmp/build_release/bin
> Flang version: 20.0.0git
> Host target: x86_64-unknown-linux-gnu
> 
> 
> Fortran compiler: GNU Fortran (GCC) 8.5.0 20210514 (Red Hat 8.5.0-4)
> Compiler path: /usr/bin/gfortran
> Version:
>   -std=<standard>          Assume that the input sources are for <standard>.
> Default flags: No default flags information available
> Target: x86_64-redhat-linux
> 
> #=======================================================================
> 
> Thank you again.
> 
> > On Feb 20, 2025, at 2:44 PM, Satish Balay <balay.anl at fastmail.org> wrote:
> > 
> > A couple of alternates (if mixing compiler versions can't be avoided):
> > 
> > - don't need to use petsc from fortran:
> > [balay at frog petsc]$ ./configure --with-cc=clang --with-cxx=clang++ --with-fc=0 --with-mpi=0 --download-f2cblaslapack && make && make check
> > 
> > - don't use c++:
> > [balay at frog petsc]$ ./configure --with-cc=clang --with-cxx=0 --with-fc=gfortran --with-mpi=0 && make && make check
> > 
> > - add in v14 -lstdc++ location ahead in the search path - so that even when -lgfortran is found in v11,  v14 -lstdc++ gets picked up correctly.
> > [balay at frog petsc]$ ./configure LDFLAGS=-L/opt/rh/gcc-toolset-14/root/usr/lib/gcc/x86_64-redhat-linux/14/ --with-cc=clang --with-cxx=clang++ --with-fc=gfortran --with-mpi=0 && make && make check
> > 
> > Satish
> > 
> > On Thu, 20 Feb 2025, Satish Balay wrote:
> > 
> >> Ok - I see this issue on CentOS [Stream/9].
> >> 
> >> What I have is:
> >>>>> 
> >> [balay at frog petsc]$ clang --version
> >> clang version 19.1.7 (CentOS 19.1.7-1.el9)
> >> Target: x86_64-redhat-linux-gnu
> >> Thread model: posix
> >> InstalledDir: /usr/bin
> >> Configuration file: /etc/clang/x86_64-redhat-linux-gnu-clang.cfg
> >> [balay at frog petsc]$ gfortran --version
> >> GNU Fortran (GCC) 11.5.0 20240719 (Red Hat 11.5.0-5)
> >> Copyright (C) 2021 Free Software Foundation, Inc.
> >> This is free software; see the source for copying conditions.  There is NO
> >> warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
> >> <<<<
> >> 
> >> Now I build:
> >>>>> 
> >> [balay at frog petsc]$ ./configure --with-cc=clang --with-cxx=clang++ --with-fc=gfortran --with-mpi=0 && make && make check
> >> <snip>
> >> *********************************************************************************
> >> clang -fPIC -Wall -Wwrite-strings -Wno-unknown-pragmas -Wconversion -Wno-sign-conversion -Wno-float-conversion -Wno-implicit-float-conversion -fstack-protector -Qunused-arguments -fvisibility=hidden -Wall -Wwrite-strings -Wno-unknown-pragmas -Wconversion -Wno-sign-conversion -Wno-float-conversion -Wno-implicit-float-conversion -fstack-protector -Qunused-arguments -fvisibility=hidden -g3 -O0  -I/home/balay/petsc/include -I/home/balay/petsc/arch-linux-c-debug/include     -Wl,-export-dynamic ex19.c  -Wl,-rpath,/home/balay/petsc/arch-linux-c-debug/lib -L/home/balay/petsc/arch-linux-c-debug/lib -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/11 -L/usr/lib/gcc/x86_64-redhat-linux/11 -lpetsc -llapack -lblas -lm -lX11 -lgfortran -lm -lgfortran -lm -lgcc_s -lquadmath -lstdc++ -o ex19
> >> /opt/rh/gcc-toolset-14/root//usr/lib/gcc/x86_64-redhat-linux/14/../../../../bin/ld: /home/balay/petsc/arch-linux-c-debug/lib/libpetsc.so: undefined reference to `std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >::_M_replace_cold(char*, unsigned long, char const*, unsigned long, unsigned long)'
> >> <snip>
> >> <<<<
> >> 
> >> Ok some v11 compiler libraries are getting mixed up (likely from -lgfortran) causing grief.
> >>>>>>>>>> 
> >> [root at frog ~]# yum remove gcc-toolset-14-runtime
> >> Dependencies resolved.
> >> ================================================================================
> >> Package                          Arch     Version           Repository    Size
> >> ================================================================================
> >> Removing:
> >> gcc-toolset-14-runtime           x86_64   14.0-1.el9        @appstream    11 k
> >> Removing dependent packages:
> >> clang                            x86_64   19.1.7-1.el9      @appstream   181 k
> >> clang-tools-extra                x86_64   19.1.7-1.el9      @appstream    69 M
> >> gcc-toolset-14-binutils          x86_64   2.41-3.el9        @appstream    27 M
> >> Removing unused dependencies:
> >> clang-libs                       x86_64   19.1.7-1.el9      @appstream   413 M
> >> clang-resource-filesystem        x86_64   19.1.7-1.el9      @appstream    15 k
> >> compiler-rt                      x86_64   19.1.7-1.el9      @appstream    37 M
> >> gcc-toolset-14-gcc               x86_64   14.2.1-7.1.el9    @appstream   122 M
> >> gcc-toolset-14-gcc-c++           x86_64   14.2.1-7.1.el9    @appstream    39 M
> >> gcc-toolset-14-libstdc++-devel   x86_64   14.2.1-7.1.el9    @appstream    22 M
> >> libomp                           x86_64   19.1.7-1.el9      @appstream   1.9 M
> >> libomp-devel                     x86_64   19.1.7-1.el9      @appstream    31 M
> >> 
> >> Transaction Summary
> >> ================================================================================
> >> Remove  12 Packages
> >> 
> >> Freed space: 763 M
> >> Is this ok [y/N]: 
> >> <<<<<
> >> 
> >> So this install of clang depends-on/requires gcc-toolset-14-gcc. Also gfortran-14 is missing. Try installing it.
> >>>>>> 
> >> [root at frog ~]# yum install gcc-toolset-14-gcc-gfortran
> >> <<<<
> >> 
> >> Now retry build:
> >>>>> 
> >> [balay at frog petsc]$ ./configure --with-cc=clang --with-cxx=clang++ --with-fc=gfortran --with-mpi=0 && make && make check
> >> <snip>
> >> Running PETSc check examples to verify correct installation
> >> Using PETSC_DIR=/home/balay/petsc and PETSC_ARCH=arch-linux-c-debug
> >> C/C++ example src/snes/tutorials/ex19 run successfully with 1 MPI process
> >> Fortran example src/snes/tutorials/ex5f run successfully with 1 MPI process
> >> Completed PETSc check examples
> >> [balay at frog petsc]$ 
> >> <<<<
> >> 
> >> Hm - Using gfortran-11 here [with gfortran-14 installed] somehow worked! But perhaps its better to use gfortran-14 [as this install of clang requires g++-14]
> >>>>>> 
> >> [balay at frog petsc]$ export PATH=/opt/rh/gcc-toolset-14/root/usr/bin:$PATH
> >> [balay at frog petsc]$ gfortran --version
> >> GNU Fortran (GCC) 14.2.1 20250110 (Red Hat 14.2.1-7)
> >> Copyright (C) 2024 Free Software Foundation, Inc.
> >> This is free software; see the source for copying conditions.  There is NO
> >> warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
> >> 
> >> [balay at frog petsc]$ ./configure --with-cc=clang --with-cxx=clang++ --with-fc=gfortran --with-mpi=0 && make && make check
> >> <snip>
> >>    CLINKER arch-linux-c-debug/lib/libpetsc.so.3.22.3
> >> =========================================
> >> Now to check if the libraries are working do:
> >> make PETSC_DIR=/home/balay/petsc PETSC_ARCH=arch-linux-c-debug check
> >> =========================================
> >> Running PETSc check examples to verify correct installation
> >> Using PETSC_DIR=/home/balay/petsc and PETSC_ARCH=arch-linux-c-debug
> >> C/C++ example src/snes/tutorials/ex19 run successfully with 1 MPI process
> >> Fortran example src/snes/tutorials/ex5f run successfully with 1 MPI process
> >> Completed PETSc check examples
> >> [balay at frog petsc]$ 
> >> <<<<
> >> 
> >> So that worked!
> >> 
> >> Satish
> >> 
> >> 
> >> On Thu, 20 Feb 2025, Satish Balay wrote:
> >> 
> >>> Actually, simpler:
> >>> 
> >>> ./configure --with-cc=clang --with-cxx=clang++ --with-fc=gfortran  --with-mpi=0 --download-fblaslapack="$(DIR_SRC)/fblaslapack-$(FBLASLAPACK_VERSION).tar.gz" && make && make check
> >>> 
> >>>> /opt/rh/gcc-toolset-12/root/usr/lib/gcc/x86_64-redhat-linux/12/../../../../bin/ld
> >>> 
> >>> Hm - there was in issue with some (clang versions?) incompatibilities with gcc-12 - I think using gcc-11 (system default in that use case) worked. I'm not sure if you are seeing the same issue here.
> >>> 
> >>> Satish
> >>> 
> >>> On Thu, 20 Feb 2025, Satish Balay wrote:
> >>> 
> >>>> 
> >>>> Any particular reason to use these flags? What clang version? OS?
> >>>> 
> >>>> Best if you can send build logs [perhaps to petsc-maint]
> >>>> 
> >>>> Can you try a simpler build and see if it works:
> >>>> 
> >>>> ./configure --with-mpi-dir=/PATH_TO/models/src/v2021.03-2.0.3-llvm --download-fblaslapack="$(DIR_SRC)/fblaslapack-$(FBLASLAPACK_VERSION).tar.gz" && make && make check
> >>>> or:
> >>>> ./configure --with-cc=clang --with-cxx=clang++ --with-fc=gfortran --download-mpich="$(DIR_SRC)/mpich-$(MPICH_VERSION).tar.gz" --download-fblaslapack="$(DIR_SRC)/fblaslapack-$(FBLASLAPACK_VERSION).tar.gz" && make && make check
> >>>> 
> >>>> Satish
> >>>> 
> >>>> On Thu, 20 Feb 2025, Michael Schaferkotter wrote:
> >>>> 
> >>>>> build petsc-3.20.3 with llvm, clang, clang++, gfortran 
> >>>>> 
> >>>>> CFLAGS='-std=c++11'
> >>>>> CXXFLAGS='-std=c++11 -D_GLIBCXX_USE_CXX11_ABI=1'
> >>>>> LDLIBS += -lstdc++
> >>>>> 
> >>>>> $PETSC_ARCH     arch-linux-c-opt
> >>>>> MPIF90  = ./models/src/v2021.03-2.0.3-llvm/bin/mpif90
> >>>>> MPICC   = ./models/src/v2021.03-2.0.3-llvm/bin/mpicc
> >>>>> CLANG   = clang
> >>>>> FC   = gfortran
> >>>>> 
> >>>>> 
> >>>>> Petsc libraries are built;
> >>>>> /models/src/v2021.03-2.0.3-llvm/lib/libpetsc.so@
> >>>>> /models/src/v2021.03-2.0.3-llvm/lib/libpetsc.so.3.020@
> >>>>> /models/src/v2021.03-2.0.3-llvm/lib/libpetsc.so.3.020.3*
> >>>>> 
> >>>>> 
> >>>>> The configure is this:
> >>>>>        cd $(PETSC_SRC) && unset CXX CC FC F77 && $(PYTHON2) ./configure --prefix=$(PREFIX) \
> >>>>>  --with-cc=clang \
> >>>>>  --with-cxx=clang++ \
> >>>>>  --with-fc=gfortran \
> >>>>>  --download-mpich="$(DIR_SRC)/mpich-$(MPICH_VERSION).tar.gz" \
> >>>>>  --download-fblaslapack="$(DIR_SRC)/fblaslapack-$(FBLASLAPACK_VERSION).tar.gz" \
> >>>>>  --download-sowing \
> >>>>>  --with-debugging=$(PETSC_DBG) \
> >>>>>  --with-shared-libraries=1 \
> >>>>> CFLAGS='-std=c11' \
> >>>>>  CXXFLAGS='-std=c++11 -D_GLIBCXX_USE_CXX11_ABI=1' \
> >>>>>  CPPFLAGS='-D_GLIBCXX_USE_CXX11_ABI=1' \
> >>>>>  LDFLAGS='-L$(LLVM_LIB)' \
> >>>>>  LIBS='-lstdc++? \
> >>>>>  --COPTFLAGS=$(COPTFLAGS) --CXXOPTFLAGS=$(CXXOPTFLAGS) --FOPTFLAGS=$(FOPTFLAGS)
> >>>>> 
> >>>>> 
> >>>>> Here is the make:
> >>>>> 
> >>>>>        $(MAKE) -C $(PETSC_SRC) PETSC_DIR=$(PETSC_SRC) PETSC_ARCH=$(PETSC_ARCH) all
> >>>>> 
> >>>>> 
> >>>>> Check-petsc is:
> >>>>> 
> >>>>>        $(MAKE) -C $(PETSC_SRC) PETSC_DIR=$(PETSC_SRC) PETSC_ARCH=$(PETSC_ARCH) test
> >>>>> 
> >>>>> Here is the log file for test:
> >>>>> 
> >>>>> make[1]: Entering directory '/models/src/v2021.03-2.0.3-llvm/build/petsc/petsc-3.20.3'
> >>>>> /usr/bin/python3 /models/src/v2021.03-2.0.3-llvm/build/petsc/petsc-3.20.3/config/gmakegentest.py --petsc-dir=/models/src/v2021.03-2.0.3-llvm/build/petsc/petsc-3.20.3 --petsc-arch=arch-linux-c-opt --testdir=./arch-linux-c-opt/tests --srcdir /models/src/v2021.03-2.0.3-llvm/build/petsc/petsc-3.20.3/src --pkg-pkgs "sys vec mat dm ksp snes ts tao"
> >>>>> Using MAKEFLAGS: iw -- PETSC_ARCH=arch-linux-c-opt PETSC_DIR=/models/src/v2021.03-2.0.3-llvm/build/petsc/petsc-3.20.3
> >>>>>         CC arch-linux-c-opt/tests/sys/classes/draw/tests/ex1.o
> >>>>>    CLINKER arch-linux-c-opt/tests/sys/classes/draw/tests/ex1
> >>>>> /opt/rh/gcc-toolset-12/root/usr/lib/gcc/x86_64-redhat-linux/12/../../../../bin/ld: arch-linux-c-opt/lib/libpetsc.so: undefined reference to `std::__cxx11::basic_ostringstream<char, std::char_traits<char>, std::allocator<char> >::basic_ostringstream()'
> >>>>> /opt/rh/gcc-toolset-12/root/usr/lib/gcc/x86_64-redhat-linux/12/../../../../bin/ld: arch-linux-c-opt/lib/libpetsc.so: undefined reference to `std::__throw_bad_array_new_length()'
> >>>>> clang: error: linker command failed with exit code 1 (use -v to see invocation)
> >>>>> make[1]: [gmakefile.test:273: arch-linux-c-opt/tests/sys/classes/draw/tests/ex1] Error 1 (ignored)
> >>>>> 
> >>>>> 
> >>>>> There are many errors of the ilk: 
> >>>>> 
> >>>>> std::__cxx11::basic_ostringstream<char, std::char_traits<char>, std::allocator<char> >::basic_ostringstream()
> >>>>> 
> >>>>> [lib]$ nm -A libpetsc.so | grep basic_ostringstream
> >>>>> libpetsc.so:                 U _ZNKSt7__cxx1119basic_ostringstreamIcSt11char_traitsIcESaIcEE3strEv at GLIBCXX_3.4.21
> >>>>> libpetsc.so:                 U _ZNSt7__cxx1119basic_ostringstreamIcSt11char_traitsIcESaIcEEC1Ev
> >>>>> libpetsc.so:                 U _ZNSt7__cxx1119basic_ostringstreamIcSt11char_traitsIcESaIcEED1Ev at GLIBCXX_3.4.21
> >>>>> 
> >>>>> 
> >>>>> I/m new to llvm and this is the first time to compile petsc.3.20.3 with llvm compilers.
> >>>>> 
> >>>>> Clearly something is amiss.
> >>>>> 
> >>>>> Any ideas appreciated.
> >>>>> 
> >>>>> Michael
> >>>>> 
> >>>>> 
> >>>>> 
> >>>> 
> 
> 

From eirik.hoydalsvik at sintef.no  Mon Feb 24 07:41:09 2025
From: eirik.hoydalsvik at sintef.no (=?Windows-1252?Q?Eirik_Jaccheri_H=F8ydalsvik?=)
Date: Mon, 24 Feb 2025 13:41:09 +0000
Subject: [petsc-users] TS Solver stops working when including ts.setDM
Message-ID: <OS6P279MB05706419B43C06C49AA346BD94C02@OS6P279MB0570.NORP279.PROD.OUTLOOK.COM>

Hi,

I am using the petsc4py.ts timestepper to solve a PDE. It is not easy to obtain the jacobian for my equations, so I do not provide a jacobian function. The code is given at the end of the email.

When I comment out the function call ?ts.setDM(da)?, the code runs and gives reasonable results.

However, when I add this line of code, the program crashes with the error message provided at the end of the email.

Questions:

1. Do you know why adding this line of code can make the SNES solver diverge? Any suggestions for how to debug the issue?

2. What is the advantage of adding the DMDA object to the ts solver? Will this speed up the calculation of the finite difference jacobian?

Best regards,
Eirik H?ydalsvik
SINTEF ER/NTNU

Error message:

[Eiriks-MacBook-Pro.local:26384] shmem: mmap: an error occurred while determining whether or not /var/folders/w1/35jw9y4n7lsbw0dhjqdhgzz80000gn/T//ompi.Eiriks-MacBook-Pro.501/jf.0/2046361600/sm_segment.Eiriks-MacBook-Pro.501.79f90000.0 could be created.
t 0 of 1 with dt =  0.2
0 TS dt 0.2 time 0.
      TSAdapt none step   0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0          + 2.000e-01 retrying with dt=5.000e-02
      TSAdapt none step   0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0          + 5.000e-02 retrying with dt=1.250e-02
      TSAdapt none step   0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0          + 1.250e-02 retrying with dt=3.125e-03
      TSAdapt none step   0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0          + 3.125e-03 retrying with dt=7.813e-04
      TSAdapt none step   0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0          + 7.813e-04 retrying with dt=1.953e-04
      TSAdapt none step   0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0          + 1.953e-04 retrying with dt=4.883e-05
      TSAdapt none step   0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0          + 4.883e-05 retrying with dt=1.221e-05
      TSAdapt none step   0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0          + 1.221e-05 retrying with dt=3.052e-06
      TSAdapt none step   0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0          + 3.052e-06 retrying with dt=7.629e-07
      TSAdapt none step   0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0          + 7.629e-07 retrying with dt=1.907e-07
      TSAdapt none step   0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0          + 1.907e-07 retrying with dt=4.768e-08
Traceback (most recent call last):
  File "/Users/iept1445/Code/tank_model/closed_tank.py", line 200, in <module>
    return_dict1d = get_tank_composition_1d(tank_params)
  File "/Users/iept1445/Code/tank_model/src/tank_model1d.py", line 223, in get_tank_composition_1d
    ts.solve(u=x)
  File "petsc4py/PETSc/TS.pyx", line 2478, in petsc4py.PETSc.TS.solve
petsc4py.PETSc.Error: error code 91
[0] TSSolve() at /Users/iept1445/.cache/uv/sdists-v6/pypi/petsc/3.22.1/svV0XnlR4s3LB4lmsaKjR/src/src/ts/interface/ts.c:4072
[0] TSStep() at /Users/iept1445/.cache/uv/sdists-v6/pypi/petsc/3.22.1/svV0XnlR4s3LB4lmsaKjR/src/src/ts/interface/ts.c:3440
[0] TSStep has failed due to DIVERGED_STEP_REJECTED

Options for solver:

COMM = PETSc.COMM_WORLD

    da = PETSc.DMDA().create(
        dim=(N_vertical,),
        dof=3,
        stencil_type=PETSc.DMDA().StencilType.STAR,
        stencil_width=1,
        # boundary_type=PETSc.DMDA.BoundaryType.GHOSTED,
    )
    x = da.createGlobalVec()
    x_old = da.createGlobalVec()
    f = da.createGlobalVec()
    J = da.createMat()
    rho_ref = rho_m[0]  # kg/m3
    e_ref = e_m[0]  # J/mol
    p_ref = p0  # Pa
    x.setArray(np.array([rho_m / rho_ref, e_m / e_ref, ux_m]).T.flatten())
    x_old.setArray(np.array([rho_m / rho_ref, e_m / e_ref, ux_m]).T.flatten())

    optsDB = PETSc.Options()
    optsDB["snes_lag_preconditioner_persists"] = False
    optsDB["snes_lag_jacobian"] = 1
    optsDB["snes_lag_jacobian_persists"] = False
    optsDB["snes_lag_preconditioner"] = 1
    optsDB["ksp_type"] = "gmres"  # "gmres"  # gmres"
    optsDB["pc_type"] = "ilu"  # "lu"  # "ilu"
    optsDB["snes_type"] = "newtonls"
    optsDB["ksp_rtol"] = 1e-7
    optsDB["ksp_atol"] = 1e-7
    optsDB["ksp_max_it"] = 100
    optsDB["snes_rtol"] = 1e-5
    optsDB["snes_atol"] = 1e-5
    optsDB["snes_stol"] = 1e-5
    optsDB["snes_max_it"] = 100
    optsDB["snes_mf"] = False
    optsDB["ts_max_time"] = t_end
    optsDB["ts_type"] = "beuler"  # "bdf"  #
    optsDB["ts_max_snes_failures"] = -1
    optsDB["ts_monitor"] = ""
    optsDB["ts_adapt_monitor"] = ""
    # optsDB["snes_monitor"] = ""
    # optsDB["ksp_monitor"] = ""
    optsDB["ts_atol"] = 1e-4

    x0 = x_old
    residual_wrap = residual_ts(
        eos,
        x0,
        N_vertical,
        g,
        pos,
        z,
        mw,
        dt,
        dx,
        p_amb,
        A_nozzle,
        r_tank_inner,
        mph_uv_flsh_L,
        rho_ref,
        e_ref,
        p_ref,
        closed_tank,
        J,
        f,
        da,
        drift_func,
        T_wall,
        tank_params,
    )

    # optsDB["ts_adapt_type"] = "none"

    ts = PETSc.TS().create(comm=COMM)
    # TODO: Figure out why DM crashes the code
    # ts.setDM(residual_wrap.da)
    ts.setIFunction(residual_wrap.residual_ts, None)
    ts.setTimeStep(dt)
    ts.setMaxSteps(-1)
    ts.setTime(t_start)  # s
    ts.setMaxTime(t_end)  # s
    ts.setMaxSteps(1e5)
    ts.setStepLimits(1e-3, 1e5)
    ts.setFromOptions()
    ts.solve(u=x)


Residual function:

class residual_ts:
    def __init__(
        self,
        eos,
        x0,
        N,
        g,
        pos,
        z,
        mw,
        dt,
        dx,
        p_amb,
        A_nozzle,
        r_tank_inner,
        mph_uv_flsh_l,
        rho_ref,
        e_ref,
        p_ref,
        closed_tank,
        J,
        f,
        da,
        drift_func,
        T_wall,
        tank_params,
    ):
        self.eos = eos
        self.x0 = x0
        self.N = N
        self.g = g
        self.pos = pos
        self.z = z
        self.mw = mw
        self.dt = dt
        self.dx = dx
        self.p_amb = p_amb
        self.A_nozzle = A_nozzle
        self.r_tank_inner = r_tank_inner
        self.mph_uv_flsh_L = mph_uv_flsh_l
        self.rho_ref = rho_ref
        self.e_ref = e_ref
        self.p_ref = p_ref
        self.closed_tank = closed_tank
        self.J = J
        self.f = f
        self.da = da
        self.drift_func = drift_func
        self.T_wall = T_wall
        self.tank_params = tank_params
        self.Q_wall = np.zeros(N)
        self.n_iter = 0
        self.t_current = [0.0]
        self.s_top = [0.0]
        self.p_choke = [0.0]

        # setting interp func # TODO figure out how to generalize this method
        self._interp_func = _jalla_upwind

        # allocate space for new params
        self.p = np.zeros(N)  # Pa
        self.T = np.zeros(N)  # K
        self.alpha = np.zeros((2, N))
        self.rho = np.zeros((2, N))
        self.e = np.zeros((2, N))

        # allocate space for ghost cells
        self.alpha_ghost = np.zeros((2, N + 2))
        self.rho_ghost = np.zeros((2, N + 2))
        self.rho_m_ghost = np.zeros(N + 2)
        self.u_m_ghost = np.zeros(N + 1)
        self.u_ghost = np.zeros((2, N + 1))
        self.e_ghost = np.zeros((2, N + 2))
        self.pos_ghost = np.zeros(N + 2)
        self.h_ghost = np.zeros((2, N + 2))

        # allocate soace for local X and Xdot
        self.X_LOCAL = da.createLocalVec()
        self.XDOT_LOCAL = da.createLocalVec()

    def residual_ts(self, ts, t, X, XDOT, F):
        self.n_iter += 1
        # TODO: Estimate time use
        """
        Caculate residuals for equations
        (rho, rho (e + gx))_t + (rho u, rho u (h + gx))_x = 0
        P_x = - g \rho
        """
        n_phase = 2
        self.da.globalToLocal(X, self.X_LOCAL)
        self.da.globalToLocal(XDOT, self.XDOT_LOCAL)
        x = self.da.getVecArray(self.X_LOCAL)
        xdot = self.da.getVecArray(self.XDOT_LOCAL)
        f = self.da.getVecArray(F)

        T_c, v_c, p_c = self.eos.critical(self.z)  # K, m3/mol, Pa
        rho_m = x[:, 0] * self.rho_ref  # kg/m3
        e_m = x[:, 1] * self.e_ref  # J/mol
        u_m = x[:-1, 2]  # m/s

        # derivatives
        rho_m_dot = xdot[:, 0] * self.rho_ref  # kg/m3
        e_m_dot = xdot[:, 1] * self.e_ref  # kg/m3
        dt = ts.getTimeStep()  # s

        for i in range(self.N):
            # get new parameters
            self.mph_uv_flsh_L[i] = self.eos.multi_phase_uvflash(
                self.z, e_m[i], self.mw / rho_m[i], self.mph_uv_flsh_L[i]
            )

            betaL, betaV, betaS = _get_betas(self.mph_uv_flsh_L[i])  # mol/mol
            beta = [betaL, betaV]
            if betaS != 0.0:
                print("there is a solid phase which is not accounted for")
            self.T[i], self.p[i] = _get_tank_temperature_pressure(
                self.mph_uv_flsh_L[i]
            )  # K, Pa)
            for j, phase in enumerate([self.eos.LIQPH, self.eos.VAPPH]):
                # new parameters
                self.rho_ghost[:, 1:-1][j][i] = (
                    self.mw
                    / self.eos.specific_volume(self.T[i], self.p[i], self.z, phase)[0]
                )  # kg/m3
                self.e_ghost[:, 1:-1][j][i] = self.eos.internal_energy_tv(
                    self.T[i], self.mw / self.rho_ghost[:, 1:-1][j][i], self.z, phase
                )[
                    0
                ]  # J/mol
                self.h_ghost[:, 1:-1][j][i] = (
                    self.e_ghost[:, 1:-1][j][i]
                    + self.p[i] * self.mw / self.rho_ghost[:, 1:-1][j][i]
                )  # J/mol
                self.alpha_ghost[:, 1:-1][j][i] = (
                    beta[j] / self.rho_ghost[:, 1:-1][j][i] * rho_m[i]
                )  # m3/m3

        # calculate drift velocity
        for i in range(self.N - 1):
            self.u_ghost[:, 1:-1][0][i], self.u_ghost[:, 1:-1][1][i] = (
                calc_drift_velocity(
                    u_m[i],
                    self._interp_func(
                        self.rho_ghost[:, 1:-1][0][i],
                        self.rho_ghost[:, 1:-1][0][i + 1],
                        u_m[i],
                    ),
                    self._interp_func(
                        self.rho_ghost[:, 1:-1][1][i],
                        self.rho_ghost[:, 1:-1][1][i + 1],
                        u_m[i],
                    ),
                    self.g,
                    self._interp_func(self.T[i], self.T[i + 1], u_m[i]),
                    T_c,
                    self.r_tank_inner,
                    self._interp_func(
                        self.alpha_ghost[:, 1:-1][0][i],
                        self.alpha_ghost[:, 1:-1][0][i + 1],
                        u_m[i],
                    ),
                    self._interp_func(
                        self.alpha_ghost[:, 1:-1][1][i],
                        self.alpha_ghost[:, 1:-1][1][i + 1],
                        u_m[i],
                    ),
                    self.drift_func,
                )
            )  # liq m / s , vapour m / s

        u_bottom = 0
        if self.closed_tank:
            u_top = 0.0  # m/s
        else:
            # calc phase to skip env_isentrope_cross
            if (
                self.mph_uv_flsh_L[-1].liquid != None
                and self.mph_uv_flsh_L[-1].vapour == None
                and self.mph_uv_flsh_L[-1].solid == None
            ):
                phase_env = self.eos.LIQPH
            else:
                phase_env = self.eos.TWOPH

            self.h_m = e_m + self.p * self.mw / rho_m  # J / mol
            self.s_top[0] = _get_s_tank(self.eos, self.mph_uv_flsh_L[-1])  # J / mol / K
            mdot, self.p_choke[0] = calc_mass_outflow(
                self.eos,
                self.z,
                self.h_m[-1],
                self.s_top[0],
                self.p[-1],
                self.p_amb,
                self.A_nozzle,
                self.mw,
                phase_env,
                debug_plot=False,
            )  # mol / s , Pa
            u_top = -mdot * self.mw / rho_m[-1] / (np.pi * self.r_tank_inner**2)  # m/s

        # assemble vectors with ghost cells
        self.alpha_ghost[:, 0] = self.alpha_ghost[:, 1:-1][:, 0]  # m3/m3
        self.alpha_ghost[:, -1] = self.alpha_ghost[:, 1:-1][:, -1]  # m3/m3
        self.rho_ghost[:, 0] = self.rho_ghost[:, 1:-1][:, 0]  # kg/m3
        self.rho_ghost[:, -1] = self.rho_ghost[:, 1:-1][:, -1]  # kg/m3
        self.rho_m_ghost[0] = rho_m[0]  # kg/m3
        self.rho_m_ghost[1:-1] = rho_m  # kg/m3
        self.rho_m_ghost[-1] = rho_m[-1]  # kg/m3
        # u_ghost[:, 1:-1] = u  # m/s
        self.u_ghost[:, 0] = u_bottom  # m/s
        self.u_ghost[:, -1] = u_top  # m/s
        self.u_m_ghost[0] = u_bottom  # m/s
        self.u_m_ghost[1:-1] = u_m  # m/s
        self.u_m_ghost[-1] = u_top  # m/s
        self.e_ghost[:, 0] = self.e_ghost[:, 1:-1][:, 0]  # J/mol
        self.e_ghost[:, -1] = self.e_ghost[:, 1:-1][:, -1]  # J/mol
        self.pos_ghost[1:-1] = self.pos  # m
        self.pos_ghost[0] = self.pos[0]  # m
        self.pos_ghost[-1] = self.pos[-1]  # m
        self.h_ghost[:, 0] = self.h_ghost[:, 1]  # J/mol
        self.h_ghost[:, -1] = self.h_ghost[:, -2]  # J/mol

        # recalculate wall temperature and heat flux
        # TODO ARE WE DOING THE STAGGERING CORRECTLY?
        lz = self.tank_params["lz_tank"] / self.N  # m
        if ts.getTime() != self.t_current[0] and self.tank_params["heat_transfer"]:
            self.t_current[0] = ts.getTime()
            for i in range(self.N):
                self.T_wall[i], self.Q_wall[i], h_ht = (
                    solve_radial_heat_conduction_implicit(
                        self.tank_params,
                        self.T[i],
                        self.T_wall[i],
                        (self.u_m_ghost[i] + self.u_m_ghost[i + 1]) / 2,
                        self.rho_m_ghost[i + 1],
                        self.mph_uv_flsh_L[i],
                        lz,
                        dt,
                    )
                )  # K, J/s, W/m2K

        # Calculate residuals
        f[:, :] = 0.0
        f[:, 0] = dt * rho_m_dot  # kg/m3
        f[:-1, 1] = self.p[1:] - self.p[:-1] + self.dx * self.g * rho_m[0:-1]  # Pa/m
        f[:, 2] = (
            dt
            * (
                rho_m_dot * (e_m / self.mw + self.g * self.pos)
                + rho_m * e_m_dot / self.mw
            )
            - rho_m_dot * e_m_dot / self.mw * dt**2
            - self.Q_wall / (np.pi * self.r_tank_inner**2 * lz) * dt
        )  # J / m3

        # add contribution from space
        for i in range(n_phase):
            e_flux_i = np.zeros_like(self.u_ghost[i])  # J/m3 m/s
            rho_flux_i = np.zeros_like(self.u_ghost[i])  # kg/m2/s
            for j in range(1, self.N + 1):
                if self.u_ghost[i][j] >= 0.0:
                    rho_flux_new = _rho_flux(
                        self.alpha_ghost[i][j], self.rho_ghost[i][j], self.u_ghost[i][j]
                    )
                    e_flux_new = _e_flux(
                        self.alpha_ghost[i][j],
                        self.rho_ghost[i][j],
                        self.h_ghost[i][j],
                        self.mw,
                        self.g,
                        self.pos_ghost[j],
                        self.u_ghost[i][j],
                    )

                    # backward euler
                    rho_flux_i[j] = rho_flux_new  # kg/m2/s
                    e_flux_i[j] = e_flux_new  # J/m3 m/s

                else:
                    rho_flux_new = _rho_flux(
                        self.alpha_ghost[i][j + 1],
                        self.rho_ghost[i][j + 1],
                        self.u_ghost[i][j],
                    )

                    e_flux_new = _e_flux(
                        self.alpha_ghost[i][j + 1],
                        self.rho_ghost[i][j + 1],
                        self.h_ghost[i][j + 1],
                        self.mw,
                        self.g,
                        self.pos_ghost[j + 1],
                        self.u_ghost[i][j],
                    )

                    # backward euler
                    rho_flux_i[j] = rho_flux_new
                    e_flux_i[j] = e_flux_new

            # mass eq
            f[:, 0] += (dt / self.dx) * (rho_flux_i[1:] - rho_flux_i[:-1])  # kg/m3

            # energy eq
            f[:, 2] += (dt / self.dx) * (e_flux_i[1:] - e_flux_i[:-1])  # J/m3

        f1_ref, f2_ref, f3_ref = self.rho_ref, self.p_ref, self.e_ref
        f[:, 0] /= f1_ref
        f[:-1, 1] /= f2_ref
        f[:, 2] /= f3_ref
        # dummy eq
        f[-1, 1] = x[-1, 2]


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20250224/f07ee069/attachment-0001.html>

From knepley at gmail.com  Mon Feb 24 07:53:41 2025
From: knepley at gmail.com (Matthew Knepley)
Date: Mon, 24 Feb 2025 08:53:41 -0500
Subject: [petsc-users] TS Solver stops working when including ts.setDM
In-Reply-To: <OS6P279MB05706419B43C06C49AA346BD94C02@OS6P279MB0570.NORP279.PROD.OUTLOOK.COM>
References: <OS6P279MB05706419B43C06C49AA346BD94C02@OS6P279MB0570.NORP279.PROD.OUTLOOK.COM>
Message-ID: <CAMYG4Gn3RwRuqta5Xc0AWgAnvF=pam0VCJacH+J_ovVfUyXtRA@mail.gmail.com>

On Mon, Feb 24, 2025 at 8:41?AM Eirik Jaccheri H?ydalsvik via petsc-users <
petsc-users at mcs.anl.gov> wrote:

> Hi,
>
> I am using the petsc4py.ts timestepper to solve a PDE. It is not easy to
> obtain the jacobian for my equations, so I do not provide a jacobian
> function. The code is given at the end of the email.
>
> When I comment out the function call ?ts.setDM(da)?, the code runs and
> gives reasonable results.
>
> However, when I add this line of code, the program crashes with the error
> message provided at the end of the email.
>
> Questions:
>
> 1. Do you know why adding this line of code can make the SNES solver
> diverge? Any suggestions for how to debug the issue?
>

I will not know until I run it, but here is my guess. When the DMDA is
specified, PETSc uses coloring to produce the Jacobian. When it is not, it
just brute-forces the entire J. My guess is that your residual does not
respect the stencil in the DMDA, so the coloring is wrong, making a wrong
Jacobian.


> 2. What is the advantage of adding the DMDA object to the ts solver? Will
> this speed up the calculation of the finite difference jacobian?
>

Yes, it speeds up the computation of the FD Jacobian.

  Thanks,

     Matt


> Best regards,
>
> Eirik H?ydalsvik
>
> SINTEF ER/NTNU
>
> Error message:
>
> [Eiriks-MacBook-Pro.local:26384] shmem: mmap: an error occurred while
> determining whether or not
> /var/folders/w1/35jw9y4n7lsbw0dhjqdhgzz80000gn/T//ompi.Eiriks-MacBook-Pro.501/jf.0/2046361600/sm_segment.Eiriks-MacBook-Pro.501.79f90000.0 could
> be created.
>
> t 0 of 1 with dt =  0.2
>
> 0 TS dt 0.2 time 0.
>
>       TSAdapt none step   0 stage rejected (SNES reason
> DIVERGED_LINEAR_SOLVE) t=0          + 2.000e-01 retrying with dt=5.000e-02
>
>       TSAdapt none step   0 stage rejected (SNES reason
> DIVERGED_LINEAR_SOLVE) t=0          + 5.000e-02 retrying with dt=1.250e-02
>
>       TSAdapt none step   0 stage rejected (SNES reason
> DIVERGED_LINEAR_SOLVE) t=0          + 1.250e-02 retrying with dt=3.125e-03
>
>       TSAdapt none step   0 stage rejected (SNES reason
> DIVERGED_LINEAR_SOLVE) t=0          + 3.125e-03 retrying with dt=7.813e-04
>
>       TSAdapt none step   0 stage rejected (SNES reason
> DIVERGED_LINEAR_SOLVE) t=0          + 7.813e-04 retrying with dt=1.953e-04
>
>       TSAdapt none step   0 stage rejected (SNES reason
> DIVERGED_LINEAR_SOLVE) t=0          + 1.953e-04 retrying with dt=4.883e-05
>
>       TSAdapt none step   0 stage rejected (SNES reason
> DIVERGED_LINEAR_SOLVE) t=0          + 4.883e-05 retrying with dt=1.221e-05
>
>       TSAdapt none step   0 stage rejected (SNES reason
> DIVERGED_LINEAR_SOLVE) t=0          + 1.221e-05 retrying with dt=3.052e-06
>
>       TSAdapt none step   0 stage rejected (SNES reason
> DIVERGED_LINEAR_SOLVE) t=0          + 3.052e-06 retrying with dt=7.629e-07
>
>       TSAdapt none step   0 stage rejected (SNES reason
> DIVERGED_LINEAR_SOLVE) t=0          + 7.629e-07 retrying with dt=1.907e-07
>
>       TSAdapt none step   0 stage rejected (SNES reason
> DIVERGED_LINEAR_SOLVE) t=0          + 1.907e-07 retrying with dt=4.768e-08
>
> Traceback (most recent call last):
>
>   File "/Users/iept1445/Code/tank_model/closed_tank.py", line 200, in
> <module>
>
>     return_dict1d = get_tank_composition_1d(tank_params)
>
>   File "/Users/iept1445/Code/tank_model/src/tank_model1d.py", line 223, in
> get_tank_composition_1d
>
>     ts.solve(u=x)
>
>   File "petsc4py/PETSc/TS.pyx", line 2478, in petsc4py.PETSc.TS.solve
>
> petsc4py.PETSc.Error: error code 91
>
> [0] TSSolve() at
> /Users/iept1445/.cache/uv/sdists-v6/pypi/petsc/3.22.1/svV0XnlR4s3LB4lmsaKjR/src/src/ts/interface/ts.c:4072
>
> [0] TSStep() at
> /Users/iept1445/.cache/uv/sdists-v6/pypi/petsc/3.22.1/svV0XnlR4s3LB4lmsaKjR/src/src/ts/interface/ts.c:3440
>
> [0] TSStep has failed due to DIVERGED_STEP_REJECTED
>
> Options for solver:
>
> COMM = PETSc.COMM_WORLD
>
>
>
>     da = PETSc.DMDA().create(
>
>         dim=(N_vertical,),
>
>         dof=3,
>
>         stencil_type=PETSc.DMDA().StencilType.STAR,
>
>         stencil_width=1,
>
>         # boundary_type=PETSc.DMDA.BoundaryType.GHOSTED,
>
>     )
>
>     x = da.createGlobalVec()
>
>     x_old = da.createGlobalVec()
>
>     f = da.createGlobalVec()
>
>     J = da.createMat()
>
>     rho_ref = rho_m[0]  # kg/m3
>
>     e_ref = e_m[0]  # J/mol
>
>     p_ref = p0  # Pa
>
>     x.setArray(np.array([rho_m / rho_ref, e_m / e_ref, ux_m]).T.flatten())
>
>     x_old.setArray(np.array([rho_m / rho_ref, e_m / e_ref,
> ux_m]).T.flatten())
>
>
>
>     optsDB = PETSc.Options()
>
>     optsDB["snes_lag_preconditioner_persists"] = False
>
>     optsDB["snes_lag_jacobian"] = 1
>
>     optsDB["snes_lag_jacobian_persists"] = False
>
>     optsDB["snes_lag_preconditioner"] = 1
>
>     optsDB["ksp_type"] = "gmres"  # "gmres"  # gmres"
>
>     optsDB["pc_type"] = "ilu"  # "lu"  # "ilu"
>
>     optsDB["snes_type"] = "newtonls"
>
>     optsDB["ksp_rtol"] = 1e-7
>
>     optsDB["ksp_atol"] = 1e-7
>
>     optsDB["ksp_max_it"] = 100
>
>     optsDB["snes_rtol"] = 1e-5
>
>     optsDB["snes_atol"] = 1e-5
>
>     optsDB["snes_stol"] = 1e-5
>
>     optsDB["snes_max_it"] = 100
>
>     optsDB["snes_mf"] = False
>
>     optsDB["ts_max_time"] = t_end
>
>     optsDB["ts_type"] = "beuler"  # "bdf"  #
>
>     optsDB["ts_max_snes_failures"] = -1
>
>     optsDB["ts_monitor"] = ""
>
>     optsDB["ts_adapt_monitor"] = ""
>
>     # optsDB["snes_monitor"] = ""
>
>     # optsDB["ksp_monitor"] = ""
>
>     optsDB["ts_atol"] = 1e-4
>
>
>
>     x0 = x_old
>
>     residual_wrap = residual_ts(
>
>         eos,
>
>         x0,
>
>         N_vertical,
>
>         g,
>
>         pos,
>
>         z,
>
>         mw,
>
>         dt,
>
>         dx,
>
>         p_amb,
>
>         A_nozzle,
>
>         r_tank_inner,
>
>         mph_uv_flsh_L,
>
>         rho_ref,
>
>         e_ref,
>
>         p_ref,
>
>         closed_tank,
>
>         J,
>
>         f,
>
>         da,
>
>         drift_func,
>
>         T_wall,
>
>         tank_params,
>
>     )
>
>
>
>     # optsDB["ts_adapt_type"] = "none"
>
>
>
>     ts = PETSc.TS().create(comm=COMM)
>
>     # TODO: Figure out why DM crashes the code
>
>     # ts.setDM(residual_wrap.da)
>
>     ts.setIFunction(residual_wrap.residual_ts, None)
>
>     ts.setTimeStep(dt)
>
>     ts.setMaxSteps(-1)
>
>     ts.setTime(t_start)  # s
>
>     ts.setMaxTime(t_end)  # s
>
>     ts.setMaxSteps(1e5)
>
>     ts.setStepLimits(1e-3, 1e5)
>
>     ts.setFromOptions()
>
>     ts.solve(u=x)
>
>
>
> Residual function:
>
> class residual_ts:
>
>     def __init__(
>
>         self,
>
>         eos,
>
>         x0,
>
>         N,
>
>         g,
>
>         pos,
>
>         z,
>
>         mw,
>
>         dt,
>
>         dx,
>
>         p_amb,
>
>         A_nozzle,
>
>         r_tank_inner,
>
>         mph_uv_flsh_l,
>
>         rho_ref,
>
>         e_ref,
>
>         p_ref,
>
>         closed_tank,
>
>         J,
>
>         f,
>
>         da,
>
>         drift_func,
>
>         T_wall,
>
>         tank_params,
>
>     ):
>
>         self.eos = eos
>
>         self.x0 = x0
>
>         self.N = N
>
>         self.g = g
>
>         self.pos = pos
>
>         self.z = z
>
>         self.mw = mw
>
>         self.dt = dt
>
>         self.dx = dx
>
>         self.p_amb = p_amb
>
>         self.A_nozzle = A_nozzle
>
>         self.r_tank_inner = r_tank_inner
>
>         self.mph_uv_flsh_L = mph_uv_flsh_l
>
>         self.rho_ref = rho_ref
>
>         self.e_ref = e_ref
>
>         self.p_ref = p_ref
>
>         self.closed_tank = closed_tank
>
>         self.J = J
>
>         self.f = f
>
>         self.da = da
>
>         self.drift_func = drift_func
>
>         self.T_wall = T_wall
>
>         self.tank_params = tank_params
>
>         self.Q_wall = np.zeros(N)
>
>         self.n_iter = 0
>
>         self.t_current = [0.0]
>
>         self.s_top = [0.0]
>
>         self.p_choke = [0.0]
>
>
>
>         # setting interp func # TODO figure out how to generalize this
> method
>
>         self._interp_func = _jalla_upwind
>
>
>
>         # allocate space for new params
>
>         self.p = np.zeros(N)  # Pa
>
>         self.T = np.zeros(N)  # K
>
>         self.alpha = np.zeros((2, N))
>
>         self.rho = np.zeros((2, N))
>
>         self.e = np.zeros((2, N))
>
>
>
>         # allocate space for ghost cells
>
>         self.alpha_ghost = np.zeros((2, N + 2))
>
>         self.rho_ghost = np.zeros((2, N + 2))
>
>         self.rho_m_ghost = np.zeros(N + 2)
>
>         self.u_m_ghost = np.zeros(N + 1)
>
>         self.u_ghost = np.zeros((2, N + 1))
>
>         self.e_ghost = np.zeros((2, N + 2))
>
>         self.pos_ghost = np.zeros(N + 2)
>
>         self.h_ghost = np.zeros((2, N + 2))
>
>
>
>         # allocate soace for local X and Xdot
>
>         self.X_LOCAL = da.createLocalVec()
>
>         self.XDOT_LOCAL = da.createLocalVec()
>
>
>
>     def residual_ts(self, ts, t, X, XDOT, F):
>
>         self.n_iter += 1
>
>         # TODO: Estimate time use
>
>         """
>
>         Caculate residuals for equations
>
>         (rho, rho (e + gx))_t + (rho u, rho u (h + gx))_x = 0
>
>         P_x = - g \rho
>
>         """
>
>         n_phase = 2
>
>         self.da.globalToLocal(X, self.X_LOCAL)
>
>         self.da.globalToLocal(XDOT, self.XDOT_LOCAL)
>
>         x = self.da.getVecArray(self.X_LOCAL)
>
>         xdot = self.da.getVecArray(self.XDOT_LOCAL)
>
>         f = self.da.getVecArray(F)
>
>
>
>         T_c, v_c, p_c = self.eos.critical(self.z)  # K, m3/mol, Pa
>
>         rho_m = x[:, 0] * self.rho_ref  # kg/m3
>
>         e_m = x[:, 1] * self.e_ref  # J/mol
>
>         u_m = x[:-1, 2]  # m/s
>
>
>
>         # derivatives
>
>         rho_m_dot = xdot[:, 0] * self.rho_ref  # kg/m3
>
>         e_m_dot = xdot[:, 1] * self.e_ref  # kg/m3
>
>         dt = ts.getTimeStep()  # s
>
>
>
>         for i in range(self.N):
>
>             # get new parameters
>
>             self.mph_uv_flsh_L[i] = self.eos.multi_phase_uvflash(
>
>                 self.z, e_m[i], self.mw / rho_m[i], self.mph_uv_flsh_L[i]
>
>             )
>
>
>
>             betaL, betaV, betaS = _get_betas(self.mph_uv_flsh_L[i])  #
> mol/mol
>
>             beta = [betaL, betaV]
>
>             if betaS != 0.0:
>
>                 print("there is a solid phase which is not accounted for")
>
>             self.T[i], self.p[i] = _get_tank_temperature_pressure(
>
>                 self.mph_uv_flsh_L[i]
>
>             )  # K, Pa)
>
>             for j, phase in enumerate([self.eos.LIQPH, self.eos.VAPPH]):
>
>                 # new parameters
>
>                 self.rho_ghost[:, 1:-1][j][i] = (
>
>                     self.mw
>
>                     / self.eos.specific_volume(self.T[i], self.p[i],
> self.z, phase)[0]
>
>                 )  # kg/m3
>
>                 self.e_ghost[:, 1:-1][j][i] = self.eos.internal_energy_tv(
>
>                     self.T[i], self.mw / self.rho_ghost[:, 1:-1][j][i],
> self.z, phase
>
>                 )[
>
>                     0
>
>                 ]  # J/mol
>
>                 self.h_ghost[:, 1:-1][j][i] = (
>
>                     self.e_ghost[:, 1:-1][j][i]
>
>                     + self.p[i] * self.mw / self.rho_ghost[:, 1:-1][j][i]
>
>                 )  # J/mol
>
>                 self.alpha_ghost[:, 1:-1][j][i] = (
>
>                     beta[j] / self.rho_ghost[:, 1:-1][j][i] * rho_m[i]
>
>                 )  # m3/m3
>
>
>
>         # calculate drift velocity
>
>         for i in range(self.N - 1):
>
>             self.u_ghost[:, 1:-1][0][i], self.u_ghost[:, 1:-1][1][i] = (
>
>                 calc_drift_velocity(
>
>                     u_m[i],
>
>                     self._interp_func(
>
>                         self.rho_ghost[:, 1:-1][0][i],
>
>                         self.rho_ghost[:, 1:-1][0][i + 1],
>
>                         u_m[i],
>
>                     ),
>
>                     self._interp_func(
>
>                         self.rho_ghost[:, 1:-1][1][i],
>
>                         self.rho_ghost[:, 1:-1][1][i + 1],
>
>                         u_m[i],
>
>                     ),
>
>                     self.g,
>
>                     self._interp_func(self.T[i], self.T[i + 1], u_m[i]),
>
>                     T_c,
>
>                     self.r_tank_inner,
>
>                     self._interp_func(
>
>                         self.alpha_ghost[:, 1:-1][0][i],
>
>                         self.alpha_ghost[:, 1:-1][0][i + 1],
>
>                         u_m[i],
>
>                     ),
>
>                     self._interp_func(
>
>                         self.alpha_ghost[:, 1:-1][1][i],
>
>                         self.alpha_ghost[:, 1:-1][1][i + 1],
>
>                         u_m[i],
>
>                     ),
>
>                     self.drift_func,
>
>                 )
>
>             )  # liq m / s , vapour m / s
>
>
>
>         u_bottom = 0
>
>         if self.closed_tank:
>
>             u_top = 0.0  # m/s
>
>         else:
>
>             # calc phase to skip env_isentrope_cross
>
>             if (
>
>                 self.mph_uv_flsh_L[-1].liquid != None
>
>                 and self.mph_uv_flsh_L[-1].vapour == None
>
>                 and self.mph_uv_flsh_L[-1].solid == None
>
>             ):
>
>                 phase_env = self.eos.LIQPH
>
>             else:
>
>                 phase_env = self.eos.TWOPH
>
>
>
>             self.h_m = e_m + self.p * self.mw / rho_m  # J / mol
>
>             self.s_top[0] = _get_s_tank(self.eos, self.mph_uv_flsh_L[-1])
> # J / mol / K
>
>             mdot, self.p_choke[0] = calc_mass_outflow(
>
>                 self.eos,
>
>                 self.z,
>
>                 self.h_m[-1],
>
>                 self.s_top[0],
>
>                 self.p[-1],
>
>                 self.p_amb,
>
>                 self.A_nozzle,
>
>                 self.mw,
>
>                 phase_env,
>
>                 debug_plot=False,
>
>             )  # mol / s , Pa
>
>             u_top = -mdot * self.mw / rho_m[-1] / (np.pi *
> self.r_tank_inner**2)  # m/s
>
>
>
>         # assemble vectors with ghost cells
>
>         self.alpha_ghost[:, 0] = self.alpha_ghost[:, 1:-1][:, 0]  # m3/m3
>
>         self.alpha_ghost[:, -1] = self.alpha_ghost[:, 1:-1][:, -1]  # m3/m3
>
>         self.rho_ghost[:, 0] = self.rho_ghost[:, 1:-1][:, 0]  # kg/m3
>
>         self.rho_ghost[:, -1] = self.rho_ghost[:, 1:-1][:, -1]  # kg/m3
>
>         self.rho_m_ghost[0] = rho_m[0]  # kg/m3
>
>         self.rho_m_ghost[1:-1] = rho_m  # kg/m3
>
>         self.rho_m_ghost[-1] = rho_m[-1]  # kg/m3
>
>         # u_ghost[:, 1:-1] = u  # m/s
>
>         self.u_ghost[:, 0] = u_bottom  # m/s
>
>         self.u_ghost[:, -1] = u_top  # m/s
>
>         self.u_m_ghost[0] = u_bottom  # m/s
>
>         self.u_m_ghost[1:-1] = u_m  # m/s
>
>         self.u_m_ghost[-1] = u_top  # m/s
>
>         self.e_ghost[:, 0] = self.e_ghost[:, 1:-1][:, 0]  # J/mol
>
>         self.e_ghost[:, -1] = self.e_ghost[:, 1:-1][:, -1]  # J/mol
>
>         self.pos_ghost[1:-1] = self.pos  # m
>
>         self.pos_ghost[0] = self.pos[0]  # m
>
>         self.pos_ghost[-1] = self.pos[-1]  # m
>
>         self.h_ghost[:, 0] = self.h_ghost[:, 1]  # J/mol
>
>         self.h_ghost[:, -1] = self.h_ghost[:, -2]  # J/mol
>
>
>
>         # recalculate wall temperature and heat flux
>
>         # TODO ARE WE DOING THE STAGGERING CORRECTLY?
>
>         lz = self.tank_params["lz_tank"] / self.N  # m
>
>         if ts.getTime() != self.t_current[0] and
> self.tank_params["heat_transfer"]:
>
>             self.t_current[0] = ts.getTime()
>
>             for i in range(self.N):
>
>                 self.T_wall[i], self.Q_wall[i], h_ht = (
>
>                     solve_radial_heat_conduction_implicit(
>
>                         self.tank_params,
>
>                         self.T[i],
>
>                         self.T_wall[i],
>
>                         (self.u_m_ghost[i] + self.u_m_ghost[i + 1]) / 2,
>
>                         self.rho_m_ghost[i + 1],
>
>                         self.mph_uv_flsh_L[i],
>
>                         lz,
>
>                         dt,
>
>                     )
>
>                 )  # K, J/s, W/m2K
>
>
>
>         # Calculate residuals
>
>         f[:, :] = 0.0
>
>         f[:, 0] = dt * rho_m_dot  # kg/m3
>
>         f[:-1, 1] = self.p[1:] - self.p[:-1] + self.dx * self.g *
> rho_m[0:-1]  # Pa/m
>
>         f[:, 2] = (
>
>             dt
>
>             * (
>
>                 rho_m_dot * (e_m / self.mw + self.g * self.pos)
>
>                 + rho_m * e_m_dot / self.mw
>
>             )
>
>             - rho_m_dot * e_m_dot / self.mw * dt**2
>
>             - self.Q_wall / (np.pi * self.r_tank_inner**2 * lz) * dt
>
>         )  # J / m3
>
>
>
>         # add contribution from space
>
>         for i in range(n_phase):
>
>             e_flux_i = np.zeros_like(self.u_ghost[i])  # J/m3 m/s
>
>             rho_flux_i = np.zeros_like(self.u_ghost[i])  # kg/m2/s
>
>             for j in range(1, self.N + 1):
>
>                 if self.u_ghost[i][j] >= 0.0:
>
>                     rho_flux_new = _rho_flux(
>
>                         self.alpha_ghost[i][j], self.rho_ghost[i][j],
> self.u_ghost[i][j]
>
>                     )
>
>                     e_flux_new = _e_flux(
>
>                         self.alpha_ghost[i][j],
>
>                         self.rho_ghost[i][j],
>
>                         self.h_ghost[i][j],
>
>                         self.mw,
>
>                         self.g,
>
>                         self.pos_ghost[j],
>
>                         self.u_ghost[i][j],
>
>                     )
>
>
>
>                     # backward euler
>
>                     rho_flux_i[j] = rho_flux_new  # kg/m2/s
>
>                     e_flux_i[j] = e_flux_new  # J/m3 m/s
>
>
>
>                 else:
>
>                     rho_flux_new = _rho_flux(
>
>                         self.alpha_ghost[i][j + 1],
>
>                         self.rho_ghost[i][j + 1],
>
>                         self.u_ghost[i][j],
>
>                     )
>
>
>
>                     e_flux_new = _e_flux(
>
>                         self.alpha_ghost[i][j + 1],
>
>                         self.rho_ghost[i][j + 1],
>
>                         self.h_ghost[i][j + 1],
>
>                         self.mw,
>
>                         self.g,
>
>                         self.pos_ghost[j + 1],
>
>                         self.u_ghost[i][j],
>
>                     )
>
>
>
>                     # backward euler
>
>                     rho_flux_i[j] = rho_flux_new
>
>                     e_flux_i[j] = e_flux_new
>
>
>
>             # mass eq
>
>             f[:, 0] += (dt / self.dx) * (rho_flux_i[1:] -
> rho_flux_i[:-1])  # kg/m3
>
>
>
>             # energy eq
>
>             f[:, 2] += (dt / self.dx) * (e_flux_i[1:] - e_flux_i[:-1])  #
> J/m3
>
>
>
>         f1_ref, f2_ref, f3_ref = self.rho_ref, self.p_ref, self.e_ref
>
>         f[:, 0] /= f1_ref
>
>         f[:-1, 1] /= f2_ref
>
>         f[:, 2] /= f3_ref
>
>         # dummy eq
>
>         f[-1, 1] = x[-1, 2]
>
>
>
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!ZxDCDT2_bjnnDIclXM4ZGttKBfwEZhFamWy_uuk1tJpgQYOZv6UzsOeafuUZ_Zln7sTsEo0uKwot1T29bxWQ$  <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!ZxDCDT2_bjnnDIclXM4ZGttKBfwEZhFamWy_uuk1tJpgQYOZv6UzsOeafuUZ_Zln7sTsEo0uKwot1dAkbMHw$ >
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20250224/6b220a32/attachment-0001.html>

From eirik.hoydalsvik at sintef.no  Mon Feb 24 07:56:20 2025
From: eirik.hoydalsvik at sintef.no (=?utf-8?B?RWlyaWsgSmFjY2hlcmkgSMO4eWRhbHN2aWs=?=)
Date: Mon, 24 Feb 2025 13:56:20 +0000
Subject: [petsc-users] TS Solver stops working when including ts.setDM
In-Reply-To: <CAMYG4Gn3RwRuqta5Xc0AWgAnvF=pam0VCJacH+J_ovVfUyXtRA@mail.gmail.com>
References: <OS6P279MB05706419B43C06C49AA346BD94C02@OS6P279MB0570.NORP279.PROD.OUTLOOK.COM>
	<CAMYG4Gn3RwRuqta5Xc0AWgAnvF=pam0VCJacH+J_ovVfUyXtRA@mail.gmail.com>
Message-ID: <OS6P279MB057047A6F89E228712A833AF94C02@OS6P279MB0570.NORP279.PROD.OUTLOOK.COM>

  1.  Thank you for the quick answer, I think this sounds reasonable? Is there any way to compare the brute-force jacobian to the one computed using the coloring information?

From: Matthew Knepley <knepley at gmail.com>
Date: Monday, February 24, 2025 at 14:53
To: Eirik Jaccheri H?ydalsvik <eirik.hoydalsvik at sintef.no>
Cc: petsc-users at mcs.anl.gov <petsc-users at mcs.anl.gov>
Subject: Re: [petsc-users] TS Solver stops working when including ts.setDM
On Mon, Feb 24, 2025 at 8:41?AM Eirik Jaccheri H?ydalsvik via petsc-users <petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>> wrote:
Hi,

I am using the petsc4py.ts timestepper to solve a PDE. It is not easy to obtain the jacobian for my equations, so I do not provide a jacobian function. The code is given at the end of the email.

When I comment out the function call ?ts.setDM(da)?, the code runs and gives reasonable results.

However, when I add this line of code, the program crashes with the error message provided at the end of the email.

Questions:

1. Do you know why adding this line of code can make the SNES solver diverge? Any suggestions for how to debug the issue?

I will not know until I run it, but here is my guess. When the DMDA is specified, PETSc uses coloring to produce the Jacobian. When it is not, it just brute-forces the entire J. My guess is that your residual does not respect the stencil in the DMDA, so the coloring is wrong, making a wrong Jacobian.

2. What is the advantage of adding the DMDA object to the ts solver? Will this speed up the calculation of the finite difference jacobian?

Yes, it speeds up the computation of the FD Jacobian.

  Thanks,

     Matt

Best regards,
Eirik H?ydalsvik
SINTEF ER/NTNU

Error message:

[Eiriks-MacBook-Pro.local:26384] shmem: mmap: an error occurred while determining whether or not /var/folders/w1/35jw9y4n7lsbw0dhjqdhgzz80000gn/T//ompi.Eiriks-MacBook-Pro.501/jf.0/2046361600/sm_segment.Eiriks-MacBook-Pro.501.79f90000.0 could be created.
t 0 of 1 with dt =  0.2
0 TS dt 0.2 time 0.
      TSAdapt none step   0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0          + 2.000e-01 retrying with dt=5.000e-02
      TSAdapt none step   0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0          + 5.000e-02 retrying with dt=1.250e-02
      TSAdapt none step   0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0          + 1.250e-02 retrying with dt=3.125e-03
      TSAdapt none step   0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0          + 3.125e-03 retrying with dt=7.813e-04
      TSAdapt none step   0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0          + 7.813e-04 retrying with dt=1.953e-04
      TSAdapt none step   0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0          + 1.953e-04 retrying with dt=4.883e-05
      TSAdapt none step   0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0          + 4.883e-05 retrying with dt=1.221e-05
      TSAdapt none step   0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0          + 1.221e-05 retrying with dt=3.052e-06
      TSAdapt none step   0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0          + 3.052e-06 retrying with dt=7.629e-07
      TSAdapt none step   0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0          + 7.629e-07 retrying with dt=1.907e-07
      TSAdapt none step   0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0          + 1.907e-07 retrying with dt=4.768e-08
Traceback (most recent call last):
  File "/Users/iept1445/Code/tank_model/closed_tank.py", line 200, in <module>
    return_dict1d = get_tank_composition_1d(tank_params)
  File "/Users/iept1445/Code/tank_model/src/tank_model1d.py", line 223, in get_tank_composition_1d
    ts.solve(u=x)
  File "petsc4py/PETSc/TS.pyx", line 2478, in petsc4py.PETSc.TS.solve
petsc4py.PETSc.Error: error code 91
[0] TSSolve() at /Users/iept1445/.cache/uv/sdists-v6/pypi/petsc/3.22.1/svV0XnlR4s3LB4lmsaKjR/src/src/ts/interface/ts.c:4072
[0] TSStep() at /Users/iept1445/.cache/uv/sdists-v6/pypi/petsc/3.22.1/svV0XnlR4s3LB4lmsaKjR/src/src/ts/interface/ts.c:3440
[0] TSStep has failed due to DIVERGED_STEP_REJECTED

Options for solver:

COMM = PETSc.COMM_WORLD

    da = PETSc.DMDA().create(
        dim=(N_vertical,),
        dof=3,
        stencil_type=PETSc.DMDA().StencilType.STAR,
        stencil_width=1,
        # boundary_type=PETSc.DMDA.BoundaryType.GHOSTED,
    )
    x = da.createGlobalVec()
    x_old = da.createGlobalVec()
    f = da.createGlobalVec()
    J = da.createMat()
    rho_ref = rho_m[0]  # kg/m3
    e_ref = e_m[0]  # J/mol
    p_ref = p0  # Pa
    x.setArray(np.array([rho_m / rho_ref, e_m / e_ref, ux_m]).T.flatten())
    x_old.setArray(np.array([rho_m / rho_ref, e_m / e_ref, ux_m]).T.flatten())

    optsDB = PETSc.Options()
    optsDB["snes_lag_preconditioner_persists"] = False
    optsDB["snes_lag_jacobian"] = 1
    optsDB["snes_lag_jacobian_persists"] = False
    optsDB["snes_lag_preconditioner"] = 1
    optsDB["ksp_type"] = "gmres"  # "gmres"  # gmres"
    optsDB["pc_type"] = "ilu"  # "lu"  # "ilu"
    optsDB["snes_type"] = "newtonls"
    optsDB["ksp_rtol"] = 1e-7
    optsDB["ksp_atol"] = 1e-7
    optsDB["ksp_max_it"] = 100
    optsDB["snes_rtol"] = 1e-5
    optsDB["snes_atol"] = 1e-5
    optsDB["snes_stol"] = 1e-5
    optsDB["snes_max_it"] = 100
    optsDB["snes_mf"] = False
    optsDB["ts_max_time"] = t_end
    optsDB["ts_type"] = "beuler"  # "bdf"  #
    optsDB["ts_max_snes_failures"] = -1
    optsDB["ts_monitor"] = ""
    optsDB["ts_adapt_monitor"] = ""
    # optsDB["snes_monitor"] = ""
    # optsDB["ksp_monitor"] = ""
    optsDB["ts_atol"] = 1e-4

    x0 = x_old
    residual_wrap = residual_ts(
        eos,
        x0,
        N_vertical,
        g,
        pos,
        z,
        mw,
        dt,
        dx,
        p_amb,
        A_nozzle,
        r_tank_inner,
        mph_uv_flsh_L,
        rho_ref,
        e_ref,
        p_ref,
        closed_tank,
        J,
        f,
        da,
        drift_func,
        T_wall,
        tank_params,
    )

    # optsDB["ts_adapt_type"] = "none"

    ts = PETSc.TS().create(comm=COMM)
    # TODO: Figure out why DM crashes the code
    # ts.setDM(residual_wrap.da)
    ts.setIFunction(residual_wrap.residual_ts, None)
    ts.setTimeStep(dt)
    ts.setMaxSteps(-1)
    ts.setTime(t_start)  # s
    ts.setMaxTime(t_end)  # s
    ts.setMaxSteps(1e5)
    ts.setStepLimits(1e-3, 1e5)
    ts.setFromOptions()
    ts.solve(u=x)


Residual function:

class residual_ts:
    def __init__(
        self,
        eos,
        x0,
        N,
        g,
        pos,
        z,
        mw,
        dt,
        dx,
        p_amb,
        A_nozzle,
        r_tank_inner,
        mph_uv_flsh_l,
        rho_ref,
        e_ref,
        p_ref,
        closed_tank,
        J,
        f,
        da,
        drift_func,
        T_wall,
        tank_params,
    ):
        self.eos = eos
        self.x0 = x0
        self.N = N
        self.g = g
        self.pos = pos
        self.z = z
        self.mw<https://urldefense.us/v3/__http://self.mw/__;!!G_uCfscf7eWS!eAK-pHiCaIieRkr0rA56mmobvaBQrGUzDnAK0WEafTKv_sK1uhYaYWFHIfy9HHlLuLFD0KExd3ovrR0DF4AFc6XURpvLKsVEWCc$ > = mw
        self.dt = dt
        self.dx = dx
        self.p_amb = p_amb
        self.A_nozzle = A_nozzle
        self.r_tank_inner = r_tank_inner
        self.mph_uv_flsh_L = mph_uv_flsh_l
        self.rho_ref = rho_ref
        self.e_ref = e_ref
        self.p_ref = p_ref
        self.closed_tank = closed_tank
        self.J = J
        self.f = f
        self.da = da
        self.drift_func = drift_func
        self.T_wall = T_wall
        self.tank_params = tank_params
        self.Q_wall = np.zeros(N)
        self.n_iter = 0
        self.t_current = [0.0]
        self.s_top = [0.0]
        self.p_choke = [0.0]

        # setting interp func # TODO figure out how to generalize this method
        self._interp_func = _jalla_upwind

        # allocate space for new params
        self.p = np.zeros(N)  # Pa
        self.T = np.zeros(N)  # K
        self.alpha = np.zeros((2, N))
        self.rho = np.zeros((2, N))
        self.e = np.zeros((2, N))

        # allocate space for ghost cells
        self.alpha_ghost = np.zeros((2, N + 2))
        self.rho_ghost = np.zeros((2, N + 2))
        self.rho_m_ghost = np.zeros(N + 2)
        self.u_m_ghost = np.zeros(N + 1)
        self.u_ghost = np.zeros((2, N + 1))
        self.e_ghost = np.zeros((2, N + 2))
        self.pos_ghost = np.zeros(N + 2)
        self.h_ghost = np.zeros((2, N + 2))

        # allocate soace for local X and Xdot
        self.X_LOCAL = da.createLocalVec()
        self.XDOT_LOCAL = da.createLocalVec()

    def residual_ts(self, ts, t, X, XDOT, F):
        self.n_iter += 1
        # TODO: Estimate time use
        """
        Caculate residuals for equations
        (rho, rho (e + gx))_t + (rho u, rho u (h + gx))_x = 0
        P_x = - g \rho
        """
        n_phase = 2
        self.da.globalToLocal(X, self.X_LOCAL)
        self.da.globalToLocal(XDOT, self.XDOT_LOCAL)
        x = self.da.getVecArray(self.X_LOCAL)
        xdot = self.da.getVecArray(self.XDOT_LOCAL)
        f = self.da.getVecArray(F)

        T_c, v_c, p_c = self.eos.critical(self.z)  # K, m3/mol, Pa
        rho_m = x[:, 0] * self.rho_ref  # kg/m3
        e_m = x[:, 1] * self.e_ref  # J/mol
        u_m = x[:-1, 2]  # m/s

        # derivatives
        rho_m_dot = xdot[:, 0] * self.rho_ref  # kg/m3
        e_m_dot = xdot[:, 1] * self.e_ref  # kg/m3
        dt = ts.getTimeStep()  # s

        for i in range(self.N):
            # get new parameters
            self.mph_uv_flsh_L[i] = self.eos.multi_phase_uvflash(
                self.z, e_m[i], self.mw<https://urldefense.us/v3/__http://self.mw/__;!!G_uCfscf7eWS!eAK-pHiCaIieRkr0rA56mmobvaBQrGUzDnAK0WEafTKv_sK1uhYaYWFHIfy9HHlLuLFD0KExd3ovrR0DF4AFc6XURpvLKsVEWCc$ > / rho_m[i], self.mph_uv_flsh_L[i]
            )

            betaL, betaV, betaS = _get_betas(self.mph_uv_flsh_L[i])  # mol/mol
            beta = [betaL, betaV]
            if betaS != 0.0:
                print("there is a solid phase which is not accounted for")
            self.T[i], self.p[i] = _get_tank_temperature_pressure(
                self.mph_uv_flsh_L[i]
            )  # K, Pa)
            for j, phase in enumerate([self.eos.LIQPH, self.eos.VAPPH]):
                # new parameters
                self.rho_ghost[:, 1:-1][j][i] = (
                    self.mw<https://urldefense.us/v3/__http://self.mw/__;!!G_uCfscf7eWS!eAK-pHiCaIieRkr0rA56mmobvaBQrGUzDnAK0WEafTKv_sK1uhYaYWFHIfy9HHlLuLFD0KExd3ovrR0DF4AFc6XURpvLKsVEWCc$ >
                    / self.eos.specific_volume(self.T[i], self.p[i], self.z, phase)[0]
                )  # kg/m3
                self.e_ghost[:, 1:-1][j][i] = self.eos.internal_energy_tv(
                    self.T[i], self.mw<https://urldefense.us/v3/__http://self.mw/__;!!G_uCfscf7eWS!eAK-pHiCaIieRkr0rA56mmobvaBQrGUzDnAK0WEafTKv_sK1uhYaYWFHIfy9HHlLuLFD0KExd3ovrR0DF4AFc6XURpvLKsVEWCc$ > / self.rho_ghost[:, 1:-1][j][i], self.z, phase
                )[
                    0
                ]  # J/mol
                self.h_ghost[:, 1:-1][j][i] = (
                    self.e_ghost[:, 1:-1][j][i]
                    + self.p[i] * self.mw<https://urldefense.us/v3/__http://self.mw/__;!!G_uCfscf7eWS!eAK-pHiCaIieRkr0rA56mmobvaBQrGUzDnAK0WEafTKv_sK1uhYaYWFHIfy9HHlLuLFD0KExd3ovrR0DF4AFc6XURpvLKsVEWCc$ > / self.rho_ghost[:, 1:-1][j][i]
                )  # J/mol
                self.alpha_ghost[:, 1:-1][j][i] = (
                    beta[j] / self.rho_ghost[:, 1:-1][j][i] * rho_m[i]
                )  # m3/m3

        # calculate drift velocity
        for i in range(self.N - 1):
            self.u_ghost[:, 1:-1][0][i], self.u_ghost[:, 1:-1][1][i] = (
                calc_drift_velocity(
                    u_m[i],
                    self._interp_func(
                        self.rho_ghost[:, 1:-1][0][i],
                        self.rho_ghost[:, 1:-1][0][i + 1],
                        u_m[i],
                    ),
                    self._interp_func(
                        self.rho_ghost[:, 1:-1][1][i],
                        self.rho_ghost[:, 1:-1][1][i + 1],
                        u_m[i],
                    ),
                    self.g,
                    self._interp_func(self.T[i], self.T[i + 1], u_m[i]),
                    T_c,
                    self.r_tank_inner,
                    self._interp_func(
                        self.alpha_ghost[:, 1:-1][0][i],
                        self.alpha_ghost[:, 1:-1][0][i + 1],
                        u_m[i],
                    ),
                    self._interp_func(
                        self.alpha_ghost[:, 1:-1][1][i],
                        self.alpha_ghost[:, 1:-1][1][i + 1],
                        u_m[i],
                    ),
                    self.drift_func,
                )
            )  # liq m / s , vapour m / s

        u_bottom = 0
        if self.closed_tank:
            u_top = 0.0  # m/s
        else:
            # calc phase to skip env_isentrope_cross
            if (
                self.mph_uv_flsh_L[-1].liquid != None
                and self.mph_uv_flsh_L[-1].vapour == None
                and self.mph_uv_flsh_L[-1].solid == None
            ):
                phase_env = self.eos.LIQPH
            else:
                phase_env = self.eos.TWOPH

            self.h_m = e_m + self.p * self.mw<https://urldefense.us/v3/__http://self.mw/__;!!G_uCfscf7eWS!eAK-pHiCaIieRkr0rA56mmobvaBQrGUzDnAK0WEafTKv_sK1uhYaYWFHIfy9HHlLuLFD0KExd3ovrR0DF4AFc6XURpvLKsVEWCc$ > / rho_m  # J / mol
            self.s_top[0] = _get_s_tank(self.eos, self.mph_uv_flsh_L[-1])  # J / mol / K
            mdot, self.p_choke[0] = calc_mass_outflow(
                self.eos,
                self.z,
                self.h_m[-1],
                self.s_top[0],
                self.p[-1],
                self.p_amb,
                self.A_nozzle,
                self.mw<https://urldefense.us/v3/__http://self.mw/__;!!G_uCfscf7eWS!eAK-pHiCaIieRkr0rA56mmobvaBQrGUzDnAK0WEafTKv_sK1uhYaYWFHIfy9HHlLuLFD0KExd3ovrR0DF4AFc6XURpvLKsVEWCc$ >,
                phase_env,
                debug_plot=False,
            )  # mol / s , Pa
            u_top = -mdot * self.mw<https://urldefense.us/v3/__http://self.mw/__;!!G_uCfscf7eWS!eAK-pHiCaIieRkr0rA56mmobvaBQrGUzDnAK0WEafTKv_sK1uhYaYWFHIfy9HHlLuLFD0KExd3ovrR0DF4AFc6XURpvLKsVEWCc$ > / rho_m[-1] / (np.pi * self.r_tank_inner**2)  # m/s

        # assemble vectors with ghost cells
        self.alpha_ghost[:, 0] = self.alpha_ghost[:, 1:-1][:, 0]  # m3/m3
        self.alpha_ghost[:, -1] = self.alpha_ghost[:, 1:-1][:, -1]  # m3/m3
        self.rho_ghost[:, 0] = self.rho_ghost[:, 1:-1][:, 0]  # kg/m3
        self.rho_ghost[:, -1] = self.rho_ghost[:, 1:-1][:, -1]  # kg/m3
        self.rho_m_ghost[0] = rho_m[0]  # kg/m3
        self.rho_m_ghost[1:-1] = rho_m  # kg/m3
        self.rho_m_ghost[-1] = rho_m[-1]  # kg/m3
        # u_ghost[:, 1:-1] = u  # m/s
        self.u_ghost[:, 0] = u_bottom  # m/s
        self.u_ghost[:, -1] = u_top  # m/s
        self.u_m_ghost[0] = u_bottom  # m/s
        self.u_m_ghost[1:-1] = u_m  # m/s
        self.u_m_ghost[-1] = u_top  # m/s
        self.e_ghost[:, 0] = self.e_ghost[:, 1:-1][:, 0]  # J/mol
        self.e_ghost[:, -1] = self.e_ghost[:, 1:-1][:, -1]  # J/mol
        self.pos_ghost[1:-1] = self.pos  # m
        self.pos_ghost[0] = self.pos[0]  # m
        self.pos_ghost[-1] = self.pos[-1]  # m
        self.h_ghost[:, 0] = self.h_ghost[:, 1]  # J/mol
        self.h_ghost[:, -1] = self.h_ghost[:, -2]  # J/mol

        # recalculate wall temperature and heat flux
        # TODO ARE WE DOING THE STAGGERING CORRECTLY?
        lz = self.tank_params["lz_tank"] / self.N  # m
        if ts.getTime() != self.t_current[0] and self.tank_params["heat_transfer"]:
            self.t_current[0] = ts.getTime()
            for i in range(self.N):
                self.T_wall[i], self.Q_wall[i], h_ht = (
                    solve_radial_heat_conduction_implicit(
                        self.tank_params,
                        self.T[i],
                        self.T_wall[i],
                        (self.u_m_ghost[i] + self.u_m_ghost[i + 1]) / 2,
                        self.rho_m_ghost[i + 1],
                        self.mph_uv_flsh_L[i],
                        lz,
                        dt,
                    )
                )  # K, J/s, W/m2K

        # Calculate residuals
        f[:, :] = 0.0
        f[:, 0] = dt * rho_m_dot  # kg/m3
        f[:-1, 1] = self.p[1:] - self.p[:-1] + self.dx * self.g * rho_m[0:-1]  # Pa/m
        f[:, 2] = (
            dt
            * (
                rho_m_dot * (e_m / self.mw<https://urldefense.us/v3/__http://self.mw/__;!!G_uCfscf7eWS!eAK-pHiCaIieRkr0rA56mmobvaBQrGUzDnAK0WEafTKv_sK1uhYaYWFHIfy9HHlLuLFD0KExd3ovrR0DF4AFc6XURpvLKsVEWCc$ > + self.g * self.pos)
                + rho_m * e_m_dot / self.mw<https://urldefense.us/v3/__http://self.mw/__;!!G_uCfscf7eWS!eAK-pHiCaIieRkr0rA56mmobvaBQrGUzDnAK0WEafTKv_sK1uhYaYWFHIfy9HHlLuLFD0KExd3ovrR0DF4AFc6XURpvLKsVEWCc$ >
            )
            - rho_m_dot * e_m_dot / self.mw<https://urldefense.us/v3/__http://self.mw/__;!!G_uCfscf7eWS!eAK-pHiCaIieRkr0rA56mmobvaBQrGUzDnAK0WEafTKv_sK1uhYaYWFHIfy9HHlLuLFD0KExd3ovrR0DF4AFc6XURpvLKsVEWCc$ > * dt**2
            - self.Q_wall / (np.pi * self.r_tank_inner**2 * lz) * dt
        )  # J / m3

        # add contribution from space
        for i in range(n_phase):
            e_flux_i = np.zeros_like(self.u_ghost[i])  # J/m3 m/s
            rho_flux_i = np.zeros_like(self.u_ghost[i])  # kg/m2/s
            for j in range(1, self.N + 1):
                if self.u_ghost[i][j] >= 0.0:
                    rho_flux_new = _rho_flux(
                        self.alpha_ghost[i][j], self.rho_ghost[i][j], self.u_ghost[i][j]
                    )
                    e_flux_new = _e_flux(
                        self.alpha_ghost[i][j],
                        self.rho_ghost[i][j],
                        self.h_ghost[i][j],
                        self.mw<https://urldefense.us/v3/__http://self.mw/__;!!G_uCfscf7eWS!eAK-pHiCaIieRkr0rA56mmobvaBQrGUzDnAK0WEafTKv_sK1uhYaYWFHIfy9HHlLuLFD0KExd3ovrR0DF4AFc6XURpvLKsVEWCc$ >,
                        self.g,
                        self.pos_ghost[j],
                        self.u_ghost[i][j],
                    )

                    # backward euler
                    rho_flux_i[j] = rho_flux_new  # kg/m2/s
                    e_flux_i[j] = e_flux_new  # J/m3 m/s

                else:
                    rho_flux_new = _rho_flux(
                        self.alpha_ghost[i][j + 1],
                        self.rho_ghost[i][j + 1],
                        self.u_ghost[i][j],
                    )

                    e_flux_new = _e_flux(
                        self.alpha_ghost[i][j + 1],
                        self.rho_ghost[i][j + 1],
                        self.h_ghost[i][j + 1],
                        self.mw<https://urldefense.us/v3/__http://self.mw/__;!!G_uCfscf7eWS!eAK-pHiCaIieRkr0rA56mmobvaBQrGUzDnAK0WEafTKv_sK1uhYaYWFHIfy9HHlLuLFD0KExd3ovrR0DF4AFc6XURpvLKsVEWCc$ >,
                        self.g,
                        self.pos_ghost[j + 1],
                        self.u_ghost[i][j],
                    )

                    # backward euler
                    rho_flux_i[j] = rho_flux_new
                    e_flux_i[j] = e_flux_new

            # mass eq
            f[:, 0] += (dt / self.dx) * (rho_flux_i[1:] - rho_flux_i[:-1])  # kg/m3

            # energy eq
            f[:, 2] += (dt / self.dx) * (e_flux_i[1:] - e_flux_i[:-1])  # J/m3

        f1_ref, f2_ref, f3_ref = self.rho_ref, self.p_ref, self.e_ref
        f[:, 0] /= f1_ref
        f[:-1, 1] /= f2_ref
        f[:, 2] /= f3_ref
        # dummy eq
        f[-1, 1] = x[-1, 2]


--
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener

https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!eAK-pHiCaIieRkr0rA56mmobvaBQrGUzDnAK0WEafTKv_sK1uhYaYWFHIfy9HHlLuLFD0KExd3ovrR0DF4AFc6XURpvLDDW3jjA$ <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!eAK-pHiCaIieRkr0rA56mmobvaBQrGUzDnAK0WEafTKv_sK1uhYaYWFHIfy9HHlLuLFD0KExd3ovrR0DF4AFc6XURpvLAZQF_8I$ >
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20250224/0e41f0ba/attachment-0001.html>

From knepley at gmail.com  Mon Feb 24 08:00:28 2025
From: knepley at gmail.com (Matthew Knepley)
Date: Mon, 24 Feb 2025 09:00:28 -0500
Subject: [petsc-users] TS Solver stops working when including ts.setDM
In-Reply-To: <OS6P279MB057047A6F89E228712A833AF94C02@OS6P279MB0570.NORP279.PROD.OUTLOOK.COM>
References: <OS6P279MB05706419B43C06C49AA346BD94C02@OS6P279MB0570.NORP279.PROD.OUTLOOK.COM>
	<CAMYG4Gn3RwRuqta5Xc0AWgAnvF=pam0VCJacH+J_ovVfUyXtRA@mail.gmail.com>
	<OS6P279MB057047A6F89E228712A833AF94C02@OS6P279MB0570.NORP279.PROD.OUTLOOK.COM>
Message-ID: <CAMYG4G=trPFXfTGdK2FW3sp_6aSVUTOv=jsh1vJKu9ahr_44sw@mail.gmail.com>

On Mon, Feb 24, 2025 at 8:56?AM Eirik Jaccheri H?ydalsvik <
eirik.hoydalsvik at sintef.no> wrote:

>
>    1. Thank you for the quick answer, I think this sounds reasonable? Is
>    there any way to compare the brute-force jacobian to the one computed using
>    the coloring information?
>
>
The easiest way we have is to print them both out:

  -ksp_view_mat

on both runs. We have a way to compare the analytic and FD Jacobians
(-snes_test_jacobian), but
not two different FDs.

  Thanks,

     Matt


>
>    1.
>
> *From: *Matthew Knepley <knepley at gmail.com>
> *Date: *Monday, February 24, 2025 at 14:53
> *To: *Eirik Jaccheri H?ydalsvik <eirik.hoydalsvik at sintef.no>
> *Cc: *petsc-users at mcs.anl.gov <petsc-users at mcs.anl.gov>
> *Subject: *Re: [petsc-users] TS Solver stops working when including
> ts.setDM
>
> On Mon, Feb 24, 2025 at 8:41?AM Eirik Jaccheri H?ydalsvik via petsc-users
> <petsc-users at mcs.anl.gov> wrote:
>
> Hi,
>
> I am using the petsc4py.ts timestepper to solve a PDE. It is not easy to
> obtain the jacobian for my equations, so I do not provide a jacobian
> function. The code is given at the end of the email.
>
> When I comment out the function call ?ts.setDM(da)?, the code runs and
> gives reasonable results.
>
> However, when I add this line of code, the program crashes with the error
> message provided at the end of the email.
>
> Questions:
>
> 1. Do you know why adding this line of code can make the SNES solver
> diverge? Any suggestions for how to debug the issue?
>
>
>
> I will not know until I run it, but here is my guess. When the DMDA is
> specified, PETSc uses coloring to produce the Jacobian. When it is not, it
> just brute-forces the entire J. My guess is that your residual does not
> respect the stencil in the DMDA, so the coloring is wrong, making a wrong
> Jacobian.
>
>
>
> 2. What is the advantage of adding the DMDA object to the ts solver? Will
> this speed up the calculation of the finite difference jacobian?
>
>
>
> Yes, it speeds up the computation of the FD Jacobian.
>
>
>
>   Thanks,
>
>
>
>      Matt
>
>
>
> Best regards,
>
> Eirik H?ydalsvik
>
> SINTEF ER/NTNU
>
> Error message:
>
> [Eiriks-MacBook-Pro.local:26384] shmem: mmap: an error occurred while
> determining whether or not
> /var/folders/w1/35jw9y4n7lsbw0dhjqdhgzz80000gn/T//ompi.Eiriks-MacBook-Pro.501/jf.0/2046361600/sm_segment.Eiriks-MacBook-Pro.501.79f90000.0 could
> be created.
>
> t 0 of 1 with dt =  0.2
>
> 0 TS dt 0.2 time 0.
>
>       TSAdapt none step   0 stage rejected (SNES reason
> DIVERGED_LINEAR_SOLVE) t=0          + 2.000e-01 retrying with dt=5.000e-02
>
>       TSAdapt none step   0 stage rejected (SNES reason
> DIVERGED_LINEAR_SOLVE) t=0          + 5.000e-02 retrying with dt=1.250e-02
>
>       TSAdapt none step   0 stage rejected (SNES reason
> DIVERGED_LINEAR_SOLVE) t=0          + 1.250e-02 retrying with dt=3.125e-03
>
>       TSAdapt none step   0 stage rejected (SNES reason
> DIVERGED_LINEAR_SOLVE) t=0          + 3.125e-03 retrying with dt=7.813e-04
>
>       TSAdapt none step   0 stage rejected (SNES reason
> DIVERGED_LINEAR_SOLVE) t=0          + 7.813e-04 retrying with dt=1.953e-04
>
>       TSAdapt none step   0 stage rejected (SNES reason
> DIVERGED_LINEAR_SOLVE) t=0          + 1.953e-04 retrying with dt=4.883e-05
>
>       TSAdapt none step   0 stage rejected (SNES reason
> DIVERGED_LINEAR_SOLVE) t=0          + 4.883e-05 retrying with dt=1.221e-05
>
>       TSAdapt none step   0 stage rejected (SNES reason
> DIVERGED_LINEAR_SOLVE) t=0          + 1.221e-05 retrying with dt=3.052e-06
>
>       TSAdapt none step   0 stage rejected (SNES reason
> DIVERGED_LINEAR_SOLVE) t=0          + 3.052e-06 retrying with dt=7.629e-07
>
>       TSAdapt none step   0 stage rejected (SNES reason
> DIVERGED_LINEAR_SOLVE) t=0          + 7.629e-07 retrying with dt=1.907e-07
>
>       TSAdapt none step   0 stage rejected (SNES reason
> DIVERGED_LINEAR_SOLVE) t=0          + 1.907e-07 retrying with dt=4.768e-08
>
> Traceback (most recent call last):
>
>   File "/Users/iept1445/Code/tank_model/closed_tank.py", line 200, in
> <module>
>
>     return_dict1d = get_tank_composition_1d(tank_params)
>
>   File "/Users/iept1445/Code/tank_model/src/tank_model1d.py", line 223, in
> get_tank_composition_1d
>
>     ts.solve(u=x)
>
>   File "petsc4py/PETSc/TS.pyx", line 2478, in petsc4py.PETSc.TS.solve
>
> petsc4py.PETSc.Error: error code 91
>
> [0] TSSolve() at
> /Users/iept1445/.cache/uv/sdists-v6/pypi/petsc/3.22.1/svV0XnlR4s3LB4lmsaKjR/src/src/ts/interface/ts.c:4072
>
> [0] TSStep() at
> /Users/iept1445/.cache/uv/sdists-v6/pypi/petsc/3.22.1/svV0XnlR4s3LB4lmsaKjR/src/src/ts/interface/ts.c:3440
>
> [0] TSStep has failed due to DIVERGED_STEP_REJECTED
>
> Options for solver:
>
> COMM = PETSc.COMM_WORLD
>
>
>
>     da = PETSc.DMDA().create(
>
>         dim=(N_vertical,),
>
>         dof=3,
>
>         stencil_type=PETSc.DMDA().StencilType.STAR,
>
>         stencil_width=1,
>
>         # boundary_type=PETSc.DMDA.BoundaryType.GHOSTED,
>
>     )
>
>     x = da.createGlobalVec()
>
>     x_old = da.createGlobalVec()
>
>     f = da.createGlobalVec()
>
>     J = da.createMat()
>
>     rho_ref = rho_m[0]  # kg/m3
>
>     e_ref = e_m[0]  # J/mol
>
>     p_ref = p0  # Pa
>
>     x.setArray(np.array([rho_m / rho_ref, e_m / e_ref, ux_m]).T.flatten())
>
>     x_old.setArray(np.array([rho_m / rho_ref, e_m / e_ref,
> ux_m]).T.flatten())
>
>
>
>     optsDB = PETSc.Options()
>
>     optsDB["snes_lag_preconditioner_persists"] = False
>
>     optsDB["snes_lag_jacobian"] = 1
>
>     optsDB["snes_lag_jacobian_persists"] = False
>
>     optsDB["snes_lag_preconditioner"] = 1
>
>     optsDB["ksp_type"] = "gmres"  # "gmres"  # gmres"
>
>     optsDB["pc_type"] = "ilu"  # "lu"  # "ilu"
>
>     optsDB["snes_type"] = "newtonls"
>
>     optsDB["ksp_rtol"] = 1e-7
>
>     optsDB["ksp_atol"] = 1e-7
>
>     optsDB["ksp_max_it"] = 100
>
>     optsDB["snes_rtol"] = 1e-5
>
>     optsDB["snes_atol"] = 1e-5
>
>     optsDB["snes_stol"] = 1e-5
>
>     optsDB["snes_max_it"] = 100
>
>     optsDB["snes_mf"] = False
>
>     optsDB["ts_max_time"] = t_end
>
>     optsDB["ts_type"] = "beuler"  # "bdf"  #
>
>     optsDB["ts_max_snes_failures"] = -1
>
>     optsDB["ts_monitor"] = ""
>
>     optsDB["ts_adapt_monitor"] = ""
>
>     # optsDB["snes_monitor"] = ""
>
>     # optsDB["ksp_monitor"] = ""
>
>     optsDB["ts_atol"] = 1e-4
>
>
>
>     x0 = x_old
>
>     residual_wrap = residual_ts(
>
>         eos,
>
>         x0,
>
>         N_vertical,
>
>         g,
>
>         pos,
>
>         z,
>
>         mw,
>
>         dt,
>
>         dx,
>
>         p_amb,
>
>         A_nozzle,
>
>         r_tank_inner,
>
>         mph_uv_flsh_L,
>
>         rho_ref,
>
>         e_ref,
>
>         p_ref,
>
>         closed_tank,
>
>         J,
>
>         f,
>
>         da,
>
>         drift_func,
>
>         T_wall,
>
>         tank_params,
>
>     )
>
>
>
>     # optsDB["ts_adapt_type"] = "none"
>
>
>
>     ts = PETSc.TS().create(comm=COMM)
>
>     # TODO: Figure out why DM crashes the code
>
>     # ts.setDM(residual_wrap.da)
>
>     ts.setIFunction(residual_wrap.residual_ts, None)
>
>     ts.setTimeStep(dt)
>
>     ts.setMaxSteps(-1)
>
>     ts.setTime(t_start)  # s
>
>     ts.setMaxTime(t_end)  # s
>
>     ts.setMaxSteps(1e5)
>
>     ts.setStepLimits(1e-3, 1e5)
>
>     ts.setFromOptions()
>
>     ts.solve(u=x)
>
>
>
> Residual function:
>
> class residual_ts:
>
>     def __init__(
>
>         self,
>
>         eos,
>
>         x0,
>
>         N,
>
>         g,
>
>         pos,
>
>         z,
>
>         mw,
>
>         dt,
>
>         dx,
>
>         p_amb,
>
>         A_nozzle,
>
>         r_tank_inner,
>
>         mph_uv_flsh_l,
>
>         rho_ref,
>
>         e_ref,
>
>         p_ref,
>
>         closed_tank,
>
>         J,
>
>         f,
>
>         da,
>
>         drift_func,
>
>         T_wall,
>
>         tank_params,
>
>     ):
>
>         self.eos = eos
>
>         self.x0 = x0
>
>         self.N = N
>
>         self.g = g
>
>         self.pos = pos
>
>         self.z = z
>
>         self.mw = mw
>
>         self.dt = dt
>
>         self.dx = dx
>
>         self.p_amb = p_amb
>
>         self.A_nozzle = A_nozzle
>
>         self.r_tank_inner = r_tank_inner
>
>         self.mph_uv_flsh_L = mph_uv_flsh_l
>
>         self.rho_ref = rho_ref
>
>         self.e_ref = e_ref
>
>         self.p_ref = p_ref
>
>         self.closed_tank = closed_tank
>
>         self.J = J
>
>         self.f = f
>
>         self.da = da
>
>         self.drift_func = drift_func
>
>         self.T_wall = T_wall
>
>         self.tank_params = tank_params
>
>         self.Q_wall = np.zeros(N)
>
>         self.n_iter = 0
>
>         self.t_current = [0.0]
>
>         self.s_top = [0.0]
>
>         self.p_choke = [0.0]
>
>
>
>         # setting interp func # TODO figure out how to generalize this
> method
>
>         self._interp_func = _jalla_upwind
>
>
>
>         # allocate space for new params
>
>         self.p = np.zeros(N)  # Pa
>
>         self.T = np.zeros(N)  # K
>
>         self.alpha = np.zeros((2, N))
>
>         self.rho = np.zeros((2, N))
>
>         self.e = np.zeros((2, N))
>
>
>
>         # allocate space for ghost cells
>
>         self.alpha_ghost = np.zeros((2, N + 2))
>
>         self.rho_ghost = np.zeros((2, N + 2))
>
>         self.rho_m_ghost = np.zeros(N + 2)
>
>         self.u_m_ghost = np.zeros(N + 1)
>
>         self.u_ghost = np.zeros((2, N + 1))
>
>         self.e_ghost = np.zeros((2, N + 2))
>
>         self.pos_ghost = np.zeros(N + 2)
>
>         self.h_ghost = np.zeros((2, N + 2))
>
>
>
>         # allocate soace for local X and Xdot
>
>         self.X_LOCAL = da.createLocalVec()
>
>         self.XDOT_LOCAL = da.createLocalVec()
>
>
>
>     def residual_ts(self, ts, t, X, XDOT, F):
>
>         self.n_iter += 1
>
>         # TODO: Estimate time use
>
>         """
>
>         Caculate residuals for equations
>
>         (rho, rho (e + gx))_t + (rho u, rho u (h + gx))_x = 0
>
>         P_x = - g \rho
>
>         """
>
>         n_phase = 2
>
>         self.da.globalToLocal(X, self.X_LOCAL)
>
>         self.da.globalToLocal(XDOT, self.XDOT_LOCAL)
>
>         x = self.da.getVecArray(self.X_LOCAL)
>
>         xdot = self.da.getVecArray(self.XDOT_LOCAL)
>
>         f = self.da.getVecArray(F)
>
>
>
>         T_c, v_c, p_c = self.eos.critical(self.z)  # K, m3/mol, Pa
>
>         rho_m = x[:, 0] * self.rho_ref  # kg/m3
>
>         e_m = x[:, 1] * self.e_ref  # J/mol
>
>         u_m = x[:-1, 2]  # m/s
>
>
>
>         # derivatives
>
>         rho_m_dot = xdot[:, 0] * self.rho_ref  # kg/m3
>
>         e_m_dot = xdot[:, 1] * self.e_ref  # kg/m3
>
>         dt = ts.getTimeStep()  # s
>
>
>
>         for i in range(self.N):
>
>             # get new parameters
>
>             self.mph_uv_flsh_L[i] = self.eos.multi_phase_uvflash(
>
>                 self.z, e_m[i], self.mw / rho_m[i], self.mph_uv_flsh_L[i]
>
>             )
>
>
>
>             betaL, betaV, betaS = _get_betas(self.mph_uv_flsh_L[i])  #
> mol/mol
>
>             beta = [betaL, betaV]
>
>             if betaS != 0.0:
>
>                 print("there is a solid phase which is not accounted for")
>
>             self.T[i], self.p[i] = _get_tank_temperature_pressure(
>
>                 self.mph_uv_flsh_L[i]
>
>             )  # K, Pa)
>
>             for j, phase in enumerate([self.eos.LIQPH, self.eos.VAPPH]):
>
>                 # new parameters
>
>                 self.rho_ghost[:, 1:-1][j][i] = (
>
>                     self.mw
>
>                     / self.eos.specific_volume(self.T[i], self.p[i],
> self.z, phase)[0]
>
>                 )  # kg/m3
>
>                 self.e_ghost[:, 1:-1][j][i] = self.eos.internal_energy_tv(
>
>                     self.T[i], self.mw / self.rho_ghost[:, 1:-1][j][i],
> self.z, phase
>
>                 )[
>
>                     0
>
>                 ]  # J/mol
>
>                 self.h_ghost[:, 1:-1][j][i] = (
>
>                     self.e_ghost[:, 1:-1][j][i]
>
>                     + self.p[i] * self.mw / self.rho_ghost[:, 1:-1][j][i]
>
>                 )  # J/mol
>
>                 self.alpha_ghost[:, 1:-1][j][i] = (
>
>                     beta[j] / self.rho_ghost[:, 1:-1][j][i] * rho_m[i]
>
>                 )  # m3/m3
>
>
>
>         # calculate drift velocity
>
>         for i in range(self.N - 1):
>
>             self.u_ghost[:, 1:-1][0][i], self.u_ghost[:, 1:-1][1][i] = (
>
>                 calc_drift_velocity(
>
>                     u_m[i],
>
>                     self._interp_func(
>
>                         self.rho_ghost[:, 1:-1][0][i],
>
>                         self.rho_ghost[:, 1:-1][0][i + 1],
>
>                         u_m[i],
>
>                     ),
>
>                     self._interp_func(
>
>                         self.rho_ghost[:, 1:-1][1][i],
>
>                         self.rho_ghost[:, 1:-1][1][i + 1],
>
>                         u_m[i],
>
>                     ),
>
>                     self.g,
>
>                     self._interp_func(self.T[i], self.T[i + 1], u_m[i]),
>
>                     T_c,
>
>                     self.r_tank_inner,
>
>                     self._interp_func(
>
>                         self.alpha_ghost[:, 1:-1][0][i],
>
>                         self.alpha_ghost[:, 1:-1][0][i + 1],
>
>                         u_m[i],
>
>                     ),
>
>                     self._interp_func(
>
>                         self.alpha_ghost[:, 1:-1][1][i],
>
>                         self.alpha_ghost[:, 1:-1][1][i + 1],
>
>                         u_m[i],
>
>                     ),
>
>                     self.drift_func,
>
>                 )
>
>             )  # liq m / s , vapour m / s
>
>
>
>         u_bottom = 0
>
>         if self.closed_tank:
>
>             u_top = 0.0  # m/s
>
>         else:
>
>             # calc phase to skip env_isentrope_cross
>
>             if (
>
>                 self.mph_uv_flsh_L[-1].liquid != None
>
>                 and self.mph_uv_flsh_L[-1].vapour == None
>
>                 and self.mph_uv_flsh_L[-1].solid == None
>
>             ):
>
>                 phase_env = self.eos.LIQPH
>
>             else:
>
>                 phase_env = self.eos.TWOPH
>
>
>
>             self.h_m = e_m + self.p * self.mw / rho_m  # J / mol
>
>             self.s_top[0] = _get_s_tank(self.eos, self.mph_uv_flsh_L[-1])
> # J / mol / K
>
>             mdot, self.p_choke[0] = calc_mass_outflow(
>
>                 self.eos,
>
>                 self.z,
>
>                 self.h_m[-1],
>
>                 self.s_top[0],
>
>                 self.p[-1],
>
>                 self.p_amb,
>
>                 self.A_nozzle,
>
>                 self.mw,
>
>                 phase_env,
>
>                 debug_plot=False,
>
>             )  # mol / s , Pa
>
>             u_top = -mdot * self.mw / rho_m[-1] / (np.pi *
> self.r_tank_inner**2)  # m/s
>
>
>
>         # assemble vectors with ghost cells
>
>         self.alpha_ghost[:, 0] = self.alpha_ghost[:, 1:-1][:, 0]  # m3/m3
>
>         self.alpha_ghost[:, -1] = self.alpha_ghost[:, 1:-1][:, -1]  # m3/m3
>
>         self.rho_ghost[:, 0] = self.rho_ghost[:, 1:-1][:, 0]  # kg/m3
>
>         self.rho_ghost[:, -1] = self.rho_ghost[:, 1:-1][:, -1]  # kg/m3
>
>         self.rho_m_ghost[0] = rho_m[0]  # kg/m3
>
>         self.rho_m_ghost[1:-1] = rho_m  # kg/m3
>
>         self.rho_m_ghost[-1] = rho_m[-1]  # kg/m3
>
>         # u_ghost[:, 1:-1] = u  # m/s
>
>         self.u_ghost[:, 0] = u_bottom  # m/s
>
>         self.u_ghost[:, -1] = u_top  # m/s
>
>         self.u_m_ghost[0] = u_bottom  # m/s
>
>         self.u_m_ghost[1:-1] = u_m  # m/s
>
>         self.u_m_ghost[-1] = u_top  # m/s
>
>         self.e_ghost[:, 0] = self.e_ghost[:, 1:-1][:, 0]  # J/mol
>
>         self.e_ghost[:, -1] = self.e_ghost[:, 1:-1][:, -1]  # J/mol
>
>         self.pos_ghost[1:-1] = self.pos  # m
>
>         self.pos_ghost[0] = self.pos[0]  # m
>
>         self.pos_ghost[-1] = self.pos[-1]  # m
>
>         self.h_ghost[:, 0] = self.h_ghost[:, 1]  # J/mol
>
>         self.h_ghost[:, -1] = self.h_ghost[:, -2]  # J/mol
>
>
>
>         # recalculate wall temperature and heat flux
>
>         # TODO ARE WE DOING THE STAGGERING CORRECTLY?
>
>         lz = self.tank_params["lz_tank"] / self.N  # m
>
>         if ts.getTime() != self.t_current[0] and
> self.tank_params["heat_transfer"]:
>
>             self.t_current[0] = ts.getTime()
>
>             for i in range(self.N):
>
>                 self.T_wall[i], self.Q_wall[i], h_ht = (
>
>                     solve_radial_heat_conduction_implicit(
>
>                         self.tank_params,
>
>                         self.T[i],
>
>                         self.T_wall[i],
>
>                         (self.u_m_ghost[i] + self.u_m_ghost[i + 1]) / 2,
>
>                         self.rho_m_ghost[i + 1],
>
>                         self.mph_uv_flsh_L[i],
>
>                         lz,
>
>                         dt,
>
>                     )
>
>                 )  # K, J/s, W/m2K
>
>
>
>         # Calculate residuals
>
>         f[:, :] = 0.0
>
>         f[:, 0] = dt * rho_m_dot  # kg/m3
>
>         f[:-1, 1] = self.p[1:] - self.p[:-1] + self.dx * self.g *
> rho_m[0:-1]  # Pa/m
>
>         f[:, 2] = (
>
>             dt
>
>             * (
>
>                 rho_m_dot * (e_m / self.mw + self.g * self.pos)
>
>                 + rho_m * e_m_dot / self.mw
>
>             )
>
>             - rho_m_dot * e_m_dot / self.mw * dt**2
>
>             - self.Q_wall / (np.pi * self.r_tank_inner**2 * lz) * dt
>
>         )  # J / m3
>
>
>
>         # add contribution from space
>
>         for i in range(n_phase):
>
>             e_flux_i = np.zeros_like(self.u_ghost[i])  # J/m3 m/s
>
>             rho_flux_i = np.zeros_like(self.u_ghost[i])  # kg/m2/s
>
>             for j in range(1, self.N + 1):
>
>                 if self.u_ghost[i][j] >= 0.0:
>
>                     rho_flux_new = _rho_flux(
>
>                         self.alpha_ghost[i][j], self.rho_ghost[i][j],
> self.u_ghost[i][j]
>
>                     )
>
>                     e_flux_new = _e_flux(
>
>                         self.alpha_ghost[i][j],
>
>                         self.rho_ghost[i][j],
>
>                         self.h_ghost[i][j],
>
>                         self.mw,
>
>                         self.g,
>
>                         self.pos_ghost[j],
>
>                         self.u_ghost[i][j],
>
>                     )
>
>
>
>                     # backward euler
>
>                     rho_flux_i[j] = rho_flux_new  # kg/m2/s
>
>                     e_flux_i[j] = e_flux_new  # J/m3 m/s
>
>
>
>                 else:
>
>                     rho_flux_new = _rho_flux(
>
>                         self.alpha_ghost[i][j + 1],
>
>                         self.rho_ghost[i][j + 1],
>
>                         self.u_ghost[i][j],
>
>                     )
>
>
>
>                     e_flux_new = _e_flux(
>
>                         self.alpha_ghost[i][j + 1],
>
>                         self.rho_ghost[i][j + 1],
>
>                         self.h_ghost[i][j + 1],
>
>                         self.mw,
>
>                         self.g,
>
>                         self.pos_ghost[j + 1],
>
>                         self.u_ghost[i][j],
>
>                     )
>
>
>
>                     # backward euler
>
>                     rho_flux_i[j] = rho_flux_new
>
>                     e_flux_i[j] = e_flux_new
>
>
>
>             # mass eq
>
>             f[:, 0] += (dt / self.dx) * (rho_flux_i[1:] -
> rho_flux_i[:-1])  # kg/m3
>
>
>
>             # energy eq
>
>             f[:, 2] += (dt / self.dx) * (e_flux_i[1:] - e_flux_i[:-1])  #
> J/m3
>
>
>
>         f1_ref, f2_ref, f3_ref = self.rho_ref, self.p_ref, self.e_ref
>
>         f[:, 0] /= f1_ref
>
>         f[:-1, 1] /= f2_ref
>
>         f[:, 2] /= f3_ref
>
>         # dummy eq
>
>         f[-1, 1] = x[-1, 2]
>
>
>
>
>
>
>
>
> --
>
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
>
>
> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!dcoz7MDn7GnxkOOW8KrkFC-3TAKZKmVlbtBSOJqC2xpb8AuzPeBKTeVUS-nWxQhzkrKp4wQF9njzok2yPpno$ 
> <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!dcoz7MDn7GnxkOOW8KrkFC-3TAKZKmVlbtBSOJqC2xpb8AuzPeBKTeVUS-nWxQhzkrKp4wQF9njzokOWx5lx$ >
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!dcoz7MDn7GnxkOOW8KrkFC-3TAKZKmVlbtBSOJqC2xpb8AuzPeBKTeVUS-nWxQhzkrKp4wQF9njzok2yPpno$  <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!dcoz7MDn7GnxkOOW8KrkFC-3TAKZKmVlbtBSOJqC2xpb8AuzPeBKTeVUS-nWxQhzkrKp4wQF9njzokOWx5lx$ >
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20250224/24132f38/attachment-0001.html>

From eirik.hoydalsvik at sintef.no  Mon Feb 24 07:35:01 2025
From: eirik.hoydalsvik at sintef.no (=?Windows-1252?Q?Eirik_Jaccheri_H=F8ydalsvik?=)
Date: Mon, 24 Feb 2025 13:35:01 +0000
Subject: [petsc-users] TS Solver stops working when including ts.setDM
Message-ID: <OS6P279MB057001C592063A74D13AD7FE94C02@OS6P279MB0570.NORP279.PROD.OUTLOOK.COM>

Hi,

I am using the petsc4py.ts timestepper to solve a PDE. It is not easy to obtain the jacobian for my equations, so I do not provide a jacobian function. The code is given at the end of the email.

When I comment out the function call ?ts.setDM(da)?, the code runs and gives reasonable results.

However, when I add this line of code, the program crashes with the error message provided at the end of the email.

Questions:

1. Do you know why adding this line of code can make the SNES solver diverge? Any suggestions for how to debug the issue?

2. What is the advantage of adding the DMDA object to the ts solver? Will this speed up the calculation of the finite difference jacobian?

Best regards,
Eirik H?ydalsvik
SINTEF ER/NTNU

Error message:

[Eiriks-MacBook-Pro.local:26384] shmem: mmap: an error occurred while determining whether or not /var/folders/w1/35jw9y4n7lsbw0dhjqdhgzz80000gn/T//ompi.Eiriks-MacBook-Pro.501/jf.0/2046361600/sm_segment.Eiriks-MacBook-Pro.501.79f90000.0 could be created.
t 0 of 1 with dt =  0.2
0 TS dt 0.2 time 0.
      TSAdapt none step   0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0          + 2.000e-01 retrying with dt=5.000e-02
      TSAdapt none step   0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0          + 5.000e-02 retrying with dt=1.250e-02
      TSAdapt none step   0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0          + 1.250e-02 retrying with dt=3.125e-03
      TSAdapt none step   0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0          + 3.125e-03 retrying with dt=7.813e-04
      TSAdapt none step   0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0          + 7.813e-04 retrying with dt=1.953e-04
      TSAdapt none step   0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0          + 1.953e-04 retrying with dt=4.883e-05
      TSAdapt none step   0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0          + 4.883e-05 retrying with dt=1.221e-05
      TSAdapt none step   0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0          + 1.221e-05 retrying with dt=3.052e-06
      TSAdapt none step   0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0          + 3.052e-06 retrying with dt=7.629e-07
      TSAdapt none step   0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0          + 7.629e-07 retrying with dt=1.907e-07
      TSAdapt none step   0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0          + 1.907e-07 retrying with dt=4.768e-08
Traceback (most recent call last):
  File "/Users/iept1445/Code/tank_model/closed_tank.py", line 200, in <module>
    return_dict1d = get_tank_composition_1d(tank_params)
  File "/Users/iept1445/Code/tank_model/src/tank_model1d.py", line 223, in get_tank_composition_1d
    ts.solve(u=x)
  File "petsc4py/PETSc/TS.pyx", line 2478, in petsc4py.PETSc.TS.solve
petsc4py.PETSc.Error: error code 91
[0] TSSolve() at /Users/iept1445/.cache/uv/sdists-v6/pypi/petsc/3.22.1/svV0XnlR4s3LB4lmsaKjR/src/src/ts/interface/ts.c:4072
[0] TSStep() at /Users/iept1445/.cache/uv/sdists-v6/pypi/petsc/3.22.1/svV0XnlR4s3LB4lmsaKjR/src/src/ts/interface/ts.c:3440
[0] TSStep has failed due to DIVERGED_STEP_REJECTED

Options for solver:

COMM = PETSc.COMM_WORLD

    da = PETSc.DMDA().create(
        dim=(N_vertical,),
        dof=3,
        stencil_type=PETSc.DMDA().StencilType.STAR,
        stencil_width=1,
        # boundary_type=PETSc.DMDA.BoundaryType.GHOSTED,
    )
    x = da.createGlobalVec()
    x_old = da.createGlobalVec()
    f = da.createGlobalVec()
    J = da.createMat()
    rho_ref = rho_m[0]  # kg/m3
    e_ref = e_m[0]  # J/mol
    p_ref = p0  # Pa
    x.setArray(np.array([rho_m / rho_ref, e_m / e_ref, ux_m]).T.flatten())
    x_old.setArray(np.array([rho_m / rho_ref, e_m / e_ref, ux_m]).T.flatten())

    optsDB = PETSc.Options()
    optsDB["snes_lag_preconditioner_persists"] = False
    optsDB["snes_lag_jacobian"] = 1
    optsDB["snes_lag_jacobian_persists"] = False
    optsDB["snes_lag_preconditioner"] = 1
    optsDB["ksp_type"] = "gmres"  # "gmres"  # gmres"
    optsDB["pc_type"] = "ilu"  # "lu"  # "ilu"
    optsDB["snes_type"] = "newtonls"
    optsDB["ksp_rtol"] = 1e-7
    optsDB["ksp_atol"] = 1e-7
    optsDB["ksp_max_it"] = 100
    optsDB["snes_rtol"] = 1e-5
    optsDB["snes_atol"] = 1e-5
    optsDB["snes_stol"] = 1e-5
    optsDB["snes_max_it"] = 100
    optsDB["snes_mf"] = False
    optsDB["ts_max_time"] = t_end
    optsDB["ts_type"] = "beuler"  # "bdf"  #
    optsDB["ts_max_snes_failures"] = -1
    optsDB["ts_monitor"] = ""
    optsDB["ts_adapt_monitor"] = ""
    # optsDB["snes_monitor"] = ""
    # optsDB["ksp_monitor"] = ""
    optsDB["ts_atol"] = 1e-4

    x0 = x_old
    residual_wrap = residual_ts(
        eos,
        x0,
        N_vertical,
        g,
        pos,
        z,
        mw,
        dt,
        dx,
        p_amb,
        A_nozzle,
        r_tank_inner,
        mph_uv_flsh_L,
        rho_ref,
        e_ref,
        p_ref,
        closed_tank,
        J,
        f,
        da,
        drift_func,
        T_wall,
        tank_params,
    )

    # optsDB["ts_adapt_type"] = "none"

    ts = PETSc.TS().create(comm=COMM)
    # TODO: Figure out why DM crashes the code
    # ts.setDM(residual_wrap.da)
    ts.setIFunction(residual_wrap.residual_ts, None)
    ts.setTimeStep(dt)
    ts.setMaxSteps(-1)
    ts.setTime(t_start)  # s
    ts.setMaxTime(t_end)  # s
    ts.setMaxSteps(1e5)
    ts.setStepLimits(1e-3, 1e5)
    ts.setFromOptions()
    ts.solve(u=x)


Residual function:

class residual_ts:
    def __init__(
        self,
        eos,
        x0,
        N,
        g,
        pos,
        z,
        mw,
        dt,
        dx,
        p_amb,
        A_nozzle,
        r_tank_inner,
        mph_uv_flsh_l,
        rho_ref,
        e_ref,
        p_ref,
        closed_tank,
        J,
        f,
        da,
        drift_func,
        T_wall,
        tank_params,
    ):
        self.eos = eos
        self.x0 = x0
        self.N = N
        self.g = g
        self.pos = pos
        self.z = z
        self.mw = mw
        self.dt = dt
        self.dx = dx
        self.p_amb = p_amb
        self.A_nozzle = A_nozzle
        self.r_tank_inner = r_tank_inner
        self.mph_uv_flsh_L = mph_uv_flsh_l
        self.rho_ref = rho_ref
        self.e_ref = e_ref
        self.p_ref = p_ref
        self.closed_tank = closed_tank
        self.J = J
        self.f = f
        self.da = da
        self.drift_func = drift_func
        self.T_wall = T_wall
        self.tank_params = tank_params
        self.Q_wall = np.zeros(N)
        self.n_iter = 0
        self.t_current = [0.0]
        self.s_top = [0.0]
        self.p_choke = [0.0]

        # setting interp func # TODO figure out how to generalize this method
        self._interp_func = _jalla_upwind

        # allocate space for new params
        self.p = np.zeros(N)  # Pa
        self.T = np.zeros(N)  # K
        self.alpha = np.zeros((2, N))
        self.rho = np.zeros((2, N))
        self.e = np.zeros((2, N))

        # allocate space for ghost cells
        self.alpha_ghost = np.zeros((2, N + 2))
        self.rho_ghost = np.zeros((2, N + 2))
        self.rho_m_ghost = np.zeros(N + 2)
        self.u_m_ghost = np.zeros(N + 1)
        self.u_ghost = np.zeros((2, N + 1))
        self.e_ghost = np.zeros((2, N + 2))
        self.pos_ghost = np.zeros(N + 2)
        self.h_ghost = np.zeros((2, N + 2))

        # allocate soace for local X and Xdot
        self.X_LOCAL = da.createLocalVec()
        self.XDOT_LOCAL = da.createLocalVec()

    def residual_ts(self, ts, t, X, XDOT, F):
        self.n_iter += 1
        # TODO: Estimate time use
        """
        Caculate residuals for equations
        (rho, rho (e + gx))_t + (rho u, rho u (h + gx))_x = 0
        P_x = - g \rho
        """
        n_phase = 2
        self.da.globalToLocal(X, self.X_LOCAL)
        self.da.globalToLocal(XDOT, self.XDOT_LOCAL)
        x = self.da.getVecArray(self.X_LOCAL)
        xdot = self.da.getVecArray(self.XDOT_LOCAL)
        f = self.da.getVecArray(F)

        T_c, v_c, p_c = self.eos.critical(self.z)  # K, m3/mol, Pa
        rho_m = x[:, 0] * self.rho_ref  # kg/m3
        e_m = x[:, 1] * self.e_ref  # J/mol
        u_m = x[:-1, 2]  # m/s

        # derivatives
        rho_m_dot = xdot[:, 0] * self.rho_ref  # kg/m3
        e_m_dot = xdot[:, 1] * self.e_ref  # kg/m3
        dt = ts.getTimeStep()  # s

        for i in range(self.N):
            # get new parameters
            self.mph_uv_flsh_L[i] = self.eos.multi_phase_uvflash(
                self.z, e_m[i], self.mw / rho_m[i], self.mph_uv_flsh_L[i]
            )

            betaL, betaV, betaS = _get_betas(self.mph_uv_flsh_L[i])  # mol/mol
            beta = [betaL, betaV]
            if betaS != 0.0:
                print("there is a solid phase which is not accounted for")
            self.T[i], self.p[i] = _get_tank_temperature_pressure(
                self.mph_uv_flsh_L[i]
            )  # K, Pa)
            for j, phase in enumerate([self.eos.LIQPH, self.eos.VAPPH]):
                # new parameters
                self.rho_ghost[:, 1:-1][j][i] = (
                    self.mw
                    / self.eos.specific_volume(self.T[i], self.p[i], self.z, phase)[0]
                )  # kg/m3
                self.e_ghost[:, 1:-1][j][i] = self.eos.internal_energy_tv(
                    self.T[i], self.mw / self.rho_ghost[:, 1:-1][j][i], self.z, phase
                )[
                    0
                ]  # J/mol
                self.h_ghost[:, 1:-1][j][i] = (
                    self.e_ghost[:, 1:-1][j][i]
                    + self.p[i] * self.mw / self.rho_ghost[:, 1:-1][j][i]
                )  # J/mol
                self.alpha_ghost[:, 1:-1][j][i] = (
                    beta[j] / self.rho_ghost[:, 1:-1][j][i] * rho_m[i]
                )  # m3/m3

        # calculate drift velocity
        for i in range(self.N - 1):
            self.u_ghost[:, 1:-1][0][i], self.u_ghost[:, 1:-1][1][i] = (
                calc_drift_velocity(
                    u_m[i],
                    self._interp_func(
                        self.rho_ghost[:, 1:-1][0][i],
                        self.rho_ghost[:, 1:-1][0][i + 1],
                        u_m[i],
                    ),
                    self._interp_func(
                        self.rho_ghost[:, 1:-1][1][i],
                        self.rho_ghost[:, 1:-1][1][i + 1],
                        u_m[i],
                    ),
                    self.g,
                    self._interp_func(self.T[i], self.T[i + 1], u_m[i]),
                    T_c,
                    self.r_tank_inner,
                    self._interp_func(
                        self.alpha_ghost[:, 1:-1][0][i],
                        self.alpha_ghost[:, 1:-1][0][i + 1],
                        u_m[i],
                    ),
                    self._interp_func(
                        self.alpha_ghost[:, 1:-1][1][i],
                        self.alpha_ghost[:, 1:-1][1][i + 1],
                        u_m[i],
                    ),
                    self.drift_func,
                )
            )  # liq m / s , vapour m / s

        u_bottom = 0
        if self.closed_tank:
            u_top = 0.0  # m/s
        else:
            # calc phase to skip env_isentrope_cross
            if (
                self.mph_uv_flsh_L[-1].liquid != None
                and self.mph_uv_flsh_L[-1].vapour == None
                and self.mph_uv_flsh_L[-1].solid == None
            ):
                phase_env = self.eos.LIQPH
            else:
                phase_env = self.eos.TWOPH

            self.h_m = e_m + self.p * self.mw / rho_m  # J / mol
            self.s_top[0] = _get_s_tank(self.eos, self.mph_uv_flsh_L[-1])  # J / mol / K
            mdot, self.p_choke[0] = calc_mass_outflow(
                self.eos,
                self.z,
                self.h_m[-1],
                self.s_top[0],
                self.p[-1],
                self.p_amb,
                self.A_nozzle,
                self.mw,
                phase_env,
                debug_plot=False,
            )  # mol / s , Pa
            u_top = -mdot * self.mw / rho_m[-1] / (np.pi * self.r_tank_inner**2)  # m/s

        # assemble vectors with ghost cells
        self.alpha_ghost[:, 0] = self.alpha_ghost[:, 1:-1][:, 0]  # m3/m3
        self.alpha_ghost[:, -1] = self.alpha_ghost[:, 1:-1][:, -1]  # m3/m3
        self.rho_ghost[:, 0] = self.rho_ghost[:, 1:-1][:, 0]  # kg/m3
        self.rho_ghost[:, -1] = self.rho_ghost[:, 1:-1][:, -1]  # kg/m3
        self.rho_m_ghost[0] = rho_m[0]  # kg/m3
        self.rho_m_ghost[1:-1] = rho_m  # kg/m3
        self.rho_m_ghost[-1] = rho_m[-1]  # kg/m3
        # u_ghost[:, 1:-1] = u  # m/s
        self.u_ghost[:, 0] = u_bottom  # m/s
        self.u_ghost[:, -1] = u_top  # m/s
        self.u_m_ghost[0] = u_bottom  # m/s
        self.u_m_ghost[1:-1] = u_m  # m/s
        self.u_m_ghost[-1] = u_top  # m/s
        self.e_ghost[:, 0] = self.e_ghost[:, 1:-1][:, 0]  # J/mol
        self.e_ghost[:, -1] = self.e_ghost[:, 1:-1][:, -1]  # J/mol
        self.pos_ghost[1:-1] = self.pos  # m
        self.pos_ghost[0] = self.pos[0]  # m
        self.pos_ghost[-1] = self.pos[-1]  # m
        self.h_ghost[:, 0] = self.h_ghost[:, 1]  # J/mol
        self.h_ghost[:, -1] = self.h_ghost[:, -2]  # J/mol

        # recalculate wall temperature and heat flux
        # TODO ARE WE DOING THE STAGGERING CORRECTLY?
        lz = self.tank_params["lz_tank"] / self.N  # m
        if ts.getTime() != self.t_current[0] and self.tank_params["heat_transfer"]:
            self.t_current[0] = ts.getTime()
            for i in range(self.N):
                self.T_wall[i], self.Q_wall[i], h_ht = (
                    solve_radial_heat_conduction_implicit(
                        self.tank_params,
                        self.T[i],
                        self.T_wall[i],
                        (self.u_m_ghost[i] + self.u_m_ghost[i + 1]) / 2,
                        self.rho_m_ghost[i + 1],
                        self.mph_uv_flsh_L[i],
                        lz,
                        dt,
                    )
                )  # K, J/s, W/m2K

        # Calculate residuals
        f[:, :] = 0.0
        f[:, 0] = dt * rho_m_dot  # kg/m3
        f[:-1, 1] = self.p[1:] - self.p[:-1] + self.dx * self.g * rho_m[0:-1]  # Pa/m
        f[:, 2] = (
            dt
            * (
                rho_m_dot * (e_m / self.mw + self.g * self.pos)
                + rho_m * e_m_dot / self.mw
            )
            - rho_m_dot * e_m_dot / self.mw * dt**2
            - self.Q_wall / (np.pi * self.r_tank_inner**2 * lz) * dt
        )  # J / m3

        # add contribution from space
        for i in range(n_phase):
            e_flux_i = np.zeros_like(self.u_ghost[i])  # J/m3 m/s
            rho_flux_i = np.zeros_like(self.u_ghost[i])  # kg/m2/s
            for j in range(1, self.N + 1):
                if self.u_ghost[i][j] >= 0.0:
                    rho_flux_new = _rho_flux(
                        self.alpha_ghost[i][j], self.rho_ghost[i][j], self.u_ghost[i][j]
                    )
                    e_flux_new = _e_flux(
                        self.alpha_ghost[i][j],
                        self.rho_ghost[i][j],
                        self.h_ghost[i][j],
                        self.mw,
                        self.g,
                        self.pos_ghost[j],
                        self.u_ghost[i][j],
                    )

                    # backward euler
                    rho_flux_i[j] = rho_flux_new  # kg/m2/s
                    e_flux_i[j] = e_flux_new  # J/m3 m/s

                else:
                    rho_flux_new = _rho_flux(
                        self.alpha_ghost[i][j + 1],
                        self.rho_ghost[i][j + 1],
                        self.u_ghost[i][j],
                    )

                    e_flux_new = _e_flux(
                        self.alpha_ghost[i][j + 1],
                        self.rho_ghost[i][j + 1],
                        self.h_ghost[i][j + 1],
                        self.mw,
                        self.g,
                        self.pos_ghost[j + 1],
                        self.u_ghost[i][j],
                    )

                    # backward euler
                    rho_flux_i[j] = rho_flux_new
                    e_flux_i[j] = e_flux_new

            # mass eq
            f[:, 0] += (dt / self.dx) * (rho_flux_i[1:] - rho_flux_i[:-1])  # kg/m3

            # energy eq
            f[:, 2] += (dt / self.dx) * (e_flux_i[1:] - e_flux_i[:-1])  # J/m3

        f1_ref, f2_ref, f3_ref = self.rho_ref, self.p_ref, self.e_ref
        f[:, 0] /= f1_ref
        f[:-1, 1] /= f2_ref
        f[:, 2] /= f3_ref
        # dummy eq
        f[-1, 1] = x[-1, 2]


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20250224/0136f7b2/attachment-0001.html>

From Eric.Chamberland at giref.ulaval.ca  Mon Feb 24 10:07:27 2025
From: Eric.Chamberland at giref.ulaval.ca (Eric Chamberland)
Date: Mon, 24 Feb 2025 11:07:27 -0500
Subject: [petsc-users] Configuring PETSc to use a relative RPATH with $ORIGIN
Message-ID: <501b9850-68ab-4740-9d11-58249bb45873@giref.ulaval.ca>

Hello,

We would like to make the libraries generated from PETSc compilation and 
installation more easily relocatable. Currently, we work around this 
limitation by using LD_LIBRARY_PATH in the environment and manually 
modifying the RPATH recorded in the libraries to remove it.

Recently, we discovered an interesting approach: during the build 
process, we can set a relative RPATH using the $ORIGIN variable, which 
corresponds to the directory containing the library. This allows 
libpetsc.so dependencies to be referenced relatively instead of 
absolutely, making the library "movable" without requiring 
LD_LIBRARY_PATH modifications. We can also apply the same approach to 
our binaries.

To avoid manually modifying the libraries after their creation, we were 
wondering if there is a way to configure PETSc to use a relative RPATH 
with $ORIGIN directly?

I looked through PETSc's configuration options and files but couldn't 
find anything mentioning $ORIGIN, and very little related to RPATH.

A SPACK newbie question: is this achievable with SPACK?

Thanks in advance for your help!

Eric

-- 
Eric Chamberland, ing., M. Ing
Professionnel de recherche
GIREF/Universit? Laval
(418) 656-2131 poste 41 22 42


From balay.anl at fastmail.org  Mon Feb 24 10:57:42 2025
From: balay.anl at fastmail.org (Satish Balay)
Date: Mon, 24 Feb 2025 10:57:42 -0600 (CST)
Subject: [petsc-users] Configuring PETSc to use a relative RPATH with
 $ORIGIN
In-Reply-To: <501b9850-68ab-4740-9d11-58249bb45873@giref.ulaval.ca>
References: <501b9850-68ab-4740-9d11-58249bb45873@giref.ulaval.ca>
Message-ID: <f74915c8-825d-1d67-c9fd-1d71dc85f18c@fastmail.org>

I see you are referring to https://urldefense.us/v3/__https://www.baeldung.com/linux/rpath-change-in-binary__;!!G_uCfscf7eWS!f7FgqvyBSVkObSDsRA2uRpI9SqxQm_45esiwkTgdj0eKkrGQn5C-j-bU-DR-T7stFHid-jUbqwu9_gXjsiU_4KPp7pg$ 

I suspect its easier to do this  by "manually modifying the libraries after their creation" than fix up build tools to support it.

configure accepts 'LIBS' option that can potentially be used - but I suspect the '$ORIGIN' might not survive different layers
of shell escapes that might occur.

./configure LIBS=-Wl,-rpath,'$ORIGIN'/foo1

Satish

On Mon, 24 Feb 2025, Eric Chamberland via petsc-users wrote:

> Hello,
> 
> We would like to make the libraries generated from PETSc compilation and
> installation more easily relocatable. Currently, we work around this
> limitation by using LD_LIBRARY_PATH in the environment and manually modifying
> the RPATH recorded in the libraries to remove it.
> 
> Recently, we discovered an interesting approach: during the build process, we
> can set a relative RPATH using the $ORIGIN variable, which corresponds to the
> directory containing the library. This allows libpetsc.so dependencies to be
> referenced relatively instead of absolutely, making the library "movable"
> without requiring LD_LIBRARY_PATH modifications. We can also apply the same
> approach to our binaries.
> 
> To avoid manually modifying the libraries after their creation, we were
> wondering if there is a way to configure PETSc to use a relative RPATH with
> $ORIGIN directly?
> 
> I looked through PETSc's configuration options and files but couldn't find
> anything mentioning $ORIGIN, and very little related to RPATH.
> 
> A SPACK newbie question: is this achievable with SPACK?
> 
> Thanks in advance for your help!
> 
> Eric
> 
> 


From jed at jedbrown.org  Mon Feb 24 11:13:48 2025
From: jed at jedbrown.org (Jed Brown)
Date: Mon, 24 Feb 2025 10:13:48 -0700
Subject: [petsc-users] Configuring PETSc to use a relative RPATH with
 $ORIGIN
In-Reply-To: <f74915c8-825d-1d67-c9fd-1d71dc85f18c@fastmail.org>
References: <501b9850-68ab-4740-9d11-58249bb45873@giref.ulaval.ca>
	<f74915c8-825d-1d67-c9fd-1d71dc85f18c@fastmail.org>
Message-ID: <87y0xvgt9v.fsf@jedbrown.org>

I think fixing up the build tools would be more reliable. We can't change paths for libraries that are not installed in the same directory as libpetsc.so. What if we just substituted (perhaps only in the Makefile) $ORIGIN for the $PETSC_DIR/$PETSC_ARCH/lib prefix?

Satish Balay <balay.anl at fastmail.org> writes:

> I see you are referring to https://urldefense.us/v3/__https://www.baeldung.com/linux/rpath-change-in-binary__;!!G_uCfscf7eWS!f7FgqvyBSVkObSDsRA2uRpI9SqxQm_45esiwkTgdj0eKkrGQn5C-j-bU-DR-T7stFHid-jUbqwu9_gXjsiU_4KPp7pg$ 
>
> I suspect its easier to do this  by "manually modifying the libraries after their creation" than fix up build tools to support it.
>
> configure accepts 'LIBS' option that can potentially be used - but I suspect the '$ORIGIN' might not survive different layers
> of shell escapes that might occur.
>
> ./configure LIBS=-Wl,-rpath,'$ORIGIN'/foo1
>
> Satish
>
> On Mon, 24 Feb 2025, Eric Chamberland via petsc-users wrote:
>
>> Hello,
>> 
>> We would like to make the libraries generated from PETSc compilation and
>> installation more easily relocatable. Currently, we work around this
>> limitation by using LD_LIBRARY_PATH in the environment and manually modifying
>> the RPATH recorded in the libraries to remove it.
>> 
>> Recently, we discovered an interesting approach: during the build process, we
>> can set a relative RPATH using the $ORIGIN variable, which corresponds to the
>> directory containing the library. This allows libpetsc.so dependencies to be
>> referenced relatively instead of absolutely, making the library "movable"
>> without requiring LD_LIBRARY_PATH modifications. We can also apply the same
>> approach to our binaries.
>> 
>> To avoid manually modifying the libraries after their creation, we were
>> wondering if there is a way to configure PETSc to use a relative RPATH with
>> $ORIGIN directly?
>> 
>> I looked through PETSc's configuration options and files but couldn't find
>> anything mentioning $ORIGIN, and very little related to RPATH.
>> 
>> A SPACK newbie question: is this achievable with SPACK?
>> 
>> Thanks in advance for your help!
>> 
>> Eric
>> 
>> 

From aduarteg at utexas.edu  Mon Feb 24 14:09:49 2025
From: aduarteg at utexas.edu (Alfredo J Duarte Gomez)
Date: Mon, 24 Feb 2025 14:09:49 -0600
Subject: [petsc-users] Interpreting flamegraph and profiling logs
Message-ID: <CAO1tTfKHtG4nJJgdKAJkoEdNLeR+zDQAZ5hidZGoSDybLaJYxA@mail.gmail.com>

Good afternoon PETSC team,

I am doing some profiling on one of my applications (petsc 3.15) using the
command options:
 -log_view :performance.out
-log_view :flame.out:ascii_flamegraph.

I want to understand the specifics of the gnereated flamegraphs and I am
attaching performance.out and flame.out to this email.

I am assuming that the flamegraphs represent what is labeled "MPI Messages"
in the regular output file (i.e. performance.out) instead of time or is it
some other quantity? Since the "unit" of these flamegraphs is not time when
I load into speedscope app and quantities most closely resemble the order
of magnitude of the messages sent.

My application being an unsteady solver, I can understand that in time
percentage, the amount of time spent in SNESsolve is almost equal to the
amount spent in TSStep (~1.63e02 s, see performance.out). However, that
percentage is not represented in the flamegraph (97% vs 82% in flame.out).
How would I interpret this difference in percentage?

Thanks,

Alfredo
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20250224/9cd443b0/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: performance.out
Type: application/octet-stream
Size: 24069 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20250224/9cd443b0/attachment-0002.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: flame.out
Type: application/octet-stream
Size: 14255 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20250224/9cd443b0/attachment-0003.obj>

From knepley at gmail.com  Mon Feb 24 14:56:09 2025
From: knepley at gmail.com (Matthew Knepley)
Date: Mon, 24 Feb 2025 15:56:09 -0500
Subject: [petsc-users] Interpreting flamegraph and profiling logs
In-Reply-To: <CAO1tTfKHtG4nJJgdKAJkoEdNLeR+zDQAZ5hidZGoSDybLaJYxA@mail.gmail.com>
References: <CAO1tTfKHtG4nJJgdKAJkoEdNLeR+zDQAZ5hidZGoSDybLaJYxA@mail.gmail.com>
Message-ID: <CAMYG4Gk7-iC7a2apAN7gQirVGvdzhZgjXjPstwtL9Q-RAyjO6w@mail.gmail.com>

On Mon, Feb 24, 2025 at 3:10?PM Alfredo J Duarte Gomez <aduarteg at utexas.edu>
wrote:

> Good afternoon PETSC team,
>
> I am doing some profiling on one of my applications (petsc 3.15) using the
> command options:
>  -log_view :performance.out
> -log_view :flame.out:ascii_flamegraph.
>
> I want to understand the specifics of the gnereated flamegraphs and I am
> attaching performance.out and flame.out to this email.
>
> I am assuming that the flamegraphs represent what is labeled "MPI
> Messages" in the regular output file (i.e. performance.out) instead of time
> or is it some other quantity? Since the "unit" of these flamegraphs is not
> time when I load into speedscope app and quantities most closely resemble
> the order of magnitude of the messages sent.
>

No. The flame graphs represent time. I forget what the normalization, but I
think you will see that the ratios match the ratios in the performance.out
file.


> My application being an unsteady solver, I can understand that in time
> percentage, the amount of time spent in SNESsolve is almost equal to the
> amount spent in TSStep (~1.63e02 s, see performance.out). However, that
> percentage is not represented in the flamegraph (97% vs 82% in flame.out).
> How would I interpret this difference in percentage?
>

SNESSolve is nested in TSSolve, as you see in the flamegraph. Do you have
multiple SNESolves? (I cannot look at the flamegraph right now)

  Thanks,

     Matt


> Thanks,
>
> Alfredo
>
>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!c_-aeuJAFXYfHQTColMAQdUWfWjw1ddAzBpk8EeqF_v9bfTtRQ8PKjSF8mYiYDWi3GkkWQa9991vePUQPPTm$  <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!c_-aeuJAFXYfHQTColMAQdUWfWjw1ddAzBpk8EeqF_v9bfTtRQ8PKjSF8mYiYDWi3GkkWQa9991veGsRWtVx$ >
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20250224/57c336a6/attachment.html>

From Eric.Chamberland at giref.ulaval.ca  Mon Feb 24 16:54:39 2025
From: Eric.Chamberland at giref.ulaval.ca (Eric Chamberland)
Date: Mon, 24 Feb 2025 17:54:39 -0500
Subject: [petsc-users] Configuring PETSc to use a relative RPATH with
 $ORIGIN
In-Reply-To: <87y0xvgt9v.fsf@jedbrown.org>
References: <501b9850-68ab-4740-9d11-58249bb45873@giref.ulaval.ca>
	<f74915c8-825d-1d67-c9fd-1d71dc85f18c@fastmail.org>
	<87y0xvgt9v.fsf@jedbrown.org>
Message-ID: <0e3764f5-872c-4f4e-b3e7-b922302570ca@giref.ulaval.ca>

Hi Jed, Satish,

Thank you both for your insights.

I think Jed is right -- it seems this would require changes in PETSc's 
build tools and scripts.

On our side, the CMake modification we made was along these lines:

SET(CMAKE_SKIP_BUILD_RPATH? FALSE)
SET(CMAKE_BUILD_WITH_INSTALL_RPATH TRUE)
SET(CMAKE_INSTALL_RPATH "$ORIGIN/../lib/${LIB_BUILD_TYPE}")

However, it's unclear to me how this approach would work for all of 
PETSc?s external packages. Can they also be relocated using $ORIGIN, or 
would they need a different solution? I suspect many libraries might not 
support this mechanism.

In any case, you?ve answered my question -- thank you again for your help!

Best,
Eric

On 2025-02-24 12:13, Jed Brown wrote:
> I think fixing up the build tools would be more reliable. We can't change paths for libraries that are not installed in the same directory as libpetsc.so. What if we just substituted (perhaps only in the Makefile) $ORIGIN for the $PETSC_DIR/$PETSC_ARCH/lib prefix?
>
> Satish Balay <balay.anl at fastmail.org> writes:
>
>> I see you are referring to https://urldefense.us/v3/__https://www.baeldung.com/linux/rpath-change-in-binary__;!!G_uCfscf7eWS!f7FgqvyBSVkObSDsRA2uRpI9SqxQm_45esiwkTgdj0eKkrGQn5C-j-bU-DR-T7stFHid-jUbqwu9_gXjsiU_4KPp7pg$
>>
>> I suspect its easier to do this  by "manually modifying the libraries after their creation" than fix up build tools to support it.
>>
>> configure accepts 'LIBS' option that can potentially be used - but I suspect the '$ORIGIN' might not survive different layers
>> of shell escapes that might occur.
>>
>> ./configure LIBS=-Wl,-rpath,'$ORIGIN'/foo1
>>
>> Satish
>>
>> On Mon, 24 Feb 2025, Eric Chamberland via petsc-users wrote:
>>
>>> Hello,
>>>
>>> We would like to make the libraries generated from PETSc compilation and
>>> installation more easily relocatable. Currently, we work around this
>>> limitation by using LD_LIBRARY_PATH in the environment and manually modifying
>>> the RPATH recorded in the libraries to remove it.
>>>
>>> Recently, we discovered an interesting approach: during the build process, we
>>> can set a relative RPATH using the $ORIGIN variable, which corresponds to the
>>> directory containing the library. This allows libpetsc.so dependencies to be
>>> referenced relatively instead of absolutely, making the library "movable"
>>> without requiring LD_LIBRARY_PATH modifications. We can also apply the same
>>> approach to our binaries.
>>>
>>> To avoid manually modifying the libraries after their creation, we were
>>> wondering if there is a way to configure PETSc to use a relative RPATH with
>>> $ORIGIN directly?
>>>
>>> I looked through PETSc's configuration options and files but couldn't find
>>> anything mentioning $ORIGIN, and very little related to RPATH.
>>>
>>> A SPACK newbie question: is this achievable with SPACK?
>>>
>>> Thanks in advance for your help!
>>>
>>> Eric
>>>
>>>
-- 
Eric Chamberland, ing., M. Ing
Professionnel de recherche
GIREF/Universit? Laval
(418) 656-2131 poste 41 22 42


From eirik.hoydalsvik at sintef.no  Tue Feb 25 02:19:10 2025
From: eirik.hoydalsvik at sintef.no (=?utf-8?B?RWlyaWsgSmFjY2hlcmkgSMO4eWRhbHN2aWs=?=)
Date: Tue, 25 Feb 2025 08:19:10 +0000
Subject: [petsc-users] TS Solver stops working when including ts.setDM
In-Reply-To: <CAMYG4G=trPFXfTGdK2FW3sp_6aSVUTOv=jsh1vJKu9ahr_44sw@mail.gmail.com>
References: <OS6P279MB05706419B43C06C49AA346BD94C02@OS6P279MB0570.NORP279.PROD.OUTLOOK.COM>
	<CAMYG4Gn3RwRuqta5Xc0AWgAnvF=pam0VCJacH+J_ovVfUyXtRA@mail.gmail.com>
	<OS6P279MB057047A6F89E228712A833AF94C02@OS6P279MB0570.NORP279.PROD.OUTLOOK.COM>
	<CAMYG4G=trPFXfTGdK2FW3sp_6aSVUTOv=jsh1vJKu9ahr_44sw@mail.gmail.com>
Message-ID: <OS6P279MB057020701F9E25564062AA6694C32@OS6P279MB0570.NORP279.PROD.OUTLOOK.COM>

Thanks again for the quick response,

I tried prining the jacobians with -ksp_view_mat as you suggested, with a system of only 3 cells (I am studying at a 1d problem).  Printing the jacobian in the first timestep I  got the two matrices attached at the end of this email. The jacobians are in general agreement, with some small diviations, like the final element of the matrix being 1.6e-5 in the sparse case and 3.7 In the full case.

Questions:

1. Are differences on the order of 1e-5 expected when computing the jacobians in different ways?

2. Do you think these differences can be the cause of my problems? Any suggestions for furtner debugging strategies?

Eirik

! sparse jacobian
row 0: (0, 1.1012)  (1, -104.568)  (2, 0.258649)  (3, -0.0644364)  (4, -13.1186)  (5, 1.3237e-08)
row 1: (0, -0.44489)  (1, 1846.04)  (2, 2.12629e-07)  (3, 0.445291)  (4, -1846.04)  (5, 7.08762e-08)
row 2: (0, 540.692)  (1, -40219.1)  (2, 126.734)  (3, -31.5544)  (4, -7023.46)  (5, 6.48896e-06)
row 3: (0, -0.101197)  (1, 104.568)  (2, -0.258649)  (3, 1.16563)  (4, -91.4489)  (5, 0.258649)  (6, -0.0644365)  (7, -13.1186)  (8, -4.4809e-08)
row 4: (0, 0.)  (1, 0.)  (2, 0.)  (3, -0.44489)  (4, 1846.04)  (5, -2.17357e-07)  (6, 0.445291)  (7, -1846.04)  (8, 2.17355e-07)
row 5: (0, -49.7734)  (1, 51373.8)  (2, -126.734)  (3, 572.246)  (4, -33195.6)  (5, 126.734)  (6, -31.5544)  (7, -7023.46)  (8, -2.19026e-05)
row 6: (3, -0.101197)  (4, 104.568)  (5, -0.258649)  (6, 1.06444)  (7, 13.1186)  (8, 3.32334e-08)
row 7: (3, 0.)  (4, 0.)  (5, 0.)  (6, 0.)  (7, 0.)  (8, 1.)
row 8: (3, -49.7734)  (4, 51373.8)  (5, -126.734)  (6, 522.472)  (7, 18178.2)  (8, 1.61503e-05)

! full jacobian
1.1011966827009450e+00 -1.0456754702270389e+02 2.5864915220241336e-01 -6.4436436239323838e-02 -1.3118626729240630e+01 7.6042484344402957e-08 2.9290438414140398e-08 2.5347494781467651e-08 7.2381179542635411e-08
-4.4488897562431995e-01 1.8460406897256150e+03 -5.0558790784552242e-07 4.4529051871980491e-01 -1.8460406898012175e+03 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00
5.4069168701069862e+02 -4.0219094660763396e+04 1.2673402711669499e+02 -3.1554364136687809e+01 -7.0234605760797031e+03 3.7635960251523168e-05 1.4708306305192963e-05 1.2833718246687978e-05 3.5617173111594722e-05
-1.0119659898285956e-01 1.0456754705230254e+02 -2.5864912672499502e-01 1.1656331184446040e+00 -9.1448920317937109e+01 2.5864905777109459e-01 -6.4436443447843730e-02 -1.3118626754633008e+01 3.8772800266974086e-09
0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 -4.4488904650049199e-01 1.8460406899429699e+03 -2.8823314009443733e-07 4.4529051871980491e-01 -1.8460406898012175e+03 0.0000000000000000e+00
-4.9773392795905657e+01 5.1373794518018862e+04 -1.2673401444613337e+02 5.7224585844509852e+02 -3.3195615874594827e+04 1.2673393520690603e+02 -3.1554356503250116e+01 -7.0234583029005144e+03 1.9105655626471588e-06
-8.1675260962506883e-08 -2.9290438414140398e-08 -2.5347494781467651e-08 -1.0119667997558363e-01 1.0456754704720647e+02 -2.5864913361425051e-01 1.0644364161519400e+00 1.3118626729240630e+01 -7.6042484344402957e-08
0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 1.0000000000000000e+00
-4.0087344635721997e-05 -1.4564107223769502e-05 -1.2401121002417596e-05 -4.9773414819747863e+01 5.1373776293551586e+04 -1.2673397275364130e+02 5.2247224727851687e+02 1.8178158133060850e+04 -3.7347562088676249e-05

From: Matthew Knepley <knepley at gmail.com>
Date: Monday, February 24, 2025 at 15:00
To: Eirik Jaccheri H?ydalsvik <eirik.hoydalsvik at sintef.no>
Cc: petsc-users at mcs.anl.gov <petsc-users at mcs.anl.gov>
Subject: Re: [petsc-users] TS Solver stops working when including ts.setDM
On Mon, Feb 24, 2025 at 8:56?AM Eirik Jaccheri H?ydalsvik <eirik.hoydalsvik at sintef.no<mailto:eirik.hoydalsvik at sintef.no>> wrote:

  1.  Thank you for the quick answer, I think this sounds reasonable? Is there any way to compare the brute-force jacobian to the one computed using the coloring information?

The easiest way we have is to print them both out:

  -ksp_view_mat

on both runs. We have a way to compare the analytic and FD Jacobians (-snes_test_jacobian), but
not two different FDs.

  Thanks,

     Matt


  1.
From: Matthew Knepley <knepley at gmail.com<mailto:knepley at gmail.com>>
Date: Monday, February 24, 2025 at 14:53
To: Eirik Jaccheri H?ydalsvik <eirik.hoydalsvik at sintef.no<mailto:eirik.hoydalsvik at sintef.no>>
Cc: petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov> <petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>>
Subject: Re: [petsc-users] TS Solver stops working when including ts.setDM
On Mon, Feb 24, 2025 at 8:41?AM Eirik Jaccheri H?ydalsvik via petsc-users <petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>> wrote:
Hi,

I am using the petsc4py.ts timestepper to solve a PDE. It is not easy to obtain the jacobian for my equations, so I do not provide a jacobian function. The code is given at the end of the email.

When I comment out the function call ?ts.setDM(da)?, the code runs and gives reasonable results.

However, when I add this line of code, the program crashes with the error message provided at the end of the email.

Questions:

1. Do you know why adding this line of code can make the SNES solver diverge? Any suggestions for how to debug the issue?

I will not know until I run it, but here is my guess. When the DMDA is specified, PETSc uses coloring to produce the Jacobian. When it is not, it just brute-forces the entire J. My guess is that your residual does not respect the stencil in the DMDA, so the coloring is wrong, making a wrong Jacobian.

2. What is the advantage of adding the DMDA object to the ts solver? Will this speed up the calculation of the finite difference jacobian?

Yes, it speeds up the computation of the FD Jacobian.

  Thanks,

     Matt

Best regards,
Eirik H?ydalsvik
SINTEF ER/NTNU

Error message:

[Eiriks-MacBook-Pro.local:26384] shmem: mmap: an error occurred while determining whether or not /var/folders/w1/35jw9y4n7lsbw0dhjqdhgzz80000gn/T//ompi.Eiriks-MacBook-Pro.501/jf.0/2046361600/sm_segment.Eiriks-MacBook-Pro.501.79f90000.0 could be created.
t 0 of 1 with dt =  0.2
0 TS dt 0.2 time 0.
      TSAdapt none step   0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0          + 2.000e-01 retrying with dt=5.000e-02
      TSAdapt none step   0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0          + 5.000e-02 retrying with dt=1.250e-02
      TSAdapt none step   0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0          + 1.250e-02 retrying with dt=3.125e-03
      TSAdapt none step   0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0          + 3.125e-03 retrying with dt=7.813e-04
      TSAdapt none step   0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0          + 7.813e-04 retrying with dt=1.953e-04
      TSAdapt none step   0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0          + 1.953e-04 retrying with dt=4.883e-05
      TSAdapt none step   0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0          + 4.883e-05 retrying with dt=1.221e-05
      TSAdapt none step   0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0          + 1.221e-05 retrying with dt=3.052e-06
      TSAdapt none step   0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0          + 3.052e-06 retrying with dt=7.629e-07
      TSAdapt none step   0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0          + 7.629e-07 retrying with dt=1.907e-07
      TSAdapt none step   0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0          + 1.907e-07 retrying with dt=4.768e-08
Traceback (most recent call last):
  File "/Users/iept1445/Code/tank_model/closed_tank.py", line 200, in <module>
    return_dict1d = get_tank_composition_1d(tank_params)
  File "/Users/iept1445/Code/tank_model/src/tank_model1d.py", line 223, in get_tank_composition_1d
    ts.solve(u=x)
  File "petsc4py/PETSc/TS.pyx", line 2478, in petsc4py.PETSc.TS.solve
petsc4py.PETSc.Error: error code 91
[0] TSSolve() at /Users/iept1445/.cache/uv/sdists-v6/pypi/petsc/3.22.1/svV0XnlR4s3LB4lmsaKjR/src/src/ts/interface/ts.c:4072
[0] TSStep() at /Users/iept1445/.cache/uv/sdists-v6/pypi/petsc/3.22.1/svV0XnlR4s3LB4lmsaKjR/src/src/ts/interface/ts.c:3440
[0] TSStep has failed due to DIVERGED_STEP_REJECTED

Options for solver:

COMM = PETSc.COMM_WORLD

    da = PETSc.DMDA().create(
        dim=(N_vertical,),
        dof=3,
        stencil_type=PETSc.DMDA().StencilType.STAR,
        stencil_width=1,
        # boundary_type=PETSc.DMDA.BoundaryType.GHOSTED,
    )
    x = da.createGlobalVec()
    x_old = da.createGlobalVec()
    f = da.createGlobalVec()
    J = da.createMat()
    rho_ref = rho_m[0]  # kg/m3
    e_ref = e_m[0]  # J/mol
    p_ref = p0  # Pa
    x.setArray(np.array([rho_m / rho_ref, e_m / e_ref, ux_m]).T.flatten())
    x_old.setArray(np.array([rho_m / rho_ref, e_m / e_ref, ux_m]).T.flatten())

    optsDB = PETSc.Options()
    optsDB["snes_lag_preconditioner_persists"] = False
    optsDB["snes_lag_jacobian"] = 1
    optsDB["snes_lag_jacobian_persists"] = False
    optsDB["snes_lag_preconditioner"] = 1
    optsDB["ksp_type"] = "gmres"  # "gmres"  # gmres"
    optsDB["pc_type"] = "ilu"  # "lu"  # "ilu"
    optsDB["snes_type"] = "newtonls"
    optsDB["ksp_rtol"] = 1e-7
    optsDB["ksp_atol"] = 1e-7
    optsDB["ksp_max_it"] = 100
    optsDB["snes_rtol"] = 1e-5
    optsDB["snes_atol"] = 1e-5
    optsDB["snes_stol"] = 1e-5
    optsDB["snes_max_it"] = 100
    optsDB["snes_mf"] = False
    optsDB["ts_max_time"] = t_end
    optsDB["ts_type"] = "beuler"  # "bdf"  #
    optsDB["ts_max_snes_failures"] = -1
    optsDB["ts_monitor"] = ""
    optsDB["ts_adapt_monitor"] = ""
    # optsDB["snes_monitor"] = ""
    # optsDB["ksp_monitor"] = ""
    optsDB["ts_atol"] = 1e-4

    x0 = x_old
    residual_wrap = residual_ts(
        eos,
        x0,
        N_vertical,
        g,
        pos,
        z,
        mw,
        dt,
        dx,
        p_amb,
        A_nozzle,
        r_tank_inner,
        mph_uv_flsh_L,
        rho_ref,
        e_ref,
        p_ref,
        closed_tank,
        J,
        f,
        da,
        drift_func,
        T_wall,
        tank_params,
    )

    # optsDB["ts_adapt_type"] = "none"

    ts = PETSc.TS().create(comm=COMM)
    # TODO: Figure out why DM crashes the code
    # ts.setDM(residual_wrap.da)
    ts.setIFunction(residual_wrap.residual_ts, None)
    ts.setTimeStep(dt)
    ts.setMaxSteps(-1)
    ts.setTime(t_start)  # s
    ts.setMaxTime(t_end)  # s
    ts.setMaxSteps(1e5)
    ts.setStepLimits(1e-3, 1e5)
    ts.setFromOptions()
    ts.solve(u=x)


Residual function:

class residual_ts:
    def __init__(
        self,
        eos,
        x0,
        N,
        g,
        pos,
        z,
        mw,
        dt,
        dx,
        p_amb,
        A_nozzle,
        r_tank_inner,
        mph_uv_flsh_l,
        rho_ref,
        e_ref,
        p_ref,
        closed_tank,
        J,
        f,
        da,
        drift_func,
        T_wall,
        tank_params,
    ):
        self.eos = eos
        self.x0 = x0
        self.N = N
        self.g = g
        self.pos = pos
        self.z = z
        self.mw<https://urldefense.us/v3/__http://self.mw/__;!!G_uCfscf7eWS!YHklSzZKWeUyyqPeA2cK5USsTLG_tjjMs0innpBaQeO9qFz_YqnRf-SHAGLf2GZ-Ku35MtOoNkWCBhU44E22_6AQznSlayXlGYI$ > = mw
        self.dt = dt
        self.dx = dx
        self.p_amb = p_amb
        self.A_nozzle = A_nozzle
        self.r_tank_inner = r_tank_inner
        self.mph_uv_flsh_L = mph_uv_flsh_l
        self.rho_ref = rho_ref
        self.e_ref = e_ref
        self.p_ref = p_ref
        self.closed_tank = closed_tank
        self.J = J
        self.f = f
        self.da = da
        self.drift_func = drift_func
        self.T_wall = T_wall
        self.tank_params = tank_params
        self.Q_wall = np.zeros(N)
        self.n_iter = 0
        self.t_current = [0.0]
        self.s_top = [0.0]
        self.p_choke = [0.0]

        # setting interp func # TODO figure out how to generalize this method
        self._interp_func = _jalla_upwind

        # allocate space for new params
        self.p = np.zeros(N)  # Pa
        self.T = np.zeros(N)  # K
        self.alpha = np.zeros((2, N))
        self.rho = np.zeros((2, N))
        self.e = np.zeros((2, N))

        # allocate space for ghost cells
        self.alpha_ghost = np.zeros((2, N + 2))
        self.rho_ghost = np.zeros((2, N + 2))
        self.rho_m_ghost = np.zeros(N + 2)
        self.u_m_ghost = np.zeros(N + 1)
        self.u_ghost = np.zeros((2, N + 1))
        self.e_ghost = np.zeros((2, N + 2))
        self.pos_ghost = np.zeros(N + 2)
        self.h_ghost = np.zeros((2, N + 2))

        # allocate soace for local X and Xdot
        self.X_LOCAL = da.createLocalVec()
        self.XDOT_LOCAL = da.createLocalVec()

    def residual_ts(self, ts, t, X, XDOT, F):
        self.n_iter += 1
        # TODO: Estimate time use
        """
        Caculate residuals for equations
        (rho, rho (e + gx))_t + (rho u, rho u (h + gx))_x = 0
        P_x = - g \rho
        """
        n_phase = 2
        self.da.globalToLocal(X, self.X_LOCAL)
        self.da.globalToLocal(XDOT, self.XDOT_LOCAL)
        x = self.da.getVecArray(self.X_LOCAL)
        xdot = self.da.getVecArray(self.XDOT_LOCAL)
        f = self.da.getVecArray(F)

        T_c, v_c, p_c = self.eos.critical(self.z)  # K, m3/mol, Pa
        rho_m = x[:, 0] * self.rho_ref  # kg/m3
        e_m = x[:, 1] * self.e_ref  # J/mol
        u_m = x[:-1, 2]  # m/s

        # derivatives
        rho_m_dot = xdot[:, 0] * self.rho_ref  # kg/m3
        e_m_dot = xdot[:, 1] * self.e_ref  # kg/m3
        dt = ts.getTimeStep()  # s

        for i in range(self.N):
            # get new parameters
            self.mph_uv_flsh_L[i] = self.eos.multi_phase_uvflash(
                self.z, e_m[i], self.mw<https://urldefense.us/v3/__http://self.mw/__;!!G_uCfscf7eWS!YHklSzZKWeUyyqPeA2cK5USsTLG_tjjMs0innpBaQeO9qFz_YqnRf-SHAGLf2GZ-Ku35MtOoNkWCBhU44E22_6AQznSlayXlGYI$ > / rho_m[i], self.mph_uv_flsh_L[i]
            )

            betaL, betaV, betaS = _get_betas(self.mph_uv_flsh_L[i])  # mol/mol
            beta = [betaL, betaV]
            if betaS != 0.0:
                print("there is a solid phase which is not accounted for")
            self.T[i], self.p[i] = _get_tank_temperature_pressure(
                self.mph_uv_flsh_L[i]
            )  # K, Pa)
            for j, phase in enumerate([self.eos.LIQPH, self.eos.VAPPH]):
                # new parameters
                self.rho_ghost[:, 1:-1][j][i] = (
                    self.mw<https://urldefense.us/v3/__http://self.mw/__;!!G_uCfscf7eWS!YHklSzZKWeUyyqPeA2cK5USsTLG_tjjMs0innpBaQeO9qFz_YqnRf-SHAGLf2GZ-Ku35MtOoNkWCBhU44E22_6AQznSlayXlGYI$ >
                    / self.eos.specific_volume(self.T[i], self.p[i], self.z, phase)[0]
                )  # kg/m3
                self.e_ghost[:, 1:-1][j][i] = self.eos.internal_energy_tv(
                    self.T[i], self.mw<https://urldefense.us/v3/__http://self.mw/__;!!G_uCfscf7eWS!YHklSzZKWeUyyqPeA2cK5USsTLG_tjjMs0innpBaQeO9qFz_YqnRf-SHAGLf2GZ-Ku35MtOoNkWCBhU44E22_6AQznSlayXlGYI$ > / self.rho_ghost[:, 1:-1][j][i], self.z, phase
                )[
                    0
                ]  # J/mol
                self.h_ghost[:, 1:-1][j][i] = (
                    self.e_ghost[:, 1:-1][j][i]
                    + self.p[i] * self.mw<https://urldefense.us/v3/__http://self.mw/__;!!G_uCfscf7eWS!YHklSzZKWeUyyqPeA2cK5USsTLG_tjjMs0innpBaQeO9qFz_YqnRf-SHAGLf2GZ-Ku35MtOoNkWCBhU44E22_6AQznSlayXlGYI$ > / self.rho_ghost[:, 1:-1][j][i]
                )  # J/mol
                self.alpha_ghost[:, 1:-1][j][i] = (
                    beta[j] / self.rho_ghost[:, 1:-1][j][i] * rho_m[i]
                )  # m3/m3

        # calculate drift velocity
        for i in range(self.N - 1):
            self.u_ghost[:, 1:-1][0][i], self.u_ghost[:, 1:-1][1][i] = (
                calc_drift_velocity(
                    u_m[i],
                    self._interp_func(
                        self.rho_ghost[:, 1:-1][0][i],
                        self.rho_ghost[:, 1:-1][0][i + 1],
                        u_m[i],
                    ),
                    self._interp_func(
                        self.rho_ghost[:, 1:-1][1][i],
                        self.rho_ghost[:, 1:-1][1][i + 1],
                        u_m[i],
                    ),
                    self.g,
                    self._interp_func(self.T[i], self.T[i + 1], u_m[i]),
                    T_c,
                    self.r_tank_inner,
                    self._interp_func(
                        self.alpha_ghost[:, 1:-1][0][i],
                        self.alpha_ghost[:, 1:-1][0][i + 1],
                        u_m[i],
                    ),
                    self._interp_func(
                        self.alpha_ghost[:, 1:-1][1][i],
                        self.alpha_ghost[:, 1:-1][1][i + 1],
                        u_m[i],
                    ),
                    self.drift_func,
                )
            )  # liq m / s , vapour m / s

        u_bottom = 0
        if self.closed_tank:
            u_top = 0.0  # m/s
        else:
            # calc phase to skip env_isentrope_cross
            if (
                self.mph_uv_flsh_L[-1].liquid != None
                and self.mph_uv_flsh_L[-1].vapour == None
                and self.mph_uv_flsh_L[-1].solid == None
            ):
                phase_env = self.eos.LIQPH
            else:
                phase_env = self.eos.TWOPH

            self.h_m = e_m + self.p * self.mw<https://urldefense.us/v3/__http://self.mw/__;!!G_uCfscf7eWS!YHklSzZKWeUyyqPeA2cK5USsTLG_tjjMs0innpBaQeO9qFz_YqnRf-SHAGLf2GZ-Ku35MtOoNkWCBhU44E22_6AQznSlayXlGYI$ > / rho_m  # J / mol
            self.s_top[0] = _get_s_tank(self.eos, self.mph_uv_flsh_L[-1])  # J / mol / K
            mdot, self.p_choke[0] = calc_mass_outflow(
                self.eos,
                self.z,
                self.h_m[-1],
                self.s_top[0],
                self.p[-1],
                self.p_amb,
                self.A_nozzle,
                self.mw<https://urldefense.us/v3/__http://self.mw/__;!!G_uCfscf7eWS!YHklSzZKWeUyyqPeA2cK5USsTLG_tjjMs0innpBaQeO9qFz_YqnRf-SHAGLf2GZ-Ku35MtOoNkWCBhU44E22_6AQznSlayXlGYI$ >,
                phase_env,
                debug_plot=False,
            )  # mol / s , Pa
            u_top = -mdot * self.mw<https://urldefense.us/v3/__http://self.mw/__;!!G_uCfscf7eWS!YHklSzZKWeUyyqPeA2cK5USsTLG_tjjMs0innpBaQeO9qFz_YqnRf-SHAGLf2GZ-Ku35MtOoNkWCBhU44E22_6AQznSlayXlGYI$ > / rho_m[-1] / (np.pi * self.r_tank_inner**2)  # m/s

        # assemble vectors with ghost cells
        self.alpha_ghost[:, 0] = self.alpha_ghost[:, 1:-1][:, 0]  # m3/m3
        self.alpha_ghost[:, -1] = self.alpha_ghost[:, 1:-1][:, -1]  # m3/m3
        self.rho_ghost[:, 0] = self.rho_ghost[:, 1:-1][:, 0]  # kg/m3
        self.rho_ghost[:, -1] = self.rho_ghost[:, 1:-1][:, -1]  # kg/m3
        self.rho_m_ghost[0] = rho_m[0]  # kg/m3
        self.rho_m_ghost[1:-1] = rho_m  # kg/m3
        self.rho_m_ghost[-1] = rho_m[-1]  # kg/m3
        # u_ghost[:, 1:-1] = u  # m/s
        self.u_ghost[:, 0] = u_bottom  # m/s
        self.u_ghost[:, -1] = u_top  # m/s
        self.u_m_ghost[0] = u_bottom  # m/s
        self.u_m_ghost[1:-1] = u_m  # m/s
        self.u_m_ghost[-1] = u_top  # m/s
        self.e_ghost[:, 0] = self.e_ghost[:, 1:-1][:, 0]  # J/mol
        self.e_ghost[:, -1] = self.e_ghost[:, 1:-1][:, -1]  # J/mol
        self.pos_ghost[1:-1] = self.pos  # m
        self.pos_ghost[0] = self.pos[0]  # m
        self.pos_ghost[-1] = self.pos[-1]  # m
        self.h_ghost[:, 0] = self.h_ghost[:, 1]  # J/mol
        self.h_ghost[:, -1] = self.h_ghost[:, -2]  # J/mol

        # recalculate wall temperature and heat flux
        # TODO ARE WE DOING THE STAGGERING CORRECTLY?
        lz = self.tank_params["lz_tank"] / self.N  # m
        if ts.getTime() != self.t_current[0] and self.tank_params["heat_transfer"]:
            self.t_current[0] = ts.getTime()
            for i in range(self.N):
                self.T_wall[i], self.Q_wall[i], h_ht = (
                    solve_radial_heat_conduction_implicit(
                        self.tank_params,
                        self.T[i],
                        self.T_wall[i],
                        (self.u_m_ghost[i] + self.u_m_ghost[i + 1]) / 2,
                        self.rho_m_ghost[i + 1],
                        self.mph_uv_flsh_L[i],
                        lz,
                        dt,
                    )
                )  # K, J/s, W/m2K

        # Calculate residuals
        f[:, :] = 0.0
        f[:, 0] = dt * rho_m_dot  # kg/m3
        f[:-1, 1] = self.p[1:] - self.p[:-1] + self.dx * self.g * rho_m[0:-1]  # Pa/m
        f[:, 2] = (
            dt
            * (
                rho_m_dot * (e_m / self.mw<https://urldefense.us/v3/__http://self.mw/__;!!G_uCfscf7eWS!YHklSzZKWeUyyqPeA2cK5USsTLG_tjjMs0innpBaQeO9qFz_YqnRf-SHAGLf2GZ-Ku35MtOoNkWCBhU44E22_6AQznSlayXlGYI$ > + self.g * self.pos)
                + rho_m * e_m_dot / self.mw<https://urldefense.us/v3/__http://self.mw/__;!!G_uCfscf7eWS!YHklSzZKWeUyyqPeA2cK5USsTLG_tjjMs0innpBaQeO9qFz_YqnRf-SHAGLf2GZ-Ku35MtOoNkWCBhU44E22_6AQznSlayXlGYI$ >
            )
            - rho_m_dot * e_m_dot / self.mw<https://urldefense.us/v3/__http://self.mw/__;!!G_uCfscf7eWS!YHklSzZKWeUyyqPeA2cK5USsTLG_tjjMs0innpBaQeO9qFz_YqnRf-SHAGLf2GZ-Ku35MtOoNkWCBhU44E22_6AQznSlayXlGYI$ > * dt**2
            - self.Q_wall / (np.pi * self.r_tank_inner**2 * lz) * dt
        )  # J / m3

        # add contribution from space
        for i in range(n_phase):
            e_flux_i = np.zeros_like(self.u_ghost[i])  # J/m3 m/s
            rho_flux_i = np.zeros_like(self.u_ghost[i])  # kg/m2/s
            for j in range(1, self.N + 1):
                if self.u_ghost[i][j] >= 0.0:
                    rho_flux_new = _rho_flux(
                        self.alpha_ghost[i][j], self.rho_ghost[i][j], self.u_ghost[i][j]
                    )
                    e_flux_new = _e_flux(
                        self.alpha_ghost[i][j],
                        self.rho_ghost[i][j],
                        self.h_ghost[i][j],
                        self.mw<https://urldefense.us/v3/__http://self.mw/__;!!G_uCfscf7eWS!YHklSzZKWeUyyqPeA2cK5USsTLG_tjjMs0innpBaQeO9qFz_YqnRf-SHAGLf2GZ-Ku35MtOoNkWCBhU44E22_6AQznSlayXlGYI$ >,
                        self.g,
                        self.pos_ghost[j],
                        self.u_ghost[i][j],
                    )

                    # backward euler
                    rho_flux_i[j] = rho_flux_new  # kg/m2/s
                    e_flux_i[j] = e_flux_new  # J/m3 m/s

                else:
                    rho_flux_new = _rho_flux(
                        self.alpha_ghost[i][j + 1],
                        self.rho_ghost[i][j + 1],
                        self.u_ghost[i][j],
                    )

                    e_flux_new = _e_flux(
                        self.alpha_ghost[i][j + 1],
                        self.rho_ghost[i][j + 1],
                        self.h_ghost[i][j + 1],
                        self.mw<https://urldefense.us/v3/__http://self.mw/__;!!G_uCfscf7eWS!YHklSzZKWeUyyqPeA2cK5USsTLG_tjjMs0innpBaQeO9qFz_YqnRf-SHAGLf2GZ-Ku35MtOoNkWCBhU44E22_6AQznSlayXlGYI$ >,
                        self.g,
                        self.pos_ghost[j + 1],
                        self.u_ghost[i][j],
                    )

                    # backward euler
                    rho_flux_i[j] = rho_flux_new
                    e_flux_i[j] = e_flux_new

            # mass eq
            f[:, 0] += (dt / self.dx) * (rho_flux_i[1:] - rho_flux_i[:-1])  # kg/m3

            # energy eq
            f[:, 2] += (dt / self.dx) * (e_flux_i[1:] - e_flux_i[:-1])  # J/m3

        f1_ref, f2_ref, f3_ref = self.rho_ref, self.p_ref, self.e_ref
        f[:, 0] /= f1_ref
        f[:-1, 1] /= f2_ref
        f[:, 2] /= f3_ref
        # dummy eq
        f[-1, 1] = x[-1, 2]


--
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener

https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!YHklSzZKWeUyyqPeA2cK5USsTLG_tjjMs0innpBaQeO9qFz_YqnRf-SHAGLf2GZ-Ku35MtOoNkWCBhU44E22_6AQznSlT6sU48M$ <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!YHklSzZKWeUyyqPeA2cK5USsTLG_tjjMs0innpBaQeO9qFz_YqnRf-SHAGLf2GZ-Ku35MtOoNkWCBhU44E22_6AQznSlics9Kn4$ >


--
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener

https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!YHklSzZKWeUyyqPeA2cK5USsTLG_tjjMs0innpBaQeO9qFz_YqnRf-SHAGLf2GZ-Ku35MtOoNkWCBhU44E22_6AQznSlT6sU48M$ <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!YHklSzZKWeUyyqPeA2cK5USsTLG_tjjMs0innpBaQeO9qFz_YqnRf-SHAGLf2GZ-Ku35MtOoNkWCBhU44E22_6AQznSlics9Kn4$ >
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20250225/4c485e02/attachment-0001.html>

From knepley at gmail.com  Tue Feb 25 08:26:49 2025
From: knepley at gmail.com (Matthew Knepley)
Date: Tue, 25 Feb 2025 09:26:49 -0500
Subject: [petsc-users] TS Solver stops working when including ts.setDM
In-Reply-To: <OS6P279MB057020701F9E25564062AA6694C32@OS6P279MB0570.NORP279.PROD.OUTLOOK.COM>
References: <OS6P279MB05706419B43C06C49AA346BD94C02@OS6P279MB0570.NORP279.PROD.OUTLOOK.COM>
	<CAMYG4Gn3RwRuqta5Xc0AWgAnvF=pam0VCJacH+J_ovVfUyXtRA@mail.gmail.com>
	<OS6P279MB057047A6F89E228712A833AF94C02@OS6P279MB0570.NORP279.PROD.OUTLOOK.COM>
	<CAMYG4G=trPFXfTGdK2FW3sp_6aSVUTOv=jsh1vJKu9ahr_44sw@mail.gmail.com>
	<OS6P279MB057020701F9E25564062AA6694C32@OS6P279MB0570.NORP279.PROD.OUTLOOK.COM>
Message-ID: <CAMYG4Gkf9bFc4r8p_R2=NWh+RaJyz4xGq6yNw8qqpMk7xFNiTQ@mail.gmail.com>

On Tue, Feb 25, 2025 at 3:19?AM Eirik Jaccheri H?ydalsvik <
eirik.hoydalsvik at sintef.no> wrote:

> Thanks again for the quick response,
>
> I tried prining the jacobians with -ksp_view_mat as you suggested, with a
> system of only 3 cells (I am studying at a 1d problem).  Printing the
> jacobian in the first timestep I  got the two matrices attached at the end
> of this email. The jacobians are in general agreement, with some small
> diviations, like the final element of the matrix being 1.6e-5 in the sparse
> case and 3.7 In the full case.
>

We usually expect to see single precision accuracy (1e-7), so this
indicates that your condition number is high.

If you use LU (-pc_type lu) to solve the linear system, do you get similar
results?

  Thanks,

     Matt


> Questions:
>
> 1. Are differences on the order of 1e-5 expected when computing the
> jacobians in different ways?
>
> 2. Do you think these differences can be the cause of my problems? Any
> suggestions for furtner debugging strategies?
>
> Eirik
>
> ! sparse jacobian
>
> row 0: (0, 1.1012)  (1, -104.568)  (2, 0.258649)  (3, -0.0644364)  (4,
> -13.1186)  (5, 1.3237e-08)
>
> row 1: (0, -0.44489)  (1, 1846.04)  (2, 2.12629e-07)  (3, 0.445291)  (4,
> -1846.04)  (5, 7.08762e-08)
>
> row 2: (0, 540.692)  (1, -40219.1)  (2, 126.734)  (3, -31.5544)  (4,
> -7023.46)  (5, 6.48896e-06)
>
> row 3: (0, -0.101197)  (1, 104.568)  (2, -0.258649)  (3, 1.16563)  (4,
> -91.4489)  (5, 0.258649)  (6, -0.0644365)  (7, -13.1186)  (8, -4.4809e-08)
>
> row 4: (0, 0.)  (1, 0.)  (2, 0.)  (3, -0.44489)  (4, 1846.04)  (5,
> -2.17357e-07)  (6, 0.445291)  (7, -1846.04)  (8, 2.17355e-07)
>
> row 5: (0, -49.7734)  (1, 51373.8)  (2, -126.734)  (3, 572.246)  (4,
> -33195.6)  (5, 126.734)  (6, -31.5544)  (7, -7023.46)  (8, -2.19026e-05)
>
> row 6: (3, -0.101197)  (4, 104.568)  (5, -0.258649)  (6, 1.06444)  (7,
> 13.1186)  (8, 3.32334e-08)
>
> row 7: (3, 0.)  (4, 0.)  (5, 0.)  (6, 0.)  (7, 0.)  (8, 1.)
>
> row 8: (3, -49.7734)  (4, 51373.8)  (5, -126.734)  (6, 522.472)  (7,
> 18178.2)  (8, 1.61503e-05)
>
>
>
> ! full jacobian
>
> 1.1011966827009450e+00 -1.0456754702270389e+02 2.5864915220241336e-01
> -6.4436436239323838e-02 -1.3118626729240630e+01 7.6042484344402957e-08
> 2.9290438414140398e-08 2.5347494781467651e-08 7.2381179542635411e-08
>
> -4.4488897562431995e-01 1.8460406897256150e+03 -5.0558790784552242e-07
> 4.4529051871980491e-01 -1.8460406898012175e+03 0.0000000000000000e+00
> 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00
>
> 5.4069168701069862e+02 -4.0219094660763396e+04 1.2673402711669499e+02
> -3.1554364136687809e+01 -7.0234605760797031e+03 3.7635960251523168e-05
> 1.4708306305192963e-05 1.2833718246687978e-05 3.5617173111594722e-05
>
> -1.0119659898285956e-01 1.0456754705230254e+02 -2.5864912672499502e-01
> 1.1656331184446040e+00 -9.1448920317937109e+01 2.5864905777109459e-01
> -6.4436443447843730e-02 -1.3118626754633008e+01 3.8772800266974086e-09
>
> 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00
> -4.4488904650049199e-01 1.8460406899429699e+03 -2.8823314009443733e-07
> 4.4529051871980491e-01 -1.8460406898012175e+03 0.0000000000000000e+00
>
> -4.9773392795905657e+01 5.1373794518018862e+04 -1.2673401444613337e+02
> 5.7224585844509852e+02 -3.3195615874594827e+04 1.2673393520690603e+02
> -3.1554356503250116e+01 -7.0234583029005144e+03 1.9105655626471588e-06
>
> -8.1675260962506883e-08 -2.9290438414140398e-08 -2.5347494781467651e-08
> -1.0119667997558363e-01 1.0456754704720647e+02 -2.5864913361425051e-01
> 1.0644364161519400e+00 1.3118626729240630e+01 -7.6042484344402957e-08
>
> 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00
> 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00
> 0.0000000000000000e+00 0.0000000000000000e+00 1.0000000000000000e+00
>
> -4.0087344635721997e-05 -1.4564107223769502e-05 -1.2401121002417596e-05
> -4.9773414819747863e+01 5.1373776293551586e+04 -1.2673397275364130e+02
> 5.2247224727851687e+02 1.8178158133060850e+04 -3.7347562088676249e-05
>
>
>
> *From: *Matthew Knepley <knepley at gmail.com>
> *Date: *Monday, February 24, 2025 at 15:00
> *To: *Eirik Jaccheri H?ydalsvik <eirik.hoydalsvik at sintef.no>
> *Cc: *petsc-users at mcs.anl.gov <petsc-users at mcs.anl.gov>
> *Subject: *Re: [petsc-users] TS Solver stops working when including
> ts.setDM
>
> On Mon, Feb 24, 2025 at 8:56?AM Eirik Jaccheri H?ydalsvik <
> eirik.hoydalsvik at sintef.no> wrote:
>
>
>    1. Thank you for the quick answer, I think this sounds reasonable? Is
>    there any way to compare the brute-force jacobian to the one computed using
>    the coloring information?
>
>
>
> The easiest way we have is to print them both out:
>
>
>
>   -ksp_view_mat
>
>
>
> on both runs. We have a way to compare the analytic and FD Jacobians
> (-snes_test_jacobian), but
>
> not two different FDs.
>
>
>
>   Thanks,
>
>
>
>      Matt
>
>
>
>
>    1.
>
> *From: *Matthew Knepley <knepley at gmail.com>
> *Date: *Monday, February 24, 2025 at 14:53
> *To: *Eirik Jaccheri H?ydalsvik <eirik.hoydalsvik at sintef.no>
> *Cc: *petsc-users at mcs.anl.gov <petsc-users at mcs.anl.gov>
> *Subject: *Re: [petsc-users] TS Solver stops working when including
> ts.setDM
>
> On Mon, Feb 24, 2025 at 8:41?AM Eirik Jaccheri H?ydalsvik via petsc-users
> <petsc-users at mcs.anl.gov> wrote:
>
> Hi,
>
> I am using the petsc4py.ts timestepper to solve a PDE. It is not easy to
> obtain the jacobian for my equations, so I do not provide a jacobian
> function. The code is given at the end of the email.
>
> When I comment out the function call ?ts.setDM(da)?, the code runs and
> gives reasonable results.
>
> However, when I add this line of code, the program crashes with the error
> message provided at the end of the email.
>
> Questions:
>
> 1. Do you know why adding this line of code can make the SNES solver
> diverge? Any suggestions for how to debug the issue?
>
>
>
> I will not know until I run it, but here is my guess. When the DMDA is
> specified, PETSc uses coloring to produce the Jacobian. When it is not, it
> just brute-forces the entire J. My guess is that your residual does not
> respect the stencil in the DMDA, so the coloring is wrong, making a wrong
> Jacobian.
>
>
>
> 2. What is the advantage of adding the DMDA object to the ts solver? Will
> this speed up the calculation of the finite difference jacobian?
>
>
>
> Yes, it speeds up the computation of the FD Jacobian.
>
>
>
>   Thanks,
>
>
>
>      Matt
>
>
>
> Best regards,
>
> Eirik H?ydalsvik
>
> SINTEF ER/NTNU
>
> Error message:
>
> [Eiriks-MacBook-Pro.local:26384] shmem: mmap: an error occurred while
> determining whether or not
> /var/folders/w1/35jw9y4n7lsbw0dhjqdhgzz80000gn/T//ompi.Eiriks-MacBook-Pro.501/jf.0/2046361600/sm_segment.Eiriks-MacBook-Pro.501.79f90000.0 could
> be created.
>
> t 0 of 1 with dt =  0.2
>
> 0 TS dt 0.2 time 0.
>
>       TSAdapt none step   0 stage rejected (SNES reason
> DIVERGED_LINEAR_SOLVE) t=0          + 2.000e-01 retrying with dt=5.000e-02
>
>       TSAdapt none step   0 stage rejected (SNES reason
> DIVERGED_LINEAR_SOLVE) t=0          + 5.000e-02 retrying with dt=1.250e-02
>
>       TSAdapt none step   0 stage rejected (SNES reason
> DIVERGED_LINEAR_SOLVE) t=0          + 1.250e-02 retrying with dt=3.125e-03
>
>       TSAdapt none step   0 stage rejected (SNES reason
> DIVERGED_LINEAR_SOLVE) t=0          + 3.125e-03 retrying with dt=7.813e-04
>
>       TSAdapt none step   0 stage rejected (SNES reason
> DIVERGED_LINEAR_SOLVE) t=0          + 7.813e-04 retrying with dt=1.953e-04
>
>       TSAdapt none step   0 stage rejected (SNES reason
> DIVERGED_LINEAR_SOLVE) t=0          + 1.953e-04 retrying with dt=4.883e-05
>
>       TSAdapt none step   0 stage rejected (SNES reason
> DIVERGED_LINEAR_SOLVE) t=0          + 4.883e-05 retrying with dt=1.221e-05
>
>       TSAdapt none step   0 stage rejected (SNES reason
> DIVERGED_LINEAR_SOLVE) t=0          + 1.221e-05 retrying with dt=3.052e-06
>
>       TSAdapt none step   0 stage rejected (SNES reason
> DIVERGED_LINEAR_SOLVE) t=0          + 3.052e-06 retrying with dt=7.629e-07
>
>       TSAdapt none step   0 stage rejected (SNES reason
> DIVERGED_LINEAR_SOLVE) t=0          + 7.629e-07 retrying with dt=1.907e-07
>
>       TSAdapt none step   0 stage rejected (SNES reason
> DIVERGED_LINEAR_SOLVE) t=0          + 1.907e-07 retrying with dt=4.768e-08
>
> Traceback (most recent call last):
>
>   File "/Users/iept1445/Code/tank_model/closed_tank.py", line 200, in
> <module>
>
>     return_dict1d = get_tank_composition_1d(tank_params)
>
>   File "/Users/iept1445/Code/tank_model/src/tank_model1d.py", line 223, in
> get_tank_composition_1d
>
>     ts.solve(u=x)
>
>   File "petsc4py/PETSc/TS.pyx", line 2478, in petsc4py.PETSc.TS.solve
>
> petsc4py.PETSc.Error: error code 91
>
> [0] TSSolve() at
> /Users/iept1445/.cache/uv/sdists-v6/pypi/petsc/3.22.1/svV0XnlR4s3LB4lmsaKjR/src/src/ts/interface/ts.c:4072
>
> [0] TSStep() at
> /Users/iept1445/.cache/uv/sdists-v6/pypi/petsc/3.22.1/svV0XnlR4s3LB4lmsaKjR/src/src/ts/interface/ts.c:3440
>
> [0] TSStep has failed due to DIVERGED_STEP_REJECTED
>
> Options for solver:
>
> COMM = PETSc.COMM_WORLD
>
>
>
>     da = PETSc.DMDA().create(
>
>         dim=(N_vertical,),
>
>         dof=3,
>
>         stencil_type=PETSc.DMDA().StencilType.STAR,
>
>         stencil_width=1,
>
>         # boundary_type=PETSc.DMDA.BoundaryType.GHOSTED,
>
>     )
>
>     x = da.createGlobalVec()
>
>     x_old = da.createGlobalVec()
>
>     f = da.createGlobalVec()
>
>     J = da.createMat()
>
>     rho_ref = rho_m[0]  # kg/m3
>
>     e_ref = e_m[0]  # J/mol
>
>     p_ref = p0  # Pa
>
>     x.setArray(np.array([rho_m / rho_ref, e_m / e_ref, ux_m]).T.flatten())
>
>     x_old.setArray(np.array([rho_m / rho_ref, e_m / e_ref,
> ux_m]).T.flatten())
>
>
>
>     optsDB = PETSc.Options()
>
>     optsDB["snes_lag_preconditioner_persists"] = False
>
>     optsDB["snes_lag_jacobian"] = 1
>
>     optsDB["snes_lag_jacobian_persists"] = False
>
>     optsDB["snes_lag_preconditioner"] = 1
>
>     optsDB["ksp_type"] = "gmres"  # "gmres"  # gmres"
>
>     optsDB["pc_type"] = "ilu"  # "lu"  # "ilu"
>
>     optsDB["snes_type"] = "newtonls"
>
>     optsDB["ksp_rtol"] = 1e-7
>
>     optsDB["ksp_atol"] = 1e-7
>
>     optsDB["ksp_max_it"] = 100
>
>     optsDB["snes_rtol"] = 1e-5
>
>     optsDB["snes_atol"] = 1e-5
>
>     optsDB["snes_stol"] = 1e-5
>
>     optsDB["snes_max_it"] = 100
>
>     optsDB["snes_mf"] = False
>
>     optsDB["ts_max_time"] = t_end
>
>     optsDB["ts_type"] = "beuler"  # "bdf"  #
>
>     optsDB["ts_max_snes_failures"] = -1
>
>     optsDB["ts_monitor"] = ""
>
>     optsDB["ts_adapt_monitor"] = ""
>
>     # optsDB["snes_monitor"] = ""
>
>     # optsDB["ksp_monitor"] = ""
>
>     optsDB["ts_atol"] = 1e-4
>
>
>
>     x0 = x_old
>
>     residual_wrap = residual_ts(
>
>         eos,
>
>         x0,
>
>         N_vertical,
>
>         g,
>
>         pos,
>
>         z,
>
>         mw,
>
>         dt,
>
>         dx,
>
>         p_amb,
>
>         A_nozzle,
>
>         r_tank_inner,
>
>         mph_uv_flsh_L,
>
>         rho_ref,
>
>         e_ref,
>
>         p_ref,
>
>         closed_tank,
>
>         J,
>
>         f,
>
>         da,
>
>         drift_func,
>
>         T_wall,
>
>         tank_params,
>
>     )
>
>
>
>     # optsDB["ts_adapt_type"] = "none"
>
>
>
>     ts = PETSc.TS().create(comm=COMM)
>
>     # TODO: Figure out why DM crashes the code
>
>     # ts.setDM(residual_wrap.da)
>
>     ts.setIFunction(residual_wrap.residual_ts, None)
>
>     ts.setTimeStep(dt)
>
>     ts.setMaxSteps(-1)
>
>     ts.setTime(t_start)  # s
>
>     ts.setMaxTime(t_end)  # s
>
>     ts.setMaxSteps(1e5)
>
>     ts.setStepLimits(1e-3, 1e5)
>
>     ts.setFromOptions()
>
>     ts.solve(u=x)
>
>
>
> Residual function:
>
> class residual_ts:
>
>     def __init__(
>
>         self,
>
>         eos,
>
>         x0,
>
>         N,
>
>         g,
>
>         pos,
>
>         z,
>
>         mw,
>
>         dt,
>
>         dx,
>
>         p_amb,
>
>         A_nozzle,
>
>         r_tank_inner,
>
>         mph_uv_flsh_l,
>
>         rho_ref,
>
>         e_ref,
>
>         p_ref,
>
>         closed_tank,
>
>         J,
>
>         f,
>
>         da,
>
>         drift_func,
>
>         T_wall,
>
>         tank_params,
>
>     ):
>
>         self.eos = eos
>
>         self.x0 = x0
>
>         self.N = N
>
>         self.g = g
>
>         self.pos = pos
>
>         self.z = z
>
>         self.mw = mw
>
>         self.dt = dt
>
>         self.dx = dx
>
>         self.p_amb = p_amb
>
>         self.A_nozzle = A_nozzle
>
>         self.r_tank_inner = r_tank_inner
>
>         self.mph_uv_flsh_L = mph_uv_flsh_l
>
>         self.rho_ref = rho_ref
>
>         self.e_ref = e_ref
>
>         self.p_ref = p_ref
>
>         self.closed_tank = closed_tank
>
>         self.J = J
>
>         self.f = f
>
>         self.da = da
>
>         self.drift_func = drift_func
>
>         self.T_wall = T_wall
>
>         self.tank_params = tank_params
>
>         self.Q_wall = np.zeros(N)
>
>         self.n_iter = 0
>
>         self.t_current = [0.0]
>
>         self.s_top = [0.0]
>
>         self.p_choke = [0.0]
>
>
>
>         # setting interp func # TODO figure out how to generalize this
> method
>
>         self._interp_func = _jalla_upwind
>
>
>
>         # allocate space for new params
>
>         self.p = np.zeros(N)  # Pa
>
>         self.T = np.zeros(N)  # K
>
>         self.alpha = np.zeros((2, N))
>
>         self.rho = np.zeros((2, N))
>
>         self.e = np.zeros((2, N))
>
>
>
>         # allocate space for ghost cells
>
>         self.alpha_ghost = np.zeros((2, N + 2))
>
>         self.rho_ghost = np.zeros((2, N + 2))
>
>         self.rho_m_ghost = np.zeros(N + 2)
>
>         self.u_m_ghost = np.zeros(N + 1)
>
>         self.u_ghost = np.zeros((2, N + 1))
>
>         self.e_ghost = np.zeros((2, N + 2))
>
>         self.pos_ghost = np.zeros(N + 2)
>
>         self.h_ghost = np.zeros((2, N + 2))
>
>
>
>         # allocate soace for local X and Xdot
>
>         self.X_LOCAL = da.createLocalVec()
>
>         self.XDOT_LOCAL = da.createLocalVec()
>
>
>
>     def residual_ts(self, ts, t, X, XDOT, F):
>
>         self.n_iter += 1
>
>         # TODO: Estimate time use
>
>         """
>
>         Caculate residuals for equations
>
>         (rho, rho (e + gx))_t + (rho u, rho u (h + gx))_x = 0
>
>         P_x = - g \rho
>
>         """
>
>         n_phase = 2
>
>         self.da.globalToLocal(X, self.X_LOCAL)
>
>         self.da.globalToLocal(XDOT, self.XDOT_LOCAL)
>
>         x = self.da.getVecArray(self.X_LOCAL)
>
>         xdot = self.da.getVecArray(self.XDOT_LOCAL)
>
>         f = self.da.getVecArray(F)
>
>
>
>         T_c, v_c, p_c = self.eos.critical(self.z)  # K, m3/mol, Pa
>
>         rho_m = x[:, 0] * self.rho_ref  # kg/m3
>
>         e_m = x[:, 1] * self.e_ref  # J/mol
>
>         u_m = x[:-1, 2]  # m/s
>
>
>
>         # derivatives
>
>         rho_m_dot = xdot[:, 0] * self.rho_ref  # kg/m3
>
>         e_m_dot = xdot[:, 1] * self.e_ref  # kg/m3
>
>         dt = ts.getTimeStep()  # s
>
>
>
>         for i in range(self.N):
>
>             # get new parameters
>
>             self.mph_uv_flsh_L[i] = self.eos.multi_phase_uvflash(
>
>                 self.z, e_m[i], self.mw / rho_m[i], self.mph_uv_flsh_L[i]
>
>             )
>
>
>
>             betaL, betaV, betaS = _get_betas(self.mph_uv_flsh_L[i])  #
> mol/mol
>
>             beta = [betaL, betaV]
>
>             if betaS != 0.0:
>
>                 print("there is a solid phase which is not accounted for")
>
>             self.T[i], self.p[i] = _get_tank_temperature_pressure(
>
>                 self.mph_uv_flsh_L[i]
>
>             )  # K, Pa)
>
>             for j, phase in enumerate([self.eos.LIQPH, self.eos.VAPPH]):
>
>                 # new parameters
>
>                 self.rho_ghost[:, 1:-1][j][i] = (
>
>                     self.mw
>
>                     / self.eos.specific_volume(self.T[i], self.p[i],
> self.z, phase)[0]
>
>                 )  # kg/m3
>
>                 self.e_ghost[:, 1:-1][j][i] = self.eos.internal_energy_tv(
>
>                     self.T[i], self.mw / self.rho_ghost[:, 1:-1][j][i],
> self.z, phase
>
>                 )[
>
>                     0
>
>                 ]  # J/mol
>
>                 self.h_ghost[:, 1:-1][j][i] = (
>
>                     self.e_ghost[:, 1:-1][j][i]
>
>                     + self.p[i] * self.mw / self.rho_ghost[:, 1:-1][j][i]
>
>                 )  # J/mol
>
>                 self.alpha_ghost[:, 1:-1][j][i] = (
>
>                     beta[j] / self.rho_ghost[:, 1:-1][j][i] * rho_m[i]
>
>                 )  # m3/m3
>
>
>
>         # calculate drift velocity
>
>         for i in range(self.N - 1):
>
>             self.u_ghost[:, 1:-1][0][i], self.u_ghost[:, 1:-1][1][i] = (
>
>                 calc_drift_velocity(
>
>                     u_m[i],
>
>                     self._interp_func(
>
>                         self.rho_ghost[:, 1:-1][0][i],
>
>                         self.rho_ghost[:, 1:-1][0][i + 1],
>
>                         u_m[i],
>
>                     ),
>
>                     self._interp_func(
>
>                         self.rho_ghost[:, 1:-1][1][i],
>
>                         self.rho_ghost[:, 1:-1][1][i + 1],
>
>                         u_m[i],
>
>                     ),
>
>                     self.g,
>
>                     self._interp_func(self.T[i], self.T[i + 1], u_m[i]),
>
>                     T_c,
>
>                     self.r_tank_inner,
>
>                     self._interp_func(
>
>                         self.alpha_ghost[:, 1:-1][0][i],
>
>                         self.alpha_ghost[:, 1:-1][0][i + 1],
>
>                         u_m[i],
>
>                     ),
>
>                     self._interp_func(
>
>                         self.alpha_ghost[:, 1:-1][1][i],
>
>                         self.alpha_ghost[:, 1:-1][1][i + 1],
>
>                         u_m[i],
>
>                     ),
>
>                     self.drift_func,
>
>                 )
>
>             )  # liq m / s , vapour m / s
>
>
>
>         u_bottom = 0
>
>         if self.closed_tank:
>
>             u_top = 0.0  # m/s
>
>         else:
>
>             # calc phase to skip env_isentrope_cross
>
>             if (
>
>                 self.mph_uv_flsh_L[-1].liquid != None
>
>                 and self.mph_uv_flsh_L[-1].vapour == None
>
>                 and self.mph_uv_flsh_L[-1].solid == None
>
>             ):
>
>                 phase_env = self.eos.LIQPH
>
>             else:
>
>                 phase_env = self.eos.TWOPH
>
>
>
>             self.h_m = e_m + self.p * self.mw / rho_m  # J / mol
>
>             self.s_top[0] = _get_s_tank(self.eos, self.mph_uv_flsh_L[-1])
> # J / mol / K
>
>             mdot, self.p_choke[0] = calc_mass_outflow(
>
>                 self.eos,
>
>                 self.z,
>
>                 self.h_m[-1],
>
>                 self.s_top[0],
>
>                 self.p[-1],
>
>                 self.p_amb,
>
>                 self.A_nozzle,
>
>                 self.mw,
>
>                 phase_env,
>
>                 debug_plot=False,
>
>             )  # mol / s , Pa
>
>             u_top = -mdot * self.mw / rho_m[-1] / (np.pi *
> self.r_tank_inner**2)  # m/s
>
>
>
>         # assemble vectors with ghost cells
>
>         self.alpha_ghost[:, 0] = self.alpha_ghost[:, 1:-1][:, 0]  # m3/m3
>
>         self.alpha_ghost[:, -1] = self.alpha_ghost[:, 1:-1][:, -1]  # m3/m3
>
>         self.rho_ghost[:, 0] = self.rho_ghost[:, 1:-1][:, 0]  # kg/m3
>
>         self.rho_ghost[:, -1] = self.rho_ghost[:, 1:-1][:, -1]  # kg/m3
>
>         self.rho_m_ghost[0] = rho_m[0]  # kg/m3
>
>         self.rho_m_ghost[1:-1] = rho_m  # kg/m3
>
>         self.rho_m_ghost[-1] = rho_m[-1]  # kg/m3
>
>         # u_ghost[:, 1:-1] = u  # m/s
>
>         self.u_ghost[:, 0] = u_bottom  # m/s
>
>         self.u_ghost[:, -1] = u_top  # m/s
>
>         self.u_m_ghost[0] = u_bottom  # m/s
>
>         self.u_m_ghost[1:-1] = u_m  # m/s
>
>         self.u_m_ghost[-1] = u_top  # m/s
>
>         self.e_ghost[:, 0] = self.e_ghost[:, 1:-1][:, 0]  # J/mol
>
>         self.e_ghost[:, -1] = self.e_ghost[:, 1:-1][:, -1]  # J/mol
>
>         self.pos_ghost[1:-1] = self.pos  # m
>
>         self.pos_ghost[0] = self.pos[0]  # m
>
>         self.pos_ghost[-1] = self.pos[-1]  # m
>
>         self.h_ghost[:, 0] = self.h_ghost[:, 1]  # J/mol
>
>         self.h_ghost[:, -1] = self.h_ghost[:, -2]  # J/mol
>
>
>
>         # recalculate wall temperature and heat flux
>
>         # TODO ARE WE DOING THE STAGGERING CORRECTLY?
>
>         lz = self.tank_params["lz_tank"] / self.N  # m
>
>         if ts.getTime() != self.t_current[0] and
> self.tank_params["heat_transfer"]:
>
>             self.t_current[0] = ts.getTime()
>
>             for i in range(self.N):
>
>                 self.T_wall[i], self.Q_wall[i], h_ht = (
>
>                     solve_radial_heat_conduction_implicit(
>
>                         self.tank_params,
>
>                         self.T[i],
>
>                         self.T_wall[i],
>
>                         (self.u_m_ghost[i] + self.u_m_ghost[i + 1]) / 2,
>
>                         self.rho_m_ghost[i + 1],
>
>                         self.mph_uv_flsh_L[i],
>
>                         lz,
>
>                         dt,
>
>                     )
>
>                 )  # K, J/s, W/m2K
>
>
>
>         # Calculate residuals
>
>         f[:, :] = 0.0
>
>         f[:, 0] = dt * rho_m_dot  # kg/m3
>
>         f[:-1, 1] = self.p[1:] - self.p[:-1] + self.dx * self.g *
> rho_m[0:-1]  # Pa/m
>
>         f[:, 2] = (
>
>             dt
>
>             * (
>
>                 rho_m_dot * (e_m / self.mw + self.g * self.pos)
>
>                 + rho_m * e_m_dot / self.mw
>
>             )
>
>             - rho_m_dot * e_m_dot / self.mw * dt**2
>
>             - self.Q_wall / (np.pi * self.r_tank_inner**2 * lz) * dt
>
>         )  # J / m3
>
>
>
>         # add contribution from space
>
>         for i in range(n_phase):
>
>             e_flux_i = np.zeros_like(self.u_ghost[i])  # J/m3 m/s
>
>             rho_flux_i = np.zeros_like(self.u_ghost[i])  # kg/m2/s
>
>             for j in range(1, self.N + 1):
>
>                 if self.u_ghost[i][j] >= 0.0:
>
>                     rho_flux_new = _rho_flux(
>
>                         self.alpha_ghost[i][j], self.rho_ghost[i][j],
> self.u_ghost[i][j]
>
>                     )
>
>                     e_flux_new = _e_flux(
>
>                         self.alpha_ghost[i][j],
>
>                         self.rho_ghost[i][j],
>
>                         self.h_ghost[i][j],
>
>                         self.mw,
>
>                         self.g,
>
>                         self.pos_ghost[j],
>
>                         self.u_ghost[i][j],
>
>                     )
>
>
>
>                     # backward euler
>
>                     rho_flux_i[j] = rho_flux_new  # kg/m2/s
>
>                     e_flux_i[j] = e_flux_new  # J/m3 m/s
>
>
>
>                 else:
>
>                     rho_flux_new = _rho_flux(
>
>                         self.alpha_ghost[i][j + 1],
>
>                         self.rho_ghost[i][j + 1],
>
>                         self.u_ghost[i][j],
>
>                     )
>
>
>
>                     e_flux_new = _e_flux(
>
>                         self.alpha_ghost[i][j + 1],
>
>                         self.rho_ghost[i][j + 1],
>
>                         self.h_ghost[i][j + 1],
>
>                         self.mw,
>
>                         self.g,
>
>                         self.pos_ghost[j + 1],
>
>                         self.u_ghost[i][j],
>
>                     )
>
>
>
>                     # backward euler
>
>                     rho_flux_i[j] = rho_flux_new
>
>                     e_flux_i[j] = e_flux_new
>
>
>
>             # mass eq
>
>             f[:, 0] += (dt / self.dx) * (rho_flux_i[1:] -
> rho_flux_i[:-1])  # kg/m3
>
>
>
>             # energy eq
>
>             f[:, 2] += (dt / self.dx) * (e_flux_i[1:] - e_flux_i[:-1])  #
> J/m3
>
>
>
>         f1_ref, f2_ref, f3_ref = self.rho_ref, self.p_ref, self.e_ref
>
>         f[:, 0] /= f1_ref
>
>         f[:-1, 1] /= f2_ref
>
>         f[:, 2] /= f3_ref
>
>         # dummy eq
>
>         f[-1, 1] = x[-1, 2]
>
>
>
>
>
>
>
>
> --
>
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
>
>
> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!dF3HkSM-0mmYDI4gDB-9LOH6k5M9HjOvEIw1XdzlTas2-tKkCBjmhfjgtouzKAhXWqb--ieWjn0Q8bbsceAb$ 
> <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!dF3HkSM-0mmYDI4gDB-9LOH6k5M9HjOvEIw1XdzlTas2-tKkCBjmhfjgtouzKAhXWqb--ieWjn0Q8bjgzlu0$ >
>
>
>
>
> --
>
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
>
>
> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!dF3HkSM-0mmYDI4gDB-9LOH6k5M9HjOvEIw1XdzlTas2-tKkCBjmhfjgtouzKAhXWqb--ieWjn0Q8bbsceAb$ 
> <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!dF3HkSM-0mmYDI4gDB-9LOH6k5M9HjOvEIw1XdzlTas2-tKkCBjmhfjgtouzKAhXWqb--ieWjn0Q8bjgzlu0$ >
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!dF3HkSM-0mmYDI4gDB-9LOH6k5M9HjOvEIw1XdzlTas2-tKkCBjmhfjgtouzKAhXWqb--ieWjn0Q8bbsceAb$  <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!dF3HkSM-0mmYDI4gDB-9LOH6k5M9HjOvEIw1XdzlTas2-tKkCBjmhfjgtouzKAhXWqb--ieWjn0Q8bjgzlu0$ >
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20250225/4d45744e/attachment-0001.html>

From eirik.hoydalsvik at sintef.no  Tue Feb 25 08:51:08 2025
From: eirik.hoydalsvik at sintef.no (=?utf-8?B?RWlyaWsgSmFjY2hlcmkgSMO4eWRhbHN2aWs=?=)
Date: Tue, 25 Feb 2025 14:51:08 +0000
Subject: [petsc-users] TS Solver stops working when including ts.setDM
In-Reply-To: <CAMYG4Gkf9bFc4r8p_R2=NWh+RaJyz4xGq6yNw8qqpMk7xFNiTQ@mail.gmail.com>
References: <OS6P279MB05706419B43C06C49AA346BD94C02@OS6P279MB0570.NORP279.PROD.OUTLOOK.COM>
	<CAMYG4Gn3RwRuqta5Xc0AWgAnvF=pam0VCJacH+J_ovVfUyXtRA@mail.gmail.com>
	<OS6P279MB057047A6F89E228712A833AF94C02@OS6P279MB0570.NORP279.PROD.OUTLOOK.COM>
	<CAMYG4G=trPFXfTGdK2FW3sp_6aSVUTOv=jsh1vJKu9ahr_44sw@mail.gmail.com>
	<OS6P279MB057020701F9E25564062AA6694C32@OS6P279MB0570.NORP279.PROD.OUTLOOK.COM>
	<CAMYG4Gkf9bFc4r8p_R2=NWh+RaJyz4xGq6yNw8qqpMk7xFNiTQ@mail.gmail.com>
Message-ID: <SV0P279MB0561FF472107BCFADD0E113694C32@SV0P279MB0561.NORP279.PROD.OUTLOOK.COM>

Hi,

After sending you the email, I rescaled the residual function and got the two jacobians to agree down to e-7.

I have tried with ?lu? and ?ilu? as preconditioners, and this does not work. However, I just tried to use ?sor? as a preconditioner, and using sor using the da jacobian works just fine!

Why should it work with sor and not with ilu or lu?

Eirik

Jacobians:
row 0: (0, 1.1012)  (1, -51356.3)  (2, 0.258649)  (3, -0.0644364)  (4, -6402.63)  (5, 6.19796e-08)
row 1: (0, -0.445291)  (1, 901708.)  (2, 0.)  (3, 0.44529)  (4, -901708.)  (5, 3.63946e-07)
row 2: (0, 1.10139)  (1, -40239.6)  (2, 0.258157)  (3, -0.0642761)  (4, -6985.51)  (5, 6.19796e-08)
row 3: (0, -0.101197)  (1, 51356.3)  (2, -0.258649)  (3, 1.16563)  (4, -44953.7)  (5, 0.258649)  (6, -0.0644364)  (7, -6402.63)  (8, 8.23293e-08)
row 4: (0, 0.)  (1, 0.)  (2, 0.)  (3, -0.44529)  (4, 901708.)  (5, -3.63946e-07)  (6, 0.44529)  (7, -901708.)  (8, -2.8832e-07)
row 5: (0, -0.101388)  (1, 51394.3)  (2, -0.258157)  (3, 1.16566)  (4, -33254.1)  (5, 0.258157)  (6, -0.0642762)  (7, -6985.51)  (8, 8.27566e-08)
row 6: (3, -0.101197)  (4, 51356.3)  (5, -0.258649)  (6, 1.06444)  (7, 6402.63)  (8, -5.80354e-08)
row 7: (3, 0.)  (4, 0.)  (5, 0.)  (6, 0.)  (7, 0.)  (8, 1.)
row 8: (3, -0.101388)  (4, 51394.3)  (5, -0.258157)  (6, 1.06428)  (7, 18140.2)  (8, -5.88806e-08)

1.1011966721737030e+00 -5.1356338141763350e+04 2.5864916418200712e-01 -6.4436434390614972e-02 -6.4026259743175688e+03 7.0149583230222760e-08 -1.2114185055821600e-08 7.4938912205780135e-08 -1.2114185055821600e-08
-4.4529045442108389e-01 9.0170829570587131e+05 -3.6394551278146074e-07 4.4529038352260736e-01 -9.0170829570607911e+05 -2.8832047116453381e-07 2.8832047116453381e-07 -2.8832047116453381e-07 2.8832047116453381e-07
1.1013882301195626e+00 -4.0239613965471210e+04 2.5815705499526392e-01 -6.4276195275552644e-02 -6.9855091616075770e+03 7.0994758931791700e-08 -1.2114185055821600e-08 7.5502362673492770e-08 -1.2114185055821600e-08
-1.0119660581208668e-01 5.1356338141756765e+04 -2.5864909514315410e-01 1.1656331102401210e+00 -4.4953712167476042e+04 2.5864905556513557e-01 -6.4436357255260146e-02 -6.4026259743998889e+03 8.6207799859128616e-08
0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 -4.4529067184307852e-01 9.0170829570636747e+05 0.0000000000000000e+00 4.4529045442108389e-01 -9.0170829570587131e+05 3.6394551278146074e-07
-1.0138834089854361e-01 5.1394313808568040e+04 -2.5815698520133573e-01 1.1656640380374783e+00 -3.3254086573823952e+04 2.5815685372183378e-01 -6.4276095304361430e-02 -6.9855068889361701e+03 8.6288440103988163e-08
-6.9304407528653807e-08 0.0000000000000000e+00 -6.9304407528653807e-08 -1.0119667848877271e-01 5.1356338141793494e+04 -2.5864912586737532e-01 1.0644363666594443e+00 6.4026259743251758e+03 -7.4093736504211182e-08
0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 1.0000000000000000e+00
-6.9586132762510124e-08 0.0000000000000000e+00 -6.9586132762510124e-08 -1.0138837786069978e-01 5.1394295578534489e+04 -2.5815692427475545e-01 1.0642752170084262e+00 1.8140206731963925e+04 -7.4375461738067500e-08


From: Matthew Knepley <knepley at gmail.com>
Date: Tuesday, February 25, 2025 at 15:27
To: Eirik Jaccheri H?ydalsvik <eirik.hoydalsvik at sintef.no>
Cc: petsc-users at mcs.anl.gov <petsc-users at mcs.anl.gov>
Subject: Re: [petsc-users] TS Solver stops working when including ts.setDM
On Tue, Feb 25, 2025 at 3:19?AM Eirik Jaccheri H?ydalsvik <eirik.hoydalsvik at sintef.no<mailto:eirik.hoydalsvik at sintef.no>> wrote:
Thanks again for the quick response,

I tried prining the jacobians with -ksp_view_mat as you suggested, with a system of only 3 cells (I am studying at a 1d problem).  Printing the jacobian in the first timestep I  got the two matrices attached at the end of this email. The jacobians are in general agreement, with some small diviations, like the final element of the matrix being 1.6e-5 in the sparse case and 3.7 In the full case.

We usually expect to see single precision accuracy (1e-7), so this indicates that your condition number is high.

If you use LU (-pc_type lu) to solve the linear system, do you get similar results?

  Thanks,

     Matt

Questions:

1. Are differences on the order of 1e-5 expected when computing the jacobians in different ways?

2. Do you think these differences can be the cause of my problems? Any suggestions for furtner debugging strategies?

Eirik

! sparse jacobian
row 0: (0, 1.1012)  (1, -104.568)  (2, 0.258649)  (3, -0.0644364)  (4, -13.1186)  (5, 1.3237e-08)
row 1: (0, -0.44489)  (1, 1846.04)  (2, 2.12629e-07)  (3, 0.445291)  (4, -1846.04)  (5, 7.08762e-08)
row 2: (0, 540.692)  (1, -40219.1)  (2, 126.734)  (3, -31.5544)  (4, -7023.46)  (5, 6.48896e-06)
row 3: (0, -0.101197)  (1, 104.568)  (2, -0.258649)  (3, 1.16563)  (4, -91.4489)  (5, 0.258649)  (6, -0.0644365)  (7, -13.1186)  (8, -4.4809e-08)
row 4: (0, 0.)  (1, 0.)  (2, 0.)  (3, -0.44489)  (4, 1846.04)  (5, -2.17357e-07)  (6, 0.445291)  (7, -1846.04)  (8, 2.17355e-07)
row 5: (0, -49.7734)  (1, 51373.8)  (2, -126.734)  (3, 572.246)  (4, -33195.6)  (5, 126.734)  (6, -31.5544)  (7, -7023.46)  (8, -2.19026e-05)
row 6: (3, -0.101197)  (4, 104.568)  (5, -0.258649)  (6, 1.06444)  (7, 13.1186)  (8, 3.32334e-08)
row 7: (3, 0.)  (4, 0.)  (5, 0.)  (6, 0.)  (7, 0.)  (8, 1.)
row 8: (3, -49.7734)  (4, 51373.8)  (5, -126.734)  (6, 522.472)  (7, 18178.2)  (8, 1.61503e-05)

! full jacobian
1.1011966827009450e+00 -1.0456754702270389e+02 2.5864915220241336e-01 -6.4436436239323838e-02 -1.3118626729240630e+01 7.6042484344402957e-08 2.9290438414140398e-08 2.5347494781467651e-08 7.2381179542635411e-08
-4.4488897562431995e-01 1.8460406897256150e+03 -5.0558790784552242e-07 4.4529051871980491e-01 -1.8460406898012175e+03 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00
5.4069168701069862e+02 -4.0219094660763396e+04 1.2673402711669499e+02 -3.1554364136687809e+01 -7.0234605760797031e+03 3.7635960251523168e-05 1.4708306305192963e-05 1.2833718246687978e-05 3.5617173111594722e-05
-1.0119659898285956e-01 1.0456754705230254e+02 -2.5864912672499502e-01 1.1656331184446040e+00 -9.1448920317937109e+01 2.5864905777109459e-01 -6.4436443447843730e-02 -1.3118626754633008e+01 3.8772800266974086e-09
0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 -4.4488904650049199e-01 1.8460406899429699e+03 -2.8823314009443733e-07 4.4529051871980491e-01 -1.8460406898012175e+03 0.0000000000000000e+00
-4.9773392795905657e+01 5.1373794518018862e+04 -1.2673401444613337e+02 5.7224585844509852e+02 -3.3195615874594827e+04 1.2673393520690603e+02 -3.1554356503250116e+01 -7.0234583029005144e+03 1.9105655626471588e-06
-8.1675260962506883e-08 -2.9290438414140398e-08 -2.5347494781467651e-08 -1.0119667997558363e-01 1.0456754704720647e+02 -2.5864913361425051e-01 1.0644364161519400e+00 1.3118626729240630e+01 -7.6042484344402957e-08
0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 1.0000000000000000e+00
-4.0087344635721997e-05 -1.4564107223769502e-05 -1.2401121002417596e-05 -4.9773414819747863e+01 5.1373776293551586e+04 -1.2673397275364130e+02 5.2247224727851687e+02 1.8178158133060850e+04 -3.7347562088676249e-05

From: Matthew Knepley <knepley at gmail.com<mailto:knepley at gmail.com>>
Date: Monday, February 24, 2025 at 15:00
To: Eirik Jaccheri H?ydalsvik <eirik.hoydalsvik at sintef.no<mailto:eirik.hoydalsvik at sintef.no>>
Cc: petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov> <petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>>
Subject: Re: [petsc-users] TS Solver stops working when including ts.setDM
On Mon, Feb 24, 2025 at 8:56?AM Eirik Jaccheri H?ydalsvik <eirik.hoydalsvik at sintef.no<mailto:eirik.hoydalsvik at sintef.no>> wrote:

  1.  Thank you for the quick answer, I think this sounds reasonable? Is there any way to compare the brute-force jacobian to the one computed using the coloring information?

The easiest way we have is to print them both out:

  -ksp_view_mat

on both runs. We have a way to compare the analytic and FD Jacobians (-snes_test_jacobian), but
not two different FDs.

  Thanks,

     Matt


  1.
From: Matthew Knepley <knepley at gmail.com<mailto:knepley at gmail.com>>
Date: Monday, February 24, 2025 at 14:53
To: Eirik Jaccheri H?ydalsvik <eirik.hoydalsvik at sintef.no<mailto:eirik.hoydalsvik at sintef.no>>
Cc: petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov> <petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>>
Subject: Re: [petsc-users] TS Solver stops working when including ts.setDM
On Mon, Feb 24, 2025 at 8:41?AM Eirik Jaccheri H?ydalsvik via petsc-users <petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>> wrote:
Hi,

I am using the petsc4py.ts timestepper to solve a PDE. It is not easy to obtain the jacobian for my equations, so I do not provide a jacobian function. The code is given at the end of the email.

When I comment out the function call ?ts.setDM(da)?, the code runs and gives reasonable results.

However, when I add this line of code, the program crashes with the error message provided at the end of the email.

Questions:

1. Do you know why adding this line of code can make the SNES solver diverge? Any suggestions for how to debug the issue?

I will not know until I run it, but here is my guess. When the DMDA is specified, PETSc uses coloring to produce the Jacobian. When it is not, it just brute-forces the entire J. My guess is that your residual does not respect the stencil in the DMDA, so the coloring is wrong, making a wrong Jacobian.

2. What is the advantage of adding the DMDA object to the ts solver? Will this speed up the calculation of the finite difference jacobian?

Yes, it speeds up the computation of the FD Jacobian.

  Thanks,

     Matt

Best regards,
Eirik H?ydalsvik
SINTEF ER/NTNU

Error message:

[Eiriks-MacBook-Pro.local:26384] shmem: mmap: an error occurred while determining whether or not /var/folders/w1/35jw9y4n7lsbw0dhjqdhgzz80000gn/T//ompi.Eiriks-MacBook-Pro.501/jf.0/2046361600/sm_segment.Eiriks-MacBook-Pro.501.79f90000.0 could be created.
t 0 of 1 with dt =  0.2
0 TS dt 0.2 time 0.
      TSAdapt none step   0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0          + 2.000e-01 retrying with dt=5.000e-02
      TSAdapt none step   0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0          + 5.000e-02 retrying with dt=1.250e-02
      TSAdapt none step   0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0          + 1.250e-02 retrying with dt=3.125e-03
      TSAdapt none step   0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0          + 3.125e-03 retrying with dt=7.813e-04
      TSAdapt none step   0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0          + 7.813e-04 retrying with dt=1.953e-04
      TSAdapt none step   0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0          + 1.953e-04 retrying with dt=4.883e-05
      TSAdapt none step   0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0          + 4.883e-05 retrying with dt=1.221e-05
      TSAdapt none step   0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0          + 1.221e-05 retrying with dt=3.052e-06
      TSAdapt none step   0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0          + 3.052e-06 retrying with dt=7.629e-07
      TSAdapt none step   0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0          + 7.629e-07 retrying with dt=1.907e-07
      TSAdapt none step   0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0          + 1.907e-07 retrying with dt=4.768e-08
Traceback (most recent call last):
  File "/Users/iept1445/Code/tank_model/closed_tank.py", line 200, in <module>
    return_dict1d = get_tank_composition_1d(tank_params)
  File "/Users/iept1445/Code/tank_model/src/tank_model1d.py", line 223, in get_tank_composition_1d
    ts.solve(u=x)
  File "petsc4py/PETSc/TS.pyx", line 2478, in petsc4py.PETSc.TS.solve
petsc4py.PETSc.Error: error code 91
[0] TSSolve() at /Users/iept1445/.cache/uv/sdists-v6/pypi/petsc/3.22.1/svV0XnlR4s3LB4lmsaKjR/src/src/ts/interface/ts.c:4072
[0] TSStep() at /Users/iept1445/.cache/uv/sdists-v6/pypi/petsc/3.22.1/svV0XnlR4s3LB4lmsaKjR/src/src/ts/interface/ts.c:3440
[0] TSStep has failed due to DIVERGED_STEP_REJECTED

Options for solver:

COMM = PETSc.COMM_WORLD

    da = PETSc.DMDA().create(
        dim=(N_vertical,),
        dof=3,
        stencil_type=PETSc.DMDA().StencilType.STAR,
        stencil_width=1,
        # boundary_type=PETSc.DMDA.BoundaryType.GHOSTED,
    )
    x = da.createGlobalVec()
    x_old = da.createGlobalVec()
    f = da.createGlobalVec()
    J = da.createMat()
    rho_ref = rho_m[0]  # kg/m3
    e_ref = e_m[0]  # J/mol
    p_ref = p0  # Pa
    x.setArray(np.array([rho_m / rho_ref, e_m / e_ref, ux_m]).T.flatten())
    x_old.setArray(np.array([rho_m / rho_ref, e_m / e_ref, ux_m]).T.flatten())

    optsDB = PETSc.Options()
    optsDB["snes_lag_preconditioner_persists"] = False
    optsDB["snes_lag_jacobian"] = 1
    optsDB["snes_lag_jacobian_persists"] = False
    optsDB["snes_lag_preconditioner"] = 1
    optsDB["ksp_type"] = "gmres"  # "gmres"  # gmres"
    optsDB["pc_type"] = "ilu"  # "lu"  # "ilu"
    optsDB["snes_type"] = "newtonls"
    optsDB["ksp_rtol"] = 1e-7
    optsDB["ksp_atol"] = 1e-7
    optsDB["ksp_max_it"] = 100
    optsDB["snes_rtol"] = 1e-5
    optsDB["snes_atol"] = 1e-5
    optsDB["snes_stol"] = 1e-5
    optsDB["snes_max_it"] = 100
    optsDB["snes_mf"] = False
    optsDB["ts_max_time"] = t_end
    optsDB["ts_type"] = "beuler"  # "bdf"  #
    optsDB["ts_max_snes_failures"] = -1
    optsDB["ts_monitor"] = ""
    optsDB["ts_adapt_monitor"] = ""
    # optsDB["snes_monitor"] = ""
    # optsDB["ksp_monitor"] = ""
    optsDB["ts_atol"] = 1e-4

    x0 = x_old
    residual_wrap = residual_ts(
        eos,
        x0,
        N_vertical,
        g,
        pos,
        z,
        mw,
        dt,
        dx,
        p_amb,
        A_nozzle,
        r_tank_inner,
        mph_uv_flsh_L,
        rho_ref,
        e_ref,
        p_ref,
        closed_tank,
        J,
        f,
        da,
        drift_func,
        T_wall,
        tank_params,
    )

    # optsDB["ts_adapt_type"] = "none"

    ts = PETSc.TS().create(comm=COMM)
    # TODO: Figure out why DM crashes the code
    # ts.setDM(residual_wrap.da)
    ts.setIFunction(residual_wrap.residual_ts, None)
    ts.setTimeStep(dt)
    ts.setMaxSteps(-1)
    ts.setTime(t_start)  # s
    ts.setMaxTime(t_end)  # s
    ts.setMaxSteps(1e5)
    ts.setStepLimits(1e-3, 1e5)
    ts.setFromOptions()
    ts.solve(u=x)


Residual function:

class residual_ts:
    def __init__(
        self,
        eos,
        x0,
        N,
        g,
        pos,
        z,
        mw,
        dt,
        dx,
        p_amb,
        A_nozzle,
        r_tank_inner,
        mph_uv_flsh_l,
        rho_ref,
        e_ref,
        p_ref,
        closed_tank,
        J,
        f,
        da,
        drift_func,
        T_wall,
        tank_params,
    ):
        self.eos = eos
        self.x0 = x0
        self.N = N
        self.g = g
        self.pos = pos
        self.z = z
        self.mw<https://urldefense.us/v3/__http://self.mw/__;!!G_uCfscf7eWS!evJjDgXHf6wRJbFLstjldUYC62cqJenkgdUFFAX5UBv8sQkHtWaP4fKcVADFipZFOIuAjgirRECHzDaLL0lsqa7hkri9dHDuYhA$ > = mw
        self.dt = dt
        self.dx = dx
        self.p_amb = p_amb
        self.A_nozzle = A_nozzle
        self.r_tank_inner = r_tank_inner
        self.mph_uv_flsh_L = mph_uv_flsh_l
        self.rho_ref = rho_ref
        self.e_ref = e_ref
        self.p_ref = p_ref
        self.closed_tank = closed_tank
        self.J = J
        self.f = f
        self.da = da
        self.drift_func = drift_func
        self.T_wall = T_wall
        self.tank_params = tank_params
        self.Q_wall = np.zeros(N)
        self.n_iter = 0
        self.t_current = [0.0]
        self.s_top = [0.0]
        self.p_choke = [0.0]

        # setting interp func # TODO figure out how to generalize this method
        self._interp_func = _jalla_upwind

        # allocate space for new params
        self.p = np.zeros(N)  # Pa
        self.T = np.zeros(N)  # K
        self.alpha = np.zeros((2, N))
        self.rho = np.zeros((2, N))
        self.e = np.zeros((2, N))

        # allocate space for ghost cells
        self.alpha_ghost = np.zeros((2, N + 2))
        self.rho_ghost = np.zeros((2, N + 2))
        self.rho_m_ghost = np.zeros(N + 2)
        self.u_m_ghost = np.zeros(N + 1)
        self.u_ghost = np.zeros((2, N + 1))
        self.e_ghost = np.zeros((2, N + 2))
        self.pos_ghost = np.zeros(N + 2)
        self.h_ghost = np.zeros((2, N + 2))

        # allocate soace for local X and Xdot
        self.X_LOCAL = da.createLocalVec()
        self.XDOT_LOCAL = da.createLocalVec()

    def residual_ts(self, ts, t, X, XDOT, F):
        self.n_iter += 1
        # TODO: Estimate time use
        """
        Caculate residuals for equations
        (rho, rho (e + gx))_t + (rho u, rho u (h + gx))_x = 0
        P_x = - g \rho
        """
        n_phase = 2
        self.da.globalToLocal(X, self.X_LOCAL)
        self.da.globalToLocal(XDOT, self.XDOT_LOCAL)
        x = self.da.getVecArray(self.X_LOCAL)
        xdot = self.da.getVecArray(self.XDOT_LOCAL)
        f = self.da.getVecArray(F)

        T_c, v_c, p_c = self.eos.critical(self.z)  # K, m3/mol, Pa
        rho_m = x[:, 0] * self.rho_ref  # kg/m3
        e_m = x[:, 1] * self.e_ref  # J/mol
        u_m = x[:-1, 2]  # m/s

        # derivatives
        rho_m_dot = xdot[:, 0] * self.rho_ref  # kg/m3
        e_m_dot = xdot[:, 1] * self.e_ref  # kg/m3
        dt = ts.getTimeStep()  # s

        for i in range(self.N):
            # get new parameters
            self.mph_uv_flsh_L[i] = self.eos.multi_phase_uvflash(
                self.z, e_m[i], self.mw<https://urldefense.us/v3/__http://self.mw/__;!!G_uCfscf7eWS!evJjDgXHf6wRJbFLstjldUYC62cqJenkgdUFFAX5UBv8sQkHtWaP4fKcVADFipZFOIuAjgirRECHzDaLL0lsqa7hkri9dHDuYhA$ > / rho_m[i], self.mph_uv_flsh_L[i]
            )

            betaL, betaV, betaS = _get_betas(self.mph_uv_flsh_L[i])  # mol/mol
            beta = [betaL, betaV]
            if betaS != 0.0:
                print("there is a solid phase which is not accounted for")
            self.T[i], self.p[i] = _get_tank_temperature_pressure(
                self.mph_uv_flsh_L[i]
            )  # K, Pa)
            for j, phase in enumerate([self.eos.LIQPH, self.eos.VAPPH]):
                # new parameters
                self.rho_ghost[:, 1:-1][j][i] = (
                    self.mw<https://urldefense.us/v3/__http://self.mw/__;!!G_uCfscf7eWS!evJjDgXHf6wRJbFLstjldUYC62cqJenkgdUFFAX5UBv8sQkHtWaP4fKcVADFipZFOIuAjgirRECHzDaLL0lsqa7hkri9dHDuYhA$ >
                    / self.eos.specific_volume(self.T[i], self.p[i], self.z, phase)[0]
                )  # kg/m3
                self.e_ghost[:, 1:-1][j][i] = self.eos.internal_energy_tv(
                    self.T[i], self.mw<https://urldefense.us/v3/__http://self.mw/__;!!G_uCfscf7eWS!evJjDgXHf6wRJbFLstjldUYC62cqJenkgdUFFAX5UBv8sQkHtWaP4fKcVADFipZFOIuAjgirRECHzDaLL0lsqa7hkri9dHDuYhA$ > / self.rho_ghost[:, 1:-1][j][i], self.z, phase
                )[
                    0
                ]  # J/mol
                self.h_ghost[:, 1:-1][j][i] = (
                    self.e_ghost[:, 1:-1][j][i]
                    + self.p[i] * self.mw<https://urldefense.us/v3/__http://self.mw/__;!!G_uCfscf7eWS!evJjDgXHf6wRJbFLstjldUYC62cqJenkgdUFFAX5UBv8sQkHtWaP4fKcVADFipZFOIuAjgirRECHzDaLL0lsqa7hkri9dHDuYhA$ > / self.rho_ghost[:, 1:-1][j][i]
                )  # J/mol
                self.alpha_ghost[:, 1:-1][j][i] = (
                    beta[j] / self.rho_ghost[:, 1:-1][j][i] * rho_m[i]
                )  # m3/m3

        # calculate drift velocity
        for i in range(self.N - 1):
            self.u_ghost[:, 1:-1][0][i], self.u_ghost[:, 1:-1][1][i] = (
                calc_drift_velocity(
                    u_m[i],
                    self._interp_func(
                        self.rho_ghost[:, 1:-1][0][i],
                        self.rho_ghost[:, 1:-1][0][i + 1],
                        u_m[i],
                    ),
                    self._interp_func(
                        self.rho_ghost[:, 1:-1][1][i],
                        self.rho_ghost[:, 1:-1][1][i + 1],
                        u_m[i],
                    ),
                    self.g,
                    self._interp_func(self.T[i], self.T[i + 1], u_m[i]),
                    T_c,
                    self.r_tank_inner,
                    self._interp_func(
                        self.alpha_ghost[:, 1:-1][0][i],
                        self.alpha_ghost[:, 1:-1][0][i + 1],
                        u_m[i],
                    ),
                    self._interp_func(
                        self.alpha_ghost[:, 1:-1][1][i],
                        self.alpha_ghost[:, 1:-1][1][i + 1],
                        u_m[i],
                    ),
                    self.drift_func,
                )
            )  # liq m / s , vapour m / s

        u_bottom = 0
        if self.closed_tank:
            u_top = 0.0  # m/s
        else:
            # calc phase to skip env_isentrope_cross
            if (
                self.mph_uv_flsh_L[-1].liquid != None
                and self.mph_uv_flsh_L[-1].vapour == None
                and self.mph_uv_flsh_L[-1].solid == None
            ):
                phase_env = self.eos.LIQPH
            else:
                phase_env = self.eos.TWOPH

            self.h_m = e_m + self.p * self.mw<https://urldefense.us/v3/__http://self.mw/__;!!G_uCfscf7eWS!evJjDgXHf6wRJbFLstjldUYC62cqJenkgdUFFAX5UBv8sQkHtWaP4fKcVADFipZFOIuAjgirRECHzDaLL0lsqa7hkri9dHDuYhA$ > / rho_m  # J / mol
            self.s_top[0] = _get_s_tank(self.eos, self.mph_uv_flsh_L[-1])  # J / mol / K
            mdot, self.p_choke[0] = calc_mass_outflow(
                self.eos,
                self.z,
                self.h_m[-1],
                self.s_top[0],
                self.p[-1],
                self.p_amb,
                self.A_nozzle,
                self.mw<https://urldefense.us/v3/__http://self.mw/__;!!G_uCfscf7eWS!evJjDgXHf6wRJbFLstjldUYC62cqJenkgdUFFAX5UBv8sQkHtWaP4fKcVADFipZFOIuAjgirRECHzDaLL0lsqa7hkri9dHDuYhA$ >,
                phase_env,
                debug_plot=False,
            )  # mol / s , Pa
            u_top = -mdot * self.mw<https://urldefense.us/v3/__http://self.mw/__;!!G_uCfscf7eWS!evJjDgXHf6wRJbFLstjldUYC62cqJenkgdUFFAX5UBv8sQkHtWaP4fKcVADFipZFOIuAjgirRECHzDaLL0lsqa7hkri9dHDuYhA$ > / rho_m[-1] / (np.pi * self.r_tank_inner**2)  # m/s

        # assemble vectors with ghost cells
        self.alpha_ghost[:, 0] = self.alpha_ghost[:, 1:-1][:, 0]  # m3/m3
        self.alpha_ghost[:, -1] = self.alpha_ghost[:, 1:-1][:, -1]  # m3/m3
        self.rho_ghost[:, 0] = self.rho_ghost[:, 1:-1][:, 0]  # kg/m3
        self.rho_ghost[:, -1] = self.rho_ghost[:, 1:-1][:, -1]  # kg/m3
        self.rho_m_ghost[0] = rho_m[0]  # kg/m3
        self.rho_m_ghost[1:-1] = rho_m  # kg/m3
        self.rho_m_ghost[-1] = rho_m[-1]  # kg/m3
        # u_ghost[:, 1:-1] = u  # m/s
        self.u_ghost[:, 0] = u_bottom  # m/s
        self.u_ghost[:, -1] = u_top  # m/s
        self.u_m_ghost[0] = u_bottom  # m/s
        self.u_m_ghost[1:-1] = u_m  # m/s
        self.u_m_ghost[-1] = u_top  # m/s
        self.e_ghost[:, 0] = self.e_ghost[:, 1:-1][:, 0]  # J/mol
        self.e_ghost[:, -1] = self.e_ghost[:, 1:-1][:, -1]  # J/mol
        self.pos_ghost[1:-1] = self.pos  # m
        self.pos_ghost[0] = self.pos[0]  # m
        self.pos_ghost[-1] = self.pos[-1]  # m
        self.h_ghost[:, 0] = self.h_ghost[:, 1]  # J/mol
        self.h_ghost[:, -1] = self.h_ghost[:, -2]  # J/mol

        # recalculate wall temperature and heat flux
        # TODO ARE WE DOING THE STAGGERING CORRECTLY?
        lz = self.tank_params["lz_tank"] / self.N  # m
        if ts.getTime() != self.t_current[0] and self.tank_params["heat_transfer"]:
            self.t_current[0] = ts.getTime()
            for i in range(self.N):
                self.T_wall[i], self.Q_wall[i], h_ht = (
                    solve_radial_heat_conduction_implicit(
                        self.tank_params,
                        self.T[i],
                        self.T_wall[i],
                        (self.u_m_ghost[i] + self.u_m_ghost[i + 1]) / 2,
                        self.rho_m_ghost[i + 1],
                        self.mph_uv_flsh_L[i],
                        lz,
                        dt,
                    )
                )  # K, J/s, W/m2K

        # Calculate residuals
        f[:, :] = 0.0
        f[:, 0] = dt * rho_m_dot  # kg/m3
        f[:-1, 1] = self.p[1:] - self.p[:-1] + self.dx * self.g * rho_m[0:-1]  # Pa/m
        f[:, 2] = (
            dt
            * (
                rho_m_dot * (e_m / self.mw<https://urldefense.us/v3/__http://self.mw/__;!!G_uCfscf7eWS!evJjDgXHf6wRJbFLstjldUYC62cqJenkgdUFFAX5UBv8sQkHtWaP4fKcVADFipZFOIuAjgirRECHzDaLL0lsqa7hkri9dHDuYhA$ > + self.g * self.pos)
                + rho_m * e_m_dot / self.mw<https://urldefense.us/v3/__http://self.mw/__;!!G_uCfscf7eWS!evJjDgXHf6wRJbFLstjldUYC62cqJenkgdUFFAX5UBv8sQkHtWaP4fKcVADFipZFOIuAjgirRECHzDaLL0lsqa7hkri9dHDuYhA$ >
            )
            - rho_m_dot * e_m_dot / self.mw<https://urldefense.us/v3/__http://self.mw/__;!!G_uCfscf7eWS!evJjDgXHf6wRJbFLstjldUYC62cqJenkgdUFFAX5UBv8sQkHtWaP4fKcVADFipZFOIuAjgirRECHzDaLL0lsqa7hkri9dHDuYhA$ > * dt**2
            - self.Q_wall / (np.pi * self.r_tank_inner**2 * lz) * dt
        )  # J / m3

        # add contribution from space
        for i in range(n_phase):
            e_flux_i = np.zeros_like(self.u_ghost[i])  # J/m3 m/s
            rho_flux_i = np.zeros_like(self.u_ghost[i])  # kg/m2/s
            for j in range(1, self.N + 1):
                if self.u_ghost[i][j] >= 0.0:
                    rho_flux_new = _rho_flux(
                        self.alpha_ghost[i][j], self.rho_ghost[i][j], self.u_ghost[i][j]
                    )
                    e_flux_new = _e_flux(
                        self.alpha_ghost[i][j],
                        self.rho_ghost[i][j],
                        self.h_ghost[i][j],
                        self.mw<https://urldefense.us/v3/__http://self.mw/__;!!G_uCfscf7eWS!evJjDgXHf6wRJbFLstjldUYC62cqJenkgdUFFAX5UBv8sQkHtWaP4fKcVADFipZFOIuAjgirRECHzDaLL0lsqa7hkri9dHDuYhA$ >,
                        self.g,
                        self.pos_ghost[j],
                        self.u_ghost[i][j],
                    )

                    # backward euler
                    rho_flux_i[j] = rho_flux_new  # kg/m2/s
                    e_flux_i[j] = e_flux_new  # J/m3 m/s

                else:
                    rho_flux_new = _rho_flux(
                        self.alpha_ghost[i][j + 1],
                        self.rho_ghost[i][j + 1],
                        self.u_ghost[i][j],
                    )

                    e_flux_new = _e_flux(
                        self.alpha_ghost[i][j + 1],
                        self.rho_ghost[i][j + 1],
                        self.h_ghost[i][j + 1],
                        self.mw<https://urldefense.us/v3/__http://self.mw/__;!!G_uCfscf7eWS!evJjDgXHf6wRJbFLstjldUYC62cqJenkgdUFFAX5UBv8sQkHtWaP4fKcVADFipZFOIuAjgirRECHzDaLL0lsqa7hkri9dHDuYhA$ >,
                        self.g,
                        self.pos_ghost[j + 1],
                        self.u_ghost[i][j],
                    )

                    # backward euler
                    rho_flux_i[j] = rho_flux_new
                    e_flux_i[j] = e_flux_new

            # mass eq
            f[:, 0] += (dt / self.dx) * (rho_flux_i[1:] - rho_flux_i[:-1])  # kg/m3

            # energy eq
            f[:, 2] += (dt / self.dx) * (e_flux_i[1:] - e_flux_i[:-1])  # J/m3

        f1_ref, f2_ref, f3_ref = self.rho_ref, self.p_ref, self.e_ref
        f[:, 0] /= f1_ref
        f[:-1, 1] /= f2_ref
        f[:, 2] /= f3_ref
        # dummy eq
        f[-1, 1] = x[-1, 2]


--
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener

https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!evJjDgXHf6wRJbFLstjldUYC62cqJenkgdUFFAX5UBv8sQkHtWaP4fKcVADFipZFOIuAjgirRECHzDaLL0lsqa7hkri9PlmBCHA$ <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!evJjDgXHf6wRJbFLstjldUYC62cqJenkgdUFFAX5UBv8sQkHtWaP4fKcVADFipZFOIuAjgirRECHzDaLL0lsqa7hkri9dysdktY$ >


--
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener

https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!evJjDgXHf6wRJbFLstjldUYC62cqJenkgdUFFAX5UBv8sQkHtWaP4fKcVADFipZFOIuAjgirRECHzDaLL0lsqa7hkri9PlmBCHA$ <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!evJjDgXHf6wRJbFLstjldUYC62cqJenkgdUFFAX5UBv8sQkHtWaP4fKcVADFipZFOIuAjgirRECHzDaLL0lsqa7hkri9dysdktY$ >


--
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener

https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!evJjDgXHf6wRJbFLstjldUYC62cqJenkgdUFFAX5UBv8sQkHtWaP4fKcVADFipZFOIuAjgirRECHzDaLL0lsqa7hkri9PlmBCHA$ <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!evJjDgXHf6wRJbFLstjldUYC62cqJenkgdUFFAX5UBv8sQkHtWaP4fKcVADFipZFOIuAjgirRECHzDaLL0lsqa7hkri9dysdktY$ >
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20250225/66efeec4/attachment-0001.html>

From knepley at gmail.com  Tue Feb 25 13:48:02 2025
From: knepley at gmail.com (Matthew Knepley)
Date: Tue, 25 Feb 2025 14:48:02 -0500
Subject: [petsc-users] TS Solver stops working when including ts.setDM
In-Reply-To: <SV0P279MB0561FF472107BCFADD0E113694C32@SV0P279MB0561.NORP279.PROD.OUTLOOK.COM>
References: <OS6P279MB05706419B43C06C49AA346BD94C02@OS6P279MB0570.NORP279.PROD.OUTLOOK.COM>
	<CAMYG4Gn3RwRuqta5Xc0AWgAnvF=pam0VCJacH+J_ovVfUyXtRA@mail.gmail.com>
	<OS6P279MB057047A6F89E228712A833AF94C02@OS6P279MB0570.NORP279.PROD.OUTLOOK.COM>
	<CAMYG4G=trPFXfTGdK2FW3sp_6aSVUTOv=jsh1vJKu9ahr_44sw@mail.gmail.com>
	<OS6P279MB057020701F9E25564062AA6694C32@OS6P279MB0570.NORP279.PROD.OUTLOOK.COM>
	<CAMYG4Gkf9bFc4r8p_R2=NWh+RaJyz4xGq6yNw8qqpMk7xFNiTQ@mail.gmail.com>
	<SV0P279MB0561FF472107BCFADD0E113694C32@SV0P279MB0561.NORP279.PROD.OUTLOOK.COM>
Message-ID: <CAMYG4GmdhGUh4uLd4JEkakZ6ECA-P=3TUEgpQAaAD_0ome-AAQ@mail.gmail.com>

On Tue, Feb 25, 2025 at 9:51?AM Eirik Jaccheri H?ydalsvik <
eirik.hoydalsvik at sintef.no> wrote:

> Hi,
>
> After sending you the email, I rescaled the residual function and got the
> two jacobians to agree down to e-7.
>
> I have tried with ?lu? and ?ilu? as preconditioners, and this does not
> work. However, I just tried to use ?sor? as a preconditioner, and using sor
> using the da jacobian works just fine!
>
> Why should it work with sor and not with ilu or lu?
>

ILU fails all the time, so that is not surprising. However, I do not
understand why SOR would succeed and LU would fail,
except that SOR is functioning as a kind of globalization by solving very
inexactly. Can you run with

  -snes_monitor -snes_converged_reason -ksp_monitor_true_solution
-ksp_converged_reason -snes_linesearch_monitor

and send the output?

  Thanks,

    Matt


> Eirik
>
> Jacobians:
> row 0: (0, 1.1012)  (1, -51356.3)  (2, 0.258649)  (3, -0.0644364)  (4,
> -6402.63)  (5, 6.19796e-08)
>
> row 1: (0, -0.445291)  (1, 901708.)  (2, 0.)  (3, 0.44529)  (4, -901708.)
> (5, 3.63946e-07)
>
> row 2: (0, 1.10139)  (1, -40239.6)  (2, 0.258157)  (3, -0.0642761)  (4,
> -6985.51)  (5, 6.19796e-08)
>
> row 3: (0, -0.101197)  (1, 51356.3)  (2, -0.258649)  (3, 1.16563)  (4,
> -44953.7)  (5, 0.258649)  (6, -0.0644364)  (7, -6402.63)  (8, 8.23293e-08)
>
> row 4: (0, 0.)  (1, 0.)  (2, 0.)  (3, -0.44529)  (4, 901708.)  (5,
> -3.63946e-07)  (6, 0.44529)  (7, -901708.)  (8, -2.8832e-07)
>
> row 5: (0, -0.101388)  (1, 51394.3)  (2, -0.258157)  (3, 1.16566)  (4,
> -33254.1)  (5, 0.258157)  (6, -0.0642762)  (7, -6985.51)  (8, 8.27566e-08)
>
> row 6: (3, -0.101197)  (4, 51356.3)  (5, -0.258649)  (6, 1.06444)  (7,
> 6402.63)  (8, -5.80354e-08)
>
> row 7: (3, 0.)  (4, 0.)  (5, 0.)  (6, 0.)  (7, 0.)  (8, 1.)
>
> row 8: (3, -0.101388)  (4, 51394.3)  (5, -0.258157)  (6, 1.06428)  (7,
> 18140.2)  (8, -5.88806e-08)
>
> 1.1011966721737030e+00 -5.1356338141763350e+04 2.5864916418200712e-01
> -6.4436434390614972e-02 -6.4026259743175688e+03 7.0149583230222760e-08
> -1.2114185055821600e-08 7.4938912205780135e-08 -1.2114185055821600e-08
>
> -4.4529045442108389e-01 9.0170829570587131e+05 -3.6394551278146074e-07
> 4.4529038352260736e-01 -9.0170829570607911e+05 -2.8832047116453381e-07
> 2.8832047116453381e-07 -2.8832047116453381e-07 2.8832047116453381e-07
>
> 1.1013882301195626e+00 -4.0239613965471210e+04 2.5815705499526392e-01
> -6.4276195275552644e-02 -6.9855091616075770e+03 7.0994758931791700e-08
> -1.2114185055821600e-08 7.5502362673492770e-08 -1.2114185055821600e-08
>
> -1.0119660581208668e-01 5.1356338141756765e+04 -2.5864909514315410e-01
> 1.1656331102401210e+00 -4.4953712167476042e+04 2.5864905556513557e-01
> -6.4436357255260146e-02 -6.4026259743998889e+03 8.6207799859128616e-08
>
> 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00
> -4.4529067184307852e-01 9.0170829570636747e+05 0.0000000000000000e+00
> 4.4529045442108389e-01 -9.0170829570587131e+05 3.6394551278146074e-07
>
> -1.0138834089854361e-01 5.1394313808568040e+04 -2.5815698520133573e-01
> 1.1656640380374783e+00 -3.3254086573823952e+04 2.5815685372183378e-01
> -6.4276095304361430e-02 -6.9855068889361701e+03 8.6288440103988163e-08
>
> -6.9304407528653807e-08 0.0000000000000000e+00 -6.9304407528653807e-08
> -1.0119667848877271e-01 5.1356338141793494e+04 -2.5864912586737532e-01
> 1.0644363666594443e+00 6.4026259743251758e+03 -7.4093736504211182e-08
>
> 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00
> 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00
> 0.0000000000000000e+00 0.0000000000000000e+00 1.0000000000000000e+00
>
> -6.9586132762510124e-08 0.0000000000000000e+00 -6.9586132762510124e-08
> -1.0138837786069978e-01 5.1394295578534489e+04 -2.5815692427475545e-01
> 1.0642752170084262e+00 1.8140206731963925e+04 -7.4375461738067500e-08
>
>
>
>
> *From: *Matthew Knepley <knepley at gmail.com>
> *Date: *Tuesday, February 25, 2025 at 15:27
> *To: *Eirik Jaccheri H?ydalsvik <eirik.hoydalsvik at sintef.no>
> *Cc: *petsc-users at mcs.anl.gov <petsc-users at mcs.anl.gov>
> *Subject: *Re: [petsc-users] TS Solver stops working when including
> ts.setDM
>
> On Tue, Feb 25, 2025 at 3:19?AM Eirik Jaccheri H?ydalsvik <
> eirik.hoydalsvik at sintef.no> wrote:
>
> Thanks again for the quick response,
>
> I tried prining the jacobians with -ksp_view_mat as you suggested, with a
> system of only 3 cells (I am studying at a 1d problem).  Printing the
> jacobian in the first timestep I  got the two matrices attached at the end
> of this email. The jacobians are in general agreement, with some small
> diviations, like the final element of the matrix being 1.6e-5 in the sparse
> case and 3.7 In the full case.
>
>
>
> We usually expect to see single precision accuracy (1e-7), so this
> indicates that your condition number is high.
>
>
>
> If you use LU (-pc_type lu) to solve the linear system, do you get similar
> results?
>
>
>
>   Thanks,
>
>
>
>      Matt
>
>
>
> Questions:
>
> 1. Are differences on the order of 1e-5 expected when computing the
> jacobians in different ways?
>
> 2. Do you think these differences can be the cause of my problems? Any
> suggestions for furtner debugging strategies?
>
> Eirik
>
> ! sparse jacobian
>
> row 0: (0, 1.1012)  (1, -104.568)  (2, 0.258649)  (3, -0.0644364)  (4,
> -13.1186)  (5, 1.3237e-08)
>
> row 1: (0, -0.44489)  (1, 1846.04)  (2, 2.12629e-07)  (3, 0.445291)  (4,
> -1846.04)  (5, 7.08762e-08)
>
> row 2: (0, 540.692)  (1, -40219.1)  (2, 126.734)  (3, -31.5544)  (4,
> -7023.46)  (5, 6.48896e-06)
>
> row 3: (0, -0.101197)  (1, 104.568)  (2, -0.258649)  (3, 1.16563)  (4,
> -91.4489)  (5, 0.258649)  (6, -0.0644365)  (7, -13.1186)  (8, -4.4809e-08)
>
> row 4: (0, 0.)  (1, 0.)  (2, 0.)  (3, -0.44489)  (4, 1846.04)  (5,
> -2.17357e-07)  (6, 0.445291)  (7, -1846.04)  (8, 2.17355e-07)
>
> row 5: (0, -49.7734)  (1, 51373.8)  (2, -126.734)  (3, 572.246)  (4,
> -33195.6)  (5, 126.734)  (6, -31.5544)  (7, -7023.46)  (8, -2.19026e-05)
>
> row 6: (3, -0.101197)  (4, 104.568)  (5, -0.258649)  (6, 1.06444)  (7,
> 13.1186)  (8, 3.32334e-08)
>
> row 7: (3, 0.)  (4, 0.)  (5, 0.)  (6, 0.)  (7, 0.)  (8, 1.)
>
> row 8: (3, -49.7734)  (4, 51373.8)  (5, -126.734)  (6, 522.472)  (7,
> 18178.2)  (8, 1.61503e-05)
>
>
>
> ! full jacobian
>
> 1.1011966827009450e+00 -1.0456754702270389e+02 2.5864915220241336e-01
> -6.4436436239323838e-02 -1.3118626729240630e+01 7.6042484344402957e-08
> 2.9290438414140398e-08 2.5347494781467651e-08 7.2381179542635411e-08
>
> -4.4488897562431995e-01 1.8460406897256150e+03 -5.0558790784552242e-07
> 4.4529051871980491e-01 -1.8460406898012175e+03 0.0000000000000000e+00
> 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00
>
> 5.4069168701069862e+02 -4.0219094660763396e+04 1.2673402711669499e+02
> -3.1554364136687809e+01 -7.0234605760797031e+03 3.7635960251523168e-05
> 1.4708306305192963e-05 1.2833718246687978e-05 3.5617173111594722e-05
>
> -1.0119659898285956e-01 1.0456754705230254e+02 -2.5864912672499502e-01
> 1.1656331184446040e+00 -9.1448920317937109e+01 2.5864905777109459e-01
> -6.4436443447843730e-02 -1.3118626754633008e+01 3.8772800266974086e-09
>
> 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00
> -4.4488904650049199e-01 1.8460406899429699e+03 -2.8823314009443733e-07
> 4.4529051871980491e-01 -1.8460406898012175e+03 0.0000000000000000e+00
>
> -4.9773392795905657e+01 5.1373794518018862e+04 -1.2673401444613337e+02
> 5.7224585844509852e+02 -3.3195615874594827e+04 1.2673393520690603e+02
> -3.1554356503250116e+01 -7.0234583029005144e+03 1.9105655626471588e-06
>
> -8.1675260962506883e-08 -2.9290438414140398e-08 -2.5347494781467651e-08
> -1.0119667997558363e-01 1.0456754704720647e+02 -2.5864913361425051e-01
> 1.0644364161519400e+00 1.3118626729240630e+01 -7.6042484344402957e-08
>
> 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00
> 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00
> 0.0000000000000000e+00 0.0000000000000000e+00 1.0000000000000000e+00
>
> -4.0087344635721997e-05 -1.4564107223769502e-05 -1.2401121002417596e-05
> -4.9773414819747863e+01 5.1373776293551586e+04 -1.2673397275364130e+02
> 5.2247224727851687e+02 1.8178158133060850e+04 -3.7347562088676249e-05
>
>
>
> *From: *Matthew Knepley <knepley at gmail.com>
> *Date: *Monday, February 24, 2025 at 15:00
> *To: *Eirik Jaccheri H?ydalsvik <eirik.hoydalsvik at sintef.no>
> *Cc: *petsc-users at mcs.anl.gov <petsc-users at mcs.anl.gov>
> *Subject: *Re: [petsc-users] TS Solver stops working when including
> ts.setDM
>
> On Mon, Feb 24, 2025 at 8:56?AM Eirik Jaccheri H?ydalsvik <
> eirik.hoydalsvik at sintef.no> wrote:
>
>
>    1. Thank you for the quick answer, I think this sounds reasonable? Is
>    there any way to compare the brute-force jacobian to the one computed using
>    the coloring information?
>
>
>
> The easiest way we have is to print them both out:
>
>
>
>   -ksp_view_mat
>
>
>
> on both runs. We have a way to compare the analytic and FD Jacobians
> (-snes_test_jacobian), but
>
> not two different FDs.
>
>
>
>   Thanks,
>
>
>
>      Matt
>
>
>
>
>    1.
>
> *From: *Matthew Knepley <knepley at gmail.com>
> *Date: *Monday, February 24, 2025 at 14:53
> *To: *Eirik Jaccheri H?ydalsvik <eirik.hoydalsvik at sintef.no>
> *Cc: *petsc-users at mcs.anl.gov <petsc-users at mcs.anl.gov>
> *Subject: *Re: [petsc-users] TS Solver stops working when including
> ts.setDM
>
> On Mon, Feb 24, 2025 at 8:41?AM Eirik Jaccheri H?ydalsvik via petsc-users
> <petsc-users at mcs.anl.gov> wrote:
>
> Hi,
>
> I am using the petsc4py.ts timestepper to solve a PDE. It is not easy to
> obtain the jacobian for my equations, so I do not provide a jacobian
> function. The code is given at the end of the email.
>
> When I comment out the function call ?ts.setDM(da)?, the code runs and
> gives reasonable results.
>
> However, when I add this line of code, the program crashes with the error
> message provided at the end of the email.
>
> Questions:
>
> 1. Do you know why adding this line of code can make the SNES solver
> diverge? Any suggestions for how to debug the issue?
>
>
>
> I will not know until I run it, but here is my guess. When the DMDA is
> specified, PETSc uses coloring to produce the Jacobian. When it is not, it
> just brute-forces the entire J. My guess is that your residual does not
> respect the stencil in the DMDA, so the coloring is wrong, making a wrong
> Jacobian.
>
>
>
> 2. What is the advantage of adding the DMDA object to the ts solver? Will
> this speed up the calculation of the finite difference jacobian?
>
>
>
> Yes, it speeds up the computation of the FD Jacobian.
>
>
>
>   Thanks,
>
>
>
>      Matt
>
>
>
> Best regards,
>
> Eirik H?ydalsvik
>
> SINTEF ER/NTNU
>
> Error message:
>
> [Eiriks-MacBook-Pro.local:26384] shmem: mmap: an error occurred while
> determining whether or not
> /var/folders/w1/35jw9y4n7lsbw0dhjqdhgzz80000gn/T//ompi.Eiriks-MacBook-Pro.501/jf.0/2046361600/sm_segment.Eiriks-MacBook-Pro.501.79f90000.0 could
> be created.
>
> t 0 of 1 with dt =  0.2
>
> 0 TS dt 0.2 time 0.
>
>       TSAdapt none step   0 stage rejected (SNES reason
> DIVERGED_LINEAR_SOLVE) t=0          + 2.000e-01 retrying with dt=5.000e-02
>
>       TSAdapt none step   0 stage rejected (SNES reason
> DIVERGED_LINEAR_SOLVE) t=0          + 5.000e-02 retrying with dt=1.250e-02
>
>       TSAdapt none step   0 stage rejected (SNES reason
> DIVERGED_LINEAR_SOLVE) t=0          + 1.250e-02 retrying with dt=3.125e-03
>
>       TSAdapt none step   0 stage rejected (SNES reason
> DIVERGED_LINEAR_SOLVE) t=0          + 3.125e-03 retrying with dt=7.813e-04
>
>       TSAdapt none step   0 stage rejected (SNES reason
> DIVERGED_LINEAR_SOLVE) t=0          + 7.813e-04 retrying with dt=1.953e-04
>
>       TSAdapt none step   0 stage rejected (SNES reason
> DIVERGED_LINEAR_SOLVE) t=0          + 1.953e-04 retrying with dt=4.883e-05
>
>       TSAdapt none step   0 stage rejected (SNES reason
> DIVERGED_LINEAR_SOLVE) t=0          + 4.883e-05 retrying with dt=1.221e-05
>
>       TSAdapt none step   0 stage rejected (SNES reason
> DIVERGED_LINEAR_SOLVE) t=0          + 1.221e-05 retrying with dt=3.052e-06
>
>       TSAdapt none step   0 stage rejected (SNES reason
> DIVERGED_LINEAR_SOLVE) t=0          + 3.052e-06 retrying with dt=7.629e-07
>
>       TSAdapt none step   0 stage rejected (SNES reason
> DIVERGED_LINEAR_SOLVE) t=0          + 7.629e-07 retrying with dt=1.907e-07
>
>       TSAdapt none step   0 stage rejected (SNES reason
> DIVERGED_LINEAR_SOLVE) t=0          + 1.907e-07 retrying with dt=4.768e-08
>
> Traceback (most recent call last):
>
>   File "/Users/iept1445/Code/tank_model/closed_tank.py", line 200, in
> <module>
>
>     return_dict1d = get_tank_composition_1d(tank_params)
>
>   File "/Users/iept1445/Code/tank_model/src/tank_model1d.py", line 223, in
> get_tank_composition_1d
>
>     ts.solve(u=x)
>
>   File "petsc4py/PETSc/TS.pyx", line 2478, in petsc4py.PETSc.TS.solve
>
> petsc4py.PETSc.Error: error code 91
>
> [0] TSSolve() at
> /Users/iept1445/.cache/uv/sdists-v6/pypi/petsc/3.22.1/svV0XnlR4s3LB4lmsaKjR/src/src/ts/interface/ts.c:4072
>
> [0] TSStep() at
> /Users/iept1445/.cache/uv/sdists-v6/pypi/petsc/3.22.1/svV0XnlR4s3LB4lmsaKjR/src/src/ts/interface/ts.c:3440
>
> [0] TSStep has failed due to DIVERGED_STEP_REJECTED
>
> Options for solver:
>
> COMM = PETSc.COMM_WORLD
>
>
>
>     da = PETSc.DMDA().create(
>
>         dim=(N_vertical,),
>
>         dof=3,
>
>         stencil_type=PETSc.DMDA().StencilType.STAR,
>
>         stencil_width=1,
>
>         # boundary_type=PETSc.DMDA.BoundaryType.GHOSTED,
>
>     )
>
>     x = da.createGlobalVec()
>
>     x_old = da.createGlobalVec()
>
>     f = da.createGlobalVec()
>
>     J = da.createMat()
>
>     rho_ref = rho_m[0]  # kg/m3
>
>     e_ref = e_m[0]  # J/mol
>
>     p_ref = p0  # Pa
>
>     x.setArray(np.array([rho_m / rho_ref, e_m / e_ref, ux_m]).T.flatten())
>
>     x_old.setArray(np.array([rho_m / rho_ref, e_m / e_ref,
> ux_m]).T.flatten())
>
>
>
>     optsDB = PETSc.Options()
>
>     optsDB["snes_lag_preconditioner_persists"] = False
>
>     optsDB["snes_lag_jacobian"] = 1
>
>     optsDB["snes_lag_jacobian_persists"] = False
>
>     optsDB["snes_lag_preconditioner"] = 1
>
>     optsDB["ksp_type"] = "gmres"  # "gmres"  # gmres"
>
>     optsDB["pc_type"] = "ilu"  # "lu"  # "ilu"
>
>     optsDB["snes_type"] = "newtonls"
>
>     optsDB["ksp_rtol"] = 1e-7
>
>     optsDB["ksp_atol"] = 1e-7
>
>     optsDB["ksp_max_it"] = 100
>
>     optsDB["snes_rtol"] = 1e-5
>
>     optsDB["snes_atol"] = 1e-5
>
>     optsDB["snes_stol"] = 1e-5
>
>     optsDB["snes_max_it"] = 100
>
>     optsDB["snes_mf"] = False
>
>     optsDB["ts_max_time"] = t_end
>
>     optsDB["ts_type"] = "beuler"  # "bdf"  #
>
>     optsDB["ts_max_snes_failures"] = -1
>
>     optsDB["ts_monitor"] = ""
>
>     optsDB["ts_adapt_monitor"] = ""
>
>     # optsDB["snes_monitor"] = ""
>
>     # optsDB["ksp_monitor"] = ""
>
>     optsDB["ts_atol"] = 1e-4
>
>
>
>     x0 = x_old
>
>     residual_wrap = residual_ts(
>
>         eos,
>
>         x0,
>
>         N_vertical,
>
>         g,
>
>         pos,
>
>         z,
>
>         mw,
>
>         dt,
>
>         dx,
>
>         p_amb,
>
>         A_nozzle,
>
>         r_tank_inner,
>
>         mph_uv_flsh_L,
>
>         rho_ref,
>
>         e_ref,
>
>         p_ref,
>
>         closed_tank,
>
>         J,
>
>         f,
>
>         da,
>
>         drift_func,
>
>         T_wall,
>
>         tank_params,
>
>     )
>
>
>
>     # optsDB["ts_adapt_type"] = "none"
>
>
>
>     ts = PETSc.TS().create(comm=COMM)
>
>     # TODO: Figure out why DM crashes the code
>
>     # ts.setDM(residual_wrap.da)
>
>     ts.setIFunction(residual_wrap.residual_ts, None)
>
>     ts.setTimeStep(dt)
>
>     ts.setMaxSteps(-1)
>
>     ts.setTime(t_start)  # s
>
>     ts.setMaxTime(t_end)  # s
>
>     ts.setMaxSteps(1e5)
>
>     ts.setStepLimits(1e-3, 1e5)
>
>     ts.setFromOptions()
>
>     ts.solve(u=x)
>
>
>
> Residual function:
>
> class residual_ts:
>
>     def __init__(
>
>         self,
>
>         eos,
>
>         x0,
>
>         N,
>
>         g,
>
>         pos,
>
>         z,
>
>         mw,
>
>         dt,
>
>         dx,
>
>         p_amb,
>
>         A_nozzle,
>
>         r_tank_inner,
>
>         mph_uv_flsh_l,
>
>         rho_ref,
>
>         e_ref,
>
>         p_ref,
>
>         closed_tank,
>
>         J,
>
>         f,
>
>         da,
>
>         drift_func,
>
>         T_wall,
>
>         tank_params,
>
>     ):
>
>         self.eos = eos
>
>         self.x0 = x0
>
>         self.N = N
>
>         self.g = g
>
>         self.pos = pos
>
>         self.z = z
>
>         self.mw = mw
>
>         self.dt = dt
>
>         self.dx = dx
>
>         self.p_amb = p_amb
>
>         self.A_nozzle = A_nozzle
>
>         self.r_tank_inner = r_tank_inner
>
>         self.mph_uv_flsh_L = mph_uv_flsh_l
>
>         self.rho_ref = rho_ref
>
>         self.e_ref = e_ref
>
>         self.p_ref = p_ref
>
>         self.closed_tank = closed_tank
>
>         self.J = J
>
>         self.f = f
>
>         self.da = da
>
>         self.drift_func = drift_func
>
>         self.T_wall = T_wall
>
>         self.tank_params = tank_params
>
>         self.Q_wall = np.zeros(N)
>
>         self.n_iter = 0
>
>         self.t_current = [0.0]
>
>         self.s_top = [0.0]
>
>         self.p_choke = [0.0]
>
>
>
>         # setting interp func # TODO figure out how to generalize this
> method
>
>         self._interp_func = _jalla_upwind
>
>
>
>         # allocate space for new params
>
>         self.p = np.zeros(N)  # Pa
>
>         self.T = np.zeros(N)  # K
>
>         self.alpha = np.zeros((2, N))
>
>         self.rho = np.zeros((2, N))
>
>         self.e = np.zeros((2, N))
>
>
>
>         # allocate space for ghost cells
>
>         self.alpha_ghost = np.zeros((2, N + 2))
>
>         self.rho_ghost = np.zeros((2, N + 2))
>
>         self.rho_m_ghost = np.zeros(N + 2)
>
>         self.u_m_ghost = np.zeros(N + 1)
>
>         self.u_ghost = np.zeros((2, N + 1))
>
>         self.e_ghost = np.zeros((2, N + 2))
>
>         self.pos_ghost = np.zeros(N + 2)
>
>         self.h_ghost = np.zeros((2, N + 2))
>
>
>
>         # allocate soace for local X and Xdot
>
>         self.X_LOCAL = da.createLocalVec()
>
>         self.XDOT_LOCAL = da.createLocalVec()
>
>
>
>     def residual_ts(self, ts, t, X, XDOT, F):
>
>         self.n_iter += 1
>
>         # TODO: Estimate time use
>
>         """
>
>         Caculate residuals for equations
>
>         (rho, rho (e + gx))_t + (rho u, rho u (h + gx))_x = 0
>
>         P_x = - g \rho
>
>         """
>
>         n_phase = 2
>
>         self.da.globalToLocal(X, self.X_LOCAL)
>
>         self.da.globalToLocal(XDOT, self.XDOT_LOCAL)
>
>         x = self.da.getVecArray(self.X_LOCAL)
>
>         xdot = self.da.getVecArray(self.XDOT_LOCAL)
>
>         f = self.da.getVecArray(F)
>
>
>
>         T_c, v_c, p_c = self.eos.critical(self.z)  # K, m3/mol, Pa
>
>         rho_m = x[:, 0] * self.rho_ref  # kg/m3
>
>         e_m = x[:, 1] * self.e_ref  # J/mol
>
>         u_m = x[:-1, 2]  # m/s
>
>
>
>         # derivatives
>
>         rho_m_dot = xdot[:, 0] * self.rho_ref  # kg/m3
>
>         e_m_dot = xdot[:, 1] * self.e_ref  # kg/m3
>
>         dt = ts.getTimeStep()  # s
>
>
>
>         for i in range(self.N):
>
>             # get new parameters
>
>             self.mph_uv_flsh_L[i] = self.eos.multi_phase_uvflash(
>
>                 self.z, e_m[i], self.mw / rho_m[i], self.mph_uv_flsh_L[i]
>
>             )
>
>
>
>             betaL, betaV, betaS = _get_betas(self.mph_uv_flsh_L[i])  #
> mol/mol
>
>             beta = [betaL, betaV]
>
>             if betaS != 0.0:
>
>                 print("there is a solid phase which is not accounted for")
>
>             self.T[i], self.p[i] = _get_tank_temperature_pressure(
>
>                 self.mph_uv_flsh_L[i]
>
>             )  # K, Pa)
>
>             for j, phase in enumerate([self.eos.LIQPH, self.eos.VAPPH]):
>
>                 # new parameters
>
>                 self.rho_ghost[:, 1:-1][j][i] = (
>
>                     self.mw
>
>                     / self.eos.specific_volume(self.T[i], self.p[i],
> self.z, phase)[0]
>
>                 )  # kg/m3
>
>                 self.e_ghost[:, 1:-1][j][i] = self.eos.internal_energy_tv(
>
>                     self.T[i], self.mw / self.rho_ghost[:, 1:-1][j][i],
> self.z, phase
>
>                 )[
>
>                     0
>
>                 ]  # J/mol
>
>                 self.h_ghost[:, 1:-1][j][i] = (
>
>                     self.e_ghost[:, 1:-1][j][i]
>
>                     + self.p[i] * self.mw / self.rho_ghost[:, 1:-1][j][i]
>
>                 )  # J/mol
>
>                 self.alpha_ghost[:, 1:-1][j][i] = (
>
>                     beta[j] / self.rho_ghost[:, 1:-1][j][i] * rho_m[i]
>
>                 )  # m3/m3
>
>
>
>         # calculate drift velocity
>
>         for i in range(self.N - 1):
>
>             self.u_ghost[:, 1:-1][0][i], self.u_ghost[:, 1:-1][1][i] = (
>
>                 calc_drift_velocity(
>
>                     u_m[i],
>
>                     self._interp_func(
>
>                         self.rho_ghost[:, 1:-1][0][i],
>
>                         self.rho_ghost[:, 1:-1][0][i + 1],
>
>                         u_m[i],
>
>                     ),
>
>                     self._interp_func(
>
>                         self.rho_ghost[:, 1:-1][1][i],
>
>                         self.rho_ghost[:, 1:-1][1][i + 1],
>
>                         u_m[i],
>
>                     ),
>
>                     self.g,
>
>                     self._interp_func(self.T[i], self.T[i + 1], u_m[i]),
>
>                     T_c,
>
>                     self.r_tank_inner,
>
>                     self._interp_func(
>
>                         self.alpha_ghost[:, 1:-1][0][i],
>
>                         self.alpha_ghost[:, 1:-1][0][i + 1],
>
>                         u_m[i],
>
>                     ),
>
>                     self._interp_func(
>
>                         self.alpha_ghost[:, 1:-1][1][i],
>
>                         self.alpha_ghost[:, 1:-1][1][i + 1],
>
>                         u_m[i],
>
>                     ),
>
>                     self.drift_func,
>
>                 )
>
>             )  # liq m / s , vapour m / s
>
>
>
>         u_bottom = 0
>
>         if self.closed_tank:
>
>             u_top = 0.0  # m/s
>
>         else:
>
>             # calc phase to skip env_isentrope_cross
>
>             if (
>
>                 self.mph_uv_flsh_L[-1].liquid != None
>
>                 and self.mph_uv_flsh_L[-1].vapour == None
>
>                 and self.mph_uv_flsh_L[-1].solid == None
>
>             ):
>
>                 phase_env = self.eos.LIQPH
>
>             else:
>
>                 phase_env = self.eos.TWOPH
>
>
>
>             self.h_m = e_m + self.p * self.mw / rho_m  # J / mol
>
>             self.s_top[0] = _get_s_tank(self.eos, self.mph_uv_flsh_L[-1])
> # J / mol / K
>
>             mdot, self.p_choke[0] = calc_mass_outflow(
>
>                 self.eos,
>
>                 self.z,
>
>                 self.h_m[-1],
>
>                 self.s_top[0],
>
>                 self.p[-1],
>
>                 self.p_amb,
>
>                 self.A_nozzle,
>
>                 self.mw,
>
>                 phase_env,
>
>                 debug_plot=False,
>
>             )  # mol / s , Pa
>
>             u_top = -mdot * self.mw / rho_m[-1] / (np.pi *
> self.r_tank_inner**2)  # m/s
>
>
>
>         # assemble vectors with ghost cells
>
>         self.alpha_ghost[:, 0] = self.alpha_ghost[:, 1:-1][:, 0]  # m3/m3
>
>         self.alpha_ghost[:, -1] = self.alpha_ghost[:, 1:-1][:, -1]  # m3/m3
>
>         self.rho_ghost[:, 0] = self.rho_ghost[:, 1:-1][:, 0]  # kg/m3
>
>         self.rho_ghost[:, -1] = self.rho_ghost[:, 1:-1][:, -1]  # kg/m3
>
>         self.rho_m_ghost[0] = rho_m[0]  # kg/m3
>
>         self.rho_m_ghost[1:-1] = rho_m  # kg/m3
>
>         self.rho_m_ghost[-1] = rho_m[-1]  # kg/m3
>
>         # u_ghost[:, 1:-1] = u  # m/s
>
>         self.u_ghost[:, 0] = u_bottom  # m/s
>
>         self.u_ghost[:, -1] = u_top  # m/s
>
>         self.u_m_ghost[0] = u_bottom  # m/s
>
>         self.u_m_ghost[1:-1] = u_m  # m/s
>
>         self.u_m_ghost[-1] = u_top  # m/s
>
>         self.e_ghost[:, 0] = self.e_ghost[:, 1:-1][:, 0]  # J/mol
>
>         self.e_ghost[:, -1] = self.e_ghost[:, 1:-1][:, -1]  # J/mol
>
>         self.pos_ghost[1:-1] = self.pos  # m
>
>         self.pos_ghost[0] = self.pos[0]  # m
>
>         self.pos_ghost[-1] = self.pos[-1]  # m
>
>         self.h_ghost[:, 0] = self.h_ghost[:, 1]  # J/mol
>
>         self.h_ghost[:, -1] = self.h_ghost[:, -2]  # J/mol
>
>
>
>         # recalculate wall temperature and heat flux
>
>         # TODO ARE WE DOING THE STAGGERING CORRECTLY?
>
>         lz = self.tank_params["lz_tank"] / self.N  # m
>
>         if ts.getTime() != self.t_current[0] and
> self.tank_params["heat_transfer"]:
>
>             self.t_current[0] = ts.getTime()
>
>             for i in range(self.N):
>
>                 self.T_wall[i], self.Q_wall[i], h_ht = (
>
>                     solve_radial_heat_conduction_implicit(
>
>                         self.tank_params,
>
>                         self.T[i],
>
>                         self.T_wall[i],
>
>                         (self.u_m_ghost[i] + self.u_m_ghost[i + 1]) / 2,
>
>                         self.rho_m_ghost[i + 1],
>
>                         self.mph_uv_flsh_L[i],
>
>                         lz,
>
>                         dt,
>
>                     )
>
>                 )  # K, J/s, W/m2K
>
>
>
>         # Calculate residuals
>
>         f[:, :] = 0.0
>
>         f[:, 0] = dt * rho_m_dot  # kg/m3
>
>         f[:-1, 1] = self.p[1:] - self.p[:-1] + self.dx * self.g *
> rho_m[0:-1]  # Pa/m
>
>         f[:, 2] = (
>
>             dt
>
>             * (
>
>                 rho_m_dot * (e_m / self.mw + self.g * self.pos)
>
>                 + rho_m * e_m_dot / self.mw
>
>             )
>
>             - rho_m_dot * e_m_dot / self.mw * dt**2
>
>             - self.Q_wall / (np.pi * self.r_tank_inner**2 * lz) * dt
>
>         )  # J / m3
>
>
>
>         # add contribution from space
>
>         for i in range(n_phase):
>
>             e_flux_i = np.zeros_like(self.u_ghost[i])  # J/m3 m/s
>
>             rho_flux_i = np.zeros_like(self.u_ghost[i])  # kg/m2/s
>
>             for j in range(1, self.N + 1):
>
>                 if self.u_ghost[i][j] >= 0.0:
>
>                     rho_flux_new = _rho_flux(
>
>                         self.alpha_ghost[i][j], self.rho_ghost[i][j],
> self.u_ghost[i][j]
>
>                     )
>
>                     e_flux_new = _e_flux(
>
>                         self.alpha_ghost[i][j],
>
>                         self.rho_ghost[i][j],
>
>                         self.h_ghost[i][j],
>
>                         self.mw,
>
>                         self.g,
>
>                         self.pos_ghost[j],
>
>                         self.u_ghost[i][j],
>
>                     )
>
>
>
>                     # backward euler
>
>                     rho_flux_i[j] = rho_flux_new  # kg/m2/s
>
>                     e_flux_i[j] = e_flux_new  # J/m3 m/s
>
>
>
>                 else:
>
>                     rho_flux_new = _rho_flux(
>
>                         self.alpha_ghost[i][j + 1],
>
>                         self.rho_ghost[i][j + 1],
>
>                         self.u_ghost[i][j],
>
>                     )
>
>
>
>                     e_flux_new = _e_flux(
>
>                         self.alpha_ghost[i][j + 1],
>
>                         self.rho_ghost[i][j + 1],
>
>                         self.h_ghost[i][j + 1],
>
>                         self.mw,
>
>                         self.g,
>
>                         self.pos_ghost[j + 1],
>
>                         self.u_ghost[i][j],
>
>                     )
>
>
>
>                     # backward euler
>
>                     rho_flux_i[j] = rho_flux_new
>
>                     e_flux_i[j] = e_flux_new
>
>
>
>             # mass eq
>
>             f[:, 0] += (dt / self.dx) * (rho_flux_i[1:] -
> rho_flux_i[:-1])  # kg/m3
>
>
>
>             # energy eq
>
>             f[:, 2] += (dt / self.dx) * (e_flux_i[1:] - e_flux_i[:-1])  #
> J/m3
>
>
>
>         f1_ref, f2_ref, f3_ref = self.rho_ref, self.p_ref, self.e_ref
>
>         f[:, 0] /= f1_ref
>
>         f[:-1, 1] /= f2_ref
>
>         f[:, 2] /= f3_ref
>
>         # dummy eq
>
>         f[-1, 1] = x[-1, 2]
>
>
>
>
>
>
>
>
> --
>
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
>
>
> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!Z3QqpIz5I2DwEJ5HG6GhoIsIxTOL2irfY5RMgPjrfCc99V9ZTfdxb5k_Tx49NclrnyAR3XQI-OmF1Y9QeBdH$ 
> <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!Z3QqpIz5I2DwEJ5HG6GhoIsIxTOL2irfY5RMgPjrfCc99V9ZTfdxb5k_Tx49NclrnyAR3XQI-OmF1ehytAxM$ >
>
>
>
>
> --
>
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
>
>
> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!Z3QqpIz5I2DwEJ5HG6GhoIsIxTOL2irfY5RMgPjrfCc99V9ZTfdxb5k_Tx49NclrnyAR3XQI-OmF1Y9QeBdH$ 
> <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!Z3QqpIz5I2DwEJ5HG6GhoIsIxTOL2irfY5RMgPjrfCc99V9ZTfdxb5k_Tx49NclrnyAR3XQI-OmF1ehytAxM$ >
>
>
>
>
> --
>
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
>
>
> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!Z3QqpIz5I2DwEJ5HG6GhoIsIxTOL2irfY5RMgPjrfCc99V9ZTfdxb5k_Tx49NclrnyAR3XQI-OmF1Y9QeBdH$ 
> <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!Z3QqpIz5I2DwEJ5HG6GhoIsIxTOL2irfY5RMgPjrfCc99V9ZTfdxb5k_Tx49NclrnyAR3XQI-OmF1ehytAxM$ >
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!Z3QqpIz5I2DwEJ5HG6GhoIsIxTOL2irfY5RMgPjrfCc99V9ZTfdxb5k_Tx49NclrnyAR3XQI-OmF1Y9QeBdH$  <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!Z3QqpIz5I2DwEJ5HG6GhoIsIxTOL2irfY5RMgPjrfCc99V9ZTfdxb5k_Tx49NclrnyAR3XQI-OmF1ehytAxM$ >
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20250225/46093c72/attachment-0001.html>

From dargaville.steven at gmail.com  Tue Feb 25 15:39:03 2025
From: dargaville.steven at gmail.com (Steven Dargaville)
Date: Tue, 25 Feb 2025 21:39:03 +0000
Subject: [petsc-users] building kokkos matrices on the device
Message-ID: <CAG_C8sKOYhGzO7sDVhHyZTcVUAfrVLEH1AYKDVgMrLSg1-h55w@mail.gmail.com>

Hi

I'm just wondering if there is any possibility of making:
MatSetSeqAIJKokkosWithCSRMatrix or MatCreateSeqAIJKokkosWithCSRMatrix in
src/mat/impls/aij/seq/kokkos/aijkok.kokkos.cxx
MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices in
src/mat/impls/aij/mpi/kokkos/mpiaijkok.kokkos.cxx

publicly accessible outside of petsc, or if there is an interface I have
missed for creating Kokkos matrices entirely on the device?
MatCreateSeqAIJKokkosWithCSRMatrix for example is marked as PETSC_INTERN so
I can't link to it.

I've currently just copied the code inside of those methods so that I can
build without any preallocation on the host (e.g., through the COO
interface) and it works really well.

Thanks for your help
Steven
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20250225/88849863/attachment.html>

From junchao.zhang at gmail.com  Tue Feb 25 16:16:37 2025
From: junchao.zhang at gmail.com (Junchao Zhang)
Date: Tue, 25 Feb 2025 16:16:37 -0600
Subject: [petsc-users] building kokkos matrices on the device
In-Reply-To: <CAG_C8sKOYhGzO7sDVhHyZTcVUAfrVLEH1AYKDVgMrLSg1-h55w@mail.gmail.com>
References: <CAG_C8sKOYhGzO7sDVhHyZTcVUAfrVLEH1AYKDVgMrLSg1-h55w@mail.gmail.com>
Message-ID: <CA+MQGp-JwjLW-MpVOPyHxz8EaQUoivk4qHqN06=Ky2_L5qR+Hg@mail.gmail.com>

Hi, Steven,
  MatSetSeqAIJKokkosWithCSRMatrix(Mat A, Mat_SeqAIJKokkos *akok) uses a
private data type Mat_SeqAIJKokkos, so it can not be directly made public.
  If you already use COO, then why not directly make the matrix of type
MATAIJKOKKOS and call MatSetPreallocationCOO() and MatSetValuesCOO()?
  So I am confused by your needs.

Thanks!
--Junchao Zhang


On Tue, Feb 25, 2025 at 3:39?PM Steven Dargaville <
dargaville.steven at gmail.com> wrote:

> Hi
>
> I'm just wondering if there is any possibility of making:
> MatSetSeqAIJKokkosWithCSRMatrix or MatCreateSeqAIJKokkosWithCSRMatrix in
> src/mat/impls/aij/seq/kokkos/aijkok.kokkos.cxx
> MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices in
> src/mat/impls/aij/mpi/kokkos/mpiaijkok.kokkos.cxx
>
> publicly accessible outside of petsc, or if there is an interface I have
> missed for creating Kokkos matrices entirely on the device?
> MatCreateSeqAIJKokkosWithCSRMatrix for example is marked as PETSC_INTERN so
> I can't link to it.
>
> I've currently just copied the code inside of those methods so that I can
> build without any preallocation on the host (e.g., through the COO
> interface) and it works really well.
>
> Thanks for your help
> Steven
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20250225/416dd3fc/attachment.html>

From dargaville.steven at gmail.com  Tue Feb 25 17:35:18 2025
From: dargaville.steven at gmail.com (Steven Dargaville)
Date: Tue, 25 Feb 2025 23:35:18 +0000
Subject: [petsc-users] building kokkos matrices on the device
In-Reply-To: <CA+MQGp-JwjLW-MpVOPyHxz8EaQUoivk4qHqN06=Ky2_L5qR+Hg@mail.gmail.com>
References: <CAG_C8sKOYhGzO7sDVhHyZTcVUAfrVLEH1AYKDVgMrLSg1-h55w@mail.gmail.com>
	<CA+MQGp-JwjLW-MpVOPyHxz8EaQUoivk4qHqN06=Ky2_L5qR+Hg@mail.gmail.com>
Message-ID: <CAG_C8sL8cZfKcGa9ZBTrK_iu5dA=XSvZ+SETHG_6BTLAFemKZg@mail.gmail.com>

Thanks for the response!

Although MatSetValuesCOO happens on the device if the input coo_v pointer
is device memory, I believe MatSetPreallocationCOO requires host pointers
for coo_i and coo_j, and the preallocation (and construction of the COO
structures) happens on the host and is then copied onto the device? I need
to be able to create a matrix object with minimal work on the host (like
many of the routines in aijkok.kokkos.cxx do internally). I originally used
the COO interface to build the matrices I need, but that was around 5x
slower than constructing the aij structures myself on the device and then
just directly using the MatSetSeqAIJKokkosWithCSRMatrix type methods.

The reason I thought MatSetSeqAIJKokkosWithCSRMatrix could easily be made
public is that the Mat_SeqAIJKokkos constructors are already publicly
accessible? In particular one of those constructors takes in pointers to
the Kokkos dual views which store a,i,j, and hence one can build a
sequential matrix with nothing (or very little) occuring on the host. The
only change I can see that would be necessary is for
MatSetSeqAIJKokkosWithCSRMatrix (or MatCreateSeqAIJKokkosWithCSRMatrix) to
be public is to change the PETSC_INTERN to PETSC_EXTERN?

For MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices, I believe all that is
required is declaring the method in the .hpp, as it's already defined as
static in mpiaijkok.kokkos.cxx. In particular, the comments
above MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices suggest that the
off-diagonal block B needs to be built with global column ids, with
mpiaij->garray constructed on the host along with the rewriting of the
global column indices in B. This happens in MatSetUpMultiply_MPIAIJ, but
checking the code there shows that if you pass in a non-null garray to
MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices, the construction and
compatification is skipped, meaning B can be built with local column ids as
long as garray is provided on the host (which I also build on the device
and then just copy to the host). Again this is what some of the internal
Kokkos routines rely on, like the matrix-product.

I am happy to try doing this and submitting a request to the petsc gitlab
if this seems sensible, I just wanted to double check that I wasn't missing
something important?
Thanks
Steven

On Tue, 25 Feb 2025 at 22:16, Junchao Zhang <junchao.zhang at gmail.com> wrote:

> Hi, Steven,
>   MatSetSeqAIJKokkosWithCSRMatrix(Mat A, Mat_SeqAIJKokkos *akok) uses a
> private data type Mat_SeqAIJKokkos, so it can not be directly made public.
>   If you already use COO, then why not directly make the matrix of type
> MATAIJKOKKOS and call MatSetPreallocationCOO() and MatSetValuesCOO()?
>   So I am confused by your needs.
>
> Thanks!
> --Junchao Zhang
>
>
> On Tue, Feb 25, 2025 at 3:39?PM Steven Dargaville <
> dargaville.steven at gmail.com> wrote:
>
>> Hi
>>
>> I'm just wondering if there is any possibility of making:
>> MatSetSeqAIJKokkosWithCSRMatrix or MatCreateSeqAIJKokkosWithCSRMatrix in
>> src/mat/impls/aij/seq/kokkos/aijkok.kokkos.cxx
>> MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices in
>> src/mat/impls/aij/mpi/kokkos/mpiaijkok.kokkos.cxx
>>
>> publicly accessible outside of petsc, or if there is an interface I have
>> missed for creating Kokkos matrices entirely on the device?
>> MatCreateSeqAIJKokkosWithCSRMatrix for example is marked as PETSC_INTERN so
>> I can't link to it.
>>
>> I've currently just copied the code inside of those methods so that I can
>> build without any preallocation on the host (e.g., through the COO
>> interface) and it works really well.
>>
>> Thanks for your help
>> Steven
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20250225/88ea1a14/attachment-0001.html>

From junchao.zhang at gmail.com  Tue Feb 25 21:35:07 2025
From: junchao.zhang at gmail.com (Junchao Zhang)
Date: Tue, 25 Feb 2025 21:35:07 -0600
Subject: [petsc-users] building kokkos matrices on the device
In-Reply-To: <CAG_C8sL8cZfKcGa9ZBTrK_iu5dA=XSvZ+SETHG_6BTLAFemKZg@mail.gmail.com>
References: <CAG_C8sKOYhGzO7sDVhHyZTcVUAfrVLEH1AYKDVgMrLSg1-h55w@mail.gmail.com>
	<CA+MQGp-JwjLW-MpVOPyHxz8EaQUoivk4qHqN06=Ky2_L5qR+Hg@mail.gmail.com>
	<CAG_C8sL8cZfKcGa9ZBTrK_iu5dA=XSvZ+SETHG_6BTLAFemKZg@mail.gmail.com>
Message-ID: <CA+MQGp8zgVgPPwSGHpeQbHNJ+iBa1ZDZ=XjnetC1nkmkB3EHzg@mail.gmail.com>

Mat_SeqAIJKokkos is private because it is in a private header
src/mat/impls/aij/seq/kokkos/aijkok.hpp

Your observation about the garray in
MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices()
might be right.  The comment

- B - the offdiag matrix using global col ids

is out of date. Perhaps it should be "the offdiag matrix uses local column
indices and garray contains the local to global mapping".  But I need to
double check it.

Since you use Kokkos, I think we could provide these two constructors for
MATSEQAIJKOKKOS and MATMPIAIJKOKKOS respectively

   - MatCreateSeqAIJKokkosWithKokkosCsrMatrix(MPI_Comm comm,
   KokkosCsrMatrix csr, Mat *A)


   - MatCreateMPIAIJKokkosWithSeqAIJKokkos(MPI_Comm comm, Mat A, Mat B,
   PetscInt *garray, Mat *mat)

         // To mimic the existing MatCreateMPIAIJWithSeqAIJ(MPI_Comm comm,
Mat A, Mat B, const PetscInt garray[], Mat *mat);
         // A and B are MATSEQAIJKOKKOS matrices and use local column
indices

Do they meet your needs?

--Junchao Zhang


On Tue, Feb 25, 2025 at 5:35?PM Steven Dargaville <
dargaville.steven at gmail.com> wrote:

> Thanks for the response!
>
> Although MatSetValuesCOO happens on the device if the input coo_v pointer
> is device memory, I believe MatSetPreallocationCOO requires host pointers
> for coo_i and coo_j, and the preallocation (and construction of the COO
> structures) happens on the host and is then copied onto the device? I need
> to be able to create a matrix object with minimal work on the host (like
> many of the routines in aijkok.kokkos.cxx do internally). I originally used
> the COO interface to build the matrices I need, but that was around 5x
> slower than constructing the aij structures myself on the device and then
> just directly using the MatSetSeqAIJKokkosWithCSRMatrix type methods.
>
> The reason I thought MatSetSeqAIJKokkosWithCSRMatrix could easily be made
> public is that the Mat_SeqAIJKokkos constructors are already publicly
> accessible? In particular one of those constructors takes in pointers to
> the Kokkos dual views which store a,i,j, and hence one can build a
> sequential matrix with nothing (or very little) occuring on the host. The
> only change I can see that would be necessary is for
> MatSetSeqAIJKokkosWithCSRMatrix (or MatCreateSeqAIJKokkosWithCSRMatrix) to
> be public is to change the PETSC_INTERN to PETSC_EXTERN?
>
> For MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices, I believe all that is
> required is declaring the method in the .hpp, as it's already defined as
> static in mpiaijkok.kokkos.cxx. In particular, the comments
> above MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices suggest that the
> off-diagonal block B needs to be built with global column ids, with
> mpiaij->garray constructed on the host along with the rewriting of the
> global column indices in B. This happens in MatSetUpMultiply_MPIAIJ, but
> checking the code there shows that if you pass in a non-null garray to
> MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices, the construction and
> compatification is skipped, meaning B can be built with local column ids as
> long as garray is provided on the host (which I also build on the device
> and then just copy to the host). Again this is what some of the internal
> Kokkos routines rely on, like the matrix-product.
>
> I am happy to try doing this and submitting a request to the petsc gitlab
> if this seems sensible, I just wanted to double check that I wasn't missing
> something important?
> Thanks
> Steven
>
> On Tue, 25 Feb 2025 at 22:16, Junchao Zhang <junchao.zhang at gmail.com>
> wrote:
>
>> Hi, Steven,
>>   MatSetSeqAIJKokkosWithCSRMatrix(Mat A, Mat_SeqAIJKokkos *akok) uses a
>> private data type Mat_SeqAIJKokkos, so it can not be directly made public.
>>   If you already use COO, then why not directly make the matrix of type
>> MATAIJKOKKOS and call MatSetPreallocationCOO() and MatSetValuesCOO()?
>>   So I am confused by your needs.
>>
>> Thanks!
>> --Junchao Zhang
>>
>>
>> On Tue, Feb 25, 2025 at 3:39?PM Steven Dargaville <
>> dargaville.steven at gmail.com> wrote:
>>
>>> Hi
>>>
>>> I'm just wondering if there is any possibility of making:
>>> MatSetSeqAIJKokkosWithCSRMatrix or MatCreateSeqAIJKokkosWithCSRMatrix in
>>> src/mat/impls/aij/seq/kokkos/aijkok.kokkos.cxx
>>> MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices in
>>> src/mat/impls/aij/mpi/kokkos/mpiaijkok.kokkos.cxx
>>>
>>> publicly accessible outside of petsc, or if there is an interface I have
>>> missed for creating Kokkos matrices entirely on the device?
>>> MatCreateSeqAIJKokkosWithCSRMatrix for example is marked as PETSC_INTERN so
>>> I can't link to it.
>>>
>>> I've currently just copied the code inside of those methods so that I
>>> can build without any preallocation on the host (e.g., through the COO
>>> interface) and it works really well.
>>>
>>> Thanks for your help
>>> Steven
>>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20250225/70215236/attachment.html>

From eirik.hoydalsvik at sintef.no  Wed Feb 26 02:21:38 2025
From: eirik.hoydalsvik at sintef.no (=?utf-8?B?RWlyaWsgSmFjY2hlcmkgSMO4eWRhbHN2aWs=?=)
Date: Wed, 26 Feb 2025 08:21:38 +0000
Subject: [petsc-users] TS Solver stops working when including ts.setDM
In-Reply-To: <CAMYG4GmdhGUh4uLd4JEkakZ6ECA-P=3TUEgpQAaAD_0ome-AAQ@mail.gmail.com>
References: <OS6P279MB05706419B43C06C49AA346BD94C02@OS6P279MB0570.NORP279.PROD.OUTLOOK.COM>
	<CAMYG4Gn3RwRuqta5Xc0AWgAnvF=pam0VCJacH+J_ovVfUyXtRA@mail.gmail.com>
	<OS6P279MB057047A6F89E228712A833AF94C02@OS6P279MB0570.NORP279.PROD.OUTLOOK.COM>
	<CAMYG4G=trPFXfTGdK2FW3sp_6aSVUTOv=jsh1vJKu9ahr_44sw@mail.gmail.com>
	<OS6P279MB057020701F9E25564062AA6694C32@OS6P279MB0570.NORP279.PROD.OUTLOOK.COM>
	<CAMYG4Gkf9bFc4r8p_R2=NWh+RaJyz4xGq6yNw8qqpMk7xFNiTQ@mail.gmail.com>
	<SV0P279MB0561FF472107BCFADD0E113694C32@SV0P279MB0561.NORP279.PROD.OUTLOOK.COM>
	<CAMYG4GmdhGUh4uLd4JEkakZ6ECA-P=3TUEgpQAaAD_0ome-AAQ@mail.gmail.com>
Message-ID: <OS6P279MB0570376BD8807EDCDFBF0F9E94C22@OS6P279MB0570.NORP279.PROD.OUTLOOK.COM>

Hi,

Here is the output when I run with ?lu? and the settings you suggested:

0 TS dt 0.1 time 0.
    0 SNES Function norm 2.982668991189e-01
      Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0
                     PC failed due to FACTOR_NUMERIC_ZEROPIVOT
    Nonlinear solve did not converge due to DIVERGED_LINEAR_SOLVE iterations 0
      TSAdapt none step   0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0          + 1.000e-01 retrying with dt=2.500e-02
    0 SNES Function norm 7.456672477972e-02
      Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0
                     PC failed due to FACTOR_NUMERIC_ZEROPIVOT
    Nonlinear solve did not converge due to DIVERGED_LINEAR_SOLVE iterations 0
      TSAdapt none step   0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0          + 2.500e-02 retrying with dt=6.250e-03
    0 SNES Function norm 1.864168119493e-02
      Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0
                     PC failed due to FACTOR_NUMERIC_ZEROPIVOT
    Nonlinear solve did not converge due to DIVERGED_LINEAR_SOLVE iterations 0
      TSAdapt none step   0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0          + 6.250e-03 retrying with dt=1.563e-03
    0 SNES Function norm 4.660420298733e-03
      Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0
                     PC failed due to FACTOR_NUMERIC_ZEROPIVOT
    Nonlinear solve did not converge due to DIVERGED_LINEAR_SOLVE iterations 0
      TSAdapt none step   0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0          + 1.563e-03 retrying with dt=3.906e-04
    0 SNES Function norm 1.165105074683e-03
      Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0
                     PC failed due to FACTOR_NUMERIC_ZEROPIVOT
    Nonlinear solve did not converge due to DIVERGED_LINEAR_SOLVE iterations 0
      TSAdapt none step   0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0          + 3.906e-04 retrying with dt=9.766e-05
    0 SNES Function norm 2.912762686708e-04
      Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0
                     PC failed due to FACTOR_NUMERIC_ZEROPIVOT
    Nonlinear solve did not converge due to DIVERGED_LINEAR_SOLVE iterations 0
      TSAdapt none step   0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0          + 9.766e-05 retrying with dt=2.441e-05
    0 SNES Function norm 7.281906716770e-05
      Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0
                     PC failed due to FACTOR_NUMERIC_ZEROPIVOT
    Nonlinear solve did not converge due to DIVERGED_LINEAR_SOLVE iterations 0
      TSAdapt none step   0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0          + 2.441e-05 retrying with dt=6.104e-06
    0 SNES Function norm 1.820476679192e-05
      Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0
                     PC failed due to FACTOR_NUMERIC_ZEROPIVOT
    Nonlinear solve did not converge due to DIVERGED_LINEAR_SOLVE iterations 0
      TSAdapt none step   0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0          + 6.104e-06 retrying with dt=1.526e-06
    0 SNES Function norm 4.551191697981e-06
    Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 0
      TSAdapt none beuler 0: step   0 accepted t=0          + 1.526e-06 dt=1.526e-06
1 TS dt 1.52588e-06 time 1.52588e-06

From: Matthew Knepley <knepley at gmail.com>
Date: Tuesday, February 25, 2025 at 20:48
To: Eirik Jaccheri H?ydalsvik <eirik.hoydalsvik at sintef.no>
Cc: petsc-users at mcs.anl.gov <petsc-users at mcs.anl.gov>
Subject: Re: [petsc-users] TS Solver stops working when including ts.setDM
On Tue, Feb 25, 2025 at 9:51?AM Eirik Jaccheri H?ydalsvik <eirik.hoydalsvik at sintef.no<mailto:eirik.hoydalsvik at sintef.no>> wrote:
Hi,

After sending you the email, I rescaled the residual function and got the two jacobians to agree down to e-7.

I have tried with ?lu? and ?ilu? as preconditioners, and this does not work. However, I just tried to use ?sor? as a preconditioner, and using sor using the da jacobian works just fine!
Why should it work with sor and not with ilu or lu?

ILU fails all the time, so that is not surprising. However, I do not understand why SOR would succeed and LU would fail,
except that SOR is functioning as a kind of globalization by solving very inexactly. Can you run with

  -snes_monitor -snes_converged_reason -ksp_monitor_true_solution -ksp_converged_reason -snes_linesearch_monitor

and send the output?

  Thanks,

    Matt

Eirik

Jacobians:
row 0: (0, 1.1012)  (1, -51356.3)  (2, 0.258649)  (3, -0.0644364)  (4, -6402.63)  (5, 6.19796e-08)
row 1: (0, -0.445291)  (1, 901708.)  (2, 0.)  (3, 0.44529)  (4, -901708.)  (5, 3.63946e-07)
row 2: (0, 1.10139)  (1, -40239.6)  (2, 0.258157)  (3, -0.0642761)  (4, -6985.51)  (5, 6.19796e-08)
row 3: (0, -0.101197)  (1, 51356.3)  (2, -0.258649)  (3, 1.16563)  (4, -44953.7)  (5, 0.258649)  (6, -0.0644364)  (7, -6402.63)  (8, 8.23293e-08)
row 4: (0, 0.)  (1, 0.)  (2, 0.)  (3, -0.44529)  (4, 901708.)  (5, -3.63946e-07)  (6, 0.44529)  (7, -901708.)  (8, -2.8832e-07)
row 5: (0, -0.101388)  (1, 51394.3)  (2, -0.258157)  (3, 1.16566)  (4, -33254.1)  (5, 0.258157)  (6, -0.0642762)  (7, -6985.51)  (8, 8.27566e-08)
row 6: (3, -0.101197)  (4, 51356.3)  (5, -0.258649)  (6, 1.06444)  (7, 6402.63)  (8, -5.80354e-08)
row 7: (3, 0.)  (4, 0.)  (5, 0.)  (6, 0.)  (7, 0.)  (8, 1.)
row 8: (3, -0.101388)  (4, 51394.3)  (5, -0.258157)  (6, 1.06428)  (7, 18140.2)  (8, -5.88806e-08)

1.1011966721737030e+00 -5.1356338141763350e+04 2.5864916418200712e-01 -6.4436434390614972e-02 -6.4026259743175688e+03 7.0149583230222760e-08 -1.2114185055821600e-08 7.4938912205780135e-08 -1.2114185055821600e-08
-4.4529045442108389e-01 9.0170829570587131e+05 -3.6394551278146074e-07 4.4529038352260736e-01 -9.0170829570607911e+05 -2.8832047116453381e-07 2.8832047116453381e-07 -2.8832047116453381e-07 2.8832047116453381e-07
1.1013882301195626e+00 -4.0239613965471210e+04 2.5815705499526392e-01 -6.4276195275552644e-02 -6.9855091616075770e+03 7.0994758931791700e-08 -1.2114185055821600e-08 7.5502362673492770e-08 -1.2114185055821600e-08
-1.0119660581208668e-01 5.1356338141756765e+04 -2.5864909514315410e-01 1.1656331102401210e+00 -4.4953712167476042e+04 2.5864905556513557e-01 -6.4436357255260146e-02 -6.4026259743998889e+03 8.6207799859128616e-08
0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 -4.4529067184307852e-01 9.0170829570636747e+05 0.0000000000000000e+00 4.4529045442108389e-01 -9.0170829570587131e+05 3.6394551278146074e-07
-1.0138834089854361e-01 5.1394313808568040e+04 -2.5815698520133573e-01 1.1656640380374783e+00 -3.3254086573823952e+04 2.5815685372183378e-01 -6.4276095304361430e-02 -6.9855068889361701e+03 8.6288440103988163e-08
-6.9304407528653807e-08 0.0000000000000000e+00 -6.9304407528653807e-08 -1.0119667848877271e-01 5.1356338141793494e+04 -2.5864912586737532e-01 1.0644363666594443e+00 6.4026259743251758e+03 -7.4093736504211182e-08
0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 1.0000000000000000e+00
-6.9586132762510124e-08 0.0000000000000000e+00 -6.9586132762510124e-08 -1.0138837786069978e-01 5.1394295578534489e+04 -2.5815692427475545e-01 1.0642752170084262e+00 1.8140206731963925e+04 -7.4375461738067500e-08


From: Matthew Knepley <knepley at gmail.com<mailto:knepley at gmail.com>>
Date: Tuesday, February 25, 2025 at 15:27
To: Eirik Jaccheri H?ydalsvik <eirik.hoydalsvik at sintef.no<mailto:eirik.hoydalsvik at sintef.no>>
Cc: petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov> <petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>>
Subject: Re: [petsc-users] TS Solver stops working when including ts.setDM
On Tue, Feb 25, 2025 at 3:19?AM Eirik Jaccheri H?ydalsvik <eirik.hoydalsvik at sintef.no<mailto:eirik.hoydalsvik at sintef.no>> wrote:
Thanks again for the quick response,

I tried prining the jacobians with -ksp_view_mat as you suggested, with a system of only 3 cells (I am studying at a 1d problem).  Printing the jacobian in the first timestep I  got the two matrices attached at the end of this email. The jacobians are in general agreement, with some small diviations, like the final element of the matrix being 1.6e-5 in the sparse case and 3.7 In the full case.

We usually expect to see single precision accuracy (1e-7), so this indicates that your condition number is high.

If you use LU (-pc_type lu) to solve the linear system, do you get similar results?

  Thanks,

     Matt

Questions:

1. Are differences on the order of 1e-5 expected when computing the jacobians in different ways?

2. Do you think these differences can be the cause of my problems? Any suggestions for furtner debugging strategies?

Eirik

! sparse jacobian
row 0: (0, 1.1012)  (1, -104.568)  (2, 0.258649)  (3, -0.0644364)  (4, -13.1186)  (5, 1.3237e-08)
row 1: (0, -0.44489)  (1, 1846.04)  (2, 2.12629e-07)  (3, 0.445291)  (4, -1846.04)  (5, 7.08762e-08)
row 2: (0, 540.692)  (1, -40219.1)  (2, 126.734)  (3, -31.5544)  (4, -7023.46)  (5, 6.48896e-06)
row 3: (0, -0.101197)  (1, 104.568)  (2, -0.258649)  (3, 1.16563)  (4, -91.4489)  (5, 0.258649)  (6, -0.0644365)  (7, -13.1186)  (8, -4.4809e-08)
row 4: (0, 0.)  (1, 0.)  (2, 0.)  (3, -0.44489)  (4, 1846.04)  (5, -2.17357e-07)  (6, 0.445291)  (7, -1846.04)  (8, 2.17355e-07)
row 5: (0, -49.7734)  (1, 51373.8)  (2, -126.734)  (3, 572.246)  (4, -33195.6)  (5, 126.734)  (6, -31.5544)  (7, -7023.46)  (8, -2.19026e-05)
row 6: (3, -0.101197)  (4, 104.568)  (5, -0.258649)  (6, 1.06444)  (7, 13.1186)  (8, 3.32334e-08)
row 7: (3, 0.)  (4, 0.)  (5, 0.)  (6, 0.)  (7, 0.)  (8, 1.)
row 8: (3, -49.7734)  (4, 51373.8)  (5, -126.734)  (6, 522.472)  (7, 18178.2)  (8, 1.61503e-05)

! full jacobian
1.1011966827009450e+00 -1.0456754702270389e+02 2.5864915220241336e-01 -6.4436436239323838e-02 -1.3118626729240630e+01 7.6042484344402957e-08 2.9290438414140398e-08 2.5347494781467651e-08 7.2381179542635411e-08
-4.4488897562431995e-01 1.8460406897256150e+03 -5.0558790784552242e-07 4.4529051871980491e-01 -1.8460406898012175e+03 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00
5.4069168701069862e+02 -4.0219094660763396e+04 1.2673402711669499e+02 -3.1554364136687809e+01 -7.0234605760797031e+03 3.7635960251523168e-05 1.4708306305192963e-05 1.2833718246687978e-05 3.5617173111594722e-05
-1.0119659898285956e-01 1.0456754705230254e+02 -2.5864912672499502e-01 1.1656331184446040e+00 -9.1448920317937109e+01 2.5864905777109459e-01 -6.4436443447843730e-02 -1.3118626754633008e+01 3.8772800266974086e-09
0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 -4.4488904650049199e-01 1.8460406899429699e+03 -2.8823314009443733e-07 4.4529051871980491e-01 -1.8460406898012175e+03 0.0000000000000000e+00
-4.9773392795905657e+01 5.1373794518018862e+04 -1.2673401444613337e+02 5.7224585844509852e+02 -3.3195615874594827e+04 1.2673393520690603e+02 -3.1554356503250116e+01 -7.0234583029005144e+03 1.9105655626471588e-06
-8.1675260962506883e-08 -2.9290438414140398e-08 -2.5347494781467651e-08 -1.0119667997558363e-01 1.0456754704720647e+02 -2.5864913361425051e-01 1.0644364161519400e+00 1.3118626729240630e+01 -7.6042484344402957e-08
0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 1.0000000000000000e+00
-4.0087344635721997e-05 -1.4564107223769502e-05 -1.2401121002417596e-05 -4.9773414819747863e+01 5.1373776293551586e+04 -1.2673397275364130e+02 5.2247224727851687e+02 1.8178158133060850e+04 -3.7347562088676249e-05

From: Matthew Knepley <knepley at gmail.com<mailto:knepley at gmail.com>>
Date: Monday, February 24, 2025 at 15:00
To: Eirik Jaccheri H?ydalsvik <eirik.hoydalsvik at sintef.no<mailto:eirik.hoydalsvik at sintef.no>>
Cc: petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov> <petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>>
Subject: Re: [petsc-users] TS Solver stops working when including ts.setDM
On Mon, Feb 24, 2025 at 8:56?AM Eirik Jaccheri H?ydalsvik <eirik.hoydalsvik at sintef.no<mailto:eirik.hoydalsvik at sintef.no>> wrote:

  1.  Thank you for the quick answer, I think this sounds reasonable? Is there any way to compare the brute-force jacobian to the one computed using the coloring information?

The easiest way we have is to print them both out:

  -ksp_view_mat

on both runs. We have a way to compare the analytic and FD Jacobians (-snes_test_jacobian), but
not two different FDs.

  Thanks,

     Matt


  1.
From: Matthew Knepley <knepley at gmail.com<mailto:knepley at gmail.com>>
Date: Monday, February 24, 2025 at 14:53
To: Eirik Jaccheri H?ydalsvik <eirik.hoydalsvik at sintef.no<mailto:eirik.hoydalsvik at sintef.no>>
Cc: petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov> <petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>>
Subject: Re: [petsc-users] TS Solver stops working when including ts.setDM
On Mon, Feb 24, 2025 at 8:41?AM Eirik Jaccheri H?ydalsvik via petsc-users <petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>> wrote:
Hi,

I am using the petsc4py.ts timestepper to solve a PDE. It is not easy to obtain the jacobian for my equations, so I do not provide a jacobian function. The code is given at the end of the email.

When I comment out the function call ?ts.setDM(da)?, the code runs and gives reasonable results.

However, when I add this line of code, the program crashes with the error message provided at the end of the email.

Questions:

1. Do you know why adding this line of code can make the SNES solver diverge? Any suggestions for how to debug the issue?

I will not know until I run it, but here is my guess. When the DMDA is specified, PETSc uses coloring to produce the Jacobian. When it is not, it just brute-forces the entire J. My guess is that your residual does not respect the stencil in the DMDA, so the coloring is wrong, making a wrong Jacobian.

2. What is the advantage of adding the DMDA object to the ts solver? Will this speed up the calculation of the finite difference jacobian?

Yes, it speeds up the computation of the FD Jacobian.

  Thanks,

     Matt

Best regards,
Eirik H?ydalsvik
SINTEF ER/NTNU

Error message:

[Eiriks-MacBook-Pro.local:26384] shmem: mmap: an error occurred while determining whether or not /var/folders/w1/35jw9y4n7lsbw0dhjqdhgzz80000gn/T//ompi.Eiriks-MacBook-Pro.501/jf.0/2046361600/sm_segment.Eiriks-MacBook-Pro.501.79f90000.0 could be created.
t 0 of 1 with dt =  0.2
0 TS dt 0.2 time 0.
      TSAdapt none step   0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0          + 2.000e-01 retrying with dt=5.000e-02
      TSAdapt none step   0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0          + 5.000e-02 retrying with dt=1.250e-02
      TSAdapt none step   0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0          + 1.250e-02 retrying with dt=3.125e-03
      TSAdapt none step   0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0          + 3.125e-03 retrying with dt=7.813e-04
      TSAdapt none step   0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0          + 7.813e-04 retrying with dt=1.953e-04
      TSAdapt none step   0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0          + 1.953e-04 retrying with dt=4.883e-05
      TSAdapt none step   0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0          + 4.883e-05 retrying with dt=1.221e-05
      TSAdapt none step   0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0          + 1.221e-05 retrying with dt=3.052e-06
      TSAdapt none step   0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0          + 3.052e-06 retrying with dt=7.629e-07
      TSAdapt none step   0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0          + 7.629e-07 retrying with dt=1.907e-07
      TSAdapt none step   0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0          + 1.907e-07 retrying with dt=4.768e-08
Traceback (most recent call last):
  File "/Users/iept1445/Code/tank_model/closed_tank.py", line 200, in <module>
    return_dict1d = get_tank_composition_1d(tank_params)
  File "/Users/iept1445/Code/tank_model/src/tank_model1d.py", line 223, in get_tank_composition_1d
    ts.solve(u=x)
  File "petsc4py/PETSc/TS.pyx", line 2478, in petsc4py.PETSc.TS.solve
petsc4py.PETSc.Error: error code 91
[0] TSSolve() at /Users/iept1445/.cache/uv/sdists-v6/pypi/petsc/3.22.1/svV0XnlR4s3LB4lmsaKjR/src/src/ts/interface/ts.c:4072
[0] TSStep() at /Users/iept1445/.cache/uv/sdists-v6/pypi/petsc/3.22.1/svV0XnlR4s3LB4lmsaKjR/src/src/ts/interface/ts.c:3440
[0] TSStep has failed due to DIVERGED_STEP_REJECTED

Options for solver:

COMM = PETSc.COMM_WORLD

    da = PETSc.DMDA().create(
        dim=(N_vertical,),
        dof=3,
        stencil_type=PETSc.DMDA().StencilType.STAR,
        stencil_width=1,
        # boundary_type=PETSc.DMDA.BoundaryType.GHOSTED,
    )
    x = da.createGlobalVec()
    x_old = da.createGlobalVec()
    f = da.createGlobalVec()
    J = da.createMat()
    rho_ref = rho_m[0]  # kg/m3
    e_ref = e_m[0]  # J/mol
    p_ref = p0  # Pa
    x.setArray(np.array([rho_m / rho_ref, e_m / e_ref, ux_m]).T.flatten())
    x_old.setArray(np.array([rho_m / rho_ref, e_m / e_ref, ux_m]).T.flatten())

    optsDB = PETSc.Options()
    optsDB["snes_lag_preconditioner_persists"] = False
    optsDB["snes_lag_jacobian"] = 1
    optsDB["snes_lag_jacobian_persists"] = False
    optsDB["snes_lag_preconditioner"] = 1
    optsDB["ksp_type"] = "gmres"  # "gmres"  # gmres"
    optsDB["pc_type"] = "ilu"  # "lu"  # "ilu"
    optsDB["snes_type"] = "newtonls"
    optsDB["ksp_rtol"] = 1e-7
    optsDB["ksp_atol"] = 1e-7
    optsDB["ksp_max_it"] = 100
    optsDB["snes_rtol"] = 1e-5
    optsDB["snes_atol"] = 1e-5
    optsDB["snes_stol"] = 1e-5
    optsDB["snes_max_it"] = 100
    optsDB["snes_mf"] = False
    optsDB["ts_max_time"] = t_end
    optsDB["ts_type"] = "beuler"  # "bdf"  #
    optsDB["ts_max_snes_failures"] = -1
    optsDB["ts_monitor"] = ""
    optsDB["ts_adapt_monitor"] = ""
    # optsDB["snes_monitor"] = ""
    # optsDB["ksp_monitor"] = ""
    optsDB["ts_atol"] = 1e-4

    x0 = x_old
    residual_wrap = residual_ts(
        eos,
        x0,
        N_vertical,
        g,
        pos,
        z,
        mw,
        dt,
        dx,
        p_amb,
        A_nozzle,
        r_tank_inner,
        mph_uv_flsh_L,
        rho_ref,
        e_ref,
        p_ref,
        closed_tank,
        J,
        f,
        da,
        drift_func,
        T_wall,
        tank_params,
    )

    # optsDB["ts_adapt_type"] = "none"

    ts = PETSc.TS().create(comm=COMM)
    # TODO: Figure out why DM crashes the code
    # ts.setDM(residual_wrap.da)
    ts.setIFunction(residual_wrap.residual_ts, None)
    ts.setTimeStep(dt)
    ts.setMaxSteps(-1)
    ts.setTime(t_start)  # s
    ts.setMaxTime(t_end)  # s
    ts.setMaxSteps(1e5)
    ts.setStepLimits(1e-3, 1e5)
    ts.setFromOptions()
    ts.solve(u=x)


Residual function:

class residual_ts:
    def __init__(
        self,
        eos,
        x0,
        N,
        g,
        pos,
        z,
        mw,
        dt,
        dx,
        p_amb,
        A_nozzle,
        r_tank_inner,
        mph_uv_flsh_l,
        rho_ref,
        e_ref,
        p_ref,
        closed_tank,
        J,
        f,
        da,
        drift_func,
        T_wall,
        tank_params,
    ):
        self.eos = eos
        self.x0 = x0
        self.N = N
        self.g = g
        self.pos = pos
        self.z = z
        self.mw<https://urldefense.us/v3/__http://self.mw/__;!!G_uCfscf7eWS!YLsa1SRpJkLsV8gFq3kXXxcNwmZ3lx9qJMebtTjTnxhU0-43s0nvM3UbT2FOmlqyxa9bU2er_jIx3OYk_Ds4kW542MX-_73AJ_I$ > = mw
        self.dt = dt
        self.dx = dx
        self.p_amb = p_amb
        self.A_nozzle = A_nozzle
        self.r_tank_inner = r_tank_inner
        self.mph_uv_flsh_L = mph_uv_flsh_l
        self.rho_ref = rho_ref
        self.e_ref = e_ref
        self.p_ref = p_ref
        self.closed_tank = closed_tank
        self.J = J
        self.f = f
        self.da = da
        self.drift_func = drift_func
        self.T_wall = T_wall
        self.tank_params = tank_params
        self.Q_wall = np.zeros(N)
        self.n_iter = 0
        self.t_current = [0.0]
        self.s_top = [0.0]
        self.p_choke = [0.0]

        # setting interp func # TODO figure out how to generalize this method
        self._interp_func = _jalla_upwind

        # allocate space for new params
        self.p = np.zeros(N)  # Pa
        self.T = np.zeros(N)  # K
        self.alpha = np.zeros((2, N))
        self.rho = np.zeros((2, N))
        self.e = np.zeros((2, N))

        # allocate space for ghost cells
        self.alpha_ghost = np.zeros((2, N + 2))
        self.rho_ghost = np.zeros((2, N + 2))
        self.rho_m_ghost = np.zeros(N + 2)
        self.u_m_ghost = np.zeros(N + 1)
        self.u_ghost = np.zeros((2, N + 1))
        self.e_ghost = np.zeros((2, N + 2))
        self.pos_ghost = np.zeros(N + 2)
        self.h_ghost = np.zeros((2, N + 2))

        # allocate soace for local X and Xdot
        self.X_LOCAL = da.createLocalVec()
        self.XDOT_LOCAL = da.createLocalVec()

    def residual_ts(self, ts, t, X, XDOT, F):
        self.n_iter += 1
        # TODO: Estimate time use
        """
        Caculate residuals for equations
        (rho, rho (e + gx))_t + (rho u, rho u (h + gx))_x = 0
        P_x = - g \rho
        """
        n_phase = 2
        self.da.globalToLocal(X, self.X_LOCAL)
        self.da.globalToLocal(XDOT, self.XDOT_LOCAL)
        x = self.da.getVecArray(self.X_LOCAL)
        xdot = self.da.getVecArray(self.XDOT_LOCAL)
        f = self.da.getVecArray(F)

        T_c, v_c, p_c = self.eos.critical(self.z)  # K, m3/mol, Pa
        rho_m = x[:, 0] * self.rho_ref  # kg/m3
        e_m = x[:, 1] * self.e_ref  # J/mol
        u_m = x[:-1, 2]  # m/s

        # derivatives
        rho_m_dot = xdot[:, 0] * self.rho_ref  # kg/m3
        e_m_dot = xdot[:, 1] * self.e_ref  # kg/m3
        dt = ts.getTimeStep()  # s

        for i in range(self.N):
            # get new parameters
            self.mph_uv_flsh_L[i] = self.eos.multi_phase_uvflash(
                self.z, e_m[i], self.mw<https://urldefense.us/v3/__http://self.mw/__;!!G_uCfscf7eWS!YLsa1SRpJkLsV8gFq3kXXxcNwmZ3lx9qJMebtTjTnxhU0-43s0nvM3UbT2FOmlqyxa9bU2er_jIx3OYk_Ds4kW542MX-_73AJ_I$ > / rho_m[i], self.mph_uv_flsh_L[i]
            )

            betaL, betaV, betaS = _get_betas(self.mph_uv_flsh_L[i])  # mol/mol
            beta = [betaL, betaV]
            if betaS != 0.0:
                print("there is a solid phase which is not accounted for")
            self.T[i], self.p[i] = _get_tank_temperature_pressure(
                self.mph_uv_flsh_L[i]
            )  # K, Pa)
            for j, phase in enumerate([self.eos.LIQPH, self.eos.VAPPH]):
                # new parameters
                self.rho_ghost[:, 1:-1][j][i] = (
                    self.mw<https://urldefense.us/v3/__http://self.mw/__;!!G_uCfscf7eWS!YLsa1SRpJkLsV8gFq3kXXxcNwmZ3lx9qJMebtTjTnxhU0-43s0nvM3UbT2FOmlqyxa9bU2er_jIx3OYk_Ds4kW542MX-_73AJ_I$ >
                    / self.eos.specific_volume(self.T[i], self.p[i], self.z, phase)[0]
                )  # kg/m3
                self.e_ghost[:, 1:-1][j][i] = self.eos.internal_energy_tv(
                    self.T[i], self.mw<https://urldefense.us/v3/__http://self.mw/__;!!G_uCfscf7eWS!YLsa1SRpJkLsV8gFq3kXXxcNwmZ3lx9qJMebtTjTnxhU0-43s0nvM3UbT2FOmlqyxa9bU2er_jIx3OYk_Ds4kW542MX-_73AJ_I$ > / self.rho_ghost[:, 1:-1][j][i], self.z, phase
                )[
                    0
                ]  # J/mol
                self.h_ghost[:, 1:-1][j][i] = (
                    self.e_ghost[:, 1:-1][j][i]
                    + self.p[i] * self.mw<https://urldefense.us/v3/__http://self.mw/__;!!G_uCfscf7eWS!YLsa1SRpJkLsV8gFq3kXXxcNwmZ3lx9qJMebtTjTnxhU0-43s0nvM3UbT2FOmlqyxa9bU2er_jIx3OYk_Ds4kW542MX-_73AJ_I$ > / self.rho_ghost[:, 1:-1][j][i]
                )  # J/mol
                self.alpha_ghost[:, 1:-1][j][i] = (
                    beta[j] / self.rho_ghost[:, 1:-1][j][i] * rho_m[i]
                )  # m3/m3

        # calculate drift velocity
        for i in range(self.N - 1):
            self.u_ghost[:, 1:-1][0][i], self.u_ghost[:, 1:-1][1][i] = (
                calc_drift_velocity(
                    u_m[i],
                    self._interp_func(
                        self.rho_ghost[:, 1:-1][0][i],
                        self.rho_ghost[:, 1:-1][0][i + 1],
                        u_m[i],
                    ),
                    self._interp_func(
                        self.rho_ghost[:, 1:-1][1][i],
                        self.rho_ghost[:, 1:-1][1][i + 1],
                        u_m[i],
                    ),
                    self.g,
                    self._interp_func(self.T[i], self.T[i + 1], u_m[i]),
                    T_c,
                    self.r_tank_inner,
                    self._interp_func(
                        self.alpha_ghost[:, 1:-1][0][i],
                        self.alpha_ghost[:, 1:-1][0][i + 1],
                        u_m[i],
                    ),
                    self._interp_func(
                        self.alpha_ghost[:, 1:-1][1][i],
                        self.alpha_ghost[:, 1:-1][1][i + 1],
                        u_m[i],
                    ),
                    self.drift_func,
                )
            )  # liq m / s , vapour m / s

        u_bottom = 0
        if self.closed_tank:
            u_top = 0.0  # m/s
        else:
            # calc phase to skip env_isentrope_cross
            if (
                self.mph_uv_flsh_L[-1].liquid != None
                and self.mph_uv_flsh_L[-1].vapour == None
                and self.mph_uv_flsh_L[-1].solid == None
            ):
                phase_env = self.eos.LIQPH
            else:
                phase_env = self.eos.TWOPH

            self.h_m = e_m + self.p * self.mw<https://urldefense.us/v3/__http://self.mw/__;!!G_uCfscf7eWS!YLsa1SRpJkLsV8gFq3kXXxcNwmZ3lx9qJMebtTjTnxhU0-43s0nvM3UbT2FOmlqyxa9bU2er_jIx3OYk_Ds4kW542MX-_73AJ_I$ > / rho_m  # J / mol
            self.s_top[0] = _get_s_tank(self.eos, self.mph_uv_flsh_L[-1])  # J / mol / K
            mdot, self.p_choke[0] = calc_mass_outflow(
                self.eos,
                self.z,
                self.h_m[-1],
                self.s_top[0],
                self.p[-1],
                self.p_amb,
                self.A_nozzle,
                self.mw<https://urldefense.us/v3/__http://self.mw/__;!!G_uCfscf7eWS!YLsa1SRpJkLsV8gFq3kXXxcNwmZ3lx9qJMebtTjTnxhU0-43s0nvM3UbT2FOmlqyxa9bU2er_jIx3OYk_Ds4kW542MX-_73AJ_I$ >,
                phase_env,
                debug_plot=False,
            )  # mol / s , Pa
            u_top = -mdot * self.mw<https://urldefense.us/v3/__http://self.mw/__;!!G_uCfscf7eWS!YLsa1SRpJkLsV8gFq3kXXxcNwmZ3lx9qJMebtTjTnxhU0-43s0nvM3UbT2FOmlqyxa9bU2er_jIx3OYk_Ds4kW542MX-_73AJ_I$ > / rho_m[-1] / (np.pi * self.r_tank_inner**2)  # m/s

        # assemble vectors with ghost cells
        self.alpha_ghost[:, 0] = self.alpha_ghost[:, 1:-1][:, 0]  # m3/m3
        self.alpha_ghost[:, -1] = self.alpha_ghost[:, 1:-1][:, -1]  # m3/m3
        self.rho_ghost[:, 0] = self.rho_ghost[:, 1:-1][:, 0]  # kg/m3
        self.rho_ghost[:, -1] = self.rho_ghost[:, 1:-1][:, -1]  # kg/m3
        self.rho_m_ghost[0] = rho_m[0]  # kg/m3
        self.rho_m_ghost[1:-1] = rho_m  # kg/m3
        self.rho_m_ghost[-1] = rho_m[-1]  # kg/m3
        # u_ghost[:, 1:-1] = u  # m/s
        self.u_ghost[:, 0] = u_bottom  # m/s
        self.u_ghost[:, -1] = u_top  # m/s
        self.u_m_ghost[0] = u_bottom  # m/s
        self.u_m_ghost[1:-1] = u_m  # m/s
        self.u_m_ghost[-1] = u_top  # m/s
        self.e_ghost[:, 0] = self.e_ghost[:, 1:-1][:, 0]  # J/mol
        self.e_ghost[:, -1] = self.e_ghost[:, 1:-1][:, -1]  # J/mol
        self.pos_ghost[1:-1] = self.pos  # m
        self.pos_ghost[0] = self.pos[0]  # m
        self.pos_ghost[-1] = self.pos[-1]  # m
        self.h_ghost[:, 0] = self.h_ghost[:, 1]  # J/mol
        self.h_ghost[:, -1] = self.h_ghost[:, -2]  # J/mol

        # recalculate wall temperature and heat flux
        # TODO ARE WE DOING THE STAGGERING CORRECTLY?
        lz = self.tank_params["lz_tank"] / self.N  # m
        if ts.getTime() != self.t_current[0] and self.tank_params["heat_transfer"]:
            self.t_current[0] = ts.getTime()
            for i in range(self.N):
                self.T_wall[i], self.Q_wall[i], h_ht = (
                    solve_radial_heat_conduction_implicit(
                        self.tank_params,
                        self.T[i],
                        self.T_wall[i],
                        (self.u_m_ghost[i] + self.u_m_ghost[i + 1]) / 2,
                        self.rho_m_ghost[i + 1],
                        self.mph_uv_flsh_L[i],
                        lz,
                        dt,
                    )
                )  # K, J/s, W/m2K

        # Calculate residuals
        f[:, :] = 0.0
        f[:, 0] = dt * rho_m_dot  # kg/m3
        f[:-1, 1] = self.p[1:] - self.p[:-1] + self.dx * self.g * rho_m[0:-1]  # Pa/m
        f[:, 2] = (
            dt
            * (
                rho_m_dot * (e_m / self.mw<https://urldefense.us/v3/__http://self.mw/__;!!G_uCfscf7eWS!YLsa1SRpJkLsV8gFq3kXXxcNwmZ3lx9qJMebtTjTnxhU0-43s0nvM3UbT2FOmlqyxa9bU2er_jIx3OYk_Ds4kW542MX-_73AJ_I$ > + self.g * self.pos)
                + rho_m * e_m_dot / self.mw<https://urldefense.us/v3/__http://self.mw/__;!!G_uCfscf7eWS!YLsa1SRpJkLsV8gFq3kXXxcNwmZ3lx9qJMebtTjTnxhU0-43s0nvM3UbT2FOmlqyxa9bU2er_jIx3OYk_Ds4kW542MX-_73AJ_I$ >
            )
            - rho_m_dot * e_m_dot / self.mw<https://urldefense.us/v3/__http://self.mw/__;!!G_uCfscf7eWS!YLsa1SRpJkLsV8gFq3kXXxcNwmZ3lx9qJMebtTjTnxhU0-43s0nvM3UbT2FOmlqyxa9bU2er_jIx3OYk_Ds4kW542MX-_73AJ_I$ > * dt**2
            - self.Q_wall / (np.pi * self.r_tank_inner**2 * lz) * dt
        )  # J / m3

        # add contribution from space
        for i in range(n_phase):
            e_flux_i = np.zeros_like(self.u_ghost[i])  # J/m3 m/s
            rho_flux_i = np.zeros_like(self.u_ghost[i])  # kg/m2/s
            for j in range(1, self.N + 1):
                if self.u_ghost[i][j] >= 0.0:
                    rho_flux_new = _rho_flux(
                        self.alpha_ghost[i][j], self.rho_ghost[i][j], self.u_ghost[i][j]
                    )
                    e_flux_new = _e_flux(
                        self.alpha_ghost[i][j],
                        self.rho_ghost[i][j],
                        self.h_ghost[i][j],
                        self.mw<https://urldefense.us/v3/__http://self.mw/__;!!G_uCfscf7eWS!YLsa1SRpJkLsV8gFq3kXXxcNwmZ3lx9qJMebtTjTnxhU0-43s0nvM3UbT2FOmlqyxa9bU2er_jIx3OYk_Ds4kW542MX-_73AJ_I$ >,
                        self.g,
                        self.pos_ghost[j],
                        self.u_ghost[i][j],
                    )

                    # backward euler
                    rho_flux_i[j] = rho_flux_new  # kg/m2/s
                    e_flux_i[j] = e_flux_new  # J/m3 m/s

                else:
                    rho_flux_new = _rho_flux(
                        self.alpha_ghost[i][j + 1],
                        self.rho_ghost[i][j + 1],
                        self.u_ghost[i][j],
                    )

                    e_flux_new = _e_flux(
                        self.alpha_ghost[i][j + 1],
                        self.rho_ghost[i][j + 1],
                        self.h_ghost[i][j + 1],
                        self.mw<https://urldefense.us/v3/__http://self.mw/__;!!G_uCfscf7eWS!YLsa1SRpJkLsV8gFq3kXXxcNwmZ3lx9qJMebtTjTnxhU0-43s0nvM3UbT2FOmlqyxa9bU2er_jIx3OYk_Ds4kW542MX-_73AJ_I$ >,
                        self.g,
                        self.pos_ghost[j + 1],
                        self.u_ghost[i][j],
                    )

                    # backward euler
                    rho_flux_i[j] = rho_flux_new
                    e_flux_i[j] = e_flux_new

            # mass eq
            f[:, 0] += (dt / self.dx) * (rho_flux_i[1:] - rho_flux_i[:-1])  # kg/m3

            # energy eq
            f[:, 2] += (dt / self.dx) * (e_flux_i[1:] - e_flux_i[:-1])  # J/m3

        f1_ref, f2_ref, f3_ref = self.rho_ref, self.p_ref, self.e_ref
        f[:, 0] /= f1_ref
        f[:-1, 1] /= f2_ref
        f[:, 2] /= f3_ref
        # dummy eq
        f[-1, 1] = x[-1, 2]


--
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener

https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!YLsa1SRpJkLsV8gFq3kXXxcNwmZ3lx9qJMebtTjTnxhU0-43s0nvM3UbT2FOmlqyxa9bU2er_jIx3OYk_Ds4kW542MX-Tp9IKBQ$ <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!YLsa1SRpJkLsV8gFq3kXXxcNwmZ3lx9qJMebtTjTnxhU0-43s0nvM3UbT2FOmlqyxa9bU2er_jIx3OYk_Ds4kW542MX-EmFBV7E$ >


--
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener

https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!YLsa1SRpJkLsV8gFq3kXXxcNwmZ3lx9qJMebtTjTnxhU0-43s0nvM3UbT2FOmlqyxa9bU2er_jIx3OYk_Ds4kW542MX-Tp9IKBQ$ <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!YLsa1SRpJkLsV8gFq3kXXxcNwmZ3lx9qJMebtTjTnxhU0-43s0nvM3UbT2FOmlqyxa9bU2er_jIx3OYk_Ds4kW542MX-EmFBV7E$ >


--
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener

https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!YLsa1SRpJkLsV8gFq3kXXxcNwmZ3lx9qJMebtTjTnxhU0-43s0nvM3UbT2FOmlqyxa9bU2er_jIx3OYk_Ds4kW542MX-Tp9IKBQ$ <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!YLsa1SRpJkLsV8gFq3kXXxcNwmZ3lx9qJMebtTjTnxhU0-43s0nvM3UbT2FOmlqyxa9bU2er_jIx3OYk_Ds4kW542MX-EmFBV7E$ >


--
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener

https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!YLsa1SRpJkLsV8gFq3kXXxcNwmZ3lx9qJMebtTjTnxhU0-43s0nvM3UbT2FOmlqyxa9bU2er_jIx3OYk_Ds4kW542MX-Tp9IKBQ$ <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!YLsa1SRpJkLsV8gFq3kXXxcNwmZ3lx9qJMebtTjTnxhU0-43s0nvM3UbT2FOmlqyxa9bU2er_jIx3OYk_Ds4kW542MX-EmFBV7E$ >
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20250226/240f8fda/attachment-0001.html>

From dargaville.steven at gmail.com  Wed Feb 26 06:26:01 2025
From: dargaville.steven at gmail.com (Steven Dargaville)
Date: Wed, 26 Feb 2025 12:26:01 +0000
Subject: [petsc-users] building kokkos matrices on the device
In-Reply-To: <CA+MQGp8zgVgPPwSGHpeQbHNJ+iBa1ZDZ=XjnetC1nkmkB3EHzg@mail.gmail.com>
References: <CAG_C8sKOYhGzO7sDVhHyZTcVUAfrVLEH1AYKDVgMrLSg1-h55w@mail.gmail.com>
	<CA+MQGp-JwjLW-MpVOPyHxz8EaQUoivk4qHqN06=Ky2_L5qR+Hg@mail.gmail.com>
	<CAG_C8sL8cZfKcGa9ZBTrK_iu5dA=XSvZ+SETHG_6BTLAFemKZg@mail.gmail.com>
	<CA+MQGp8zgVgPPwSGHpeQbHNJ+iBa1ZDZ=XjnetC1nkmkB3EHzg@mail.gmail.com>
Message-ID: <CAG_C8sLZFij6j0nPLJs9fFYPPCNNm=zHLZzjqseP9YPRT=xH2A@mail.gmail.com>

Those two constructors would definitely meet my needs, thanks!

Also I should note that the comment about garray and B in
MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices is correct if garray is
passed in as NULL, it's just that if you pass in a completed garray it
doesn't bother creating one or changing the column indices of B. So I would
suggest the comment be: "if garray is NULL the offdiag matrix B should have
global col ids; if garray is not NULL the offdiag matrix B should have
local col ids"

On Wed, 26 Feb 2025 at 03:35, Junchao Zhang <junchao.zhang at gmail.com> wrote:

> Mat_SeqAIJKokkos is private because it is in a private header
> src/mat/impls/aij/seq/kokkos/aijkok.hpp
>
> Your observation about the garray in MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices()
> might be right.  The comment
>
> - B - the offdiag matrix using global col ids
>
> is out of date. Perhaps it should be "the offdiag matrix uses local column
> indices and garray contains the local to global mapping".  But I need to
> double check it.
>
> Since you use Kokkos, I think we could provide these two constructors for
> MATSEQAIJKOKKOS and MATMPIAIJKOKKOS respectively
>
>    - MatCreateSeqAIJKokkosWithKokkosCsrMatrix(MPI_Comm comm,
>    KokkosCsrMatrix csr, Mat *A)
>
>
>    - MatCreateMPIAIJKokkosWithSeqAIJKokkos(MPI_Comm comm, Mat A, Mat B,
>    PetscInt *garray, Mat *mat)
>
>          // To mimic the existing MatCreateMPIAIJWithSeqAIJ(MPI_Comm comm,
> Mat A, Mat B, const PetscInt garray[], Mat *mat);
>          // A and B are MATSEQAIJKOKKOS matrices and use local column
> indices
>
> Do they meet your needs?
>
> --Junchao Zhang
>
>
> On Tue, Feb 25, 2025 at 5:35?PM Steven Dargaville <
> dargaville.steven at gmail.com> wrote:
>
>> Thanks for the response!
>>
>> Although MatSetValuesCOO happens on the device if the input coo_v pointer
>> is device memory, I believe MatSetPreallocationCOO requires host pointers
>> for coo_i and coo_j, and the preallocation (and construction of the COO
>> structures) happens on the host and is then copied onto the device? I need
>> to be able to create a matrix object with minimal work on the host (like
>> many of the routines in aijkok.kokkos.cxx do internally). I originally used
>> the COO interface to build the matrices I need, but that was around 5x
>> slower than constructing the aij structures myself on the device and then
>> just directly using the MatSetSeqAIJKokkosWithCSRMatrix type methods.
>>
>> The reason I thought MatSetSeqAIJKokkosWithCSRMatrix could easily be made
>> public is that the Mat_SeqAIJKokkos constructors are already publicly
>> accessible? In particular one of those constructors takes in pointers to
>> the Kokkos dual views which store a,i,j, and hence one can build a
>> sequential matrix with nothing (or very little) occuring on the host. The
>> only change I can see that would be necessary is for
>> MatSetSeqAIJKokkosWithCSRMatrix (or MatCreateSeqAIJKokkosWithCSRMatrix) to
>> be public is to change the PETSC_INTERN to PETSC_EXTERN?
>>
>> For MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices, I believe all that
>> is required is declaring the method in the .hpp, as it's already defined as
>> static in mpiaijkok.kokkos.cxx. In particular, the comments
>> above MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices suggest that the
>> off-diagonal block B needs to be built with global column ids, with
>> mpiaij->garray constructed on the host along with the rewriting of the
>> global column indices in B. This happens in MatSetUpMultiply_MPIAIJ, but
>> checking the code there shows that if you pass in a non-null garray to
>> MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices, the construction and
>> compatification is skipped, meaning B can be built with local column ids as
>> long as garray is provided on the host (which I also build on the device
>> and then just copy to the host). Again this is what some of the internal
>> Kokkos routines rely on, like the matrix-product.
>>
>> I am happy to try doing this and submitting a request to the petsc gitlab
>> if this seems sensible, I just wanted to double check that I wasn't missing
>> something important?
>> Thanks
>> Steven
>>
>> On Tue, 25 Feb 2025 at 22:16, Junchao Zhang <junchao.zhang at gmail.com>
>> wrote:
>>
>>> Hi, Steven,
>>>   MatSetSeqAIJKokkosWithCSRMatrix(Mat A, Mat_SeqAIJKokkos *akok) uses a
>>> private data type Mat_SeqAIJKokkos, so it can not be directly made public.
>>>   If you already use COO, then why not directly make the matrix of type
>>> MATAIJKOKKOS and call MatSetPreallocationCOO() and MatSetValuesCOO()?
>>>   So I am confused by your needs.
>>>
>>> Thanks!
>>> --Junchao Zhang
>>>
>>>
>>> On Tue, Feb 25, 2025 at 3:39?PM Steven Dargaville <
>>> dargaville.steven at gmail.com> wrote:
>>>
>>>> Hi
>>>>
>>>> I'm just wondering if there is any possibility of making:
>>>> MatSetSeqAIJKokkosWithCSRMatrix or MatCreateSeqAIJKokkosWithCSRMatrix
>>>> in src/mat/impls/aij/seq/kokkos/aijkok.kokkos.cxx
>>>> MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices in
>>>> src/mat/impls/aij/mpi/kokkos/mpiaijkok.kokkos.cxx
>>>>
>>>> publicly accessible outside of petsc, or if there is an interface I
>>>> have missed for creating Kokkos matrices entirely on the device?
>>>> MatCreateSeqAIJKokkosWithCSRMatrix for example is marked as PETSC_INTERN so
>>>> I can't link to it.
>>>>
>>>> I've currently just copied the code inside of those methods so that I
>>>> can build without any preallocation on the host (e.g., through the COO
>>>> interface) and it works really well.
>>>>
>>>> Thanks for your help
>>>> Steven
>>>>
>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20250226/9727d53e/attachment.html>

From knepley at gmail.com  Wed Feb 26 08:52:05 2025
From: knepley at gmail.com (Matthew Knepley)
Date: Wed, 26 Feb 2025 09:52:05 -0500
Subject: [petsc-users] TS Solver stops working when including ts.setDM
In-Reply-To: <OS6P279MB0570376BD8807EDCDFBF0F9E94C22@OS6P279MB0570.NORP279.PROD.OUTLOOK.COM>
References: <OS6P279MB05706419B43C06C49AA346BD94C02@OS6P279MB0570.NORP279.PROD.OUTLOOK.COM>
	<CAMYG4Gn3RwRuqta5Xc0AWgAnvF=pam0VCJacH+J_ovVfUyXtRA@mail.gmail.com>
	<OS6P279MB057047A6F89E228712A833AF94C02@OS6P279MB0570.NORP279.PROD.OUTLOOK.COM>
	<CAMYG4G=trPFXfTGdK2FW3sp_6aSVUTOv=jsh1vJKu9ahr_44sw@mail.gmail.com>
	<OS6P279MB057020701F9E25564062AA6694C32@OS6P279MB0570.NORP279.PROD.OUTLOOK.COM>
	<CAMYG4Gkf9bFc4r8p_R2=NWh+RaJyz4xGq6yNw8qqpMk7xFNiTQ@mail.gmail.com>
	<SV0P279MB0561FF472107BCFADD0E113694C32@SV0P279MB0561.NORP279.PROD.OUTLOOK.COM>
	<CAMYG4GmdhGUh4uLd4JEkakZ6ECA-P=3TUEgpQAaAD_0ome-AAQ@mail.gmail.com>
	<OS6P279MB0570376BD8807EDCDFBF0F9E94C22@OS6P279MB0570.NORP279.PROD.OUTLOOK.COM>
Message-ID: <CAMYG4G=xAMh8rFi1-hcW2+FjNMRT=LmHQkSBvT-s=MR1rB5bkw@mail.gmail.com>

On Wed, Feb 26, 2025 at 3:21?AM Eirik Jaccheri H?ydalsvik <
eirik.hoydalsvik at sintef.no> wrote:

> Hi,
>
> Here is the output when I run with ?lu? and the settings you suggested:
>
> 0 TS dt 0.1 time 0.
>
>     0 SNES Function norm 2.982668991189e-01
>
>       Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0
>
>                      PC failed due to FACTOR_NUMERIC_ZEROPIVOT
>

Okay, your Jacobian is rank deficient. Did you know this? This usually
indicates an error either
in the implementation or formulation. Is it supposed to be rank deficient?

  Thanks,

     Matt


>     Nonlinear solve did not converge due to DIVERGED_LINEAR_SOLVE
> iterations 0
>
>       TSAdapt none step   0 stage rejected (SNES reason
> DIVERGED_LINEAR_SOLVE) t=0          + 1.000e-01 retrying with dt=2.500e-02
>
>     0 SNES Function norm 7.456672477972e-02
>
>       Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0
>
>                      PC failed due to FACTOR_NUMERIC_ZEROPIVOT
>
>     Nonlinear solve did not converge due to DIVERGED_LINEAR_SOLVE
> iterations 0
>
>       TSAdapt none step   0 stage rejected (SNES reason
> DIVERGED_LINEAR_SOLVE) t=0          + 2.500e-02 retrying with dt=6.250e-03
>
>     0 SNES Function norm 1.864168119493e-02
>
>       Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0
>
>                      PC failed due to FACTOR_NUMERIC_ZEROPIVOT
>
>     Nonlinear solve did not converge due to DIVERGED_LINEAR_SOLVE
> iterations 0
>
>       TSAdapt none step   0 stage rejected (SNES reason
> DIVERGED_LINEAR_SOLVE) t=0          + 6.250e-03 retrying with dt=1.563e-03
>
>     0 SNES Function norm 4.660420298733e-03
>
>       Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0
>
>                      PC failed due to FACTOR_NUMERIC_ZEROPIVOT
>
>     Nonlinear solve did not converge due to DIVERGED_LINEAR_SOLVE
> iterations 0
>
>       TSAdapt none step   0 stage rejected (SNES reason
> DIVERGED_LINEAR_SOLVE) t=0          + 1.563e-03 retrying with dt=3.906e-04
>
>     0 SNES Function norm 1.165105074683e-03
>
>       Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0
>
>                      PC failed due to FACTOR_NUMERIC_ZEROPIVOT
>
>     Nonlinear solve did not converge due to DIVERGED_LINEAR_SOLVE
> iterations 0
>
>       TSAdapt none step   0 stage rejected (SNES reason
> DIVERGED_LINEAR_SOLVE) t=0          + 3.906e-04 retrying with dt=9.766e-05
>
>     0 SNES Function norm 2.912762686708e-04
>
>       Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0
>
>                      PC failed due to FACTOR_NUMERIC_ZEROPIVOT
>
>     Nonlinear solve did not converge due to DIVERGED_LINEAR_SOLVE
> iterations 0
>
>       TSAdapt none step   0 stage rejected (SNES reason
> DIVERGED_LINEAR_SOLVE) t=0          + 9.766e-05 retrying with dt=2.441e-05
>
>     0 SNES Function norm 7.281906716770e-05
>
>       Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0
>
>                      PC failed due to FACTOR_NUMERIC_ZEROPIVOT
>
>     Nonlinear solve did not converge due to DIVERGED_LINEAR_SOLVE
> iterations 0
>
>       TSAdapt none step   0 stage rejected (SNES reason
> DIVERGED_LINEAR_SOLVE) t=0          + 2.441e-05 retrying with dt=6.104e-06
>
>     0 SNES Function norm 1.820476679192e-05
>
>       Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0
>
>                      PC failed due to FACTOR_NUMERIC_ZEROPIVOT
>
>     Nonlinear solve did not converge due to DIVERGED_LINEAR_SOLVE
> iterations 0
>
>       TSAdapt none step   0 stage rejected (SNES reason
> DIVERGED_LINEAR_SOLVE) t=0          + 6.104e-06 retrying with dt=1.526e-06
>
>     0 SNES Function norm 4.551191697981e-06
>
>     Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 0
>
>       TSAdapt none beuler 0: step   0 accepted t=0          + 1.526e-06
> dt=1.526e-06
>
> 1 TS dt 1.52588e-06 time 1.52588e-06
>
>
>
> *From: *Matthew Knepley <knepley at gmail.com>
> *Date: *Tuesday, February 25, 2025 at 20:48
> *To: *Eirik Jaccheri H?ydalsvik <eirik.hoydalsvik at sintef.no>
> *Cc: *petsc-users at mcs.anl.gov <petsc-users at mcs.anl.gov>
> *Subject: *Re: [petsc-users] TS Solver stops working when including
> ts.setDM
>
> On Tue, Feb 25, 2025 at 9:51?AM Eirik Jaccheri H?ydalsvik <
> eirik.hoydalsvik at sintef.no> wrote:
>
> Hi,
>
> After sending you the email, I rescaled the residual function and got the
> two jacobians to agree down to e-7.
>
> I have tried with ?lu? and ?ilu? as preconditioners, and this does not
> work. However, I just tried to use ?sor? as a preconditioner, and using sor
> using the da jacobian works just fine!
>
> Why should it work with sor and not with ilu or lu?
>
>
>
> ILU fails all the time, so that is not surprising. However, I do not
> understand why SOR would succeed and LU would fail,
>
> except that SOR is functioning as a kind of globalization by solving very
> inexactly. Can you run with
>
>
>
>   -snes_monitor -snes_converged_reason -ksp_monitor_true_solution
> -ksp_converged_reason -snes_linesearch_monitor
>
>
>
> and send the output?
>
>
>
>   Thanks,
>
>
>
>     Matt
>
>
>
> Eirik
>
> Jacobians:
> row 0: (0, 1.1012)  (1, -51356.3)  (2, 0.258649)  (3, -0.0644364)  (4,
> -6402.63)  (5, 6.19796e-08)
>
> row 1: (0, -0.445291)  (1, 901708.)  (2, 0.)  (3, 0.44529)  (4, -901708.)
> (5, 3.63946e-07)
>
> row 2: (0, 1.10139)  (1, -40239.6)  (2, 0.258157)  (3, -0.0642761)  (4,
> -6985.51)  (5, 6.19796e-08)
>
> row 3: (0, -0.101197)  (1, 51356.3)  (2, -0.258649)  (3, 1.16563)  (4,
> -44953.7)  (5, 0.258649)  (6, -0.0644364)  (7, -6402.63)  (8, 8.23293e-08)
>
> row 4: (0, 0.)  (1, 0.)  (2, 0.)  (3, -0.44529)  (4, 901708.)  (5,
> -3.63946e-07)  (6, 0.44529)  (7, -901708.)  (8, -2.8832e-07)
>
> row 5: (0, -0.101388)  (1, 51394.3)  (2, -0.258157)  (3, 1.16566)  (4,
> -33254.1)  (5, 0.258157)  (6, -0.0642762)  (7, -6985.51)  (8, 8.27566e-08)
>
> row 6: (3, -0.101197)  (4, 51356.3)  (5, -0.258649)  (6, 1.06444)  (7,
> 6402.63)  (8, -5.80354e-08)
>
> row 7: (3, 0.)  (4, 0.)  (5, 0.)  (6, 0.)  (7, 0.)  (8, 1.)
>
> row 8: (3, -0.101388)  (4, 51394.3)  (5, -0.258157)  (6, 1.06428)  (7,
> 18140.2)  (8, -5.88806e-08)
>
> 1.1011966721737030e+00 -5.1356338141763350e+04 2.5864916418200712e-01
> -6.4436434390614972e-02 -6.4026259743175688e+03 7.0149583230222760e-08
> -1.2114185055821600e-08 7.4938912205780135e-08 -1.2114185055821600e-08
>
> -4.4529045442108389e-01 9.0170829570587131e+05 -3.6394551278146074e-07
> 4.4529038352260736e-01 -9.0170829570607911e+05 -2.8832047116453381e-07
> 2.8832047116453381e-07 -2.8832047116453381e-07 2.8832047116453381e-07
>
> 1.1013882301195626e+00 -4.0239613965471210e+04 2.5815705499526392e-01
> -6.4276195275552644e-02 -6.9855091616075770e+03 7.0994758931791700e-08
> -1.2114185055821600e-08 7.5502362673492770e-08 -1.2114185055821600e-08
>
> -1.0119660581208668e-01 5.1356338141756765e+04 -2.5864909514315410e-01
> 1.1656331102401210e+00 -4.4953712167476042e+04 2.5864905556513557e-01
> -6.4436357255260146e-02 -6.4026259743998889e+03 8.6207799859128616e-08
>
> 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00
> -4.4529067184307852e-01 9.0170829570636747e+05 0.0000000000000000e+00
> 4.4529045442108389e-01 -9.0170829570587131e+05 3.6394551278146074e-07
>
> -1.0138834089854361e-01 5.1394313808568040e+04 -2.5815698520133573e-01
> 1.1656640380374783e+00 -3.3254086573823952e+04 2.5815685372183378e-01
> -6.4276095304361430e-02 -6.9855068889361701e+03 8.6288440103988163e-08
>
> -6.9304407528653807e-08 0.0000000000000000e+00 -6.9304407528653807e-08
> -1.0119667848877271e-01 5.1356338141793494e+04 -2.5864912586737532e-01
> 1.0644363666594443e+00 6.4026259743251758e+03 -7.4093736504211182e-08
>
> 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00
> 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00
> 0.0000000000000000e+00 0.0000000000000000e+00 1.0000000000000000e+00
>
> -6.9586132762510124e-08 0.0000000000000000e+00 -6.9586132762510124e-08
> -1.0138837786069978e-01 5.1394295578534489e+04 -2.5815692427475545e-01
> 1.0642752170084262e+00 1.8140206731963925e+04 -7.4375461738067500e-08
>
>
>
> *From: *Matthew Knepley <knepley at gmail.com>
> *Date: *Tuesday, February 25, 2025 at 15:27
> *To: *Eirik Jaccheri H?ydalsvik <eirik.hoydalsvik at sintef.no>
> *Cc: *petsc-users at mcs.anl.gov <petsc-users at mcs.anl.gov>
> *Subject: *Re: [petsc-users] TS Solver stops working when including
> ts.setDM
>
> On Tue, Feb 25, 2025 at 3:19?AM Eirik Jaccheri H?ydalsvik <
> eirik.hoydalsvik at sintef.no> wrote:
>
> Thanks again for the quick response,
>
> I tried prining the jacobians with -ksp_view_mat as you suggested, with a
> system of only 3 cells (I am studying at a 1d problem).  Printing the
> jacobian in the first timestep I  got the two matrices attached at the end
> of this email. The jacobians are in general agreement, with some small
> diviations, like the final element of the matrix being 1.6e-5 in the sparse
> case and 3.7 In the full case.
>
>
>
> We usually expect to see single precision accuracy (1e-7), so this
> indicates that your condition number is high.
>
>
>
> If you use LU (-pc_type lu) to solve the linear system, do you get similar
> results?
>
>
>
>   Thanks,
>
>
>
>      Matt
>
>
>
> Questions:
>
> 1. Are differences on the order of 1e-5 expected when computing the
> jacobians in different ways?
>
> 2. Do you think these differences can be the cause of my problems? Any
> suggestions for furtner debugging strategies?
>
> Eirik
>
> ! sparse jacobian
>
> row 0: (0, 1.1012)  (1, -104.568)  (2, 0.258649)  (3, -0.0644364)  (4,
> -13.1186)  (5, 1.3237e-08)
>
> row 1: (0, -0.44489)  (1, 1846.04)  (2, 2.12629e-07)  (3, 0.445291)  (4,
> -1846.04)  (5, 7.08762e-08)
>
> row 2: (0, 540.692)  (1, -40219.1)  (2, 126.734)  (3, -31.5544)  (4,
> -7023.46)  (5, 6.48896e-06)
>
> row 3: (0, -0.101197)  (1, 104.568)  (2, -0.258649)  (3, 1.16563)  (4,
> -91.4489)  (5, 0.258649)  (6, -0.0644365)  (7, -13.1186)  (8, -4.4809e-08)
>
> row 4: (0, 0.)  (1, 0.)  (2, 0.)  (3, -0.44489)  (4, 1846.04)  (5,
> -2.17357e-07)  (6, 0.445291)  (7, -1846.04)  (8, 2.17355e-07)
>
> row 5: (0, -49.7734)  (1, 51373.8)  (2, -126.734)  (3, 572.246)  (4,
> -33195.6)  (5, 126.734)  (6, -31.5544)  (7, -7023.46)  (8, -2.19026e-05)
>
> row 6: (3, -0.101197)  (4, 104.568)  (5, -0.258649)  (6, 1.06444)  (7,
> 13.1186)  (8, 3.32334e-08)
>
> row 7: (3, 0.)  (4, 0.)  (5, 0.)  (6, 0.)  (7, 0.)  (8, 1.)
>
> row 8: (3, -49.7734)  (4, 51373.8)  (5, -126.734)  (6, 522.472)  (7,
> 18178.2)  (8, 1.61503e-05)
>
>
>
> ! full jacobian
>
> 1.1011966827009450e+00 -1.0456754702270389e+02 2.5864915220241336e-01
> -6.4436436239323838e-02 -1.3118626729240630e+01 7.6042484344402957e-08
> 2.9290438414140398e-08 2.5347494781467651e-08 7.2381179542635411e-08
>
> -4.4488897562431995e-01 1.8460406897256150e+03 -5.0558790784552242e-07
> 4.4529051871980491e-01 -1.8460406898012175e+03 0.0000000000000000e+00
> 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00
>
> 5.4069168701069862e+02 -4.0219094660763396e+04 1.2673402711669499e+02
> -3.1554364136687809e+01 -7.0234605760797031e+03 3.7635960251523168e-05
> 1.4708306305192963e-05 1.2833718246687978e-05 3.5617173111594722e-05
>
> -1.0119659898285956e-01 1.0456754705230254e+02 -2.5864912672499502e-01
> 1.1656331184446040e+00 -9.1448920317937109e+01 2.5864905777109459e-01
> -6.4436443447843730e-02 -1.3118626754633008e+01 3.8772800266974086e-09
>
> 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00
> -4.4488904650049199e-01 1.8460406899429699e+03 -2.8823314009443733e-07
> 4.4529051871980491e-01 -1.8460406898012175e+03 0.0000000000000000e+00
>
> -4.9773392795905657e+01 5.1373794518018862e+04 -1.2673401444613337e+02
> 5.7224585844509852e+02 -3.3195615874594827e+04 1.2673393520690603e+02
> -3.1554356503250116e+01 -7.0234583029005144e+03 1.9105655626471588e-06
>
> -8.1675260962506883e-08 -2.9290438414140398e-08 -2.5347494781467651e-08
> -1.0119667997558363e-01 1.0456754704720647e+02 -2.5864913361425051e-01
> 1.0644364161519400e+00 1.3118626729240630e+01 -7.6042484344402957e-08
>
> 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00
> 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00
> 0.0000000000000000e+00 0.0000000000000000e+00 1.0000000000000000e+00
>
> -4.0087344635721997e-05 -1.4564107223769502e-05 -1.2401121002417596e-05
> -4.9773414819747863e+01 5.1373776293551586e+04 -1.2673397275364130e+02
> 5.2247224727851687e+02 1.8178158133060850e+04 -3.7347562088676249e-05
>
>
>
> *From: *Matthew Knepley <knepley at gmail.com>
> *Date: *Monday, February 24, 2025 at 15:00
> *To: *Eirik Jaccheri H?ydalsvik <eirik.hoydalsvik at sintef.no>
> *Cc: *petsc-users at mcs.anl.gov <petsc-users at mcs.anl.gov>
> *Subject: *Re: [petsc-users] TS Solver stops working when including
> ts.setDM
>
> On Mon, Feb 24, 2025 at 8:56?AM Eirik Jaccheri H?ydalsvik <
> eirik.hoydalsvik at sintef.no> wrote:
>
>
>    1. Thank you for the quick answer, I think this sounds reasonable? Is
>    there any way to compare the brute-force jacobian to the one computed using
>    the coloring information?
>
>
>
> The easiest way we have is to print them both out:
>
>
>
>   -ksp_view_mat
>
>
>
> on both runs. We have a way to compare the analytic and FD Jacobians
> (-snes_test_jacobian), but
>
> not two different FDs.
>
>
>
>   Thanks,
>
>
>
>      Matt
>
>
>
>
>    1.
>
> *From: *Matthew Knepley <knepley at gmail.com>
> *Date: *Monday, February 24, 2025 at 14:53
> *To: *Eirik Jaccheri H?ydalsvik <eirik.hoydalsvik at sintef.no>
> *Cc: *petsc-users at mcs.anl.gov <petsc-users at mcs.anl.gov>
> *Subject: *Re: [petsc-users] TS Solver stops working when including
> ts.setDM
>
> On Mon, Feb 24, 2025 at 8:41?AM Eirik Jaccheri H?ydalsvik via petsc-users
> <petsc-users at mcs.anl.gov> wrote:
>
> Hi,
>
> I am using the petsc4py.ts timestepper to solve a PDE. It is not easy to
> obtain the jacobian for my equations, so I do not provide a jacobian
> function. The code is given at the end of the email.
>
> When I comment out the function call ?ts.setDM(da)?, the code runs and
> gives reasonable results.
>
> However, when I add this line of code, the program crashes with the error
> message provided at the end of the email.
>
> Questions:
>
> 1. Do you know why adding this line of code can make the SNES solver
> diverge? Any suggestions for how to debug the issue?
>
>
>
> I will not know until I run it, but here is my guess. When the DMDA is
> specified, PETSc uses coloring to produce the Jacobian. When it is not, it
> just brute-forces the entire J. My guess is that your residual does not
> respect the stencil in the DMDA, so the coloring is wrong, making a wrong
> Jacobian.
>
>
>
> 2. What is the advantage of adding the DMDA object to the ts solver? Will
> this speed up the calculation of the finite difference jacobian?
>
>
>
> Yes, it speeds up the computation of the FD Jacobian.
>
>
>
>   Thanks,
>
>
>
>      Matt
>
>
>
> Best regards,
>
> Eirik H?ydalsvik
>
> SINTEF ER/NTNU
>
> Error message:
>
> [Eiriks-MacBook-Pro.local:26384] shmem: mmap: an error occurred while
> determining whether or not
> /var/folders/w1/35jw9y4n7lsbw0dhjqdhgzz80000gn/T//ompi.Eiriks-MacBook-Pro.501/jf.0/2046361600/sm_segment.Eiriks-MacBook-Pro.501.79f90000.0 could
> be created.
>
> t 0 of 1 with dt =  0.2
>
> 0 TS dt 0.2 time 0.
>
>       TSAdapt none step   0 stage rejected (SNES reason
> DIVERGED_LINEAR_SOLVE) t=0          + 2.000e-01 retrying with dt=5.000e-02
>
>       TSAdapt none step   0 stage rejected (SNES reason
> DIVERGED_LINEAR_SOLVE) t=0          + 5.000e-02 retrying with dt=1.250e-02
>
>       TSAdapt none step   0 stage rejected (SNES reason
> DIVERGED_LINEAR_SOLVE) t=0          + 1.250e-02 retrying with dt=3.125e-03
>
>       TSAdapt none step   0 stage rejected (SNES reason
> DIVERGED_LINEAR_SOLVE) t=0          + 3.125e-03 retrying with dt=7.813e-04
>
>       TSAdapt none step   0 stage rejected (SNES reason
> DIVERGED_LINEAR_SOLVE) t=0          + 7.813e-04 retrying with dt=1.953e-04
>
>       TSAdapt none step   0 stage rejected (SNES reason
> DIVERGED_LINEAR_SOLVE) t=0          + 1.953e-04 retrying with dt=4.883e-05
>
>       TSAdapt none step   0 stage rejected (SNES reason
> DIVERGED_LINEAR_SOLVE) t=0          + 4.883e-05 retrying with dt=1.221e-05
>
>       TSAdapt none step   0 stage rejected (SNES reason
> DIVERGED_LINEAR_SOLVE) t=0          + 1.221e-05 retrying with dt=3.052e-06
>
>       TSAdapt none step   0 stage rejected (SNES reason
> DIVERGED_LINEAR_SOLVE) t=0          + 3.052e-06 retrying with dt=7.629e-07
>
>       TSAdapt none step   0 stage rejected (SNES reason
> DIVERGED_LINEAR_SOLVE) t=0          + 7.629e-07 retrying with dt=1.907e-07
>
>       TSAdapt none step   0 stage rejected (SNES reason
> DIVERGED_LINEAR_SOLVE) t=0          + 1.907e-07 retrying with dt=4.768e-08
>
> Traceback (most recent call last):
>
>   File "/Users/iept1445/Code/tank_model/closed_tank.py", line 200, in
> <module>
>
>     return_dict1d = get_tank_composition_1d(tank_params)
>
>   File "/Users/iept1445/Code/tank_model/src/tank_model1d.py", line 223, in
> get_tank_composition_1d
>
>     ts.solve(u=x)
>
>   File "petsc4py/PETSc/TS.pyx", line 2478, in petsc4py.PETSc.TS.solve
>
> petsc4py.PETSc.Error: error code 91
>
> [0] TSSolve() at
> /Users/iept1445/.cache/uv/sdists-v6/pypi/petsc/3.22.1/svV0XnlR4s3LB4lmsaKjR/src/src/ts/interface/ts.c:4072
>
> [0] TSStep() at
> /Users/iept1445/.cache/uv/sdists-v6/pypi/petsc/3.22.1/svV0XnlR4s3LB4lmsaKjR/src/src/ts/interface/ts.c:3440
>
> [0] TSStep has failed due to DIVERGED_STEP_REJECTED
>
> Options for solver:
>
> COMM = PETSc.COMM_WORLD
>
>
>
>     da = PETSc.DMDA().create(
>
>         dim=(N_vertical,),
>
>         dof=3,
>
>         stencil_type=PETSc.DMDA().StencilType.STAR,
>
>         stencil_width=1,
>
>         # boundary_type=PETSc.DMDA.BoundaryType.GHOSTED,
>
>     )
>
>     x = da.createGlobalVec()
>
>     x_old = da.createGlobalVec()
>
>     f = da.createGlobalVec()
>
>     J = da.createMat()
>
>     rho_ref = rho_m[0]  # kg/m3
>
>     e_ref = e_m[0]  # J/mol
>
>     p_ref = p0  # Pa
>
>     x.setArray(np.array([rho_m / rho_ref, e_m / e_ref, ux_m]).T.flatten())
>
>     x_old.setArray(np.array([rho_m / rho_ref, e_m / e_ref,
> ux_m]).T.flatten())
>
>
>
>     optsDB = PETSc.Options()
>
>     optsDB["snes_lag_preconditioner_persists"] = False
>
>     optsDB["snes_lag_jacobian"] = 1
>
>     optsDB["snes_lag_jacobian_persists"] = False
>
>     optsDB["snes_lag_preconditioner"] = 1
>
>     optsDB["ksp_type"] = "gmres"  # "gmres"  # gmres"
>
>     optsDB["pc_type"] = "ilu"  # "lu"  # "ilu"
>
>     optsDB["snes_type"] = "newtonls"
>
>     optsDB["ksp_rtol"] = 1e-7
>
>     optsDB["ksp_atol"] = 1e-7
>
>     optsDB["ksp_max_it"] = 100
>
>     optsDB["snes_rtol"] = 1e-5
>
>     optsDB["snes_atol"] = 1e-5
>
>     optsDB["snes_stol"] = 1e-5
>
>     optsDB["snes_max_it"] = 100
>
>     optsDB["snes_mf"] = False
>
>     optsDB["ts_max_time"] = t_end
>
>     optsDB["ts_type"] = "beuler"  # "bdf"  #
>
>     optsDB["ts_max_snes_failures"] = -1
>
>     optsDB["ts_monitor"] = ""
>
>     optsDB["ts_adapt_monitor"] = ""
>
>     # optsDB["snes_monitor"] = ""
>
>     # optsDB["ksp_monitor"] = ""
>
>     optsDB["ts_atol"] = 1e-4
>
>
>
>     x0 = x_old
>
>     residual_wrap = residual_ts(
>
>         eos,
>
>         x0,
>
>         N_vertical,
>
>         g,
>
>         pos,
>
>         z,
>
>         mw,
>
>         dt,
>
>         dx,
>
>         p_amb,
>
>         A_nozzle,
>
>         r_tank_inner,
>
>         mph_uv_flsh_L,
>
>         rho_ref,
>
>         e_ref,
>
>         p_ref,
>
>         closed_tank,
>
>         J,
>
>         f,
>
>         da,
>
>         drift_func,
>
>         T_wall,
>
>         tank_params,
>
>     )
>
>
>
>     # optsDB["ts_adapt_type"] = "none"
>
>
>
>     ts = PETSc.TS().create(comm=COMM)
>
>     # TODO: Figure out why DM crashes the code
>
>     # ts.setDM(residual_wrap.da)
>
>     ts.setIFunction(residual_wrap.residual_ts, None)
>
>     ts.setTimeStep(dt)
>
>     ts.setMaxSteps(-1)
>
>     ts.setTime(t_start)  # s
>
>     ts.setMaxTime(t_end)  # s
>
>     ts.setMaxSteps(1e5)
>
>     ts.setStepLimits(1e-3, 1e5)
>
>     ts.setFromOptions()
>
>     ts.solve(u=x)
>
>
>
> Residual function:
>
> class residual_ts:
>
>     def __init__(
>
>         self,
>
>         eos,
>
>         x0,
>
>         N,
>
>         g,
>
>         pos,
>
>         z,
>
>         mw,
>
>         dt,
>
>         dx,
>
>         p_amb,
>
>         A_nozzle,
>
>         r_tank_inner,
>
>         mph_uv_flsh_l,
>
>         rho_ref,
>
>         e_ref,
>
>         p_ref,
>
>         closed_tank,
>
>         J,
>
>         f,
>
>         da,
>
>         drift_func,
>
>         T_wall,
>
>         tank_params,
>
>     ):
>
>         self.eos = eos
>
>         self.x0 = x0
>
>         self.N = N
>
>         self.g = g
>
>         self.pos = pos
>
>         self.z = z
>
>         self.mw = mw
>
>         self.dt = dt
>
>         self.dx = dx
>
>         self.p_amb = p_amb
>
>         self.A_nozzle = A_nozzle
>
>         self.r_tank_inner = r_tank_inner
>
>         self.mph_uv_flsh_L = mph_uv_flsh_l
>
>         self.rho_ref = rho_ref
>
>         self.e_ref = e_ref
>
>         self.p_ref = p_ref
>
>         self.closed_tank = closed_tank
>
>         self.J = J
>
>         self.f = f
>
>         self.da = da
>
>         self.drift_func = drift_func
>
>         self.T_wall = T_wall
>
>         self.tank_params = tank_params
>
>         self.Q_wall = np.zeros(N)
>
>         self.n_iter = 0
>
>         self.t_current = [0.0]
>
>         self.s_top = [0.0]
>
>         self.p_choke = [0.0]
>
>
>
>         # setting interp func # TODO figure out how to generalize this
> method
>
>         self._interp_func = _jalla_upwind
>
>
>
>         # allocate space for new params
>
>         self.p = np.zeros(N)  # Pa
>
>         self.T = np.zeros(N)  # K
>
>         self.alpha = np.zeros((2, N))
>
>         self.rho = np.zeros((2, N))
>
>         self.e = np.zeros((2, N))
>
>
>
>         # allocate space for ghost cells
>
>         self.alpha_ghost = np.zeros((2, N + 2))
>
>         self.rho_ghost = np.zeros((2, N + 2))
>
>         self.rho_m_ghost = np.zeros(N + 2)
>
>         self.u_m_ghost = np.zeros(N + 1)
>
>         self.u_ghost = np.zeros((2, N + 1))
>
>         self.e_ghost = np.zeros((2, N + 2))
>
>         self.pos_ghost = np.zeros(N + 2)
>
>         self.h_ghost = np.zeros((2, N + 2))
>
>
>
>         # allocate soace for local X and Xdot
>
>         self.X_LOCAL = da.createLocalVec()
>
>         self.XDOT_LOCAL = da.createLocalVec()
>
>
>
>     def residual_ts(self, ts, t, X, XDOT, F):
>
>         self.n_iter += 1
>
>         # TODO: Estimate time use
>
>         """
>
>         Caculate residuals for equations
>
>         (rho, rho (e + gx))_t + (rho u, rho u (h + gx))_x = 0
>
>         P_x = - g \rho
>
>         """
>
>         n_phase = 2
>
>         self.da.globalToLocal(X, self.X_LOCAL)
>
>         self.da.globalToLocal(XDOT, self.XDOT_LOCAL)
>
>         x = self.da.getVecArray(self.X_LOCAL)
>
>         xdot = self.da.getVecArray(self.XDOT_LOCAL)
>
>         f = self.da.getVecArray(F)
>
>
>
>         T_c, v_c, p_c = self.eos.critical(self.z)  # K, m3/mol, Pa
>
>         rho_m = x[:, 0] * self.rho_ref  # kg/m3
>
>         e_m = x[:, 1] * self.e_ref  # J/mol
>
>         u_m = x[:-1, 2]  # m/s
>
>
>
>         # derivatives
>
>         rho_m_dot = xdot[:, 0] * self.rho_ref  # kg/m3
>
>         e_m_dot = xdot[:, 1] * self.e_ref  # kg/m3
>
>         dt = ts.getTimeStep()  # s
>
>
>
>         for i in range(self.N):
>
>             # get new parameters
>
>             self.mph_uv_flsh_L[i] = self.eos.multi_phase_uvflash(
>
>                 self.z, e_m[i], self.mw / rho_m[i], self.mph_uv_flsh_L[i]
>
>             )
>
>
>
>             betaL, betaV, betaS = _get_betas(self.mph_uv_flsh_L[i])  #
> mol/mol
>
>             beta = [betaL, betaV]
>
>             if betaS != 0.0:
>
>                 print("there is a solid phase which is not accounted for")
>
>             self.T[i], self.p[i] = _get_tank_temperature_pressure(
>
>                 self.mph_uv_flsh_L[i]
>
>             )  # K, Pa)
>
>             for j, phase in enumerate([self.eos.LIQPH, self.eos.VAPPH]):
>
>                 # new parameters
>
>                 self.rho_ghost[:, 1:-1][j][i] = (
>
>                     self.mw
>
>                     / self.eos.specific_volume(self.T[i], self.p[i],
> self.z, phase)[0]
>
>                 )  # kg/m3
>
>                 self.e_ghost[:, 1:-1][j][i] = self.eos.internal_energy_tv(
>
>                     self.T[i], self.mw / self.rho_ghost[:, 1:-1][j][i],
> self.z, phase
>
>                 )[
>
>                     0
>
>                 ]  # J/mol
>
>                 self.h_ghost[:, 1:-1][j][i] = (
>
>                     self.e_ghost[:, 1:-1][j][i]
>
>                     + self.p[i] * self.mw / self.rho_ghost[:, 1:-1][j][i]
>
>                 )  # J/mol
>
>                 self.alpha_ghost[:, 1:-1][j][i] = (
>
>                     beta[j] / self.rho_ghost[:, 1:-1][j][i] * rho_m[i]
>
>                 )  # m3/m3
>
>
>
>         # calculate drift velocity
>
>         for i in range(self.N - 1):
>
>             self.u_ghost[:, 1:-1][0][i], self.u_ghost[:, 1:-1][1][i] = (
>
>                 calc_drift_velocity(
>
>                     u_m[i],
>
>                     self._interp_func(
>
>                         self.rho_ghost[:, 1:-1][0][i],
>
>                         self.rho_ghost[:, 1:-1][0][i + 1],
>
>                         u_m[i],
>
>                     ),
>
>                     self._interp_func(
>
>                         self.rho_ghost[:, 1:-1][1][i],
>
>                         self.rho_ghost[:, 1:-1][1][i + 1],
>
>                         u_m[i],
>
>                     ),
>
>                     self.g,
>
>                     self._interp_func(self.T[i], self.T[i + 1], u_m[i]),
>
>                     T_c,
>
>                     self.r_tank_inner,
>
>                     self._interp_func(
>
>                         self.alpha_ghost[:, 1:-1][0][i],
>
>                         self.alpha_ghost[:, 1:-1][0][i + 1],
>
>                         u_m[i],
>
>                     ),
>
>                     self._interp_func(
>
>                         self.alpha_ghost[:, 1:-1][1][i],
>
>                         self.alpha_ghost[:, 1:-1][1][i + 1],
>
>                         u_m[i],
>
>                     ),
>
>                     self.drift_func,
>
>                 )
>
>             )  # liq m / s , vapour m / s
>
>
>
>         u_bottom = 0
>
>         if self.closed_tank:
>
>             u_top = 0.0  # m/s
>
>         else:
>
>             # calc phase to skip env_isentrope_cross
>
>             if (
>
>                 self.mph_uv_flsh_L[-1].liquid != None
>
>                 and self.mph_uv_flsh_L[-1].vapour == None
>
>                 and self.mph_uv_flsh_L[-1].solid == None
>
>             ):
>
>                 phase_env = self.eos.LIQPH
>
>             else:
>
>                 phase_env = self.eos.TWOPH
>
>
>
>             self.h_m = e_m + self.p * self.mw / rho_m  # J / mol
>
>             self.s_top[0] = _get_s_tank(self.eos, self.mph_uv_flsh_L[-1])
> # J / mol / K
>
>             mdot, self.p_choke[0] = calc_mass_outflow(
>
>                 self.eos,
>
>                 self.z,
>
>                 self.h_m[-1],
>
>                 self.s_top[0],
>
>                 self.p[-1],
>
>                 self.p_amb,
>
>                 self.A_nozzle,
>
>                 self.mw,
>
>                 phase_env,
>
>                 debug_plot=False,
>
>             )  # mol / s , Pa
>
>             u_top = -mdot * self.mw / rho_m[-1] / (np.pi *
> self.r_tank_inner**2)  # m/s
>
>
>
>         # assemble vectors with ghost cells
>
>         self.alpha_ghost[:, 0] = self.alpha_ghost[:, 1:-1][:, 0]  # m3/m3
>
>         self.alpha_ghost[:, -1] = self.alpha_ghost[:, 1:-1][:, -1]  # m3/m3
>
>         self.rho_ghost[:, 0] = self.rho_ghost[:, 1:-1][:, 0]  # kg/m3
>
>         self.rho_ghost[:, -1] = self.rho_ghost[:, 1:-1][:, -1]  # kg/m3
>
>         self.rho_m_ghost[0] = rho_m[0]  # kg/m3
>
>         self.rho_m_ghost[1:-1] = rho_m  # kg/m3
>
>         self.rho_m_ghost[-1] = rho_m[-1]  # kg/m3
>
>         # u_ghost[:, 1:-1] = u  # m/s
>
>         self.u_ghost[:, 0] = u_bottom  # m/s
>
>         self.u_ghost[:, -1] = u_top  # m/s
>
>         self.u_m_ghost[0] = u_bottom  # m/s
>
>         self.u_m_ghost[1:-1] = u_m  # m/s
>
>         self.u_m_ghost[-1] = u_top  # m/s
>
>         self.e_ghost[:, 0] = self.e_ghost[:, 1:-1][:, 0]  # J/mol
>
>         self.e_ghost[:, -1] = self.e_ghost[:, 1:-1][:, -1]  # J/mol
>
>         self.pos_ghost[1:-1] = self.pos  # m
>
>         self.pos_ghost[0] = self.pos[0]  # m
>
>         self.pos_ghost[-1] = self.pos[-1]  # m
>
>         self.h_ghost[:, 0] = self.h_ghost[:, 1]  # J/mol
>
>         self.h_ghost[:, -1] = self.h_ghost[:, -2]  # J/mol
>
>
>
>         # recalculate wall temperature and heat flux
>
>         # TODO ARE WE DOING THE STAGGERING CORRECTLY?
>
>         lz = self.tank_params["lz_tank"] / self.N  # m
>
>         if ts.getTime() != self.t_current[0] and
> self.tank_params["heat_transfer"]:
>
>             self.t_current[0] = ts.getTime()
>
>             for i in range(self.N):
>
>                 self.T_wall[i], self.Q_wall[i], h_ht = (
>
>                     solve_radial_heat_conduction_implicit(
>
>                         self.tank_params,
>
>                         self.T[i],
>
>                         self.T_wall[i],
>
>                         (self.u_m_ghost[i] + self.u_m_ghost[i + 1]) / 2,
>
>                         self.rho_m_ghost[i + 1],
>
>                         self.mph_uv_flsh_L[i],
>
>                         lz,
>
>                         dt,
>
>                     )
>
>                 )  # K, J/s, W/m2K
>
>
>
>         # Calculate residuals
>
>         f[:, :] = 0.0
>
>         f[:, 0] = dt * rho_m_dot  # kg/m3
>
>         f[:-1, 1] = self.p[1:] - self.p[:-1] + self.dx * self.g *
> rho_m[0:-1]  # Pa/m
>
>         f[:, 2] = (
>
>             dt
>
>             * (
>
>                 rho_m_dot * (e_m / self.mw + self.g * self.pos)
>
>                 + rho_m * e_m_dot / self.mw
>
>             )
>
>             - rho_m_dot * e_m_dot / self.mw * dt**2
>
>             - self.Q_wall / (np.pi * self.r_tank_inner**2 * lz) * dt
>
>         )  # J / m3
>
>
>
>         # add contribution from space
>
>         for i in range(n_phase):
>
>             e_flux_i = np.zeros_like(self.u_ghost[i])  # J/m3 m/s
>
>             rho_flux_i = np.zeros_like(self.u_ghost[i])  # kg/m2/s
>
>             for j in range(1, self.N + 1):
>
>                 if self.u_ghost[i][j] >= 0.0:
>
>                     rho_flux_new = _rho_flux(
>
>                         self.alpha_ghost[i][j], self.rho_ghost[i][j],
> self.u_ghost[i][j]
>
>                     )
>
>                     e_flux_new = _e_flux(
>
>                         self.alpha_ghost[i][j],
>
>                         self.rho_ghost[i][j],
>
>                         self.h_ghost[i][j],
>
>                         self.mw,
>
>                         self.g,
>
>                         self.pos_ghost[j],
>
>                         self.u_ghost[i][j],
>
>                     )
>
>
>
>                     # backward euler
>
>                     rho_flux_i[j] = rho_flux_new  # kg/m2/s
>
>                     e_flux_i[j] = e_flux_new  # J/m3 m/s
>
>
>
>                 else:
>
>                     rho_flux_new = _rho_flux(
>
>                         self.alpha_ghost[i][j + 1],
>
>                         self.rho_ghost[i][j + 1],
>
>                         self.u_ghost[i][j],
>
>                     )
>
>
>
>                     e_flux_new = _e_flux(
>
>                         self.alpha_ghost[i][j + 1],
>
>                         self.rho_ghost[i][j + 1],
>
>                         self.h_ghost[i][j + 1],
>
>                         self.mw,
>
>                         self.g,
>
>                         self.pos_ghost[j + 1],
>
>                         self.u_ghost[i][j],
>
>                     )
>
>
>
>                     # backward euler
>
>                     rho_flux_i[j] = rho_flux_new
>
>                     e_flux_i[j] = e_flux_new
>
>
>
>             # mass eq
>
>             f[:, 0] += (dt / self.dx) * (rho_flux_i[1:] -
> rho_flux_i[:-1])  # kg/m3
>
>
>
>             # energy eq
>
>             f[:, 2] += (dt / self.dx) * (e_flux_i[1:] - e_flux_i[:-1])  #
> J/m3
>
>
>
>         f1_ref, f2_ref, f3_ref = self.rho_ref, self.p_ref, self.e_ref
>
>         f[:, 0] /= f1_ref
>
>         f[:-1, 1] /= f2_ref
>
>         f[:, 2] /= f3_ref
>
>         # dummy eq
>
>         f[-1, 1] = x[-1, 2]
>
>
>
>
>
>
>
>
> --
>
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
>
>
> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!dMjDqMTlQm8dMz4SogywwfFxLuGNm4cejXWCgg3wmaKDTGtTcYGgbOlRFLDU1IkFMEeg4UxIiNnMeGQclIz3$ 
> <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!dMjDqMTlQm8dMz4SogywwfFxLuGNm4cejXWCgg3wmaKDTGtTcYGgbOlRFLDU1IkFMEeg4UxIiNnMeOxIINZ7$ >
>
>
>
>
> --
>
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
>
>
> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!dMjDqMTlQm8dMz4SogywwfFxLuGNm4cejXWCgg3wmaKDTGtTcYGgbOlRFLDU1IkFMEeg4UxIiNnMeGQclIz3$ 
> <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!dMjDqMTlQm8dMz4SogywwfFxLuGNm4cejXWCgg3wmaKDTGtTcYGgbOlRFLDU1IkFMEeg4UxIiNnMeOxIINZ7$ >
>
>
>
>
> --
>
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
>
>
> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!dMjDqMTlQm8dMz4SogywwfFxLuGNm4cejXWCgg3wmaKDTGtTcYGgbOlRFLDU1IkFMEeg4UxIiNnMeGQclIz3$ 
> <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!dMjDqMTlQm8dMz4SogywwfFxLuGNm4cejXWCgg3wmaKDTGtTcYGgbOlRFLDU1IkFMEeg4UxIiNnMeOxIINZ7$ >
>
>
>
>
> --
>
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
>
>
> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!dMjDqMTlQm8dMz4SogywwfFxLuGNm4cejXWCgg3wmaKDTGtTcYGgbOlRFLDU1IkFMEeg4UxIiNnMeGQclIz3$ 
> <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!dMjDqMTlQm8dMz4SogywwfFxLuGNm4cejXWCgg3wmaKDTGtTcYGgbOlRFLDU1IkFMEeg4UxIiNnMeOxIINZ7$ >
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!dMjDqMTlQm8dMz4SogywwfFxLuGNm4cejXWCgg3wmaKDTGtTcYGgbOlRFLDU1IkFMEeg4UxIiNnMeGQclIz3$  <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!dMjDqMTlQm8dMz4SogywwfFxLuGNm4cejXWCgg3wmaKDTGtTcYGgbOlRFLDU1IkFMEeg4UxIiNnMeOxIINZ7$ >
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20250226/64251bd7/attachment-0001.html>

From junchao.zhang at gmail.com  Wed Feb 26 10:11:16 2025
From: junchao.zhang at gmail.com (Junchao Zhang)
Date: Wed, 26 Feb 2025 10:11:16 -0600
Subject: [petsc-users] building kokkos matrices on the device
In-Reply-To: <CAG_C8sLZFij6j0nPLJs9fFYPPCNNm=zHLZzjqseP9YPRT=xH2A@mail.gmail.com>
References: <CAG_C8sKOYhGzO7sDVhHyZTcVUAfrVLEH1AYKDVgMrLSg1-h55w@mail.gmail.com>
	<CA+MQGp-JwjLW-MpVOPyHxz8EaQUoivk4qHqN06=Ky2_L5qR+Hg@mail.gmail.com>
	<CAG_C8sL8cZfKcGa9ZBTrK_iu5dA=XSvZ+SETHG_6BTLAFemKZg@mail.gmail.com>
	<CA+MQGp8zgVgPPwSGHpeQbHNJ+iBa1ZDZ=XjnetC1nkmkB3EHzg@mail.gmail.com>
	<CAG_C8sLZFij6j0nPLJs9fFYPPCNNm=zHLZzjqseP9YPRT=xH2A@mail.gmail.com>
Message-ID: <CA+MQGp-GSJUv9J8qzvVNXZyq2N628RQfi3N4NtW_h2epM4dMMA@mail.gmail.com>

This fuction *MatCreateMPIAIJWithSeqAIJ(MPI_Comm comm, Mat A, Mat B, const
PetscInt garray[], Mat *mat) *is rarely used. To compute the global
matrix's row/col size M, N, it has to do an MPI_Allreduce(). I think it is
a waste, as the caller usually knows M, N already. So I think we can depart
from it and have a new one:

MatCreateMPIAIJKokkosWithSeqAIJKokkos(MPI_Comm comm, PetscInt M, PetscInt
N, Mat A, Mat B, const PetscInt garray[], Mat *mat)
* M, N are global row/col size of  mat
* A, B are MATSEQAIJKOKKOS
* M, N could be PETSC_DECIDE, if so, petsc will compute mat's M, N from A,
i.e.,  M = Sum of A's M,  N= Sum of A's N
* if garray is NULL, B uses global column indices (and B's N  should be
equal to the output mat's N)
* if garray is not NULL, B uses local column indices; garray[] was
allocated by PetscMalloc() and after the call,  garray will be owned by mat
(user should not free garray afterwards).

What do you think? If you agree, could you contribute an MR?

BTW, I think we need to create a new header, petscmat_kokkos.hpp to declare
  PetscErrorCode MatCreateSeqAIJKokkosWithKokkosCsrMatrix(MPI_Comm comm,
KokkosCsrMatrix csr, Mat *A)
but
PetscErrorCode MatCreateMPIAIJKokkosWithSeqAIJKokkos(MPI_Comm comm,
PetscInt M, PetscInt N, Mat A, Mat B, const PetscInt garray[], Mat *mat)
can be in petscmat.h as it uses only C types.

Barry, what do you think of the two new APIs?

--Junchao Zhang


On Wed, Feb 26, 2025 at 6:26?AM Steven Dargaville <
dargaville.steven at gmail.com> wrote:

> Those two constructors would definitely meet my needs, thanks!
>
> Also I should note that the comment about garray and B in
> MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices is correct if garray is
> passed in as NULL, it's just that if you pass in a completed garray it
> doesn't bother creating one or changing the column indices of B. So I would
> suggest the comment be: "if garray is NULL the offdiag matrix B should
> have global col ids; if garray is not NULL the offdiag matrix B should have
> local col ids"
>
> On Wed, 26 Feb 2025 at 03:35, Junchao Zhang <junchao.zhang at gmail.com>
> wrote:
>
>> Mat_SeqAIJKokkos is private because it is in a private header
>> src/mat/impls/aij/seq/kokkos/aijkok.hpp
>>
>> Your observation about the garray in MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices()
>> might be right.  The comment
>>
>> - B - the offdiag matrix using global col ids
>>
>> is out of date. Perhaps it should be "the offdiag matrix uses local
>> column indices and garray contains the local to global mapping".  But I
>> need to double check it.
>>
>> Since you use Kokkos, I think we could provide these two constructors for
>> MATSEQAIJKOKKOS and MATMPIAIJKOKKOS respectively
>>
>>    - MatCreateSeqAIJKokkosWithKokkosCsrMatrix(MPI_Comm comm,
>>    KokkosCsrMatrix csr, Mat *A)
>>
>>
>>    - MatCreateMPIAIJKokkosWithSeqAIJKokkos(MPI_Comm comm, Mat A, Mat B,
>>    PetscInt *garray, Mat *mat)
>>
>>          // To mimic the existing MatCreateMPIAIJWithSeqAIJ(MPI_Comm
>> comm, Mat A, Mat B, const PetscInt garray[], Mat *mat);
>>          // A and B are MATSEQAIJKOKKOS matrices and use local column
>> indices
>>
>> Do they meet your needs?
>>
>> --Junchao Zhang
>>
>>
>> On Tue, Feb 25, 2025 at 5:35?PM Steven Dargaville <
>> dargaville.steven at gmail.com> wrote:
>>
>>> Thanks for the response!
>>>
>>> Although MatSetValuesCOO happens on the device if the input coo_v
>>> pointer is device memory, I believe MatSetPreallocationCOO requires host
>>> pointers for coo_i and coo_j, and the preallocation (and construction of
>>> the COO structures) happens on the host and is then copied onto the device?
>>> I need to be able to create a matrix object with minimal work on the host
>>> (like many of the routines in aijkok.kokkos.cxx do internally). I
>>> originally used the COO interface to build the matrices I need, but that
>>> was around 5x slower than constructing the aij structures myself on the
>>> device and then just directly using the MatSetSeqAIJKokkosWithCSRMatrix
>>> type methods.
>>>
>>> The reason I thought MatSetSeqAIJKokkosWithCSRMatrix could easily be
>>> made public is that the Mat_SeqAIJKokkos constructors are already publicly
>>> accessible? In particular one of those constructors takes in pointers to
>>> the Kokkos dual views which store a,i,j, and hence one can build a
>>> sequential matrix with nothing (or very little) occuring on the host. The
>>> only change I can see that would be necessary is for
>>> MatSetSeqAIJKokkosWithCSRMatrix (or MatCreateSeqAIJKokkosWithCSRMatrix) to
>>> be public is to change the PETSC_INTERN to PETSC_EXTERN?
>>>
>>> For MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices, I believe all that
>>> is required is declaring the method in the .hpp, as it's already defined as
>>> static in mpiaijkok.kokkos.cxx. In particular, the comments
>>> above MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices suggest that the
>>> off-diagonal block B needs to be built with global column ids, with
>>> mpiaij->garray constructed on the host along with the rewriting of the
>>> global column indices in B. This happens in MatSetUpMultiply_MPIAIJ, but
>>> checking the code there shows that if you pass in a non-null garray to
>>> MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices, the construction and
>>> compatification is skipped, meaning B can be built with local column ids as
>>> long as garray is provided on the host (which I also build on the device
>>> and then just copy to the host). Again this is what some of the internal
>>> Kokkos routines rely on, like the matrix-product.
>>>
>>> I am happy to try doing this and submitting a request to the petsc
>>> gitlab if this seems sensible, I just wanted to double check that I wasn't
>>> missing something important?
>>> Thanks
>>> Steven
>>>
>>> On Tue, 25 Feb 2025 at 22:16, Junchao Zhang <junchao.zhang at gmail.com>
>>> wrote:
>>>
>>>> Hi, Steven,
>>>>   MatSetSeqAIJKokkosWithCSRMatrix(Mat A, Mat_SeqAIJKokkos *akok) uses
>>>> a private data type Mat_SeqAIJKokkos, so it can not be directly made
>>>> public.
>>>>   If you already use COO, then why not directly make the matrix of type
>>>> MATAIJKOKKOS and call MatSetPreallocationCOO() and MatSetValuesCOO()?
>>>>   So I am confused by your needs.
>>>>
>>>> Thanks!
>>>> --Junchao Zhang
>>>>
>>>>
>>>> On Tue, Feb 25, 2025 at 3:39?PM Steven Dargaville <
>>>> dargaville.steven at gmail.com> wrote:
>>>>
>>>>> Hi
>>>>>
>>>>> I'm just wondering if there is any possibility of making:
>>>>> MatSetSeqAIJKokkosWithCSRMatrix or MatCreateSeqAIJKokkosWithCSRMatrix
>>>>> in src/mat/impls/aij/seq/kokkos/aijkok.kokkos.cxx
>>>>> MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices in
>>>>> src/mat/impls/aij/mpi/kokkos/mpiaijkok.kokkos.cxx
>>>>>
>>>>> publicly accessible outside of petsc, or if there is an interface I
>>>>> have missed for creating Kokkos matrices entirely on the device?
>>>>> MatCreateSeqAIJKokkosWithCSRMatrix for example is marked as PETSC_INTERN so
>>>>> I can't link to it.
>>>>>
>>>>> I've currently just copied the code inside of those methods so that I
>>>>> can build without any preallocation on the host (e.g., through the COO
>>>>> interface) and it works really well.
>>>>>
>>>>> Thanks for your help
>>>>> Steven
>>>>>
>>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20250226/000a681d/attachment.html>

From dargaville.steven at gmail.com  Wed Feb 26 10:54:32 2025
From: dargaville.steven at gmail.com (Steven Dargaville)
Date: Wed, 26 Feb 2025 16:54:32 +0000
Subject: [petsc-users] building kokkos matrices on the device
In-Reply-To: <CA+MQGp-GSJUv9J8qzvVNXZyq2N628RQfi3N4NtW_h2epM4dMMA@mail.gmail.com>
References: <CAG_C8sKOYhGzO7sDVhHyZTcVUAfrVLEH1AYKDVgMrLSg1-h55w@mail.gmail.com>
	<CA+MQGp-JwjLW-MpVOPyHxz8EaQUoivk4qHqN06=Ky2_L5qR+Hg@mail.gmail.com>
	<CAG_C8sL8cZfKcGa9ZBTrK_iu5dA=XSvZ+SETHG_6BTLAFemKZg@mail.gmail.com>
	<CA+MQGp8zgVgPPwSGHpeQbHNJ+iBa1ZDZ=XjnetC1nkmkB3EHzg@mail.gmail.com>
	<CAG_C8sLZFij6j0nPLJs9fFYPPCNNm=zHLZzjqseP9YPRT=xH2A@mail.gmail.com>
	<CA+MQGp-GSJUv9J8qzvVNXZyq2N628RQfi3N4NtW_h2epM4dMMA@mail.gmail.com>
Message-ID: <CAG_C8sKUfZiq3Bu-uot+QshK03C8A_YEsaDPiJJn9zhuT-7+hQ@mail.gmail.com>

I think that sounds great, I'm happy to put together an MR (likely next
week) for review.

On Wed, 26 Feb 2025 at 16:11, Junchao Zhang <junchao.zhang at gmail.com> wrote:

> This fuction *MatCreateMPIAIJWithSeqAIJ(MPI_Comm comm, Mat A, Mat B,
> const PetscInt garray[], Mat *mat) *is rarely used. To compute the global
> matrix's row/col size M, N, it has to do an MPI_Allreduce(). I think it is
> a waste, as the caller usually knows M, N already. So I think we can depart
> from it and have a new one:
>
> MatCreateMPIAIJKokkosWithSeqAIJKokkos(MPI_Comm comm, PetscInt M, PetscInt
> N, Mat A, Mat B, const PetscInt garray[], Mat *mat)
> * M, N are global row/col size of  mat
> * A, B are MATSEQAIJKOKKOS
> * M, N could be PETSC_DECIDE, if so, petsc will compute mat's M, N from A,
> i.e.,  M = Sum of A's M,  N= Sum of A's N
> * if garray is NULL, B uses global column indices (and B's N  should be
> equal to the output mat's N)
> * if garray is not NULL, B uses local column indices; garray[] was
> allocated by PetscMalloc() and after the call,  garray will be owned by mat
> (user should not free garray afterwards).
>
> What do you think? If you agree, could you contribute an MR?
>
> BTW, I think we need to create a new header, petscmat_kokkos.hpp to declare
>   PetscErrorCode MatCreateSeqAIJKokkosWithKokkosCsrMatrix(MPI_Comm comm,
> KokkosCsrMatrix csr, Mat *A)
> but
> PetscErrorCode MatCreateMPIAIJKokkosWithSeqAIJKokkos(MPI_Comm comm,
> PetscInt M, PetscInt N, Mat A, Mat B, const PetscInt garray[], Mat *mat)
> can be in petscmat.h as it uses only C types.
>
> Barry, what do you think of the two new APIs?
>
> --Junchao Zhang
>
>
> On Wed, Feb 26, 2025 at 6:26?AM Steven Dargaville <
> dargaville.steven at gmail.com> wrote:
>
>> Those two constructors would definitely meet my needs, thanks!
>>
>> Also I should note that the comment about garray and B in
>> MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices is correct if garray is
>> passed in as NULL, it's just that if you pass in a completed garray it
>> doesn't bother creating one or changing the column indices of B. So I would
>> suggest the comment be: "if garray is NULL the offdiag matrix B should
>> have global col ids; if garray is not NULL the offdiag matrix B should have
>> local col ids"
>>
>> On Wed, 26 Feb 2025 at 03:35, Junchao Zhang <junchao.zhang at gmail.com>
>> wrote:
>>
>>> Mat_SeqAIJKokkos is private because it is in a private header
>>> src/mat/impls/aij/seq/kokkos/aijkok.hpp
>>>
>>> Your observation about the garray in MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices()
>>> might be right.  The comment
>>>
>>> - B - the offdiag matrix using global col ids
>>>
>>> is out of date. Perhaps it should be "the offdiag matrix uses local
>>> column indices and garray contains the local to global mapping".  But I
>>> need to double check it.
>>>
>>> Since you use Kokkos, I think we could provide these two constructors
>>> for MATSEQAIJKOKKOS and MATMPIAIJKOKKOS respectively
>>>
>>>    - MatCreateSeqAIJKokkosWithKokkosCsrMatrix(MPI_Comm comm,
>>>    KokkosCsrMatrix csr, Mat *A)
>>>
>>>
>>>    - MatCreateMPIAIJKokkosWithSeqAIJKokkos(MPI_Comm comm, Mat A, Mat B,
>>>    PetscInt *garray, Mat *mat)
>>>
>>>          // To mimic the existing MatCreateMPIAIJWithSeqAIJ(MPI_Comm
>>> comm, Mat A, Mat B, const PetscInt garray[], Mat *mat);
>>>          // A and B are MATSEQAIJKOKKOS matrices and use local column
>>> indices
>>>
>>> Do they meet your needs?
>>>
>>> --Junchao Zhang
>>>
>>>
>>> On Tue, Feb 25, 2025 at 5:35?PM Steven Dargaville <
>>> dargaville.steven at gmail.com> wrote:
>>>
>>>> Thanks for the response!
>>>>
>>>> Although MatSetValuesCOO happens on the device if the input coo_v
>>>> pointer is device memory, I believe MatSetPreallocationCOO requires host
>>>> pointers for coo_i and coo_j, and the preallocation (and construction of
>>>> the COO structures) happens on the host and is then copied onto the device?
>>>> I need to be able to create a matrix object with minimal work on the host
>>>> (like many of the routines in aijkok.kokkos.cxx do internally). I
>>>> originally used the COO interface to build the matrices I need, but that
>>>> was around 5x slower than constructing the aij structures myself on the
>>>> device and then just directly using the MatSetSeqAIJKokkosWithCSRMatrix
>>>> type methods.
>>>>
>>>> The reason I thought MatSetSeqAIJKokkosWithCSRMatrix could easily be
>>>> made public is that the Mat_SeqAIJKokkos constructors are already publicly
>>>> accessible? In particular one of those constructors takes in pointers to
>>>> the Kokkos dual views which store a,i,j, and hence one can build a
>>>> sequential matrix with nothing (or very little) occuring on the host. The
>>>> only change I can see that would be necessary is for
>>>> MatSetSeqAIJKokkosWithCSRMatrix (or MatCreateSeqAIJKokkosWithCSRMatrix) to
>>>> be public is to change the PETSC_INTERN to PETSC_EXTERN?
>>>>
>>>> For MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices, I believe all that
>>>> is required is declaring the method in the .hpp, as it's already defined as
>>>> static in mpiaijkok.kokkos.cxx. In particular, the comments
>>>> above MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices suggest that the
>>>> off-diagonal block B needs to be built with global column ids, with
>>>> mpiaij->garray constructed on the host along with the rewriting of the
>>>> global column indices in B. This happens in MatSetUpMultiply_MPIAIJ, but
>>>> checking the code there shows that if you pass in a non-null garray to
>>>> MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices, the construction and
>>>> compatification is skipped, meaning B can be built with local column ids as
>>>> long as garray is provided on the host (which I also build on the device
>>>> and then just copy to the host). Again this is what some of the internal
>>>> Kokkos routines rely on, like the matrix-product.
>>>>
>>>> I am happy to try doing this and submitting a request to the petsc
>>>> gitlab if this seems sensible, I just wanted to double check that I wasn't
>>>> missing something important?
>>>> Thanks
>>>> Steven
>>>>
>>>> On Tue, 25 Feb 2025 at 22:16, Junchao Zhang <junchao.zhang at gmail.com>
>>>> wrote:
>>>>
>>>>> Hi, Steven,
>>>>>   MatSetSeqAIJKokkosWithCSRMatrix(Mat A, Mat_SeqAIJKokkos *akok) uses
>>>>> a private data type Mat_SeqAIJKokkos, so it can not be directly made
>>>>> public.
>>>>>   If you already use COO, then why not directly make the matrix of
>>>>> type MATAIJKOKKOS and call MatSetPreallocationCOO() and MatSetValuesCOO()?
>>>>>   So I am confused by your needs.
>>>>>
>>>>> Thanks!
>>>>> --Junchao Zhang
>>>>>
>>>>>
>>>>> On Tue, Feb 25, 2025 at 3:39?PM Steven Dargaville <
>>>>> dargaville.steven at gmail.com> wrote:
>>>>>
>>>>>> Hi
>>>>>>
>>>>>> I'm just wondering if there is any possibility of making:
>>>>>> MatSetSeqAIJKokkosWithCSRMatrix or MatCreateSeqAIJKokkosWithCSRMatrix
>>>>>> in src/mat/impls/aij/seq/kokkos/aijkok.kokkos.cxx
>>>>>> MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices in
>>>>>> src/mat/impls/aij/mpi/kokkos/mpiaijkok.kokkos.cxx
>>>>>>
>>>>>> publicly accessible outside of petsc, or if there is an interface I
>>>>>> have missed for creating Kokkos matrices entirely on the device?
>>>>>> MatCreateSeqAIJKokkosWithCSRMatrix for example is marked as PETSC_INTERN so
>>>>>> I can't link to it.
>>>>>>
>>>>>> I've currently just copied the code inside of those methods so that I
>>>>>> can build without any preallocation on the host (e.g., through the COO
>>>>>> interface) and it works really well.
>>>>>>
>>>>>> Thanks for your help
>>>>>> Steven
>>>>>>
>>>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20250226/c97db042/attachment-0001.html>

From bsmith at petsc.dev  Wed Feb 26 11:02:16 2025
From: bsmith at petsc.dev (Barry Smith)
Date: Wed, 26 Feb 2025 12:02:16 -0500
Subject: [petsc-users] building kokkos matrices on the device
In-Reply-To: <CAG_C8sKUfZiq3Bu-uot+QshK03C8A_YEsaDPiJJn9zhuT-7+hQ@mail.gmail.com>
References: <CAG_C8sKOYhGzO7sDVhHyZTcVUAfrVLEH1AYKDVgMrLSg1-h55w@mail.gmail.com>
	<CA+MQGp-JwjLW-MpVOPyHxz8EaQUoivk4qHqN06=Ky2_L5qR+Hg@mail.gmail.com>
	<CAG_C8sL8cZfKcGa9ZBTrK_iu5dA=XSvZ+SETHG_6BTLAFemKZg@mail.gmail.com>
	<CA+MQGp8zgVgPPwSGHpeQbHNJ+iBa1ZDZ=XjnetC1nkmkB3EHzg@mail.gmail.com>
	<CAG_C8sLZFij6j0nPLJs9fFYPPCNNm=zHLZzjqseP9YPRT=xH2A@mail.gmail.com>
	<CA+MQGp-GSJUv9J8qzvVNXZyq2N628RQfi3N4NtW_h2epM4dMMA@mail.gmail.com>
	<CAG_C8sKUfZiq3Bu-uot+QshK03C8A_YEsaDPiJJn9zhuT-7+hQ@mail.gmail.com>
Message-ID: <FF1094A3-2516-421B-B2C8-9A90B93C9F08@petsc.dev>


    The new function doesn't seem to have anything to do with Kokkos so why have any new functions? Just have MatCreateMPIAIJWithSeqAIJ() work properly when the two matrices are Kokkos (or CUDA or HIP).   Or if you want to eliminate the global reduction maybe make your new function MatCreateMPIWithSeq() and have it work for any type of submatrix and eventually we could deprecate the MatCreateMPIAIJWithSeqAIJ() 

   Barry


> On Feb 26, 2025, at 11:54?AM, Steven Dargaville <dargaville.steven at gmail.com> wrote:
> 
> I think that sounds great, I'm happy to put together an MR (likely next week) for review. 
> 
> On Wed, 26 Feb 2025 at 16:11, Junchao Zhang <junchao.zhang at gmail.com <mailto:junchao.zhang at gmail.com>> wrote:
>> This fuction MatCreateMPIAIJWithSeqAIJ(MPI_Comm comm, Mat A, Mat B, const PetscInt garray[], Mat *mat) is rarely used. To compute the global matrix's row/col size M, N, it has to do an MPI_Allreduce(). I think it is a waste, as the caller usually knows M, N already. So I think we can depart from it and have a new one:
>> 
>> MatCreateMPIAIJKokkosWithSeqAIJKokkos(MPI_Comm comm, PetscInt M, PetscInt N, Mat A, Mat B, const PetscInt garray[], Mat *mat) 
>> * M, N are global row/col size of  mat
>> * A, B are MATSEQAIJKOKKOS
>> * M, N could be PETSC_DECIDE, if so, petsc will compute mat's M, N from A, i.e.,  M = Sum of A's M,  N= Sum of A's N
>> * if garray is NULL, B uses global column indices (and B's N  should be equal to the output mat's N)  
>> * if garray is not NULL, B uses local column indices; garray[] was allocated by PetscMalloc() and after the call,  garray will be owned by mat (user should not free garray afterwards).
>>  
>> What do you think? If you agree, could you contribute an MR?
>> 
>> BTW, I think we need to create a new header, petscmat_kokkos.hpp to declare
>>   PetscErrorCode MatCreateSeqAIJKokkosWithKokkosCsrMatrix(MPI_Comm comm, KokkosCsrMatrix csr, Mat *A)
>> but 
>>   PetscErrorCode MatCreateMPIAIJKokkosWithSeqAIJKokkos(MPI_Comm comm, PetscInt M, PetscInt N, Mat A, Mat B, const PetscInt garray[], Mat *mat) 
>> can be in petscmat.h as it uses only C types.
>> 
>> Barry, what do you think of the two new APIs?
>> 
>> --Junchao Zhang
>> 
>> 
>> On Wed, Feb 26, 2025 at 6:26?AM Steven Dargaville <dargaville.steven at gmail.com <mailto:dargaville.steven at gmail.com>> wrote:
>>> Those two constructors would definitely meet my needs, thanks!
>>> 
>>> Also I should note that the comment about garray and B in MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices is correct if garray is passed in as NULL, it's just that if you pass in a completed garray it doesn't bother creating one or changing the column indices of B. So I would suggest the comment be: "if garray is NULL the offdiag matrix B should have global col ids; if garray is not NULL the offdiag matrix B should have local col ids"
>>> 
>>> On Wed, 26 Feb 2025 at 03:35, Junchao Zhang <junchao.zhang at gmail.com <mailto:junchao.zhang at gmail.com>> wrote:
>>>> Mat_SeqAIJKokkos is private because it is in a private header src/mat/impls/aij/seq/kokkos/aijkok.hpp
>>>> 
>>>> Your observation about the garray in MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices() might be right.  The comment
>>>> 
>>>> -  B     - the offdiag matrix using global col ids
>>>> 
>>>> is out of date. Perhaps it should be "the offdiag matrix uses local column indices and garray contains the local to global mapping".  But I need to double check it.
>>>> 
>>>> Since you use Kokkos, I think we could provide these two constructors for MATSEQAIJKOKKOS and MATMPIAIJKOKKOS respectively
>>>> MatCreateSeqAIJKokkosWithKokkosCsrMatrix(MPI_Comm comm, KokkosCsrMatrix csr, Mat *A)
>>>> MatCreateMPIAIJKokkosWithSeqAIJKokkos(MPI_Comm comm, Mat A, Mat B, PetscInt *garray, Mat *mat) 
>>>>          // To mimic the existing MatCreateMPIAIJWithSeqAIJ(MPI_Comm comm, Mat A, Mat B, const PetscInt garray[], Mat *mat);
>>>>          // A and B are MATSEQAIJKOKKOS matrices and use local column indices
>>>> 
>>>> Do they meet your needs?
>>>> 
>>>> --Junchao Zhang
>>>> 
>>>> 
>>>> On Tue, Feb 25, 2025 at 5:35?PM Steven Dargaville <dargaville.steven at gmail.com <mailto:dargaville.steven at gmail.com>> wrote:
>>>>> Thanks for the response!
>>>>> 
>>>>> Although MatSetValuesCOO happens on the device if the input coo_v pointer is device memory, I believe MatSetPreallocationCOO requires host pointers for coo_i and coo_j, and the preallocation (and construction of the COO structures) happens on the host and is then copied onto the device? I need to be able to create a matrix object with minimal work on the host (like many of the routines in aijkok.kokkos.cxx do internally). I originally used the COO interface to build the matrices I need, but that was around 5x slower than constructing the aij structures myself on the device and then just directly using the MatSetSeqAIJKokkosWithCSRMatrix type methods.
>>>>> 
>>>>> The reason I thought MatSetSeqAIJKokkosWithCSRMatrix could easily be made public is that the Mat_SeqAIJKokkos constructors are already publicly accessible? In particular one of those constructors takes in pointers to the Kokkos dual views which store a,i,j, and hence one can build a sequential matrix with nothing (or very little) occuring on the host. The only change I can see that would be necessary is for MatSetSeqAIJKokkosWithCSRMatrix (or MatCreateSeqAIJKokkosWithCSRMatrix) to be public is to change the PETSC_INTERN to PETSC_EXTERN?
>>>>> 
>>>>> For MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices, I believe all that is required is declaring the method in the .hpp, as it's already defined as static in mpiaijkok.kokkos.cxx. In particular, the comments above MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices suggest that the off-diagonal block B needs to be built with global column ids, with mpiaij->garray constructed on the host along with the rewriting of the global column indices in B. This happens in MatSetUpMultiply_MPIAIJ, but checking the code there shows that if you pass in a non-null garray to MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices, the construction and compatification is skipped, meaning B can be built with local column ids as long as garray is provided on the host (which I also build on the device and then just copy to the host). Again this is what some of the internal Kokkos routines rely on, like the matrix-product.  
>>>>> 
>>>>> I am happy to try doing this and submitting a request to the petsc gitlab if this seems sensible, I just wanted to double check that I wasn't missing something important?
>>>>> Thanks
>>>>> Steven
>>>>> 
>>>>> On Tue, 25 Feb 2025 at 22:16, Junchao Zhang <junchao.zhang at gmail.com <mailto:junchao.zhang at gmail.com>> wrote:
>>>>>> Hi, Steven,
>>>>>>   MatSetSeqAIJKokkosWithCSRMatrix(Mat A, Mat_SeqAIJKokkos *akok) uses a private data type Mat_SeqAIJKokkos, so it can not be directly made public. 
>>>>>>   If you already use COO, then why not directly make the matrix of type MATAIJKOKKOS and call MatSetPreallocationCOO() and MatSetValuesCOO()?
>>>>>>   So I am confused by your needs. 
>>>>>> 
>>>>>> Thanks!
>>>>>> --Junchao Zhang
>>>>>> 
>>>>>> 
>>>>>> On Tue, Feb 25, 2025 at 3:39?PM Steven Dargaville <dargaville.steven at gmail.com <mailto:dargaville.steven at gmail.com>> wrote:
>>>>>>> Hi
>>>>>>> 
>>>>>>> I'm just wondering if there is any possibility of making:
>>>>>>> MatSetSeqAIJKokkosWithCSRMatrix or MatCreateSeqAIJKokkosWithCSRMatrix in src/mat/impls/aij/seq/kokkos/aijkok.kokkos.cxx 
>>>>>>> MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices in src/mat/impls/aij/mpi/kokkos/mpiaijkok.kokkos.cxx
>>>>>>> 
>>>>>>> publicly accessible outside of petsc, or if there is an interface I have missed for creating Kokkos matrices entirely on the device? MatCreateSeqAIJKokkosWithCSRMatrix for example is marked as PETSC_INTERN so I can't link to it.
>>>>>>> 
>>>>>>> I've currently just copied the code inside of those methods so that I can build without any preallocation on the host (e.g., through the COO interface) and it works really well.
>>>>>>> 
>>>>>>> Thanks for your help
>>>>>>> Steven

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20250226/bbaf318c/attachment.html>

From junchao.zhang at gmail.com  Wed Feb 26 11:15:27 2025
From: junchao.zhang at gmail.com (Junchao Zhang)
Date: Wed, 26 Feb 2025 11:15:27 -0600
Subject: [petsc-users] building kokkos matrices on the device
In-Reply-To: <FF1094A3-2516-421B-B2C8-9A90B93C9F08@petsc.dev>
References: <CAG_C8sKOYhGzO7sDVhHyZTcVUAfrVLEH1AYKDVgMrLSg1-h55w@mail.gmail.com>
	<CA+MQGp-JwjLW-MpVOPyHxz8EaQUoivk4qHqN06=Ky2_L5qR+Hg@mail.gmail.com>
	<CAG_C8sL8cZfKcGa9ZBTrK_iu5dA=XSvZ+SETHG_6BTLAFemKZg@mail.gmail.com>
	<CA+MQGp8zgVgPPwSGHpeQbHNJ+iBa1ZDZ=XjnetC1nkmkB3EHzg@mail.gmail.com>
	<CAG_C8sLZFij6j0nPLJs9fFYPPCNNm=zHLZzjqseP9YPRT=xH2A@mail.gmail.com>
	<CA+MQGp-GSJUv9J8qzvVNXZyq2N628RQfi3N4NtW_h2epM4dMMA@mail.gmail.com>
	<CAG_C8sKUfZiq3Bu-uot+QshK03C8A_YEsaDPiJJn9zhuT-7+hQ@mail.gmail.com>
	<FF1094A3-2516-421B-B2C8-9A90B93C9F08@petsc.dev>
Message-ID: <CA+MQGp9xOOwhYsQdVB_PLurchEDSi=E9OuwU7fiFAbo6oX6F1A@mail.gmail.com>

That is a good idea.  Perhaps a new MatCreateMPIXAIJWithSeqXAIJ(MPI_Comm
comm, PetscInt M, PetscInt N, Mat A, Mat B, const PetscInt garray[], Mat
*mat) since garray[] is only meaningful for MATAIJ and subclasses.

--Junchao Zhang


On Wed, Feb 26, 2025 at 11:02?AM Barry Smith <bsmith at petsc.dev> wrote:

>
>     The new function doesn't seem to have anything to do with Kokkos so
> why have any new functions? Just have *MatCreateMPIAIJWithSeqAIJ() work
> properly when the two matrices are Kokkos (or CUDA or HIP).   Or if you
> want to eliminate the global reduction maybe make your new function
> MatCreateMPIWithSeq() and have it work for any type of submatrix and
> eventually we could deprecate the **MatCreateMPIAIJWithSeqAIJ() *
>
> *   Barry*
>
>
>
>
> On Feb 26, 2025, at 11:54?AM, Steven Dargaville <
> dargaville.steven at gmail.com> wrote:
>
> I think that sounds great, I'm happy to put together an MR (likely next
> week) for review.
>
> On Wed, 26 Feb 2025 at 16:11, Junchao Zhang <junchao.zhang at gmail.com>
> wrote:
>
>> This fuction *MatCreateMPIAIJWithSeqAIJ(MPI_Comm comm, Mat A, Mat B,
>> const PetscInt garray[], Mat *mat) *is rarely used. To compute the
>> global matrix's row/col size M, N, it has to do an MPI_Allreduce(). I think
>> it is a waste, as the caller usually knows M, N already. So I think we can
>> depart from it and have a new one:
>>
>> MatCreateMPIAIJKokkosWithSeqAIJKokkos(MPI_Comm comm, PetscInt M, PetscInt
>> N, Mat A, Mat B, const PetscInt garray[], Mat *mat)
>> * M, N are global row/col size of  mat
>> * A, B are MATSEQAIJKOKKOS
>> * M, N could be PETSC_DECIDE, if so, petsc will compute mat's M, N from
>> A, i.e.,  M = Sum of A's M,  N= Sum of A's N
>> * if garray is NULL, B uses global column indices (and B's N  should be
>> equal to the output mat's N)
>> * if garray is not NULL, B uses local column indices; garray[] was
>> allocated by PetscMalloc() and after the call,  garray will be owned by mat
>> (user should not free garray afterwards).
>>
>> What do you think? If you agree, could you contribute an MR?
>>
>> BTW, I think we need to create a new header, petscmat_kokkos.hpp to
>> declare
>>   PetscErrorCode MatCreateSeqAIJKokkosWithKokkosCsrMatrix(MPI_Comm comm,
>> KokkosCsrMatrix csr, Mat *A)
>> but
>> PetscErrorCode MatCreateMPIAIJKokkosWithSeqAIJKokkos(MPI_Comm comm,
>> PetscInt M, PetscInt N, Mat A, Mat B, const PetscInt garray[], Mat *mat)
>> can be in petscmat.h as it uses only C types.
>>
>> Barry, what do you think of the two new APIs?
>>
>> --Junchao Zhang
>>
>>
>> On Wed, Feb 26, 2025 at 6:26?AM Steven Dargaville <
>> dargaville.steven at gmail.com> wrote:
>>
>>> Those two constructors would definitely meet my needs, thanks!
>>>
>>> Also I should note that the comment about garray and B in
>>> MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices is correct if garray is
>>> passed in as NULL, it's just that if you pass in a completed garray it
>>> doesn't bother creating one or changing the column indices of B. So I would
>>> suggest the comment be: "if garray is NULL the offdiag matrix B should
>>> have global col ids; if garray is not NULL the offdiag matrix B should have
>>> local col ids"
>>>
>>> On Wed, 26 Feb 2025 at 03:35, Junchao Zhang <junchao.zhang at gmail.com>
>>> wrote:
>>>
>>>> Mat_SeqAIJKokkos is private because it is in a private header
>>>> src/mat/impls/aij/seq/kokkos/aijkok.hpp
>>>>
>>>> Your observation about the garray in MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices()
>>>> might be right.  The comment
>>>>
>>>> - B - the offdiag matrix using global col ids
>>>>
>>>> is out of date. Perhaps it should be "the offdiag matrix uses local
>>>> column indices and garray contains the local to global mapping".  But I
>>>> need to double check it.
>>>>
>>>> Since you use Kokkos, I think we could provide these two constructors
>>>> for MATSEQAIJKOKKOS and MATMPIAIJKOKKOS respectively
>>>>
>>>>    - MatCreateSeqAIJKokkosWithKokkosCsrMatrix(MPI_Comm comm,
>>>>    KokkosCsrMatrix csr, Mat *A)
>>>>
>>>>
>>>>    - MatCreateMPIAIJKokkosWithSeqAIJKokkos(MPI_Comm comm, Mat A, Mat
>>>>    B, PetscInt *garray, Mat *mat)
>>>>
>>>>          // To mimic the existing MatCreateMPIAIJWithSeqAIJ(MPI_Comm
>>>> comm, Mat A, Mat B, const PetscInt garray[], Mat *mat);
>>>>          // A and B are MATSEQAIJKOKKOS matrices and use local column
>>>> indices
>>>>
>>>> Do they meet your needs?
>>>>
>>>> --Junchao Zhang
>>>>
>>>>
>>>> On Tue, Feb 25, 2025 at 5:35?PM Steven Dargaville <
>>>> dargaville.steven at gmail.com> wrote:
>>>>
>>>>> Thanks for the response!
>>>>>
>>>>> Although MatSetValuesCOO happens on the device if the input coo_v
>>>>> pointer is device memory, I believe MatSetPreallocationCOO requires host
>>>>> pointers for coo_i and coo_j, and the preallocation (and construction of
>>>>> the COO structures) happens on the host and is then copied onto the device?
>>>>> I need to be able to create a matrix object with minimal work on the host
>>>>> (like many of the routines in aijkok.kokkos.cxx do internally). I
>>>>> originally used the COO interface to build the matrices I need, but that
>>>>> was around 5x slower than constructing the aij structures myself on the
>>>>> device and then just directly using the MatSetSeqAIJKokkosWithCSRMatrix
>>>>> type methods.
>>>>>
>>>>> The reason I thought MatSetSeqAIJKokkosWithCSRMatrix could easily be
>>>>> made public is that the Mat_SeqAIJKokkos constructors are already publicly
>>>>> accessible? In particular one of those constructors takes in pointers to
>>>>> the Kokkos dual views which store a,i,j, and hence one can build a
>>>>> sequential matrix with nothing (or very little) occuring on the host. The
>>>>> only change I can see that would be necessary is for
>>>>> MatSetSeqAIJKokkosWithCSRMatrix (or MatCreateSeqAIJKokkosWithCSRMatrix) to
>>>>> be public is to change the PETSC_INTERN to PETSC_EXTERN?
>>>>>
>>>>> For MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices, I believe all
>>>>> that is required is declaring the method in the .hpp, as it's already
>>>>> defined as static in mpiaijkok.kokkos.cxx. In particular, the comments
>>>>> above MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices suggest that the
>>>>> off-diagonal block B needs to be built with global column ids, with
>>>>> mpiaij->garray constructed on the host along with the rewriting of the
>>>>> global column indices in B. This happens in MatSetUpMultiply_MPIAIJ, but
>>>>> checking the code there shows that if you pass in a non-null garray to
>>>>> MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices, the construction and
>>>>> compatification is skipped, meaning B can be built with local column ids as
>>>>> long as garray is provided on the host (which I also build on the device
>>>>> and then just copy to the host). Again this is what some of the internal
>>>>> Kokkos routines rely on, like the matrix-product.
>>>>>
>>>>> I am happy to try doing this and submitting a request to the petsc
>>>>> gitlab if this seems sensible, I just wanted to double check that I wasn't
>>>>> missing something important?
>>>>> Thanks
>>>>> Steven
>>>>>
>>>>> On Tue, 25 Feb 2025 at 22:16, Junchao Zhang <junchao.zhang at gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hi, Steven,
>>>>>>   MatSetSeqAIJKokkosWithCSRMatrix(Mat A, Mat_SeqAIJKokkos *akok) uses
>>>>>> a private data type Mat_SeqAIJKokkos, so it can not be directly made
>>>>>> public.
>>>>>>   If you already use COO, then why not directly make the matrix of
>>>>>> type MATAIJKOKKOS and call MatSetPreallocationCOO() and MatSetValuesCOO()?
>>>>>>   So I am confused by your needs.
>>>>>>
>>>>>> Thanks!
>>>>>> --Junchao Zhang
>>>>>>
>>>>>>
>>>>>> On Tue, Feb 25, 2025 at 3:39?PM Steven Dargaville <
>>>>>> dargaville.steven at gmail.com> wrote:
>>>>>>
>>>>>>> Hi
>>>>>>>
>>>>>>> I'm just wondering if there is any possibility of making:
>>>>>>> MatSetSeqAIJKokkosWithCSRMatrix
>>>>>>> or MatCreateSeqAIJKokkosWithCSRMatrix in
>>>>>>> src/mat/impls/aij/seq/kokkos/aijkok.kokkos.cxx
>>>>>>> MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices in
>>>>>>> src/mat/impls/aij/mpi/kokkos/mpiaijkok.kokkos.cxx
>>>>>>>
>>>>>>> publicly accessible outside of petsc, or if there is an interface I
>>>>>>> have missed for creating Kokkos matrices entirely on the device?
>>>>>>> MatCreateSeqAIJKokkosWithCSRMatrix for example is marked as PETSC_INTERN so
>>>>>>> I can't link to it.
>>>>>>>
>>>>>>> I've currently just copied the code inside of those methods so that
>>>>>>> I can build without any preallocation on the host (e.g., through the COO
>>>>>>> interface) and it works really well.
>>>>>>>
>>>>>>> Thanks for your help
>>>>>>> Steven
>>>>>>>
>>>>>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20250226/f488bae6/attachment-0001.html>

From dargaville.steven at gmail.com  Wed Feb 26 11:26:54 2025
From: dargaville.steven at gmail.com (Steven Dargaville)
Date: Wed, 26 Feb 2025 17:26:54 +0000
Subject: [petsc-users] building kokkos matrices on the device
In-Reply-To: <CA+MQGp9xOOwhYsQdVB_PLurchEDSi=E9OuwU7fiFAbo6oX6F1A@mail.gmail.com>
References: <CAG_C8sKOYhGzO7sDVhHyZTcVUAfrVLEH1AYKDVgMrLSg1-h55w@mail.gmail.com>
	<CA+MQGp-JwjLW-MpVOPyHxz8EaQUoivk4qHqN06=Ky2_L5qR+Hg@mail.gmail.com>
	<CAG_C8sL8cZfKcGa9ZBTrK_iu5dA=XSvZ+SETHG_6BTLAFemKZg@mail.gmail.com>
	<CA+MQGp8zgVgPPwSGHpeQbHNJ+iBa1ZDZ=XjnetC1nkmkB3EHzg@mail.gmail.com>
	<CAG_C8sLZFij6j0nPLJs9fFYPPCNNm=zHLZzjqseP9YPRT=xH2A@mail.gmail.com>
	<CA+MQGp-GSJUv9J8qzvVNXZyq2N628RQfi3N4NtW_h2epM4dMMA@mail.gmail.com>
	<CAG_C8sKUfZiq3Bu-uot+QshK03C8A_YEsaDPiJJn9zhuT-7+hQ@mail.gmail.com>
	<FF1094A3-2516-421B-B2C8-9A90B93C9F08@petsc.dev>
	<CA+MQGp9xOOwhYsQdVB_PLurchEDSi=E9OuwU7fiFAbo6oX6F1A@mail.gmail.com>
Message-ID: <CAG_C8sKWZpVLFWyLBzE2vdtcZYR9r20Un1YrqaETmyRsP0TpSA@mail.gmail.com>

Ok so just to double check the things I should do:

1. Create a new header for MatCreateSeqAIJKokkosWithCSRMatrix (and declare
it PETSC_EXTERN) so users can call the existing method and build a
seqaijkokkos matrix with no host involvement.
2. Modify *MatCreateMPIAIJWithSeqAIJ (*or equivalent*) *so it does the same
thing as MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices in the case that A
and B are seqaijkokkos matrices.
3. Potentially remove MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices given
it would be redundant?

On Wed, 26 Feb 2025 at 17:15, Junchao Zhang <junchao.zhang at gmail.com> wrote:

> That is a good idea.  Perhaps a new MatCreateMPIXAIJWithSeqXAIJ(MPI_Comm
> comm, PetscInt M, PetscInt N, Mat A, Mat B, const PetscInt garray[], Mat
> *mat) since garray[] is only meaningful for MATAIJ and subclasses.
>
> --Junchao Zhang
>
>
> On Wed, Feb 26, 2025 at 11:02?AM Barry Smith <bsmith at petsc.dev> wrote:
>
>>
>>     The new function doesn't seem to have anything to do with Kokkos so
>> why have any new functions? Just have *MatCreateMPIAIJWithSeqAIJ() work
>> properly when the two matrices are Kokkos (or CUDA or HIP).   Or if you
>> want to eliminate the global reduction maybe make your new function
>> MatCreateMPIWithSeq() and have it work for any type of submatrix and
>> eventually we could deprecate the **MatCreateMPIAIJWithSeqAIJ() *
>>
>> *   Barry*
>>
>>
>>
>>
>> On Feb 26, 2025, at 11:54?AM, Steven Dargaville <
>> dargaville.steven at gmail.com> wrote:
>>
>> I think that sounds great, I'm happy to put together an MR (likely next
>> week) for review.
>>
>> On Wed, 26 Feb 2025 at 16:11, Junchao Zhang <junchao.zhang at gmail.com>
>> wrote:
>>
>>> This fuction *MatCreateMPIAIJWithSeqAIJ(MPI_Comm comm, Mat A, Mat B,
>>> const PetscInt garray[], Mat *mat) *is rarely used. To compute the
>>> global matrix's row/col size M, N, it has to do an MPI_Allreduce(). I think
>>> it is a waste, as the caller usually knows M, N already. So I think we can
>>> depart from it and have a new one:
>>>
>>> MatCreateMPIAIJKokkosWithSeqAIJKokkos(MPI_Comm comm, PetscInt M,
>>> PetscInt N, Mat A, Mat B, const PetscInt garray[], Mat *mat)
>>> * M, N are global row/col size of  mat
>>> * A, B are MATSEQAIJKOKKOS
>>> * M, N could be PETSC_DECIDE, if so, petsc will compute mat's M, N from
>>> A, i.e.,  M = Sum of A's M,  N= Sum of A's N
>>> * if garray is NULL, B uses global column indices (and B's N  should be
>>> equal to the output mat's N)
>>> * if garray is not NULL, B uses local column indices; garray[] was
>>> allocated by PetscMalloc() and after the call,  garray will be owned by mat
>>> (user should not free garray afterwards).
>>>
>>> What do you think? If you agree, could you contribute an MR?
>>>
>>> BTW, I think we need to create a new header, petscmat_kokkos.hpp to
>>> declare
>>>   PetscErrorCode MatCreateSeqAIJKokkosWithKokkosCsrMatrix(MPI_Comm
>>> comm, KokkosCsrMatrix csr, Mat *A)
>>> but
>>> PetscErrorCode MatCreateMPIAIJKokkosWithSeqAIJKokkos(MPI_Comm comm,
>>> PetscInt M, PetscInt N, Mat A, Mat B, const PetscInt garray[], Mat *mat)
>>> can be in petscmat.h as it uses only C types.
>>>
>>> Barry, what do you think of the two new APIs?
>>>
>>> --Junchao Zhang
>>>
>>>
>>> On Wed, Feb 26, 2025 at 6:26?AM Steven Dargaville <
>>> dargaville.steven at gmail.com> wrote:
>>>
>>>> Those two constructors would definitely meet my needs, thanks!
>>>>
>>>> Also I should note that the comment about garray and B in
>>>> MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices is correct if garray
>>>> is passed in as NULL, it's just that if you pass in a completed garray it
>>>> doesn't bother creating one or changing the column indices of B. So I would
>>>> suggest the comment be: "if garray is NULL the offdiag matrix B should
>>>> have global col ids; if garray is not NULL the offdiag matrix B should have
>>>> local col ids"
>>>>
>>>> On Wed, 26 Feb 2025 at 03:35, Junchao Zhang <junchao.zhang at gmail.com>
>>>> wrote:
>>>>
>>>>> Mat_SeqAIJKokkos is private because it is in a private header
>>>>> src/mat/impls/aij/seq/kokkos/aijkok.hpp
>>>>>
>>>>> Your observation about the garray in MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices()
>>>>> might be right.  The comment
>>>>>
>>>>> - B - the offdiag matrix using global col ids
>>>>>
>>>>> is out of date. Perhaps it should be "the offdiag matrix uses local
>>>>> column indices and garray contains the local to global mapping".  But I
>>>>> need to double check it.
>>>>>
>>>>> Since you use Kokkos, I think we could provide these two constructors
>>>>> for MATSEQAIJKOKKOS and MATMPIAIJKOKKOS respectively
>>>>>
>>>>>    - MatCreateSeqAIJKokkosWithKokkosCsrMatrix(MPI_Comm comm,
>>>>>    KokkosCsrMatrix csr, Mat *A)
>>>>>
>>>>>
>>>>>    - MatCreateMPIAIJKokkosWithSeqAIJKokkos(MPI_Comm comm, Mat A, Mat
>>>>>    B, PetscInt *garray, Mat *mat)
>>>>>
>>>>>          // To mimic the existing MatCreateMPIAIJWithSeqAIJ(MPI_Comm
>>>>> comm, Mat A, Mat B, const PetscInt garray[], Mat *mat);
>>>>>          // A and B are MATSEQAIJKOKKOS matrices and use local column
>>>>> indices
>>>>>
>>>>> Do they meet your needs?
>>>>>
>>>>> --Junchao Zhang
>>>>>
>>>>>
>>>>> On Tue, Feb 25, 2025 at 5:35?PM Steven Dargaville <
>>>>> dargaville.steven at gmail.com> wrote:
>>>>>
>>>>>> Thanks for the response!
>>>>>>
>>>>>> Although MatSetValuesCOO happens on the device if the input coo_v
>>>>>> pointer is device memory, I believe MatSetPreallocationCOO requires host
>>>>>> pointers for coo_i and coo_j, and the preallocation (and construction of
>>>>>> the COO structures) happens on the host and is then copied onto the device?
>>>>>> I need to be able to create a matrix object with minimal work on the host
>>>>>> (like many of the routines in aijkok.kokkos.cxx do internally). I
>>>>>> originally used the COO interface to build the matrices I need, but that
>>>>>> was around 5x slower than constructing the aij structures myself on the
>>>>>> device and then just directly using the MatSetSeqAIJKokkosWithCSRMatrix
>>>>>> type methods.
>>>>>>
>>>>>> The reason I thought MatSetSeqAIJKokkosWithCSRMatrix could easily be
>>>>>> made public is that the Mat_SeqAIJKokkos constructors are already publicly
>>>>>> accessible? In particular one of those constructors takes in pointers to
>>>>>> the Kokkos dual views which store a,i,j, and hence one can build a
>>>>>> sequential matrix with nothing (or very little) occuring on the host. The
>>>>>> only change I can see that would be necessary is for
>>>>>> MatSetSeqAIJKokkosWithCSRMatrix (or MatCreateSeqAIJKokkosWithCSRMatrix) to
>>>>>> be public is to change the PETSC_INTERN to PETSC_EXTERN?
>>>>>>
>>>>>> For MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices, I believe all
>>>>>> that is required is declaring the method in the .hpp, as it's already
>>>>>> defined as static in mpiaijkok.kokkos.cxx. In particular, the comments
>>>>>> above MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices suggest that the
>>>>>> off-diagonal block B needs to be built with global column ids, with
>>>>>> mpiaij->garray constructed on the host along with the rewriting of the
>>>>>> global column indices in B. This happens in MatSetUpMultiply_MPIAIJ, but
>>>>>> checking the code there shows that if you pass in a non-null garray to
>>>>>> MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices, the construction and
>>>>>> compatification is skipped, meaning B can be built with local column ids as
>>>>>> long as garray is provided on the host (which I also build on the device
>>>>>> and then just copy to the host). Again this is what some of the internal
>>>>>> Kokkos routines rely on, like the matrix-product.
>>>>>>
>>>>>> I am happy to try doing this and submitting a request to the petsc
>>>>>> gitlab if this seems sensible, I just wanted to double check that I wasn't
>>>>>> missing something important?
>>>>>> Thanks
>>>>>> Steven
>>>>>>
>>>>>> On Tue, 25 Feb 2025 at 22:16, Junchao Zhang <junchao.zhang at gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi, Steven,
>>>>>>>   MatSetSeqAIJKokkosWithCSRMatrix(Mat A, Mat_SeqAIJKokkos *akok) uses
>>>>>>> a private data type Mat_SeqAIJKokkos, so it can not be directly made
>>>>>>> public.
>>>>>>>   If you already use COO, then why not directly make the matrix of
>>>>>>> type MATAIJKOKKOS and call MatSetPreallocationCOO() and MatSetValuesCOO()?
>>>>>>>   So I am confused by your needs.
>>>>>>>
>>>>>>> Thanks!
>>>>>>> --Junchao Zhang
>>>>>>>
>>>>>>>
>>>>>>> On Tue, Feb 25, 2025 at 3:39?PM Steven Dargaville <
>>>>>>> dargaville.steven at gmail.com> wrote:
>>>>>>>
>>>>>>>> Hi
>>>>>>>>
>>>>>>>> I'm just wondering if there is any possibility of making:
>>>>>>>> MatSetSeqAIJKokkosWithCSRMatrix
>>>>>>>> or MatCreateSeqAIJKokkosWithCSRMatrix in
>>>>>>>> src/mat/impls/aij/seq/kokkos/aijkok.kokkos.cxx
>>>>>>>> MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices in
>>>>>>>> src/mat/impls/aij/mpi/kokkos/mpiaijkok.kokkos.cxx
>>>>>>>>
>>>>>>>> publicly accessible outside of petsc, or if there is an interface I
>>>>>>>> have missed for creating Kokkos matrices entirely on the device?
>>>>>>>> MatCreateSeqAIJKokkosWithCSRMatrix for example is marked as PETSC_INTERN so
>>>>>>>> I can't link to it.
>>>>>>>>
>>>>>>>> I've currently just copied the code inside of those methods so that
>>>>>>>> I can build without any preallocation on the host (e.g., through the COO
>>>>>>>> interface) and it works really well.
>>>>>>>>
>>>>>>>> Thanks for your help
>>>>>>>> Steven
>>>>>>>>
>>>>>>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20250226/17d7d1da/attachment.html>

From junchao.zhang at gmail.com  Wed Feb 26 12:00:23 2025
From: junchao.zhang at gmail.com (Junchao Zhang)
Date: Wed, 26 Feb 2025 12:00:23 -0600
Subject: [petsc-users] building kokkos matrices on the device
In-Reply-To: <CAG_C8sKWZpVLFWyLBzE2vdtcZYR9r20Un1YrqaETmyRsP0TpSA@mail.gmail.com>
References: <CAG_C8sKOYhGzO7sDVhHyZTcVUAfrVLEH1AYKDVgMrLSg1-h55w@mail.gmail.com>
	<CA+MQGp-JwjLW-MpVOPyHxz8EaQUoivk4qHqN06=Ky2_L5qR+Hg@mail.gmail.com>
	<CAG_C8sL8cZfKcGa9ZBTrK_iu5dA=XSvZ+SETHG_6BTLAFemKZg@mail.gmail.com>
	<CA+MQGp8zgVgPPwSGHpeQbHNJ+iBa1ZDZ=XjnetC1nkmkB3EHzg@mail.gmail.com>
	<CAG_C8sLZFij6j0nPLJs9fFYPPCNNm=zHLZzjqseP9YPRT=xH2A@mail.gmail.com>
	<CA+MQGp-GSJUv9J8qzvVNXZyq2N628RQfi3N4NtW_h2epM4dMMA@mail.gmail.com>
	<CAG_C8sKUfZiq3Bu-uot+QshK03C8A_YEsaDPiJJn9zhuT-7+hQ@mail.gmail.com>
	<FF1094A3-2516-421B-B2C8-9A90B93C9F08@petsc.dev>
	<CA+MQGp9xOOwhYsQdVB_PLurchEDSi=E9OuwU7fiFAbo6oX6F1A@mail.gmail.com>
	<CAG_C8sKWZpVLFWyLBzE2vdtcZYR9r20Un1YrqaETmyRsP0TpSA@mail.gmail.com>
Message-ID: <CA+MQGp_fqA95_nyTiLq10kUqKo_r12K-s7Gjyap5KoJFihZtoA@mail.gmail.com>

On Wed, Feb 26, 2025 at 11:27?AM Steven Dargaville <
dargaville.steven at gmail.com> wrote:

> Ok so just to double check the things I should do:
>
> 1. Create a new header for MatCreateSeqAIJKokkosWithCSRMatrix (and declare
> it PETSC_EXTERN) so users can call the existing method and build a
> seqaijkokkos matrix with no host involvement.
>

No, We already have a private MatCreateSeqAIJKokkosWithKokkosCsrMatrix(MPI_Comm
comm, KokkosCsrMatrix csr, Mat *A), you just need to make it public in a
new header petscmat_kokkos.hpp.

BTW, I am also thinking MatCreateSeqAIJKokkosWithKokkosViews(MPI_Comm comm,
PetscInt m, PetscInt n, Kokkos::View<PetscInt*> i, Kokkos::View<PetscInt*>
j, Kokkos::View<PetscScalar*> a, Mat *mat), as we already have
MatCreateSeqAIJWithArrays(MPI_Comm
comm, PetscInt m, PetscInt n, PetscInt i[], PetscInt j[], PetscScalar a[],
Mat *mat)
The benefit is that we don't need to include <KokkosSparse_CrsMatrix.hpp>
in petscmat_kokkos.hpp, to decouple petsc and kokkos to the least.
But either is fine with me.

>
> 2. Modify *MatCreateMPIAIJWithSeqAIJ (*or equivalent*) *so it does the
> same thing as MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices in the case
> that A and B are seqaijkokkos matrices.
>
Keep the existing  MatCreateMPIAIJWithSeqAIJ() but depreciate it in favor
of a new MatCreateMPIXAIJWithSeqXAIJ(MPI_Comm comm, PetscInt M, PetscInt N,
Mat A, Mat B, const PetscInt garray[], Mat *mat). The new function should
handle cases that the A, B are MATSEQAIJKOKKOS.

>
> 3. Potentially remove MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices
> given it would be redundant?
>
Yes, remove it and use the new API at places calling it.


>
> On Wed, 26 Feb 2025 at 17:15, Junchao Zhang <junchao.zhang at gmail.com>
> wrote:
>
>> That is a good idea.  Perhaps a new MatCreateMPIXAIJWithSeqXAIJ(MPI_Comm
>> comm, PetscInt M, PetscInt N, Mat A, Mat B, const PetscInt garray[], Mat
>> *mat) since garray[] is only meaningful for MATAIJ and subclasses.
>>
>> --Junchao Zhang
>>
>>
>> On Wed, Feb 26, 2025 at 11:02?AM Barry Smith <bsmith at petsc.dev> wrote:
>>
>>>
>>>     The new function doesn't seem to have anything to do with Kokkos so
>>> why have any new functions? Just have *MatCreateMPIAIJWithSeqAIJ() work
>>> properly when the two matrices are Kokkos (or CUDA or HIP).   Or if you
>>> want to eliminate the global reduction maybe make your new function
>>> MatCreateMPIWithSeq() and have it work for any type of submatrix and
>>> eventually we could deprecate the **MatCreateMPIAIJWithSeqAIJ() *
>>>
>>> *   Barry*
>>>
>>>
>>>
>>>
>>> On Feb 26, 2025, at 11:54?AM, Steven Dargaville <
>>> dargaville.steven at gmail.com> wrote:
>>>
>>> I think that sounds great, I'm happy to put together an MR (likely next
>>> week) for review.
>>>
>>> On Wed, 26 Feb 2025 at 16:11, Junchao Zhang <junchao.zhang at gmail.com>
>>> wrote:
>>>
>>>> This fuction *MatCreateMPIAIJWithSeqAIJ(MPI_Comm comm, Mat A, Mat B,
>>>> const PetscInt garray[], Mat *mat) *is rarely used. To compute the
>>>> global matrix's row/col size M, N, it has to do an MPI_Allreduce(). I think
>>>> it is a waste, as the caller usually knows M, N already. So I think we can
>>>> depart from it and have a new one:
>>>>
>>>> MatCreateMPIAIJKokkosWithSeqAIJKokkos(MPI_Comm comm, PetscInt M,
>>>> PetscInt N, Mat A, Mat B, const PetscInt garray[], Mat *mat)
>>>> * M, N are global row/col size of  mat
>>>> * A, B are MATSEQAIJKOKKOS
>>>> * M, N could be PETSC_DECIDE, if so, petsc will compute mat's M, N from
>>>> A, i.e.,  M = Sum of A's M,  N= Sum of A's N
>>>> * if garray is NULL, B uses global column indices (and B's N  should be
>>>> equal to the output mat's N)
>>>> * if garray is not NULL, B uses local column indices; garray[] was
>>>> allocated by PetscMalloc() and after the call,  garray will be owned by mat
>>>> (user should not free garray afterwards).
>>>>
>>>> What do you think? If you agree, could you contribute an MR?
>>>>
>>>> BTW, I think we need to create a new header, petscmat_kokkos.hpp to
>>>> declare
>>>>   PetscErrorCode MatCreateSeqAIJKokkosWithKokkosCsrMatrix(MPI_Comm
>>>> comm, KokkosCsrMatrix csr, Mat *A)
>>>> but
>>>> PetscErrorCode MatCreateMPIAIJKokkosWithSeqAIJKokkos(MPI_Comm comm,
>>>> PetscInt M, PetscInt N, Mat A, Mat B, const PetscInt garray[], Mat *mat)
>>>> can be in petscmat.h as it uses only C types.
>>>>
>>>> Barry, what do you think of the two new APIs?
>>>>
>>>> --Junchao Zhang
>>>>
>>>>
>>>> On Wed, Feb 26, 2025 at 6:26?AM Steven Dargaville <
>>>> dargaville.steven at gmail.com> wrote:
>>>>
>>>>> Those two constructors would definitely meet my needs, thanks!
>>>>>
>>>>> Also I should note that the comment about garray and B in
>>>>> MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices is correct if garray
>>>>> is passed in as NULL, it's just that if you pass in a completed garray it
>>>>> doesn't bother creating one or changing the column indices of B. So I would
>>>>> suggest the comment be: "if garray is NULL the offdiag matrix B
>>>>> should have global col ids; if garray is not NULL the offdiag matrix B
>>>>> should have local col ids"
>>>>>
>>>>> On Wed, 26 Feb 2025 at 03:35, Junchao Zhang <junchao.zhang at gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Mat_SeqAIJKokkos is private because it is in a private header
>>>>>> src/mat/impls/aij/seq/kokkos/aijkok.hpp
>>>>>>
>>>>>> Your observation about the garray in MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices()
>>>>>> might be right.  The comment
>>>>>>
>>>>>> - B - the offdiag matrix using global col ids
>>>>>>
>>>>>> is out of date. Perhaps it should be "the offdiag matrix uses local
>>>>>> column indices and garray contains the local to global mapping".  But I
>>>>>> need to double check it.
>>>>>>
>>>>>> Since you use Kokkos, I think we could provide these two constructors
>>>>>> for MATSEQAIJKOKKOS and MATMPIAIJKOKKOS respectively
>>>>>>
>>>>>>    - MatCreateSeqAIJKokkosWithKokkosCsrMatrix(MPI_Comm comm,
>>>>>>    KokkosCsrMatrix csr, Mat *A)
>>>>>>
>>>>>>
>>>>>>    - MatCreateMPIAIJKokkosWithSeqAIJKokkos(MPI_Comm comm, Mat A, Mat
>>>>>>    B, PetscInt *garray, Mat *mat)
>>>>>>
>>>>>>          // To mimic the existing MatCreateMPIAIJWithSeqAIJ(MPI_Comm
>>>>>> comm, Mat A, Mat B, const PetscInt garray[], Mat *mat);
>>>>>>          // A and B are MATSEQAIJKOKKOS matrices and use local column
>>>>>> indices
>>>>>>
>>>>>> Do they meet your needs?
>>>>>>
>>>>>> --Junchao Zhang
>>>>>>
>>>>>>
>>>>>> On Tue, Feb 25, 2025 at 5:35?PM Steven Dargaville <
>>>>>> dargaville.steven at gmail.com> wrote:
>>>>>>
>>>>>>> Thanks for the response!
>>>>>>>
>>>>>>> Although MatSetValuesCOO happens on the device if the input coo_v
>>>>>>> pointer is device memory, I believe MatSetPreallocationCOO requires host
>>>>>>> pointers for coo_i and coo_j, and the preallocation (and construction of
>>>>>>> the COO structures) happens on the host and is then copied onto the device?
>>>>>>> I need to be able to create a matrix object with minimal work on the host
>>>>>>> (like many of the routines in aijkok.kokkos.cxx do internally). I
>>>>>>> originally used the COO interface to build the matrices I need, but that
>>>>>>> was around 5x slower than constructing the aij structures myself on the
>>>>>>> device and then just directly using the MatSetSeqAIJKokkosWithCSRMatrix
>>>>>>> type methods.
>>>>>>>
>>>>>>> The reason I thought MatSetSeqAIJKokkosWithCSRMatrix could easily be
>>>>>>> made public is that the Mat_SeqAIJKokkos constructors are already publicly
>>>>>>> accessible? In particular one of those constructors takes in pointers to
>>>>>>> the Kokkos dual views which store a,i,j, and hence one can build a
>>>>>>> sequential matrix with nothing (or very little) occuring on the host. The
>>>>>>> only change I can see that would be necessary is for
>>>>>>> MatSetSeqAIJKokkosWithCSRMatrix (or MatCreateSeqAIJKokkosWithCSRMatrix) to
>>>>>>> be public is to change the PETSC_INTERN to PETSC_EXTERN?
>>>>>>>
>>>>>>> For MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices, I believe all
>>>>>>> that is required is declaring the method in the .hpp, as it's already
>>>>>>> defined as static in mpiaijkok.kokkos.cxx. In particular, the comments
>>>>>>> above MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices suggest that the
>>>>>>> off-diagonal block B needs to be built with global column ids, with
>>>>>>> mpiaij->garray constructed on the host along with the rewriting of the
>>>>>>> global column indices in B. This happens in MatSetUpMultiply_MPIAIJ, but
>>>>>>> checking the code there shows that if you pass in a non-null garray to
>>>>>>> MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices, the construction and
>>>>>>> compatification is skipped, meaning B can be built with local column ids as
>>>>>>> long as garray is provided on the host (which I also build on the device
>>>>>>> and then just copy to the host). Again this is what some of the internal
>>>>>>> Kokkos routines rely on, like the matrix-product.
>>>>>>>
>>>>>>> I am happy to try doing this and submitting a request to the petsc
>>>>>>> gitlab if this seems sensible, I just wanted to double check that I wasn't
>>>>>>> missing something important?
>>>>>>> Thanks
>>>>>>> Steven
>>>>>>>
>>>>>>> On Tue, 25 Feb 2025 at 22:16, Junchao Zhang <junchao.zhang at gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hi, Steven,
>>>>>>>>   MatSetSeqAIJKokkosWithCSRMatrix(Mat A, Mat_SeqAIJKokkos *akok) uses
>>>>>>>> a private data type Mat_SeqAIJKokkos, so it can not be directly made
>>>>>>>> public.
>>>>>>>>   If you already use COO, then why not directly make the matrix of
>>>>>>>> type MATAIJKOKKOS and call MatSetPreallocationCOO() and MatSetValuesCOO()?
>>>>>>>>   So I am confused by your needs.
>>>>>>>>
>>>>>>>> Thanks!
>>>>>>>> --Junchao Zhang
>>>>>>>>
>>>>>>>>
>>>>>>>> On Tue, Feb 25, 2025 at 3:39?PM Steven Dargaville <
>>>>>>>> dargaville.steven at gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Hi
>>>>>>>>>
>>>>>>>>> I'm just wondering if there is any possibility of making:
>>>>>>>>> MatSetSeqAIJKokkosWithCSRMatrix
>>>>>>>>> or MatCreateSeqAIJKokkosWithCSRMatrix in
>>>>>>>>> src/mat/impls/aij/seq/kokkos/aijkok.kokkos.cxx
>>>>>>>>> MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices in
>>>>>>>>> src/mat/impls/aij/mpi/kokkos/mpiaijkok.kokkos.cxx
>>>>>>>>>
>>>>>>>>> publicly accessible outside of petsc, or if there is an interface
>>>>>>>>> I have missed for creating Kokkos matrices entirely on the device?
>>>>>>>>> MatCreateSeqAIJKokkosWithCSRMatrix for example is marked as PETSC_INTERN so
>>>>>>>>> I can't link to it.
>>>>>>>>>
>>>>>>>>> I've currently just copied the code inside of those methods so
>>>>>>>>> that I can build without any preallocation on the host (e.g., through the
>>>>>>>>> COO interface) and it works really well.
>>>>>>>>>
>>>>>>>>> Thanks for your help
>>>>>>>>> Steven
>>>>>>>>>
>>>>>>>>
>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20250226/9cc1fa9d/attachment-0001.html>

From liufield at gmail.com  Thu Feb 27 17:12:11 2025
From: liufield at gmail.com (neil liu)
Date: Thu, 27 Feb 2025 18:12:11 -0500
Subject: [petsc-users] Inquiry about resetting a petscsection for a dmplex
Message-ID: <CAGVJNHDjzfy677FHboOL=e5qkxcWvj+Mza_tftqs25onD0H4bw@mail.gmail.com>

Dear Pestc community,

I am currently working on a 3D adaptive vector FEM solver. In my case, I
need to solve two systems: one for the primal equation using a low-order
discretization and another for the adjoint equation using a high-order
discretization.

Afterward, I need to reset the section associated with the DMPlex.
Whichever is set first?20 DOFs (second-order) or 6 DOFs (first-order)?the
final mapping always follows that of the first-defined configuration.

Did I miss something?

Thanks,


Xiaodong

PetscErrorCode DMManage::SetupSection(CaseInfo &objCaseInfo){
PetscSection s;
PetscInt edgeStart, edgeEnd, pStart, pEnd;
PetscInt cellStart, cellEnd;
PetscInt faceStart, faceEnd;

PetscFunctionBeginUser;
DMPlexGetChart(dm, &pStart, &pEnd);
DMPlexGetHeightStratum(dm, 0, &cellStart, &cellEnd);
DMPlexGetHeightStratum(dm, 1, &faceStart, &faceEnd);
DMPlexGetHeightStratum(dm, 2, &edgeStart, &edgeEnd); /* edges */;
PetscSectionCreate(PetscObjectComm((PetscObject)dm), &s);
PetscSectionSetNumFields(s, 1);
PetscSectionSetFieldComponents(s, 0, 1);
if (objCaseInfo.getnumberDof_local() == 6){
PetscSectionSetChart(s, edgeStart, edgeEnd);
for (PetscInt edgeIndex = edgeStart; edgeIndex < edgeEnd; ++edgeIndex) {
PetscSectionSetDof(s, edgeIndex, objCaseInfo.numdofPerEdge);
PetscSectionSetFieldDof(s, edgeIndex, 0, 1);
}
}
else if(objCaseInfo.getnumberDof_local() == 20){
PetscSectionSetChart(s, faceStart, edgeEnd);
for (PetscInt faceIndex = faceStart; faceIndex < faceEnd; ++faceIndex) {
PetscSectionSetDof(s, faceIndex, objCaseInfo.numdofPerFace);
PetscSectionSetFieldDof(s, faceIndex, 0, 1);
}
//Test
for (PetscInt edgeIndex = edgeStart; edgeIndex < edgeEnd; ++edgeIndex) {
PetscSectionSetDof(s, edgeIndex, objCaseInfo.numdofPerEdge);
PetscSectionSetFieldDof(s, edgeIndex, 0, 1);
}
}
//
PetscSectionSetUp(s);
DMSetLocalSection(dm, s);
PetscSectionDestroy(&s);

//Output map for check
ISLocalToGlobalMapping ltogm;
const PetscInt *g_idx;
DMGetLocalToGlobalMapping(dm, &ltogm);
ISLocalToGlobalMappingView(ltogm, PETSC_VIEWER_STDOUT_WORLD);
ISLocalToGlobalMappingGetIndices(ltogm, &g_idx);

PetscFunctionReturn(PETSC_SUCCESS);
}
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20250227/60b22dcd/attachment.html>

From knepley at gmail.com  Thu Feb 27 20:16:31 2025
From: knepley at gmail.com (Matthew Knepley)
Date: Thu, 27 Feb 2025 21:16:31 -0500
Subject: [petsc-users] Inquiry about resetting a petscsection for a
 dmplex
In-Reply-To: <CAGVJNHDjzfy677FHboOL=e5qkxcWvj+Mza_tftqs25onD0H4bw@mail.gmail.com>
References: <CAGVJNHDjzfy677FHboOL=e5qkxcWvj+Mza_tftqs25onD0H4bw@mail.gmail.com>
Message-ID: <CAMYG4GnnYciXYBEWpFRxUYEU+j6FZwmLB8CccDA=rRjk98Ycyw@mail.gmail.com>

On Thu, Feb 27, 2025 at 6:12?PM neil liu <liufield at gmail.com> wrote:

> Dear Pestc community,
>
> I am currently working on a 3D adaptive vector FEM solver. In my case, I
> need to solve two systems: one for the primal equation using a low-order
> discretization and another for the adjoint equation using a high-order
> discretization.
>
> Afterward, I need to reset the section associated with the DMPlex.
> Whichever is set first?20 DOFs (second-order) or 6 DOFs (first-order)?the
> final mapping always follows that of the first-defined configuration.
>
> Did I miss something?
>
> When solving two systems like this on the same mesh, I recommend using
DMClone(). What this does is create you a new
DM with the same backend topology (Plex), but a different function space
(Section). This is how I do everything internally in Plex. Does that make
sense?

  Thanks,

     Matt

> Thanks,
>
>
> Xiaodong
>
> PetscErrorCode DMManage::SetupSection(CaseInfo &objCaseInfo){
> PetscSection s;
> PetscInt edgeStart, edgeEnd, pStart, pEnd;
> PetscInt cellStart, cellEnd;
> PetscInt faceStart, faceEnd;
>
> PetscFunctionBeginUser;
> DMPlexGetChart(dm, &pStart, &pEnd);
> DMPlexGetHeightStratum(dm, 0, &cellStart, &cellEnd);
> DMPlexGetHeightStratum(dm, 1, &faceStart, &faceEnd);
> DMPlexGetHeightStratum(dm, 2, &edgeStart, &edgeEnd); /* edges */;
> PetscSectionCreate(PetscObjectComm((PetscObject)dm), &s);
> PetscSectionSetNumFields(s, 1);
> PetscSectionSetFieldComponents(s, 0, 1);
> if (objCaseInfo.getnumberDof_local() == 6){
> PetscSectionSetChart(s, edgeStart, edgeEnd);
> for (PetscInt edgeIndex = edgeStart; edgeIndex < edgeEnd; ++edgeIndex) {
> PetscSectionSetDof(s, edgeIndex, objCaseInfo.numdofPerEdge);
> PetscSectionSetFieldDof(s, edgeIndex, 0, 1);
> }
> }
> else if(objCaseInfo.getnumberDof_local() == 20){
> PetscSectionSetChart(s, faceStart, edgeEnd);
> for (PetscInt faceIndex = faceStart; faceIndex < faceEnd; ++faceIndex) {
> PetscSectionSetDof(s, faceIndex, objCaseInfo.numdofPerFace);
> PetscSectionSetFieldDof(s, faceIndex, 0, 1);
> }
> //Test
> for (PetscInt edgeIndex = edgeStart; edgeIndex < edgeEnd; ++edgeIndex) {
> PetscSectionSetDof(s, edgeIndex, objCaseInfo.numdofPerEdge);
> PetscSectionSetFieldDof(s, edgeIndex, 0, 1);
> }
> }
> //
> PetscSectionSetUp(s);
> DMSetLocalSection(dm, s);
> PetscSectionDestroy(&s);
>
> //Output map for check
> ISLocalToGlobalMapping ltogm;
> const PetscInt *g_idx;
> DMGetLocalToGlobalMapping(dm, &ltogm);
> ISLocalToGlobalMappingView(ltogm, PETSC_VIEWER_STDOUT_WORLD);
> ISLocalToGlobalMappingGetIndices(ltogm, &g_idx);
>
> PetscFunctionReturn(PETSC_SUCCESS);
> }
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!ZoOMxv_X_kOSiYdGQlrrfxNqBqX-JwvUe4wg8Lmx9ICyyEgKROX7IMg4jQIW9310TtkewqWflxHLqw8Z6USM$  <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!ZoOMxv_X_kOSiYdGQlrrfxNqBqX-JwvUe4wg8Lmx9ICyyEgKROX7IMg4jQIW9310TtkewqWflxHLqyHojTxe$ >
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20250227/53e8b9ef/attachment-0001.html>

From liufield at gmail.com  Fri Feb 28 21:56:45 2025
From: liufield at gmail.com (neil liu)
Date: Fri, 28 Feb 2025 22:56:45 -0500
Subject: [petsc-users] Inquiry about resetting a petscsection for a
 dmplex
In-Reply-To: <CAMYG4GnnYciXYBEWpFRxUYEU+j6FZwmLB8CccDA=rRjk98Ycyw@mail.gmail.com>
References: <CAGVJNHDjzfy677FHboOL=e5qkxcWvj+Mza_tftqs25onD0H4bw@mail.gmail.com>
	<CAMYG4GnnYciXYBEWpFRxUYEU+j6FZwmLB8CccDA=rRjk98Ycyw@mail.gmail.com>
Message-ID: <CAGVJNHD2=8gpjMOXEgFnXLOzkb_3AMPtPRHna0ygMhDgd8xhCQ@mail.gmail.com>

Thanks a lot, Matt! It works well.

I have another question regarding future p-adaptivity. Will the section
support defining different DOFs for each face and edge? Maybe I should try
this.

Thanks,

Xiaodong


On Thu, Feb 27, 2025 at 9:16?PM Matthew Knepley <knepley at gmail.com> wrote:

> On Thu, Feb 27, 2025 at 6:12?PM neil liu <liufield at gmail.com> wrote:
>
>> Dear Pestc community,
>>
>> I am currently working on a 3D adaptive vector FEM solver. In my case, I
>> need to solve two systems: one for the primal equation using a low-order
>> discretization and another for the adjoint equation using a high-order
>> discretization.
>>
>> Afterward, I need to reset the section associated with the DMPlex.
>> Whichever is set first?20 DOFs (second-order) or 6 DOFs (first-order)?the
>> final mapping always follows that of the first-defined configuration.
>>
>> Did I miss something?
>>
>> When solving two systems like this on the same mesh, I recommend using
> DMClone(). What this does is create you a new
> DM with the same backend topology (Plex), but a different function space
> (Section). This is how I do everything internally in Plex. Does that make
> sense?
>
>   Thanks,
>
>      Matt
>
>> Thanks,
>>
>>
>> Xiaodong
>>
>> PetscErrorCode DMManage::SetupSection(CaseInfo &objCaseInfo){
>> PetscSection s;
>> PetscInt edgeStart, edgeEnd, pStart, pEnd;
>> PetscInt cellStart, cellEnd;
>> PetscInt faceStart, faceEnd;
>>
>> PetscFunctionBeginUser;
>> DMPlexGetChart(dm, &pStart, &pEnd);
>> DMPlexGetHeightStratum(dm, 0, &cellStart, &cellEnd);
>> DMPlexGetHeightStratum(dm, 1, &faceStart, &faceEnd);
>> DMPlexGetHeightStratum(dm, 2, &edgeStart, &edgeEnd); /* edges */;
>> PetscSectionCreate(PetscObjectComm((PetscObject)dm), &s);
>> PetscSectionSetNumFields(s, 1);
>> PetscSectionSetFieldComponents(s, 0, 1);
>> if (objCaseInfo.getnumberDof_local() == 6){
>> PetscSectionSetChart(s, edgeStart, edgeEnd);
>> for (PetscInt edgeIndex = edgeStart; edgeIndex < edgeEnd; ++edgeIndex) {
>> PetscSectionSetDof(s, edgeIndex, objCaseInfo.numdofPerEdge);
>> PetscSectionSetFieldDof(s, edgeIndex, 0, 1);
>> }
>> }
>> else if(objCaseInfo.getnumberDof_local() == 20){
>> PetscSectionSetChart(s, faceStart, edgeEnd);
>> for (PetscInt faceIndex = faceStart; faceIndex < faceEnd; ++faceIndex) {
>> PetscSectionSetDof(s, faceIndex, objCaseInfo.numdofPerFace);
>> PetscSectionSetFieldDof(s, faceIndex, 0, 1);
>> }
>> //Test
>> for (PetscInt edgeIndex = edgeStart; edgeIndex < edgeEnd; ++edgeIndex) {
>> PetscSectionSetDof(s, edgeIndex, objCaseInfo.numdofPerEdge);
>> PetscSectionSetFieldDof(s, edgeIndex, 0, 1);
>> }
>> }
>> //
>> PetscSectionSetUp(s);
>> DMSetLocalSection(dm, s);
>> PetscSectionDestroy(&s);
>>
>> //Output map for check
>> ISLocalToGlobalMapping ltogm;
>> const PetscInt *g_idx;
>> DMGetLocalToGlobalMapping(dm, &ltogm);
>> ISLocalToGlobalMappingView(ltogm, PETSC_VIEWER_STDOUT_WORLD);
>> ISLocalToGlobalMappingGetIndices(ltogm, &g_idx);
>>
>> PetscFunctionReturn(PETSC_SUCCESS);
>> }
>>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!ZTmIg2QHRHk4rNyrPzO0mJpbo5uTsYN7umDaXzGGtb2o3qeMrQtB0zvmFa55nwwfw-UtpYaOFEvs3PMXfa-oIQ$ 
> <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!ZTmIg2QHRHk4rNyrPzO0mJpbo5uTsYN7umDaXzGGtb2o3qeMrQtB0zvmFa55nwwfw-UtpYaOFEvs3PPt7om-xg$ >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20250228/4e85ac95/attachment.html>