From Elena.Moral.Sanchez at ipp.mpg.de  Mon Nov  3 11:39:07 2025
From: Elena.Moral.Sanchez at ipp.mpg.de (Moral Sanchez, Elena)
Date: Mon, 3 Nov 2025 17:39:07 +0000
Subject: [petsc-users] norm of KSPBuildResidual does not match norm computed
 from KSPBuildSolution
Message-ID: <4a3cebd0f2494a26b49e8b1b19595433@ipp.mpg.de>

Hi,
I am running CG with a Jacobi preconditioner. I have a monitor function that prints the residual and saves the solution at every iteration. To get the solution at every iteration, I am using the function KSPBuildSolution. I am setting the KSP norm as UNPRECONDITIONED.

The convergence test is consistent with the 2-norm of KSPBuildResidual. However this norm does not match the 2-norm of the residual computed explicitly from the solution (obtained with KSPBuildSolution). It also does not match the preconditioned norm. What norm is it computing?

When the CG is not preconditioned, the norm of KSPBuildResidual and the norm of the residual computed from the solution match, as I expected.

This is KSPView():

    KSP Object: 1 MPI process
      type: cg
        variant HERMITIAN
      maximum iterations=100, nonzero initial guess
      tolerances: relative=1e-08, absolute=1e-08, divergence=10000.
      left preconditioning
      using UNPRECONDITIONED norm type for convergence test
    PC Object: 1 MPI process
      type: jacobi
        type DIAGONAL
      linear system matrix = precond matrix:
      Mat Object: 1 MPI process
        type: nest
        rows=524, cols=524
          Matrix object:
        type=nest, rows=3, cols=3
        MatNest structure:
        (0,0) : type=mpiaij, rows=176, cols=176
        (0,1) : NULL
        (0,2) : NULL
        (1,0) : NULL
        (1,1) : type=mpiaij, rows=172, cols=172
        (1,2) : NULL
        (2,0) : NULL
        (2,1) : NULL
        (2,2) : type=mpiaij, rows=176, cols=176

Cheers,
Elena Moral S?nchez

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251103/4e395726/attachment.html>

From bsmith at petsc.dev  Mon Nov  3 20:01:08 2025
From: bsmith at petsc.dev (Barry Smith)
Date: Mon, 3 Nov 2025 21:01:08 -0500
Subject: [petsc-users] norm of KSPBuildResidual does not match norm
 computed from KSPBuildSolution
In-Reply-To: <4a3cebd0f2494a26b49e8b1b19595433@ipp.mpg.de>
References: <4a3cebd0f2494a26b49e8b1b19595433@ipp.mpg.de>
Message-ID: <3A29735F-9AB3-49CC-9ECB-7576B563949C@petsc.dev>

  Elena,

   I have attached a modification to src/snes/tutorials/ex5.c that adds a monitor routine in the style I think you are suggesting. 

?

Below I cut and paste the beginning of the output from running the command

 ./ex5 -ksp_type cg -ksp_monitor_true_residual -ksp_norm_type unpreconditioned -pc_type
 jacobi -da_refine 3

    0 KSP unpreconditioned resid norm 1.265943996096e+00 true resid norm 1.265943996096e+00 ||r(i)||/||b|| 1.000000000000e+00
My monitor 0 1.265943996096e+00
    1 KSP unpreconditioned resid norm 1.030361071579e+00 true resid norm 1.030361071579e+00 ||r(i)||/||b|| 8.139073092933e-01
My monitor 1 1.030361071579e+00
    2 KSP unpreconditioned resid norm 7.753237278390e-01 true resid norm 7.753237278390e-01 ||r(i)||/||b|| 6.124470989473e-01
My monitor 2 7.753237278390e-01
    3 KSP unpreconditioned resid norm 6.674186105521e-01 true resid norm 6.674186105521e-01 ||r(i)||/||b|| 5.272102183115e-01
My monitor 3 6.674186105521e-01
    4 KSP unpreconditioned resid norm 5.745948088398e-01 true resid norm 5.745948088398e-01 ||r(i)||/||b|| 4.538864362181e-01
My monitor 4 5.745948088398e-01
    5 KSP unpreconditioned resid norm 5.103132053010e-01 true resid norm 5.103132053010e-01 ||r(i)||/||b|| 4.031088317292e-01
My monitor 5 5.103132053010e-01
    6 KSP unpreconditioned resid norm 4.581737850155e-01 true resid norm 4.581737850155e-01 ||r(i)||/||b|| 3.619226335670e-01
My monitor 6 4.581737850155e-01
    7 KSP unpreconditioned resid norm 4.202213342980e-01 true resid norm 4.202213342980e-01 ||r(i)||/||b|| 3.319430682509e-01
My monitor 7 4.202213342980e-01
    8 KSP unpreconditioned resid norm 3.936600255123e-01 true resid norm 3.936600255123e-01 ||r(i)||/||b|| 3.109616434267e-01
My monitor 8 3.936600255123e-01
    9 KSP unpreconditioned resid norm 3.811944420804e-01 true resid norm 3.811944420804e-01 ||r(i)||/||b|| 3.011147754212e-01
My monitor 9 3.811944420804e-01
   10 KSP unpreconditioned resid norm 3.851182669108e-01 true resid norm 3.851182669108e-01 ||r(i)||/||b|| 3.042143002363e-01
My monitor 10 3.851182669108e-01
   11 KSP unpreconditioned resid norm 4.107620195902e-01 true resid norm 4.107620195902e-01 ||r(i)||/||b|| 3.244709251411e-01
My monitor 11 4.107620195902e-01
   12 KSP unpreconditioned resid norm 3.678610761984e-01 true resid norm 3.678610761984e-01 ||r(i)||/||b|| 2.905824249198e-01
My monitor 12 3.678610761984e-01
   13 KSP unpreconditioned resid norm 3.891700469761e-01 true resid norm 3.891700469761e-01 ||r(i)||/||b|| 3.074149000083e-01
My monitor 13 3.891700469761e-01
   14 KSP unpreconditioned resid norm 4.123002052123e-01 true resid norm 4.123002052123e-01 ||r(i)||/||b|| 3.256859754331e-01
My monitor 14 4.123002052123e-01
   15 KSP unpreconditioned resid norm 4.456104079353e-01 true resid norm 4.456104079353e-01 ||r(i)||/||b|| 3.519985159765e-01
My monitor 15 4.456104079353e-01
   16 KSP unpreconditioned resid norm 5.125721163597e-01 true resid norm 5.125721163597e-01 ||r(i)||/||b|| 4.048932005999e-01
My monitor 16 5.125721163597e-01
   17 KSP unpreconditioned resid norm 4.475156370525e-01 true resid norm 4.475156370525e-01 ||r(i)||/||b|| 3.535035028662e-01
My monitor 17 4.475156370525e-01
   18 KSP unpreconditioned resid norm 2.977755423590e-01 true resid norm 2.977755423590e-01 ||r(i)||/||b|| 2.352201545070e-01
My monitor 18 2.977755423590e-01
   19 KSP unpreconditioned resid norm 2.317275576684e-01 true resid norm 2.317275576684e-01 ||r(i)||/||b|| 1.830472425186e-01
My monitor 19 2.317275576684e-01
   20 KSP unpreconditioned resid norm 2.388542347249e-01 true resid norm 2.388542347249e-01 ||r(i)||/||b|| 1.886767783262e-01
My monitor 20 2.388542347249e-01
   21 KSP unpreconditioned resid norm 1.722165062986e-01 true resid norm 1.722165062986e-01 ||r(i)||/||b|| 1.360380133953e-01
My monitor 21 1.722165062986e-01
   22 KSP unpreconditioned resid norm 1.161869442046e-01 true resid norm 1.161869442046e-01 ||r(i)||/||b|| 9.177889745747e-02
My monitor 22 1.161869442046e-01
   23 KSP unpreconditioned resid norm 6.594339583731e-02 true resid norm 6.594339583731e-02 ||r(i)||/||b|| 5.209029470549e-02
My monitor 23 6.594339583731e-02
   24 KSP unpreconditioned resid norm 4.351679748574e-02 true resid norm 4.351679748574e-02 ||r(i)||/||b|| 3.437497837181e-02
My monitor 24 4.351679748573e-02
   25 KSP unpreconditioned resid norm 3.847638846864e-02 true resid norm 3.847638846864e-02 ||r(i)||/||b|| 3.039343650847e-02
My monitor 25 3.847638846864e-02
   26 KSP unpreconditioned resid norm 2.063424248358e-02 true resid norm 2.063424248358e-02 ||r(i)||/||b|| 1.629949077306e-02
My monitor 26 2.063424248358e-02
   27 KSP unpreconditioned resid norm 1.402462240396e-02 true resid norm 1.402462240396e-02 ||r(i)||/||b|| 1.107839086659e-02
My monitor 27 1.402462240396e-02
   28 KSP unpreconditioned resid norm 7.732817953098e-03 true resid norm 7.732817953098e-03 ||r(i)||/||b|| 6.108341267025e-03
My monitor 28 7.732817953099e-03
   29 KSP unpreconditioned resid norm 5.109464751004e-03 true resid norm 5.109464751004e-03 ||r(i)||/||b|| 4.036090669698e-03
My monitor 29 5.109464751004e-03
   30 KSP unpreconditioned resid norm 2.628714079103e-03 true resid norm 2.628714079103e-03 ||r(i)||/||b|| 2.076485284664e-03
My monitor 30 2.628714079103e-03
   31 KSP unpreconditioned resid norm 1.211324322673e-03 true resid norm 1.211324322673e-03 ||r(i)||/||b|| 9.568545894671e-04
My monitor 31 1.211324322673e-03

At iteration 32 we see a slight difference in the reported norms

   32 KSP unpreconditioned resid norm 5.638897702485e-04 true resid norm 5.638897702485e-04 ||r(i)||/||b|| 4.454302654678e-04
My monitor 32 5.638897702491e-04

Then they continue to be different with more and more digits

   33 KSP unpreconditioned resid norm 2.557920120696e-04 true resid norm 2.557920120696e-04 ||r(i)||/||b|| 2.020563412429e-04
My monitor 33 2.557920120695e-04
   34 KSP unpreconditioned resid norm 1.249567288159e-04 true resid norm 1.249567288159e-04 ||r(i)||/||b|| 9.870636394758e-05
My monitor 34 1.249567288156e-04
   35 KSP unpreconditioned resid norm 6.554146400697e-05 true resid norm 6.554146400697e-05 ||r(i)||/||b|| 5.177279896194e-05
My monitor 35 6.554146400761e-05
   36 KSP unpreconditioned resid norm 3.360138566154e-05 true resid norm 3.360138566154e-05 ||r(i)||/||b|| 2.654255303959e-05
My monitor 36 3.360138566057e-05
   37 KSP unpreconditioned resid norm 1.963635751089e-05 true resid norm 1.963635751089e-05 ||r(i)||/||b|| 1.551123712537e-05
My monitor 37 1.963635751179e-05
   38 KSP unpreconditioned resid norm 1.111922577034e-05 true resid norm 1.111922577034e-05 ||r(i)||/||b|| 8.783347292320e-06
My monitor 38 1.111922577016e-05

Is this the type of discrepancy you're seeing in your code, or are you seeing enormous differences right off the bat?

The discrepancy shown above is normal. It arises because KSPSolve_CG() uses 

   PetscCall(VecAXPY(X, a, P));  /*     x <- x + ap                      */
    PetscCall(VecAXPY(R, -a, W)); /*     r <- r - aw                      */

to update the solution and the residual.    Where W has been computed further up in the code as A*P. 

If I change KSPSolve_CG() to instead compute R = b - A X explicitly (attached)? then the output becomes

   32 KSP unpreconditioned resid norm 5.638897702486e-04 true resid norm 5.638897702486e-04 ||r(i)||/||b|| 4.454302654678e-04
My monitor 32 5.638897702486e-04
   33 KSP unpreconditioned resid norm 2.557920120698e-04 true resid norm 2.557920120698e-04 ||r(i)||/||b|| 2.020563412431e-04
My monitor 33 2.557920120698e-04
   34 KSP unpreconditioned resid norm 1.249567288163e-04 true resid norm 1.249567288163e-04 ||r(i)||/||b|| 9.870636394784e-05
My monitor 34 1.249567288163e-04
   35 KSP unpreconditioned resid norm 6.554146400720e-05 true resid norm 6.554146400720e-05 ||r(i)||/||b|| 5.177279896212e-05
My monitor 35 6.554146400720e-05
   36 KSP unpreconditioned resid norm 3.360138566157e-05 true resid norm 3.360138566157e-05 ||r(i)||/||b|| 2.654255303962e-05
My monitor 36 3.360138566157e-05
   37 KSP unpreconditioned resid norm 1.963635751099e-05 true resid norm 1.963635751099e-05 ||r(i)||/||b|| 1.551123712545e-05
My monitor 37 1.963635751099e-05
   38 KSP unpreconditioned resid norm 1.111922577015e-05 true resid norm 1.111922577015e-05 ||r(i)||/||b|| 8.783347292168e-06
My monitor 38 1.111922577015e-05

Now the residual norm printed by -ksp_monitor and MyMonitor are the same to all digits for all iterations.

KSPSolve_CG() uses R = R - a W = R - a (A * P) instead of R = B - A*X because it saves a matrix-vector multiply per iteration (generally, for CG the matrix-vector multiply dominates the solution time).

One final note. How come "The convergence test is consistent with the 2-norm of KSPBuildResidual" but not computed explicitly from KSPBuildSolution()? That is because the KSPBuildResidual() routine cheats

PETSC_INTERN PetscErrorCode KSPBuildResidual_CG(KSP ksp, Vec t, Vec v, Vec *V)
{
  PetscFunctionBegin;
  PetscCall(VecCopy(ksp->work[0], v));
  *V = v;

It knows from the KSPSolve_CG() code where the computed residual is stored and gives you that (slightly incorrect :-) one. Now some Krylov methods do not explicitly store (or even compute) the residual vector so they explicitly compute it with

PetscErrorCode KSPBuildResidualDefault(KSP ksp, Vec t, Vec v, Vec *V)
{
  Mat Amat, Pmat;

  PetscFunctionBegin;
  if (!ksp->pc) PetscCall(KSPGetPC(ksp, &ksp->pc));
  PetscCall(PCGetOperators(ksp->pc, &Amat, &Pmat));
  PetscCall(KSPBuildSolution(ksp, t, NULL));
  PetscCall(KSP_MatMult(ksp, Amat, t, v));
  PetscCall(VecAYPX(v, -1.0, ksp->vec_rhs));
  *V = v;
  PetscFunctionReturn(PETSC_SUCCESS);
}


Barry

This is an interesting phenomenon that often is not discussed in elementary introductions to Krylov methods (or even in advanced discussions), so I think I will write an FAQ that explains it for petsc.org <https://urldefense.us/v3/__http://petsc.org/__;!!G_uCfscf7eWS!bQiljmFzCDPj21dA6X6eDr47PLi0lUIaziJfBnNVcO5E7N-sNhdUyjXxrGx4bcYiZj1xbKIYBUWdvDHFkPwnSBw$ >


> On Nov 3, 2025, at 12:39?PM, Moral Sanchez, Elena <Elena.Moral.Sanchez at ipp.mpg.de> wrote:
> 
> Hi,
> I am running CG with a Jacobi preconditioner. I have a monitor function that prints the residual and saves the solution at every iteration. To get the solution at every iteration, I am using the function KSPBuildSolution. I am setting the KSP norm as UNPRECONDITIONED.
> 
> The convergence test is consistent with the 2-norm of KSPBuildResidual. However this norm does not match the 2-norm of the residual computed explicitly from the solution (obtained with KSPBuildSolution). It also does not match the preconditioned norm. What norm is it computing?
> 
> When the CG is not preconditioned, the norm of KSPBuildResidual and the norm of the residual computed from the solution match, as I expected.
> 
> This is KSPView():
> 
>     KSP Object: 1 MPI process
>       type: cg
>         variant HERMITIAN
>       maximum iterations=100, nonzero initial guess
>       tolerances: relative=1e-08, absolute=1e-08, divergence=10000.
>       left preconditioning
>       using UNPRECONDITIONED norm type for convergence test
>     PC Object: 1 MPI process
>       type: jacobi
>         type DIAGONAL
>       linear system matrix = precond matrix:
>       Mat Object: 1 MPI process
>         type: nest
>         rows=524, cols=524
>           Matrix object:
>         type=nest, rows=3, cols=3
>         MatNest structure:
>         (0,0) : type=mpiaij, rows=176, cols=176
>         (0,1) : NULL
>         (0,2) : NULL
>         (1,0) : NULL
>         (1,1) : type=mpiaij, rows=172, cols=172
>         (1,2) : NULL
>         (2,0) : NULL
>         (2,1) : NULL
>         (2,2) : type=mpiaij, rows=176, cols=176
>         
> Cheers,
> Elena Moral S?nchez

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251103/77d65617/attachment-0003.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ex5.c
Type: application/octet-stream
Size: 38511 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251103/77d65617/attachment-0002.obj>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251103/77d65617/attachment-0004.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: cg.c
Type: application/octet-stream
Size: 30555 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251103/77d65617/attachment-0003.obj>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251103/77d65617/attachment-0005.html>

From prateekgupta1709 at gmail.com  Tue Nov  4 02:09:52 2025
From: prateekgupta1709 at gmail.com (prateekgupta1709 at gmail.com)
Date: Tue, 4 Nov 2025 13:39:52 +0530
Subject: [petsc-users] Segmentation violation in DMSwarmMigrate(sdm_,
 PETSC_TRUE)
Message-ID: <054F3C46-1719-4A52-9CFF-8CDBB9D46B76@gmail.com>

Hi, 
I am writing a basic particle tracking code with an existing periodic DMDA. All velocity calculations and position updates work fine but migration throws segfault. I have checked and particles are within the periodic box as well. 

I discovered that there is an internal field for coordinates which has to be defined using its identifier instead of explicitly registering a field. I am not registering any velocity field (using external vectors).

 Is there something I am missing?

Thanks, 
Prateek

From prateekgupta1709 at gmail.com  Tue Nov  4 02:45:49 2025
From: prateekgupta1709 at gmail.com (prateekgupta1709 at gmail.com)
Date: Tue, 4 Nov 2025 14:15:49 +0530
Subject: [petsc-users] Segmentation violation in DMSwarmMigrate(sdm_,
 PETSC_TRUE)
In-Reply-To: <054F3C46-1719-4A52-9CFF-8CDBB9D46B76@gmail.com>
References: <054F3C46-1719-4A52-9CFF-8CDBB9D46B76@gmail.com>
Message-ID: <0BFB7586-0108-47EA-98F8-F9EC0ED21A6D@gmail.com>

Forgot to add, 

even with PETSC_FALSE option, I get the same segfault error. 
> 
> On 4 Nov 2025, at 1:39?PM, prateekgupta1709 at gmail.com wrote:
> 
> ?Hi,
> I am writing a basic particle tracking code with an existing periodic DMDA. All velocity calculations and position updates work fine but migration throws segfault. I have checked and particles are within the periodic box as well.
> 
> I discovered that there is an internal field for coordinates which has to be defined using its identifier instead of explicitly registering a field. I am not registering any velocity field (using external vectors).
> 
> Is there something I am missing?
> 
> Thanks,
> Prateek

From Elena.Moral.Sanchez at ipp.mpg.de  Tue Nov  4 03:06:07 2025
From: Elena.Moral.Sanchez at ipp.mpg.de (Moral Sanchez, Elena)
Date: Tue, 4 Nov 2025 09:06:07 +0000
Subject: [petsc-users] norm of KSPBuildResidual does not match norm
 computed from KSPBuildSolution
In-Reply-To: <3A29735F-9AB3-49CC-9ECB-7576B563949C@petsc.dev>
References: <4a3cebd0f2494a26b49e8b1b19595433@ipp.mpg.de>
	<3A29735F-9AB3-49CC-9ECB-7576B563949C@petsc.dev>
Message-ID: <e59c808b-0252-47c9-8ef7-59e032d650aa@ipp.mpg.de>

Dear Barry,

Thanks for the fast answer. Unfortunately in my case the discrepancy is huge. With the flags

 -ksp_monitor_true_residual -ksp_norm_type unpreconditioned

this is the output:

  0 KSP unpreconditioned resid norm 5.568889644229e-01 true resid norm 5.568889644229e-01 ||r(i)||/||b|| 1.000000000000e+00
  1 KSP unpreconditioned resid norm 2.831772665189e-01 true resid norm 2.831772665189e-01 ||r(i)||/||b|| 5.084986139245e-01
  2 KSP unpreconditioned resid norm 1.875950094147e-01 true resid norm 1.875950094147e-01 ||r(i)||/||b|| 3.368625011435e-01

and this is the output of my own monitor function:

Iter 0/10 | res = 5.57e-01/1.00e-08 | 0.0 s
difference KSPBuildSolution and u: 0.0
UNPRECONDITIONED norm:  0.5568889644229376
PRECONDITIONED norm:  2.049041078011257
KSPBuildResidual 2-norm: 0.5568889644229299
difference KSPBuildResidual and b-A(KSPBuildSolution): 6.573603152700697e-13

Iter 1/10 | res = 2.83e-01/1.00e-08 | 0.0 s
difference KSPBuildSolution and u: 0.0
UNPRECONDITIONED norm:  0.7661983589104541
PRECONDITIONED norm:  2.7387602134717137
KSPBuildResidual 2-norm: 0.2831772665189212
difference KSPBuildResidual and b-A(KSPBuildSolution): 0.1700718741085172

Iter 2/10 | res = 1.88e-01/1.00e-08 | 0.0 s
difference KSPBuildSolution and u: 0.0
UNPRECONDITIONED norm:  0.7050518160900253
PRECONDITIONED norm:  2.421773833445645
KSPBuildResidual 2-norm: 0.18759500941469456
difference KSPBuildResidual and b-A(KSPBuildSolution): 0.19327058976599623

Here u is the vector in the KSPSolve.

After the first iteration, the residual computed from KSPBuildSolution and the residual from KSPBuildResidual diverge. They are the same when I run the same code without preconditioner.

Another observation is that after convergence (wrt. unpreconditioned norm == 2-norm of KSPBuildResidual) the solution with and without preconditioner looks quite different. How is this possible if my preconditioner is SPD?


By the way, where can I find your implementation of "My monitor" in src/snes/tutorials/ex5.c? I tried to look at the Gitlab repository but could not find it.

Thanks for the help.

Cheers,

Elena


On 11/4/25 03:01, Barry Smith wrote:
    0 KSP unpreconditioned resid norm 1.265943996096e+00 true resid norm 1.265943996096e+00 ||r(i)||/||b|| 1.000000000000e+00
My monitor 0 1.265943996096e+00
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251104/ff4e0fe1/attachment.html>

From Elena.Moral.Sanchez at ipp.mpg.de  Tue Nov  4 05:09:21 2025
From: Elena.Moral.Sanchez at ipp.mpg.de (Moral Sanchez, Elena)
Date: Tue, 4 Nov 2025 11:09:21 +0000
Subject: [petsc-users] norm of KSPBuildResidual does not match norm
 computed from KSPBuildSolution
In-Reply-To: <e59c808b-0252-47c9-8ef7-59e032d650aa@ipp.mpg.de>
References: <4a3cebd0f2494a26b49e8b1b19595433@ipp.mpg.de>
	<3A29735F-9AB3-49CC-9ECB-7576B563949C@petsc.dev>,
	<e59c808b-0252-47c9-8ef7-59e032d650aa@ipp.mpg.de>
Message-ID: <b69cb8b70d4c45e7a62d182b7603e36e@ipp.mpg.de>

Dear Barry,

I just realized that the operator from KSPGetOperators()[0] does not match my operator. In fact, KSPGetOperators()[0] returns the operator used for the preconditioner. This explains all the issues I reported.

The way I was setting up the KSP is
    ksp.setOperators(ksp, lhs, None)

    precond = PETSc.PC().create(comm=comm)
    precond.setType(PETSc.PC.Type.JACOBI)
    precond.setOperators(A=nest_mass_matrix, P=None)
    precond.setUp()

    ksp.setPC(precond)
    ksp.setUp()

This turns out to behave as
    ksp.setOperators(A=nest_mass_matrix, P=nest_mass_matrix)
    precond = ksp.getPC()
    precond.setType(PETSc.PC.Type.JACOBI)
    ksp.setUp()

I managed to set up what I want with
    ksp.setOperators(A=lhs, P=nest_mass_matrix)
    precond = ksp.getPC()
    precond.setType(PETSc.PC.Type.JACOBI)
    ksp.setUp()

Here I am solving
    lhs x = rhs,
lhs is a matrix-free operator (python type) and my preconditioner is the diagonal of nest_mass_matrix, which is of type nest.

I find this extremely confusing, especially because from the output of KSPView I could not detect the problem. For illustration, after the fix, PCView prints

    Mat Object: 1 MPI process
      type: python
        Python: __main__.LHSOperator
    Mat Object: 1 MPI process
      type: nest
      Matrix object:
        type=nest, rows=3, cols=3
        MatNest structure:
        (0,0) : type=mpiaij, rows=176, cols=176
        (0,1) : NULL
        (0,2) : NULL
        (1,0) : NULL
        (1,1) : type=mpiaij, rows=172, cols=172
        (1,2) : NULL
        (2,0) : NULL
        (2,1) : NULL
        (2,2) : type=mpiaij, rows=176, cols=176

This brings me to the question: is the Jacobi preconditioner using the operator lhs or the nested operator? I think that the output of PCView and KSPView should be more clear on this.

Cheers,
Elena


________________________________
From: Moral Sanchez, Elena
Sent: 04 November 2025 10:06:07
To: Barry Smith
Cc: PETSc
Subject: Re: [petsc-users] norm of KSPBuildResidual does not match norm computed from KSPBuildSolution


Dear Barry,

Thanks for the fast answer. Unfortunately in my case the discrepancy is huge. With the flags

 -ksp_monitor_true_residual -ksp_norm_type unpreconditioned

this is the output:

  0 KSP unpreconditioned resid norm 5.568889644229e-01 true resid norm 5.568889644229e-01 ||r(i)||/||b|| 1.000000000000e+00
  1 KSP unpreconditioned resid norm 2.831772665189e-01 true resid norm 2.831772665189e-01 ||r(i)||/||b|| 5.084986139245e-01
  2 KSP unpreconditioned resid norm 1.875950094147e-01 true resid norm 1.875950094147e-01 ||r(i)||/||b|| 3.368625011435e-01

and this is the output of my own monitor function:

Iter 0/10 | res = 5.57e-01/1.00e-08 | 0.0 s
difference KSPBuildSolution and u: 0.0
UNPRECONDITIONED norm:  0.5568889644229376
PRECONDITIONED norm:  2.049041078011257
KSPBuildResidual 2-norm: 0.5568889644229299
difference KSPBuildResidual and b-A(KSPBuildSolution): 6.573603152700697e-13

Iter 1/10 | res = 2.83e-01/1.00e-08 | 0.0 s
difference KSPBuildSolution and u: 0.0
UNPRECONDITIONED norm:  0.7661983589104541
PRECONDITIONED norm:  2.7387602134717137
KSPBuildResidual 2-norm: 0.2831772665189212
difference KSPBuildResidual and b-A(KSPBuildSolution): 0.1700718741085172

Iter 2/10 | res = 1.88e-01/1.00e-08 | 0.0 s
difference KSPBuildSolution and u: 0.0
UNPRECONDITIONED norm:  0.7050518160900253
PRECONDITIONED norm:  2.421773833445645
KSPBuildResidual 2-norm: 0.18759500941469456
difference KSPBuildResidual and b-A(KSPBuildSolution): 0.19327058976599623

Here u is the vector in the KSPSolve.

After the first iteration, the residual computed from KSPBuildSolution and the residual from KSPBuildResidual diverge. They are the same when I run the same code without preconditioner.

Another observation is that after convergence (wrt. unpreconditioned norm == 2-norm of KSPBuildResidual) the solution with and without preconditioner looks quite different. How is this possible if my preconditioner is SPD?


By the way, where can I find your implementation of "My monitor" in src/snes/tutorials/ex5.c? I tried to look at the Gitlab repository but could not find it.

Thanks for the help.

Cheers,

Elena


On 11/4/25 03:01, Barry Smith wrote:
    0 KSP unpreconditioned resid norm 1.265943996096e+00 true resid norm 1.265943996096e+00 ||r(i)||/||b|| 1.000000000000e+00
My monitor 0 1.265943996096e+00
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251104/544104cc/attachment-0001.html>

From bsmith at petsc.dev  Tue Nov  4 08:36:15 2025
From: bsmith at petsc.dev (Barry Smith)
Date: Tue, 4 Nov 2025 09:36:15 -0500
Subject: [petsc-users] norm of KSPBuildResidual does not match norm
 computed from KSPBuildSolution
In-Reply-To: <e59c808b-0252-47c9-8ef7-59e032d650aa@ipp.mpg.de>
References: <4a3cebd0f2494a26b49e8b1b19595433@ipp.mpg.de>
	<3A29735F-9AB3-49CC-9ECB-7576B563949C@petsc.dev>
	<e59c808b-0252-47c9-8ef7-59e032d650aa@ipp.mpg.de>
Message-ID: <A7793DDE-6EE7-4717-A64C-1A640563CD5B@petsc.dev>


> On Nov 4, 2025, at 4:06?AM, Moral Sanchez, Elena <Elena.Moral.Sanchez at ipp.mpg.de> wrote:
> 
> Dear Barry,
> Thanks for the fast answer. Unfortunately in my case the discrepancy is huge. With the flags
>  -ksp_monitor_true_residual -ksp_norm_type unpreconditioned
> this is the output:
>   0 KSP unpreconditioned resid norm 5.568889644229e-01 true resid norm 5.568889644229e-01 ||r(i)||/||b|| 1.000000000000e+00
>   1 KSP unpreconditioned resid norm 2.831772665189e-01 true resid norm 2.831772665189e-01 ||r(i)||/||b|| 5.084986139245e-01
>   2 KSP unpreconditioned resid norm 1.875950094147e-01 true resid norm 1.875950094147e-01 ||r(i)||/||b|| 3.368625011435e-01
> and this is the output of my own monitor function:
> Iter 0/10 | res = 5.57e-01/1.00e-08 | 0.0 s
> difference KSPBuildSolution and u: 0.0
> UNPRECONDITIONED norm:  0.5568889644229376
> PRECONDITIONED norm:  2.049041078011257
> KSPBuildResidual 2-norm: 0.5568889644229299
> difference KSPBuildResidual and b-A(KSPBuildSolution): 6.573603152700697e-13
> 
> Iter 1/10 | res = 2.83e-01/1.00e-08 | 0.0 s
> difference KSPBuildSolution and u: 0.0
> UNPRECONDITIONED norm:  0.7661983589104541
> PRECONDITIONED norm:  2.7387602134717137
> KSPBuildResidual 2-norm: 0.2831772665189212
> difference KSPBuildResidual and b-A(KSPBuildSolution): 0.1700718741085172
> 
> Iter 2/10 | res = 1.88e-01/1.00e-08 | 0.0 s
> difference KSPBuildSolution and u: 0.0
> UNPRECONDITIONED norm:  0.7050518160900253
> PRECONDITIONED norm:  2.421773833445645
> KSPBuildResidual 2-norm: 0.18759500941469456
> difference KSPBuildResidual and b-A(KSPBuildSolution): 0.19327058976599623
> Here u is the vector in the KSPSolve. 
> After the first iteration, the residual computed from KSPBuildSolution and the residual from KSPBuildResidual diverge. They are the same when I run the same code without preconditioner.
> Another observation is that after convergence (wrt. unpreconditioned norm == 2-norm of KSPBuildResidual) the solution with and without preconditioner looks quite different. How is this possible if my preconditioner is SPD?
> 
> By the way, where can I find your implementation of "My monitor" in src/snes/tutorials/ex5.c? I tried to look at the Gitlab repository but could not find it.

I thought I attached it to the email.


> Thanks for the help.
> Cheers,
> Elena
> 
> 
> 
> 
> 
> On 11/4/25 03:01, Barry Smith wrote:
>>     0 KSP unpreconditioned resid norm 1.265943996096e+00 true resid norm 1.265943996096e+00 ||r(i)||/||b|| 1.000000000000e+00
>> My monitor 0 1.265943996096e+00

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251104/2ce401ab/attachment.html>

From bsmith at petsc.dev  Tue Nov  4 08:43:57 2025
From: bsmith at petsc.dev (Barry Smith)
Date: Tue, 4 Nov 2025 09:43:57 -0500
Subject: [petsc-users] norm of KSPBuildResidual does not match norm
 computed from KSPBuildSolution
In-Reply-To: <e59c808b-0252-47c9-8ef7-59e032d650aa@ipp.mpg.de>
References: <4a3cebd0f2494a26b49e8b1b19595433@ipp.mpg.de>
	<3A29735F-9AB3-49CC-9ECB-7576B563949C@petsc.dev>
	<e59c808b-0252-47c9-8ef7-59e032d650aa@ipp.mpg.de>
Message-ID: <F2CAB967-2703-4CC1-901F-21D7BCC0FA93@petsc.dev>


  Are you sure your matrix is symmetric, positive definite, and that the sign of all the diagonal entries is the same? 

  You can run with -ksp_view_mat binary -ksp_view_rhs binary. This will produce a file called binaryoutput, you can email that file.

  Barry


> On Nov 4, 2025, at 4:06?AM, Moral Sanchez, Elena <Elena.Moral.Sanchez at ipp.mpg.de> wrote:
> 
> Dear Barry,
> Thanks for the fast answer. Unfortunately in my case the discrepancy is huge. With the flags
>  -ksp_monitor_true_residual -ksp_norm_type unpreconditioned
> this is the output:
>   0 KSP unpreconditioned resid norm 5.568889644229e-01 true resid norm 5.568889644229e-01 ||r(i)||/||b|| 1.000000000000e+00
>   1 KSP unpreconditioned resid norm 2.831772665189e-01 true resid norm 2.831772665189e-01 ||r(i)||/||b|| 5.084986139245e-01
>   2 KSP unpreconditioned resid norm 1.875950094147e-01 true resid norm 1.875950094147e-01 ||r(i)||/||b|| 3.368625011435e-01
> and this is the output of my own monitor function:
> Iter 0/10 | res = 5.57e-01/1.00e-08 | 0.0 s
> difference KSPBuildSolution and u: 0.0
> UNPRECONDITIONED norm:  0.5568889644229376
> PRECONDITIONED norm:  2.049041078011257
> KSPBuildResidual 2-norm: 0.5568889644229299
> difference KSPBuildResidual and b-A(KSPBuildSolution): 6.573603152700697e-13
> 
> Iter 1/10 | res = 2.83e-01/1.00e-08 | 0.0 s
> difference KSPBuildSolution and u: 0.0
> UNPRECONDITIONED norm:  0.7661983589104541
> PRECONDITIONED norm:  2.7387602134717137
> KSPBuildResidual 2-norm: 0.2831772665189212
> difference KSPBuildResidual and b-A(KSPBuildSolution): 0.1700718741085172
> 
> Iter 2/10 | res = 1.88e-01/1.00e-08 | 0.0 s
> difference KSPBuildSolution and u: 0.0
> UNPRECONDITIONED norm:  0.7050518160900253
> PRECONDITIONED norm:  2.421773833445645
> KSPBuildResidual 2-norm: 0.18759500941469456
> difference KSPBuildResidual and b-A(KSPBuildSolution): 0.19327058976599623
> Here u is the vector in the KSPSolve. 
> After the first iteration, the residual computed from KSPBuildSolution and the residual from KSPBuildResidual diverge. They are the same when I run the same code without preconditioner.
> Another observation is that after convergence (wrt. unpreconditioned norm == 2-norm of KSPBuildResidual) the solution with and without preconditioner looks quite different. How is this possible if my preconditioner is SPD?
> 
> By the way, where can I find your implementation of "My monitor" in src/snes/tutorials/ex5.c? I tried to look at the Gitlab repository but could not find it.
> Thanks for the help.
> Cheers,
> Elena
> 
> 
> 
> 
> 
> On 11/4/25 03:01, Barry Smith wrote:
>>     0 KSP unpreconditioned resid norm 1.265943996096e+00 true resid norm 1.265943996096e+00 ||r(i)||/||b|| 1.000000000000e+00
>> My monitor 0 1.265943996096e+00

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251104/4c9c6f83/attachment.html>

From bsmith at petsc.dev  Tue Nov  4 09:08:21 2025
From: bsmith at petsc.dev (Barry Smith)
Date: Tue, 4 Nov 2025 10:08:21 -0500
Subject: [petsc-users] norm of KSPBuildResidual does not match norm
 computed from KSPBuildSolution
In-Reply-To: <e59c808b-0252-47c9-8ef7-59e032d650aa@ipp.mpg.de>
References: <4a3cebd0f2494a26b49e8b1b19595433@ipp.mpg.de>
	<3A29735F-9AB3-49CC-9ECB-7576B563949C@petsc.dev>
	<e59c808b-0252-47c9-8ef7-59e032d650aa@ipp.mpg.de>
Message-ID: <3197B431-68DD-437E-997B-7E8276D1479B@petsc.dev>

  The preconditioner is always built from the second of the two matrices passed in KSPSetOperators() or PCSetOperators(). In the first case below both matrices are the same (the nest matrix) and so diagonal is extracted from the nest matrix (which is the only matrix). In the second case below the first matrix is custom and the second is the next matrix, again the preconditioner is constructed from the second one. 

      linear system matrix = precond matrix:
      Mat Object: 1 MPI process
        type: nest
        rows=524, cols=524
          Matrix object:
        type=nest, rows=3, cols=3
        MatNest structure:
        (0,0) : type=mpiaij, rows=176, cols=176
        (0,1) : NULL
        (0,2) : NULL
        (1,0) : NULL
        (1,1) : type=mpiaij, rows=172, cols=172
        (1,2) : NULL
        (2,0) : NULL
        (2,1) : NULL
        (2,2) : type=mpiaij, rows=176, cols=176
        
Mat Object: 1 MPI process
      type: python
        Python: __main__.LHSOperator
    Mat Object: 1 MPI process
      type: nest
      Matrix object:
        type=nest, rows=3, cols=3
        MatNest structure:
        (0,0) : type=mpiaij, rows=176, cols=176
        (0,1) : NULL
        (0,2) : NULL
        (1,0) : NULL
        (1,1) : type=mpiaij, rows=172, cols=172
        (1,2) : NULL
        (2,0) : NULL
        (2,1) : NULL
        (2,2) : type=mpiaij, rows=176, cols=176
        

The arguments to KSPSetOperators() and PCSetOperators() are stored in the same place (inside the PC) 

In fact,

PetscErrorCode KSPSetOperators(KSP ksp, Mat Amat, Mat Pmat)
{
 ....
  PetscCall(PCSetOperators(ksp->pc, Amat, Pmat));


so when you call

   precond.setOperators(A=nest_mass_matrix, P=None)
   ksp.setPC(precond)

you are overwriting the A = lhs that you passed into  ksp.setOperators(ksp, lhs, None).

Given the names  ksp.setOperators() and precond.setOperators() I can see how this can be confusing. It is reasonable to conclude that ksp.setOperators is providing the linear system and that pc.setOperators() is providing the matrix with which to build the preconditioner, but that is not the case. 

But the code has been this way for 31 years. I am not sure what we can change with the documentation. Perhaps in KSPSetOperators it can say that it sets them into the PC.

Barry


> On Nov 4, 2025, at 4:06?AM, Moral Sanchez, Elena <Elena.Moral.Sanchez at ipp.mpg.de> wrote:
> 
> Dear Barry,
> Thanks for the fast answer. Unfortunately in my case the discrepancy is huge. With the flags
>  -ksp_monitor_true_residual -ksp_norm_type unpreconditioned
> this is the output:
>   0 KSP unpreconditioned resid norm 5.568889644229e-01 true resid norm 5.568889644229e-01 ||r(i)||/||b|| 1.000000000000e+00
>   1 KSP unpreconditioned resid norm 2.831772665189e-01 true resid norm 2.831772665189e-01 ||r(i)||/||b|| 5.084986139245e-01
>   2 KSP unpreconditioned resid norm 1.875950094147e-01 true resid norm 1.875950094147e-01 ||r(i)||/||b|| 3.368625011435e-01
> and this is the output of my own monitor function:
> Iter 0/10 | res = 5.57e-01/1.00e-08 | 0.0 s
> difference KSPBuildSolution and u: 0.0
> UNPRECONDITIONED norm:  0.5568889644229376
> PRECONDITIONED norm:  2.049041078011257
> KSPBuildResidual 2-norm: 0.5568889644229299
> difference KSPBuildResidual and b-A(KSPBuildSolution): 6.573603152700697e-13
> 
> Iter 1/10 | res = 2.83e-01/1.00e-08 | 0.0 s
> difference KSPBuildSolution and u: 0.0
> UNPRECONDITIONED norm:  0.7661983589104541
> PRECONDITIONED norm:  2.7387602134717137
> KSPBuildResidual 2-norm: 0.2831772665189212
> difference KSPBuildResidual and b-A(KSPBuildSolution): 0.1700718741085172
> 
> Iter 2/10 | res = 1.88e-01/1.00e-08 | 0.0 s
> difference KSPBuildSolution and u: 0.0
> UNPRECONDITIONED norm:  0.7050518160900253
> PRECONDITIONED norm:  2.421773833445645
> KSPBuildResidual 2-norm: 0.18759500941469456
> difference KSPBuildResidual and b-A(KSPBuildSolution): 0.19327058976599623
> Here u is the vector in the KSPSolve. 
> After the first iteration, the residual computed from KSPBuildSolution and the residual from KSPBuildResidual diverge. They are the same when I run the same code without preconditioner.
> Another observation is that after convergence (wrt. unpreconditioned norm == 2-norm of KSPBuildResidual) the solution with and without preconditioner looks quite different. How is this possible if my preconditioner is SPD?
> 
> By the way, where can I find your implementation of "My monitor" in src/snes/tutorials/ex5.c? I tried to look at the Gitlab repository but could not find it.
> Thanks for the help.
> Cheers,
> Elena
> 
> 
> 
> 
> 
> On 11/4/25 03:01, Barry Smith wrote:
>>     0 KSP unpreconditioned resid norm 1.265943996096e+00 true resid norm 1.265943996096e+00 ||r(i)||/||b|| 1.000000000000e+00
>> My monitor 0 1.265943996096e+00

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251104/758708a2/attachment-0001.html>

From Elena.Moral.Sanchez at ipp.mpg.de  Tue Nov  4 10:14:00 2025
From: Elena.Moral.Sanchez at ipp.mpg.de (Moral Sanchez, Elena)
Date: Tue, 4 Nov 2025 16:14:00 +0000
Subject: [petsc-users] norm of KSPBuildResidual does not match norm
 computed from KSPBuildSolution
In-Reply-To: <3197B431-68DD-437E-997B-7E8276D1479B@petsc.dev>
References: <4a3cebd0f2494a26b49e8b1b19595433@ipp.mpg.de>
	<3A29735F-9AB3-49CC-9ECB-7576B563949C@petsc.dev>
	<e59c808b-0252-47c9-8ef7-59e032d650aa@ipp.mpg.de>,
	<3197B431-68DD-437E-997B-7E8276D1479B@petsc.dev>
Message-ID: <74096301abcb458bbd82fc6adace2e6d@ipp.mpg.de>

Dear Barry,

Thank you for the clear answer. Indeed, it would be useful to know that they are both stored in PC, that PCSetOperators may overwrite KSPSetOperators and that the order (1 for A, 2 for PC) is the order that appears in KSPView.


In any case, it is clear to me now. Thanks for the help!

Cheers,

Elena

________________________________
From: Barry Smith <bsmith at petsc.dev>
Sent: 04 November 2025 16:08:21
To: Moral Sanchez, Elena
Cc: PETSc
Subject: Re: [petsc-users] norm of KSPBuildResidual does not match norm computed from KSPBuildSolution

  The preconditioner is always built from the second of the two matrices passed in KSPSetOperators() or PCSetOperators(). In the first case below both matrices are the same (the nest matrix) and so diagonal is extracted from the nest matrix (which is the only matrix). In the second case below the first matrix is custom and the second is the next matrix, again the preconditioner is constructed from the second one.

      linear system matrix = precond matrix:
      Mat Object: 1 MPI process
        type: nest
        rows=524, cols=524
          Matrix object:
        type=nest, rows=3, cols=3
        MatNest structure:
        (0,0) : type=mpiaij, rows=176, cols=176
        (0,1) : NULL
        (0,2) : NULL
        (1,0) : NULL
        (1,1) : type=mpiaij, rows=172, cols=172
        (1,2) : NULL
        (2,0) : NULL
        (2,1) : NULL
        (2,2) : type=mpiaij, rows=176, cols=176

Mat Object: 1 MPI process
      type: python
        Python: __main__.LHSOperator
    Mat Object: 1 MPI process
      type: nest
      Matrix object:
        type=nest, rows=3, cols=3
        MatNest structure:
        (0,0) : type=mpiaij, rows=176, cols=176
        (0,1) : NULL
        (0,2) : NULL
        (1,0) : NULL
        (1,1) : type=mpiaij, rows=172, cols=172
        (1,2) : NULL
        (2,0) : NULL
        (2,1) : NULL
        (2,2) : type=mpiaij, rows=176, cols=176


The arguments to KSPSetOperators() and PCSetOperators() are stored in the same place (inside the PC)

In fact,

PetscErrorCode KSPSetOperators(KSP ksp, Mat Amat, Mat Pmat)
{
 ....
  PetscCall(PCSetOperators(ksp->pc, Amat, Pmat));


so when you call

   precond.setOperators(A=nest_mass_matrix, P=None)
   ksp.setPC(precond)

you are overwriting the A = lhs that you passed into  ksp.setOperators(ksp, lhs, None).

Given the names  ksp.setOperators() and precond.setOperators() I can see how this can be confusing. It is reasonable to conclude that ksp.setOperators is providing the linear system and that pc.setOperators() is providing the matrix with which to build the preconditioner, but that is not the case.

But the code has been this way for 31 years. I am not sure what we can change with the documentation. Perhaps in KSPSetOperators it can say that it sets them into the PC.

Barry


On Nov 4, 2025, at 4:06?AM, Moral Sanchez, Elena <Elena.Moral.Sanchez at ipp.mpg.de> wrote:


Dear Barry,
Thanks for the fast answer. Unfortunately in my case the discrepancy is huge. With the flags
 -ksp_monitor_true_residual -ksp_norm_type unpreconditioned
this is the output:
  0 KSP unpreconditioned resid norm 5.568889644229e-01 true resid norm 5.568889644229e-01 ||r(i)||/||b|| 1.000000000000e+00
  1 KSP unpreconditioned resid norm 2.831772665189e-01 true resid norm 2.831772665189e-01 ||r(i)||/||b|| 5.084986139245e-01
  2 KSP unpreconditioned resid norm 1.875950094147e-01 true resid norm 1.875950094147e-01 ||r(i)||/||b|| 3.368625011435e-01
and this is the output of my own monitor function:
Iter 0/10 | res = 5.57e-01/1.00e-08 | 0.0 s
difference KSPBuildSolution and u: 0.0
UNPRECONDITIONED norm:  0.5568889644229376
PRECONDITIONED norm:  2.049041078011257
KSPBuildResidual 2-norm: 0.5568889644229299
difference KSPBuildResidual and b-A(KSPBuildSolution): 6.573603152700697e-13

Iter 1/10 | res = 2.83e-01/1.00e-08 | 0.0 s
difference KSPBuildSolution and u: 0.0
UNPRECONDITIONED norm:  0.7661983589104541
PRECONDITIONED norm:  2.7387602134717137
KSPBuildResidual 2-norm: 0.2831772665189212
difference KSPBuildResidual and b-A(KSPBuildSolution): 0.1700718741085172

Iter 2/10 | res = 1.88e-01/1.00e-08 | 0.0 s
difference KSPBuildSolution and u: 0.0
UNPRECONDITIONED norm:  0.7050518160900253
PRECONDITIONED norm:  2.421773833445645
KSPBuildResidual 2-norm: 0.18759500941469456
difference KSPBuildResidual and b-A(KSPBuildSolution): 0.19327058976599623
Here u is the vector in the KSPSolve.
After the first iteration, the residual computed from KSPBuildSolution and the residual from KSPBuildResidual diverge. They are the same when I run the same code without preconditioner.
Another observation is that after convergence (wrt. unpreconditioned norm == 2-norm of KSPBuildResidual) the solution with and without preconditioner looks quite different. How is this possible if my preconditioner is SPD?

By the way, where can I find your implementation of "My monitor" in src/snes/tutorials/ex5.c? I tried to look at the Gitlab repository but could not find it.
Thanks for the help.
Cheers,
Elena


On 11/4/25 03:01, Barry Smith wrote:
    0 KSP unpreconditioned resid norm 1.265943996096e+00 true resid norm 1.265943996096e+00 ||r(i)||/||b|| 1.000000000000e+00
My monitor 0 1.265943996096e+00

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251104/65147be8/attachment-0001.html>

From bsmith at petsc.dev  Tue Nov  4 20:10:36 2025
From: bsmith at petsc.dev (Barry Smith)
Date: Tue, 4 Nov 2025 21:10:36 -0500
Subject: [petsc-users] norm of KSPBuildResidual does not match norm
 computed from KSPBuildSolution
In-Reply-To: <74096301abcb458bbd82fc6adace2e6d@ipp.mpg.de>
References: <4a3cebd0f2494a26b49e8b1b19595433@ipp.mpg.de>
	<3A29735F-9AB3-49CC-9ECB-7576B563949C@petsc.dev>
	<e59c808b-0252-47c9-8ef7-59e032d650aa@ipp.mpg.de>
	<3197B431-68DD-437E-997B-7E8276D1479B@petsc.dev>
	<74096301abcb458bbd82fc6adace2e6d@ipp.mpg.de>
Message-ID: <98DD57FA-A301-4B5F-AFFD-041465090438@petsc.dev>


https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/merge_requests/8825__;!!G_uCfscf7eWS!bN5mKdIPW5Qb1phs3E5csHyDe8rTEZ1Wg5FYGcuTnKoIRKJqNhC4ydBAK9k4w-OY6HGXdZdeTCdxXp3afWUypHg$ 


> On Nov 4, 2025, at 11:14?AM, Moral Sanchez, Elena <Elena.Moral.Sanchez at ipp.mpg.de> wrote:
> 
> Dear Barry,
> Thank you for the clear answer. Indeed, it would be useful to know that they are both stored in PC, that PCSetOperators may overwrite KSPSetOperators and that the order (1 for A, 2 for PC) is the order that appears in KSPView.
> 
> In any case, it is clear to me now. Thanks for the help!
> Cheers,
> Elena
> From: Barry Smith <bsmith at petsc.dev <mailto:bsmith at petsc.dev>>
> Sent: 04 November 2025 16:08:21
> To: Moral Sanchez, Elena
> Cc: PETSc
> Subject: Re: [petsc-users] norm of KSPBuildResidual does not match norm computed from KSPBuildSolution
>  
>   The preconditioner is always built from the second of the two matrices passed in KSPSetOperators() or PCSetOperators(). In the first case below both matrices are the same (the nest matrix) and so diagonal is extracted from the nest matrix (which is the only matrix). In the second case below the first matrix is custom and the second is the next matrix, again the preconditioner is constructed from the second one. 
> 
>       linear system matrix = precond matrix:
>       Mat Object: 1 MPI process
>         type: nest
>         rows=524, cols=524
>           Matrix object:
>         type=nest, rows=3, cols=3
>         MatNest structure:
>         (0,0) : type=mpiaij, rows=176, cols=176
>         (0,1) : NULL
>         (0,2) : NULL
>         (1,0) : NULL
>         (1,1) : type=mpiaij, rows=172, cols=172
>         (1,2) : NULL
>         (2,0) : NULL
>         (2,1) : NULL
>         (2,2) : type=mpiaij, rows=176, cols=176
>         
> Mat Object: 1 MPI process
>       type: python
>         Python: __main__.LHSOperator
>     Mat Object: 1 MPI process
>       type: nest
>       Matrix object:
>         type=nest, rows=3, cols=3
>         MatNest structure:
>         (0,0) : type=mpiaij, rows=176, cols=176
>         (0,1) : NULL
>         (0,2) : NULL
>         (1,0) : NULL
>         (1,1) : type=mpiaij, rows=172, cols=172
>         (1,2) : NULL
>         (2,0) : NULL
>         (2,1) : NULL
>         (2,2) : type=mpiaij, rows=176, cols=176
>         
> 
> The arguments to KSPSetOperators() and PCSetOperators() are stored in the same place (inside the PC) 
> 
> In fact,
> 
> PetscErrorCode KSPSetOperators(KSP ksp, Mat Amat, Mat Pmat)
> {
>  ....
>   PetscCall(PCSetOperators(ksp->pc, Amat, Pmat));
> 
> 
> 
> so when you call
> 
>    precond.setOperators(A=nest_mass_matrix, P=None)
>    ksp.setPC(precond)
> 
> you are overwriting the A = lhs that you passed into  ksp.setOperators(ksp, lhs, None).
> 
> Given the names  ksp.setOperators() and precond.setOperators() I can see how this can be confusing. It is reasonable to conclude that ksp.setOperators is providing the linear system and that pc.setOperators() is providing the matrix with which to build the preconditioner, but that is not the case. 
> 
> But the code has been this way for 31 years. I am not sure what we can change with the documentation. Perhaps in KSPSetOperators it can say that it sets them into the PC.
> 
> Barry
> 
> 
> 
> 
>> On Nov 4, 2025, at 4:06?AM, Moral Sanchez, Elena <Elena.Moral.Sanchez at ipp.mpg.de <mailto:Elena.Moral.Sanchez at ipp.mpg.de>> wrote:
>> 
>> Dear Barry,
>> Thanks for the fast answer. Unfortunately in my case the discrepancy is huge. With the flags
>>  -ksp_monitor_true_residual -ksp_norm_type unpreconditioned
>> this is the output:
>>   0 KSP unpreconditioned resid norm 5.568889644229e-01 true resid norm 5.568889644229e-01 ||r(i)||/||b|| 1.000000000000e+00
>>   1 KSP unpreconditioned resid norm 2.831772665189e-01 true resid norm 2.831772665189e-01 ||r(i)||/||b|| 5.084986139245e-01
>>   2 KSP unpreconditioned resid norm 1.875950094147e-01 true resid norm 1.875950094147e-01 ||r(i)||/||b|| 3.368625011435e-01
>> and this is the output of my own monitor function:
>> Iter 0/10 | res = 5.57e-01/1.00e-08 | 0.0 s
>> difference KSPBuildSolution and u: 0.0
>> UNPRECONDITIONED norm:  0.5568889644229376
>> PRECONDITIONED norm:  2.049041078011257
>> KSPBuildResidual 2-norm: 0.5568889644229299
>> difference KSPBuildResidual and b-A(KSPBuildSolution): 6.573603152700697e-13
>> 
>> Iter 1/10 | res = 2.83e-01/1.00e-08 | 0.0 s
>> difference KSPBuildSolution and u: 0.0
>> UNPRECONDITIONED norm:  0.7661983589104541
>> PRECONDITIONED norm:  2.7387602134717137
>> KSPBuildResidual 2-norm: 0.2831772665189212
>> difference KSPBuildResidual and b-A(KSPBuildSolution): 0.1700718741085172
>> 
>> Iter 2/10 | res = 1.88e-01/1.00e-08 | 0.0 s
>> difference KSPBuildSolution and u: 0.0
>> UNPRECONDITIONED norm:  0.7050518160900253
>> PRECONDITIONED norm:  2.421773833445645
>> KSPBuildResidual 2-norm: 0.18759500941469456
>> difference KSPBuildResidual and b-A(KSPBuildSolution): 0.19327058976599623
>> Here u is the vector in the KSPSolve. 
>> After the first iteration, the residual computed from KSPBuildSolution and the residual from KSPBuildResidual diverge. They are the same when I run the same code without preconditioner.
>> Another observation is that after convergence (wrt. unpreconditioned norm == 2-norm of KSPBuildResidual) the solution with and without preconditioner looks quite different. How is this possible if my preconditioner is SPD?
>> 
>> By the way, where can I find your implementation of "My monitor" in src/snes/tutorials/ex5.c? I tried to look at the Gitlab repository but could not find it.
>> Thanks for the help.
>> Cheers,
>> Elena
>> 
>> 
>> 
>> 
>> 
>> On 11/4/25 03:01, Barry Smith wrote:
>>>     0 KSP unpreconditioned resid norm 1.265943996096e+00 true resid norm 1.265943996096e+00 ||r(i)||/||b|| 1.000000000000e+00
>>> My monitor 0 1.265943996096e+00

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251104/6f48b9ef/attachment.html>

From bsmith at petsc.dev  Tue Nov  4 20:58:40 2025
From: bsmith at petsc.dev (Barry Smith)
Date: Tue, 4 Nov 2025 21:58:40 -0500
Subject: [petsc-users] norm of KSPBuildResidual does not match norm
 computed from KSPBuildSolution
In-Reply-To: <74096301abcb458bbd82fc6adace2e6d@ipp.mpg.de>
References: <4a3cebd0f2494a26b49e8b1b19595433@ipp.mpg.de>
	<3A29735F-9AB3-49CC-9ECB-7576B563949C@petsc.dev>
	<e59c808b-0252-47c9-8ef7-59e032d650aa@ipp.mpg.de>
	<3197B431-68DD-437E-997B-7E8276D1479B@petsc.dev>
	<74096301abcb458bbd82fc6adace2e6d@ipp.mpg.de>
Message-ID: <946FAEEA-83D0-480C-8BB5-487BBB4F36E0@petsc.dev>


https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/merge_requests/8826__;!!G_uCfscf7eWS!diDfPQOitwJEJCtXeieGgQxBQzXqEaYV-O_91c4jZTPklWudrcBr8bBjZGioRCn5h4NI49X6mRTD-BCf0N0RWKU$ ?
Fix terminology for Pmat in KSPView output. (!8826) ? Merge requests ? PETSc / petsc ? GitLab
gitlab.com


> On Nov 4, 2025, at 11:14?AM, Moral Sanchez, Elena <Elena.Moral.Sanchez at ipp.mpg.de> wrote:
> 
> Dear Barry,
> Thank you for the clear answer. Indeed, it would be useful to know that they are both stored in PC, that PCSetOperators may overwrite KSPSetOperators and that the order (1 for A, 2 for PC) is the order that appears in KSPView.
> 
> In any case, it is clear to me now. Thanks for the help!
> Cheers,
> Elena
> From: Barry Smith <bsmith at petsc.dev <mailto:bsmith at petsc.dev>>
> Sent: 04 November 2025 16:08:21
> To: Moral Sanchez, Elena
> Cc: PETSc
> Subject: Re: [petsc-users] norm of KSPBuildResidual does not match norm computed from KSPBuildSolution
>  
>   The preconditioner is always built from the second of the two matrices passed in KSPSetOperators() or PCSetOperators(). In the first case below both matrices are the same (the nest matrix) and so diagonal is extracted from the nest matrix (which is the only matrix). In the second case below the first matrix is custom and the second is the next matrix, again the preconditioner is constructed from the second one. 
> 
>       linear system matrix = precond matrix:
>       Mat Object: 1 MPI process
>         type: nest
>         rows=524, cols=524
>           Matrix object:
>         type=nest, rows=3, cols=3
>         MatNest structure:
>         (0,0) : type=mpiaij, rows=176, cols=176
>         (0,1) : NULL
>         (0,2) : NULL
>         (1,0) : NULL
>         (1,1) : type=mpiaij, rows=172, cols=172
>         (1,2) : NULL
>         (2,0) : NULL
>         (2,1) : NULL
>         (2,2) : type=mpiaij, rows=176, cols=176
>         
> Mat Object: 1 MPI process
>       type: python
>         Python: __main__.LHSOperator
>     Mat Object: 1 MPI process
>       type: nest
>       Matrix object:
>         type=nest, rows=3, cols=3
>         MatNest structure:
>         (0,0) : type=mpiaij, rows=176, cols=176
>         (0,1) : NULL
>         (0,2) : NULL
>         (1,0) : NULL
>         (1,1) : type=mpiaij, rows=172, cols=172
>         (1,2) : NULL
>         (2,0) : NULL
>         (2,1) : NULL
>         (2,2) : type=mpiaij, rows=176, cols=176
>         
> 
> The arguments to KSPSetOperators() and PCSetOperators() are stored in the same place (inside the PC) 
> 
> In fact,
> 
> PetscErrorCode KSPSetOperators(KSP ksp, Mat Amat, Mat Pmat)
> {
>  ....
>   PetscCall(PCSetOperators(ksp->pc, Amat, Pmat));
> 
> 
> 
> so when you call
> 
>    precond.setOperators(A=nest_mass_matrix, P=None)
>    ksp.setPC(precond)
> 
> you are overwriting the A = lhs that you passed into  ksp.setOperators(ksp, lhs, None).
> 
> Given the names  ksp.setOperators() and precond.setOperators() I can see how this can be confusing. It is reasonable to conclude that ksp.setOperators is providing the linear system and that pc.setOperators() is providing the matrix with which to build the preconditioner, but that is not the case. 
> 
> But the code has been this way for 31 years. I am not sure what we can change with the documentation. Perhaps in KSPSetOperators it can say that it sets them into the PC.
> 
> Barry
> 
> 
> 
> 
>> On Nov 4, 2025, at 4:06?AM, Moral Sanchez, Elena <Elena.Moral.Sanchez at ipp.mpg.de <mailto:Elena.Moral.Sanchez at ipp.mpg.de>> wrote:
>> 
>> Dear Barry,
>> Thanks for the fast answer. Unfortunately in my case the discrepancy is huge. With the flags
>>  -ksp_monitor_true_residual -ksp_norm_type unpreconditioned
>> this is the output:
>>   0 KSP unpreconditioned resid norm 5.568889644229e-01 true resid norm 5.568889644229e-01 ||r(i)||/||b|| 1.000000000000e+00
>>   1 KSP unpreconditioned resid norm 2.831772665189e-01 true resid norm 2.831772665189e-01 ||r(i)||/||b|| 5.084986139245e-01
>>   2 KSP unpreconditioned resid norm 1.875950094147e-01 true resid norm 1.875950094147e-01 ||r(i)||/||b|| 3.368625011435e-01
>> and this is the output of my own monitor function:
>> Iter 0/10 | res = 5.57e-01/1.00e-08 | 0.0 s
>> difference KSPBuildSolution and u: 0.0
>> UNPRECONDITIONED norm:  0.5568889644229376
>> PRECONDITIONED norm:  2.049041078011257
>> KSPBuildResidual 2-norm: 0.5568889644229299
>> difference KSPBuildResidual and b-A(KSPBuildSolution): 6.573603152700697e-13
>> 
>> Iter 1/10 | res = 2.83e-01/1.00e-08 | 0.0 s
>> difference KSPBuildSolution and u: 0.0
>> UNPRECONDITIONED norm:  0.7661983589104541
>> PRECONDITIONED norm:  2.7387602134717137
>> KSPBuildResidual 2-norm: 0.2831772665189212
>> difference KSPBuildResidual and b-A(KSPBuildSolution): 0.1700718741085172
>> 
>> Iter 2/10 | res = 1.88e-01/1.00e-08 | 0.0 s
>> difference KSPBuildSolution and u: 0.0
>> UNPRECONDITIONED norm:  0.7050518160900253
>> PRECONDITIONED norm:  2.421773833445645
>> KSPBuildResidual 2-norm: 0.18759500941469456
>> difference KSPBuildResidual and b-A(KSPBuildSolution): 0.19327058976599623
>> Here u is the vector in the KSPSolve. 
>> After the first iteration, the residual computed from KSPBuildSolution and the residual from KSPBuildResidual diverge. They are the same when I run the same code without preconditioner.
>> Another observation is that after convergence (wrt. unpreconditioned norm == 2-norm of KSPBuildResidual) the solution with and without preconditioner looks quite different. How is this possible if my preconditioner is SPD?
>> 
>> By the way, where can I find your implementation of "My monitor" in src/snes/tutorials/ex5.c? I tried to look at the Gitlab repository but could not find it.
>> Thanks for the help.
>> Cheers,
>> Elena
>> 
>> 
>> 
>> 
>> 
>> On 11/4/25 03:01, Barry Smith wrote:
>>>     0 KSP unpreconditioned resid norm 1.265943996096e+00 true resid norm 1.265943996096e+00 ||r(i)||/||b|| 1.000000000000e+00
>>> My monitor 0 1.265943996096e+00

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251104/4dc20439/attachment-0002.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: PETSc_RBG-logo.png
Type: image/png
Size: 7210 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251104/4dc20439/attachment-0002.png>

From bsmith at petsc.dev  Tue Nov  4 20:58:40 2025
From: bsmith at petsc.dev (Barry Smith)
Date: Tue, 4 Nov 2025 21:58:40 -0500
Subject: [petsc-users] norm of KSPBuildResidual does not match norm
 computed from KSPBuildSolution
In-Reply-To: <74096301abcb458bbd82fc6adace2e6d@ipp.mpg.de>
References: <4a3cebd0f2494a26b49e8b1b19595433@ipp.mpg.de>
	<3A29735F-9AB3-49CC-9ECB-7576B563949C@petsc.dev>
	<e59c808b-0252-47c9-8ef7-59e032d650aa@ipp.mpg.de>
	<3197B431-68DD-437E-997B-7E8276D1479B@petsc.dev>
	<74096301abcb458bbd82fc6adace2e6d@ipp.mpg.de>
Message-ID: <946FAEEA-83D0-480C-8BB5-487BBB4F36E0@petsc.dev>


https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/merge_requests/8826__;!!G_uCfscf7eWS!esBVCjPHQrWOYQ6R1FZoyTB6d0lnQTWmypXz8_Fm14V7nZmgyIP_3WFXO9tJU3-lEhzeqt_xIZOD2yALOJf1d5c$ ?
Fix terminology for Pmat in KSPView output. (!8826) ? Merge requests ? PETSc / petsc ? GitLab
gitlab.com


> On Nov 4, 2025, at 11:14?AM, Moral Sanchez, Elena <Elena.Moral.Sanchez at ipp.mpg.de> wrote:
> 
> Dear Barry,
> Thank you for the clear answer. Indeed, it would be useful to know that they are both stored in PC, that PCSetOperators may overwrite KSPSetOperators and that the order (1 for A, 2 for PC) is the order that appears in KSPView.
> 
> In any case, it is clear to me now. Thanks for the help!
> Cheers,
> Elena
> From: Barry Smith <bsmith at petsc.dev <mailto:bsmith at petsc.dev>>
> Sent: 04 November 2025 16:08:21
> To: Moral Sanchez, Elena
> Cc: PETSc
> Subject: Re: [petsc-users] norm of KSPBuildResidual does not match norm computed from KSPBuildSolution
>  
>   The preconditioner is always built from the second of the two matrices passed in KSPSetOperators() or PCSetOperators(). In the first case below both matrices are the same (the nest matrix) and so diagonal is extracted from the nest matrix (which is the only matrix). In the second case below the first matrix is custom and the second is the next matrix, again the preconditioner is constructed from the second one. 
> 
>       linear system matrix = precond matrix:
>       Mat Object: 1 MPI process
>         type: nest
>         rows=524, cols=524
>           Matrix object:
>         type=nest, rows=3, cols=3
>         MatNest structure:
>         (0,0) : type=mpiaij, rows=176, cols=176
>         (0,1) : NULL
>         (0,2) : NULL
>         (1,0) : NULL
>         (1,1) : type=mpiaij, rows=172, cols=172
>         (1,2) : NULL
>         (2,0) : NULL
>         (2,1) : NULL
>         (2,2) : type=mpiaij, rows=176, cols=176
>         
> Mat Object: 1 MPI process
>       type: python
>         Python: __main__.LHSOperator
>     Mat Object: 1 MPI process
>       type: nest
>       Matrix object:
>         type=nest, rows=3, cols=3
>         MatNest structure:
>         (0,0) : type=mpiaij, rows=176, cols=176
>         (0,1) : NULL
>         (0,2) : NULL
>         (1,0) : NULL
>         (1,1) : type=mpiaij, rows=172, cols=172
>         (1,2) : NULL
>         (2,0) : NULL
>         (2,1) : NULL
>         (2,2) : type=mpiaij, rows=176, cols=176
>         
> 
> The arguments to KSPSetOperators() and PCSetOperators() are stored in the same place (inside the PC) 
> 
> In fact,
> 
> PetscErrorCode KSPSetOperators(KSP ksp, Mat Amat, Mat Pmat)
> {
>  ....
>   PetscCall(PCSetOperators(ksp->pc, Amat, Pmat));
> 
> 
> 
> so when you call
> 
>    precond.setOperators(A=nest_mass_matrix, P=None)
>    ksp.setPC(precond)
> 
> you are overwriting the A = lhs that you passed into  ksp.setOperators(ksp, lhs, None).
> 
> Given the names  ksp.setOperators() and precond.setOperators() I can see how this can be confusing. It is reasonable to conclude that ksp.setOperators is providing the linear system and that pc.setOperators() is providing the matrix with which to build the preconditioner, but that is not the case. 
> 
> But the code has been this way for 31 years. I am not sure what we can change with the documentation. Perhaps in KSPSetOperators it can say that it sets them into the PC.
> 
> Barry
> 
> 
> 
> 
>> On Nov 4, 2025, at 4:06?AM, Moral Sanchez, Elena <Elena.Moral.Sanchez at ipp.mpg.de <mailto:Elena.Moral.Sanchez at ipp.mpg.de>> wrote:
>> 
>> Dear Barry,
>> Thanks for the fast answer. Unfortunately in my case the discrepancy is huge. With the flags
>>  -ksp_monitor_true_residual -ksp_norm_type unpreconditioned
>> this is the output:
>>   0 KSP unpreconditioned resid norm 5.568889644229e-01 true resid norm 5.568889644229e-01 ||r(i)||/||b|| 1.000000000000e+00
>>   1 KSP unpreconditioned resid norm 2.831772665189e-01 true resid norm 2.831772665189e-01 ||r(i)||/||b|| 5.084986139245e-01
>>   2 KSP unpreconditioned resid norm 1.875950094147e-01 true resid norm 1.875950094147e-01 ||r(i)||/||b|| 3.368625011435e-01
>> and this is the output of my own monitor function:
>> Iter 0/10 | res = 5.57e-01/1.00e-08 | 0.0 s
>> difference KSPBuildSolution and u: 0.0
>> UNPRECONDITIONED norm:  0.5568889644229376
>> PRECONDITIONED norm:  2.049041078011257
>> KSPBuildResidual 2-norm: 0.5568889644229299
>> difference KSPBuildResidual and b-A(KSPBuildSolution): 6.573603152700697e-13
>> 
>> Iter 1/10 | res = 2.83e-01/1.00e-08 | 0.0 s
>> difference KSPBuildSolution and u: 0.0
>> UNPRECONDITIONED norm:  0.7661983589104541
>> PRECONDITIONED norm:  2.7387602134717137
>> KSPBuildResidual 2-norm: 0.2831772665189212
>> difference KSPBuildResidual and b-A(KSPBuildSolution): 0.1700718741085172
>> 
>> Iter 2/10 | res = 1.88e-01/1.00e-08 | 0.0 s
>> difference KSPBuildSolution and u: 0.0
>> UNPRECONDITIONED norm:  0.7050518160900253
>> PRECONDITIONED norm:  2.421773833445645
>> KSPBuildResidual 2-norm: 0.18759500941469456
>> difference KSPBuildResidual and b-A(KSPBuildSolution): 0.19327058976599623
>> Here u is the vector in the KSPSolve. 
>> After the first iteration, the residual computed from KSPBuildSolution and the residual from KSPBuildResidual diverge. They are the same when I run the same code without preconditioner.
>> Another observation is that after convergence (wrt. unpreconditioned norm == 2-norm of KSPBuildResidual) the solution with and without preconditioner looks quite different. How is this possible if my preconditioner is SPD?
>> 
>> By the way, where can I find your implementation of "My monitor" in src/snes/tutorials/ex5.c? I tried to look at the Gitlab repository but could not find it.
>> Thanks for the help.
>> Cheers,
>> Elena
>> 
>> 
>> 
>> 
>> 
>> On 11/4/25 03:01, Barry Smith wrote:
>>>     0 KSP unpreconditioned resid norm 1.265943996096e+00 true resid norm 1.265943996096e+00 ||r(i)||/||b|| 1.000000000000e+00
>>> My monitor 0 1.265943996096e+00

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251104/4dc20439/attachment-0003.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: PETSc_RBG-logo.png
Type: image/png
Size: 7210 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251104/4dc20439/attachment-0003.png>

From sale987 at live.com  Wed Nov  5 08:17:32 2025
From: sale987 at live.com (Samuele Ferri)
Date: Wed, 5 Nov 2025 14:17:32 +0000
Subject: [petsc-users] Two SNES on the same DM not working
Message-ID: <ZR2P278MB1100D1A3DC7D3FE2602C60708EC5A@ZR2P278MB1100.CHEP278.PROD.OUTLOOK.COM>

Dear petsc users,

in petsc version 3.24, I'm trying to create two snes over the same DM, but with different functions and jacobians. Despite making different calls to SNESSetFunction it happens the second snes uses the same function of the first.
Can you help me finding the problem, please?

Here below there is a minimal working example showing the issue:

static char help[] = "Test SNES.\n";
#include <petscsys.h>
#include <petscdmda.h>
#include <petscsnes.h>

PetscErrorCode Jac_1(SNES snes, Vec x, Mat J, Mat B, void *){
    PetscFunctionBegin;
    printf("Jac 1\n");
    PetscFunctionReturn(PETSC_SUCCESS);
}

PetscErrorCode Function_1(SNES snes, Vec x, Vec f, void *){
    PetscFunctionBegin;
    printf("Function 1\n");
    PetscFunctionReturn(PETSC_SUCCESS);
}

PetscErrorCode Jac_2(SNES snes, Vec x, Mat J, Mat B, void *){
    PetscFunctionBegin;
    printf("Jac 2\n");
    PetscFunctionReturn(PETSC_SUCCESS);
}

PetscErrorCode Function_2(SNES snes, Vec x, Vec f, void *){
    PetscFunctionBegin;
    printf("Function 2\n");
    PetscFunctionReturn(PETSC_SUCCESS);
}

int main(int argc, char **argv) {

    PetscFunctionBeginUser;
    PetscCall(PetscInitialize(&argc, &argv, NULL, help));

    DM dm;
    PetscCall(DMDACreate1d(PETSC_COMM_WORLD, DM_BOUNDARY_NONE, 100, 1, 1, NULL, &dm));
    PetscCall(DMSetFromOptions(dm));
    PetscCall(DMSetUp(dm));

    SNES snes1, snes2;
    Vec r1,r2;
    Mat J1, J2;

    PetscCall(DMCreateGlobalVector(dm, &r1));
    PetscCall(DMCreateGlobalVector(dm, &r2))
    PetscCall(DMCreateMatrix(dm, &J1));
    PetscCall(DMCreateMatrix(dm, &J2));

    PetscCall(SNESCreate(PETSC_COMM_WORLD, &snes1));
    PetscCall(SNESCreate(PETSC_COMM_WORLD, &snes2));
    PetscCall(SNESSetType(snes1, SNESNEWTONLS));
    PetscCall(SNESSetType(snes2, SNESNEWTONLS));
    PetscCall(SNESSetFromOptions(snes1));
    PetscCall(SNESSetFromOptions(snes2));
    PetscCall(SNESSetFunction(snes1, r1, Function_1, NULL));
    PetscCall(SNESSetFunction(snes2, r2, Function_2, NULL));
    PetscCall(SNESSetJacobian(snes1, J1, J1, Jac_1, NULL));
    PetscCall(SNESSetJacobian(snes2, J2, J2, Jac_2, NULL));
    PetscCall(SNESSetDM(snes1, dm));
    PetscCall(SNESSetDM(snes2, dm));

    PetscCall(SNESSolve(snes1, NULL, NULL));
    PetscCall(SNESSolve(snes2, NULL, NULL));

    printf("snes1 %p; snes2 %p\n", snes1, snes2);

    SNESFunctionFn *p;
    PetscCall(SNESGetFunction(snes1, NULL, &p, NULL));
    printf("snes1: pointer %p, true function %p\n", *p, Function_1);
    PetscCall(SNESGetFunction(snes2, NULL, &p, NULL));
    printf("snes2: pointer %p, true function %p\n", *p, Function_2);

    PetscCall(PetscFinalize());
    PetscFunctionReturn(PETSC_SUCCESS);
}

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251105/d383e6eb/attachment.html>

From bsmith at petsc.dev  Wed Nov  5 08:47:27 2025
From: bsmith at petsc.dev (Barry Smith)
Date: Wed, 5 Nov 2025 09:47:27 -0500
Subject: [petsc-users] Two SNES on the same DM not working
In-Reply-To: <ZR2P278MB1100D1A3DC7D3FE2602C60708EC5A@ZR2P278MB1100.CHEP278.PROD.OUTLOOK.COM>
References: <ZR2P278MB1100D1A3DC7D3FE2602C60708EC5A@ZR2P278MB1100.CHEP278.PROD.OUTLOOK.COM>
Message-ID: <E40ED2D9-8C4F-4357-BF33-8B0003F28664@petsc.dev>


   This is not supported. Duplicate your DM.

> On Nov 5, 2025, at 9:17?AM, Samuele Ferri <sale987 at live.com> wrote:
> 
> Dear petsc users,
> 
> in petsc version 3.24, I'm trying to create two snes over the same DM, but with different functions and jacobians. Despite making different calls to SNESSetFunction it happens the second snes uses the same function of the first.
> Can you help me finding the problem, please?
> 
> Here below there is a minimal working example showing the issue:
> 
> static char help[] = "Test SNES.\n";
> #include <petscsys.h>
> #include <petscdmda.h>
> #include <petscsnes.h>
> 
> PetscErrorCode Jac_1(SNES snes, Vec x, Mat J, Mat B, void *){
>     PetscFunctionBegin;
>     printf("Jac 1\n");
>     PetscFunctionReturn(PETSC_SUCCESS);
> }
> 
> PetscErrorCode Function_1(SNES snes, Vec x, Vec f, void *){
>     PetscFunctionBegin;
>     printf("Function 1\n");
>     PetscFunctionReturn(PETSC_SUCCESS);
> }
> 
> PetscErrorCode Jac_2(SNES snes, Vec x, Mat J, Mat B, void *){
>     PetscFunctionBegin;
>     printf("Jac 2\n");
>     PetscFunctionReturn(PETSC_SUCCESS);
> }
> 
> PetscErrorCode Function_2(SNES snes, Vec x, Vec f, void *){
>     PetscFunctionBegin;
>     printf("Function 2\n");
>     PetscFunctionReturn(PETSC_SUCCESS);
> }
> 
> int main(int argc, char **argv) {
> 
>     PetscFunctionBeginUser;
>     PetscCall(PetscInitialize(&argc, &argv, NULL, help));
> 
>     DM dm;
>     PetscCall(DMDACreate1d(PETSC_COMM_WORLD, DM_BOUNDARY_NONE, 100, 1, 1, NULL, &dm));
>     PetscCall(DMSetFromOptions(dm));
>     PetscCall(DMSetUp(dm));
> 
>     SNES snes1, snes2;
>     Vec r1,r2;
>     Mat J1, J2;
> 
>     PetscCall(DMCreateGlobalVector(dm, &r1));
>     PetscCall(DMCreateGlobalVector(dm, &r2))
>     PetscCall(DMCreateMatrix(dm, &J1));
>     PetscCall(DMCreateMatrix(dm, &J2));
> 
>     PetscCall(SNESCreate(PETSC_COMM_WORLD, &snes1));
>     PetscCall(SNESCreate(PETSC_COMM_WORLD, &snes2));
>     PetscCall(SNESSetType(snes1, SNESNEWTONLS));
>     PetscCall(SNESSetType(snes2, SNESNEWTONLS));
>     PetscCall(SNESSetFromOptions(snes1));
>     PetscCall(SNESSetFromOptions(snes2));
>     PetscCall(SNESSetFunction(snes1, r1, Function_1, NULL));
>     PetscCall(SNESSetFunction(snes2, r2, Function_2, NULL));
>     PetscCall(SNESSetJacobian(snes1, J1, J1, Jac_1, NULL));
>     PetscCall(SNESSetJacobian(snes2, J2, J2, Jac_2, NULL));
>     PetscCall(SNESSetDM(snes1, dm));
>     PetscCall(SNESSetDM(snes2, dm));
> 
>     PetscCall(SNESSolve(snes1, NULL, NULL));
>     PetscCall(SNESSolve(snes2, NULL, NULL));
> 
>     printf("snes1 %p; snes2 %p\n", snes1, snes2);
> 
>     SNESFunctionFn *p;
>     PetscCall(SNESGetFunction(snes1, NULL, &p, NULL));
>     printf("snes1: pointer %p, true function %p\n", *p, Function_1);
>     PetscCall(SNESGetFunction(snes2, NULL, &p, NULL));
>     printf("snes2: pointer %p, true function %p\n", *p, Function_2);
>    
>     PetscCall(PetscFinalize());
>     PetscFunctionReturn(PETSC_SUCCESS);
> }

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251105/bf133f23/attachment-0001.html>

From aldo.bonfiglioli at unibas.it  Thu Nov  6 00:41:33 2025
From: aldo.bonfiglioli at unibas.it (Aldo Bonfiglioli)
Date: Thu, 6 Nov 2025 07:41:33 +0100
Subject: [petsc-users] Probelm with DMPlexExtractSubMesh
Message-ID: <1e9e4725-5274-4cef-a035-bb399deaac55@unibas.it>

Dear all,

I am having troubles in using DMPlexExtractSubMesh to extract the 
different strata of the Face Sets of a given mesh.

When run on the enclosed tetrahedral mesh of the unit cube generated 
with gmsh

> Face Sets: 6 strata with value/size (1 (246), 2 (246), 3 (246), 4 
> (246), 5 (242), 6 (242))
>
I would expect 246 "points" on stratum 3, but when I DMview the subdm 
(and plot it) the surface mesh looks incomplete

> DM Object: patch_03 1 MPI process
> ?type: plex
> patch_03 in 2 dimensions:
> ?Cells are at height 1
> ?Number of 0-cells per rank: 122
> ?Number of 1-cells per rank: 325
> Number of 2-cells per rank: 204
> Number of 3-cells per rank: 204 [204]
> Labels:
> celltype: 4 strata with value/size (0 (122), 1 (325), 3 (204), 12 (204))
> depth: 4 strata with value/size (0 (122), 1 (325), 2 (204), 3 (204))
> Cell Sets: 1 strata with value/size (1 (204))
> Face Sets: 1 strata with value/size (3 (204))
> Edge Sets: 2 strata with value/size (1 (8), 5 (8))
>
see also patch_03.pdf

What am I doing wrong?

A simple reproducer (compiles with petsc-3.24.0)?and the gmsh mesh are 
enclosed.

Thanks,

Aldo

-- 
Dr. Aldo Bonfiglioli
Associate professor of Fluid Mechanics
Dipartimento di Ingegneria
Universita' della Basilicata
V.le dell'Ateneo Lucano, 10 85100 Potenza ITALY
tel:+39.0971.205203 fax:+39.0971.205215
web:https://urldefense.us/v3/__http://docenti.unibas.it/site/home/docente.html?m=002423__;!!G_uCfscf7eWS!aMKmGG4aim9XcbNSnDyHUkDyhUkQHGZ-u-xX2C-sycYUMmtTij6AwqsQbZPXJSvPp9KUfgwRJK2Ok6Me2BLgO0en1w4QF2fHo7s$ 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251106/7a2b8176/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: submesh_xmple.F90
Type: text/x-fortran
Size: 3882 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251106/7a2b8176/attachment-0001.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: cube6.msh
Type: model/mesh
Size: 186084 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251106/7a2b8176/attachment-0001.msh>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: patch_03.pdf
Type: application/pdf
Size: 160538 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251106/7a2b8176/attachment-0001.pdf>

From sale987 at live.com  Thu Nov  6 00:49:12 2025
From: sale987 at live.com (Samuele Ferri)
Date: Thu, 6 Nov 2025 06:49:12 +0000
Subject: [petsc-users] R:  Two SNES on the same DM not working
In-Reply-To: <E40ED2D9-8C4F-4357-BF33-8B0003F28664@petsc.dev>
References: <ZR2P278MB1100D1A3DC7D3FE2602C60708EC5A@ZR2P278MB1100.CHEP278.PROD.OUTLOOK.COM>
	<E40ED2D9-8C4F-4357-BF33-8B0003F28664@petsc.dev>
Message-ID: <ZR2P278MB110071BCFDD63127A5C9D7B68EC2A@ZR2P278MB1100.CHEP278.PROD.OUTLOOK.COM>

Dear Barry,

thank you for your reply. Now everything works fine.

Best regards
Samuele
________________________________
Da: Barry Smith <bsmith at petsc.dev>
Inviato: mercoled? 5 novembre 2025 15:47
A: Samuele Ferri <sale987 at live.com>
Cc: petsc-users at mcs.anl.gov <petsc-users at mcs.anl.gov>
Oggetto: Re: [petsc-users] Two SNES on the same DM not working


   This is not supported. Duplicate your DM.

On Nov 5, 2025, at 9:17?AM, Samuele Ferri <sale987 at live.com> wrote:

Dear petsc users,

in petsc version 3.24, I'm trying to create two snes over the same DM, but with different functions and jacobians. Despite making different calls to SNESSetFunction it happens the second snes uses the same function of the first.
Can you help me finding the problem, please?

Here below there is a minimal working example showing the issue:

static char help[] = "Test SNES.\n";
#include <petscsys.h>
#include <petscdmda.h>
#include <petscsnes.h>

PetscErrorCode Jac_1(SNES snes, Vec x, Mat J, Mat B, void *){
    PetscFunctionBegin;
    printf("Jac 1\n");
    PetscFunctionReturn(PETSC_SUCCESS);
}

PetscErrorCode Function_1(SNES snes, Vec x, Vec f, void *){
    PetscFunctionBegin;
    printf("Function 1\n");
    PetscFunctionReturn(PETSC_SUCCESS);
}

PetscErrorCode Jac_2(SNES snes, Vec x, Mat J, Mat B, void *){
    PetscFunctionBegin;
    printf("Jac 2\n");
    PetscFunctionReturn(PETSC_SUCCESS);
}

PetscErrorCode Function_2(SNES snes, Vec x, Vec f, void *){
    PetscFunctionBegin;
    printf("Function 2\n");
    PetscFunctionReturn(PETSC_SUCCESS);
}

int main(int argc, char **argv) {

    PetscFunctionBeginUser;
    PetscCall(PetscInitialize(&argc, &argv, NULL, help));

    DM dm;
    PetscCall(DMDACreate1d(PETSC_COMM_WORLD, DM_BOUNDARY_NONE, 100, 1, 1, NULL, &dm));
    PetscCall(DMSetFromOptions(dm));
    PetscCall(DMSetUp(dm));

    SNES snes1, snes2;
    Vec r1,r2;
    Mat J1, J2;

    PetscCall(DMCreateGlobalVector(dm, &r1));
    PetscCall(DMCreateGlobalVector(dm, &r2))
    PetscCall(DMCreateMatrix(dm, &J1));
    PetscCall(DMCreateMatrix(dm, &J2));

    PetscCall(SNESCreate(PETSC_COMM_WORLD, &snes1));
    PetscCall(SNESCreate(PETSC_COMM_WORLD, &snes2));
    PetscCall(SNESSetType(snes1, SNESNEWTONLS));
    PetscCall(SNESSetType(snes2, SNESNEWTONLS));
    PetscCall(SNESSetFromOptions(snes1));
    PetscCall(SNESSetFromOptions(snes2));
    PetscCall(SNESSetFunction(snes1, r1, Function_1, NULL));
    PetscCall(SNESSetFunction(snes2, r2, Function_2, NULL));
    PetscCall(SNESSetJacobian(snes1, J1, J1, Jac_1, NULL));
    PetscCall(SNESSetJacobian(snes2, J2, J2, Jac_2, NULL));
    PetscCall(SNESSetDM(snes1, dm));
    PetscCall(SNESSetDM(snes2, dm));

    PetscCall(SNESSolve(snes1, NULL, NULL));
    PetscCall(SNESSolve(snes2, NULL, NULL));

    printf("snes1 %p; snes2 %p\n", snes1, snes2);

    SNESFunctionFn *p;
    PetscCall(SNESGetFunction(snes1, NULL, &p, NULL));
    printf("snes1: pointer %p, true function %p\n", *p, Function_1);
    PetscCall(SNESGetFunction(snes2, NULL, &p, NULL));
    printf("snes2: pointer %p, true function %p\n", *p, Function_2);

    PetscCall(PetscFinalize());
    PetscFunctionReturn(PETSC_SUCCESS);
}

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251106/587582ee/attachment-0001.html>

From matteo.semplice at uninsubria.it  Thu Nov  6 02:19:14 2025
From: matteo.semplice at uninsubria.it (Matteo Semplice)
Date: Thu, 6 Nov 2025 09:19:14 +0100
Subject: [petsc-users] R: Two SNES on the same DM not working
In-Reply-To: <ZR2P278MB110071BCFDD63127A5C9D7B68EC2A@ZR2P278MB1100.CHEP278.PROD.OUTLOOK.COM>
References: <ZR2P278MB1100D1A3DC7D3FE2602C60708EC5A@ZR2P278MB1100.CHEP278.PROD.OUTLOOK.COM>
	<E40ED2D9-8C4F-4357-BF33-8B0003F28664@petsc.dev>
	<ZR2P278MB110071BCFDD63127A5C9D7B68EC2A@ZR2P278MB1100.CHEP278.PROD.OUTLOOK.COM>
Message-ID: <f3382438-0d10-4739-9bf8-acceb8f175c5@uninsubria.it>

Dear Barry,

 ? ? sorry for jumping into this.


I am wondering if your reply is related to DMDA or to DM in general. I 
have at least one code where I do something similar to what Samuele did 
in his sample code: create a DMPlex, create a section on this DMPlex, 
create two SNES solving for Vecs defined on that same section and attach 
to each of them a different SNESFunction and SNESJacobian (one solves a 
predictor and the other is a corrector). Everything seems fine, but I am 
wondering if that code is somewhat weak and should be changed by 
DMCloning the plex as you suggested to Samuele.


Thanks

 ? ? Matteo


On 06/11/2025 07:49, Samuele Ferri wrote:
>
> 	
> sale987 at live.com sembra simile a un utente che in precedenza ti ha 
> inviato un messaggio di posta elettronica, ma potrebbe non essere lo 
> stesso. Scopri perch? potrebbe trattarsi di un rischio 
> <https://urldefense.us/v3/__https://aka.ms/LearnAboutSenderIdentification__;!!G_uCfscf7eWS!d2x1h3Pt9OqhfeBcqW8pR0dbGy3bRw6bM-p0DxGAaY0CXWkqH3lpsuXiob9mOqfsy-aRVJXUig3819n2CWpYyFn4FDaTAWRxNDXZsA$ >
> 	
>
> Dear Barry,
>
> thank you for your reply. Now everything works fine.
>
> Best regards
> Samuele
> ------------------------------------------------------------------------
> *Da:* Barry Smith <bsmith at petsc.dev>
> *Inviato:* mercoled? 5 novembre 2025 15:47
> *A:* Samuele Ferri <sale987 at live.com>
> *Cc:* petsc-users at mcs.anl.gov <petsc-users at mcs.anl.gov>
> *Oggetto:* Re: [petsc-users] Two SNES on the same DM not working
>
> ? ?This is not supported. Duplicate your DM.
>
>> On Nov 5, 2025, at 9:17?AM, Samuele Ferri <sale987 at live.com> wrote:
>>
>> Dear petsc users,
>>
>> in petsc version 3.24, I'm trying to create two snes over the same 
>> DM, but with different functions and jacobians. Despite making 
>> different calls to SNESSetFunction it happens the second snes uses 
>> the same function of the first.
>> Can you help me finding the problem, please?
>>
>> Here below there is a minimal working example showing the issue:
>>
>> static char help[] = "Test SNES.\n";
>> #include <petscsys.h>
>> #include <petscdmda.h>
>> #include <petscsnes.h>
>>
>> PetscErrorCode Jac_1(SNES/snes/, Vec/x/, Mat/J/, Mat/B/, void *){
>> ? ? PetscFunctionBegin;
>> ? ? printf("Jac 1\n");
>> ? ? PetscFunctionReturn(PETSC_SUCCESS);
>> }
>>
>> PetscErrorCode Function_1(SNES/snes/, Vec/x/, Vec/f/, void *){
>> ? ? PetscFunctionBegin;
>> ? ? printf("Function 1\n");
>> ? ? PetscFunctionReturn(PETSC_SUCCESS);
>> }
>>
>> PetscErrorCode Jac_2(SNES/snes/, Vec/x/, Mat/J/, Mat/B/, void *){
>> ? ? PetscFunctionBegin;
>> ? ? printf("Jac 2\n");
>> ? ? PetscFunctionReturn(PETSC_SUCCESS);
>> }
>>
>> PetscErrorCode Function_2(SNES/snes/, Vec/x/, Vec/f/, void *){
>> ? ? PetscFunctionBegin;
>> ? ? printf("Function 2\n");
>> ? ? PetscFunctionReturn(PETSC_SUCCESS);
>> }
>>
>> int main(int/argc/, char **/argv/) {
>>
>> ? ? PetscFunctionBeginUser;
>> ? ? PetscCall(PetscInitialize(&/argc/, &/argv/, NULL, help));
>>
>> ? ? DM dm;
>> ? ? PetscCall(DMDACreate1d(PETSC_COMM_WORLD, DM_BOUNDARY_NONE, 100, 
>> 1, 1, NULL, &dm));
>> ? ? PetscCall(DMSetFromOptions(dm));
>> ? ? PetscCall(DMSetUp(dm));
>>
>> ? ? SNES snes1, snes2;
>> ? ? Vec r1,r2;
>> ? ? Mat J1, J2;
>>
>> ? ? PetscCall(DMCreateGlobalVector(dm, &r1));
>> ? ? PetscCall(DMCreateGlobalVector(dm, &r2))
>> ? ? PetscCall(DMCreateMatrix(dm, &J1));
>> ? ? PetscCall(DMCreateMatrix(dm, &J2));
>>
>> ? ? PetscCall(SNESCreate(PETSC_COMM_WORLD, &snes1));
>> ? ? PetscCall(SNESCreate(PETSC_COMM_WORLD, &snes2));
>> ? ? PetscCall(SNESSetType(snes1, SNESNEWTONLS));
>> ? ? PetscCall(SNESSetType(snes2, SNESNEWTONLS));
>> ? ? PetscCall(SNESSetFromOptions(snes1));
>> ? ? PetscCall(SNESSetFromOptions(snes2));
>> ? ? PetscCall(SNESSetFunction(snes1, r1, Function_1, NULL));
>> ? ? PetscCall(SNESSetFunction(snes2, r2, Function_2, NULL));
>> ? ? PetscCall(SNESSetJacobian(snes1, J1, J1, Jac_1, NULL));
>> ? ? PetscCall(SNESSetJacobian(snes2, J2, J2, Jac_2, NULL));
>> ? ? PetscCall(SNESSetDM(snes1, dm));
>> ? ? PetscCall(SNESSetDM(snes2, dm));
>>
>> ? ? PetscCall(SNESSolve(snes1, NULL, NULL));
>> ? ? PetscCall(SNESSolve(snes2, NULL, NULL));
>>
>> ? ? printf("snes1 %p; snes2 %p\n", snes1, snes2);
>>
>> ? ? SNESFunctionFn *p;
>> ? ? PetscCall(SNESGetFunction(snes1, NULL, &p, NULL));
>> ? ? printf("snes1: pointer %p, true function %p\n", *p, Function_1);
>> ? ? PetscCall(SNESGetFunction(snes2, NULL, &p, NULL));
>> ? ? printf("snes2: pointer %p, true function %p\n", *p, Function_2);
>> ? ? PetscCall(PetscFinalize());
>> ? ? PetscFunctionReturn(PETSC_SUCCESS);
>> }
>
-- 
Prof. Matteo Semplice
Universit? degli Studi dell?Insubria
Dipartimento di Scienza e Alta Tecnologia ? DiSAT
Professore Associato
Via Valleggio, 11 ? 22100 Como (CO) ? Italia
tel.: +39 031 2386316
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251106/f195e99e/attachment-0001.html>

From stefano.zampini at gmail.com  Thu Nov  6 03:11:26 2025
From: stefano.zampini at gmail.com (Stefano Zampini)
Date: Thu, 6 Nov 2025 12:11:26 +0300
Subject: [petsc-users] R: Two SNES on the same DM not working
In-Reply-To: <f3382438-0d10-4739-9bf8-acceb8f175c5@uninsubria.it>
References: <ZR2P278MB1100D1A3DC7D3FE2602C60708EC5A@ZR2P278MB1100.CHEP278.PROD.OUTLOOK.COM>
	<E40ED2D9-8C4F-4357-BF33-8B0003F28664@petsc.dev>
	<ZR2P278MB110071BCFDD63127A5C9D7B68EC2A@ZR2P278MB1100.CHEP278.PROD.OUTLOOK.COM>
	<f3382438-0d10-4739-9bf8-acceb8f175c5@uninsubria.it>
Message-ID: <CAGPUisg3+=L9Lr+by+JK9SMrw3NtXiw5yGDJchj8V+RowUiaoQ@mail.gmail.com>

Matteo

eventually (and in some sense counterintuitively) the DM stores the
information on the problem, not SNES. See the snippet below to make things
more clear

SNESSetDM(snes1,dm)
SNESSetFunction(snes1,F)
SNESSolve(snes1) // Solves F(x)=0

SNESSetDM(snes2,dm)
SNESSetFunction(snes2,G)
SNESSolve(snes2) // Solves G(x)=0

SNESSolve(snes1) // Solves G(x), not F(x)!!

If you have a plex you can call DMClone(dm,dm2) and set a new section on
dm2 to be used on snes2 (the mesh won't be duplicated, only the problem
dependent part)
I guess you can follow the same approach with a DMDA, it should work. If
not, you may need to call DMDuplicate on the DMDA.


Il giorno gio 6 nov 2025 alle ore 11:19 Matteo Semplice via petsc-users <
petsc-users at mcs.anl.gov> ha scritto:

> Dear Barry,
>
>     sorry for jumping into this.
>
>
> I am wondering if your reply is related to DMDA or to DM in general. I
> have at least one code where I do something similar to what Samuele did in
> his sample code: create a DMPlex, create a section on this DMPlex, create
> two SNES solving for Vecs defined on that same section and attach to each
> of them a different SNESFunction and SNESJacobian (one solves a predictor
> and the other is a corrector). Everything seems fine, but I am wondering if
> that code is somewhat weak and should be changed by DMCloning the plex as
> you suggested to Samuele.
>
>
> Thanks
>
>     Matteo
>
>
> On 06/11/2025 07:49, Samuele Ferri wrote:
>
>
> sale987 at live.com sembra simile a un utente che in precedenza ti ha
> inviato un messaggio di posta elettronica, ma potrebbe non essere lo
> stesso. Scopri perch? potrebbe trattarsi di un rischio
> <https://urldefense.us/v3/__https://aka.ms/LearnAboutSenderIdentification__;!!G_uCfscf7eWS!d2x1h3Pt9OqhfeBcqW8pR0dbGy3bRw6bM-p0DxGAaY0CXWkqH3lpsuXiob9mOqfsy-aRVJXUig3819n2CWpYyFn4FDaTAWRxNDXZsA$>
>
> Dear Barry,
>
> thank you for your reply. Now everything works fine.
>
> Best regards
> Samuele
> ------------------------------
> *Da:* Barry Smith <bsmith at petsc.dev> <bsmith at petsc.dev>
> *Inviato:* mercoled? 5 novembre 2025 15:47
> *A:* Samuele Ferri <sale987 at live.com> <sale987 at live.com>
> *Cc:* petsc-users at mcs.anl.gov <petsc-users at mcs.anl.gov>
> <petsc-users at mcs.anl.gov>
> *Oggetto:* Re: [petsc-users] Two SNES on the same DM not working
>
>
>    This is not supported. Duplicate your DM.
>
> On Nov 5, 2025, at 9:17?AM, Samuele Ferri <sale987 at live.com>
> <sale987 at live.com> wrote:
>
> Dear petsc users,
>
> in petsc version 3.24, I'm trying to create two snes over the same DM, but
> with different functions and jacobians. Despite making different calls to
> SNESSetFunction it happens the second snes uses the same function of the
> first.
> Can you help me finding the problem, please?
>
> Here below there is a minimal working example showing the issue:
>
> static char help[] = "Test SNES.\n";
> #include <petscsys.h>
> #include <petscdmda.h>
> #include <petscsnes.h>
>
> PetscErrorCode Jac_1(SNES *snes*, Vec *x*, Mat *J*, Mat *B*, void *){
>     PetscFunctionBegin;
>     printf("Jac 1\n");
>     PetscFunctionReturn(PETSC_SUCCESS);
> }
>
> PetscErrorCode Function_1(SNES *snes*, Vec *x*, Vec *f*, void *){
>     PetscFunctionBegin;
>     printf("Function 1\n");
>     PetscFunctionReturn(PETSC_SUCCESS);
> }
>
> PetscErrorCode Jac_2(SNES *snes*, Vec *x*, Mat *J*, Mat *B*, void *){
>     PetscFunctionBegin;
>     printf("Jac 2\n");
>     PetscFunctionReturn(PETSC_SUCCESS);
> }
>
> PetscErrorCode Function_2(SNES *snes*, Vec *x*, Vec *f*, void *){
>     PetscFunctionBegin;
>     printf("Function 2\n");
>     PetscFunctionReturn(PETSC_SUCCESS);
> }
>
> int main(int *argc*, char ***argv*) {
>
>     PetscFunctionBeginUser;
>     PetscCall(PetscInitialize(&*argc*, &*argv*, NULL, help));
>
>     DM dm;
>     PetscCall(DMDACreate1d(PETSC_COMM_WORLD, DM_BOUNDARY_NONE, 100, 1, 1,
> NULL, &dm));
>     PetscCall(DMSetFromOptions(dm));
>     PetscCall(DMSetUp(dm));
>
>     SNES snes1, snes2;
>     Vec r1,r2;
>     Mat J1, J2;
>
>     PetscCall(DMCreateGlobalVector(dm, &r1));
>     PetscCall(DMCreateGlobalVector(dm, &r2))
>     PetscCall(DMCreateMatrix(dm, &J1));
>     PetscCall(DMCreateMatrix(dm, &J2));
>
>     PetscCall(SNESCreate(PETSC_COMM_WORLD, &snes1));
>     PetscCall(SNESCreate(PETSC_COMM_WORLD, &snes2));
>     PetscCall(SNESSetType(snes1, SNESNEWTONLS));
>     PetscCall(SNESSetType(snes2, SNESNEWTONLS));
>     PetscCall(SNESSetFromOptions(snes1));
>     PetscCall(SNESSetFromOptions(snes2));
>     PetscCall(SNESSetFunction(snes1, r1, Function_1, NULL));
>     PetscCall(SNESSetFunction(snes2, r2, Function_2, NULL));
>     PetscCall(SNESSetJacobian(snes1, J1, J1, Jac_1, NULL));
>     PetscCall(SNESSetJacobian(snes2, J2, J2, Jac_2, NULL));
>     PetscCall(SNESSetDM(snes1, dm));
>     PetscCall(SNESSetDM(snes2, dm));
>
>     PetscCall(SNESSolve(snes1, NULL, NULL));
>     PetscCall(SNESSolve(snes2, NULL, NULL));
>
>     printf("snes1 %p; snes2 %p\n", snes1, snes2);
>
>     SNESFunctionFn *p;
>     PetscCall(SNESGetFunction(snes1, NULL, &p, NULL));
>     printf("snes1: pointer %p, true function %p\n", *p, Function_1);
>     PetscCall(SNESGetFunction(snes2, NULL, &p, NULL));
>     printf("snes2: pointer %p, true function %p\n", *p, Function_2);
>
>     PetscCall(PetscFinalize());
>     PetscFunctionReturn(PETSC_SUCCESS);
> }
>
>
> --
> Prof. Matteo Semplice
> Universit? degli Studi dell?Insubria
> Dipartimento di Scienza e Alta Tecnologia ? DiSAT
> Professore Associato
> Via Valleggio, 11 ? 22100 Como (CO) ? Italia
> tel.: +39 031 2386316
>
>

-- 
Stefano
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251106/9843bf82/attachment-0001.html>

From hng.email at gmail.com  Thu Nov  6 13:02:05 2025
From: hng.email at gmail.com (Hom Nath Gharti)
Date: Thu, 6 Nov 2025 14:02:05 -0500
Subject: [petsc-users] Fortran program compilation hangs with new PETSc
 version
Message-ID: <CAL+XNdXYHrb_=CxC+CSxgo0-FrnoEg-AR4FF+qqFg12yxsjTSg@mail.gmail.com>

I compiled the latest version of PETSc (3.24.1) and attempted to compile my
package, which uses Fortran and MPI. But the compilation of my package
hangs forever. I tried it on a different cluster with the same behaviour.
The compilation precisely hangs when linking to the PETSc program. It seems
to be entering into some sort of infinite loop. This problem did not happen
with PETSc 3.22.4.
I tried with GCC versions 11 and 12.

Any advice would be greatly appreciated.

Best,
Hom Nath
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251106/a44315dd/attachment.html>

From bsmith at petsc.dev  Thu Nov  6 19:10:06 2025
From: bsmith at petsc.dev (Barry Smith)
Date: Thu, 6 Nov 2025 20:10:06 -0500
Subject: [petsc-users] Two SNES on the same DM not working
In-Reply-To: <CAGPUisg3+=L9Lr+by+JK9SMrw3NtXiw5yGDJchj8V+RowUiaoQ@mail.gmail.com>
References: <ZR2P278MB1100D1A3DC7D3FE2602C60708EC5A@ZR2P278MB1100.CHEP278.PROD.OUTLOOK.COM>
	<E40ED2D9-8C4F-4357-BF33-8B0003F28664@petsc.dev>
	<ZR2P278MB110071BCFDD63127A5C9D7B68EC2A@ZR2P278MB1100.CHEP278.PROD.OUTLOOK.COM>
	<f3382438-0d10-4739-9bf8-acceb8f175c5@uninsubria.it>
	<CAGPUisg3+=L9Lr+by+JK9SMrw3NtXiw5yGDJchj8V+RowUiaoQ@mail.gmail.com>
Message-ID: <45F50A6B-E163-4750-A970-E7AE504A1B38@petsc.dev>


> On Nov 6, 2025, at 4:11?AM, Stefano Zampini <stefano.zampini at gmail.com> wrote:
> 
> Matteo
> 
> eventually (and in some sense counterintuitively) the DM stores the information on the problem, not SNES.

  "The information on the problem" here means the function you set with SNESSetFunction, SNESSetJacobian etc. (and possibly other stuff, I am not sure).


> See the snippet below to make things more clear
> 
> SNESSetDM(snes1,dm)
> SNESSetFunction(snes1,F)
> SNESSolve(snes1) // Solves F(x)=0 
> 
> SNESSetDM(snes2,dm)
> SNESSetFunction(snes2,G)
> SNESSolve(snes2) // Solves G(x)=0 
> 
> SNESSolve(snes1) // Solves G(x), not F(x)!!
> 
> If you have a plex you can call DMClone(dm,dm2) and set a new section on dm2 to be used on snes2 (the mesh won't be duplicated, only the problem dependent part)
> I guess you can follow the same approach with a DMDA, it should work. If not, you may need to call DMDuplicate on the DMDA.
> 
> 
> 
> Il giorno gio 6 nov 2025 alle ore 11:19 Matteo Semplice via petsc-users <petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov>> ha scritto:
>> Dear Barry,
>> 
>>     sorry for jumping into this.
>> 
>> 
>> 
>> I am wondering if your reply is related to DMDA or to DM in general. I have at least one code where I do something similar to what Samuele did in his sample code: create a DMPlex, create a section on this DMPlex, create two SNES solving for Vecs defined on that same section and attach to each of them a different SNESFunction and SNESJacobian (one solves a predictor and the other is a corrector). Everything seems fine, but I am wondering if that code is somewhat weak and should be changed by DMCloning the plex as you suggested to Samuele.
>> 
>> 
>> 
>> Thanks
>> 
>>     Matteo
>> 
>> 
>> 
>> On 06/11/2025 07:49, Samuele Ferri wrote:
>>> 
>>> 
>>> sale987 at live.com <mailto:sale987 at live.com> sembra simile a un utente che in precedenza ti ha inviato un messaggio di posta elettronica, ma potrebbe non essere lo stesso. Scopri perch? potrebbe trattarsi di un rischio <https://urldefense.us/v3/__https://aka.ms/LearnAboutSenderIdentification__;!!G_uCfscf7eWS!d2x1h3Pt9OqhfeBcqW8pR0dbGy3bRw6bM-p0DxGAaY0CXWkqH3lpsuXiob9mOqfsy-aRVJXUig3819n2CWpYyFn4FDaTAWRxNDXZsA$>	
>>> Dear Barry,
>>> 
>>> thank you for your reply. Now everything works fine.
>>> 
>>> Best regards
>>> Samuele
>>> Da: Barry Smith <bsmith at petsc.dev> <mailto:bsmith at petsc.dev>
>>> Inviato: mercoled? 5 novembre 2025 15:47
>>> A: Samuele Ferri <sale987 at live.com> <mailto:sale987 at live.com>
>>> Cc: petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov> <petsc-users at mcs.anl.gov> <mailto:petsc-users at mcs.anl.gov>
>>> Oggetto: Re: [petsc-users] Two SNES on the same DM not working
>>>  
>>> 
>>>    This is not supported. Duplicate your DM.
>>> 
>>>> On Nov 5, 2025, at 9:17?AM, Samuele Ferri <sale987 at live.com> <mailto:sale987 at live.com> wrote:
>>>> 
>>>> Dear petsc users,
>>>> 
>>>> in petsc version 3.24, I'm trying to create two snes over the same DM, but with different functions and jacobians. Despite making different calls to SNESSetFunction it happens the second snes uses the same function of the first.
>>>> Can you help me finding the problem, please?
>>>> 
>>>> Here below there is a minimal working example showing the issue:
>>>> 
>>>> static char help[] = "Test SNES.\n";
>>>> #include <petscsys.h>
>>>> #include <petscdmda.h>
>>>> #include <petscsnes.h>
>>>> 
>>>> PetscErrorCode Jac_1(SNES snes, Vec x, Mat J, Mat B, void *){
>>>>     PetscFunctionBegin;
>>>>     printf("Jac 1\n");
>>>>     PetscFunctionReturn(PETSC_SUCCESS);
>>>> }
>>>> 
>>>> PetscErrorCode Function_1(SNES snes, Vec x, Vec f, void *){
>>>>     PetscFunctionBegin;
>>>>     printf("Function 1\n");
>>>>     PetscFunctionReturn(PETSC_SUCCESS);
>>>> }
>>>> 
>>>> PetscErrorCode Jac_2(SNES snes, Vec x, Mat J, Mat B, void *){
>>>>     PetscFunctionBegin;
>>>>     printf("Jac 2\n");
>>>>     PetscFunctionReturn(PETSC_SUCCESS);
>>>> }
>>>> 
>>>> PetscErrorCode Function_2(SNES snes, Vec x, Vec f, void *){
>>>>     PetscFunctionBegin;
>>>>     printf("Function 2\n");
>>>>     PetscFunctionReturn(PETSC_SUCCESS);
>>>> }
>>>> 
>>>> int main(int argc, char **argv) {
>>>> 
>>>>     PetscFunctionBeginUser;
>>>>     PetscCall(PetscInitialize(&argc, &argv, NULL, help));
>>>> 
>>>>     DM dm;
>>>>     PetscCall(DMDACreate1d(PETSC_COMM_WORLD, DM_BOUNDARY_NONE, 100, 1, 1, NULL, &dm));
>>>>     PetscCall(DMSetFromOptions(dm));
>>>>     PetscCall(DMSetUp(dm));
>>>> 
>>>>     SNES snes1, snes2;
>>>>     Vec r1,r2;
>>>>     Mat J1, J2;
>>>> 
>>>>     PetscCall(DMCreateGlobalVector(dm, &r1));
>>>>     PetscCall(DMCreateGlobalVector(dm, &r2))
>>>>     PetscCall(DMCreateMatrix(dm, &J1));
>>>>     PetscCall(DMCreateMatrix(dm, &J2));
>>>> 
>>>>     PetscCall(SNESCreate(PETSC_COMM_WORLD, &snes1));
>>>>     PetscCall(SNESCreate(PETSC_COMM_WORLD, &snes2));
>>>>     PetscCall(SNESSetType(snes1, SNESNEWTONLS));
>>>>     PetscCall(SNESSetType(snes2, SNESNEWTONLS));
>>>>     PetscCall(SNESSetFromOptions(snes1));
>>>>     PetscCall(SNESSetFromOptions(snes2));
>>>>     PetscCall(SNESSetFunction(snes1, r1, Function_1, NULL));
>>>>     PetscCall(SNESSetFunction(snes2, r2, Function_2, NULL));
>>>>     PetscCall(SNESSetJacobian(snes1, J1, J1, Jac_1, NULL));
>>>>     PetscCall(SNESSetJacobian(snes2, J2, J2, Jac_2, NULL));
>>>>     PetscCall(SNESSetDM(snes1, dm));
>>>>     PetscCall(SNESSetDM(snes2, dm));
>>>> 
>>>>     PetscCall(SNESSolve(snes1, NULL, NULL));
>>>>     PetscCall(SNESSolve(snes2, NULL, NULL));
>>>> 
>>>>     printf("snes1 %p; snes2 %p\n", snes1, snes2);
>>>> 
>>>>     SNESFunctionFn *p;
>>>>     PetscCall(SNESGetFunction(snes1, NULL, &p, NULL));
>>>>     printf("snes1: pointer %p, true function %p\n", *p, Function_1);
>>>>     PetscCall(SNESGetFunction(snes2, NULL, &p, NULL));
>>>>     printf("snes2: pointer %p, true function %p\n", *p, Function_2);
>>>>    
>>>>     PetscCall(PetscFinalize());
>>>>     PetscFunctionReturn(PETSC_SUCCESS);
>>>> }
>>> 
>> -- 
>> Prof. Matteo Semplice
>> Universit? degli Studi dell?Insubria
>> Dipartimento di Scienza e Alta Tecnologia ? DiSAT
>> Professore Associato
>> Via Valleggio, 11 ? 22100 Como (CO) ? Italia
>> tel.: +39 031 2386316
> 
> 
> 
> --
> Stefano

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251106/cf185941/attachment-0001.html>

From matteo.semplice at uninsubria.it  Fri Nov  7 11:21:17 2025
From: matteo.semplice at uninsubria.it (Matteo Semplice)
Date: Fri, 7 Nov 2025 18:21:17 +0100
Subject: [petsc-users] Two SNES on the same DM not working
In-Reply-To: <45F50A6B-E163-4750-A970-E7AE504A1B38@petsc.dev>
References: <ZR2P278MB1100D1A3DC7D3FE2602C60708EC5A@ZR2P278MB1100.CHEP278.PROD.OUTLOOK.COM>
	<E40ED2D9-8C4F-4357-BF33-8B0003F28664@petsc.dev>
	<ZR2P278MB110071BCFDD63127A5C9D7B68EC2A@ZR2P278MB1100.CHEP278.PROD.OUTLOOK.COM>
	<f3382438-0d10-4739-9bf8-acceb8f175c5@uninsubria.it>
	<CAGPUisg3+=L9Lr+by+JK9SMrw3NtXiw5yGDJchj8V+RowUiaoQ@mail.gmail.com>
	<45F50A6B-E163-4750-A970-E7AE504A1B38@petsc.dev>
Message-ID: <81621b5f-a171-4c60-af37-43955406454d@uninsubria.it>


Il 07/11/25 02:10, Barry Smith ha scritto:
>
>
>> On Nov 6, 2025, at 4:11?AM, Stefano Zampini 
>> <stefano.zampini at gmail.com> wrote:
>>
>> Matteo
>>
>> eventually (and in some sense counterintuitively) the DM stores the 
>> information on the problem, not SNES.
>
> ? "The information on the problem" here means the function you set 
> with SNESSetFunction, SNESSetJacobian etc. (and possibly other stuff, 
> I am not sure).

Oddly, sometimes the two SNES seems to work nevertheless, but?after your 
anwers I will make sure that I explicitly DMClone when I need more that 
1 snes.

Matteo


>
>
>> See the snippet below to make things more clear
>>
>> SNESSetDM(snes1,dm)
>> SNESSetFunction(snes1,F)
>> SNESSolve(snes1) // Solves F(x)=0
>>
>> SNESSetDM(snes2,dm)
>> SNESSetFunction(snes2,G)
>> SNESSolve(snes2) // Solves G(x)=0
>>
>> SNESSolve(snes1) // Solves G(x), not F(x)!!
>>
>> If you have a plex you can call DMClone(dm,dm2) and set a new section 
>> on dm2 to be used on snes2 (the mesh won't be duplicated, only?the 
>> problem dependent part)
>> I guess you can follow the same approach with a DMDA, it should work. 
>> If not, you may need to call DMDuplicate on the DMDA.
>>
>>
>>
>> Il giorno gio 6 nov 2025 alle ore 11:19 Matteo Semplice via 
>> petsc-users <petsc-users at mcs.anl.gov> ha scritto:
>>
>>     Dear Barry,
>>
>>     ? ? sorry for jumping into this.
>>
>>
>>     I am wondering if your reply is related to DMDA or to DM in
>>     general. I have at least one code where I do something similar to
>>     what Samuele did in his sample code: create a DMPlex, create a
>>     section on this DMPlex, create two SNES solving for Vecs defined
>>     on that same section and attach to each of them a different
>>     SNESFunction and SNESJacobian (one solves a predictor and the
>>     other is a corrector). Everything seems fine, but I am wondering
>>     if that code is somewhat weak and should be changed by DMCloning
>>     the plex as you suggested to Samuele.
>>
>>
>>     Thanks
>>
>>     ? ? Matteo
>>
>>
>>     On 06/11/2025 07:49, Samuele Ferri wrote:
>>>
>>>     	
>>>     sale987 at live.com sembra simile a un utente che in precedenza ti
>>>     ha inviato un messaggio di posta elettronica, ma potrebbe non
>>>     essere lo stesso. Scopri perch? potrebbe trattarsi di un rischio
>>>     <https://urldefense.us/v3/__https://aka.ms/LearnAboutSenderIdentification__;!!G_uCfscf7eWS!d2x1h3Pt9OqhfeBcqW8pR0dbGy3bRw6bM-p0DxGAaY0CXWkqH3lpsuXiob9mOqfsy-aRVJXUig3819n2CWpYyFn4FDaTAWRxNDXZsA$>
>>>
>>>     	
>>>
>>>     Dear Barry,
>>>
>>>     thank you for your reply. Now everything works fine.
>>>
>>>     Best regards
>>>     Samuele
>>>     ------------------------------------------------------------------------
>>>     *Da:* Barry Smith <bsmith at petsc.dev> <mailto:bsmith at petsc.dev>
>>>     *Inviato:* mercoled? 5 novembre 2025 15:47
>>>     *A:* Samuele Ferri <sale987 at live.com> <mailto:sale987 at live.com>
>>>     *Cc:* petsc-users at mcs.anl.gov <petsc-users at mcs.anl.gov>
>>>     <mailto:petsc-users at mcs.anl.gov>
>>>     *Oggetto:* Re: [petsc-users] Two SNES on the same DM not working
>>>
>>>     ? ?This is not supported. Duplicate your DM.
>>>
>>>>     On Nov 5, 2025, at 9:17?AM, Samuele Ferri <sale987 at live.com>
>>>>     <mailto:sale987 at live.com> wrote:
>>>>
>>>>     Dear petsc users,
>>>>
>>>>     in petsc version 3.24, I'm trying to create two snes over the
>>>>     same DM, but with different functions and jacobians. Despite
>>>>     making different calls to SNESSetFunction it happens the second
>>>>     snes uses the same function of the first.
>>>>     Can you help me finding the problem, please?
>>>>
>>>>     Here below there is a minimal working example showing the issue:
>>>>
>>>>     static char help[] = "Test SNES.\n";
>>>>     #include <petscsys.h>
>>>>     #include <petscdmda.h>
>>>>     #include <petscsnes.h>
>>>>
>>>>     PetscErrorCode Jac_1(SNES/snes/, Vec/x/, Mat/J/, Mat/B/, void *){
>>>>     ? ? PetscFunctionBegin;
>>>>     ? ? printf("Jac 1\n");
>>>>     ? ? PetscFunctionReturn(PETSC_SUCCESS);
>>>>     }
>>>>
>>>>     PetscErrorCode Function_1(SNES/snes/, Vec/x/, Vec/f/, void *){
>>>>     ? ? PetscFunctionBegin;
>>>>     ? ? printf("Function 1\n");
>>>>     ? ? PetscFunctionReturn(PETSC_SUCCESS);
>>>>     }
>>>>
>>>>     PetscErrorCode Jac_2(SNES/snes/, Vec/x/, Mat/J/, Mat/B/, void *){
>>>>     ? ? PetscFunctionBegin;
>>>>     ? ? printf("Jac 2\n");
>>>>     ? ? PetscFunctionReturn(PETSC_SUCCESS);
>>>>     }
>>>>
>>>>     PetscErrorCode Function_2(SNES/snes/, Vec/x/, Vec/f/, void *){
>>>>     ? ? PetscFunctionBegin;
>>>>     ? ? printf("Function 2\n");
>>>>     ? ? PetscFunctionReturn(PETSC_SUCCESS);
>>>>     }
>>>>
>>>>     int main(int/argc/, char **/argv/) {
>>>>
>>>>     ? ? PetscFunctionBeginUser;
>>>>     ? ? PetscCall(PetscInitialize(&/argc/, &/argv/, NULL, help));
>>>>
>>>>     ? ? DM dm;
>>>>     PetscCall(DMDACreate1d(PETSC_COMM_WORLD, DM_BOUNDARY_NONE, 100,
>>>>     1, 1, NULL, &dm));
>>>>     ? ? PetscCall(DMSetFromOptions(dm));
>>>>     ? ? PetscCall(DMSetUp(dm));
>>>>
>>>>     ? ? SNES snes1, snes2;
>>>>     ? ? Vec r1,r2;
>>>>     ? ? Mat J1, J2;
>>>>
>>>>     ? ? PetscCall(DMCreateGlobalVector(dm, &r1));
>>>>     ? ? PetscCall(DMCreateGlobalVector(dm, &r2))
>>>>     ? ? PetscCall(DMCreateMatrix(dm, &J1));
>>>>     ? ? PetscCall(DMCreateMatrix(dm, &J2));
>>>>
>>>>     ? ? PetscCall(SNESCreate(PETSC_COMM_WORLD, &snes1));
>>>>     ? ? PetscCall(SNESCreate(PETSC_COMM_WORLD, &snes2));
>>>>     ? ? PetscCall(SNESSetType(snes1, SNESNEWTONLS));
>>>>     ? ? PetscCall(SNESSetType(snes2, SNESNEWTONLS));
>>>>     ? ? PetscCall(SNESSetFromOptions(snes1));
>>>>     ? ? PetscCall(SNESSetFromOptions(snes2));
>>>>     ? ? PetscCall(SNESSetFunction(snes1, r1, Function_1, NULL));
>>>>     ? ? PetscCall(SNESSetFunction(snes2, r2, Function_2, NULL));
>>>>     ? ? PetscCall(SNESSetJacobian(snes1, J1, J1, Jac_1, NULL));
>>>>     ? ? PetscCall(SNESSetJacobian(snes2, J2, J2, Jac_2, NULL));
>>>>     ? ? PetscCall(SNESSetDM(snes1, dm));
>>>>     ? ? PetscCall(SNESSetDM(snes2, dm));
>>>>
>>>>     ? ? PetscCall(SNESSolve(snes1, NULL, NULL));
>>>>     ? ? PetscCall(SNESSolve(snes2, NULL, NULL));
>>>>
>>>>     ? ? printf("snes1 %p; snes2 %p\n", snes1, snes2);
>>>>
>>>>     ? ? SNESFunctionFn *p;
>>>>     ? ? PetscCall(SNESGetFunction(snes1, NULL, &p, NULL));
>>>>     ? ? printf("snes1: pointer %p, true function %p\n", *p,
>>>>     Function_1);
>>>>     ? ? PetscCall(SNESGetFunction(snes2, NULL, &p, NULL));
>>>>     ? ? printf("snes2: pointer %p, true function %p\n", *p,
>>>>     Function_2);
>>>>     ? ? PetscCall(PetscFinalize());
>>>>     ? ? PetscFunctionReturn(PETSC_SUCCESS);
>>>>     }
>>>
>>     -- 
>>     Prof. Matteo Semplice
>>     Universit? degli Studi dell?Insubria
>>     Dipartimento di Scienza e Alta Tecnologia ? DiSAT
>>     Professore Associato
>>     Via Valleggio, 11 ? 22100 Como (CO) ? Italia
>>     tel.: +39 031 2386316
>>
>>
>>
>> -- 
>> Stefano
>
-- 
---
Professore Associato in Analisi Numerica
Dipartimento di Scienza e Alta Tecnologia
Universit? degli Studi dell'Insubria
Via Valleggio, 11 - Como
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251107/d40dcb80/attachment-0001.html>

From ctchengben at mail.scut.edu.cn  Tue Nov 11 03:43:10 2025
From: ctchengben at mail.scut.edu.cn (=?UTF-8?B?56iL5aWU?=)
Date: Tue, 11 Nov 2025 17:43:10 +0800 (GMT+08:00)
Subject: [petsc-users] Error in configuring PETSc with Cygwin on Windows by
 using MS-MPI
Message-ID: <71f7ca4b.59c6.19a724c3c2c.Coremail.ctchengben@mail.scut.edu.cn>

Hello,
Recently I try to install PETSc with Cygwin since I'd like to use PETSc with Visual Studio on Windows10 plateform.For the sake of clarity, I firstly list the softwares/packages used below:
1. PETSc: version 3.14.1
2. VS: version 2022 
3. MS MPI: download Microsoft MPI v10.1.2
4. Cygwin


And the compiler option in configuration is:
./configure --with-debugging=0 --with-cc=cl --with-fc=0 --with-cxx=cl 
--download-f2cblaslapack=/cygdrive/g/mypetsc/f2cblaslapack-3.8.0.q2.tar.gz 
--with-mpi-include=\[/cygdrive/g/MSmpi/MicrosoftSDKs/Include,/cygdrive/g/MSmpi/MicrosoftSDKs/Include/x64\] 
--with-mpi-lib=\[/cygdrive/g/MSmpi/MicrosoftSDKs/Lib/x64/msmpifec.lib,/cygdrive/g/MSmpi/MicrosoftSDKs/Lib/x64/msmpi.lib\] 
--with-mpiexec=/cygdrive/g/MSmpi/MicrosoftMPI/Bin/mpiexec 
--download-metis=/cygdrive/g/mypetsc/petsc-pkg-metis-69fb26dd0428.tar.gz 
--download-parmetis=/cygdrive/g/mypetsc/petsc-pkg-parmetis-45100eac9301.tar.gz 
--with-strict-petscerrorcode=0 --with-64-bit-indices --download-hdf5=/cygdrive/g/mypetsc/hdf5-1.14.3-p1.tar.bz2


but there return an error:
*********************************************************************************************
=============================================================================================
=============================================================================================
                Configuring PARMETIS with CMake; this may take several minutes
=============================================================================================
=============================================================================================
               Compiling and installing PARMETIS; this may take several minutes
=============================================================================================


*********************************************************************************************
           UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for details):
---------------------------------------------------------------------------------------------
                               Error running make on  PARMETIS


*********************************************************************************************


The configure.log is attached below.

So I write this email to report my problem and ask for your help.  


Looking forward your reply!


sinserely,
Cheng.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251111/2f1f5181/attachment-0001.html>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: configure.log
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251111/2f1f5181/attachment-0001.ksh>

From 202321009113 at mail.scut.edu.cn  Tue Nov 11 03:45:38 2025
From: 202321009113 at mail.scut.edu.cn (=?UTF-8?B?56iL5aWU?=)
Date: Tue, 11 Nov 2025 17:45:38 +0800 (GMT+08:00)
Subject: [petsc-users] Error in configuring PETSc with Cygwin on Windows
 by using MS-MPI
In-Reply-To: <71f7ca4b.59c6.19a724c3c2c.Coremail.ctchengben@mail.scut.edu.cn>
References: <71f7ca4b.59c6.19a724c3c2c.Coremail.ctchengben@mail.scut.edu.cn>
Message-ID: <692e614a.59c9.19a724e80b4.Coremail.202321009113@mail.scut.edu.cn>

Sorry the PETSc version is 3.24.1.


-----????-----
???:?? <ctchengben at mail.scut.edu.cn>
????:2025-11-11 17:43:10 (???)
???: petsc-users at mcs.anl.gov
??: Error in configuring PETSc with Cygwin on Windows by using MS-MPI


Hello,
Recently I try to install PETSc with Cygwin since I'd like to use PETSc with Visual Studio on Windows10 plateform.For the sake of clarity, I firstly list the softwares/packages used below:
1. PETSc: version 3.14.1
2. VS: version 2022 
3. MS MPI: download Microsoft MPI v10.1.2
4. Cygwin


And the compiler option in configuration is:
./configure --with-debugging=0 --with-cc=cl --with-fc=0 --with-cxx=cl 
--download-f2cblaslapack=/cygdrive/g/mypetsc/f2cblaslapack-3.8.0.q2.tar.gz 
--with-mpi-include=\[/cygdrive/g/MSmpi/MicrosoftSDKs/Include,/cygdrive/g/MSmpi/MicrosoftSDKs/Include/x64\] 
--with-mpi-lib=\[/cygdrive/g/MSmpi/MicrosoftSDKs/Lib/x64/msmpifec.lib,/cygdrive/g/MSmpi/MicrosoftSDKs/Lib/x64/msmpi.lib\] 
--with-mpiexec=/cygdrive/g/MSmpi/MicrosoftMPI/Bin/mpiexec 
--download-metis=/cygdrive/g/mypetsc/petsc-pkg-metis-69fb26dd0428.tar.gz 
--download-parmetis=/cygdrive/g/mypetsc/petsc-pkg-parmetis-45100eac9301.tar.gz 
--with-strict-petscerrorcode=0 --with-64-bit-indices --download-hdf5=/cygdrive/g/mypetsc/hdf5-1.14.3-p1.tar.bz2


but there return an error:
*********************************************************************************************
=============================================================================================
=============================================================================================
                Configuring PARMETIS with CMake; this may take several minutes
=============================================================================================
=============================================================================================
               Compiling and installing PARMETIS; this may take several minutes
=============================================================================================


*********************************************************************************************
           UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for details):
---------------------------------------------------------------------------------------------
                               Error running make on  PARMETIS


*********************************************************************************************


The configure.log is attached below.

So I write this email to report my problem and ask for your help.  


Looking forward your reply!


sinserely,
Cheng.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251111/1d17eeab/attachment.html>

From knepley at gmail.com  Tue Nov 11 06:35:41 2025
From: knepley at gmail.com (Matthew Knepley)
Date: Tue, 11 Nov 2025 07:35:41 -0500
Subject: [petsc-users] Error in configuring PETSc with Cygwin on Windows
 by using MS-MPI
In-Reply-To: <71f7ca4b.59c6.19a724c3c2c.Coremail.ctchengben@mail.scut.edu.cn>
References: <71f7ca4b.59c6.19a724c3c2c.Coremail.ctchengben@mail.scut.edu.cn>
Message-ID: <CAMYG4Gm7oUio5fswjLWVu_5bxTSpLwNzJaZh5ggx25D=Z7rPjA@mail.gmail.com>

On Tue, Nov 11, 2025 at 4:44?AM ?? <ctchengben at mail.scut.edu.cn> wrote:

> Hello,
> Recently I try to install PETSc with Cygwin since I'd like to use PETSc
> with Visual Studio on Windows10 plateform.For the sake of clarity, I
> firstly list the softwares/packages used below:
> 1. PETSc: version 3.14.1
> 2. VS: version 2022
> 3. MS MPI: download Microsoft MPI v10.1.2
> 4. Cygwin
>

Quick question: Have you considered installing on WSL? I have had much
better luck with that on Windows.

This seems to be an incompatibility of ParMetis Windows support and your
version:

G:\VisualStudio2022\VC\Tools\MSVC\14.37.32822\include\stdint.h(37): error
C2371: 'int_fast16_t': redefinition; different basic types^M

G:\mypetsc\petsc-3.24.1\arch-mswin-c-opt\externalpackages\petsc-pkg-parmetis-f5e3aab04fd5\headers\ms_stdint.h(80):
note: see declaration of 'int_fast16_t'^M

G:\VisualStudio2022\VC\Tools\MSVC\14.37.32822\include\stdint.h(41): error
C2371: 'uint_fast16_t': redefinition; different basic types^M

G:\mypetsc\petsc-3.24.1\arch-mswin-c-opt\externalpackages\petsc-pkg-parmetis-f5e3aab04fd5\headers\ms_stdint.h(84):
note: see declaration of 'uint_fast16_t'^M


  Thanks,


     Matt


> And the compiler option in configuration is:
> ./configure --with-debugging=0 --with-cc=cl --with-fc=0 --with-cxx=cl
> --download-f2cblaslapack=/cygdrive/g/mypetsc/f2cblaslapack-3.8.0.q2.tar.gz
>
> --with-mpi-include=\[/cygdrive/g/MSmpi/MicrosoftSDKs/Include,/cygdrive/g/MSmpi/MicrosoftSDKs/Include/x64\]
>
> --with-mpi-lib=\[/cygdrive/g/MSmpi/MicrosoftSDKs/Lib/x64/msmpifec.lib,/cygdrive/g/MSmpi/MicrosoftSDKs/Lib/x64/msmpi.lib\]
>
> --with-mpiexec=/cygdrive/g/MSmpi/MicrosoftMPI/Bin/mpiexec
> --download-metis=/cygdrive/g/mypetsc/petsc-pkg-metis-69fb26dd0428.tar.gz
> --download-parmetis=/cygdrive/g/mypetsc/petsc-pkg-parmetis-45100eac9301.tar.gz
>
> --with-strict-petscerrorcode=0 --with-64-bit-indices
> --download-hdf5=/cygdrive/g/mypetsc/hdf5-1.14.3-p1.tar.bz2
>
>
>
>
>
>
> but there return an error:
>
> *********************************************************************************************
>
> =============================================================================================
>
> =============================================================================================
>                 Configuring PARMETIS with CMake; this may take several
> minutes
>
> =============================================================================================
>
> =============================================================================================
>                Compiling and installing PARMETIS; this may take several
> minutes
>
> =============================================================================================
>
>
>
> *********************************************************************************************
>            UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for
> details):
>
> ---------------------------------------------------------------------------------------------
>                                Error running make on  PARMETIS
>
>
>
> *********************************************************************************************
>
>
> The configure.log is attached below.
>
> So I write this email to report my problem and ask for your help.
>
> Looking forward your reply!
>
>
> sinserely,
> Cheng.
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!dLmtriOEmUVP2A1oc3Mf52cboEA1wjKSpm11szn5VzeEqH4dEZEbvnyoNwoTWleZIFdbzRu6B635UJstTIIb$  <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!dLmtriOEmUVP2A1oc3Mf52cboEA1wjKSpm11szn5VzeEqH4dEZEbvnyoNwoTWleZIFdbzRu6B635UNPR25nq$ >
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251111/62c2472c/attachment-0001.html>

From bsmith at petsc.dev  Tue Nov 11 09:29:01 2025
From: bsmith at petsc.dev (Barry Smith)
Date: Tue, 11 Nov 2025 10:29:01 -0500
Subject: [petsc-users] Error in configuring PETSc with Cygwin on Windows
 by using MS-MPI
In-Reply-To: <CAMYG4Gm7oUio5fswjLWVu_5bxTSpLwNzJaZh5ggx25D=Z7rPjA@mail.gmail.com>
References: <71f7ca4b.59c6.19a724c3c2c.Coremail.ctchengben@mail.scut.edu.cn>
	<CAMYG4Gm7oUio5fswjLWVu_5bxTSpLwNzJaZh5ggx25D=Z7rPjA@mail.gmail.com>
Message-ID: <823C2320-52A3-4679-8BB2-26DA296E4ACA@petsc.dev>


  Where/how did you obtain /cygdrive/g/mypetsc/petsc-pkg-parmetis-45100eac9301.tar.gz ? Was it from PETSc ./configure?

    self.version          = '4.0.3'
    self.versionname      = 'PARMETIS_MAJOR_VERSION.PARMETIS_MINOR_VERSION.PARMETIS_SUBMINOR_VERSION'
    self.gitcommit         = 'v'+self.version+'-p9'
    self.download          = ['git://https://bitbucket.org/petsc/pkg-parmetis.git','https://bitbucket.org/petsc/pkg-parmetis/get/'+self.gitcommit+'.tar.gz']


> On Nov 11, 2025, at 7:35?AM, Matthew Knepley <knepley at gmail.com> wrote:
> 
> On Tue, Nov 11, 2025 at 4:44?AM ?? <ctchengben at mail.scut.edu.cn <mailto:ctchengben at mail.scut.edu.cn>> wrote:
>> Hello,
>> Recently I try to install PETSc with Cygwin since I'd like to use PETSc with Visual Studio on Windows10 plateform.For the sake of clarity, I firstly list the softwares/packages used below:
>> 1. PETSc: version 3.14.1
>> 2. VS: version 2022 
>> 3. MS MPI: download Microsoft MPI v10.1.2
>> 4. Cygwin
> 
> Quick question: Have you considered installing on WSL? I have had much better luck with that on Windows.
> 
> This seems to be an incompatibility of ParMetis Windows support and your version:
> 
> G:\VisualStudio2022\VC\Tools\MSVC\14.37.32822\include\stdint.h(37): error C2371: 'int_fast16_t': redefinition; different basic types^M
> G:\mypetsc\petsc-3.24.1\arch-mswin-c-opt\externalpackages\petsc-pkg-parmetis-f5e3aab04fd5\headers\ms_stdint.h(80): note: see declaration of 'int_fast16_t'^M
> G:\VisualStudio2022\VC\Tools\MSVC\14.37.32822\include\stdint.h(41): error C2371: 'uint_fast16_t': redefinition; different basic types^M
> G:\mypetsc\petsc-3.24.1\arch-mswin-c-opt\externalpackages\petsc-pkg-parmetis-f5e3aab04fd5\headers\ms_stdint.h(84): note: see declaration of 'uint_fast16_t'^M
> 
>   Thanks,
> 
>      Matt
> 
>> 
>> And the compiler option in configuration is:
>> ./configure --with-debugging=0 --with-cc=cl --with-fc=0 --with-cxx=cl 
>> --download-f2cblaslapack=/cygdrive/g/mypetsc/f2cblaslapack-3.8.0.q2.tar.gz 
>> --with-mpi-include=\[/cygdrive/g/MSmpi/MicrosoftSDKs/Include,/cygdrive/g/MSmpi/MicrosoftSDKs/Include/x64\] 
>> --with-mpi-lib=\[/cygdrive/g/MSmpi/MicrosoftSDKs/Lib/x64/msmpifec.lib,/cygdrive/g/MSmpi/MicrosoftSDKs/Lib/x64/msmpi.lib\] 
>> --with-mpiexec=/cygdrive/g/MSmpi/MicrosoftMPI/Bin/mpiexec 
>> --download-metis=/cygdrive/g/mypetsc/petsc-pkg-metis-69fb26dd0428.tar.gz 
>> --download-parmetis=/cygdrive/g/mypetsc/petsc-pkg-parmetis-45100eac9301.tar.gz 
>> --with-strict-petscerrorcode=0 --with-64-bit-indices --download-hdf5=/cygdrive/g/mypetsc/hdf5-1.14.3-p1.tar.bz2
>> 
>> 
>> 
>> 
>> 
>> 
>> but there return an error:
>> *********************************************************************************************
>> =============================================================================================
>> =============================================================================================
>>                 Configuring PARMETIS with CMake; this may take several minutes
>> =============================================================================================
>> =============================================================================================
>>                Compiling and installing PARMETIS; this may take several minutes
>> =============================================================================================
>> 
>> 
>> *********************************************************************************************
>>            UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for details):
>> ---------------------------------------------------------------------------------------------
>>                                Error running make on  PARMETIS
>> 
>> 
>> *********************************************************************************************
>> 
>> 
>> 
>> The configure.log is attached below.
>> 
>> So I write this email to report my problem and ask for your help.  
>> 
>> 
>> Looking forward your reply!
>> 
>> 
>> sinserely,
>> Cheng.
> 
> 
> 
> --
> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> -- Norbert Wiener
> 
> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!dR4RcpHZAmunWDbmeNsF6mKarUO8DHjbwajZkjXJy-_DKnCMYIt_pdxNJd1ZnSGAKlBTYKkzGncQU7Y1GZWvVy4$  <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!dLmtriOEmUVP2A1oc3Mf52cboEA1wjKSpm11szn5VzeEqH4dEZEbvnyoNwoTWleZIFdbzRu6B635UNPR25nq$>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251111/21a63cc5/attachment.html>

From balay.anl at fastmail.org  Tue Nov 11 11:07:48 2025
From: balay.anl at fastmail.org (Satish Balay)
Date: Tue, 11 Nov 2025 11:07:48 -0600 (CST)
Subject: [petsc-users] Error in configuring PETSc with Cygwin on Windows
 by using MS-MPI
In-Reply-To: <823C2320-52A3-4679-8BB2-26DA296E4ACA@petsc.dev>
References: <71f7ca4b.59c6.19a724c3c2c.Coremail.ctchengben@mail.scut.edu.cn>
	<CAMYG4Gm7oUio5fswjLWVu_5bxTSpLwNzJaZh5ggx25D=Z7rPjA@mail.gmail.com>
	<823C2320-52A3-4679-8BB2-26DA296E4ACA@petsc.dev>
Message-ID: <12b3e0c6-18a8-dffa-37f9-ac9663101f0d@fastmail.org>

Also  --download-hdf5 won't work with MS compilers on windows.

Satish

On Tue, 11 Nov 2025, Barry Smith wrote:

> 
>   Where/how did you obtain /cygdrive/g/mypetsc/petsc-pkg-parmetis-45100eac9301.tar.gz ? Was it from PETSc ./configure?
> 
>     self.version          = '4.0.3'
>     self.versionname      = 'PARMETIS_MAJOR_VERSION.PARMETIS_MINOR_VERSION.PARMETIS_SUBMINOR_VERSION'
>     self.gitcommit         = 'v'+self.version+'-p9'
>     self.download          = ['git://https://bitbucket.org/petsc/pkg-parmetis.git','https://bitbucket.org/petsc/pkg-parmetis/get/'+self.gitcommit+'.tar.gz']
> 
> 
> 
> > On Nov 11, 2025, at 7:35?AM, Matthew Knepley <knepley at gmail.com> wrote:
> > 
> > On Tue, Nov 11, 2025 at 4:44?AM ?? <ctchengben at mail.scut.edu.cn <mailto:ctchengben at mail.scut.edu.cn>> wrote:
> >> Hello,
> >> Recently I try to install PETSc with Cygwin since I'd like to use PETSc with Visual Studio on Windows10 plateform.For the sake of clarity, I firstly list the softwares/packages used below:
> >> 1. PETSc: version 3.14.1
> >> 2. VS: version 2022 
> >> 3. MS MPI: download Microsoft MPI v10.1.2
> >> 4. Cygwin
> > 
> > Quick question: Have you considered installing on WSL? I have had much better luck with that on Windows.
> > 
> > This seems to be an incompatibility of ParMetis Windows support and your version:
> > 
> > G:\VisualStudio2022\VC\Tools\MSVC\14.37.32822\include\stdint.h(37): error C2371: 'int_fast16_t': redefinition; different basic types^M
> > G:\mypetsc\petsc-3.24.1\arch-mswin-c-opt\externalpackages\petsc-pkg-parmetis-f5e3aab04fd5\headers\ms_stdint.h(80): note: see declaration of 'int_fast16_t'^M
> > G:\VisualStudio2022\VC\Tools\MSVC\14.37.32822\include\stdint.h(41): error C2371: 'uint_fast16_t': redefinition; different basic types^M
> > G:\mypetsc\petsc-3.24.1\arch-mswin-c-opt\externalpackages\petsc-pkg-parmetis-f5e3aab04fd5\headers\ms_stdint.h(84): note: see declaration of 'uint_fast16_t'^M
> > 
> >   Thanks,
> > 
> >      Matt
> > 
> >> 
> >> And the compiler option in configuration is:
> >> ./configure --with-debugging=0 --with-cc=cl --with-fc=0 --with-cxx=cl 
> >> --download-f2cblaslapack=/cygdrive/g/mypetsc/f2cblaslapack-3.8.0.q2.tar.gz 
> >> --with-mpi-include=\[/cygdrive/g/MSmpi/MicrosoftSDKs/Include,/cygdrive/g/MSmpi/MicrosoftSDKs/Include/x64\] 
> >> --with-mpi-lib=\[/cygdrive/g/MSmpi/MicrosoftSDKs/Lib/x64/msmpifec.lib,/cygdrive/g/MSmpi/MicrosoftSDKs/Lib/x64/msmpi.lib\] 
> >> --with-mpiexec=/cygdrive/g/MSmpi/MicrosoftMPI/Bin/mpiexec 
> >> --download-metis=/cygdrive/g/mypetsc/petsc-pkg-metis-69fb26dd0428.tar.gz 
> >> --download-parmetis=/cygdrive/g/mypetsc/petsc-pkg-parmetis-45100eac9301.tar.gz 
> >> --with-strict-petscerrorcode=0 --with-64-bit-indices --download-hdf5=/cygdrive/g/mypetsc/hdf5-1.14.3-p1.tar.bz2
> >> 
> >> 
> >> 
> >> 
> >> 
> >> 
> >> but there return an error:
> >> *********************************************************************************************
> >> =============================================================================================
> >> =============================================================================================
> >>                 Configuring PARMETIS with CMake; this may take several minutes
> >> =============================================================================================
> >> =============================================================================================
> >>                Compiling and installing PARMETIS; this may take several minutes
> >> =============================================================================================
> >> 
> >> 
> >> *********************************************************************************************
> >>            UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for details):
> >> ---------------------------------------------------------------------------------------------
> >>                                Error running make on  PARMETIS
> >> 
> >> 
> >> *********************************************************************************************
> >> 
> >> 
> >> 
> >> The configure.log is attached below.
> >> 
> >> So I write this email to report my problem and ask for your help.  
> >> 
> >> 
> >> Looking forward your reply!
> >> 
> >> 
> >> sinserely,
> >> Cheng.
> > 
> > 
> > 
> > --
> > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> > -- Norbert Wiener
> > 
> > https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!dR4RcpHZAmunWDbmeNsF6mKarUO8DHjbwajZkjXJy-_DKnCMYIt_pdxNJd1ZnSGAKlBTYKkzGncQU7Y1GZWvVy4$  <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!dLmtriOEmUVP2A1oc3Mf52cboEA1wjKSpm11szn5VzeEqH4dEZEbvnyoNwoTWleZIFdbzRu6B635UNPR25nq$>
> 
> 

From zhaowenbo.npic at gmail.com  Tue Nov 11 19:50:10 2025
From: zhaowenbo.npic at gmail.com (Wenbo Zhao)
Date: Wed, 12 Nov 2025 09:50:10 +0800
Subject: [petsc-users] gpu cpu parallel
Message-ID: <CAKxb76vC1h=WXJJk5pji6oU5W5KB0EZ3R-revNUxaXVHnO3Ekw@mail.gmail.com>

Dear all,

We are trying to solve ksp using GPUs.
We found the example, src/ksp/ksp/tutorials/bench_kspsolve.c, in which the
matrix is created and assembling using COO way provided by PETSc. In this
example, the number of CPU is as same as the number of GPU.
In our case, computation of the parameters of matrix is performed on CPUs.
And the cost of it is expensive, which might take half of total time or
even more.

 We want to use more CPUs to compute parameters in parallel. And a smaller
communication domain (such as gpu_comm) for the CPUs corresponding to the
GPUs is created. The parameters are computed by all of the CPUs (in
MPI_COMM_WORLD). Then, the parameters are send to gpu_comm related CPUs via
MPI. Matrix (type of aijcusparse) is then created and assembled within
gpu_comm. Finally, ksp_solve is performed on GPUs.

I?m not sure if this approach will work in practice. Are there any
comparable examples I can look to for guidance?

Best,
Wenbo
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251112/5ef23773/attachment.html>

From junchao.zhang at gmail.com  Tue Nov 11 21:48:47 2025
From: junchao.zhang at gmail.com (Junchao Zhang)
Date: Tue, 11 Nov 2025 21:48:47 -0600
Subject: [petsc-users] gpu cpu parallel
In-Reply-To: <CAKxb76vC1h=WXJJk5pji6oU5W5KB0EZ3R-revNUxaXVHnO3Ekw@mail.gmail.com>
References: <CAKxb76vC1h=WXJJk5pji6oU5W5KB0EZ3R-revNUxaXVHnO3Ekw@mail.gmail.com>
Message-ID: <CA+MQGp8cohPPYQv1sP=o4rd_Zn9t5M+d_FPSyKqW8e42KZ9UWg@mail.gmail.com>

Hi, Wenbo,
   I think your approach should work.  But before going this extra step
with gpu_comm,  have you tried to map multiple MPI ranks (CPUs) to one GPU,
using nvidia's multiple process service (MPS)?  If MPS works well,  then
you can avoid the extra complexity.

--Junchao Zhang


On Tue, Nov 11, 2025 at 7:50?PM Wenbo Zhao <zhaowenbo.npic at gmail.com> wrote:

> Dear all,
>
> We are trying to solve ksp using GPUs.
> We found the example, src/ksp/ksp/tutorials/bench_kspsolve.c, in which the
> matrix is created and assembling using COO way provided by PETSc. In this
> example, the number of CPU is as same as the number of GPU.
> In our case, computation of the parameters of matrix is performed on CPUs.
> And the cost of it is expensive, which might take half of total time or
> even more.
>
>  We want to use more CPUs to compute parameters in parallel. And a smaller
> communication domain (such as gpu_comm) for the CPUs corresponding to the
> GPUs is created. The parameters are computed by all of the CPUs (in
> MPI_COMM_WORLD). Then, the parameters are send to gpu_comm related CPUs via
> MPI. Matrix (type of aijcusparse) is then created and assembled within
> gpu_comm. Finally, ksp_solve is performed on GPUs.
>
> I?m not sure if this approach will work in practice. Are there any
> comparable examples I can look to for guidance?
>
> Best,
> Wenbo
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251111/370a1837/attachment.html>

From grantchao2018 at 163.com  Wed Nov 12 01:31:35 2025
From: grantchao2018 at 163.com (Grant Chao)
Date: Wed, 12 Nov 2025 15:31:35 +0800 (CST)
Subject: [petsc-users] gpu cpu parallel
In-Reply-To: <CA+MQGp8cohPPYQv1sP=o4rd_Zn9t5M+d_FPSyKqW8e42KZ9UWg@mail.gmail.com>
References: <CAKxb76vC1h=WXJJk5pji6oU5W5KB0EZ3R-revNUxaXVHnO3Ekw@mail.gmail.com>
	<CA+MQGp8cohPPYQv1sP=o4rd_Zn9t5M+d_FPSyKqW8e42KZ9UWg@mail.gmail.com>
Message-ID: <1f9310f.309.19a76fa21d9.Coremail.grantchao2018@163.com>


Thank you for the suggestion.


We have already tried running multiple CPU ranks with a single GPU. However, we observed that as the number of ranks increases, the EPS solver becomes significantly slower. We are not sure of the exact cause?could it be due to process access contention, hidden data transfers, or perhaps another reason? We would be very interested to hear your insight on this matter.


To avoid this problem, we used the gpu_comm approach mentioned before. During testing, we noticed that the mapping between rank ID and GPU ID seems to be set automatically and is not user-specifiable.


For example, with 4 GPUs (0-3) and 8 CPU ranks (0-7), the program binds ranks 0 and 4 to GPU 0, ranks 1 and 5 to GPU 1, and so on.


We tested possible solutions, such as calling cudaSetDevice() manually to set rank 4 to device 1, but it did not work as expected. Ranks 0 and 4 still used GPU 0.


We would appreciate your guidance on how to customize this mapping. Thank you for your support.


Best wishes,
Grant


At 2025-11-12 11:48:47, "Junchao Zhang" <junchao.zhang at gmail.com>, said:

Hi, Wenbo,
   I think your approach should work.  But before going this extra step with gpu_comm,  have you tried to map multiple MPI ranks (CPUs) to one GPU, using nvidia's multiple process service (MPS)?  If MPS works well,  then you can avoid the extra complexity. 


--Junchao Zhang


On Tue, Nov 11, 2025 at 7:50?PM Wenbo Zhao <zhaowenbo.npic at gmail.com> wrote:

Dear all,


We are trying to solve ksp using GPUs.
We found the example, src/ksp/ksp/tutorials/bench_kspsolve.c, in which the matrix is created and assembling using COO way provided by PETSc. In this example, the number of CPU is as same as the number of GPU.
In our case, computation of the parameters of matrix is performed on CPUs. And the cost of it is expensive, which might take half of total time or even more. 


 We want to use more CPUs to compute parameters in parallel. And a smaller communication domain (such as gpu_comm) for the CPUs corresponding to the GPUs is created. The parameters are computed by all of the CPUs (in MPI_COMM_WORLD). Then, the parameters are send to gpu_comm related CPUs via MPI. Matrix (type of aijcusparse) is then created and assembled within gpu_comm. Finally, ksp_solve is performed on GPUs.


I?m not sure if this approach will work in practice. Are there any comparable examples I can look to for guidance?


Best,
Wenbo
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251112/04d9f3ca/attachment-0001.html>

From junchao.zhang at gmail.com  Wed Nov 12 09:58:21 2025
From: junchao.zhang at gmail.com (Junchao Zhang)
Date: Wed, 12 Nov 2025 09:58:21 -0600
Subject: [petsc-users] gpu cpu parallel
In-Reply-To: <1f9310f.309.19a76fa21d9.Coremail.grantchao2018@163.com>
References: <CAKxb76vC1h=WXJJk5pji6oU5W5KB0EZ3R-revNUxaXVHnO3Ekw@mail.gmail.com>
	<CA+MQGp8cohPPYQv1sP=o4rd_Zn9t5M+d_FPSyKqW8e42KZ9UWg@mail.gmail.com>
	<1f9310f.309.19a76fa21d9.Coremail.grantchao2018@163.com>
Message-ID: <CA+MQGp9LJm+dWA-4_-=fK=QdYb2u_pS5hxWwvFOdPWZQ6ABpig@mail.gmail.com>

On Wed, Nov 12, 2025 at 1:31?AM Grant Chao <grantchao2018 at 163.com> wrote:

>
> Thank you for the suggestion.
>
> We have already tried running multiple CPU ranks with a single GPU.
> However, we observed that as the number of ranks increases, the EPS solver
> becomes significantly slower. We are not sure of the exact cause?could it
> be due to process access contention, hidden data transfers, or perhaps
> another reason? We would be very interested to hear your insight on this
> matter.
>
Have you started the MPS, see
https://urldefense.us/v3/__https://docs.nvidia.com/deploy/mps/index.html*starting-and-stopping-mps-on-linux__;Iw!!G_uCfscf7eWS!fRqGFSTH6neOLcmMT1alt2Uma1K1jVsAm1kXTHrg5nNNe-dVKOn6jIJvkO6q0AKcW9s3WvmnXT3jqrh2NFk1hBiuCBlC$ 


>
> To avoid this problem, we used the gpu_comm approach mentioned before.
> During testing, we noticed that the mapping between rank ID and GPU ID
> seems to be set automatically and is not user-specifiable.
>
> For example, with 4 GPUs (0-3) and 8 CPU ranks (0-7), the program binds
> ranks 0 and 4 to GPU 0, ranks 1 and 5 to GPU 1, and so on.
>
Yes, that is the current round-robin algorithm.  Do you want ranks 0,1 on
GPU 0,  and ranks 2, 3 on GPU 1, and so on?


> We tested possible solutions, such as calling cudaSetDevice() manually to
> set rank 4 to device 1, but it did not work as expected. Ranks 0 and 4
> still used GPU 0.
>
> We would appreciate your guidance on how to customize this mapping. Thank
> you for your support.
>
> Best wishes,
> Grant
>
>
> At 2025-11-12 11:48:47, "Junchao Zhang" <junchao.zhang at gmail.com>, said:
>
> Hi, Wenbo,
>    I think your approach should work.  But before going this extra step
> with gpu_comm,  have you tried to map multiple MPI ranks (CPUs) to one GPU,
> using nvidia's multiple process service (MPS)?  If MPS works well,  then
> you can avoid the extra complexity.
>
> --Junchao Zhang
>
>
> On Tue, Nov 11, 2025 at 7:50?PM Wenbo Zhao <zhaowenbo.npic at gmail.com>
> wrote:
>
>> Dear all,
>>
>> We are trying to solve ksp using GPUs.
>> We found the example, src/ksp/ksp/tutorials/bench_kspsolve.c, in which
>> the matrix is created and assembling using COO way provided by PETSc. In
>> this example, the number of CPU is as same as the number of GPU.
>> In our case, computation of the parameters of matrix is performed on
>> CPUs. And the cost of it is expensive, which might take half of total time
>> or even more.
>>
>>  We want to use more CPUs to compute parameters in parallel. And a
>> smaller communication domain (such as gpu_comm) for the CPUs corresponding
>> to the GPUs is created. The parameters are computed by all of the CPUs (in
>> MPI_COMM_WORLD). Then, the parameters are send to gpu_comm related CPUs via
>> MPI. Matrix (type of aijcusparse) is then created and assembled within
>> gpu_comm. Finally, ksp_solve is performed on GPUs.
>>
>> I?m not sure if this approach will work in practice. Are there any
>> comparable examples I can look to for guidance?
>>
>> Best,
>> Wenbo
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251112/c4dae0c0/attachment.html>

From bsmith at petsc.dev  Wed Nov 12 10:03:50 2025
From: bsmith at petsc.dev (Barry Smith)
Date: Wed, 12 Nov 2025 11:03:50 -0500
Subject: [petsc-users] Error in configuring PETSc with Cygwin on Windows
 by using MS-MPI
In-Reply-To: <46e195ab.5cea.19a779d6908.Coremail.202321009113@mail.scut.edu.cn>
References: <71f7ca4b.59c6.19a724c3c2c.Coremail.ctchengben@mail.scut.edu.cn>
	<CAMYG4Gm7oUio5fswjLWVu_5bxTSpLwNzJaZh5ggx25D=Z7rPjA@mail.gmail.com>
	<823C2320-52A3-4679-8BB2-26DA296E4ACA@petsc.dev>
	<46e195ab.5cea.19a779d6908.Coremail.202321009113@mail.scut.edu.cn>
Message-ID: <D8DDBDE6-67EA-418B-A8D8-032377C6295A@petsc.dev>

G:\VisualStudio2022\VC\Tools\MSVC\14.37.32822\include\stdint.h(37): error C2371: 'int_fast16_t': redefinition; different basic types
G:\mypetsc\petsc-3.24.1\arch-mswin-c-opt\externalpackages\petsc-pkg-parmetis-f5e3aab04fd5\headers\ms_stdint.h(80): note: see declaration of 'int_fast16_t'
G:\VisualStudio2022\VC\Tools\MSVC\14.37.32822\include\stdint.h(41): error C2371: 'uint_fast16_t': redefinition; different basic types
G:\mypetsc\petsc-3.24.1\arch-mswin-c-opt\externalpackages\petsc-pkg-parmetis-f5e3aab04fd5\headers\ms_stdint.h(84): note: see declaration of 'uint_fast16_t'
G:\VisualStudio2022\VC\Tools\MSVC\14.37.32822\include\stdint.h(49): warning C4005: 'INT8_MIN': macro redefinition
G:\mypetsc\petsc-3.24.1\arch-mswin-c-opt\externalpackages\petsc-pkg-parmetis-f5e3aab04fd5\headers\ms_stdint.h(107): note: see previous definition of 'INT8_MIN'

Parmetis has its own definitions for many C standard types, etc in headers\ms_stdint.h that duplicate what is available in stdint.h on Unix systems. Normally, this gets included when __MSC_ is defined instead of stdint.h (in gk_arch.h).

But for some reason, with your system it appears that Microsoft's stdint.h is also getting included; presumably brought in through some other system include file since it is only included in one place.

$ git grep stdint.h
headers/gk_arch.h:  #include "ms_stdint.h"
headers/gk_arch.h:  #include <stdint.h>
headers/ms_inttypes.h:#include "ms_stdint.h"
headers/ms_stdint.h:// ISO C9x  compliant stdint.h for Microsoft Visual Studio

You have a fairly old VisualStudio, 2022. Can you upgrade to the latest? Let us know if this resolves the problem.

Barry


> On Nov 12, 2025, at 5:29?AM, ?? <202321009113 at mail.scut.edu.cn> wrote:
> 
> Hi Barry
> 
> Thanks for your reply.
> 
> I check the package parmetis,and the "petsc-pkg-parmetis-45100eac9301.tar.gz" is form https://urldefense.us/v3/__https://bitbucket.org/petsc/pkg-parmetis/get/v4.0.3.tar.gz__;!!G_uCfscf7eWS!anttFLuihC7sv3xitFbNls4Ab1QfxVAGNr1EttbSarqqFMdkXJIg9_aN1RakIYDBWqtKJJM8jYn3SxcuaKW6S2Q$ . So I made a mistake about the package.
> 
> Then I download the package form https://urldefense.us/v3/__https://bitbucket.org/petsc/pkg-parmetis/get/v4.0.3-p9__;!!G_uCfscf7eWS!anttFLuihC7sv3xitFbNls4Ab1QfxVAGNr1EttbSarqqFMdkXJIg9_aN1RakIYDBWqtKJJM8jYn3Sxcu4By6gtk$  <https://urldefense.us/v3/__https://bitbucket.org/petsc/pkg-parmetis/get/v4.0.3-p9.tar.gz__;!!G_uCfscf7eWS!anttFLuihC7sv3xitFbNls4Ab1QfxVAGNr1EttbSarqqFMdkXJIg9_aN1RakIYDBWqtKJJM8jYn3SxcuBzzHK7w$ >.tar.gz <https://urldefense.us/v3/__https://bitbucket.org/petsc/pkg-parmetis/get/v4.0.3-p9.tar.gz__;!!G_uCfscf7eWS!anttFLuihC7sv3xitFbNls4Ab1QfxVAGNr1EttbSarqqFMdkXJIg9_aN1RakIYDBWqtKJJM8jYn3SxcuBzzHK7w$ > and it is "petsc-pkg-parmetis-f5e3aab04fd5.tar.gz" 
> 
> 
> 
> 
> Then the compiler option in configuration is:
> ./configure --with-debugging=0 --with-cc=cl --with-fc=0 --with-cxx=cl --download-f2cblaslapack=/cygdrive/g/mypetsc/f2cblaslapack-3.8.0.q2.tar.gz --with-mpi-include=\[/cygdrive/g/MSmpi/MicrosoftSDKs/Include,/cygdrive/g/MSmpi/MicrosoftSDKs/Include/x64\] --with-mpi-lib=\[/cygdrive/g/MSmpi/MicrosoftSDKs/Lib/x64/msmpifec.lib,/cygdrive/g/MSmpi/MicrosoftSDKs/Lib/x64/msmpi.lib\] --with-mpiexec=/cygdrive/g/MSmpi/MicrosoftMPI/Bin/mpiexec --download-metis=/cygdrive/g/mypetsc/petsc-pkg-metis-69fb26dd0428.tar.gz --download-parmetis=/cygdrive/g/mypetsc/petsc-pkg-parmetis-f5e3aab04fd5.tar.gz --with-strict-petscerrorcode=0 --with-64-bit-indices
> 
> 
> but it still have the same error:
> *********************************************************************************************
> =============================================================================================
> =============================================================================================
>                 Configuring PARMETIS with CMake; this may take several minutes
> =============================================================================================
> =============================================================================================
>                Compiling and installing PARMETIS; this may take several minutes
> =============================================================================================
> 
> 
> *********************************************************************************************
>            UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for details):
> ---------------------------------------------------------------------------------------------
>                                Error running make on  PARMETIS
> 
> 
> 
> *********************************************************************************************
> 
> 
> 
> 
> The new configure.log is attached below.
> 
> So I ask for your help again.  
> 
> 
> Looking forward your reply!
> 
> 
> sinserely,
> Cheng.
> 
> 
> 
> 
> 
> 
> 
> 
> -----????-----
> ???: "Barry Smith" <bsmith at petsc.dev>
> ????: 2025-11-11 23:29:01 (???)
> ???: "Matthew Knepley" <knepley at gmail.com>
> ??: ?? <ctchengben at mail.scut.edu.cn>, petsc-users at mcs.anl.gov
> ??: Re: [petsc-users] Error in configuring PETSc with Cygwin on Windows by using MS-MPI
> 
> 
>   Where/how did you obtain /cygdrive/g/mypetsc/petsc-pkg-parmetis-45100eac9301.tar.gz ? Was it from PETSc ./configure?
> 
>     self.version          = '4.0.3'
>     self.versionname      = 'PARMETIS_MAJOR_VERSION.PARMETIS_MINOR_VERSION.PARMETIS_SUBMINOR_VERSION'
>     self.gitcommit         = 'v'+self.version+'-p9'
>     self.download          = ['git://https://bitbucket.org/petsc/pkg-parmetis.git','https://bitbucket.org/petsc/pkg-parmetis/get/'+self.gitcommit+'.tar.gz']
> 
> 
> 
>> On Nov 11, 2025, at 7:35?AM, Matthew Knepley <knepley at gmail.com> wrote:
>> 
>> On Tue, Nov 11, 2025 at 4:44?AM ?? <ctchengben at mail.scut.edu.cn <mailto:ctchengben at mail.scut.edu.cn>> wrote:
>>> Hello,
>>> Recently I try to install PETSc with Cygwin since I'd like to use PETSc with Visual Studio on Windows10 plateform.For the sake of clarity, I firstly list the softwares/packages used below:
>>> 1. PETSc: version 3.14.1
>>> 2. VS: version 2022 
>>> 3. MS MPI: download Microsoft MPI v10.1.2
>>> 4. Cygwin
>> 
>> Quick question: Have you considered installing on WSL? I have had much better luck with that on Windows.
>> 
>> This seems to be an incompatibility of ParMetis Windows support and your version:
>> 
>> G:\VisualStudio2022\VC\Tools\MSVC\14.37.32822\include\stdint.h(37): error C2371: 'int_fast16_t': redefinition; different basic types^M
>> G:\mypetsc\petsc-3.24.1\arch-mswin-c-opt\externalpackages\petsc-pkg-parmetis-f5e3aab04fd5\headers\ms_stdint.h(80): note: see declaration of 'int_fast16_t'^M
>> G:\VisualStudio2022\VC\Tools\MSVC\14.37.32822\include\stdint.h(41): error C2371: 'uint_fast16_t': redefinition; different basic types^M
>> G:\mypetsc\petsc-3.24.1\arch-mswin-c-opt\externalpackages\petsc-pkg-parmetis-f5e3aab04fd5\headers\ms_stdint.h(84): note: see declaration of 'uint_fast16_t'^M 
>> 
>>   Thanks,
>> 
>>      Matt
>> 
>>> 
>>> And the compiler option in configuration is:
>>> ./configure --with-debugging=0 --with-cc=cl --with-fc=0 --with-cxx=cl 
>>> --download-f2cblaslapack=/cygdrive/g/mypetsc/f2cblaslapack-3.8.0.q2.tar.gz 
>>> --with-mpi-include=\[/cygdrive/g/MSmpi/MicrosoftSDKs/Include,/cygdrive/g/MSmpi/MicrosoftSDKs/Include/x64\] 
>>> --with-mpi-lib=\[/cygdrive/g/MSmpi/MicrosoftSDKs/Lib/x64/msmpifec.lib,/cygdrive/g/MSmpi/MicrosoftSDKs/Lib/x64/msmpi.lib\] 
>>> --with-mpiexec=/cygdrive/g/MSmpi/MicrosoftMPI/Bin/mpiexec 
>>> --download-metis=/cygdrive/g/mypetsc/petsc-pkg-metis-69fb26dd0428.tar.gz 
>>> --download-parmetis=/cygdrive/g/mypetsc/petsc-pkg-parmetis-45100eac9301.tar.gz 
>>> --with-strict-petscerrorcode=0 --with-64-bit-indices --download-hdf5=/cygdrive/g/mypetsc/hdf5-1.14.3-p1.tar.bz2
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> but there return an error:
>>> *********************************************************************************************
>>> =============================================================================================
>>> =============================================================================================
>>>                 Configuring PARMETIS with CMake; this may take several minutes
>>> =============================================================================================
>>> =============================================================================================
>>>                Compiling and installing PARMETIS; this may take several minutes
>>> =============================================================================================
>>> 
>>> 
>>> *********************************************************************************************
>>>            UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for details):
>>> ---------------------------------------------------------------------------------------------
>>>                                Error running make on  PARMETIS
>>> 
>>> 
>>> *********************************************************************************************
>>> 
>>> 
>>> 
>>> The configure.log is attached below.
>>> 
>>> So I write this email to report my problem and ask for your help.  
>>> 
>>> 
>>> Looking forward your reply!
>>> 
>>> 
>>> sinserely,
>>> Cheng.
>> 
>> 
>> 
>> -- 
>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>> -- Norbert Wiener
>> 
>> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!anttFLuihC7sv3xitFbNls4Ab1QfxVAGNr1EttbSarqqFMdkXJIg9_aN1RakIYDBWqtKJJM8jYn3SxcurPKaHgI$  <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!dLmtriOEmUVP2A1oc3Mf52cboEA1wjKSpm11szn5VzeEqH4dEZEbvnyoNwoTWleZIFdbzRu6B635UNPR25nq$>
> 
?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251112/ec242ce8/attachment-0002.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: configure.log
Type: application/octet-stream
Size: 1253211 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251112/ec242ce8/attachment-0001.obj>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251112/ec242ce8/attachment-0003.html>

From bsmith at petsc.dev  Wed Nov 12 10:20:41 2025
From: bsmith at petsc.dev (Barry Smith)
Date: Wed, 12 Nov 2025 11:20:41 -0500
Subject: [petsc-users] gpu cpu parallel
In-Reply-To: <1f9310f.309.19a76fa21d9.Coremail.grantchao2018@163.com>
References: <CAKxb76vC1h=WXJJk5pji6oU5W5KB0EZ3R-revNUxaXVHnO3Ekw@mail.gmail.com>
	<CA+MQGp8cohPPYQv1sP=o4rd_Zn9t5M+d_FPSyKqW8e42KZ9UWg@mail.gmail.com>
	<1f9310f.309.19a76fa21d9.Coremail.grantchao2018@163.com>
Message-ID: <4E8E7829-A856-495A-ADA3-710C91F8B3EF@petsc.dev>


> On Nov 12, 2025, at 2:31?AM, Grant Chao <grantchao2018 at 163.com> wrote:
> 
> 
> Thank you for the suggestion.
> 
> We have already tried running multiple CPU ranks with a single GPU. However, we observed that as the number of ranks increases, the EPS solver becomes significantly slower. We are not sure of the exact cause?could it be due to process access contention, hidden data transfers, or perhaps another reason? We would be very interested to hear your insight on this matter.
> 
> To avoid this problem, we used the gpu_comm approach mentioned before. During testing, we noticed that the mapping between rank ID and GPU ID seems to be set automatically and is not user-specifiable.
> 
> For example, with 4 GPUs (0-3) and 8 CPU ranks (0-7), the program binds ranks 0 and 4 to GPU 0, ranks 1 and 5 to GPU 1, and so on.

 
> We tested possible solutions, such as calling cudaSetDevice() manually to set rank 4 to device 1, but it did not work as expected. Ranks 0 and 4 still used GPU 0.
> 
> We would appreciate your guidance on how to customize this mapping. Thank you for your support.

  So you have a single compute "node" connected to multiple GPUs?  Then the mapping of MPI ranks to GPUs doesn't matter and changing it won't improve the performance.

> However, we observed that as the number of ranks increases, the EPS solver becomes significantly slower.

  Does the number of EPS "iterations" increase? Run with one, two, four and eight MPI ranks (and the same number of "GPUs" (if you only have say four GPUs that is fine, just virtualize them so two different MPI ranks share one) and the option -log_view and send the output. We need to know what is slowing down before trying to find any cure.

  Barry


> 
> Best wishes,
> Grant
> 
> 
> At 2025-11-12 11:48:47, "Junchao Zhang" <junchao.zhang at gmail.com>, said:
> Hi, Wenbo,
>    I think your approach should work.  But before going this extra step with gpu_comm,  have you tried to map multiple MPI ranks (CPUs) to one GPU, using nvidia's multiple process service (MPS)?  If MPS works well,  then you can avoid the extra complexity. 
> 
> --Junchao Zhang
> 
> 
> On Tue, Nov 11, 2025 at 7:50?PM Wenbo Zhao <zhaowenbo.npic at gmail.com <mailto:zhaowenbo.npic at gmail.com>> wrote:
>> Dear all,
>> 
>> We are trying to solve ksp using GPUs.
>> We found the example, src/ksp/ksp/tutorials/bench_kspsolve.c, in which the matrix is created and assembling using COO way provided by PETSc. In this example, the number of CPU is as same as the number of GPU.
>> In our case, computation of the parameters of matrix is performed on CPUs. And the cost of it is expensive, which might take half of total time or even more. 
>> 
>>  We want to use more CPUs to compute parameters in parallel. And a smaller communication domain (such as gpu_comm) for the CPUs corresponding to the GPUs is created. The parameters are computed by all of the CPUs (in MPI_COMM_WORLD). Then, the parameters are send to gpu_comm related CPUs via MPI. Matrix (type of aijcusparse) is then created and assembled within gpu_comm. Finally, ksp_solve is performed on GPUs.
>> 
>> I?m not sure if this approach will work in practice. Are there any comparable examples I can look to for guidance?
>> 
>> Best,
>> Wenbo

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251112/6de7a775/attachment.html>

From junchao.zhang at gmail.com  Wed Nov 12 15:58:05 2025
From: junchao.zhang at gmail.com (Junchao Zhang)
Date: Wed, 12 Nov 2025 15:58:05 -0600
Subject: [petsc-users] gpu cpu parallel
In-Reply-To: <4E8E7829-A856-495A-ADA3-710C91F8B3EF@petsc.dev>
References: <CAKxb76vC1h=WXJJk5pji6oU5W5KB0EZ3R-revNUxaXVHnO3Ekw@mail.gmail.com>
	<CA+MQGp8cohPPYQv1sP=o4rd_Zn9t5M+d_FPSyKqW8e42KZ9UWg@mail.gmail.com>
	<1f9310f.309.19a76fa21d9.Coremail.grantchao2018@163.com>
	<4E8E7829-A856-495A-ADA3-710C91F8B3EF@petsc.dev>
Message-ID: <CA+MQGp9M-5UmmBNC3-8eGoBwyahg-Lv1nQ_Hrz7b9wLS7Z8EVA@mail.gmail.com>

A common approach is to use CUDA_VISIBLE_DEVICES to manipulate MPI ranks to
GPUs mapping, see the section at
https://urldefense.us/v3/__https://docs.nersc.gov/jobs/affinity/*gpu-nodes__;Iw!!G_uCfscf7eWS!ags1Nog_0A9TnDudT9S81jm72t1NQYuOCg3--XMIlL4LXQCv-SFhCbQzesjgOxMAaRoyDOeYcqInlCRwOorJ0HFSR5q_$ 

With OpenMPI,  you can use OMPI_COMM_WORLD_LOCAL_RANK in place of
SLURM_LOCALID (see
https://urldefense.us/v3/__https://docs.open-mpi.org/en/v5.0.x/tuning-apps/environment-var.html__;!!G_uCfscf7eWS!ags1Nog_0A9TnDudT9S81jm72t1NQYuOCg3--XMIlL4LXQCv-SFhCbQzesjgOxMAaRoyDOeYcqInlCRwOorJ0DsDgr-l$ ). For
example, with 8 MPI ranks and 4 GPUs per node, the following script will
map ranks 0, 1 to GPU 0, ranks 2, 3 to GPU 1.

#!/bin/bash
# select_gpu_device wrapper script
export
CUDA_VISIBLE_DEVICES=$((OMPI_COMM_WORLD_LOCAL_RANK/(OMPI_COMM_WORLD_LOCAL_SIZE/4)))
exec $*

On Wed, Nov 12, 2025 at 10:20?AM Barry Smith <bsmith at petsc.dev> wrote:

>
>
> On Nov 12, 2025, at 2:31?AM, Grant Chao <grantchao2018 at 163.com> wrote:
>
>
> Thank you for the suggestion.
>
> We have already tried running multiple CPU ranks with a single GPU.
> However, we observed that as the number of ranks increases, the EPS solver
> becomes significantly slower. We are not sure of the exact cause?could it
> be due to process access contention, hidden data transfers, or perhaps
> another reason? We would be very interested to hear your insight on this
> matter.
>
> To avoid this problem, we used the gpu_comm approach mentioned before.
> During testing, we noticed that the mapping between rank ID and GPU ID
> seems to be set automatically and is not user-specifiable.
>
> For example, with 4 GPUs (0-3) and 8 CPU ranks (0-7), the program binds
> ranks 0 and 4 to GPU 0, ranks 1 and 5 to GPU 1, and so on.
>
>
>
>
> We tested possible solutions, such as calling cudaSetDevice() manually to
> set rank 4 to device 1, but it did not work as expected. Ranks 0 and 4
> still used GPU 0.
>
> We would appreciate your guidance on how to customize this mapping. Thank
> you for your support.
>
>
>   So you have a single compute "node" connected to multiple GPUs?  Then
> the mapping of MPI ranks to GPUs doesn't matter and changing it won't
> improve the performance.
>

> However, we observed that as the number of ranks increases, the EPS solver
> becomes significantly slower.
>
>
>   Does the number of EPS "iterations" increase? Run with one, two, four
> and eight MPI ranks (and the same number of "GPUs" (if you only have say
> four GPUs that is fine, just virtualize them so two different MPI ranks
> share one) and the option -log_view and send the output. We need to know
> what is slowing down before trying to find any cure.
>
>   Barry
>
>
>
>
>
> Best wishes,
> Grant
>
>
> At 2025-11-12 11:48:47, "Junchao Zhang" <junchao.zhang at gmail.com>, said:
>
> Hi, Wenbo,
>    I think your approach should work.  But before going this extra step
> with gpu_comm,  have you tried to map multiple MPI ranks (CPUs) to one GPU,
> using nvidia's multiple process service (MPS)?  If MPS works well,  then
> you can avoid the extra complexity.
>
> --Junchao Zhang
>
>
> On Tue, Nov 11, 2025 at 7:50?PM Wenbo Zhao <zhaowenbo.npic at gmail.com>
> wrote:
>
>> Dear all,
>>
>> We are trying to solve ksp using GPUs.
>> We found the example, src/ksp/ksp/tutorials/bench_kspsolve.c, in which
>> the matrix is created and assembling using COO way provided by PETSc. In
>> this example, the number of CPU is as same as the number of GPU.
>> In our case, computation of the parameters of matrix is performed on
>> CPUs. And the cost of it is expensive, which might take half of total time
>> or even more.
>>
>>  We want to use more CPUs to compute parameters in parallel. And a
>> smaller communication domain (such as gpu_comm) for the CPUs corresponding
>> to the GPUs is created. The parameters are computed by all of the CPUs (in
>> MPI_COMM_WORLD). Then, the parameters are send to gpu_comm related CPUs via
>> MPI. Matrix (type of aijcusparse) is then created and assembled within
>> gpu_comm. Finally, ksp_solve is performed on GPUs.
>>
>> I?m not sure if this approach will work in practice. Are there any
>> comparable examples I can look to for guidance?
>>
>> Best,
>> Wenbo
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251112/48ca7b5a/attachment.html>

From benjamin.chapman at mail.utoronto.ca  Wed Nov 12 23:08:01 2025
From: benjamin.chapman at mail.utoronto.ca (Benjamin Chapman)
Date: Thu, 13 Nov 2025 05:08:01 +0000
Subject: [petsc-users] Inquiry about issue when compiling with clang
Message-ID: <YQBPR01MB10238AFC9CDE56DF864F0B809A9CDA@YQBPR01MB10238.CANPRD01.PROD.OUTLOOK.COM>

Hello,

We are using PETSc as part of a larger project and are switching from compiling with gcc to using AOCC (clang). However, I am getting an error from my include statements stating that the data type __complex128 is an "unknown type name". The full error log is attached.

I found this solution online (FreeFem - PETSc compilation error - libblas.a/liblapack.a cannot be used - FreeFEM installation - FreeFEM<https://urldefense.us/v3/__https://community.freefem.org/t/freefem-petsc-compilation-error-libblas-a-liblapack-a-cannot-be-used/2771/8__;!!G_uCfscf7eWS!fqL9KO-Su4o3uzl3iGDcskkh1utILAAeNaaZ25v6yAZ2CEn2axOiqwOCYwJHMlq_jQwt9Oc4FXRaM9tMvfMBVx5mCetoylgskpID89ee$ >), which says to comment out a line in the petscconf.h header file. Is this a safe fix or is there a more elegant way to go about this?

Best,
Ben
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251113/700754a7/attachment-0001.html>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: petsc_build_error_messages.txt
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251113/700754a7/attachment-0001.txt>

From jed at jedbrown.org  Wed Nov 12 23:59:21 2025
From: jed at jedbrown.org (Jed Brown)
Date: Wed, 12 Nov 2025 22:59:21 -0700
Subject: [petsc-users] Inquiry about issue when compiling with clang
In-Reply-To: <YQBPR01MB10238AFC9CDE56DF864F0B809A9CDA@YQBPR01MB10238.CANPRD01.PROD.OUTLOOK.COM>
References: <YQBPR01MB10238AFC9CDE56DF864F0B809A9CDA@YQBPR01MB10238.CANPRD01.PROD.OUTLOOK.COM>
Message-ID: <87qzu2qxqe.fsf@jedbrown.org>

I'm assuming that you configured and built PETSc with gcc, and now are building a package that depends on PETSc using AOCC? Is it possible to configure PETSc with the same compiler, or to use -std=gnu++17 (any dialect that supports GNU extensions)?

Benjamin Chapman via petsc-users <petsc-users at mcs.anl.gov> writes:

> Hello,
>
> We are using PETSc as part of a larger project and are switching from compiling with gcc to using AOCC (clang). However, I am getting an error from my include statements stating that the data type __complex128 is an "unknown type name". The full error log is attached.
>
> I found this solution online (FreeFem - PETSc compilation error - libblas.a/liblapack.a cannot be used - FreeFEM installation - FreeFEM<https://urldefense.us/v3/__https://community.freefem.org/t/freefem-petsc-compilation-error-libblas-a-liblapack-a-cannot-be-used/2771/8__;!!G_uCfscf7eWS!fqL9KO-Su4o3uzl3iGDcskkh1utILAAeNaaZ25v6yAZ2CEn2axOiqwOCYwJHMlq_jQwt9Oc4FXRaM9tMvfMBVx5mCetoylgskpID89ee$ >), which says to comment out a line in the petscconf.h header file. Is this a safe fix or is there a more elegant way to go about this?
>
> Best,
> Ben
> [ 64%] Building CXX object src/CMakeFiles/rebel_lib.dir/common/coordinate_system.cpp.o
> [ 65%] Building CXX object src/CMakeFiles/rebel_lib.dir/common/FFTW_interface.cpp.o
> [ 65%] Building CXX object src/CMakeFiles/rebel_lib.dir/common/fileIO.cpp.o
> [ 66%] Building CXX object src/CMakeFiles/rebel_lib.dir/common/lapacke_interface.cpp.o
> [ 66%] Building CXX object src/CMakeFiles/rebel_lib.dir/common/petsc_extensions.cpp.o
> In file included from /mnt/scratch/bchapman/rebel/src/common/petsc_extensions.cpp:7:
> In file included from /mnt/scratch/bchapman/rebel/src/common/petsc_extensions.h:14:
> In file included from /mnt/scratch/bchapman/rebel/build/external/builds/petsc-3.21.0/include/petscksp.h:6:
> In file included from /mnt/scratch/bchapman/rebel/build/external/builds/petsc-3.21.0/include/petscpc.h:6:
> In file included from /mnt/scratch/bchapman/rebel/build/external/builds/petsc-3.21.0/include/petscmat.h:6:
> In file included from /mnt/scratch/bchapman/rebel/build/external/builds/petsc-3.21.0/include/petscvec.h:8:
> In file included from /mnt/scratch/bchapman/rebel/build/external/builds/petsc-3.21.0/include/petscsys.h:193:
> /mnt/scratch/bchapman/rebel/build/external/builds/petsc-3.21.0/include/petscmath.h:429:45: error: unknown type name '__complex128'
>   429 | PETSC_EXTERN MPI_Datatype MPIU___COMPLEX128 MPIU___COMPLEX128_ATTR_TAG;
>       |                                             ^
> /mnt/scratch/bchapman/rebel/build/external/builds/petsc-3.21.0/include/petscmath.h:424:71: note: expanded from macro 'MPIU___COMPLEX128_ATTR_TAG'
>   424 |       #define MPIU___COMPLEX128_ATTR_TAG PETSC_ATTRIBUTE_MPI_TYPE_TAG(__complex128)
>       |                                                                       ^
> /mnt/scratch/bchapman/rebel/src/common/petsc_extensions.cpp:2570:13: warning: decomposition declarations are a C++17 extension [-Wc++17-extensions]
>  2570 |         const auto [min, max] = std::minmax_element(std::begin(s), std::end(s));
>       |                    ^~~~~~~~~~
> 1 warning and 1 error generated.
> make[2]: *** [src/CMakeFiles/rebel_lib.dir/build.make:132: src/CMakeFiles/rebel_lib.dir/common/petsc_extensions.cpp.o] Error 1
> make[1]: *** [CMakeFiles/Makefile2:772: src/CMakeFiles/rebel_lib.dir/all] Error 2
> make: *** [Makefile:136: all] Error 2

From pierre at joliv.et  Thu Nov 13 03:44:02 2025
From: pierre at joliv.et (Pierre Jolivet)
Date: Thu, 13 Nov 2025 10:44:02 +0100
Subject: [petsc-users] Inquiry about issue when compiling with clang
In-Reply-To: <87qzu2qxqe.fsf@jedbrown.org>
References: <YQBPR01MB10238AFC9CDE56DF864F0B809A9CDA@YQBPR01MB10238.CANPRD01.PROD.OUTLOOK.COM>
	<87qzu2qxqe.fsf@jedbrown.org>
Message-ID: <069EBDD7-E501-4076-8EB2-B32EB75C6134@joliv.et>


> On 13 Nov 2025, at 6:59?AM, Jed Brown <jed at jedbrown.org> wrote:
> 
> I'm assuming that you configured and built PETSc with gcc, and now are building a package that depends on PETSc using AOCC? Is it possible to configure PETSc with the same compiler, or to use -std=gnu++17 (any dialect that supports GNU extensions)?

Do what Jed suggests.
But to give you a thorough answer, this FreeFEM post got me to fix some missing code in PETSc https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/merge_requests/7868__;!!G_uCfscf7eWS!ZCDn8GMjZjjHKboEmYNh_DysRggGBuwNc5uVk819bJyjAPCkifplTawg0JU1vik-GoaLgcKII1fsCidj5CrSqw$ .
So now, the more ?elegant? solution of defining PETSC_SKIP_REAL___FLOAT128 (which was not working back then) should work given that you are using a recent enough PETSc (it seems you are using version 3.21.0, we are at 3.24.1, and the fix is there since 3.21.6).
But again, try what Jed suggests first and foremost.

Thanks,
Pierre

> Benjamin Chapman via petsc-users <petsc-users at mcs.anl.gov> writes:
> 
>> Hello,
>> 
>> We are using PETSc as part of a larger project and are switching from compiling with gcc to using AOCC (clang). However, I am getting an error from my include statements stating that the data type __complex128 is an "unknown type name". The full error log is attached.
>> 
>> I found this solution online (FreeFem - PETSc compilation error - libblas.a/liblapack.a cannot be used - FreeFEM installation - FreeFEM<https://urldefense.us/v3/__https://community.freefem.org/t/freefem-petsc-compilation-error-libblas-a-liblapack-a-cannot-be-used/2771/8__;!!G_uCfscf7eWS!fqL9KO-Su4o3uzl3iGDcskkh1utILAAeNaaZ25v6yAZ2CEn2axOiqwOCYwJHMlq_jQwt9Oc4FXRaM9tMvfMBVx5mCetoylgskpID89ee$ >), which says to comment out a line in the petscconf.h header file. Is this a safe fix or is there a more elegant way to go about this?
>> 
>> Best,
>> Ben
>> [ 64%] Building CXX object src/CMakeFiles/rebel_lib.dir/common/coordinate_system.cpp.o
>> [ 65%] Building CXX object src/CMakeFiles/rebel_lib.dir/common/FFTW_interface.cpp.o
>> [ 65%] Building CXX object src/CMakeFiles/rebel_lib.dir/common/fileIO.cpp.o
>> [ 66%] Building CXX object src/CMakeFiles/rebel_lib.dir/common/lapacke_interface.cpp.o
>> [ 66%] Building CXX object src/CMakeFiles/rebel_lib.dir/common/petsc_extensions.cpp.o
>> In file included from /mnt/scratch/bchapman/rebel/src/common/petsc_extensions.cpp:7:
>> In file included from /mnt/scratch/bchapman/rebel/src/common/petsc_extensions.h:14:
>> In file included from /mnt/scratch/bchapman/rebel/build/external/builds/petsc-3.21.0/include/petscksp.h:6:
>> In file included from /mnt/scratch/bchapman/rebel/build/external/builds/petsc-3.21.0/include/petscpc.h:6:
>> In file included from /mnt/scratch/bchapman/rebel/build/external/builds/petsc-3.21.0/include/petscmat.h:6:
>> In file included from /mnt/scratch/bchapman/rebel/build/external/builds/petsc-3.21.0/include/petscvec.h:8:
>> In file included from /mnt/scratch/bchapman/rebel/build/external/builds/petsc-3.21.0/include/petscsys.h:193:
>> /mnt/scratch/bchapman/rebel/build/external/builds/petsc-3.21.0/include/petscmath.h:429:45: error: unknown type name '__complex128'
>>  429 | PETSC_EXTERN MPI_Datatype MPIU___COMPLEX128 MPIU___COMPLEX128_ATTR_TAG;
>>      |                                             ^
>> /mnt/scratch/bchapman/rebel/build/external/builds/petsc-3.21.0/include/petscmath.h:424:71: note: expanded from macro 'MPIU___COMPLEX128_ATTR_TAG'
>>  424 |       #define MPIU___COMPLEX128_ATTR_TAG PETSC_ATTRIBUTE_MPI_TYPE_TAG(__complex128)
>>      |                                                                       ^
>> /mnt/scratch/bchapman/rebel/src/common/petsc_extensions.cpp:2570:13: warning: decomposition declarations are a C++17 extension [-Wc++17-extensions]
>> 2570 |         const auto [min, max] = std::minmax_element(std::begin(s), std::end(s));
>>      |                    ^~~~~~~~~~
>> 1 warning and 1 error generated.
>> make[2]: *** [src/CMakeFiles/rebel_lib.dir/build.make:132: src/CMakeFiles/rebel_lib.dir/common/petsc_extensions.cpp.o] Error 1
>> make[1]: *** [CMakeFiles/Makefile2:772: src/CMakeFiles/rebel_lib.dir/all] Error 2
>> make: *** [Makefile:136: all] Error 2

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251113/7e4377ce/attachment.html>

From bsmith at petsc.dev  Thu Nov 13 09:05:05 2025
From: bsmith at petsc.dev (Barry Smith)
Date: Thu, 13 Nov 2025 10:05:05 -0500
Subject: [petsc-users] Error in configuring PETSc with Cygwin on Windows
 by using MS-MPI
In-Reply-To: <6de01bb0.5fe2.19a7c6bb95f.Coremail.202321009113@mail.scut.edu.cn>
References: <71f7ca4b.59c6.19a724c3c2c.Coremail.ctchengben@mail.scut.edu.cn>
	<CAMYG4Gm7oUio5fswjLWVu_5bxTSpLwNzJaZh5ggx25D=Z7rPjA@mail.gmail.com>
	<823C2320-52A3-4679-8BB2-26DA296E4ACA@petsc.dev>
	<46e195ab.5cea.19a779d6908.Coremail.202321009113@mail.scut.edu.cn>
	<D8DDBDE6-67EA-418B-A8D8-032377C6295A@petsc.dev>
	<6de01bb0.5fe2.19a7c6bb95f.Coremail.202321009113@mail.scut.edu.cn>
Message-ID: <84886CB9-D3B8-433C-943B-31E85A47C3B3@petsc.dev>


  Change

__attribute__((packed))

to 

/* __attribute__((packed)) */ 

in include/petscmath.h 

and run make again.

I think you should install a new version of Microsoft's compilers etc.

  Barry


> On Nov 13, 2025, at 3:53?AM, ?? <202321009113 at mail.scut.edu.cn> wrote:
> 
> Hi Barry
> 
> 
> 
> Thanks for your advice.
> I use AI help me that change the file on the petsc-3.24.1/arch-mswin-c-opt/externalpackages/petsc-pkg-parmetis-f5e3aab04fd5/headers/gk_arch. 
> 
> 
> The change is from: 
> 
> #ifdef __MSC__ 
>   #include "ms_stdint.h"
>   #include "ms_inttypes.h"
>   #include "ms_stat.h"              
> #else
> #ifndef SUNOS
>   #include <stdint.h>
> #endif
> #if !defined(WIN32) && !defined(__MINGW32__)
>   #include <sys/resource.h>
> #endif
>   #include <inttypes.h>
>   #include <sys/types.h>
>   #include <sys/time.h>
> #endif
> 
> To:
> 
> #if (defined(__MSC__) || defined(_MSC_VER)) && defined(_MSC_VER) && _MSC_VER < 1900
>   #include "ms_stdint.h"
>   #include "ms_inttypes.h"
>   #include "ms_stat.h"
> #else
> #ifndef SUNOS
>   #include <stdint.h>
> #endif
> #if !defined(WIN32) && !defined(__MINGW32__) && !defined(_MSC_VER)
>   #include <sys/resource.h>
> #endif
>   #include <inttypes.h>
>   #include <sys/types.h>
> #if !defined(_MSC_VER)
>   #include <sys/time.h>
> #endif
> #endif
> 
> 
> 
> Then I configure the PETSc:
> 
> ./configure --with-debugging=0 --with-cc=cl --with-fc=0 --with-cxx=cl --download-f2cblaslapack=/cygdrive/g/mypetsc/f2cblaslapack-3.8.0.q2.tar.gz --with-mpi-include=\[/cygdrive/g/MSmpi/MicrosoftSDKs/Include,/cygdrive/g/MSmpi/MicrosoftSDKs/Include/x64\] --with-mpi-lib=\[/cygdrive/g/MSmpi/MicrosoftSDKs/Lib/x64/msmpifec.lib,/cygdrive/g/MSmpi/MicrosoftSDKs/Lib/x64/msmpi.lib\] --with-mpiexec=/cygdrive/g/MSmpi/MicrosoftMPI/Bin/mpiexec --download-metis=/cygdrive/g/mypetsc/petsc-pkg-metis-69fb26dd0428.tar.gz --download-parmetis=/cygdrive/g/mypetsc/petsc-pkg-parmetis-f5e3aab04fd5.tar.gz --with-strict-petscerrorcode=0 --with-64-bit-indices
> 
> It seems good, but then I make it
> 
> it  have the error:
> 
> 
> 
> make[3]: *** [gmakefile:211: arch-mswin-c-opt/obj/src/sys/objects/device/interface/mark_dcontext.o] Error 2
> make[3]: Leaving directory '/cygdrive/g/mypetsc/petsc-3.24.1'
> make[2]: *** [/cygdrive/g/mypetsc/petsc-3.24.1/lib/petsc/conf/rules_doc.mk:5: libs] Error 2
> make[2]: Leaving directory '/cygdrive/g/mypetsc/petsc-3.24.1'
> **************************ERROR*************************************
>   Error during compile, check arch-mswin-c-opt/lib/petsc/conf/make.log
>   Send it and arch-mswin-c-opt/lib/petsc/conf/configure.log to petsc-maint at mcs.anl.gov <mailto:petsc-maint at mcs.anl.gov>
> ********************************************************************
> make[1]: *** [makefile:44: all] Error 1
> make: *** [GNUmakefile:9: all] Error 2
> 
> 
> The new configure.log and make.log is attached below.
> 
> I don't know if it is caused by the change I made or the other problems.
> 
> 
> 
> So I ask for your help again.  
> Looking forward your reply!
> 
> 
> sinserely,
> Cheng.
> 
> 
> 
> 
> 
> 
> 
> -----????-----
> ???: "Barry Smith" <bsmith at petsc.dev <mailto:bsmith at petsc.dev>>
> ????: 2025-11-13 00:03:50 (???)
> ???: ?? <202321009113 at mail.scut.edu.cn <mailto:202321009113 at mail.scut.edu.cn>>
> ??: PETSc <petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov>>
> ??: Re: [petsc-users] Error in configuring PETSc with Cygwin on Windows by using MS-MPI
> 
> G:\VisualStudio2022\VC\Tools\MSVC\14.37.32822\include\stdint.h(37): error C2371: 'int_fast16_t': redefinition; different basic types
> G:\mypetsc\petsc-3.24.1\arch-mswin-c-opt\externalpackages\petsc-pkg-parmetis-f5e3aab04fd5\headers\ms_stdint.h(80): note: see declaration of 'int_fast16_t'
> G:\VisualStudio2022\VC\Tools\MSVC\14.37.32822\include\stdint.h(41): error C2371: 'uint_fast16_t': redefinition; different basic types
> G:\mypetsc\petsc-3.24.1\arch-mswin-c-opt\externalpackages\petsc-pkg-parmetis-f5e3aab04fd5\headers\ms_stdint.h(84): note: see declaration of 'uint_fast16_t'
> G:\VisualStudio2022\VC\Tools\MSVC\14.37.32822\include\stdint.h(49): warning C4005: 'INT8_MIN': macro redefinition
> G:\mypetsc\petsc-3.24.1\arch-mswin-c-opt\externalpackages\petsc-pkg-parmetis-f5e3aab04fd5\headers\ms_stdint.h(107): note: see previous definition of 'INT8_MIN'
> 
> Parmetis has its own definitions for many C standard types, etc in headers\ms_stdint.h that duplicate what is available in stdint.h on Unix systems. Normally, this gets included when __MSC_ is defined instead of stdint.h (in gk_arch.h).
> 
> But for some reason, with your system it appears that Microsoft's stdint.h is also getting included; presumably brought in through some other system include file since it is only included in one place.
> 
> $ git grep stdint.h
> headers/gk_arch.h:  #include "ms_stdint.h"
> headers/gk_arch.h:  #include <stdint.h>
> headers/ms_inttypes.h:#include "ms_stdint.h"
> headers/ms_stdint.h:// ISO C9x  compliant stdint.h for Microsoft Visual Studio
> 
> You have a fairly old VisualStudio, 2022. Can you upgrade to the latest? Let us know if this resolves the problem.
> 
> Barry
> 
> 
> 
> 
>  
> 
> 
> 
> 
> 
> On Nov 12, 2025, at 5:29?AM, ?? <202321009113 at mail.scut.edu.cn <mailto:202321009113 at mail.scut.edu.cn>> wrote:
> 
> Hi Barry
> Thanks for your reply.
> I check the package parmetis,and the "petsc-pkg-parmetis-45100eac9301.tar.gz" is form https://urldefense.us/v3/__https://bitbucket.org/petsc/pkg-parmetis/get/v4.0.3.tar.gz__;!!G_uCfscf7eWS!fxWXWboQRNUYFGMA0mW58ZDCE6A4aGfOZvzcj0EG2lHsj_174DkztA-YDWKfPXg9WJxjRkZ13WsNy2TXkQuzMkw$ . So I made a mistake about the package.
> Then I download the package form https://urldefense.us/v3/__https://bitbucket.org/petsc/pkg-parmetis/get/v4.0.3-p9.tar.gz__;!!G_uCfscf7eWS!fxWXWboQRNUYFGMA0mW58ZDCE6A4aGfOZvzcj0EG2lHsj_174DkztA-YDWKfPXg9WJxjRkZ13WsNy2TXPQoo3R0$  and it is "petsc-pkg-parmetis-f5e3aab04fd5.tar.gz" 
> 
> 
> Then the compiler option in configuration is:
> ./configure --with-debugging=0 --with-cc=cl --with-fc=0 --with-cxx=cl --download-f2cblaslapack=/cygdrive/g/mypetsc/f2cblaslapack-3.8.0.q2.tar.gz --with-mpi-include=\[/cygdrive/g/MSmpi/MicrosoftSDKs/Include,/cygdrive/g/MSmpi/MicrosoftSDKs/Include/x64\] --with-mpi-lib=\[/cygdrive/g/MSmpi/MicrosoftSDKs/Lib/x64/msmpifec.lib,/cygdrive/g/MSmpi/MicrosoftSDKs/Lib/x64/msmpi.lib\] --with-mpiexec=/cygdrive/g/MSmpi/MicrosoftMPI/Bin/mpiexec --download-metis=/cygdrive/g/mypetsc/petsc-pkg-metis-69fb26dd0428.tar.gz --download-parmetis=/cygdrive/g/mypetsc/petsc-pkg-parmetis-f5e3aab04fd5.tar.gz --with-strict-petscerrorcode=0 --with-64-bit-indices
> 
> but it still have the same error:
> *********************************************************************************************
> =============================================================================================
> =============================================================================================
>                 Configuring PARMETIS with CMake; this may take several minutes
> =============================================================================================
> =============================================================================================
>                Compiling and installing PARMETIS; this may take several minutes
> =============================================================================================
> 
> 
> *********************************************************************************************
>            UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for details):
> ---------------------------------------------------------------------------------------------
>                                Error running make on  PARMETIS
> 
> 
> *********************************************************************************************
> 
> 
> The new configure.log is attached below.
> So I ask for your help again.  
> Looking forward your reply!
> 
> 
> sinserely,
> Cheng.
> 
> 
> 
> 
> -----????-----
> ???: "Barry Smith" <bsmith at petsc.dev <mailto:bsmith at petsc.dev>>
> ????: 2025-11-11 23:29:01 (???)
> ???: "Matthew Knepley" <knepley at gmail.com <mailto:knepley at gmail.com>>
> ??: ?? <ctchengben at mail.scut.edu.cn <mailto:ctchengben at mail.scut.edu.cn>>, petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov>
> ??: Re: [petsc-users] Error in configuring PETSc with Cygwin on Windows by using MS-MPI
> 
> 
>   Where/how did you obtain /cygdrive/g/mypetsc/petsc-pkg-parmetis-45100eac9301.tar.gz ? Was it from PETSc ./configure?
> 
>     self.version          = '4.0.3'
>     self.versionname      = 'PARMETIS_MAJOR_VERSION.PARMETIS_MINOR_VERSION.PARMETIS_SUBMINOR_VERSION'
>     self.gitcommit         = 'v'+self.version+'-p9'
>     self.download          = ['git://https://bitbucket.org/petsc/pkg-parmetis.git','https://bitbucket.org/petsc/pkg-parmetis/get/'+self.gitcommit+'.tar.gz <git://https//bitbucket.org/petsc/pkg-parmetis.git','https://bitbucket.org/petsc/pkg-parmetis/get/'+self.gitcommit+'.tar.gz>']
> 
> 
> 
> On Nov 11, 2025, at 7:35?AM, Matthew Knepley <knepley at gmail.com <mailto:knepley at gmail.com>> wrote:
> 
> On Tue, Nov 11, 2025 at 4:44?AM ?? <ctchengben at mail.scut.edu.cn <mailto:ctchengben at mail.scut.edu.cn>> wrote:
> Hello,
> Recently I try to install PETSc with Cygwin since I'd like to use PETSc with Visual Studio on Windows10 plateform.For the sake of clarity, I firstly list the softwares/packages used below:
> 1. PETSc: version 3.14.1
> 2. VS: version 2022 
> 3. MS MPI: download Microsoft MPI v10.1.2
> 4. Cygwin
> 
> Quick question: Have you considered installing on WSL? I have had much better luck with that on Windows.
> 
> This seems to be an incompatibility of ParMetis Windows support and your version:
> 
> G:\VisualStudio2022\VC\Tools\MSVC\14.37.32822\include\stdint.h(37): error C2371: 'int_fast16_t': redefinition; different basic types^M
> G:\mypetsc\petsc-3.24.1\arch-mswin-c-opt\externalpackages\petsc-pkg-parmetis-f5e3aab04fd5\headers\ms_stdint.h(80): note: see declaration of 'int_fast16_t'^M
> G:\VisualStudio2022\VC\Tools\MSVC\14.37.32822\include\stdint.h(41): error C2371: 'uint_fast16_t': redefinition; different basic types^M
> G:\mypetsc\petsc-3.24.1\arch-mswin-c-opt\externalpackages\petsc-pkg-parmetis-f5e3aab04fd5\headers\ms_stdint.h(84): note: see declaration of 'uint_fast16_t'^M 
> 
>   Thanks,
> 
>      Matt
> 
> And the compiler option in configuration is:
> ./configure --with-debugging=0 --with-cc=cl --with-fc=0 --with-cxx=cl 
> --download-f2cblaslapack=/cygdrive/g/mypetsc/f2cblaslapack-3.8.0.q2.tar.gz 
> --with-mpi-include=\[/cygdrive/g/MSmpi/MicrosoftSDKs/Include,/cygdrive/g/MSmpi/MicrosoftSDKs/Include/x64\] 
> --with-mpi-lib=\[/cygdrive/g/MSmpi/MicrosoftSDKs/Lib/x64/msmpifec.lib,/cygdrive/g/MSmpi/MicrosoftSDKs/Lib/x64/msmpi.lib\] 
> --with-mpiexec=/cygdrive/g/MSmpi/MicrosoftMPI/Bin/mpiexec 
> --download-metis=/cygdrive/g/mypetsc/petsc-pkg-metis-69fb26dd0428.tar.gz 
> --download-parmetis=/cygdrive/g/mypetsc/petsc-pkg-parmetis-45100eac9301.tar.gz 
> --with-strict-petscerrorcode=0 --with-64-bit-indices --download-hdf5=/cygdrive/g/mypetsc/hdf5-1.14.3-p1.tar.bz2
> 
> 
> 
> 
> 
> 
> but there return an error:
> *********************************************************************************************
> =============================================================================================
> =============================================================================================
>                 Configuring PARMETIS with CMake; this may take several minutes
> =============================================================================================
> =============================================================================================
>                Compiling and installing PARMETIS; this may take several minutes
> =============================================================================================
> 
> 
> *********************************************************************************************
>            UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for details):
> ---------------------------------------------------------------------------------------------
>                                Error running make on  PARMETIS
> 
> 
> *********************************************************************************************
> 
> 
> The configure.log is attached below.
> So I write this email to report my problem and ask for your help.  
> Looking forward your reply!
> 
> 
> sinserely,
> Cheng.
> 
> 
> -- 
> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> -- Norbert Wiener
> 
> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!fxWXWboQRNUYFGMA0mW58ZDCE6A4aGfOZvzcj0EG2lHsj_174DkztA-YDWKfPXg9WJxjRkZ13WsNy2TXA-nTYTc$ 
> 
> 
??
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251113/b019f5a4/attachment-0003.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: configure.log
Type: application/octet-stream
Size: 2302425 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251113/b019f5a4/attachment-0002.obj>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251113/b019f5a4/attachment-0004.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: make.log
Type: application/octet-stream
Size: 17055 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251113/b019f5a4/attachment-0003.obj>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251113/b019f5a4/attachment-0005.html>

From herbert.owen at bsc.es  Thu Nov 13 11:11:04 2025
From: herbert.owen at bsc.es (howen)
Date: Thu, 13 Nov 2025 18:11:04 +0100
Subject: [petsc-users] Petsc + nvhpc
In-Reply-To: <CA+MQGp9gQVBFuxjynUmpOoGTqvNkuRytyLtZsiJaEf30Mc9OAg@mail.gmail.com>
References: <0B70CA06-D787-4D97-8E33-27E71D08BBF0@bsc.es>
	<CA+MQGp9gQVBFuxjynUmpOoGTqvNkuRytyLtZsiJaEf30Mc9OAg@mail.gmail.com>
Message-ID: <2A940B06-46A3-40AB-BD66-835C46970CA1@bsc.es>

Dear Junchao,

Thank you for response and sorry for taking so long to answer back. 
I cannot avoid using the nvidia tools. Gfortran is not mature for OpenACC and gives us problems when compiling our code.
What I have done to enable using the latest petsc is to create my own C code to call petsc. 
I have little experience with c and it took me some time, but I can now use petsc 3.24.1  ;)

The behaviour remains the same as in my original email . 
Parallel+GPU gives bad results. CPU(serial and parallel) and GPU serial all work ok and give the same result.

I have gone a bit into petsc comparing the CPU and GPU version with 2 mpi.
I see that the difference starts in 
src/ksp/ksp/impls/cg/cg.c  L170
    PetscCall(KSP_PCApply(ksp, R, Z));  /*    z <- Br                           */
I have printed the vectors R and Z and the norm dp.
R is identical on both CPU and GPU; but Z differs.
The correct value of dp (for the first time it enters) is 14.3014, while running on the GPU with 2 mpis it gives 14.7493.
If you wish I can send you prints I introduced in cg.c    

The folder with the input files to run the case can be downloaded from https://urldefense.us/v3/__https://b2drop.eudat.eu/s/wKRQ4LK7RTKz2iQ__;!!G_uCfscf7eWS!Y_YIXIdrN81gDKgNed6V4icL3nN9OG62-ZsdnB1Bkc7iiGAoJ2riwbTzxJMnIROon3mXgiFVLnbH0RTlsXrLEAh7n_UO$ 

For submitting the gpu run I use 
mpirun -np 2 --map-by ppr:4:node:PE=20 --report-bindings ./mn5_bind.sh /gpfs/scratch/bsc21/bsc021257/git/140-add-petsc/sod2d_gitlab/build_gpu/src/app_sod2d/sod2d ChannelFlowSolverIncomp.json

For the cpu run
mpirun -np 2 /gpfs/scratch/bsc21/bsc021257/git/140-add-petsc/sod2d_gitlab/build_cpu/src/app_sod2d/sod2d ChannelFlowSolverIncomp.json

Our code can be downloaded with :
git clone --recursive https://urldefense.us/v3/__https://gitlab.com/bsc_sod2d/sod2d_gitlab.git__;!!G_uCfscf7eWS!Y_YIXIdrN81gDKgNed6V4icL3nN9OG62-ZsdnB1Bkc7iiGAoJ2riwbTzxJMnIROon3mXgiFVLnbH0RTlsXrLEFjsBTIo$ 

-and the branch I am using with
git checkout 140-add-petsc

To use exactly the same commit I am using 
git checkout 09a923c9b57e46b14ae54b935845d50272691ace


I am currently using: Currently Loaded Modules:
  1) nvidia-hpc-sdk/25.1   2) hdf5/1.14.1-2-nvidia-nvhpcx   3) cmake/3.25.1
I guess/hope similar modules should be available in any supercomputer.

To build the cpu version 
mkdir build_cpu
cd build_cpu

export PETSC_INSTALL=/gpfs/scratch/bsc21/bsc021257/git/petsc_oct25/3241_cpu/hhinstal
export LD_LIBRARY_PATH=$PETSC_INSTALL/lib:$LD_LIBRARY_PATH
export LIBRARY_PATH=$PETSC_INSTALL/lib:$LIBRARY_PATH
export C_INCLUDE_PATH=$PETSC_INSTALL/include:$C_INCLUDE_PATH
export CPLUS_INCLUDE_PATH=$PETSC_INSTALL/include:$CPLUS_INCLUDE_PATH
export PKG_CONFIG_PATH=$PETSC_INSTALL/lib/pkgconfig:$PKG_CONFIG_PATH

cmake -DUSE_RP=8 -DUSE_PORDER=3 -DUSE_PETSC=ON -DUSE_GPU=OFF -DDEBUG_MODE=OFF ..
make -j 80

I have built petsc myself  as follows

git clone -b release https://urldefense.us/v3/__https://gitlab.com/petsc/petsc.git__;!!G_uCfscf7eWS!Y_YIXIdrN81gDKgNed6V4icL3nN9OG62-ZsdnB1Bkc7iiGAoJ2riwbTzxJMnIROon3mXgiFVLnbH0RTlsXrLELP8U6d0$  petsc
cd petsc
git checkout v3.24.1     
module purge
module load nvidia-hpc-sdk/25.1   hdf5/1.14.1-2-nvidia-nvhpcx cmake/3.25.1 
./configure --PETSC_DIR=/gpfs/scratch/bsc21/bsc021257/git/petsc_oct25/3241/petsc --prefix=/gpfs/scratch/bsc21/bsc021257/git/petsc_oct25/3241/hhinstal --with-fortran-bindings=0  --with-fc=0 --with-petsc-arch=linux-x86_64-opt --with-scalar-type=real --with-debugging=yes --with-64-bit-indices=1 --with-precision=single --download-hypre CFLAGS=-I/apps/ACC/HDF5/1.14.1-2/NVIDIA/NVHPCX/include CXXFLAGS= FCFLAGS= --with-shared-libraries=1 --with-mpi=1 --with-blacs-lib=/gpfs/apps/MN5/ACC/ONEAPI/2025.1/mkl/2025.1/lib/intel64/libmkl_blacs_openmpi_lp64.a --with-blacs-include=/gpfs/apps/MN5/ACC/ONEAPI/2025.1/mkl/2025.1/include --with-mpi-dir=/apps/ACC/NVIDIA-HPC-SDK/25.1/Linux_x86_64/25.1/comm_libs/12.6/hpcx/latest/ompi/ --download-ptscotch=yes --download-metis --download-parmetis
make all check
make install

-------------------
For the GPU version when configuring petsc I add : --with-cuda 

I then change the export PETSC_INSTALL  to 
export PETSC_INSTALL=/gpfs/scratch/bsc21/bsc021257/git/petsc_oct25/3241/hhinstal
and repeat all other exports

mkdir build_gpu
cd build_gpu
cmake -DUSE_RP=8 -DUSE_PORDER=3 -DUSE_PETSC=ON -DUSE_GPU=ON -DDEBUG_MODE=OFF ..
make -j 80

As you can see from the submit instructions the executable is found in sod2d_gitlab/build_gpu/src/app_sod2d/sod2d

I hope I have not forgotten anything and my instructions are 'easy' to follow. If you have any issue do not doubt to contact me.
The wiki for our code can be found in https://urldefense.us/v3/__https://gitlab.com/bsc_sod2d/sod2d_gitlab/-/wikis/home__;!!G_uCfscf7eWS!Y_YIXIdrN81gDKgNed6V4icL3nN9OG62-ZsdnB1Bkc7iiGAoJ2riwbTzxJMnIROon3mXgiFVLnbH0RTlsXrLEA1vqPYk$ 

Best, 

Herbert Owen
 
Herbert Owen
Senior Researcher, Dpt. Computer Applications in Science and Engineering
Barcelona Supercomputing Center (BSC-CNS)
Tel: +34 93 413 4038
Skype: herbert.owen

https://urldefense.us/v3/__https://scholar.google.es/citations?user=qe5O2IYAAAAJ&hl=en__;!!G_uCfscf7eWS!Y_YIXIdrN81gDKgNed6V4icL3nN9OG62-ZsdnB1Bkc7iiGAoJ2riwbTzxJMnIROon3mXgiFVLnbH0RTlsXrLEAA5PwtO$ 


> On 16 Oct 2025, at 18:30, Junchao Zhang <junchao.zhang at gmail.com> wrote:
> 
> Hi, Herbert,
>    I don't have much experience on OpenACC and PETSc CI doesn't have such tests.  Could you avoid using nvfortran and instead use gfortran to compile your Fortran + OpenACC code?  If you, then you can use the latest petsc code and make our debugging easier. 
>    Also, could you provide us with a test and instructions to reproduce the problem?
>    
>    Thanks!
> --Junchao Zhang
> 
> 
> On Thu, Oct 16, 2025 at 5:07?AM howen via petsc-users <petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov>> wrote:
>> Dear All,
>> 
>> I am interfacing our CFD code (Fortran + OpenACC)  to Petsc. 
>> Since we use OpenACC the natural choice for us is to use Nvidia?s nvhpc compiler. The Gnu compiler does not work well and we do not have access to the Cray compiler.  
>> 
>> I already know that the latest version of Petsc does not compile with nvhpc, I am therefore using version 3.21.  
>> I get good results on the CPU both in serial and parallel (MPI). However, the GPU implementation, that is what we are interested in, only work correctly for the serial version. In parallel, the results are different. Even for a CG solve. 
>> 
>> I would like to know, if you have experience with the Nvidia compiler.  I am particularly interested if you have already observed issues with it. Your opinion on whether to put further effort into trying to find a bug I may have introduced during the interfacing is highly appreciated.
>> 
>> Best,
>> 
>> Herbert Owen
>> Senior Researcher, Dpt. Computer Applications in Science and Engineering
>> Barcelona Supercomputing Center (BSC-CNS)
>> Tel: +34 93 413 4038
>> Skype: herbert.owen
>> 
>> https://urldefense.us/v3/__https://scholar.google.es/citations?user=qe5O2IYAAAAJ&hl=en__;!!G_uCfscf7eWS!Y_YIXIdrN81gDKgNed6V4icL3nN9OG62-ZsdnB1Bkc7iiGAoJ2riwbTzxJMnIROon3mXgiFVLnbH0RTlsXrLEAA5PwtO$  <https://urldefense.us/v3/__https://scholar.google.es/citations?user=qe5O2IYAAAAJ&hl=en__;!!G_uCfscf7eWS!abuM7ozzUs7eISYBumHNxpvO2Tuy74KRM4-WWcunXHZVjQf1V032xQrCzTfC5vA_NM-35xMEZ9yJ8XK-3QFqjWBSWuUi$>
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251113/0c61aa44/attachment.html>

From grantchao2018 at 163.com  Thu Nov 13 11:16:33 2025
From: grantchao2018 at 163.com (Grant Chao)
Date: Fri, 14 Nov 2025 01:16:33 +0800 (CST)
Subject: [petsc-users] gpu cpu parallel
In-Reply-To: <CA+MQGp9M-5UmmBNC3-8eGoBwyahg-Lv1nQ_Hrz7b9wLS7Z8EVA@mail.gmail.com>
References: <CAKxb76vC1h=WXJJk5pji6oU5W5KB0EZ3R-revNUxaXVHnO3Ekw@mail.gmail.com>
	<CA+MQGp8cohPPYQv1sP=o4rd_Zn9t5M+d_FPSyKqW8e42KZ9UWg@mail.gmail.com>
	<1f9310f.309.19a76fa21d9.Coremail.grantchao2018@163.com>
	<4E8E7829-A856-495A-ADA3-710C91F8B3EF@petsc.dev>
	<CA+MQGp9M-5UmmBNC3-8eGoBwyahg-Lv1nQ_Hrz7b9wLS7Z8EVA@mail.gmail.com>
Message-ID: <3d5b96a2.a3e4.19a7e380b1f.Coremail.grantchao2018@163.com>

Junchao,
We have tried cudaSetDevice.
The test code is attached. 8 cpu and 2 gpu are used. And we create a gpu_comm including rank 0 and rank 4.
Then we set gpu 0 to rank 0, gpu 1 to rank 1 respectively.
After MatSetType, rank 1 is mapped to gpu0 again.


The run cmd is 
    mpirun -n 8 ./a.out -eps_type jd -st_ksp_type gmres -st_pc_type none


The std out is show below,
[Rank 0] using GPU 0, [line 22].
[Rank 1] no computation assigned.
[Rank 2] no computation assigned.
[Rank 3] no computation assigned.
[Rank 4] using GPU 0, [line 22].
[Rank 5] no computation assigned.
[Rank 6] no computation assigned.
[Rank 7] no computation assigned.
[Rank 4] using GPU 1, [line 31] after setdevice.   -------- Here set device successfully
[Rank 0] using GPU 0, [line 31] after setdevice.
[Rank 4] using GPU 1, [line 41] after create A.
[Rank 0] using GPU 0, [line 41] after create A.
[Rank 0] using GPU 0, [line 45] after set A type.
[Rank 4] using GPU 0, [line 45] after set A type.      ------ change to 0?
[Rank 4] using GPU 0, [line 49] after MatSetUp.
[Rank 0] using GPU 0, [line 49] after MatSetUp.
[Rank 4] using GPU 0, [line 62] after Mat Assemble.
[Rank 0] using GPU 0, [line 62] after Mat Assemble.
Smallest eigenvalue = 100.000000
Smallest eigenvalue = 100.000000


BEST,
Grant


At 2025-11-13 05:58:05, "Junchao Zhang" <junchao.zhang at gmail.com> wrote:

A common approach is to use CUDA_VISIBLE_DEVICES to manipulate MPI ranks to GPUs mapping, see the section at https://urldefense.us/v3/__https://docs.nersc.gov/jobs/affinity/*gpu-nodes__;Iw!!G_uCfscf7eWS!Z_gIM7FfeDHQ5dHmPBQcDcmQnG0t6iMrPQU7OgVoGBU_BV3clXDllaQuK7A2zJlgP_o477Up1LHyn0VK4A3ULkoO7PrHMQ$ 

With OpenMPI,  you can use OMPI_COMM_WORLD_LOCAL_RANK in place of SLURM_LOCALID (see https://urldefense.us/v3/__https://docs.open-mpi.org/en/v5.0.x/tuning-apps/environment-var.html__;!!G_uCfscf7eWS!Z_gIM7FfeDHQ5dHmPBQcDcmQnG0t6iMrPQU7OgVoGBU_BV3clXDllaQuK7A2zJlgP_o477Up1LHyn0VK4A3ULkpfQizn9g$ ). For example, with 8 MPI ranks and 4 GPUs per node, the following script will map ranks 0, 1 to GPU 0, ranks 2, 3 to GPU 1.


#!/bin/bash 
# select_gpu_device wrapper script 
export CUDA_VISIBLE_DEVICES=$((OMPI_COMM_WORLD_LOCAL_RANK/(OMPI_COMM_WORLD_LOCAL_SIZE/4)))
exec $*


On Wed, Nov 12, 2025 at 10:20?AM Barry Smith <bsmith at petsc.dev> wrote:


On Nov 12, 2025, at 2:31?AM, Grant Chao <grantchao2018 at 163.com> wrote:


Thank you for the suggestion.


We have already tried running multiple CPU ranks with a single GPU. However, we observed that as the number of ranks increases, the EPS solver becomes significantly slower. We are not sure of the exact cause?could it be due to process access contention, hidden data transfers, or perhaps another reason? We would be very interested to hear your insight on this matter.


To avoid this problem, we used the gpu_comm approach mentioned before. During testing, we noticed that the mapping between rank ID and GPU ID seems to be set automatically and is not user-specifiable.


For example, with 4 GPUs (0-3) and 8 CPU ranks (0-7), the program binds ranks 0 and 4 to GPU 0, ranks 1 and 5 to GPU 1, and so on.


We tested possible solutions, such as calling cudaSetDevice() manually to set rank 4 to device 1, but it did not work as expected. Ranks 0 and 4 still used GPU 0.


We would appreciate your guidance on how to customize this mapping. Thank you for your support.


  So you have a single compute "node" connected to multiple GPUs?  Then the mapping of MPI ranks to GPUs doesn't matter and changing it won't improve the performance. 


However, we observed that as the number of ranks increases, the EPS solver becomes significantly slower.

  Does the number of EPS "iterations" increase? Run with one, two, four and eight MPI ranks (and the same number of "GPUs" (if you only have say four GPUs that is fine, just virtualize them so two different MPI ranks share one) and the option -log_view and send the output. We need to know what is slowing down before trying to find any cure.


  Barry


Best wishes,
Grant


At 2025-11-12 11:48:47, "Junchao Zhang" <junchao.zhang at gmail.com>, said:

Hi, Wenbo,
   I think your approach should work.  But before going this extra step with gpu_comm,  have you tried to map multiple MPI ranks (CPUs) to one GPU, using nvidia's multiple process service (MPS)?  If MPS works well,  then you can avoid the extra complexity. 


--Junchao Zhang


On Tue, Nov 11, 2025 at 7:50?PM Wenbo Zhao <zhaowenbo.npic at gmail.com> wrote:

Dear all,


We are trying to solve ksp using GPUs.
We found the example, src/ksp/ksp/tutorials/bench_kspsolve.c, in which the matrix is created and assembling using COO way provided by PETSc. In this example, the number of CPU is as same as the number of GPU.
In our case, computation of the parameters of matrix is performed on CPUs. And the cost of it is expensive, which might take half of total time or even more. 


 We want to use more CPUs to compute parameters in parallel. And a smaller communication domain (such as gpu_comm) for the CPUs corresponding to the GPUs is created. The parameters are computed by all of the CPUs (in MPI_COMM_WORLD). Then, the parameters are send to gpu_comm related CPUs via MPI. Matrix (type of aijcusparse) is then created and assembled within gpu_comm. Finally, ksp_solve is performed on GPUs.


I?m not sure if this approach will work in practice. Are there any comparable examples I can look to for guidance?


Best,
Wenbo

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251114/9f355bec/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: test_cpu_gpu.cpp
Type: text/x-c
Size: 3190 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251114/9f355bec/attachment-0001.bin>

From knepley at gmail.com  Thu Nov 13 11:23:20 2025
From: knepley at gmail.com (Matthew Knepley)
Date: Thu, 13 Nov 2025 12:23:20 -0500
Subject: [petsc-users] Petsc + nvhpc
In-Reply-To: <2A940B06-46A3-40AB-BD66-835C46970CA1@bsc.es>
References: <0B70CA06-D787-4D97-8E33-27E71D08BBF0@bsc.es>
	<CA+MQGp9gQVBFuxjynUmpOoGTqvNkuRytyLtZsiJaEf30Mc9OAg@mail.gmail.com>
	<2A940B06-46A3-40AB-BD66-835C46970CA1@bsc.es>
Message-ID: <CAMYG4GkzGH5_HTBsufcZWAfbp_YyjTLrR=ehq2D=gLSpxuVfeA@mail.gmail.com>

On Thu, Nov 13, 2025 at 12:11?PM howen via petsc-users <
petsc-users at mcs.anl.gov> wrote:

> Dear Junchao,
>
> Thank you for response and sorry for taking so long to answer back.
> I cannot avoid using the nvidia tools. Gfortran is not mature for OpenACC
> and gives us problems when compiling our code.
> What I have done to enable using the latest petsc is to create my own C
> code to call petsc.
> I have little experience with c and it took me some time, but I can now
> use petsc 3.24.1  ;)
>
> The behaviour remains the same as in my original email .
> Parallel+GPU gives bad results. CPU(serial and parallel) and GPU serial
> all work ok and give the same result.
>
> I have gone a bit into petsc comparing the CPU and GPU version with 2 mpi.
> I see that the difference starts in
> src/ksp/ksp/impls/cg/cg.c  L170
>     PetscCall(KSP_PCApply(ksp, R, Z));  /*    z <- Br
>       */
> I have printed the vectors R and Z and the norm dp.
> R is identical on both CPU and GPU; but Z differs.
> The correct value of dp (for the first time it enters) is 14.3014, while
> running on the GPU with 2 mpis it gives 14.7493.
> If you wish I can send you prints I introduced in cg.c
>

Thank you for all the detail in this report. However, since you see a
problem in KSPCG, I believe we can reduce the complexity. You can use

  -ksp_view_mat binary:A.bin -ksp_view_rhs binary:b.bin

and send us those files. Then we can run your system directly using KSP
ex10 (and so can you).

  Thanks,

      Matt


> The folder with the input files to run the case can be downloaded from
> https://urldefense.us/v3/__https://b2drop.eudat.eu/s/wKRQ4LK7RTKz2iQ__;!!G_uCfscf7eWS!bOzafvznhRJly5r11WId0BSmM38vOBG5qnlJMJf02uLM44-t4g7Xm8NCG7h_D7BTAe3ACc19jaFdq9hQ4klS$ 
> <https://urldefense.us/v3/__https://b2drop.eudat.eu/s/wKRQ4LK7RTKz2iQ__;!!G_uCfscf7eWS!Y_YIXIdrN81gDKgNed6V4icL3nN9OG62-ZsdnB1Bkc7iiGAoJ2riwbTzxJMnIROon3mXgiFVLnbH0RTlsXrLEAh7n_UO$>
>
> For submitting the gpu run I use
> mpirun -np 2 --map-by ppr:4:node:PE=20 --report-bindings ./mn5_bind.sh
> /gpfs/scratch/bsc21/bsc021257/git/140-add-petsc/sod2d_gitlab/build_gpu/src/app_sod2d/sod2d
> ChannelFlowSolverIncomp.json
>
> For the cpu run
> mpirun -np 2
> /gpfs/scratch/bsc21/bsc021257/git/140-add-petsc/sod2d_gitlab/build_cpu/src/app_sod2d/sod2d
> ChannelFlowSolverIncomp.json
>
> Our code can be downloaded with :
> git clone --recursive https://urldefense.us/v3/__https://gitlab.com/bsc_sod2d/sod2d_gitlab.git__;!!G_uCfscf7eWS!bOzafvznhRJly5r11WId0BSmM38vOBG5qnlJMJf02uLM44-t4g7Xm8NCG7h_D7BTAe3ACc19jaFdq_ZmRVRG$ 
> <https://urldefense.us/v3/__https://gitlab.com/bsc_sod2d/sod2d_gitlab.git__;!!G_uCfscf7eWS!Y_YIXIdrN81gDKgNed6V4icL3nN9OG62-ZsdnB1Bkc7iiGAoJ2riwbTzxJMnIROon3mXgiFVLnbH0RTlsXrLEFjsBTIo$>
>
> -and the branch I am using with
> git checkout 140-add-petsc
>
> To use exactly the same commit I am using
> git checkout 09a923c9b57e46b14ae54b935845d50272691ace
>
>
> I am currently using: Currently Loaded Modules:
>   1) nvidia-hpc-sdk/25.1   2) hdf5/1.14.1-2-nvidia-nvhpcx   3) cmake/3.25.1
> I guess/hope similar modules should be available in any supercomputer.
>
> To build the cpu version
> mkdir build_cpu
> cd build_cpu
>
> export
> PETSC_INSTALL=/gpfs/scratch/bsc21/bsc021257/git/petsc_oct25/3241_cpu/hhinstal
> export LD_LIBRARY_PATH=$PETSC_INSTALL/lib:$LD_LIBRARY_PATH
> export LIBRARY_PATH=$PETSC_INSTALL/lib:$LIBRARY_PATH
> export C_INCLUDE_PATH=$PETSC_INSTALL/include:$C_INCLUDE_PATH
> export CPLUS_INCLUDE_PATH=$PETSC_INSTALL/include:$CPLUS_INCLUDE_PATH
> export PKG_CONFIG_PATH=$PETSC_INSTALL/lib/pkgconfig:$PKG_CONFIG_PATH
>
> cmake -DUSE_RP=8 -DUSE_PORDER=3 -DUSE_PETSC=ON -DUSE_GPU=OFF
> -DDEBUG_MODE=OFF ..
> make -j 80
>
> I have built petsc myself  as follows
>
> git clone -b release https://urldefense.us/v3/__https://gitlab.com/petsc/petsc.git__;!!G_uCfscf7eWS!bOzafvznhRJly5r11WId0BSmM38vOBG5qnlJMJf02uLM44-t4g7Xm8NCG7h_D7BTAe3ACc19jaFdq58G6Hkk$ 
> <https://urldefense.us/v3/__https://gitlab.com/petsc/petsc.git__;!!G_uCfscf7eWS!Y_YIXIdrN81gDKgNed6V4icL3nN9OG62-ZsdnB1Bkc7iiGAoJ2riwbTzxJMnIROon3mXgiFVLnbH0RTlsXrLELP8U6d0$>
> petsc
> cd petsc
> git checkout v3.24.1
> module purge
> module load nvidia-hpc-sdk/25.1   hdf5/1.14.1-2-nvidia-nvhpcx cmake/3.25.1
> ./configure
> --PETSC_DIR=/gpfs/scratch/bsc21/bsc021257/git/petsc_oct25/3241/petsc
> --prefix=/gpfs/scratch/bsc21/bsc021257/git/petsc_oct25/3241/hhinstal
> --with-fortran-bindings=0  --with-fc=0 --with-petsc-arch=linux-x86_64-opt
> --with-scalar-type=real --with-debugging=yes --with-64-bit-indices=1
> --with-precision=single --download-hypre
> CFLAGS=-I/apps/ACC/HDF5/1.14.1-2/NVIDIA/NVHPCX/include CXXFLAGS= FCFLAGS=
> --with-shared-libraries=1 --with-mpi=1
> --with-blacs-lib=/gpfs/apps/MN5/ACC/ONEAPI/2025.1/mkl/2025.1/lib/intel64/libmkl_blacs_openmpi_lp64.a
> --with-blacs-include=/gpfs/apps/MN5/ACC/ONEAPI/2025.1/mkl/2025.1/include
> --with-mpi-dir=/apps/ACC/NVIDIA-HPC-SDK/25.1/Linux_x86_64/25.1/comm_libs/12.6/hpcx/latest/ompi/
> --download-ptscotch=yes --download-metis --download-parmetis
> make all check
> make install
>
> -------------------
> For the GPU version when configuring petsc I add : --with-cuda
>
> I then change the export PETSC_INSTALL  to
> export
> PETSC_INSTALL=/gpfs/scratch/bsc21/bsc021257/git/petsc_oct25/3241/hhinstal
> and repeat all other exports
>
> mkdir build_gpu
> cd build_gpu
> cmake -DUSE_RP=8 -DUSE_PORDER=3 -DUSE_PETSC=ON -DUSE_GPU=ON
> -DDEBUG_MODE=OFF ..
> make -j 80
>
> As you can see from the submit instructions the executable is found in
> sod2d_gitlab/build_gpu/src/app_sod2d/sod2d
>
> I hope I have not forgotten anything and my instructions are 'easy' to
> follow. If you have any issue do not doubt to contact me.
> The wiki for our code can be found in
> https://urldefense.us/v3/__https://gitlab.com/bsc_sod2d/sod2d_gitlab/-/wikis/home__;!!G_uCfscf7eWS!bOzafvznhRJly5r11WId0BSmM38vOBG5qnlJMJf02uLM44-t4g7Xm8NCG7h_D7BTAe3ACc19jaFdq4yklS7Y$ 
> <https://urldefense.us/v3/__https://gitlab.com/bsc_sod2d/sod2d_gitlab/-/wikis/home__;!!G_uCfscf7eWS!Y_YIXIdrN81gDKgNed6V4icL3nN9OG62-ZsdnB1Bkc7iiGAoJ2riwbTzxJMnIROon3mXgiFVLnbH0RTlsXrLEA1vqPYk$>
>
> Best,
>
> Herbert Owen
>
> Herbert Owen
> Senior Researcher, Dpt. Computer Applications in Science and Engineering
> Barcelona Supercomputing Center (BSC-CNS)
> Tel: +34 93 413 4038
> Skype: herbert.owen
>
> https://urldefense.us/v3/__https://scholar.google.es/citations?user=qe5O2IYAAAAJ&hl=en__;!!G_uCfscf7eWS!bOzafvznhRJly5r11WId0BSmM38vOBG5qnlJMJf02uLM44-t4g7Xm8NCG7h_D7BTAe3ACc19jaFdq7rqqXKl$ 
> <https://urldefense.us/v3/__https://scholar.google.es/citations?user=qe5O2IYAAAAJ&hl=en__;!!G_uCfscf7eWS!Y_YIXIdrN81gDKgNed6V4icL3nN9OG62-ZsdnB1Bkc7iiGAoJ2riwbTzxJMnIROon3mXgiFVLnbH0RTlsXrLEAA5PwtO$>
>
>
>
>
>
>
>
>
> On 16 Oct 2025, at 18:30, Junchao Zhang <junchao.zhang at gmail.com> wrote:
>
> Hi, Herbert,
>    I don't have much experience on OpenACC and PETSc CI doesn't have such
> tests.  Could you avoid using nvfortran and instead use gfortran to compile
> your Fortran + OpenACC code?  If you, then you can use the latest petsc
> code and make our debugging easier.
>    Also, could you provide us with a test and instructions to reproduce
> the problem?
>
>    Thanks!
> --Junchao Zhang
>
>
> On Thu, Oct 16, 2025 at 5:07?AM howen via petsc-users <
> petsc-users at mcs.anl.gov> wrote:
>
>> Dear All,
>>
>> I am interfacing our CFD code (Fortran + OpenACC)  to Petsc.
>> Since we use OpenACC the natural choice for us is to use Nvidia?s nvhpc
>> compiler. The Gnu compiler does not work well and we do not have access to
>> the Cray compiler.
>>
>> I already know that the latest version of Petsc does not compile with
>> nvhpc, I am therefore using version 3.21.
>> I get good results on the CPU both in serial and parallel (MPI). However,
>> the GPU implementation, that is what we are interested in, only work
>> correctly for the serial version. In parallel, the results are different.
>> Even for a CG solve.
>>
>> I would like to know, if you have experience with the Nvidia compiler.  I
>> am particularly interested if you have already observed issues with it.
>> Your opinion on whether to put further effort into trying to find a bug I
>> may have introduced during the interfacing is highly appreciated.
>>
>> Best,
>>
>> Herbert Owen
>> Senior Researcher, Dpt. Computer Applications in Science and Engineering
>> Barcelona Supercomputing Center (BSC-CNS)
>> Tel: +34 93 413 4038
>> Skype: herbert.owen
>>
>> https://urldefense.us/v3/__https://scholar.google.es/citations?user=qe5O2IYAAAAJ&hl=en__;!!G_uCfscf7eWS!bOzafvznhRJly5r11WId0BSmM38vOBG5qnlJMJf02uLM44-t4g7Xm8NCG7h_D7BTAe3ACc19jaFdq7rqqXKl$ 
>> <https://urldefense.us/v3/__https://scholar.google.es/citations?user=qe5O2IYAAAAJ&hl=en__;!!G_uCfscf7eWS!abuM7ozzUs7eISYBumHNxpvO2Tuy74KRM4-WWcunXHZVjQf1V032xQrCzTfC5vA_NM-35xMEZ9yJ8XK-3QFqjWBSWuUi$>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!bOzafvznhRJly5r11WId0BSmM38vOBG5qnlJMJf02uLM44-t4g7Xm8NCG7h_D7BTAe3ACc19jaFdq3vxkBC_$  <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!bOzafvznhRJly5r11WId0BSmM38vOBG5qnlJMJf02uLM44-t4g7Xm8NCG7h_D7BTAe3ACc19jaFdq7RR1DEh$ >
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251113/d26637d5/attachment.html>

From knepley at gmail.com  Thu Nov 13 12:48:25 2025
From: knepley at gmail.com (Matthew Knepley)
Date: Thu, 13 Nov 2025 13:48:25 -0500
Subject: [petsc-users] Probelm with DMPlexExtractSubMesh
In-Reply-To: <1e9e4725-5274-4cef-a035-bb399deaac55@unibas.it>
References: <1e9e4725-5274-4cef-a035-bb399deaac55@unibas.it>
Message-ID: <CAMYG4GntFkfE41HAne1r3P6yHtNZU0JzX96ETnT-TEHtAo-yBA@mail.gmail.com>

Sorry, I have been traveling. I just got back to this.

The problem is that _everything_ that goes in the submesh has to have the
same label value. That way you can distinguish exactly what you want in.
However, the boundary label has to make decisions about shared edges and
vertices. I am attaching a modified code that does what you want by making
a separate label for each side.

I apologize for the C. I am just not as quick in Fortran.

  Thanks,

     Matt

On Thu, Nov 6, 2025 at 1:42?AM Aldo Bonfiglioli <aldo.bonfiglioli at unibas.it>
wrote:

> Dear all,
>
> I am having troubles in using DMPlexExtractSubMesh to extract the
> different strata of the Face Sets of a given mesh.
>
> When run on the enclosed tetrahedral mesh of the unit cube generated with
> gmsh
>
> Face Sets: 6 strata with value/size (1 (246), 2 (246), 3 (246), 4 (246), 5
> (242), 6 (242))
>
> I would expect 246 "points" on stratum 3, but when I DMview the subdm (and
> plot it) the surface mesh looks incomplete
>
> DM Object: patch_03 1 MPI process
>  type: plex
> patch_03 in 2 dimensions:
>  Cells are at height 1
>  Number of 0-cells per rank: 122
>  Number of 1-cells per rank: 325
>  Number of 2-cells per rank: 204
>  Number of 3-cells per rank: 204 [204]
> Labels:
>  celltype: 4 strata with value/size (0 (122), 1 (325), 3 (204), 12 (204))
>  depth: 4 strata with value/size (0 (122), 1 (325), 2 (204), 3 (204))
>  Cell Sets: 1 strata with value/size (1 (204))
>  Face Sets: 1 strata with value/size (3 (204))
>  Edge Sets: 2 strata with value/size (1 (8), 5 (8))
>
> see also patch_03.pdf
>
> What am I doing wrong?
>
> A simple reproducer (compiles with petsc-3.24.0) and the gmsh mesh are
> enclosed.
>
> Thanks,
>
> Aldo
>
> --
> Dr. Aldo Bonfiglioli
> Associate professor of Fluid Mechanics
> Dipartimento di Ingegneria
> Universita' della Basilicata
> V.le dell'Ateneo Lucano, 10 85100 Potenza ITALY
> tel:+39.0971.205203 fax:+39.0971.205215
> web: https://urldefense.us/v3/__http://docenti.unibas.it/site/home/docente.html?m=002423__;!!G_uCfscf7eWS!aNqQNIAnqfeL74GBiwHA9seVWu0ove-CSJIwX6f353WAN55As1veo1pVXphJIAAgvQIkWls9Xnm5sW-es9gN$  <https://urldefense.us/v3/__http://docenti.unibas.it/site/home/docente.html?m=002423__;!!G_uCfscf7eWS!aMKmGG4aim9XcbNSnDyHUkDyhUkQHGZ-u-xX2C-sycYUMmtTij6AwqsQbZPXJSvPp9KUfgwRJK2Ok6Me2BLgO0en1w4QF2fHo7s$>
>
>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!aNqQNIAnqfeL74GBiwHA9seVWu0ove-CSJIwX6f353WAN55As1veo1pVXphJIAAgvQIkWls9Xnm5sYVo0yux$  <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!aNqQNIAnqfeL74GBiwHA9seVWu0ove-CSJIwX6f353WAN55As1veo1pVXphJIAAgvQIkWls9Xnm5sbieON39$ >
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251113/eabed4dd/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ex_submesh.c
Type: application/octet-stream
Size: 2524 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251113/eabed4dd/attachment.obj>

From knepley at gmail.com  Thu Nov 13 15:27:44 2025
From: knepley at gmail.com (Matthew Knepley)
Date: Thu, 13 Nov 2025 16:27:44 -0500
Subject: [petsc-users] How to map global vector to natural vector
In-Reply-To: <CAMYG4Gnv3Rm-fqZGLNuU5uPyDJXuFfvU03NMjFngk9-Kxh85Gw@mail.gmail.com>
References: <SJ0PR09MB96671EE714795408BEEB6795E3F2A@SJ0PR09MB9667.namprd09.prod.outlook.com>
	<CAMYG4Gnv3Rm-fqZGLNuU5uPyDJXuFfvU03NMjFngk9-Kxh85Gw@mail.gmail.com>
Message-ID: <CAMYG4Gmfar_GcRCMAWzvBVTAzpDZBXsCBjXVa9qzRB3vtS17cA@mail.gmail.com>

Here is the MR:  https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/merge_requests/8845__;!!G_uCfscf7eWS!ZmoN7SHfAQw89DROdmXJv-lFozEHO_b5_4vXYl19TEES-ofkxjaaeq13Z5aj1-VVMY43qgXqgj5Y1WIH2Mba$ 

  Thanks,

     Matt

On Tue, Oct 21, 2025 at 4:24?PM Matthew Knepley <knepley at gmail.com> wrote:

> I will fix it.
>
> Thanks,
>
>     Matt
>
> On Tue, Oct 21, 2025 at 12:09?PM Xu, Donghui via petsc-users <
> petsc-users at mcs.anl.gov> wrote:
>
>> Dear PETSc Team,
>>
>> I am working with petsc4py for my model. I had some experience of using
>> PETSc in Fortran. In Fortran, I used the following subroutines:
>>
>> call DMPlexCreateNaturalVector(dm, natural, ierr)
>> call DMPlexNaturalToGlobalBegin(dm,natural,X,ierr)
>> call DMPlexNaturalToGlobalEnd(dm,natural,X,ierr)
>>
>> However, I found there are no such interfaces in petsc4py. Can you advise
>> me on how to get the global vector in natural order with DMPLEX in petsc4py?
>>
>> Thanks,
>> Donghui
>>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!ZmoN7SHfAQw89DROdmXJv-lFozEHO_b5_4vXYl19TEES-ofkxjaaeq13Z5aj1-VVMY43qgXqgj5Y1TFMijNU$ 
> <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!ZmoN7SHfAQw89DROdmXJv-lFozEHO_b5_4vXYl19TEES-ofkxjaaeq13Z5aj1-VVMY43qgXqgj5Y1Sq4JsWY$ >
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!ZmoN7SHfAQw89DROdmXJv-lFozEHO_b5_4vXYl19TEES-ofkxjaaeq13Z5aj1-VVMY43qgXqgj5Y1TFMijNU$  <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!ZmoN7SHfAQw89DROdmXJv-lFozEHO_b5_4vXYl19TEES-ofkxjaaeq13Z5aj1-VVMY43qgXqgj5Y1Sq4JsWY$ >
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251113/03bff06e/attachment-0001.html>

From donghui.xu at pnnl.gov  Thu Nov 13 15:28:43 2025
From: donghui.xu at pnnl.gov (Xu, Donghui)
Date: Thu, 13 Nov 2025 21:28:43 +0000
Subject: [petsc-users] How to map global vector to natural vector
In-Reply-To: <CAMYG4Gmfar_GcRCMAWzvBVTAzpDZBXsCBjXVa9qzRB3vtS17cA@mail.gmail.com>
References: <SJ0PR09MB96671EE714795408BEEB6795E3F2A@SJ0PR09MB9667.namprd09.prod.outlook.com>
	<CAMYG4Gnv3Rm-fqZGLNuU5uPyDJXuFfvU03NMjFngk9-Kxh85Gw@mail.gmail.com>
	<CAMYG4Gmfar_GcRCMAWzvBVTAzpDZBXsCBjXVa9qzRB3vtS17cA@mail.gmail.com>
Message-ID: <SJ0PR09MB9667582C01C8E64AF9F3F1BBE3CDA@SJ0PR09MB9667.namprd09.prod.outlook.com>

Thank you, Matt!

From: Matthew Knepley <knepley at gmail.com>
Date: Thursday, November 13, 2025 at 1:28?PM
To: Xu, Donghui <donghui.xu at pnnl.gov>
Cc: petsc-users at mcs.anl.gov <petsc-users at mcs.anl.gov>
Subject: Re: [petsc-users] How to map global vector to natural vector

Check twice before you click! This email originated from outside PNNL.

Here is the MR:  https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/merge_requests/8845__;!!G_uCfscf7eWS!eigadZs4YANflOBAegy-QTp9ga_gDR04kBT3z1MTqB-hieojq_WyFnV_8kjEYFw4cGI5ugeoekJUtEVOobw-94jwWXM$ 

  Thanks,

     Matt

On Tue, Oct 21, 2025 at 4:24?PM Matthew Knepley <knepley at gmail.com<mailto:knepley at gmail.com>> wrote:
I will fix it.

Thanks,

    Matt

On Tue, Oct 21, 2025 at 12:09?PM Xu, Donghui via petsc-users <petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>> wrote:
Dear PETSc Team,

I am working with petsc4py for my model. I had some experience of using PETSc in Fortran. In Fortran, I used the following subroutines:

call DMPlexCreateNaturalVector(dm, natural, ierr)
call DMPlexNaturalToGlobalBegin(dm,natural,X,ierr)
call DMPlexNaturalToGlobalEnd(dm,natural,X,ierr)

However, I found there are no such interfaces in petsc4py. Can you advise me on how to get the global vector in natural order with DMPLEX in petsc4py?

Thanks,
Donghui


--
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener

https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!eigadZs4YANflOBAegy-QTp9ga_gDR04kBT3z1MTqB-hieojq_WyFnV_8kjEYFw4cGI5ugeoekJUtEVOobw-Y8UMmvs$ <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!eigadZs4YANflOBAegy-QTp9ga_gDR04kBT3z1MTqB-hieojq_WyFnV_8kjEYFw4cGI5ugeoekJUtEVOobw-hCsLqdY$ >


--
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener

https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!eigadZs4YANflOBAegy-QTp9ga_gDR04kBT3z1MTqB-hieojq_WyFnV_8kjEYFw4cGI5ugeoekJUtEVOobw-Y8UMmvs$ <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!eigadZs4YANflOBAegy-QTp9ga_gDR04kBT3z1MTqB-hieojq_WyFnV_8kjEYFw4cGI5ugeoekJUtEVOobw-hCsLqdY$ >
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251113/389ae021/attachment.html>

From junchao.zhang at gmail.com  Thu Nov 13 17:02:20 2025
From: junchao.zhang at gmail.com (Junchao Zhang)
Date: Thu, 13 Nov 2025 17:02:20 -0600
Subject: [petsc-users] gpu cpu parallel
In-Reply-To: <3d5b96a2.a3e4.19a7e380b1f.Coremail.grantchao2018@163.com>
References: <CAKxb76vC1h=WXJJk5pji6oU5W5KB0EZ3R-revNUxaXVHnO3Ekw@mail.gmail.com>
	<CA+MQGp8cohPPYQv1sP=o4rd_Zn9t5M+d_FPSyKqW8e42KZ9UWg@mail.gmail.com>
	<1f9310f.309.19a76fa21d9.Coremail.grantchao2018@163.com>
	<4E8E7829-A856-495A-ADA3-710C91F8B3EF@petsc.dev>
	<CA+MQGp9M-5UmmBNC3-8eGoBwyahg-Lv1nQ_Hrz7b9wLS7Z8EVA@mail.gmail.com>
	<3d5b96a2.a3e4.19a7e380b1f.Coremail.grantchao2018@163.com>
Message-ID: <CA+MQGp96TsybPwEm2CcaXbLkmJ+cOE5J+v0p-qpK-E6nK6Eb7w@mail.gmail.com>

Hi, Grant,
  I could reproduce the issue with your code.  I think petsc code has some
problems and I created an issue at
https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/issues/1826__;!!G_uCfscf7eWS!ZSsk7IMQF7yL-THgMdfh_H3K7F1HUJg38n2dhkaBkJR1IvhSOpfX3c1TZLEL6JDNyCGACV-PEFWtIy-WgsKA8roDoTvm$ .  Though we should fix it (not
sure how for now),  I think a much simpler approach is to use
CUDA_VISIBLE_DEVICES. For example, if you just want ranks 0, 4 to use GPUs
0, 1 respectively,  you can just delete these lines in your example
if (global_rank == 0) {
cudaSetDevice(0);
} else if (global_rank == 4) {
cudaSetDevice(1);
}

Then, instead, just make GPUs 0, 1 visible to ranks 0, 4 respectively
upfront, by

$ cat set_gpu_device
#!/bin/bash
# select_gpu_device wrapper script
export
CUDA_VISIBLE_DEVICES=$((OMPI_COMM_WORLD_LOCAL_RANK/(OMPI_COMM_WORLD_LOCAL_SIZE/2)))
exec $*

$ mpirun -n 8 ./set_gpu_device  ./ex0
[Rank 5] no computation assigned.
[Rank 6] no computation assigned.
[Rank 7] no computation assigned.
[Rank 0] using GPU 0, [line 23].
[Rank 0] using GPU 0, [line 32] after setdevice.
[Rank 1] no computation assigned.
[Rank 2] no computation assigned.
[Rank 3] no computation assigned.
[Rank 4] using GPU 0, [line 23].
[Rank 4] using GPU 0, [line 32] after setdevice.
[Rank 0] using GPU 0, [line 42] after create A.
[Rank 4] using GPU 0, [line 42] after create A.
[Rank 4] using GPU 0, [line 46] after set A type.
[Rank 0] using GPU 0, [line 46] after set A type.
[Rank 0] using GPU 0, [line 50] after MatSetUp.
[Rank 4] using GPU 0, [line 50] after MatSetUp.
[Rank 0] using GPU 0, [line 63] after Mat Assemble.
[Rank 4] using GPU 0, [line 63] after Mat Assemble.
Smallest eigenvalue = 100.000000
Smallest eigenvalue = 100.000000

Note for rank 4, GPU 0 is actually the physical GPU 1.

Let me know if it works.
--Junchao Zhang


On Thu, Nov 13, 2025 at 11:17?AM Grant Chao <grantchao2018 at 163.com> wrote:

> Junchao,
> We have tried cudaSetDevice.
> The test code is attached. 8 cpu and 2 gpu are used. And we create a
> gpu_comm including rank 0 and rank 4.
> Then we set gpu 0 to rank 0, gpu 1 to rank 1 respectively.
> After MatSetType, rank 1 is mapped to gpu0 again.
>
> The run cmd is
>     mpirun -n 8 ./a.out -eps_type jd -st_ksp_type gmres -st_pc_type none
>
> The std out is show below,
> [Rank 0] using GPU 0, [line 22].
> [Rank 1] no computation assigned.
> [Rank 2] no computation assigned.
> [Rank 3] no computation assigned.
> [Rank 4] using GPU 0, [line 22].
> [Rank 5] no computation assigned.
> [Rank 6] no computation assigned.
> [Rank 7] no computation assigned.
> [Rank 4] using GPU 1, [line 31] after setdevice.   -------- Here set
> device successfully
> [Rank 0] using GPU 0, [line 31] after setdevice.
> [Rank 4] using GPU 1, [line 41] after create A.
> [Rank 0] using GPU 0, [line 41] after create A.
> [Rank 0] using GPU 0, [line 45] after set A type.
> [Rank 4] using GPU 0, [line 45] after set A type.      ------ change to 0?
> [Rank 4] using GPU 0, [line 49] after MatSetUp.
> [Rank 0] using GPU 0, [line 49] after MatSetUp.
> [Rank 4] using GPU 0, [line 62] after Mat Assemble.
> [Rank 0] using GPU 0, [line 62] after Mat Assemble.
> Smallest eigenvalue = 100.000000
> Smallest eigenvalue = 100.000000
>
> BEST,
> Grant
>
>
>
>
> At 2025-11-13 05:58:05, "Junchao Zhang" <junchao.zhang at gmail.com> wrote:
>
> A common approach is to use CUDA_VISIBLE_DEVICES to manipulate MPI ranks
> to GPUs mapping, see the section at
> https://urldefense.us/v3/__https://docs.nersc.gov/jobs/affinity/*gpu-nodes__;Iw!!G_uCfscf7eWS!ZSsk7IMQF7yL-THgMdfh_H3K7F1HUJg38n2dhkaBkJR1IvhSOpfX3c1TZLEL6JDNyCGACV-PEFWtIy-WgsKA8pWxGvch$ 
>
> With OpenMPI,  you can use OMPI_COMM_WORLD_LOCAL_RANK in place of
> SLURM_LOCALID (see
> https://urldefense.us/v3/__https://docs.open-mpi.org/en/v5.0.x/tuning-apps/environment-var.html__;!!G_uCfscf7eWS!ZSsk7IMQF7yL-THgMdfh_H3K7F1HUJg38n2dhkaBkJR1IvhSOpfX3c1TZLEL6JDNyCGACV-PEFWtIy-WgsKA8khuXtvj$ ).
> For example, with 8 MPI ranks and 4 GPUs per node, the following script
> will map ranks 0, 1 to GPU 0, ranks 2, 3 to GPU 1.
>
> #!/bin/bash
> # select_gpu_device wrapper script
> export
> CUDA_VISIBLE_DEVICES=$((OMPI_COMM_WORLD_LOCAL_RANK/(OMPI_COMM_WORLD_LOCAL_SIZE/4)))
> exec $*
>
> On Wed, Nov 12, 2025 at 10:20?AM Barry Smith <bsmith at petsc.dev> wrote:
>
>>
>>
>> On Nov 12, 2025, at 2:31?AM, Grant Chao <grantchao2018 at 163.com> wrote:
>>
>>
>> Thank you for the suggestion.
>>
>> We have already tried running multiple CPU ranks with a single GPU.
>> However, we observed that as the number of ranks increases, the EPS solver
>> becomes significantly slower. We are not sure of the exact cause?could it
>> be due to process access contention, hidden data transfers, or perhaps
>> another reason? We would be very interested to hear your insight on this
>> matter.
>>
>> To avoid this problem, we used the gpu_comm approach mentioned before.
>> During testing, we noticed that the mapping between rank ID and GPU ID
>> seems to be set automatically and is not user-specifiable.
>>
>> For example, with 4 GPUs (0-3) and 8 CPU ranks (0-7), the program binds
>> ranks 0 and 4 to GPU 0, ranks 1 and 5 to GPU 1, and so on.
>>
>>
>>
>>
>> We tested possible solutions, such as calling cudaSetDevice() manually to
>> set rank 4 to device 1, but it did not work as expected. Ranks 0 and 4
>> still used GPU 0.
>>
>> We would appreciate your guidance on how to customize this mapping. Thank
>> you for your support.
>>
>>
>>   So you have a single compute "node" connected to multiple GPUs?  Then
>> the mapping of MPI ranks to GPUs doesn't matter and changing it won't
>> improve the performance.
>>
>
>> However, we observed that as the number of ranks increases, the EPS
>> solver becomes significantly slower.
>>
>>
>>   Does the number of EPS "iterations" increase? Run with one, two, four
>> and eight MPI ranks (and the same number of "GPUs" (if you only have say
>> four GPUs that is fine, just virtualize them so two different MPI ranks
>> share one) and the option -log_view and send the output. We need to know
>> what is slowing down before trying to find any cure.
>>
>>   Barry
>>
>>
>>
>>
>>
>> Best wishes,
>> Grant
>>
>>
>> At 2025-11-12 11:48:47, "Junchao Zhang" <junchao.zhang at gmail.com>, said:
>>
>> Hi, Wenbo,
>>    I think your approach should work.  But before going this extra step
>> with gpu_comm,  have you tried to map multiple MPI ranks (CPUs) to one GPU,
>> using nvidia's multiple process service (MPS)?  If MPS works well,  then
>> you can avoid the extra complexity.
>>
>> --Junchao Zhang
>>
>>
>> On Tue, Nov 11, 2025 at 7:50?PM Wenbo Zhao <zhaowenbo.npic at gmail.com>
>> wrote:
>>
>>> Dear all,
>>>
>>> We are trying to solve ksp using GPUs.
>>> We found the example, src/ksp/ksp/tutorials/bench_kspsolve.c, in which
>>> the matrix is created and assembling using COO way provided by PETSc. In
>>> this example, the number of CPU is as same as the number of GPU.
>>> In our case, computation of the parameters of matrix is performed on
>>> CPUs. And the cost of it is expensive, which might take half of total time
>>> or even more.
>>>
>>>  We want to use more CPUs to compute parameters in parallel. And a
>>> smaller communication domain (such as gpu_comm) for the CPUs corresponding
>>> to the GPUs is created. The parameters are computed by all of the CPUs (in
>>> MPI_COMM_WORLD). Then, the parameters are send to gpu_comm related CPUs via
>>> MPI. Matrix (type of aijcusparse) is then created and assembled within
>>> gpu_comm. Finally, ksp_solve is performed on GPUs.
>>>
>>> I?m not sure if this approach will work in practice. Are there any
>>> comparable examples I can look to for guidance?
>>>
>>> Best,
>>> Wenbo
>>>
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251113/b5bd6f36/attachment-0001.html>

From junchao.zhang at gmail.com  Thu Nov 13 21:15:35 2025
From: junchao.zhang at gmail.com (Junchao Zhang)
Date: Thu, 13 Nov 2025 21:15:35 -0600
Subject: [petsc-users] Fwd: Fw:  gpu cpu parallel
In-Reply-To: <7a48f48d.29317.19a80534c97.Coremail.amarantos@126.com>
References: <7a48f48d.29317.19a80534c97.Coremail.amarantos@126.com>
Message-ID: <CA+MQGp8D-_F_D3J1M2Q65csiaQe=9uJdcy6_Rhr8z8H8Qknhfg@mail.gmail.com>

Glad to hear it works!

--Junchao Zhang

---------- Forwarded message ---------
From: Grace <amarantos at 126.com>
Date: Thu, Nov 13, 2025 at 9:05?PM
Subject: Fw: [petsc-users] gpu cpu parallel
To: junchao.zhang at gmail.com <junchao.zhang at gmail.com>


Hello, Junchao,

Thank you for your prompt help and the detailed solution.

We have tested the approach you suggested, using the set_gpu_device wrapper
script to control GPU visibility via CUDA_VISIBLE_DEVICES. It works
perfectly and now correctly maps the ranks to the intended GPUs as we
desired.

We really appreciate your guidance in resolving this issue.

Best regards,
Grace Gao
---- Forwarded Message ----
>From Grant Chao<grantchao2018 at 163.com> <grantchao2018 at 163.com>
Date 11/14/2025 08:40
To amarantos at 126.com
Cc
Subject Fw:Re: Re: [petsc-users] gpu cpu parallel

--
sent by my netease email phone version


-------- Forward mail content --------
From: "Junchao Zhang" <junchao.zhang at gmail.com>
Date: 2025-11-14 07:02:20
To: "Grant Chao" <grantchao2018 at 163.com>
CC: "Barry Smith" <bsmith at petsc.dev>,petsc-users <petsc-users at mcs.anl.gov>
Subject: Re: Re: [petsc-users] gpu cpu parallel
Hi, Grant,
  I could reproduce the issue with your code.  I think petsc code has some
problems and I created an issue at
https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/issues/1826__;!!G_uCfscf7eWS!cbp-7TpeWxYGe77z37dPkFan7mFckYzyKehvf7UVDZ8djmigGrIvIj5PYrJhgumqnq2Gi5vpjoVuOPymh3NEBptx_yOV$ .  Though we should fix it (not
sure how for now),  I think a much simpler approach is to use
CUDA_VISIBLE_DEVICES. For example, if you just want ranks 0, 4 to use GPUs
0, 1 respectively,  you can just delete these lines in your example
if (global_rank == 0) {
cudaSetDevice(0);
} else if (global_rank == 4) {
cudaSetDevice(1);
}

Then, instead, just make GPUs 0, 1 visible to ranks 0, 4 respectively
upfront, by

$ cat set_gpu_device
#!/bin/bash
# select_gpu_device wrapper script
export
CUDA_VISIBLE_DEVICES=$((OMPI_COMM_WORLD_LOCAL_RANK/(OMPI_COMM_WORLD_LOCAL_SIZE/2)))
exec $*

$ mpirun -n 8 ./set_gpu_device  ./ex0
[Rank 5] no computation assigned.
[Rank 6] no computation assigned.
[Rank 7] no computation assigned.
[Rank 0] using GPU 0, [line 23].
[Rank 0] using GPU 0, [line 32] after setdevice.
[Rank 1] no computation assigned.
[Rank 2] no computation assigned.
[Rank 3] no computation assigned.
[Rank 4] using GPU 0, [line 23].
[Rank 4] using GPU 0, [line 32] after setdevice.
[Rank 0] using GPU 0, [line 42] after create A.
[Rank 4] using GPU 0, [line 42] after create A.
[Rank 4] using GPU 0, [line 46] after set A type.
[Rank 0] using GPU 0, [line 46] after set A type.
[Rank 0] using GPU 0, [line 50] after MatSetUp.
[Rank 4] using GPU 0, [line 50] after MatSetUp.
[Rank 0] using GPU 0, [line 63] after Mat Assemble.
[Rank 4] using GPU 0, [line 63] after Mat Assemble.
Smallest eigenvalue = 100.000000
Smallest eigenvalue = 100.000000

Note for rank 4, GPU 0 is actually the physical GPU 1.

Let me know if it works.
--Junchao Zhang


On Thu, Nov 13, 2025 at 11:17?AM Grant Chao <grantchao2018 at 163.com> wrote:

> Junchao,
> We have tried cudaSetDevice.
> The test code is attached. 8 cpu and 2 gpu are used. And we create a
> gpu_comm including rank 0 and rank 4.
> Then we set gpu 0 to rank 0, gpu 1 to rank 1 respectively.
> After MatSetType, rank 1 is mapped to gpu0 again.
>
> The run cmd is
>     mpirun -n 8 ./a.out -eps_type jd -st_ksp_type gmres -st_pc_type none
>
> The std out is show below,
> [Rank 0] using GPU 0, [line 22].
> [Rank 1] no computation assigned.
> [Rank 2] no computation assigned.
> [Rank 3] no computation assigned.
> [Rank 4] using GPU 0, [line 22].
> [Rank 5] no computation assigned.
> [Rank 6] no computation assigned.
> [Rank 7] no computation assigned.
> [Rank 4] using GPU 1, [line 31] after setdevice.   -------- Here set
> device successfully
> [Rank 0] using GPU 0, [line 31] after setdevice.
> [Rank 4] using GPU 1, [line 41] after create A.
> [Rank 0] using GPU 0, [line 41] after create A.
> [Rank 0] using GPU 0, [line 45] after set A type.
> [Rank 4] using GPU 0, [line 45] after set A type.      ------ change to 0?
> [Rank 4] using GPU 0, [line 49] after MatSetUp.
> [Rank 0] using GPU 0, [line 49] after MatSetUp.
> [Rank 4] using GPU 0, [line 62] after Mat Assemble.
> [Rank 0] using GPU 0, [line 62] after Mat Assemble.
> Smallest eigenvalue = 100.000000
> Smallest eigenvalue = 100.000000
>
> BEST,
> Grant
>
>
>
>
> At 2025-11-13 05:58:05, "Junchao Zhang" <junchao.zhang at gmail.com> wrote:
>
> A common approach is to use CUDA_VISIBLE_DEVICES to manipulate MPI ranks
> to GPUs mapping, see the section at
> https://urldefense.us/v3/__https://docs.nersc.gov/jobs/affinity/*gpu-nodes__;Iw!!G_uCfscf7eWS!cbp-7TpeWxYGe77z37dPkFan7mFckYzyKehvf7UVDZ8djmigGrIvIj5PYrJhgumqnq2Gi5vpjoVuOPymh3NEBtfb0PXl$ 
>
> With OpenMPI,  you can use OMPI_COMM_WORLD_LOCAL_RANK in place of
> SLURM_LOCALID (see
> https://urldefense.us/v3/__https://docs.open-mpi.org/en/v5.0.x/tuning-apps/environment-var.html__;!!G_uCfscf7eWS!cbp-7TpeWxYGe77z37dPkFan7mFckYzyKehvf7UVDZ8djmigGrIvIj5PYrJhgumqnq2Gi5vpjoVuOPymh3NEBgxjjYYZ$ ).
> For example, with 8 MPI ranks and 4 GPUs per node, the following script
> will map ranks 0, 1 to GPU 0, ranks 2, 3 to GPU 1.
>
> #!/bin/bash
> # select_gpu_device wrapper script
> export
> CUDA_VISIBLE_DEVICES=$((OMPI_COMM_WORLD_LOCAL_RANK/(OMPI_COMM_WORLD_LOCAL_SIZE/4)))
> exec $*
>
> On Wed, Nov 12, 2025 at 10:20?AM Barry Smith <bsmith at petsc.dev> wrote:
>
>>
>>
>> On Nov 12, 2025, at 2:31?AM, Grant Chao <grantchao2018 at 163.com> wrote:
>>
>>
>> Thank you for the suggestion.
>>
>> We have already tried running multiple CPU ranks with a single GPU.
>> However, we observed that as the number of ranks increases, the EPS solver
>> becomes significantly slower. We are not sure of the exact cause?could it
>> be due to process access contention, hidden data transfers, or perhaps
>> another reason? We would be very interested to hear your insight on this
>> matter.
>>
>> To avoid this problem, we used the gpu_comm approach mentioned before.
>> During testing, we noticed that the mapping between rank ID and GPU ID
>> seems to be set automatically and is not user-specifiable.
>>
>> For example, with 4 GPUs (0-3) and 8 CPU ranks (0-7), the program binds
>> ranks 0 and 4 to GPU 0, ranks 1 and 5 to GPU 1, and so on.
>>
>>
>>
>>
>> We tested possible solutions, such as calling cudaSetDevice() manually to
>> set rank 4 to device 1, but it did not work as expected. Ranks 0 and 4
>> still used GPU 0.
>>
>> We would appreciate your guidance on how to customize this mapping. Thank
>> you for your support.
>>
>>
>>   So you have a single compute "node" connected to multiple GPUs?  Then
>> the mapping of MPI ranks to GPUs doesn't matter and changing it won't
>> improve the performance.
>>
>
>> However, we observed that as the number of ranks increases, the EPS
>> solver becomes significantly slower.
>>
>>
>>   Does the number of EPS "iterations" increase? Run with one, two, four
>> and eight MPI ranks (and the same number of "GPUs" (if you only have say
>> four GPUs that is fine, just virtualize them so two different MPI ranks
>> share one) and the option -log_view and send the output. We need to know
>> what is slowing down before trying to find any cure.
>>
>>   Barry
>>
>>
>>
>>
>>
>> Best wishes,
>> Grant
>>
>>
>> At 2025-11-12 11:48:47, "Junchao Zhang" <junchao.zhang at gmail.com>, said:
>>
>> Hi, Wenbo,
>>    I think your approach should work.  But before going this extra step
>> with gpu_comm,  have you tried to map multiple MPI ranks (CPUs) to one GPU,
>> using nvidia's multiple process service (MPS)?  If MPS works well,  then
>> you can avoid the extra complexity.
>>
>> --Junchao Zhang
>>
>>
>> On Tue, Nov 11, 2025 at 7:50?PM Wenbo Zhao <zhaowenbo.npic at gmail.com>
>> wrote:
>>
>>> Dear all,
>>>
>>> We are trying to solve ksp using GPUs.
>>> We found the example, src/ksp/ksp/tutorials/bench_kspsolve.c, in which
>>> the matrix is created and assembling using COO way provided by PETSc. In
>>> this example, the number of CPU is as same as the number of GPU.
>>> In our case, computation of the parameters of matrix is performed on
>>> CPUs. And the cost of it is expensive, which might take half of total time
>>> or even more.
>>>
>>>  We want to use more CPUs to compute parameters in parallel. And a
>>> smaller communication domain (such as gpu_comm) for the CPUs corresponding
>>> to the GPUs is created. The parameters are computed by all of the CPUs (in
>>> MPI_COMM_WORLD). Then, the parameters are send to gpu_comm related CPUs via
>>> MPI. Matrix (type of aijcusparse) is then created and assembled within
>>> gpu_comm. Finally, ksp_solve is performed on GPUs.
>>>
>>> I?m not sure if this approach will work in practice. Are there any
>>> comparable examples I can look to for guidance?
>>>
>>> Best,
>>> Wenbo
>>>
>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251113/a8734dff/attachment-0001.html>

From herbert.owen at bsc.es  Fri Nov 14 09:08:24 2025
From: herbert.owen at bsc.es (howen)
Date: Fri, 14 Nov 2025 16:08:24 +0100
Subject: [petsc-users] Petsc + nvhpc
In-Reply-To: <CAMYG4GkzGH5_HTBsufcZWAfbp_YyjTLrR=ehq2D=gLSpxuVfeA@mail.gmail.com>
References: <0B70CA06-D787-4D97-8E33-27E71D08BBF0@bsc.es>
	<CA+MQGp9gQVBFuxjynUmpOoGTqvNkuRytyLtZsiJaEf30Mc9OAg@mail.gmail.com>
	<2A940B06-46A3-40AB-BD66-835C46970CA1@bsc.es>
	<CAMYG4GkzGH5_HTBsufcZWAfbp_YyjTLrR=ehq2D=gLSpxuVfeA@mail.gmail.com>
Message-ID: <B639DD8C-F63A-4F19-8EEC-6CDEFCAAB42E@bsc.es>

Thank you very much Matthew,

I did what you suggested and I also added 

ierr = MatView(*amat, PETSC_VIEWER_STDOUT_WORLD); CHKERRQ(ierr);

Now that I can see the matrices I notice that some values differ. I will debug and simplify my code to try to understand where the difference comes from . 

As soon as I have a more clear picture I will contact you back. 

Best, 


Herbert Owen
Senior Researcher, Dpt. Computer Applications in Science and Engineering
Barcelona Supercomputing Center (BSC-CNS)
Tel: +34 93 413 4038
Skype: herbert.owen

https://urldefense.us/v3/__https://scholar.google.es/citations?user=qe5O2IYAAAAJ&hl=en__;!!G_uCfscf7eWS!dLyvJQfp1uwWG0xt9W0Mel0ZD1L7iEt2qnUs6XfEM-gvuPIwdzmzUE8dkvjDoYsah4-z0d0W6hCI9jZ_17GnGxsT7iF2$ 


> On 13 Nov 2025, at 18:23, Matthew Knepley <knepley at gmail.com> wrote:
> 
> On Thu, Nov 13, 2025 at 12:11?PM howen via petsc-users <petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov>> wrote:
>> Dear Junchao,
>> 
>> Thank you for response and sorry for taking so long to answer back. 
>> I cannot avoid using the nvidia tools. Gfortran is not mature for OpenACC and gives us problems when compiling our code.
>> What I have done to enable using the latest petsc is to create my own C code to call petsc. 
>> I have little experience with c and it took me some time, but I can now use petsc 3.24.1  ;)
>> 
>> The behaviour remains the same as in my original email . 
>> Parallel+GPU gives bad results. CPU(serial and parallel) and GPU serial all work ok and give the same result.
>> 
>> I have gone a bit into petsc comparing the CPU and GPU version with 2 mpi.
>> I see that the difference starts in 
>> src/ksp/ksp/impls/cg/cg.c  L170
>>     PetscCall(KSP_PCApply(ksp, R, Z));  /*    z <- Br                           */
>> I have printed the vectors R and Z and the norm dp.
>> R is identical on both CPU and GPU; but Z differs.
>> The correct value of dp (for the first time it enters) is 14.3014, while running on the GPU with 2 mpis it gives 14.7493.
>> If you wish I can send you prints I introduced in cg.c    
> 
> Thank you for all the detail in this report. However, since you see a problem in KSPCG, I believe we can reduce the complexity. You can use
> 
>   -ksp_view_mat binary:A.bin -ksp_view_rhs binary:b.bin
> 
> and send us those files. Then we can run your system directly using KSP ex10 (and so can you).
> 
>   Thanks,
> 
>       Matt
>  
>> The folder with the input files to run the case can be downloaded from https://urldefense.us/v3/__https://b2drop.eudat.eu/s/wKRQ4LK7RTKz2iQ__;!!G_uCfscf7eWS!dLyvJQfp1uwWG0xt9W0Mel0ZD1L7iEt2qnUs6XfEM-gvuPIwdzmzUE8dkvjDoYsah4-z0d0W6hCI9jZ_17GnG1yKXAMP$  <https://urldefense.us/v3/__https://b2drop.eudat.eu/s/wKRQ4LK7RTKz2iQ__;!!G_uCfscf7eWS!Y_YIXIdrN81gDKgNed6V4icL3nN9OG62-ZsdnB1Bkc7iiGAoJ2riwbTzxJMnIROon3mXgiFVLnbH0RTlsXrLEAh7n_UO$>
>> 
>> For submitting the gpu run I use 
>> mpirun -np 2 --map-by ppr:4:node:PE=20 --report-bindings ./mn5_bind.sh /gpfs/scratch/bsc21/bsc021257/git/140-add-petsc/sod2d_gitlab/build_gpu/src/app_sod2d/sod2d ChannelFlowSolverIncomp.json
>> 
>> For the cpu run
>> mpirun -np 2 /gpfs/scratch/bsc21/bsc021257/git/140-add-petsc/sod2d_gitlab/build_cpu/src/app_sod2d/sod2d ChannelFlowSolverIncomp.json
>> 
>> Our code can be downloaded with :
>> git clone --recursive https://urldefense.us/v3/__https://gitlab.com/bsc_sod2d/sod2d_gitlab.git__;!!G_uCfscf7eWS!dLyvJQfp1uwWG0xt9W0Mel0ZD1L7iEt2qnUs6XfEM-gvuPIwdzmzUE8dkvjDoYsah4-z0d0W6hCI9jZ_17GnG8xQvHi_$  <https://urldefense.us/v3/__https://gitlab.com/bsc_sod2d/sod2d_gitlab.git__;!!G_uCfscf7eWS!Y_YIXIdrN81gDKgNed6V4icL3nN9OG62-ZsdnB1Bkc7iiGAoJ2riwbTzxJMnIROon3mXgiFVLnbH0RTlsXrLEFjsBTIo$>
>> 
>> -and the branch I am using with
>> git checkout 140-add-petsc
>> 
>> To use exactly the same commit I am using 
>> git checkout 09a923c9b57e46b14ae54b935845d50272691ace
>> 
>> 
>> I am currently using: Currently Loaded Modules:
>>   1) nvidia-hpc-sdk/25.1   2) hdf5/1.14.1-2-nvidia-nvhpcx   3) cmake/3.25.1
>> I guess/hope similar modules should be available in any supercomputer.
>> 
>> To build the cpu version 
>> mkdir build_cpu
>> cd build_cpu
>> 
>> export PETSC_INSTALL=/gpfs/scratch/bsc21/bsc021257/git/petsc_oct25/3241_cpu/hhinstal
>> export LD_LIBRARY_PATH=$PETSC_INSTALL/lib:$LD_LIBRARY_PATH
>> export LIBRARY_PATH=$PETSC_INSTALL/lib:$LIBRARY_PATH
>> export C_INCLUDE_PATH=$PETSC_INSTALL/include:$C_INCLUDE_PATH
>> export CPLUS_INCLUDE_PATH=$PETSC_INSTALL/include:$CPLUS_INCLUDE_PATH
>> export PKG_CONFIG_PATH=$PETSC_INSTALL/lib/pkgconfig:$PKG_CONFIG_PATH
>> 
>> cmake -DUSE_RP=8 -DUSE_PORDER=3 -DUSE_PETSC=ON -DUSE_GPU=OFF -DDEBUG_MODE=OFF ..
>> make -j 80
>> 
>> I have built petsc myself  as follows
>> 
>> git clone -b release https://urldefense.us/v3/__https://gitlab.com/petsc/petsc.git__;!!G_uCfscf7eWS!dLyvJQfp1uwWG0xt9W0Mel0ZD1L7iEt2qnUs6XfEM-gvuPIwdzmzUE8dkvjDoYsah4-z0d0W6hCI9jZ_17GnG9OyCmiL$  <https://urldefense.us/v3/__https://gitlab.com/petsc/petsc.git__;!!G_uCfscf7eWS!Y_YIXIdrN81gDKgNed6V4icL3nN9OG62-ZsdnB1Bkc7iiGAoJ2riwbTzxJMnIROon3mXgiFVLnbH0RTlsXrLELP8U6d0$> petsc
>> cd petsc
>> git checkout v3.24.1     
>> module purge
>> module load nvidia-hpc-sdk/25.1   hdf5/1.14.1-2-nvidia-nvhpcx cmake/3.25.1 
>> ./configure --PETSC_DIR=/gpfs/scratch/bsc21/bsc021257/git/petsc_oct25/3241/petsc --prefix=/gpfs/scratch/bsc21/bsc021257/git/petsc_oct25/3241/hhinstal --with-fortran-bindings=0  --with-fc=0 --with-petsc-arch=linux-x86_64-opt --with-scalar-type=real --with-debugging=yes --with-64-bit-indices=1 --with-precision=single --download-hypre CFLAGS=-I/apps/ACC/HDF5/1.14.1-2/NVIDIA/NVHPCX/include CXXFLAGS= FCFLAGS= --with-shared-libraries=1 --with-mpi=1 --with-blacs-lib=/gpfs/apps/MN5/ACC/ONEAPI/2025.1/mkl/2025.1/lib/intel64/libmkl_blacs_openmpi_lp64.a --with-blacs-include=/gpfs/apps/MN5/ACC/ONEAPI/2025.1/mkl/2025.1/include --with-mpi-dir=/apps/ACC/NVIDIA-HPC-SDK/25.1/Linux_x86_64/25.1/comm_libs/12.6/hpcx/latest/ompi/ --download-ptscotch=yes --download-metis --download-parmetis
>> make all check
>> make install
>> 
>> -------------------
>> For the GPU version when configuring petsc I add : --with-cuda 
>> 
>> I then change the export PETSC_INSTALL  to 
>> export PETSC_INSTALL=/gpfs/scratch/bsc21/bsc021257/git/petsc_oct25/3241/hhinstal
>> and repeat all other exports
>> 
>> mkdir build_gpu
>> cd build_gpu
>> cmake -DUSE_RP=8 -DUSE_PORDER=3 -DUSE_PETSC=ON -DUSE_GPU=ON -DDEBUG_MODE=OFF ..
>> make -j 80
>> 
>> As you can see from the submit instructions the executable is found in sod2d_gitlab/build_gpu/src/app_sod2d/sod2d
>> 
>> I hope I have not forgotten anything and my instructions are 'easy' to follow. If you have any issue do not doubt to contact me.
>> The wiki for our code can be found in https://urldefense.us/v3/__https://gitlab.com/bsc_sod2d/sod2d_gitlab/-/wikis/home__;!!G_uCfscf7eWS!dLyvJQfp1uwWG0xt9W0Mel0ZD1L7iEt2qnUs6XfEM-gvuPIwdzmzUE8dkvjDoYsah4-z0d0W6hCI9jZ_17GnG49E2dbs$  <https://urldefense.us/v3/__https://gitlab.com/bsc_sod2d/sod2d_gitlab/-/wikis/home__;!!G_uCfscf7eWS!Y_YIXIdrN81gDKgNed6V4icL3nN9OG62-ZsdnB1Bkc7iiGAoJ2riwbTzxJMnIROon3mXgiFVLnbH0RTlsXrLEA1vqPYk$>
>> 
>> Best, 
>> 
>> Herbert Owen
>>  
>> Herbert Owen
>> Senior Researcher, Dpt. Computer Applications in Science and Engineering
>> Barcelona Supercomputing Center (BSC-CNS)
>> Tel: +34 93 413 4038
>> Skype: herbert.owen
>> 
>> https://urldefense.us/v3/__https://scholar.google.es/citations?user=qe5O2IYAAAAJ&hl=en__;!!G_uCfscf7eWS!dLyvJQfp1uwWG0xt9W0Mel0ZD1L7iEt2qnUs6XfEM-gvuPIwdzmzUE8dkvjDoYsah4-z0d0W6hCI9jZ_17GnGxsT7iF2$  <https://urldefense.us/v3/__https://scholar.google.es/citations?user=qe5O2IYAAAAJ&hl=en__;!!G_uCfscf7eWS!Y_YIXIdrN81gDKgNed6V4icL3nN9OG62-ZsdnB1Bkc7iiGAoJ2riwbTzxJMnIROon3mXgiFVLnbH0RTlsXrLEAA5PwtO$>
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>>> On 16 Oct 2025, at 18:30, Junchao Zhang <junchao.zhang at gmail.com <mailto:junchao.zhang at gmail.com>> wrote:
>>> 
>>> Hi, Herbert,
>>>    I don't have much experience on OpenACC and PETSc CI doesn't have such tests.  Could you avoid using nvfortran and instead use gfortran to compile your Fortran + OpenACC code?  If you, then you can use the latest petsc code and make our debugging easier. 
>>>    Also, could you provide us with a test and instructions to reproduce the problem?
>>>    
>>>    Thanks!
>>> --Junchao Zhang
>>> 
>>> 
>>> On Thu, Oct 16, 2025 at 5:07?AM howen via petsc-users <petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov>> wrote:
>>>> Dear All,
>>>> 
>>>> I am interfacing our CFD code (Fortran + OpenACC)  to Petsc. 
>>>> Since we use OpenACC the natural choice for us is to use Nvidia?s nvhpc compiler. The Gnu compiler does not work well and we do not have access to the Cray compiler.  
>>>> 
>>>> I already know that the latest version of Petsc does not compile with nvhpc, I am therefore using version 3.21.  
>>>> I get good results on the CPU both in serial and parallel (MPI). However, the GPU implementation, that is what we are interested in, only work correctly for the serial version. In parallel, the results are different. Even for a CG solve. 
>>>> 
>>>> I would like to know, if you have experience with the Nvidia compiler.  I am particularly interested if you have already observed issues with it. Your opinion on whether to put further effort into trying to find a bug I may have introduced during the interfacing is highly appreciated.
>>>> 
>>>> Best,
>>>> 
>>>> Herbert Owen
>>>> Senior Researcher, Dpt. Computer Applications in Science and Engineering
>>>> Barcelona Supercomputing Center (BSC-CNS)
>>>> Tel: +34 93 413 4038
>>>> Skype: herbert.owen
>>>> 
>>>> https://urldefense.us/v3/__https://scholar.google.es/citations?user=qe5O2IYAAAAJ&hl=en__;!!G_uCfscf7eWS!dLyvJQfp1uwWG0xt9W0Mel0ZD1L7iEt2qnUs6XfEM-gvuPIwdzmzUE8dkvjDoYsah4-z0d0W6hCI9jZ_17GnGxsT7iF2$  <https://urldefense.us/v3/__https://scholar.google.es/citations?user=qe5O2IYAAAAJ&hl=en__;!!G_uCfscf7eWS!abuM7ozzUs7eISYBumHNxpvO2Tuy74KRM4-WWcunXHZVjQf1V032xQrCzTfC5vA_NM-35xMEZ9yJ8XK-3QFqjWBSWuUi$>
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>> 
> 
> 
> 
> --
> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
> -- Norbert Wiener
> 
> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!dLyvJQfp1uwWG0xt9W0Mel0ZD1L7iEt2qnUs6XfEM-gvuPIwdzmzUE8dkvjDoYsah4-z0d0W6hCI9jZ_17GnG02oLPV3$  <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!dLyvJQfp1uwWG0xt9W0Mel0ZD1L7iEt2qnUs6XfEM-gvuPIwdzmzUE8dkvjDoYsah4-z0d0W6hCI9jZ_17GnG3J_36vG$ >

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251114/6112d2d8/attachment.html>

From bsmith at petsc.dev  Fri Nov 14 09:45:13 2025
From: bsmith at petsc.dev (Barry Smith)
Date: Fri, 14 Nov 2025 10:45:13 -0500
Subject: [petsc-users] Error in configuring PETSc with Cygwin on Windows
 by using MS-MPI
In-Reply-To: <2980bc3b.63f0.19a824df26a.Coremail.202321009113@mail.scut.edu.cn>
References: <71f7ca4b.59c6.19a724c3c2c.Coremail.ctchengben@mail.scut.edu.cn>
	<CAMYG4Gm7oUio5fswjLWVu_5bxTSpLwNzJaZh5ggx25D=Z7rPjA@mail.gmail.com>
	<823C2320-52A3-4679-8BB2-26DA296E4ACA@petsc.dev>
	<46e195ab.5cea.19a779d6908.Coremail.202321009113@mail.scut.edu.cn>
	<D8DDBDE6-67EA-418B-A8D8-032377C6295A@petsc.dev>
	<6de01bb0.5fe2.19a7c6bb95f.Coremail.202321009113@mail.scut.edu.cn>
	<84886CB9-D3B8-433C-943B-31E85A47C3B3@petsc.dev>
	<2980bc3b.63f0.19a824df26a.Coremail.202321009113@mail.scut.edu.cn>
Message-ID: <9A855973-12C8-4A76-B898-A69AD14962D2@petsc.dev>


   The C preprocessor may be failing on the complicated gymnastics of PETSC_DEPRECATED_FUNCTION

    My conclusion is PETSc is unbuildable on the version of the Microsoft compilers you are using, and you need to upgrade to the latest Microsoft compilers.

   The Microsoft C compiler has never been properly standard-compliant (and proudly so), so it can fail on correct C code.

   Barry


> On Nov 14, 2025, at 7:18?AM, ?? <202321009113 at mail.scut.edu.cn> wrote:
> 
>> Hi Barry
>> 
>> 
>> 
>> Thanks for your advice.
>> I follow your advice and configure and make again.
>> 
>> /******************************************
>> ./configure --with-debugging=0 --with-cc="win32fe_cl" --with-fc=0 --with-cxx="win32fe_cl" --download-f2cblaslapack=/cygdrive/g/mypetsc/f2cblaslapack-3.8.0.q2.tar.gz --with-mpi-include=\[/cygdrive/g/MSmpi/MicrosoftSDKs/Include,/cygdrive/g/MSmpi/MicrosoftSDKs/Include/x64\] --with-mpi-lib=\[/cygdrive/g/MSmpi/MicrosoftSDKs/Lib/x64/msmpifec.lib,/cygdrive/g/MSmpi/MicrosoftSDKs/Lib/x64/msmpi.lib\] --with-mpiexec=/cygdrive/g/MSmpi/MicrosoftMPI/Bin/mpiexec --download-metis=/cygdrive/g/mypetsc/petsc-pkg-metis-69fb26dd0428.tar.gz --download-parmetis=/cygdrive/g/mypetsc/petsc-pkg-parmetis-f5e3aab04fd5.tar.gz --with-strict-petscerrorcode=0 --with-64-bit-indices -CFLAGS='-O2 -MD -wd4996' -CXXFLAGS='-O2 -MD -wd4996'
>> *******************************************/
>> 
>> 
>> It happen to be another problems when performing the make all.
>> 
>> 
>> 
>> The new configure.log and make.log is attached below.
>> 
>> Sorry for bother you so many times but I wish your can help me again.
>> 
>>  
>> Looking forward your reply!
>> 
>> sinserely,
>> Cheng.
>> 
> 
> 
> 
> -----????-----
> ???: "Barry Smith" <bsmith at petsc.dev <mailto:bsmith at petsc.dev>>
> ????: 2025-11-13 23:05:05 (???)
> ???: ?? <202321009113 at mail.scut.edu.cn <mailto:202321009113 at mail.scut.edu.cn>>
> ??: petsc-users <petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov>>
> ??: Re: [petsc-users] Error in configuring PETSc with Cygwin on Windows by using MS-MPI
> 
> 
>   Change
> 
> __attribute__((packed))
> 
> to 
> 
> /* __attribute__((packed)) */ 
> 
> in include/petscmath.h 
> 
> and run make again.
> 
> I think you should install a new version of Microsoft's compilers etc.
> 
>   Barry
> 
> 
>> On Nov 13, 2025, at 3:53?AM, ?? <202321009113 at mail.scut.edu.cn <mailto:202321009113 at mail.scut.edu.cn>> wrote:
>> 
>> Hi Barry
>> 
>> 
>> 
>> Thanks for your advice.
>> I use AI help me that change the file on the petsc-3.24.1/arch-mswin-c-opt/externalpackages/petsc-pkg-parmetis-f5e3aab04fd5/headers/gk_arch. 
>> 
>> 
>> The change is from: 
>> 
>> #ifdef __MSC__ 
>>   #include "ms_stdint.h"
>>   #include "ms_inttypes.h"
>>   #include "ms_stat.h"              
>> #else
>> #ifndef SUNOS
>>   #include <stdint.h>
>> #endif
>> #if !defined(WIN32) && !defined(__MINGW32__)
>>   #include <sys/resource.h>
>> #endif
>>   #include <inttypes.h>
>>   #include <sys/types.h>
>>   #include <sys/time.h>
>> #endif
>> 
>> To:
>> 
>> #if (defined(__MSC__) || defined(_MSC_VER)) && defined(_MSC_VER) && _MSC_VER < 1900
>>   #include "ms_stdint.h"
>>   #include "ms_inttypes.h"
>>   #include "ms_stat.h"
>> #else
>> #ifndef SUNOS
>>   #include <stdint.h>
>> #endif
>> #if !defined(WIN32) && !defined(__MINGW32__) && !defined(_MSC_VER)
>>   #include <sys/resource.h>
>> #endif
>>   #include <inttypes.h>
>>   #include <sys/types.h>
>> #if !defined(_MSC_VER)
>>   #include <sys/time.h>
>> #endif
>> #endif
>> 
>> 
>> 
>> Then I configure the PETSc:
>> 
>> ./configure --with-debugging=0 --with-cc=cl --with-fc=0 --with-cxx=cl --download-f2cblaslapack=/cygdrive/g/mypetsc/f2cblaslapack-3.8.0.q2.tar.gz --with-mpi-include=\[/cygdrive/g/MSmpi/MicrosoftSDKs/Include,/cygdrive/g/MSmpi/MicrosoftSDKs/Include/x64\] --with-mpi-lib=\[/cygdrive/g/MSmpi/MicrosoftSDKs/Lib/x64/msmpifec.lib,/cygdrive/g/MSmpi/MicrosoftSDKs/Lib/x64/msmpi.lib\] --with-mpiexec=/cygdrive/g/MSmpi/MicrosoftMPI/Bin/mpiexec --download-metis=/cygdrive/g/mypetsc/petsc-pkg-metis-69fb26dd0428.tar.gz --download-parmetis=/cygdrive/g/mypetsc/petsc-pkg-parmetis-f5e3aab04fd5.tar.gz --with-strict-petscerrorcode=0 --with-64-bit-indices
>> 
>> It seems good, but then I make it
>> 
>> it  have the error:
>> 
>> 
>> 
>> make[3]: *** [gmakefile:211: arch-mswin-c-opt/obj/src/sys/objects/device/interface/mark_dcontext.o] Error 2
>> make[3]: Leaving directory '/cygdrive/g/mypetsc/petsc-3.24.1'
>> make[2]: *** [/cygdrive/g/mypetsc/petsc-3.24.1/lib/petsc/conf/rules_doc.mk:5: libs] Error 2
>> make[2]: Leaving directory '/cygdrive/g/mypetsc/petsc-3.24.1'
>> **************************ERROR*************************************
>>   Error during compile, check arch-mswin-c-opt/lib/petsc/conf/make.log
>>   Send it and arch-mswin-c-opt/lib/petsc/conf/configure.log to petsc-maint at mcs.anl.gov <mailto:petsc-maint at mcs.anl.gov>
>> ********************************************************************
>> make[1]: *** [makefile:44: all] Error 1
>> make: *** [GNUmakefile:9: all] Error 2
>> 
>> 
>> The new configure.log and make.log is attached below.
>> 
>> I don't know if it is caused by the change I made or the other problems.
>> 
>> 
>> 
>> So I ask for your help again.  
>> Looking forward your reply!
>> 
>> 
>> sinserely,
>> Cheng.
>> 
>> 
>> 
>> 
>> 
>> 
>> 
>> -----????-----
>> ???: "Barry Smith" <bsmith at petsc.dev <mailto:bsmith at petsc.dev>>
>> ????: 2025-11-13 00:03:50 (???)
>> ???: ?? <202321009113 at mail.scut.edu.cn <mailto:202321009113 at mail.scut.edu.cn>>
>> ??: PETSc <petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov>>
>> ??: Re: [petsc-users] Error in configuring PETSc with Cygwin on Windows by using MS-MPI
>> 
>> G:\VisualStudio2022\VC\Tools\MSVC\14.37.32822\include\stdint.h(37): error C2371: 'int_fast16_t': redefinition; different basic types
>> G:\mypetsc\petsc-3.24.1\arch-mswin-c-opt\externalpackages\petsc-pkg-parmetis-f5e3aab04fd5\headers\ms_stdint.h(80): note: see declaration of 'int_fast16_t'
>> G:\VisualStudio2022\VC\Tools\MSVC\14.37.32822\include\stdint.h(41): error C2371: 'uint_fast16_t': redefinition; different basic types
>> G:\mypetsc\petsc-3.24.1\arch-mswin-c-opt\externalpackages\petsc-pkg-parmetis-f5e3aab04fd5\headers\ms_stdint.h(84): note: see declaration of 'uint_fast16_t'
>> G:\VisualStudio2022\VC\Tools\MSVC\14.37.32822\include\stdint.h(49): warning C4005: 'INT8_MIN': macro redefinition
>> G:\mypetsc\petsc-3.24.1\arch-mswin-c-opt\externalpackages\petsc-pkg-parmetis-f5e3aab04fd5\headers\ms_stdint.h(107): note: see previous definition of 'INT8_MIN'
>> 
>> Parmetis has its own definitions for many C standard types, etc in headers\ms_stdint.h that duplicate what is available in stdint.h on Unix systems. Normally, this gets included when __MSC_ is defined instead of stdint.h (in gk_arch.h).
>> 
>> But for some reason, with your system it appears that Microsoft's stdint.h is also getting included; presumably brought in through some other system include file since it is only included in one place.
>> 
>> $ git grep stdint.h
>> headers/gk_arch.h:  #include "ms_stdint.h"
>> headers/gk_arch.h:  #include <stdint.h>
>> headers/ms_inttypes.h:#include "ms_stdint.h"
>> headers/ms_stdint.h:// ISO C9x  compliant stdint.h for Microsoft Visual Studio
>> 
>> You have a fairly old VisualStudio, 2022. Can you upgrade to the latest? Let us know if this resolves the problem.
>> 
>> Barry
>> 
>> 
>> 
>> 
>>  
>> 
>> 
>> 
>> 
>> 
>> On Nov 12, 2025, at 5:29?AM, ?? <202321009113 at mail.scut.edu.cn <mailto:202321009113 at mail.scut.edu.cn>> wrote:
>> 
>> Hi Barry
>> Thanks for your reply.
>> I check the package parmetis,and the "petsc-pkg-parmetis-45100eac9301.tar.gz" is form https://urldefense.us/v3/__https://bitbucket.org/petsc/pkg-parmetis/get/v4.0.3.tar.gz__;!!G_uCfscf7eWS!dCHHyY-VxDm-ywDHjCTx7TuexZfZSpNvZITmJoKuqThj3NRnYxGB_lEcLPxzGRFkPS8_uyJCqoS_NU4txpTE0xQ$ . So I made a mistake about the package.
>> Then I download the package form https://urldefense.us/v3/__https://bitbucket.org/petsc/pkg-parmetis/get/v4.0.3-p9.tar.gz__;!!G_uCfscf7eWS!dCHHyY-VxDm-ywDHjCTx7TuexZfZSpNvZITmJoKuqThj3NRnYxGB_lEcLPxzGRFkPS8_uyJCqoS_NU4twjES1lo$  and it is "petsc-pkg-parmetis-f5e3aab04fd5.tar.gz" 
>> 
>> 
>> Then the compiler option in configuration is:
>> ./configure --with-debugging=0 --with-cc=cl --with-fc=0 --with-cxx=cl --download-f2cblaslapack=/cygdrive/g/mypetsc/f2cblaslapack-3.8.0.q2.tar.gz --with-mpi-include=\[/cygdrive/g/MSmpi/MicrosoftSDKs/Include,/cygdrive/g/MSmpi/MicrosoftSDKs/Include/x64\] --with-mpi-lib=\[/cygdrive/g/MSmpi/MicrosoftSDKs/Lib/x64/msmpifec.lib,/cygdrive/g/MSmpi/MicrosoftSDKs/Lib/x64/msmpi.lib\] --with-mpiexec=/cygdrive/g/MSmpi/MicrosoftMPI/Bin/mpiexec --download-metis=/cygdrive/g/mypetsc/petsc-pkg-metis-69fb26dd0428.tar.gz --download-parmetis=/cygdrive/g/mypetsc/petsc-pkg-parmetis-f5e3aab04fd5.tar.gz --with-strict-petscerrorcode=0 --with-64-bit-indices
>> 
>> but it still have the same error:
>> *********************************************************************************************
>> =============================================================================================
>> =============================================================================================
>>                 Configuring PARMETIS with CMake; this may take several minutes
>> =============================================================================================
>> =============================================================================================
>>                Compiling and installing PARMETIS; this may take several minutes
>> =============================================================================================
>> 
>> 
>> *********************************************************************************************
>>            UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for details):
>> ---------------------------------------------------------------------------------------------
>>                                Error running make on  PARMETIS
>> 
>> 
>> *********************************************************************************************
>> 
>> 
>> The new configure.log is attached below.
>> So I ask for your help again.  
>> Looking forward your reply!
>> 
>> 
>> sinserely,
>> Cheng.
>> 
>> 
>> 
>> 
>> -----????-----
>> ???: "Barry Smith" <bsmith at petsc.dev <mailto:bsmith at petsc.dev>>
>> ????: 2025-11-11 23:29:01 (???)
>> ???: "Matthew Knepley" <knepley at gmail.com <mailto:knepley at gmail.com>>
>> ??: ?? <ctchengben at mail.scut.edu.cn <mailto:ctchengben at mail.scut.edu.cn>>, petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov>
>> ??: Re: [petsc-users] Error in configuring PETSc with Cygwin on Windows by using MS-MPI
>> 
>> 
>>   Where/how did you obtain /cygdrive/g/mypetsc/petsc-pkg-parmetis-45100eac9301.tar.gz ? Was it from PETSc ./configure?
>> 
>>     self.version          = '4.0.3'
>>     self.versionname      = 'PARMETIS_MAJOR_VERSION.PARMETIS_MINOR_VERSION.PARMETIS_SUBMINOR_VERSION'
>>     self.gitcommit         = 'v'+self.version+'-p9'
>>     self.download          = ['git://https://bitbucket.org/petsc/pkg-parmetis.git','https://bitbucket.org/petsc/pkg-parmetis/get/'+self.gitcommit+'.tar.gz <git://https//bitbucket.org/petsc/pkg-parmetis.git','https://bitbucket.org/petsc/pkg-parmetis/get/'+self.gitcommit+'.tar.gz>']
>> 
>> 
>> 
>> On Nov 11, 2025, at 7:35?AM, Matthew Knepley <knepley at gmail.com <mailto:knepley at gmail.com>> wrote:
>> 
>> On Tue, Nov 11, 2025 at 4:44?AM ?? <ctchengben at mail.scut.edu.cn <mailto:ctchengben at mail.scut.edu.cn>> wrote:
>> Hello,
>> Recently I try to install PETSc with Cygwin since I'd like to use PETSc with Visual Studio on Windows10 plateform.For the sake of clarity, I firstly list the softwares/packages used below:
>> 1. PETSc: version 3.14.1
>> 2. VS: version 2022 
>> 3. MS MPI: download Microsoft MPI v10.1.2
>> 4. Cygwin
>> 
>> Quick question: Have you considered installing on WSL? I have had much better luck with that on Windows.
>> 
>> This seems to be an incompatibility of ParMetis Windows support and your version:
>> 
>> G:\VisualStudio2022\VC\Tools\MSVC\14.37.32822\include\stdint.h(37): error C2371: 'int_fast16_t': redefinition; different basic types^M
>> G:\mypetsc\petsc-3.24.1\arch-mswin-c-opt\externalpackages\petsc-pkg-parmetis-f5e3aab04fd5\headers\ms_stdint.h(80): note: see declaration of 'int_fast16_t'^M
>> G:\VisualStudio2022\VC\Tools\MSVC\14.37.32822\include\stdint.h(41): error C2371: 'uint_fast16_t': redefinition; different basic types^M
>> G:\mypetsc\petsc-3.24.1\arch-mswin-c-opt\externalpackages\petsc-pkg-parmetis-f5e3aab04fd5\headers\ms_stdint.h(84): note: see declaration of 'uint_fast16_t'^M 
>> 
>>   Thanks,
>> 
>>      Matt
>> 
>> And the compiler option in configuration is:
>> ./configure --with-debugging=0 --with-cc=cl --with-fc=0 --with-cxx=cl 
>> --download-f2cblaslapack=/cygdrive/g/mypetsc/f2cblaslapack-3.8.0.q2.tar.gz 
>> --with-mpi-include=\[/cygdrive/g/MSmpi/MicrosoftSDKs/Include,/cygdrive/g/MSmpi/MicrosoftSDKs/Include/x64\] 
>> --with-mpi-lib=\[/cygdrive/g/MSmpi/MicrosoftSDKs/Lib/x64/msmpifec.lib,/cygdrive/g/MSmpi/MicrosoftSDKs/Lib/x64/msmpi.lib\] 
>> --with-mpiexec=/cygdrive/g/MSmpi/MicrosoftMPI/Bin/mpiexec 
>> --download-metis=/cygdrive/g/mypetsc/petsc-pkg-metis-69fb26dd0428.tar.gz 
>> --download-parmetis=/cygdrive/g/mypetsc/petsc-pkg-parmetis-45100eac9301.tar.gz 
>> --with-strict-petscerrorcode=0 --with-64-bit-indices --download-hdf5=/cygdrive/g/mypetsc/hdf5-1.14.3-p1.tar.bz2
>> 
>> 
>> 
>> 
>> 
>> 
>> but there return an error:
>> *********************************************************************************************
>> =============================================================================================
>> =============================================================================================
>>                 Configuring PARMETIS with CMake; this may take several minutes
>> =============================================================================================
>> =============================================================================================
>>                Compiling and installing PARMETIS; this may take several minutes
>> =============================================================================================
>> 
>> 
>> *********************************************************************************************
>>            UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for details):
>> ---------------------------------------------------------------------------------------------
>>                                Error running make on  PARMETIS
>> 
>> 
>> *********************************************************************************************
>> 
>> 
>> The configure.log is attached below.
>> So I write this email to report my problem and ask for your help.  
>> Looking forward your reply!
>> 
>> 
>> sinserely,
>> Cheng.
>> 
>> 
>> -- 
>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>> -- Norbert Wiener
>> 
>> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!dCHHyY-VxDm-ywDHjCTx7TuexZfZSpNvZITmJoKuqThj3NRnYxGB_lEcLPxzGRFkPS8_uyJCqoS_NU4tpGoejyM$ 
>> 
>> 
> 
> 
> 
??
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251114/9a28d639/attachment-0003.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: configure.log
Type: application/octet-stream
Size: 1387907 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251114/9a28d639/attachment-0002.obj>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251114/9a28d639/attachment-0004.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: make.log
Type: application/octet-stream
Size: 137025 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251114/9a28d639/attachment-0003.obj>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251114/9a28d639/attachment-0005.html>

From yin.shi1 at icloud.com  Sat Nov 15 08:17:23 2025
From: yin.shi1 at icloud.com (Yin Shi)
Date: Sat, 15 Nov 2025 22:17:23 +0800
Subject: [petsc-users] solveBackward in parallel
Message-ID: <BE1DC54D-9CC7-426B-8079-3762EE0AA762@icloud.com>

Dear Developers,

In short, I need to explicitly use A.solveBackward(b, x) in parallel with petsc4py, where A is a Cholesky factored matrix, but it seems that this is not supported (e.g., for mumps and superlu_dist factorization solver backend). Is it possible to work around this?

In detail, the problem I need to solve is to generate a set of correlated random numbers (denoted by a vector, w) from an uncorrelated one (denoted by a vector n). Denote the covariance matrix of n as C (symmetric). One needs to first factorize C, C = L L^T, and then solve the linear system L^T w = n for w in parallel. Is it possible to reformulate this problem for it to be implemented using petsc4py?

Thank you!
Yin

From bsmith at petsc.dev  Sat Nov 15 18:59:33 2025
From: bsmith at petsc.dev (Barry Smith)
Date: Sat, 15 Nov 2025 19:59:33 -0500
Subject: [petsc-users] solveBackward in parallel
In-Reply-To: <BE1DC54D-9CC7-426B-8079-3762EE0AA762@icloud.com>
References: <BE1DC54D-9CC7-426B-8079-3762EE0AA762@icloud.com>
Message-ID: <5B2C79DF-736E-4113-AF9B-D8A40B64C192@petsc.dev>

  It appears that only MATSOLVERMKL_CPARDISO provides a parallel backward solve currently. 

  The only seperation of forward and backward solves in MUMPS appears to be provided with (from its users manual)

A special case is the one
where the forward elimination step is performed during factorization (see Subsection 3.8), instead of
during the solve phase. This allows accessing the L factors right after they have been computed, with a
better locality, and can avoid writing the L factors to disk in an out-of-core context. In this case (forward


> On Nov 15, 2025, at 9:17?AM, Yin Shi via petsc-users <petsc-users at mcs.anl.gov> wrote:
> 
> Dear Developers,
> 
> In short, I need to explicitly use A.solveBackward(b, x) in parallel with petsc4py, where A is a Cholesky factored matrix, but it seems that this is not supported (e.g., for mumps and superlu_dist factorization solver backend). Is it possible to work around this?
> 
> In detail, the problem I need to solve is to generate a set of correlated random numbers (denoted by a vector, w) from an uncorrelated one (denoted by a vector n). Denote the covariance matrix of n as C (symmetric). One needs to first factorize C, C = L L^T, and then solve the linear system L^T w = n for w in parallel. Is it possible to reformulate this problem for it to be implemented using petsc4py?
> 
> Thank you!
> Yin

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251115/7537a68f/attachment.html>

From aldo.bonfiglioli at unibas.it  Mon Nov 17 01:50:59 2025
From: aldo.bonfiglioli at unibas.it (Aldo Bonfiglioli)
Date: Mon, 17 Nov 2025 08:50:59 +0100
Subject: [petsc-users] Probelm with DMPlexExtractSubMesh
In-Reply-To: <CAMYG4GntFkfE41HAne1r3P6yHtNZU0JzX96ETnT-TEHtAo-yBA@mail.gmail.com>
References: <1e9e4725-5274-4cef-a035-bb399deaac55@unibas.it>
	<CAMYG4GntFkfE41HAne1r3P6yHtNZU0JzX96ETnT-TEHtAo-yBA@mail.gmail.com>
Message-ID: <13143062-44a5-4855-9929-47bfa668431f@unibas.it>

On 11/13/25 19:48, Matthew Knepley wrote:
> Sorry, I have been traveling. I just got back to this.
>
> The problem is that _everything_ that goes in the submesh has to have 
> the same label value. That way you can distinguish?exactly what?you 
> want in. However, the boundary label has to make decisions about 
> shared edges and vertices. I am attaching a modified code that does 
> what you want by making a separate label for each side.
>
> I apologize for the C. I am just not as quick in Fortran.
>
> ? Thanks,
>
> ? ? ?Matt
>
> On Thu, Nov 6, 2025 at 1:42?AM Aldo Bonfiglioli 
> <aldo.bonfiglioli at unibas.it> wrote:
>
>     Dear all,
>
>     I am having troubles in using DMPlexExtractSubMesh to extract the
>     different strata of the Face Sets of a given mesh.
>
>     When run on the enclosed tetrahedral mesh of the unit cube
>     generated with gmsh
>
>>     Face Sets: 6 strata with value/size (1 (246), 2 (246), 3 (246), 4
>>     (246), 5 (242), 6 (242))
>>
>     I would expect 246 "points" on stratum 3, but when I DMview the
>     subdm (and plot it) the surface mesh looks incomplete
>
>>     DM Object: patch_03 1 MPI process
>>     ?type: plex
>>     patch_03 in 2 dimensions:
>>     ?Cells are at height 1
>>     ?Number of 0-cells per rank: 122
>>     ?Number of 1-cells per rank: 325
>>     Number of 2-cells per rank: 204
>>     Number of 3-cells per rank: 204 [204]
>>     Labels:
>>     celltype: 4 strata with value/size (0 (122), 1 (325), 3 (204), 12
>>     (204))
>>     depth: 4 strata with value/size (0 (122), 1 (325), 2 (204), 3 (204))
>>     Cell Sets: 1 strata with value/size (1 (204))
>>     Face Sets: 1 strata with value/size (3 (204))
>>     Edge Sets: 2 strata with value/size (1 (8), 5 (8))
>>
>     see also patch_03.pdf
>
>     What am I doing wrong?
>
>     A simple reproducer (compiles with petsc-3.24.0)?and the gmsh mesh
>     are enclosed.
>
>     Thanks,
>
>     Aldo
>
>     -- 
>     Dr. Aldo Bonfiglioli
>     Associate professor of Fluid Mechanics
>     Dipartimento di Ingegneria
>     Universita' della Basilicata
>     V.le dell'Ateneo Lucano, 10 85100 Potenza ITALY
>     tel:+39.0971.205203 fax:+39.0971.205215
>     web:https://urldefense.us/v3/__http://docenti.unibas.it/site/home/docente.html?m=002423__;!!G_uCfscf7eWS!abF6mWj2v5ivSvUET6QFY34S5Jw6daKMHiS5E9ztz2YbV2jQPr-0WGi09d7IEArZlAwqdLwjjsQeUl2PlNwJMcq6AAnRpMwHc3Q$  <https://urldefense.us/v3/__http://docenti.unibas.it/site/home/docente.html?m=002423__;!!G_uCfscf7eWS!aMKmGG4aim9XcbNSnDyHUkDyhUkQHGZ-u-xX2C-sycYUMmtTij6AwqsQbZPXJSvPp9KUfgwRJK2Ok6Me2BLgO0en1w4QF2fHo7s$>
>
>
>
> -- 
> What most experimenters take for granted before they begin their 
> experiments is infinitely more interesting than any results to which 
> their experiments lead.
> -- Norbert Wiener
>
> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!abF6mWj2v5ivSvUET6QFY34S5Jw6daKMHiS5E9ztz2YbV2jQPr-0WGi09d7IEArZlAwqdLwjjsQeUl2PlNwJMcq6AAnRT1dw3uY$  
> <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!abF6mWj2v5ivSvUET6QFY34S5Jw6daKMHiS5E9ztz2YbV2jQPr-0WGi09d7IEArZlAwqdLwjjsQeUl2PlNwJMcq6AAnRaIPU6qs$ >

Matthew,

thank you for providing the working C code.

I will ba back to you in case I need further advice.

Regards,

Aldo

-- 
Dr. Aldo Bonfiglioli
Associate professor of Fluid Mechanics
Dipartimento di Ingegneria
Universita' della Basilicata
V.le dell'Ateneo Lucano, 10 85100 Potenza ITALY
tel:+39.0971.205203 fax:+39.0971.205215
web:https://urldefense.us/v3/__http://docenti.unibas.it/site/home/docente.html?m=002423__;!!G_uCfscf7eWS!abF6mWj2v5ivSvUET6QFY34S5Jw6daKMHiS5E9ztz2YbV2jQPr-0WGi09d7IEArZlAwqdLwjjsQeUl2PlNwJMcq6AAnRpMwHc3Q$ 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251117/1c60e4ed/attachment.html>

From herbert.owen at bsc.es  Mon Nov 17 09:40:16 2025
From: herbert.owen at bsc.es (howen)
Date: Mon, 17 Nov 2025 16:40:16 +0100
Subject: [petsc-users] Petsc + nvhpc
In-Reply-To: <B639DD8C-F63A-4F19-8EEC-6CDEFCAAB42E@bsc.es>
References: <0B70CA06-D787-4D97-8E33-27E71D08BBF0@bsc.es>
	<CA+MQGp9gQVBFuxjynUmpOoGTqvNkuRytyLtZsiJaEf30Mc9OAg@mail.gmail.com>
	<2A940B06-46A3-40AB-BD66-835C46970CA1@bsc.es>
	<CAMYG4GkzGH5_HTBsufcZWAfbp_YyjTLrR=ehq2D=gLSpxuVfeA@mail.gmail.com>
	<B639DD8C-F63A-4F19-8EEC-6CDEFCAAB42E@bsc.es>
Message-ID: <50E23080-79B1-4EF6-BE7C-9527978EFF8A@bsc.es>

Dear Matthew and Junchao,

I finally found my error now everything works fine. I was a bit stuck at some moment and your small comments were very helpful.

Thanks!!!

Herbert Owen
Senior Researcher, Dpt. Computer Applications in Science and Engineering
Barcelona Supercomputing Center (BSC-CNS)
Tel: +34 93 413 4038
Skype: herbert.owen

https://urldefense.us/v3/__https://scholar.google.es/citations?user=qe5O2IYAAAAJ&hl=en__;!!G_uCfscf7eWS!f2aJLIJjM-mazjlNYof4HlyYtStFvSqBARrm1edFZiRRKxeneBWEc4um7RyuOjp8es6iTTywGdvGPHvmzdeHRfz8esh1$ 


> On 14 Nov 2025, at 16:08, howen <herbert.owen at bsc.es> wrote:
> 
> Thank you very much Matthew,
> 
> I did what you suggested and I also added 
> 
> ierr = MatView(*amat, PETSC_VIEWER_STDOUT_WORLD); CHKERRQ(ierr);
> 
> Now that I can see the matrices I notice that some values differ. I will debug and simplify my code to try to understand where the difference comes from . 
> 
> As soon as I have a more clear picture I will contact you back. 
> 
> Best, 
> 
> 
> Herbert Owen
> Senior Researcher, Dpt. Computer Applications in Science and Engineering
> Barcelona Supercomputing Center (BSC-CNS)
> Tel: +34 93 413 4038
> Skype: herbert.owen
> 
> https://urldefense.us/v3/__https://scholar.google.es/citations?user=qe5O2IYAAAAJ&hl=en__;!!G_uCfscf7eWS!f2aJLIJjM-mazjlNYof4HlyYtStFvSqBARrm1edFZiRRKxeneBWEc4um7RyuOjp8es6iTTywGdvGPHvmzdeHRfz8esh1$ 
> 
> 
> 
> 
> 
> 
> 
> 
>> On 13 Nov 2025, at 18:23, Matthew Knepley <knepley at gmail.com> wrote:
>> 
>> On Thu, Nov 13, 2025 at 12:11?PM howen via petsc-users <petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov>> wrote:
>>> Dear Junchao,
>>> 
>>> Thank you for response and sorry for taking so long to answer back. 
>>> I cannot avoid using the nvidia tools. Gfortran is not mature for OpenACC and gives us problems when compiling our code.
>>> What I have done to enable using the latest petsc is to create my own C code to call petsc. 
>>> I have little experience with c and it took me some time, but I can now use petsc 3.24.1  ;)
>>> 
>>> The behaviour remains the same as in my original email . 
>>> Parallel+GPU gives bad results. CPU(serial and parallel) and GPU serial all work ok and give the same result.
>>> 
>>> I have gone a bit into petsc comparing the CPU and GPU version with 2 mpi.
>>> I see that the difference starts in 
>>> src/ksp/ksp/impls/cg/cg.c  L170
>>>     PetscCall(KSP_PCApply(ksp, R, Z));  /*    z <- Br                           */
>>> I have printed the vectors R and Z and the norm dp.
>>> R is identical on both CPU and GPU; but Z differs.
>>> The correct value of dp (for the first time it enters) is 14.3014, while running on the GPU with 2 mpis it gives 14.7493.
>>> If you wish I can send you prints I introduced in cg.c    
>> 
>> Thank you for all the detail in this report. However, since you see a problem in KSPCG, I believe we can reduce the complexity. You can use
>> 
>>   -ksp_view_mat binary:A.bin -ksp_view_rhs binary:b.bin
>> 
>> and send us those files. Then we can run your system directly using KSP ex10 (and so can you).
>> 
>>   Thanks,
>> 
>>       Matt
>>  
>>> The folder with the input files to run the case can be downloaded from https://urldefense.us/v3/__https://b2drop.eudat.eu/s/wKRQ4LK7RTKz2iQ__;!!G_uCfscf7eWS!f2aJLIJjM-mazjlNYof4HlyYtStFvSqBARrm1edFZiRRKxeneBWEc4um7RyuOjp8es6iTTywGdvGPHvmzdeHRSYLlx3K$  <https://urldefense.us/v3/__https://b2drop.eudat.eu/s/wKRQ4LK7RTKz2iQ__;!!G_uCfscf7eWS!Y_YIXIdrN81gDKgNed6V4icL3nN9OG62-ZsdnB1Bkc7iiGAoJ2riwbTzxJMnIROon3mXgiFVLnbH0RTlsXrLEAh7n_UO$>
>>> 
>>> For submitting the gpu run I use 
>>> mpirun -np 2 --map-by ppr:4:node:PE=20 --report-bindings ./mn5_bind.sh /gpfs/scratch/bsc21/bsc021257/git/140-add-petsc/sod2d_gitlab/build_gpu/src/app_sod2d/sod2d ChannelFlowSolverIncomp.json
>>> 
>>> For the cpu run
>>> mpirun -np 2 /gpfs/scratch/bsc21/bsc021257/git/140-add-petsc/sod2d_gitlab/build_cpu/src/app_sod2d/sod2d ChannelFlowSolverIncomp.json
>>> 
>>> Our code can be downloaded with :
>>> git clone --recursive https://urldefense.us/v3/__https://gitlab.com/bsc_sod2d/sod2d_gitlab.git__;!!G_uCfscf7eWS!f2aJLIJjM-mazjlNYof4HlyYtStFvSqBARrm1edFZiRRKxeneBWEc4um7RyuOjp8es6iTTywGdvGPHvmzdeHRQCr0eq8$  <https://urldefense.us/v3/__https://gitlab.com/bsc_sod2d/sod2d_gitlab.git__;!!G_uCfscf7eWS!Y_YIXIdrN81gDKgNed6V4icL3nN9OG62-ZsdnB1Bkc7iiGAoJ2riwbTzxJMnIROon3mXgiFVLnbH0RTlsXrLEFjsBTIo$>
>>> 
>>> -and the branch I am using with
>>> git checkout 140-add-petsc
>>> 
>>> To use exactly the same commit I am using 
>>> git checkout 09a923c9b57e46b14ae54b935845d50272691ace
>>> 
>>> 
>>> I am currently using: Currently Loaded Modules:
>>>   1) nvidia-hpc-sdk/25.1   2) hdf5/1.14.1-2-nvidia-nvhpcx   3) cmake/3.25.1
>>> I guess/hope similar modules should be available in any supercomputer.
>>> 
>>> To build the cpu version 
>>> mkdir build_cpu
>>> cd build_cpu
>>> 
>>> export PETSC_INSTALL=/gpfs/scratch/bsc21/bsc021257/git/petsc_oct25/3241_cpu/hhinstal
>>> export LD_LIBRARY_PATH=$PETSC_INSTALL/lib:$LD_LIBRARY_PATH
>>> export LIBRARY_PATH=$PETSC_INSTALL/lib:$LIBRARY_PATH
>>> export C_INCLUDE_PATH=$PETSC_INSTALL/include:$C_INCLUDE_PATH
>>> export CPLUS_INCLUDE_PATH=$PETSC_INSTALL/include:$CPLUS_INCLUDE_PATH
>>> export PKG_CONFIG_PATH=$PETSC_INSTALL/lib/pkgconfig:$PKG_CONFIG_PATH
>>> 
>>> cmake -DUSE_RP=8 -DUSE_PORDER=3 -DUSE_PETSC=ON -DUSE_GPU=OFF -DDEBUG_MODE=OFF ..
>>> make -j 80
>>> 
>>> I have built petsc myself  as follows
>>> 
>>> git clone -b release https://urldefense.us/v3/__https://gitlab.com/petsc/petsc.git__;!!G_uCfscf7eWS!f2aJLIJjM-mazjlNYof4HlyYtStFvSqBARrm1edFZiRRKxeneBWEc4um7RyuOjp8es6iTTywGdvGPHvmzdeHRZKzWAoJ$  <https://urldefense.us/v3/__https://gitlab.com/petsc/petsc.git__;!!G_uCfscf7eWS!Y_YIXIdrN81gDKgNed6V4icL3nN9OG62-ZsdnB1Bkc7iiGAoJ2riwbTzxJMnIROon3mXgiFVLnbH0RTlsXrLELP8U6d0$> petsc
>>> cd petsc
>>> git checkout v3.24.1     
>>> module purge
>>> module load nvidia-hpc-sdk/25.1   hdf5/1.14.1-2-nvidia-nvhpcx cmake/3.25.1 
>>> ./configure --PETSC_DIR=/gpfs/scratch/bsc21/bsc021257/git/petsc_oct25/3241/petsc --prefix=/gpfs/scratch/bsc21/bsc021257/git/petsc_oct25/3241/hhinstal --with-fortran-bindings=0  --with-fc=0 --with-petsc-arch=linux-x86_64-opt --with-scalar-type=real --with-debugging=yes --with-64-bit-indices=1 --with-precision=single --download-hypre CFLAGS=-I/apps/ACC/HDF5/1.14.1-2/NVIDIA/NVHPCX/include CXXFLAGS= FCFLAGS= --with-shared-libraries=1 --with-mpi=1 --with-blacs-lib=/gpfs/apps/MN5/ACC/ONEAPI/2025.1/mkl/2025.1/lib/intel64/libmkl_blacs_openmpi_lp64.a --with-blacs-include=/gpfs/apps/MN5/ACC/ONEAPI/2025.1/mkl/2025.1/include --with-mpi-dir=/apps/ACC/NVIDIA-HPC-SDK/25.1/Linux_x86_64/25.1/comm_libs/12.6/hpcx/latest/ompi/ --download-ptscotch=yes --download-metis --download-parmetis
>>> make all check
>>> make install
>>> 
>>> -------------------
>>> For the GPU version when configuring petsc I add : --with-cuda 
>>> 
>>> I then change the export PETSC_INSTALL  to 
>>> export PETSC_INSTALL=/gpfs/scratch/bsc21/bsc021257/git/petsc_oct25/3241/hhinstal
>>> and repeat all other exports
>>> 
>>> mkdir build_gpu
>>> cd build_gpu
>>> cmake -DUSE_RP=8 -DUSE_PORDER=3 -DUSE_PETSC=ON -DUSE_GPU=ON -DDEBUG_MODE=OFF ..
>>> make -j 80
>>> 
>>> As you can see from the submit instructions the executable is found in sod2d_gitlab/build_gpu/src/app_sod2d/sod2d
>>> 
>>> I hope I have not forgotten anything and my instructions are 'easy' to follow. If you have any issue do not doubt to contact me.
>>> The wiki for our code can be found in https://urldefense.us/v3/__https://gitlab.com/bsc_sod2d/sod2d_gitlab/-/wikis/home__;!!G_uCfscf7eWS!f2aJLIJjM-mazjlNYof4HlyYtStFvSqBARrm1edFZiRRKxeneBWEc4um7RyuOjp8es6iTTywGdvGPHvmzdeHRTtC2VEI$  <https://urldefense.us/v3/__https://gitlab.com/bsc_sod2d/sod2d_gitlab/-/wikis/home__;!!G_uCfscf7eWS!Y_YIXIdrN81gDKgNed6V4icL3nN9OG62-ZsdnB1Bkc7iiGAoJ2riwbTzxJMnIROon3mXgiFVLnbH0RTlsXrLEA1vqPYk$>
>>> 
>>> Best, 
>>> 
>>> Herbert Owen
>>>  
>>> Herbert Owen
>>> Senior Researcher, Dpt. Computer Applications in Science and Engineering
>>> Barcelona Supercomputing Center (BSC-CNS)
>>> Tel: +34 93 413 4038
>>> Skype: herbert.owen
>>> 
>>> https://urldefense.us/v3/__https://scholar.google.es/citations?user=qe5O2IYAAAAJ&hl=en__;!!G_uCfscf7eWS!f2aJLIJjM-mazjlNYof4HlyYtStFvSqBARrm1edFZiRRKxeneBWEc4um7RyuOjp8es6iTTywGdvGPHvmzdeHRfz8esh1$  <https://urldefense.us/v3/__https://scholar.google.es/citations?user=qe5O2IYAAAAJ&hl=en__;!!G_uCfscf7eWS!Y_YIXIdrN81gDKgNed6V4icL3nN9OG62-ZsdnB1Bkc7iiGAoJ2riwbTzxJMnIROon3mXgiFVLnbH0RTlsXrLEAA5PwtO$>
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>>> On 16 Oct 2025, at 18:30, Junchao Zhang <junchao.zhang at gmail.com <mailto:junchao.zhang at gmail.com>> wrote:
>>>> 
>>>> Hi, Herbert,
>>>>    I don't have much experience on OpenACC and PETSc CI doesn't have such tests.  Could you avoid using nvfortran and instead use gfortran to compile your Fortran + OpenACC code?  If you, then you can use the latest petsc code and make our debugging easier. 
>>>>    Also, could you provide us with a test and instructions to reproduce the problem?
>>>>    
>>>>    Thanks!
>>>> --Junchao Zhang
>>>> 
>>>> 
>>>> On Thu, Oct 16, 2025 at 5:07?AM howen via petsc-users <petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov>> wrote:
>>>>> Dear All,
>>>>> 
>>>>> I am interfacing our CFD code (Fortran + OpenACC)  to Petsc. 
>>>>> Since we use OpenACC the natural choice for us is to use Nvidia?s nvhpc compiler. The Gnu compiler does not work well and we do not have access to the Cray compiler.  
>>>>> 
>>>>> I already know that the latest version of Petsc does not compile with nvhpc, I am therefore using version 3.21.  
>>>>> I get good results on the CPU both in serial and parallel (MPI). However, the GPU implementation, that is what we are interested in, only work correctly for the serial version. In parallel, the results are different. Even for a CG solve. 
>>>>> 
>>>>> I would like to know, if you have experience with the Nvidia compiler.  I am particularly interested if you have already observed issues with it. Your opinion on whether to put further effort into trying to find a bug I may have introduced during the interfacing is highly appreciated.
>>>>> 
>>>>> Best,
>>>>> 
>>>>> Herbert Owen
>>>>> Senior Researcher, Dpt. Computer Applications in Science and Engineering
>>>>> Barcelona Supercomputing Center (BSC-CNS)
>>>>> Tel: +34 93 413 4038
>>>>> Skype: herbert.owen
>>>>> 
>>>>> https://urldefense.us/v3/__https://scholar.google.es/citations?user=qe5O2IYAAAAJ&hl=en__;!!G_uCfscf7eWS!f2aJLIJjM-mazjlNYof4HlyYtStFvSqBARrm1edFZiRRKxeneBWEc4um7RyuOjp8es6iTTywGdvGPHvmzdeHRfz8esh1$  <https://urldefense.us/v3/__https://scholar.google.es/citations?user=qe5O2IYAAAAJ&hl=en__;!!G_uCfscf7eWS!abuM7ozzUs7eISYBumHNxpvO2Tuy74KRM4-WWcunXHZVjQf1V032xQrCzTfC5vA_NM-35xMEZ9yJ8XK-3QFqjWBSWuUi$>
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>> 
>> 
>> 
>> 
>> --
>> What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
>> -- Norbert Wiener
>> 
>> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!f2aJLIJjM-mazjlNYof4HlyYtStFvSqBARrm1edFZiRRKxeneBWEc4um7RyuOjp8es6iTTywGdvGPHvmzdeHRY11r9Bz$  <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!f2aJLIJjM-mazjlNYof4HlyYtStFvSqBARrm1edFZiRRKxeneBWEc4um7RyuOjp8es6iTTywGdvGPHvmzdeHRZltPRPT$ >
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251117/743facb3/attachment-0001.html>

From sebastian.blauth at itwm.fraunhofer.de  Tue Nov 18 10:11:44 2025
From: sebastian.blauth at itwm.fraunhofer.de (Blauth, Sebastian)
Date: Tue, 18 Nov 2025 16:11:44 +0000
Subject: [petsc-users] Ordering of DoFs in submatrices with PCFieldsplit
Message-ID: <FRYP281MB21568B9E25B8CC21FBB2F2A8B8D6A@FRYP281MB2156.DEUP281.PROD.OUTLOOK.COM>

Dear PETSc developers and users,


I have a question regarding the Fieldsplit preconditioner in PETSc. In particular, I want to know how the submatrices there are created from the parent matrix. The "obvious" way would be to take the DoF indices of the corresponding split and "renumber" them so that the DoFs in the submatrix have the same order as the ones of the parent matrix. I did not find any documentation on this and as it is at least possible that the DoFs are re-ordered, I wanted to ask this question. Obviously, in case the DoFs are re-ordered, how can I get the mapping between the DoFs of the parent and the submatrix?


The thing I am wanting to work on is implementing a pressure convection diffusion preconditioner with FEniCS for the incompressible Navier-Stokes equations. The parent matrix is assembled via a mixed FEM and then I use PETSc to solve the system. I want to assemble the corresponding operators on the pressure space from a collapsed (i.e. sub-space of the mixed FEM) function space. However, FEniCS re-orders the DoFs there, but I can get a mapping between the DoFs so this should not be problematic. However, I am not sure if PETSc also does a re-ordering.


Thanks a lot in advance and best regards,

Sebastian


--

Dr. Sebastian Blauth

Fraunhofer-Institut f?r

Techno- und Wirtschaftsmathematik ITWM

Abteilung  Transportvorg?nge

Fraunhofer-Platz 1, 67663 Kaiserslautern

Telefon: +49 631 31600-4968

sebastian.blauth at itwm.fraunhofer.de<mailto:sebastian.blauth at itwm.fraunhofer.de>

https://urldefense.us/v3/__https://www.itwm.fraunhofer.de__;!!G_uCfscf7eWS!f_qaoCRxX3prMgl6ev5fvSFQegVfZo84xW9eJTz7uYmLjZiyJFIlm1tlqYrM3LqjOpkEoMrIJZo6J63-23-atPBnJn4et_4R-UvZVnIkaQ0$ <https://urldefense.us/v3/__https://www.itwm.fraunhofer.de/__;!!G_uCfscf7eWS!f_qaoCRxX3prMgl6ev5fvSFQegVfZo84xW9eJTz7uYmLjZiyJFIlm1tlqYrM3LqjOpkEoMrIJZo6J63-23-atPBnJn4et_4R-UvZoWlBpHM$ >


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251118/85ea8736/attachment.html>

From knepley at gmail.com  Tue Nov 18 10:23:27 2025
From: knepley at gmail.com (Matthew Knepley)
Date: Tue, 18 Nov 2025 11:23:27 -0500
Subject: [petsc-users] Ordering of DoFs in submatrices with PCFieldsplit
In-Reply-To: <FRYP281MB21568B9E25B8CC21FBB2F2A8B8D6A@FRYP281MB2156.DEUP281.PROD.OUTLOOK.COM>
References: <FRYP281MB21568B9E25B8CC21FBB2F2A8B8D6A@FRYP281MB2156.DEUP281.PROD.OUTLOOK.COM>
Message-ID: <CAMYG4Gkv1w6MHxuw1T3ox9qHj7t4Twah6NnASNsOqpGYXWcA=w@mail.gmail.com>

On Tue, Nov 18, 2025 at 11:12?AM Blauth, Sebastian <
sebastian.blauth at itwm.fraunhofer.de> wrote:

> Dear PETSc developers and users,
>
>
>
> I have a question regarding the Fieldsplit preconditioner in PETSc. In
> particular, I want to know how the submatrices there are created from the
> parent matrix. The ?obvious? way would be to take the DoF indices of the
> corresponding split and ?renumber? them so that the DoFs in the submatrix
> have the same order as the ones of the parent matrix. I did not find any
> documentation on this and as it is at least possible that the DoFs are
> re-ordered, I wanted to ask this question. Obviously, in case the DoFs are
> re-ordered, how can I get the mapping between the DoFs of the parent and
> the submatrix?
>

Hi Sebastian,

Inside, we call MatCreateSubmatrix(), which takes an IS on each process,
and selects those global rows, in the order in which they appear in the IS,
into a new parallel matrix. PCFieldsplitSetIS() can be used to specify
those IS, so you can control the reordering. Does that make sense?


> The thing I am wanting to work on is implementing a pressure convection
> diffusion preconditioner with FEniCS for the incompressible Navier-Stokes
> equations.
>
The parent matrix is assembled via a mixed FEM and then I use PETSc to
> solve the system. I want to assemble the corresponding operators on the
> pressure space from a collapsed (i.e. sub-space of the mixed FEM) function
> space. However, FEniCS re-orders the DoFs there, but I can get a mapping
> between the DoFs so this should not be problematic. However, I am not sure
> if PETSc also does a re-ordering.
>

You can just create an IS with that reordering. What operator are you
planning on assembling on the pressure space? Have you seen
https://urldefense.us/v3/__https://arxiv.org/abs/1810.03315?__;!!G_uCfscf7eWS!ZFlvrtpVlFuXdYWcwujVNh1WjnSmuEKqsh1s3GCYbyN0_wNsVgBaJo3x-lWG3Iea3iQhp_iniM9QzDSr9iD3$ 

  Thanks,

     Matt


> Thanks a lot in advance and best regards,
>
> Sebastian
>
>
>
> --
>
> Dr. Sebastian Blauth
>
> Fraunhofer-Institut f?r
>
> Techno- und Wirtschaftsmathematik ITWM
>
> Abteilung  Transportvorg?nge
>
> Fraunhofer-Platz 1, 67663 Kaiserslautern
>
> Telefon: +49 631 31600-4968
>
> sebastian.blauth at itwm.fraunhofer.de
>
> https://urldefense.us/v3/__https://www.itwm.fraunhofer.de__;!!G_uCfscf7eWS!ZFlvrtpVlFuXdYWcwujVNh1WjnSmuEKqsh1s3GCYbyN0_wNsVgBaJo3x-lWG3Iea3iQhp_iniM9QzNhlmkaU$ 
> <https://urldefense.us/v3/__https://www.itwm.fraunhofer.de/__;!!G_uCfscf7eWS!f_qaoCRxX3prMgl6ev5fvSFQegVfZo84xW9eJTz7uYmLjZiyJFIlm1tlqYrM3LqjOpkEoMrIJZo6J63-23-atPBnJn4et_4R-UvZoWlBpHM$>
>
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!ZFlvrtpVlFuXdYWcwujVNh1WjnSmuEKqsh1s3GCYbyN0_wNsVgBaJo3x-lWG3Iea3iQhp_iniM9QzGRu26U1$  <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!ZFlvrtpVlFuXdYWcwujVNh1WjnSmuEKqsh1s3GCYbyN0_wNsVgBaJo3x-lWG3Iea3iQhp_iniM9QzPGY9jGb$ >
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251118/4b7361fd/attachment-0001.html>

From sebastian.blauth at itwm.fraunhofer.de  Wed Nov 19 01:03:20 2025
From: sebastian.blauth at itwm.fraunhofer.de (Blauth, Sebastian)
Date: Wed, 19 Nov 2025 07:03:20 +0000
Subject: [petsc-users] Ordering of DoFs in submatrices with PCFieldsplit
In-Reply-To: <CAMYG4Gkv1w6MHxuw1T3ox9qHj7t4Twah6NnASNsOqpGYXWcA=w@mail.gmail.com>
References: <FRYP281MB21568B9E25B8CC21FBB2F2A8B8D6A@FRYP281MB2156.DEUP281.PROD.OUTLOOK.COM>
	<CAMYG4Gkv1w6MHxuw1T3ox9qHj7t4Twah6NnASNsOqpGYXWcA=w@mail.gmail.com>
Message-ID: <FRYP281MB215682CF29560AEB210EC0C0B8D7A@FRYP281MB2156.DEUP281.PROD.OUTLOOK.COM>

Dear Matt,

thanks for the clarification. Yes, that makes sense. Basically, I use two approaches for defining the splits in my code, see https://urldefense.us/v3/__https://github.com/sblauth/cashocs/blob/46c0d91467d03a4906b7bde29727b45d4bb0d6d2/cashocs/_utils/linalg.py*L245-L287__;Iw!!G_uCfscf7eWS!ZrZxEenSD9yoVQBgqWHSpGUGp75YsbFopexb0vZKBu8oG5soqUBYoVKVAGETh1eMtV2aO-XjQUFcjY-OdaJUjHL04TxyhunHGM7Y_93bJVg$ 
I think the first one, where the IS is defined, then does exactly what I thought it would do. In the second approach, which I need for nested fieldsplits, I use a DMShell with a Section defined analogously - so I guess the same applies here.

Well, yes I could just reorder the DoFs for the creation of the submatrices - but I usually don't need these sub-functionspaces and would not want to create them every time. I thought of using MatPermute (https://urldefense.us/v3/__https://petsc.org/release/petsc4py/reference/petsc4py.PETSc.Mat.html*petsc4py.PETSc.Mat.permute__;Iw!!G_uCfscf7eWS!ZrZxEenSD9yoVQBgqWHSpGUGp75YsbFopexb0vZKBu8oG5soqUBYoVKVAGETh1eMtV2aO-XjQUFcjY-OdaJUjHL04TxyhunHGM7YwxbMqnE$ ) with the permutation I get from FEniCS - or is there any reason not to do so?

And thank you very much for the reference. Yes, I am aware of the paper you sent. However I think the function spaces involved in the method make it more or less infeasible for me - usually using Taylor-Hood elements is already very expensive. I usually use a stabilized P1-P1 discretization or try to get the linear Crouzeix?Raviart with elementwise constant pressure working (for slow flows, this works okay, but as I go to higher Reynolds numbers, things become more problematic).

And regarding your question on which operators I am planning to assemble on the pressure space: Basically the pressure mass matrix, pressure convection-diffusion matrix and a pressure Laplacian.

If you have any tips for solving the incompressible Navier Stokes equations (steady state) at higher Reynolds numbers I certainly welcome them. I can also go a bit more into detail of what kind of solution approach I am using - if that is appropriate here.

Thanks a lot and best regards,
Sebastian


-- 
Dr. Sebastian Blauth
Fraunhofer-Institut f?r
Techno- und Wirtschaftsmathematik ITWM
Abteilung  Transportvorg?nge
Fraunhofer-Platz 1, 67663 Kaiserslautern
Telefon: +49 631 31600-4968
sebastian.blauth at itwm.fraunhofer.de
https://urldefense.us/v3/__https://www.itwm.fraunhofer.de__;!!G_uCfscf7eWS!ZrZxEenSD9yoVQBgqWHSpGUGp75YsbFopexb0vZKBu8oG5soqUBYoVKVAGETh1eMtV2aO-XjQUFcjY-OdaJUjHL04TxyhunHGM7YW0PRsVU$ 

> -----Original Message-----
> From: Matthew Knepley <knepley at gmail.com>
> Sent: Tuesday, November 18, 2025 5:23 PM
> To: Blauth, Sebastian <sebastian.blauth at itwm.fraunhofer.de>
> Cc: PETSc users list <petsc-users at mcs.anl.gov>
> Subject: Re: [petsc-users] Ordering of DoFs in submatrices with PCFieldsplit
> 
> On Tue, Nov 18, 2025 at 11:12?AM Blauth, Sebastian
> <sebastian.blauth at itwm.fraunhofer.de
> <mailto:sebastian.blauth at itwm.fraunhofer.de> > wrote:
> 
> 	Dear PETSc developers and users,
> 
> 
> 
> 	I have a question regarding the Fieldsplit preconditioner in PETSc. In
> particular, I want to know how the submatrices there are created from the parent
> matrix. The ?obvious? way would be to take the DoF indices of the corresponding
> split and ?renumber? them so that the DoFs in the submatrix have the same order
> as the ones of the parent matrix. I did not find any documentation on this and as
> it is at least possible that the DoFs are re-ordered, I wanted to ask this question.
> Obviously, in case the DoFs are re-ordered, how can I get the mapping between
> the DoFs of the parent and the submatrix?
> 
> 
> Hi Sebastian,
> 
> Inside, we call MatCreateSubmatrix(), which takes an IS on each process, and
> selects those global rows, in the order in which they appear in the IS, into a new
> parallel matrix. PCFieldsplitSetIS() can be used to specify those IS, so you can
> control the reordering. Does that make sense?
> 
> 
> 	The thing I am wanting to work on is implementing a pressure convection
> diffusion preconditioner with FEniCS for the incompressible Navier-Stokes
> equations.
> 
> 	The parent matrix is assembled via a mixed FEM and then I use PETSc to
> solve the system. I want to assemble the corresponding operators on the pressure
> space from a collapsed (i.e. sub-space of the mixed FEM) function space.
> However, FEniCS re-orders the DoFs there, but I can get a mapping between the
> DoFs so this should not be problematic. However, I am not sure if PETSc also does
> a re-ordering.
> 
> 
> You can just create an IS with that reordering. What operator are you planning on
> assembling on the pressure space? Have you seen
> https://urldefense.us/v3/__https://arxiv.org/abs/1810.03315?__;!!G_uCfscf7eWS!ZrZxEenSD9yoVQBgqWHSpGUGp75YsbFopexb0vZKBu8oG5soqUBYoVKVAGETh1eMtV2aO-XjQUFcjY-OdaJUjHL04TxyhunHGM7YE-1UtAk$ 
> 
>   Thanks,
> 
>      Matt
> 
> 
> 	Thanks a lot in advance and best regards,
> 
> 	Sebastian
> 
> 
> 
> 	--
> 
> 	Dr. Sebastian Blauth
> 
> 	Fraunhofer-Institut f?r
> 
> 	Techno- und Wirtschaftsmathematik ITWM
> 
> 	Abteilung  Transportvorg?nge
> 
> 	Fraunhofer-Platz 1, 67663 Kaiserslautern
> 
> 	Telefon: +49 631 31600-4968
> 
> 	sebastian.blauth at itwm.fraunhofer.de
> <mailto:sebastian.blauth at itwm.fraunhofer.de>
> 
> 	https://urldefense.us/v3/__https://www.itwm.fraunhofer.de__;!!G_uCfscf7eWS!ZrZxEenSD9yoVQBgqWHSpGUGp75YsbFopexb0vZKBu8oG5soqUBYoVKVAGETh1eMtV2aO-XjQUFcjY-OdaJUjHL04TxyhunHGM7YW0PRsVU$ 
> <https://urldefense.us/v3/__https://www.itwm.fraunhofer.de/__;!!G_uCfscf7eW
> S!f_qaoCRxX3prMgl6ev5fvSFQegVfZo84xW9eJTz7uYmLjZiyJFIlm1tlqYrM3LqjOpkE
> oMrIJZo6J63-23-atPBnJn4et_4R-UvZoWlBpHM$>
> 
> 
> 
> 
> 
> --
> 
> What most experimenters take for granted before they begin their experiments is
> infinitely more interesting than any results to which their experiments lead.
> -- Norbert Wiener
> 
> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!ZrZxEenSD9yoVQBgqWHSpGUGp75YsbFopexb0vZKBu8oG5soqUBYoVKVAGETh1eMtV2aO-XjQUFcjY-OdaJUjHL04TxyhunHGM7YQCIJKn8$ 
> <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!ZrZxEenSD9yoVQBgqWHSpGUGp75YsbFopexb0vZKBu8oG5soqUBYoVKVAGETh1eMtV2aO-XjQUFcjY-OdaJUjHL04TxyhunHGM7YUxlQO8E$ >


From knepley at gmail.com  Wed Nov 19 07:18:50 2025
From: knepley at gmail.com (Matthew Knepley)
Date: Wed, 19 Nov 2025 08:18:50 -0500
Subject: [petsc-users] Ordering of DoFs in submatrices with PCFieldsplit
In-Reply-To: <FRYP281MB215682CF29560AEB210EC0C0B8D7A@FRYP281MB2156.DEUP281.PROD.OUTLOOK.COM>
References: <FRYP281MB21568B9E25B8CC21FBB2F2A8B8D6A@FRYP281MB2156.DEUP281.PROD.OUTLOOK.COM>
	<CAMYG4Gkv1w6MHxuw1T3ox9qHj7t4Twah6NnASNsOqpGYXWcA=w@mail.gmail.com>
	<FRYP281MB215682CF29560AEB210EC0C0B8D7A@FRYP281MB2156.DEUP281.PROD.OUTLOOK.COM>
Message-ID: <CAMYG4GkZmxe3orR917+Vsch7+O59mrSiTv5h3o-10Sw_VjDcgA@mail.gmail.com>

On Wed, Nov 19, 2025 at 2:03?AM Blauth, Sebastian <
sebastian.blauth at itwm.fraunhofer.de> wrote:

> Dear Matt,
>
> thanks for the clarification. Yes, that makes sense. Basically, I use two
> approaches for defining the splits in my code, see
> https://urldefense.us/v3/__https://github.com/sblauth/cashocs/blob/46c0d91467d03a4906b7bde29727b45d4bb0d6d2/cashocs/_utils/linalg.py*L245-L287__;Iw!!G_uCfscf7eWS!Z3vaW6mH3kTXdhSyLFZ2XcUc1E6hcY4dtjtLWLAzcg4Pj_S8issujZ0Khj24yL9Bb5KfynJ0mj5vygrOWHgH$ 
> I think the first one, where the IS is defined, then does exactly what I
> thought it would do. In the second approach, which I need for nested
> fieldsplits, I use a DMShell with a Section defined analogously - so I
> guess the same applies here.
>

Okay.


> Well, yes I could just reorder the DoFs for the creation of the
> submatrices - but I usually don't need these sub-functionspaces and would
> not want to create them every time. I thought of using MatPermute (
> https://urldefense.us/v3/__https://petsc.org/release/petsc4py/reference/petsc4py.PETSc.Mat.html*petsc4py.PETSc.Mat.permute__;Iw!!G_uCfscf7eWS!Z3vaW6mH3kTXdhSyLFZ2XcUc1E6hcY4dtjtLWLAzcg4Pj_S8issujZ0Khj24yL9Bb5KfynJ0mj5vyiWtyehi$ )
> with the permutation I get from FEniCS - or is there any reason not to do
> so?
>

That is more memory movement. I am not understanding why you would not just
permute the input defining the nested FS.


> And thank you very much for the reference. Yes, I am aware of the paper
> you sent. However I think the function spaces involved in the method make
> it more or less infeasible for me - usually using Taylor-Hood elements is
> already very expensive. I usually use a stabilized P1-P1 discretization or
> try to get the linear Crouzeix?Raviart with elementwise constant pressure
> working (for slow flows, this works okay, but as I go to higher Reynolds
> numbers, things become more problematic).
>

Oh, yes those spaces are crazy, but not necessary. In
https://urldefense.us/v3/__https://arxiv.org/pdf/2107.00820__;!!G_uCfscf7eWS!Z3vaW6mH3kTXdhSyLFZ2XcUc1E6hcY4dtjtLWLAzcg4Pj_S8issujZ0Khj24yL9Bb5KfynJ0mj5vysoKtJkS$ , on page 9, you can see that they are able
to prove the kernel decomposition property for simple Taylor-Hood.


> And regarding your question on which operators I am planning to assemble
> on the pressure space: Basically the pressure mass matrix, pressure
> convection-diffusion matrix and a pressure Laplacian.
>
> If you have any tips for solving the incompressible Navier Stokes
> equations (steady state) at higher Reynolds numbers I certainly welcome
> them. I can also go a bit more into detail of what kind of solution
> approach I am using - if that is appropriate here.
>

I think that the Augmented Lagrangian strategy from the Stadler paper is
currently the best option I know of.

  Thanks,

     Matt


> Thanks a lot and best regards,
> Sebastian
>
>
> --
> Dr. Sebastian Blauth
> Fraunhofer-Institut f?r
> Techno- und Wirtschaftsmathematik ITWM
> Abteilung  Transportvorg?nge
> Fraunhofer-Platz 1, 67663 Kaiserslautern
> Telefon: +49 631 31600-4968
> sebastian.blauth at itwm.fraunhofer.de
> https://urldefense.us/v3/__https://www.itwm.fraunhofer.de__;!!G_uCfscf7eWS!Z3vaW6mH3kTXdhSyLFZ2XcUc1E6hcY4dtjtLWLAzcg4Pj_S8issujZ0Khj24yL9Bb5KfynJ0mj5vyu_13JIA$ 
>
> > -----Original Message-----
> > From: Matthew Knepley <knepley at gmail.com>
> > Sent: Tuesday, November 18, 2025 5:23 PM
> > To: Blauth, Sebastian <sebastian.blauth at itwm.fraunhofer.de>
> > Cc: PETSc users list <petsc-users at mcs.anl.gov>
> > Subject: Re: [petsc-users] Ordering of DoFs in submatrices with
> PCFieldsplit
> >
> > On Tue, Nov 18, 2025 at 11:12?AM Blauth, Sebastian
> > <sebastian.blauth at itwm.fraunhofer.de
> > <mailto:sebastian.blauth at itwm.fraunhofer.de> > wrote:
> >
> >       Dear PETSc developers and users,
> >
> >
> >
> >       I have a question regarding the Fieldsplit preconditioner in
> PETSc. In
> > particular, I want to know how the submatrices there are created from
> the parent
> > matrix. The ?obvious? way would be to take the DoF indices of the
> corresponding
> > split and ?renumber? them so that the DoFs in the submatrix have the
> same order
> > as the ones of the parent matrix. I did not find any documentation on
> this and as
> > it is at least possible that the DoFs are re-ordered, I wanted to ask
> this question.
> > Obviously, in case the DoFs are re-ordered, how can I get the mapping
> between
> > the DoFs of the parent and the submatrix?
> >
> >
> > Hi Sebastian,
> >
> > Inside, we call MatCreateSubmatrix(), which takes an IS on each process,
> and
> > selects those global rows, in the order in which they appear in the IS,
> into a new
> > parallel matrix. PCFieldsplitSetIS() can be used to specify those IS, so
> you can
> > control the reordering. Does that make sense?
> >
> >
> >       The thing I am wanting to work on is implementing a pressure
> convection
> > diffusion preconditioner with FEniCS for the incompressible Navier-Stokes
> > equations.
> >
> >       The parent matrix is assembled via a mixed FEM and then I use
> PETSc to
> > solve the system. I want to assemble the corresponding operators on the
> pressure
> > space from a collapsed (i.e. sub-space of the mixed FEM) function space.
> > However, FEniCS re-orders the DoFs there, but I can get a mapping
> between the
> > DoFs so this should not be problematic. However, I am not sure if PETSc
> also does
> > a re-ordering.
> >
> >
> > You can just create an IS with that reordering. What operator are you
> planning on
> > assembling on the pressure space? Have you seen
> > https://urldefense.us/v3/__https://arxiv.org/abs/1810.03315?__;!!G_uCfscf7eWS!Z3vaW6mH3kTXdhSyLFZ2XcUc1E6hcY4dtjtLWLAzcg4Pj_S8issujZ0Khj24yL9Bb5KfynJ0mj5vyv_7q3wU$ 
> >
> >   Thanks,
> >
> >      Matt
> >
> >
> >       Thanks a lot in advance and best regards,
> >
> >       Sebastian
> >
> >
> >
> >       --
> >
> >       Dr. Sebastian Blauth
> >
> >       Fraunhofer-Institut f?r
> >
> >       Techno- und Wirtschaftsmathematik ITWM
> >
> >       Abteilung  Transportvorg?nge
> >
> >       Fraunhofer-Platz 1, 67663 Kaiserslautern
> >
> >       Telefon: +49 631 31600-4968
> >
> >       sebastian.blauth at itwm.fraunhofer.de
> > <mailto:sebastian.blauth at itwm.fraunhofer.de>
> >
> >       https://urldefense.us/v3/__https://www.itwm.fraunhofer.de__;!!G_uCfscf7eWS!Z3vaW6mH3kTXdhSyLFZ2XcUc1E6hcY4dtjtLWLAzcg4Pj_S8issujZ0Khj24yL9Bb5KfynJ0mj5vyu_13JIA$ 
> > <
> https://urldefense.us/v3/__https://www.itwm.fraunhofer.de/__;!!G_uCfscf7eW
> > S!f_qaoCRxX3prMgl6ev5fvSFQegVfZo84xW9eJTz7uYmLjZiyJFIlm1tlqYrM3LqjOpkE
> > oMrIJZo6J63-23-atPBnJn4et_4R-UvZoWlBpHM$>
> >
> >
> >
> >
> >
> > --
> >
> > What most experimenters take for granted before they begin their
> experiments is
> > infinitely more interesting than any results to which their experiments
> lead.
> > -- Norbert Wiener
> >
> > https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!Z3vaW6mH3kTXdhSyLFZ2XcUc1E6hcY4dtjtLWLAzcg4Pj_S8issujZ0Khj24yL9Bb5KfynJ0mj5vypAp6Vm8$ 
> > <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!Z3vaW6mH3kTXdhSyLFZ2XcUc1E6hcY4dtjtLWLAzcg4Pj_S8issujZ0Khj24yL9Bb5KfynJ0mj5vymdmkq01$ >
>
>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!Z3vaW6mH3kTXdhSyLFZ2XcUc1E6hcY4dtjtLWLAzcg4Pj_S8issujZ0Khj24yL9Bb5KfynJ0mj5vypAp6Vm8$  <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!Z3vaW6mH3kTXdhSyLFZ2XcUc1E6hcY4dtjtLWLAzcg4Pj_S8issujZ0Khj24yL9Bb5KfynJ0mj5vymdmkq01$ >
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251119/06eeeb35/attachment-0001.html>

From bsmith at petsc.dev  Sun Nov 23 14:09:10 2025
From: bsmith at petsc.dev (Barry Smith)
Date: Sun, 23 Nov 2025 15:09:10 -0500
Subject: [petsc-users] solveBackward in parallel
In-Reply-To: <7D84E81D-60FF-4316-818C-4D8D0FCA1ACD@icloud.com>
References: <BE1DC54D-9CC7-426B-8079-3762EE0AA762@icloud.com>
	<5B2C79DF-736E-4113-AF9B-D8A40B64C192@petsc.dev>
	<7D84E81D-60FF-4316-818C-4D8D0FCA1ACD@icloud.com>
Message-ID: <A0D1724F-E379-4152-96F4-2C21740E0AC9@petsc.dev>

  I would be stunned and amazed if this worked.  Sparse factorization codes use very complicated data structures to store the resulting "factors" and the solves are complicated code that traverse through the "factor" data structures to perform the solve. 

  Barry


> On Nov 22, 2025, at 6:58?AM, Yin Shi <yin.shi1 at icloud.com> wrote:
> 
> Thank you very much for your reply. Given this, when using MUMPS in parallel, I can still get the factor matrix (using getFactorMatrix method of a PC object) and use it to do matrix multiplications (e.g., using matMult method of the factor matrix), correct? I also would like to confirm whether the factor matrix returned is really triangular and multiplying it with another matrix gives the intended result.
> 
>> On Nov 16, 2025, at 08:59, Barry Smith <bsmith at petsc.dev> wrote:
>> 
>>   It appears that only MATSOLVERMKL_CPARDISO provides a parallel backward solve currently. 
>> 
>>   The only seperation of forward and backward solves in MUMPS appears to be provided with (from its users manual)
>> 
>> A special case is the one
>> where the forward elimination step is performed during factorization (see Subsection 3.8), instead of
>> during the solve phase. This allows accessing the L factors right after they have been computed, with a
>> better locality, and can avoid writing the L factors to disk in an out-of-core context. In this case (forward
>> 
>> 
>> 
>>> On Nov 15, 2025, at 9:17?AM, Yin Shi via petsc-users <petsc-users at mcs.anl.gov> wrote:
>>> 
>>> Dear Developers,
>>> 
>>> In short, I need to explicitly use A.solveBackward(b, x) in parallel with petsc4py, where A is a Cholesky factored matrix, but it seems that this is not supported (e.g., for mumps and superlu_dist factorization solver backend). Is it possible to work around this?
>>> 
>>> In detail, the problem I need to solve is to generate a set of correlated random numbers (denoted by a vector, w) from an uncorrelated one (denoted by a vector n). Denote the covariance matrix of n as C (symmetric). One needs to first factorize C, C = L L^T, and then solve the linear system L^T w = n for w in parallel. Is it possible to reformulate this problem for it to be implemented using petsc4py?
>>> 
>>> Thank you!
>>> Yin
>> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251123/21ef7837/attachment.html>

From jed at jedbrown.org  Sun Nov 23 16:41:33 2025
From: jed at jedbrown.org (Jed Brown)
Date: Sun, 23 Nov 2025 15:41:33 -0700
Subject: [petsc-users] 41st Colorado Conference on Iterative and Multigrid
 Methods, June 2026
Message-ID: <87h5ukjrs2.fsf@jedbrown.org>

After many successful Copper Mountain Conferences on Iterative and Multigrid Methods, we have made the difficult decision that it is time to continue the conference series in a new venue.  Thus, it is with great pleasure that we announce the 41st Colorado Conference on Iterative and Multigrid Methods, to be held June 21-26, 2026 in Boulder, Colorado, on the CU Campus.  We hope to continue the many wonderful traditions of Copper Mountain, with a rich technical problem, opportunities for informal discussion and work time amongst the participants, and a focus on student and early career participants.
 
We are currently finalizing many details for the conference, and information about participation, registration, and lodging will be forthcoming at  https://urldefense.us/v3/__https://coloradoconference.github.io/2026/__;!!G_uCfscf7eWS!dHswLmycwzRakxfip9Nnp9U-IWvLvaqe3za8V4FcT8p9hZZSWxi7SQhnV7CiKrk6nkktLJdS7ODcOXoUn20$  .  Expect student paper competition deadlines in February, 2026, and abstract and registration deadlines in Spring, 2026.  We look forward to welcoming you in Boulder!
 
Jed Brown, University of Colorado at Boulder
Rob Falgout, Lawrence Livermore National Laboratory
Scott MacLachlan, Memorial University of Newfoundland
Luke Olson, University of Illinois Urbana-Champaign

From yin.shi1 at icloud.com  Sat Nov 22 05:58:57 2025
From: yin.shi1 at icloud.com (Yin Shi)
Date: Sat, 22 Nov 2025 19:58:57 +0800
Subject: [petsc-users] solveBackward in parallel
In-Reply-To: <5B2C79DF-736E-4113-AF9B-D8A40B64C192@petsc.dev>
References: <BE1DC54D-9CC7-426B-8079-3762EE0AA762@icloud.com>
	<5B2C79DF-736E-4113-AF9B-D8A40B64C192@petsc.dev>
Message-ID: <7D84E81D-60FF-4316-818C-4D8D0FCA1ACD@icloud.com>

Thank you very much for your reply. Given this, when using MUMPS in parallel, I can still get the factor matrix (using getFactorMatrix method of a PC object) and use it to do matrix multiplications (e.g., using matMult method of the factor matrix), correct? I also would like to confirm whether the factor matrix returned is really triangular and multiplying it with another matrix gives the intended result.

> On Nov 16, 2025, at 08:59, Barry Smith <bsmith at petsc.dev> wrote:
> 
>   It appears that only MATSOLVERMKL_CPARDISO provides a parallel backward solve currently. 
> 
>   The only seperation of forward and backward solves in MUMPS appears to be provided with (from its users manual)
> 
> A special case is the one
> where the forward elimination step is performed during factorization (see Subsection 3.8), instead of
> during the solve phase. This allows accessing the L factors right after they have been computed, with a
> better locality, and can avoid writing the L factors to disk in an out-of-core context. In this case (forward
> 
> 
> 
>> On Nov 15, 2025, at 9:17?AM, Yin Shi via petsc-users <petsc-users at mcs.anl.gov> wrote:
>> 
>> Dear Developers,
>> 
>> In short, I need to explicitly use A.solveBackward(b, x) in parallel with petsc4py, where A is a Cholesky factored matrix, but it seems that this is not supported (e.g., for mumps and superlu_dist factorization solver backend). Is it possible to work around this?
>> 
>> In detail, the problem I need to solve is to generate a set of correlated random numbers (denoted by a vector, w) from an uncorrelated one (denoted by a vector n). Denote the covariance matrix of n as C (symmetric). One needs to first factorize C, C = L L^T, and then solve the linear system L^T w = n for w in parallel. Is it possible to reformulate this problem for it to be implemented using petsc4py?
>> 
>> Thank you!
>> Yin
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251122/2464031c/attachment-0001.html>

From benjamin.chapman at mail.utoronto.ca  Wed Nov 26 13:46:35 2025
From: benjamin.chapman at mail.utoronto.ca (Benjamin Chapman)
Date: Wed, 26 Nov 2025 19:46:35 +0000
Subject: [petsc-users] Issue using AOCC + AOCL to build PETSc
Message-ID: <YQBPR01MB102388FBDB1DA4B8B96582ADDA9DEA@YQBPR01MB10238.CANPRD01.PROD.OUTLOOK.COM>

Hello,

I am currently trying to build PETSc using the AOCC compiler and using AOCL libraries for BLAS/LAPACK. However, I am running into an issue when it tries to download and construct the MPI library, it cannot find the ".libs/libevent.so" library in libevent (which it downloads). I attached the configure.log file.

The reason I'm so confused is because PETSc built successfully when I did AOCC + MKL and gcc + AOCL, but not AOCC + AOCL together.

Is it even possible to build PETSc using AOCC + AOCL or is it not designed for that? If so, is there a specific procedure I should follow?

Thanks in advance.

Best,
Ben
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251126/66bb0f41/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: configure_AOCC_AOCL_add.log
Type: application/octet-stream
Size: 2060079 bytes
Desc: configure_AOCC_AOCL_add.log
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251126/66bb0f41/attachment-0001.obj>

From rlmackie862 at gmail.com  Wed Nov 26 15:26:08 2025
From: rlmackie862 at gmail.com (Randall Mackie)
Date: Wed, 26 Nov 2025 13:26:08 -0800
Subject: [petsc-users] [WARNING: UNSCANNABLE EXTRACTION FAILED]GMRES
 plus BlockJacobi behave differently for seemlingy identical matrices
In-Reply-To: <E5D6A2B7-6AD2-45C1-8274-486BA9826F32@joliv.et>
References: <877bwkms9h.fsf@mpi-hd.mpg.de>
	<BED6E987-AC1D-4459-B33D-7FFD576A71BC@joliv.et>
	<87o6pwpdpm.fsf@mpi-hd.mpg.de>
	<E5D6A2B7-6AD2-45C1-8274-486BA9826F32@joliv.et>
Message-ID: <574147FD-AC56-4236-B7BA-23AB03D209AC@gmail.com>

Hi Pierre,

Following up on this email, what are good options to use when trying PCHPDDM for the first time?

For example, what did you use here that worked so well?

Thanks,

Randy M.


> On Oct 24, 2025, at 8:14?AM, Pierre Jolivet <pierre at joliv.et> wrote:
> 
> 
>> On 24 Oct 2025, at 4:38?PM, Nils Schween <nils.schween at mpi-hd.mpg.de> wrote:
>> 
>> Thank you very much Pierre!
>> 
>> I was not aware of the fact that the fill-in in the ILU decides about
>> its quality. But its clear now. I will just test what level of fill we
>> need for our application.
> 
> I?ll note that block Jacobi and ILU are not very efficient solvers in most instances.
> I tried much fancier algebraic preconditioners such as BoomerAMG and GAMG on your problem, and they are failing hard out-of-the-box.
> Without knowing much more on the problem, it?s difficult to setup.
> We also have other more robust preconditioners in PETSc by means of domain decomposition methods.
> deal.II is interfaced with PCBDDC (which is also somewhat difficult to tune in a fully algebraic mode) and you could also use PCHPDDM (in fully algebraic mode).
> On this toy problem, PCHPDDM performs much better in terms of iteration than the simpler PCBJACOBI + (sub) PCILU.
> Of course, as we always advise our users, it?s best to do a little bit of literature survey to find the best method for your application, I doubt it?s PCBJACOBI.
> If the solver part is not a problem in your application, just carry on with what?s easiest for you.
> If you want some precise help on either PCBDDC or PCHPDDM, feel free to get in touch with me in private.
> 
> Thanks,
> Pierre
> 
> PCGAMG
>  Linear A_ solve did not converge due to DIVERGED_ITS iterations 1000
>  Linear B_ solve did not converge due to DIVERGED_ITS iterations 1000
> PCHYPRE
>  Linear A_ solve did not converge due to DIVERGED_NANORINF iterations 0
>  Linear B_ solve did not converge due to DIVERGED_NANORINF iterations 0
> PCHPDDM
>  Linear A_ solve converged due to CONVERGED_RTOL iterations 4
>  Linear B_ solve converged due to CONVERGED_RTOL iterations 38
> PCBJACOBI
>  Linear A_ solve converged due to CONVERGED_RTOL iterations 134
>  Linear B_ solve did not converge due to DIVERGED_ITS iterations 1000
> 
>> Once more thanks,
>> Nils
>> 
>> 
>> Pierre Jolivet <pierre at joliv.et> writes:
>> 
>>>> On 24 Oct 2025, at 1:52?PM, Nils Schween <nils.schween at mpi-hd.mpg.de> wrote:
>>>> 
>>>> Dear PETSc users, Dear PETSc developers,
>>>> 
>>>> in our software we are solving a linear system with PETSc using GMRES
>>>> in conjunction with a BlockJacobi preconditioner, i.e. the default of
>>>> the KSP object.
>>>> 
>>>> We have two versions of the system matrix, say A and B. The difference
>>>> between them is the non-zero pattern. The non-zero pattern of matrix B
>>>> is a subset of the one of matrix A. Their values should be identical.
>>>> 
>>>> We solve the linear system, using A yields a solution after some
>>>> iterations, whereas using B does not converge.
>>>> 
>>>> I created binary files of the two matrices, the right-hand side, and
>>>> wrote a small PETSc programm, which loads them and demonstrates the
>>>> issue. I attach the files to this email.
>>>> 
>>>> We would like to understand why the solver-preconditioner combination
>>>> works in case A and not in case B. Can you help us finding this out?
>>>> 
>>>> To test if the two matrices are identical, I substracted them and
>>>> computed the Frobenius norm of the result. It is zero.
>>> 
>>> The default subdomain solver is ILU(0).
>>> By definition, this won?t allow fill-in.
>>> So when you are not storing the zeros in B, the quality of your PC is much worse.
>>> You can check this yourself with -A_ksp_view -B_ksp_view:
>>> [?]
>>>   0 levels of fill
>>>   tolerance for zero pivot 2.22045e-14
>>>   matrix ordering: natural
>>>   factor fill ratio given 1., needed 1.
>>>     Factored matrix follows:
>>>       Mat Object: (A_) 1 MPI process
>>>         type: seqaij
>>>         rows=1664, cols=1664
>>>         package used to perform factorization: petsc
>>>         total: nonzeros=117760, allocated nonzeros=117760
>>>           using I-node routines: found 416 nodes, limit used is 5
>>> [?]
>>>   0 levels of fill
>>>   tolerance for zero pivot 2.22045e-14
>>>   matrix ordering: natural
>>>   factor fill ratio given 1., needed 1.
>>>     Factored matrix follows:
>>>       Mat Object: (B_) 1 MPI process
>>>         type: seqaij
>>>         rows=1664, cols=1664
>>>         package used to perform factorization: petsc
>>>         total: nonzeros=49408, allocated nonzeros=49408
>>>           not using I-node routines
>>> 
>>> Check the number of nonzeros of both factored Mat.
>>> With -B_pc_factor_levels 3, you?ll get roughly similar convergence speed (and density in the factored Mat of both PC).
>>> 
>>> Thanks,
>>> Pierre
>>> 
>>>> 
>>>> To give you more context, we solve a system of partial differential
>>>> equations that models astrophysical plasmas. It is essentially a system
>>>> of advection-reaction equations. We use a discontinuous Galerkin (dG)
>>>> method. Our code relies on the finite element library library deal.ii
>>>> and its PETSc interface. The system matrices A and B are the result of
>>>> the (dG) discretisation. We GMRES with a BlockJaboci preconditioner,
>>>> because we do not know any better.
>>>> 
>>>> I tested the code I sent with PETSc 3.24.0 and 3.19.1 on my workstation, i.e.
>>>> Linux home-desktop 6.17.2-arch1-1 #1 SMP PREEMPT_DYNAMIC Sun, 12 Oct 2025 12:45:18 +0000 x86_64 GNU/Linux
>>>> I use OpenMPI 5.0.8 and I compiled with mpicc, which in my cases use
>>>> gcc.
>>>> 
>>>> In case you need more information. Please let me know.
>>>> Any help is appreciated.
>>>> 
>>>> Thank you,
>>>> Nils 
>>>> <example.tar.gz>
>>>> 
>>>> -- 
>>>> Nils Schween
>>>> 
>>>> Phone: +49 6221 516 557
>>>> Mail: nils.schween at mpi-hd.mpg.de
>>>> PGP-Key: 4DD3DCC0532EE96DB0C1F8B5368DBFA14CB81849
>>>> 
>>>> Max Planck Institute for Nuclear Physics
>>>> Astrophysical Plasma Theory (APT)
>>>> Saupfercheckweg 1, D-69117 Heidelberg
>>>> https://urldefense.us/v3/__https://www.mpi-hd.mpg.de/mpi/en/research/scientific-divisions-and-groups/independent-research-groups/apt__;!!G_uCfscf7eWS!YoOlZjX4v-hbz0Oaawvh2Yy3nCbpHcafn1VjON06Or7f-WVrzGzD9SMcky5YJAyzVu62BIfzC5cpshSkkkpKpA$
>> 
>> -- 
>> Nils Schween
>> PhD Student
>> 
>> Phone: +49 6221 516 557
>> Mail: nils.schween at mpi-hd.mpg.de
>> PGP-Key: 4DD3DCC0532EE96DB0C1F8B5368DBFA14CB81849
>> 
>> Max Planck Institute for Nuclear Physics
>> Astrophysical Plasma Theory (APT)
>> Saupfercheckweg 1, D-69117 Heidelberg
>> https://urldefense.us/v3/__https://www.mpi-hd.mpg.de/mpi/en/research/scientific-divisions-and-groups/independent-research-groups/apt__;!!G_uCfscf7eWS!YoOlZjX4v-hbz0Oaawvh2Yy3nCbpHcafn1VjON06Or7f-WVrzGzD9SMcky5YJAyzVu62BIfzC5cpshSkkkpKpA$
> 


From bsmith at petsc.dev  Wed Nov 26 20:13:20 2025
From: bsmith at petsc.dev (Barry Smith)
Date: Wed, 26 Nov 2025 21:13:20 -0500
Subject: [petsc-users] Issue using AOCC + AOCL to build PETSc
In-Reply-To: <YQBPR01MB102388FBDB1DA4B8B96582ADDA9DEA@YQBPR01MB10238.CANPRD01.PROD.OUTLOOK.COM>
References: <YQBPR01MB102388FBDB1DA4B8B96582ADDA9DEA@YQBPR01MB10238.CANPRD01.PROD.OUTLOOK.COM>
Message-ID: <F2714745-6D71-4F9A-BA83-3D987AF52C64@petsc.dev>


  Probably  not an issue at all but why does --with-blaslapack-dir=/home/bchapman/aocl/5.1.0/gcc end with gcc? 

> On Nov 26, 2025, at 2:46?PM, Benjamin Chapman via petsc-users <petsc-users at mcs.anl.gov> wrote:
> 
> Hello,
>  
> I am currently trying to build PETSc using the AOCC compiler and using AOCL libraries for BLAS/LAPACK. However, I am running into an issue when it tries to download and construct the MPI library, it cannot find the ?.libs/libevent.so <https://urldefense.us/v3/__http://libevent.so/__;!!G_uCfscf7eWS!eCTV7HOPIUn75CuafYQXAhs0576zXZiNO4f6GWn6VG2JrDwpWQTFEKHycgGPN8fq2EnAx1V9u3LBZq85HrRzfZ4$ >? library in libevent (which it downloads). I attached the configure.log file.
>  
> The reason I?m so confused is because PETSc built successfully when I did AOCC + MKL and gcc + AOCL, but not AOCC + AOCL together.
>  
> Is it even possible to build PETSc using AOCC + AOCL or is it not designed for that? If so, is there a specific procedure I should follow?

   There is no specific reason with PETSc that this should not work. The failure takes place in building OpenMPI which should have nothing to do with aocl.  What you set for --with-blaslapack-dir should have no effect on the building of OpenMPI. OpenMPI builds libevent as part of its build process and then uses it. I cannot see a failure in building libevent just that it cannot find it later while working on other parts of OpenMPI.

   Can you delete your PETSC_ARCH directory  completely and rerun the ./configure 

   Don't use --with-fortran-kernels=1  it is pretty worthless. Don't use --with-threadsafety unless you know specifically that you need it. 

   If the rerun of ./configure fails you can try a trick. Run without the --with-blaslapack-dir (and let it use the default it finds) then immediately run ./configure again this time with the --with-blaslapack-dir  option (it will reuse the OpenMPI that it has already just built and won't rebuild it).


>  
> Thanks in advance.
>  
> Best,
> Ben
> <configure_AOCC_AOCL_add.log>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251126/d3199940/attachment.html>

From pierre at joliv.et  Wed Nov 26 23:26:42 2025
From: pierre at joliv.et (Pierre Jolivet)
Date: Thu, 27 Nov 2025 06:26:42 +0100
Subject: [petsc-users] [WARNING: UNSCANNABLE EXTRACTION FAILED]GMRES
 plus BlockJacobi behave differently for seemlingy identical matrices
In-Reply-To: <574147FD-AC56-4236-B7BA-23AB03D209AC@gmail.com>
References: <877bwkms9h.fsf@mpi-hd.mpg.de>
	<BED6E987-AC1D-4459-B33D-7FFD576A71BC@joliv.et>
	<87o6pwpdpm.fsf@mpi-hd.mpg.de>
	<E5D6A2B7-6AD2-45C1-8274-486BA9826F32@joliv.et>
	<574147FD-AC56-4236-B7BA-23AB03D209AC@gmail.com>
Message-ID: <90A737CA-1392-4CE8-AED8-7677683CBA6F@joliv.et>


> On 26 Nov 2025, at 10:26?PM, Randall Mackie <rlmackie862 at gmail.com> wrote:
> 
> Hi Pierre,
> 
> Following up on this email, what are good options to use when trying PCHPDDM for the first time?

There is no single answer to this question, especially when solving systems in a purely algebraic manner.
Some examples can be found in the repository:
$ git grep " hpddm" "*/ex*"

> For example, what did you use here that worked so well?

What worked well here may totally fail for other problems, so unless you are solving the same problem, then the following may not work, but here is what I get with -options_view.

Thanks,
Pierre

  Linear A_ solve converged due to CONVERGED_RTOL iterations 5
  Linear B_ solve converged due to CONVERGED_RTOL iterations 39
#PETSc Option Table entries:
-A_ksp_converged_reason # (source: command line)
-A_ksp_type fgmres # (source: command line)
-A_pc_hpddm_coarse_mat_type baij # (source: command line)
-A_pc_hpddm_harmonic_overlap 2 # (source: command line)
-A_pc_hpddm_levels_1_sub_pc_type lu # (source: command line)
-A_pc_hpddm_levels_1_svd_nsv 120 # (source: command line)
-A_pc_type hpddm # (source: command line)
-B_ksp_converged_reason # (source: command line)
-B_ksp_type fgmres # (source: command line)
-B_pc_hpddm_coarse_mat_type baij # (source: command line)
-B_pc_hpddm_harmonic_overlap 2 # (source: command line)
-B_pc_hpddm_levels_1_sub_pc_type lu # (source: command line)
-B_pc_hpddm_levels_1_svd_nsv 120 # (source: command line)
-B_pc_type hpddm # (source: command line)
-matload_block_size 1 # (source: file)
-options_view # (source: command line)
-vecload_block_size 1 # (source: file)
#End of PETSc Option Table entries

> Thanks,
> 
> Randy M.
> 
> 
> 
>> On Oct 24, 2025, at 8:14?AM, Pierre Jolivet <pierre at joliv.et> wrote:
>> 
>> 
>>> On 24 Oct 2025, at 4:38?PM, Nils Schween <nils.schween at mpi-hd.mpg.de> wrote:
>>> 
>>> Thank you very much Pierre!
>>> 
>>> I was not aware of the fact that the fill-in in the ILU decides about
>>> its quality. But its clear now. I will just test what level of fill we
>>> need for our application.
>> 
>> I?ll note that block Jacobi and ILU are not very efficient solvers in most instances.
>> I tried much fancier algebraic preconditioners such as BoomerAMG and GAMG on your problem, and they are failing hard out-of-the-box.
>> Without knowing much more on the problem, it?s difficult to setup.
>> We also have other more robust preconditioners in PETSc by means of domain decomposition methods.
>> deal.II is interfaced with PCBDDC (which is also somewhat difficult to tune in a fully algebraic mode) and you could also use PCHPDDM (in fully algebraic mode).
>> On this toy problem, PCHPDDM performs much better in terms of iteration than the simpler PCBJACOBI + (sub) PCILU.
>> Of course, as we always advise our users, it?s best to do a little bit of literature survey to find the best method for your application, I doubt it?s PCBJACOBI.
>> If the solver part is not a problem in your application, just carry on with what?s easiest for you.
>> If you want some precise help on either PCBDDC or PCHPDDM, feel free to get in touch with me in private.
>> 
>> Thanks,
>> Pierre
>> 
>> PCGAMG
>> Linear A_ solve did not converge due to DIVERGED_ITS iterations 1000
>> Linear B_ solve did not converge due to DIVERGED_ITS iterations 1000
>> PCHYPRE
>> Linear A_ solve did not converge due to DIVERGED_NANORINF iterations 0
>> Linear B_ solve did not converge due to DIVERGED_NANORINF iterations 0
>> PCHPDDM
>> Linear A_ solve converged due to CONVERGED_RTOL iterations 4
>> Linear B_ solve converged due to CONVERGED_RTOL iterations 38
>> PCBJACOBI
>> Linear A_ solve converged due to CONVERGED_RTOL iterations 134
>> Linear B_ solve did not converge due to DIVERGED_ITS iterations 1000
>> 
>>> Once more thanks,
>>> Nils
>>> 
>>> 
>>> Pierre Jolivet <pierre at joliv.et> writes:
>>> 
>>>>> On 24 Oct 2025, at 1:52?PM, Nils Schween <nils.schween at mpi-hd.mpg.de> wrote:
>>>>> 
>>>>> Dear PETSc users, Dear PETSc developers,
>>>>> 
>>>>> in our software we are solving a linear system with PETSc using GMRES
>>>>> in conjunction with a BlockJacobi preconditioner, i.e. the default of
>>>>> the KSP object.
>>>>> 
>>>>> We have two versions of the system matrix, say A and B. The difference
>>>>> between them is the non-zero pattern. The non-zero pattern of matrix B
>>>>> is a subset of the one of matrix A. Their values should be identical.
>>>>> 
>>>>> We solve the linear system, using A yields a solution after some
>>>>> iterations, whereas using B does not converge.
>>>>> 
>>>>> I created binary files of the two matrices, the right-hand side, and
>>>>> wrote a small PETSc programm, which loads them and demonstrates the
>>>>> issue. I attach the files to this email.
>>>>> 
>>>>> We would like to understand why the solver-preconditioner combination
>>>>> works in case A and not in case B. Can you help us finding this out?
>>>>> 
>>>>> To test if the two matrices are identical, I substracted them and
>>>>> computed the Frobenius norm of the result. It is zero.
>>>> 
>>>> The default subdomain solver is ILU(0).
>>>> By definition, this won?t allow fill-in.
>>>> So when you are not storing the zeros in B, the quality of your PC is much worse.
>>>> You can check this yourself with -A_ksp_view -B_ksp_view:
>>>> [?]
>>>>  0 levels of fill
>>>>  tolerance for zero pivot 2.22045e-14
>>>>  matrix ordering: natural
>>>>  factor fill ratio given 1., needed 1.
>>>>    Factored matrix follows:
>>>>      Mat Object: (A_) 1 MPI process
>>>>        type: seqaij
>>>>        rows=1664, cols=1664
>>>>        package used to perform factorization: petsc
>>>>        total: nonzeros=117760, allocated nonzeros=117760
>>>>          using I-node routines: found 416 nodes, limit used is 5
>>>> [?]
>>>>  0 levels of fill
>>>>  tolerance for zero pivot 2.22045e-14
>>>>  matrix ordering: natural
>>>>  factor fill ratio given 1., needed 1.
>>>>    Factored matrix follows:
>>>>      Mat Object: (B_) 1 MPI process
>>>>        type: seqaij
>>>>        rows=1664, cols=1664
>>>>        package used to perform factorization: petsc
>>>>        total: nonzeros=49408, allocated nonzeros=49408
>>>>          not using I-node routines
>>>> 
>>>> Check the number of nonzeros of both factored Mat.
>>>> With -B_pc_factor_levels 3, you?ll get roughly similar convergence speed (and density in the factored Mat of both PC).
>>>> 
>>>> Thanks,
>>>> Pierre
>>>> 
>>>>> 
>>>>> To give you more context, we solve a system of partial differential
>>>>> equations that models astrophysical plasmas. It is essentially a system
>>>>> of advection-reaction equations. We use a discontinuous Galerkin (dG)
>>>>> method. Our code relies on the finite element library library deal.ii
>>>>> and its PETSc interface. The system matrices A and B are the result of
>>>>> the (dG) discretisation. We GMRES with a BlockJaboci preconditioner,
>>>>> because we do not know any better.
>>>>> 
>>>>> I tested the code I sent with PETSc 3.24.0 and 3.19.1 on my workstation, i.e.
>>>>> Linux home-desktop 6.17.2-arch1-1 #1 SMP PREEMPT_DYNAMIC Sun, 12 Oct 2025 12:45:18 +0000 x86_64 GNU/Linux
>>>>> I use OpenMPI 5.0.8 and I compiled with mpicc, which in my cases use
>>>>> gcc.
>>>>> 
>>>>> In case you need more information. Please let me know.
>>>>> Any help is appreciated.
>>>>> 
>>>>> Thank you,
>>>>> Nils 
>>>>> <example.tar.gz>
>>>>> 
>>>>> -- 
>>>>> Nils Schween
>>>>> 
>>>>> Phone: +49 6221 516 557
>>>>> Mail: nils.schween at mpi-hd.mpg.de
>>>>> PGP-Key: 4DD3DCC0532EE96DB0C1F8B5368DBFA14CB81849
>>>>> 
>>>>> Max Planck Institute for Nuclear Physics
>>>>> Astrophysical Plasma Theory (APT)
>>>>> Saupfercheckweg 1, D-69117 Heidelberg
>>>>> https://urldefense.us/v3/__https://www.mpi-hd.mpg.de/mpi/en/research/scientific-divisions-and-groups/independent-research-groups/apt__;!!G_uCfscf7eWS!YoOlZjX4v-hbz0Oaawvh2Yy3nCbpHcafn1VjON06Or7f-WVrzGzD9SMcky5YJAyzVu62BIfzC5cpshSkkkpKpA$
>>> 
>>> -- 
>>> Nils Schween
>>> PhD Student
>>> 
>>> Phone: +49 6221 516 557
>>> Mail: nils.schween at mpi-hd.mpg.de
>>> PGP-Key: 4DD3DCC0532EE96DB0C1F8B5368DBFA14CB81849
>>> 
>>> Max Planck Institute for Nuclear Physics
>>> Astrophysical Plasma Theory (APT)
>>> Saupfercheckweg 1, D-69117 Heidelberg
>>> https://urldefense.us/v3/__https://www.mpi-hd.mpg.de/mpi/en/research/scientific-divisions-and-groups/independent-research-groups/apt__;!!G_uCfscf7eWS!YoOlZjX4v-hbz0Oaawvh2Yy3nCbpHcafn1VjON06Or7f-WVrzGzD9SMcky5YJAyzVu62BIfzC5cpshSkkkpKpA$
>> 
> 


From C.J.Berends at uu.nl  Wed Nov 26 02:31:06 2025
From: C.J.Berends at uu.nl (Berends, C.J. (Tijn))
Date: Wed, 26 Nov 2025 08:31:06 +0000
Subject: [petsc-users] Bug report: petcds fortran module missing?
Message-ID: <AS8PR05MB9857169CC2863F67927F8CF8B6DEA@AS8PR05MB9857.eurprd05.prod.outlook.com>

Dear petsc folks,

I am trying to set up a very basic example program in Fortran, using Petsc to solve a simple Poisson problem (d2u/dx2 = f).

However, I am running into problems when I try to call PetscDSSetResidual to set the function pointers for the residual. According to the documentation, this function is part of the petcds module, but while the header file (petscds.h) exists in my Petsc include directotry, the Fortran module (petscds.mod) does not. Therefore, my code won't compile.

I have currently got Petsc installed via Homebrew. I have tried cloning the Petsc git repository and building that, using the configuration option --with-fortran-interfaces=1, but still the petscds.mod file is not there.

Am I doing something wrong, or can it be that this particular Fortran interface is just missing?

If you need any additional information from me, please let me know.

Kind regards,
Tijn Berends


dr. C. J. (Tijn) Berends
Post-doc (palaeo)glaciology
Institute for Marine and Atmospheric research Utrecht (IMAU), Utrecht University, The Netherlands
Buys Ballot Building (BBG), Room 6.67
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20251126/d80d19c7/attachment.html>

From jroman at dsic.upv.es  Thu Nov 27 09:49:33 2025
From: jroman at dsic.upv.es (Jose E. Roman)
Date: Thu, 27 Nov 2025 15:49:33 +0000
Subject: [petsc-users] Bug report: petcds fortran module missing?
In-Reply-To: <AS8PR05MB9857169CC2863F67927F8CF8B6DEA@AS8PR05MB9857.eurprd05.prod.outlook.com>
References: <AS8PR05MB9857169CC2863F67927F8CF8B6DEA@AS8PR05MB9857.eurprd05.prod.outlook.com>
Message-ID: <09F62CDD-EE2F-47DF-B041-5DB4FF1C7D78@dsic.upv.es>

PetscDS is part of DM, so you have to "use petscdm".
The example src/dm/impls/plex/tutorials/ex4f90.F90 uses PetscDS.
The configure option --with-fortran-interfaces=1 is deprecated, it is no longer needed.
Jose


> El 26 nov 2025, a las 9:31, Berends, C.J. (Tijn) via petsc-users <petsc-users at mcs.anl.gov> escribi?:
> 
> Dear petsc folks,
> 
> I am trying to set up a very basic example program in Fortran, using Petsc to solve a simple Poisson problem (d2u/dx2 = f).
> 
> However, I am running into problems when I try to call PetscDSSetResidual to set the function pointers for the residual. According to the documentation, this function is part of the petcds module, but while the header file (petscds.h) exists in my Petsc include directotry, the Fortran module (petscds.mod) does not. Therefore, my code won't compile.
> 
> I have currently got Petsc installed via Homebrew. I have tried cloning the Petsc git repository and building that, using the configuration option --with-fortran-interfaces=1, but still the petscds.mod file is not there.
> 
> Am I doing something wrong, or can it be that this particular Fortran interface is just missing?
> 
> If you need any additional information from me, please let me know.
> 
> Kind regards,
> Tijn Berends
> 
> 
> 
> dr. C. J. (Tijn) Berends
> Post-doc (palaeo)glaciology
> Institute for Marine and Atmospheric research Utrecht (IMAU), Utrecht University, The Netherlands
> Buys Ballot Building (BBG), Room 6.67