From daniel.pino_munoz at mines-paristech.fr  Mon Nov  4 04:23:35 2024
From: daniel.pino_munoz at mines-paristech.fr (Daniel Pino Munoz)
Date: Mon, 4 Nov 2024 11:23:35 +0100
Subject: [petsc-users] Example of SNES using matrix free jacobian
In-Reply-To: <AD860000-7DBC-474D-A9B6-72E260CC18D4@petsc.dev>
References: <161ca7d4-82c4-4696-b123-4d441eb903da@mines-paristech.fr>
	<A53125ED-CFF1-474F-87ED-EB93BD0B03E4@petsc.dev>
	<e052ebf4-a2e5-4b41-a441-8bf0838ca772@mines-paristech.fr>
	<7096B68A-67A7-41AA-9DAF-A3FE78C8679E@petsc.dev>
	<4257d3ef-9a69-4b3c-a435-f8f9fe77f50c@mines-paristech.fr>
	<78522DC0-7204-42C0-8A0C-47E2CA55AB80@petsc.dev>
	<40d13023-ede8-4226-bdd9-a8a495ef4daa@mines-paristech.fr>
	<AD860000-7DBC-474D-A9B6-72E260CC18D4@petsc.dev>
Message-ID: <f85de251-c1ad-4b67-9871-f8f1ed7eb318@mines-paristech.fr>

Dear all,

The problem was the context. The context was not properly set, and yet 
for some reason it was running correctly in Debug mode.

Now this problem has been solved.

Thank you Barry for your help.

Best,

 ? Daniel

On 29/10/2024 23:04, Barry Smith wrote:
>    Run in the debugger, even though there will not be full debugging support with the optimizations it should still provide you some information
>
>
>> On Oct 29, 2024, at 4:35?PM, Daniel Pino Munoz <daniel.pino_munoz at mines-paristech.fr> wrote:
>>
>> That's what I thought, so I replaced its content by:
>>
>> VecZeroEntries(f);
>>
>> and the result is the same...
>>
>>
>> On 29/10/2024 21:31, Barry Smith wrote:
>>>    This
>>>    [0]PETSC ERROR: #1 SNES callback function
>>>     indicates the crash is in your computeResidual function and thus you need to debug your function
>>>
>>>     Barry
>>>
>>>
>>>> On Oct 29, 2024, at 4:28?PM, Daniel Pino Munoz <daniel.pino_munoz at mines-paristech.fr> wrote:
>>>>
>>>> I ran it with -malloc_debug and it does not change anything.
>>>>
>>>> The output is the following:
>>>>
>>>> he absolute tolerance is 0.001
>>>> The relative tolerance is 0.001
>>>> The divergence tolerance is 10000
>>>> The maximum iterations is 10000
>>>> Initial load !
>>>> [0]PETSC ERROR: ------------------------------------------------------------------------
>>>> [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range
>>>> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
>>>> [0]PETSC ERROR: or see https://urldefense.us/v3/__https://petsc.org/release/faq/*valgrind__;Iw!!G_uCfscf7eWS!dkbZ6vbH8gLYOkcINufnJg_JyJcRHtNge8M_c8f-iuQ3J8EP-KVV58zSZ1cBNyyStNlGaXrJUcwnob5EdBnPDpfAiOKcpEClnwS21YZuJw$  and https://urldefense.us/v3/__https://petsc.org/release/faq/__;!!G_uCfscf7eWS!dkbZ6vbH8gLYOkcINufnJg_JyJcRHtNge8M_c8f-iuQ3J8EP-KVV58zSZ1cBNyyStNlGaXrJUcwnob5EdBnPDpfAiOKcpEClnwTJP97sJQ$ 
>>>> [0]PETSC ERROR: ---------------------  Stack Frames ------------------------------------
>>>> [0]PETSC ERROR: The line numbers in the error traceback are not always exact.
>>>> [0]PETSC ERROR: #1 SNES callback function
>>>> [0]PETSC ERROR: #2 SNESComputeFunction() at /home/daniel-pino/Software/Dependencies/petsc/src/snes/interface/snes.c:2489
>>>> [0]PETSC ERROR: #3 SNESSolve_KSPONLY() at /home/daniel-pino/Software/Dependencies/petsc/src/snes/impls/ksponly/ksponly.c:27
>>>> [0]PETSC ERROR: #4 SNESSolve() at /home/daniel-pino/Software/Dependencies/petsc/src/snes/interface/snes.c:4841
>>>> --------------------------------------------------------------------------
>>>> MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
>>>>    Proc: [[27669,1],0]
>>>>    Errorcode: 59
>>>>
>>>> NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
>>>> You may or may not see output from other processes, depending on
>>>> exactly when Open MPI kills them.
>>>> --------------------------------------------------------------------------
>>>>
>>>>
>>>>
>>>> On 29/10/2024 20:17, Barry Smith wrote:
>>>>>     Hmm, cut and paste the output when it crashes.
>>>>>
>>>>>      Also run with -malloc_debug and see what happens
>>>>>
>>>>>
>>>>>      Barry
>>>>>
>>>>>
>>>>>> On Oct 29, 2024, at 3:13?PM, Daniel Pino Munoz <daniel.pino_munoz at mines-paristech.fr> wrote:
>>>>>>
>>>>>> Hi Barry,
>>>>>>
>>>>>> Thanks for getting back to me!
>>>>>>
>>>>>> I tried replacing KSPSetOperators(ksp, J, J); by SNESSetJacobian(snes,J,J, MatMFFDComputeJacobian)
>>>>>>
>>>>>> and I get the same result = It works in Debug mode but not in Release. I also ran valgrind and it did not catch any memory problem.
>>>>>>
>>>>>> Any ideas?
>>>>>>
>>>>>> PS : You are right regarding the number of iterations of the non preconditioned problem. In the previous version of the code that only used a KSP, I already had to set -ksp_gmres_restart 100. But thanks for the heads up.
>>>>>>
>>>>>> Best,
>>>>>>
>>>>>>    Daniel
>>>>>>
>>>>>> On 29/10/2024 20:01, Barry Smith wrote:
>>>>>>>     Don't call
>>>>>>>
>>>>>>>     KSPSetOperators(ksp, J, J);
>>>>>>>
>>>>>>>
>>>>>>>      instead call
>>>>>>>
>>>>>>>      SNESSetJacobian(snes,J,J, MatMFFDComputeJacobian)
>>>>>>>
>>>>>>>      but I am not sure that would explain the crash.
>>>>>>>
>>>>>>>      BTW: since you are applying no preconditioner if the matrix is ill-conditioned it may take many iterations or not converge. You can try something like -ksp_gmres_restart 100 or similar value to try to improve convergence (default is 30).
>>>>>>>
>>>>>>>      Barry
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>> On Oct 29, 2024, at 12:37?PM, Daniel Pino Munoz <daniel.pino_munoz at mines-paristech.fr> wrote:
>>>>>>>>
>>>>>>>> Dear all,
>>>>>>>>
>>>>>>>> I have a linear problem that I am currently solving with a KSP matrix-free.
>>>>>>>>
>>>>>>>> I would like to move on to a non linear problem, so figure I could start by solving the same linear problem using SNES. So I am setting the problem as follows:
>>>>>>>>
>>>>>>>> SNESCreate(PETSC_COMM_WORLD, &snes);
>>>>>>>> MatCreateShell(PETSC_COMM_WORLD, n_dofs, n_dofs, PETSC_DETERMINE, PETSC_DETERMINE, &ctx, &J);
>>>>>>>> MatCreateShell(PETSC_COMM_WORLD, n_dofs, n_dofs, PETSC_DETERMINE, PETSC_DETERMINE, &ctx, &B);
>>>>>>>> MatShellSetOperation(J, MATOP_MULT, (void (*)(void))(Multiplication));
>>>>>>>> MatCreateVecs(J, &x_sol, &b);
>>>>>>>> VecDuplicate(x_sol, &r);
>>>>>>>> SNESSetFromOptions(snes);
>>>>>>>> SNESSetFunction(snes, r, &(computeResidual), &ctx);
>>>>>>>> SNESSetUseMatrixFree(snes, PETSC_FALSE, PETSC_TRUE);
>>>>>>>> SNESGetLineSearch(snes, &linesearch);
>>>>>>>> SNESGetKSP(snes, &ksp);
>>>>>>>> KSPSetOperators(ksp, J, J);
>>>>>>>> KSPSetInitialGuessNonzero(ksp, PETSC_TRUE);
>>>>>>>>
>>>>>>>> I tested it with a small problem (compiled in debug) and it works.
>>>>>>>>
>>>>>>>> When I compiled it in Release, it crashes with a segfault. I tried running the Debug version through valgrind, but even for a small problem, it is too slow. So I was wondering if you guys could see any rocky mistake on the lines I used above?
>>>>>>>>
>>>>>>>> Otherwise, is there any example that uses a SNES combined with a matrix free KSP operator?
>>>>>>>>
>>>>>>>> Thank you,
>>>>>>>>
>>>>>>>>    Daniel
>>>>>>>>

From mail2amneet at gmail.com  Mon Nov  4 11:36:15 2024
From: mail2amneet at gmail.com (Amneet Bhalla)
Date: Mon, 4 Nov 2024 09:36:15 -0800
Subject: [petsc-users] Rigid body nullspace for Stokes operator
In-Reply-To: <87v7x8pagg.fsf@jedbrown.org>
References: <CAMETWJ2V7q2evB-UxPTNdHN663R=S8MSQXuy_ZgX7w6XJtjNLA@mail.gmail.com>
	<CADOhEh6ZU+h4ydq5zP1wiTWZL5p7Kp_HQ-LxF7CQ1JOF=4fsXQ@mail.gmail.com>
	<CADOhEh6dVYq4Hw+DpcYrntNykD12ej-7z2ZLJBxeg=56jh1X_A@mail.gmail.com>
	<CAMETWJ3m5X=3D2A4PkvFRQksSSWqSSgk6TArKfhH0RdVhvOU4w@mail.gmail.com>
	<CADOhEh7HCSH4_=TwpuXyq0ydpmLcxoYTLReSfka6qs-mjBFs1A@mail.gmail.com>
	<875xpas35p.fsf@jedbrown.org>
	<CAMETWJ2Qj02uZdG2jrTgRsj3Qrm44UtNqd_NPRe-qk4R=domYQ@mail.gmail.com>
	<CAMETWJ2fpdsSd=YXy+J7T3C7jW03Ubr+bOZEJ8sruX-gsy6SHw@mail.gmail.com>
	<87bjz1qtk6.fsf@jedbrown.org>
	<CAMETWJ1+BQFFFySanUL=eb0yF6f=bx_xzuJAWNfd45T+c1VwZA@mail.gmail.com>
	<87y124q31n.fsf@jedbrown.org>
	<CADOhEh5ZhSKFij7JfaHxuVn=xhy=zG4L5daKERXh4FUm-UAE5A@mail.gmail.com>
	<ACE98DE3-6041-4893-8CBD-09E2CF4F560D@joliv.et>
	<CADOhEh4n-9Ch2AfpgRZdMRbC=BW1duj_d8rLW5mkATGvJR3V6g@mail.gmail.com>
	<96BE4263-8C34-4A2E-91B4-305F94FFCAB4@joliv.et>
	<87v7x8pagg.fsf@jedbrown.org>
Message-ID: <CAMETWJ0JMWWtSENX9ArNuaTr0DB69YXvmjS77nguLvRwGxqqVA@mail.gmail.com>

Hi Jed,

Do I need to create two separate MattNullSpace objects if I want to use
both MatSetNullSpace() and MatSetNearNullSpace()?

Thanks,


On Thu, Oct 31, 2024 at 8:18?AM Jed Brown <jed at jedbrown.org> wrote:

> Pierre Jolivet <pierre at joliv.et> writes:
>
> >> On 31 Oct 2024, at 2:47?PM, Mark Adams <mfadams at lbl.gov> wrote:
> >>
> >> Interesting. I have seen hypre do fine on elasticity, but do you know
> if boomeramg (classical) uses these vectors or is there a smoothed
> aggregation solver in hypre?
> >
> > I?m not sure it is precisely ?standard? smoothed aggregation, see bottom
> paragraph of
> https://urldefense.us/v3/__https://hypre.readthedocs.io/en/latest/solvers-boomeramg.html*amg-for-systems-of-pdes__;Iw!!G_uCfscf7eWS!fdj4AzKhAcDmM1x_ZQ8gxqWeX6BAKY9urnvATMpT7hC8lw77ak7tqxqXGIX3PMg2wYA5PGu7EzyCW0yzixutBg$
> > I?ve never made it to work, but I know some do.
> > A while back, Stefano gave me this pointer as well:
> https://urldefense.us/v3/__https://github.com/mfem/mfem/blob/17955e114020af340e9a06a66ebef43e05012d9c/linalg/hypre.cpp*L5245__;Iw!!G_uCfscf7eWS!fdj4AzKhAcDmM1x_ZQ8gxqWeX6BAKY9urnvATMpT7hC8lw77ak7tqxqXGIX3PMg2wYA5PGu7EzyCW0wVEi33Pw$
>
> It's still classical AMG, and in my experience, struggles on very thin
> structures (e.g., aspect ratio 1000 cantilever beams) when compared to SA.
> However, it can be quite competitive for many structures. I found that the
> "MFEM elasticity suite", which is based on Baker et al 2010, gave rather
> poor results. This is a configuration that works on GPUs and gives good
> convergence and performance for elasticity:
>
>
> https://urldefense.us/v3/__https://github.com/hypre-space/hypre/issues/601*issuecomment-1069426997__;Iw!!G_uCfscf7eWS!arUVBVKKcYs1M5OhNqqRZl2b2o0NIUkG7fV_22qBbg-ssHhhHazhkpMbYNjCOTN66Sfbk-VZilfox9bxDf0$
>
> In the above issue, I was only using BoomerAMG as a coarse level for p-MG
> so all the options have a `-mg_coarse_` prefix; here are those options
> without the prefix:
>
> -pc_hypre_boomeramg_coarsen_type pmis
> -pc_hypre_boomeramg_interp_type ext+i
> -pc_hypre_boomeramg_no_CF
> -pc_hypre_boomeramg_P_max 6
> -pc_hypre_boomeramg_print_statistics 1
> -pc_hypre_boomeramg_relax_type_down Chebyshev
> -pc_hypre_boomeramg_relax_type_up Chebyshev
> -pc_hypre_boomeramg_strong_threshold 0.5
> -pc_type hypre
>


-- 
--Amneet
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20241104/c05e8332/attachment.html>

From aldo.bonfiglioli at unibas.it  Mon Nov  4 12:48:08 2024
From: aldo.bonfiglioli at unibas.it (Aldo Bonfiglioli)
Date: Mon, 4 Nov 2024 19:48:08 +0100
Subject: [petsc-users] Advice on setting BCs in a PLEX
Message-ID: <f6a1a601-6bf5-4eb0-a1b2-83882601aa96@unibas.it>

Dear users,

I have been using petsc's KSP for over 20 years and I am considering 
using DMPLEX to replace my own data structure in a mixed FEM/FVM CFD code.

To do so, I am trying to understand DMPLEX by writing a 1D code that 
solves u_t = u_xx using PSEUDOTS.

In order to prescribe Dirichlet BCs at the endpoints of the 1D box, I 
use PetscSectionSetConstraintDof when building the PetscSection.

> A global vector is missing both the shared dofs which are not owned by 
> this process, as well as /constrained/ dofs. These constraints 
> represent essential (Dirichlet) boundary conditions. They are dofs 
> that have a given fixed value, so they are present in local vectors 
> for assembly purposes, but absent from global vectors since they are 
> never solved for during algebraic solves.

My global Vec has indeed two entries less than the local one.

When initializing the solution or evaluating the rhs function I transfer 
data from the global to local representation, do the calculation, then 
transfer back.

I am doing something wrong, though, which shows up in the attached log file.

My simple code is also attached.

Thank you for your advice.

Aldo

-- 
Dr. Aldo Bonfiglioli
Associate professor of Fluid Machines
Scuola di Ingegneria
Universita' della Basilicata
V.le dell'Ateneo lucano, 10 85100 Potenza ITALY
tel:+39.0971.205203 fax:+39.0971.205215
web:https://urldefense.us/v3/__http://docenti.unibas.it/site/home/docente.html?m=002423__;!!G_uCfscf7eWS!cWgDpzKW9KGEkqxu775gDU488CBQ5itLl2l_HUsDKcPstbcuSm0_bZPHkMciXajTi3539fuMPe8TD1YZuirDfmSBMSXXxrQZWns$ 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20241104/e493a95c/attachment.html>
-------------- next part --------------
DM Object: box 1 MPI process
  type: plex
box in 1 dimension:
  Number of 0-cells per rank: 11
  Number of 1-cells per rank: 10
Labels:
  marker: 1 strata with value/size (1 (2))
  Face Sets: 2 strata with value/size (1 (1), 2 (1))
  depth: 2 strata with value/size (0 (11), 1 (10))
  celltype: 2 strata with value/size (0 (11), 1 (10))
[0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
[0]PETSC ERROR: Invalid argument
[0]PETSC ERROR: Point 10: Global dof 0 != 1 size - number of constraints
[0]PETSC ERROR: WARNING! There are unused option(s) set! Could be the program crashed before usage or a spelling mistake, etc!
[0]PETSC ERROR:   Option left: name:-ts_max_steps value: 10 source: command line
[0]PETSC ERROR:   Option left: name:-ts_monitor (no value) source: command line
[0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting.
[0]PETSC ERROR: Petsc Release Version 3.22.0, Sep 28, 2024 
[0]PETSC ERROR: ./ex15f90 with 1 MPI process(es) and PETSC_ARCH linux_gnu on DESKTOP-04AR6V6.lan by abonfi Mon Nov  4 19:28:00 2024
[0]PETSC ERROR: Configure options: --with-mpi=1 --with-mpi-dir=/usr/lib64/mpi/gcc/mpich --with-shared-libraries=1 --with-debugging=1 --with-blas-lib=/usr/lib64/libblas.a --with-lapack-lib=/usr/lib64/liblapack.a --with-triangle=1 --download-triangle=yes
[0]PETSC ERROR: #1 PetscSFSetGraphSection() at /home/abonfi/src/petsc-3.22.0/src/vec/is/sf/utils/sfutils.c:179
[0]PETSC ERROR: #2 DMCreateSectionSF() at /home/abonfi/src/petsc-3.22.0/src/dm/interface/dm.c:4783
[0]PETSC ERROR: #3 DMGetSectionSF() at /home/abonfi/src/petsc-3.22.0/src/dm/interface/dm.c:4722
[0]PETSC ERROR: #4 DMGlobalToLocalBegin() at /home/abonfi/src/petsc-3.22.0/src/dm/interface/dm.c:2851
[0]PETSC ERROR: #5 DMGlobalToLocal() at /home/abonfi/src/petsc-3.22.0/src/dm/interface/dm.c:2812
[0]PETSC ERROR: #6 ex15f90.F90:280
Abort(62) on node 0 (rank 0 in comm 16): application called MPI_Abort(MPI_COMM_SELF, 62) - process 0
 there are           11  entries in depth:            0
 there are           10  entries in depth:            1
 Setting up a section on           11  meshpoints with            1  dofs
 The DM is marked
 Point           10 is a boundary point
 Point           20 is a boundary point
 Array u has type : seq                                                                             
 (Global) Array u has size :            9
 coordVec has size           11
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ex15f90.F90
Type: text/x-fortran
Size: 14065 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20241104/e493a95c/attachment.bin>

From jed at jedbrown.org  Mon Nov  4 12:11:54 2024
From: jed at jedbrown.org (Jed Brown)
Date: Mon, 04 Nov 2024 11:11:54 -0700
Subject: [petsc-users] Rigid body nullspace for Stokes operator
In-Reply-To: <CAMETWJ0JMWWtSENX9ArNuaTr0DB69YXvmjS77nguLvRwGxqqVA@mail.gmail.com>
References: <CAMETWJ2V7q2evB-UxPTNdHN663R=S8MSQXuy_ZgX7w6XJtjNLA@mail.gmail.com>
	<CADOhEh6ZU+h4ydq5zP1wiTWZL5p7Kp_HQ-LxF7CQ1JOF=4fsXQ@mail.gmail.com>
	<CADOhEh6dVYq4Hw+DpcYrntNykD12ej-7z2ZLJBxeg=56jh1X_A@mail.gmail.com>
	<CAMETWJ3m5X=3D2A4PkvFRQksSSWqSSgk6TArKfhH0RdVhvOU4w@mail.gmail.com>
	<CADOhEh7HCSH4_=TwpuXyq0ydpmLcxoYTLReSfka6qs-mjBFs1A@mail.gmail.com>
	<875xpas35p.fsf@jedbrown.org>
	<CAMETWJ2Qj02uZdG2jrTgRsj3Qrm44UtNqd_NPRe-qk4R=domYQ@mail.gmail.com>
	<CAMETWJ2fpdsSd=YXy+J7T3C7jW03Ubr+bOZEJ8sruX-gsy6SHw@mail.gmail.com>
	<87bjz1qtk6.fsf@jedbrown.org>
	<CAMETWJ1+BQFFFySanUL=eb0yF6f=bx_xzuJAWNfd45T+c1VwZA@mail.gmail.com>
	<87y124q31n.fsf@jedbrown.org>
	<CADOhEh5ZhSKFij7JfaHxuVn=xhy=zG4L5daKERXh4FUm-UAE5A@mail.gmail.com>
	<ACE98DE3-6041-4893-8CBD-09E2CF4F560D@joliv.et>
	<CADOhEh4n-9Ch2AfpgRZdMRbC=BW1duj_d8rLW5mkATGvJR3V6g@mail.gmail.com>
	<96BE4263-8C34-4A2E-91B4-305F94FFCAB4@joliv.et>
	<87v7x8pagg.fsf@jedbrown.org>
	<CAMETWJ0JMWWtSENX9ArNuaTr0DB69YXvmjS77nguLvRwGxqqVA@mail.gmail.com>
Message-ID: <87wmhi27it.fsf@jedbrown.org>

Unless the problem is entirely floating (the true null space is all six rigid body modes), then they will be different, so yes, you'll typically have two MatNullSpace objects.

Amneet Bhalla <mail2amneet at gmail.com> writes:

> Hi Jed,
>
> Do I need to create two separate MattNullSpace objects if I want to use
> both MatSetNullSpace() and MatSetNearNullSpace()?
>
> Thanks,
>
>
> On Thu, Oct 31, 2024 at 8:18?AM Jed Brown <jed at jedbrown.org> wrote:
>
>> Pierre Jolivet <pierre at joliv.et> writes:
>>
>> >> On 31 Oct 2024, at 2:47?PM, Mark Adams <mfadams at lbl.gov> wrote:
>> >>
>> >> Interesting. I have seen hypre do fine on elasticity, but do you know
>> if boomeramg (classical) uses these vectors or is there a smoothed
>> aggregation solver in hypre?
>> >
>> > I?m not sure it is precisely ?standard? smoothed aggregation, see bottom
>> paragraph of
>> https://urldefense.us/v3/__https://hypre.readthedocs.io/en/latest/solvers-boomeramg.html*amg-for-systems-of-pdes__;Iw!!G_uCfscf7eWS!fdj4AzKhAcDmM1x_ZQ8gxqWeX6BAKY9urnvATMpT7hC8lw77ak7tqxqXGIX3PMg2wYA5PGu7EzyCW0yzixutBg$
>> > I?ve never made it to work, but I know some do.
>> > A while back, Stefano gave me this pointer as well:
>> https://urldefense.us/v3/__https://github.com/mfem/mfem/blob/17955e114020af340e9a06a66ebef43e05012d9c/linalg/hypre.cpp*L5245__;Iw!!G_uCfscf7eWS!fdj4AzKhAcDmM1x_ZQ8gxqWeX6BAKY9urnvATMpT7hC8lw77ak7tqxqXGIX3PMg2wYA5PGu7EzyCW0wVEi33Pw$
>>
>> It's still classical AMG, and in my experience, struggles on very thin
>> structures (e.g., aspect ratio 1000 cantilever beams) when compared to SA.
>> However, it can be quite competitive for many structures. I found that the
>> "MFEM elasticity suite", which is based on Baker et al 2010, gave rather
>> poor results. This is a configuration that works on GPUs and gives good
>> convergence and performance for elasticity:
>>
>>
>> https://urldefense.us/v3/__https://github.com/hypre-space/hypre/issues/601*issuecomment-1069426997__;Iw!!G_uCfscf7eWS!arUVBVKKcYs1M5OhNqqRZl2b2o0NIUkG7fV_22qBbg-ssHhhHazhkpMbYNjCOTN66Sfbk-VZilfox9bxDf0$
>>
>> In the above issue, I was only using BoomerAMG as a coarse level for p-MG
>> so all the options have a `-mg_coarse_` prefix; here are those options
>> without the prefix:
>>
>> -pc_hypre_boomeramg_coarsen_type pmis
>> -pc_hypre_boomeramg_interp_type ext+i
>> -pc_hypre_boomeramg_no_CF
>> -pc_hypre_boomeramg_P_max 6
>> -pc_hypre_boomeramg_print_statistics 1
>> -pc_hypre_boomeramg_relax_type_down Chebyshev
>> -pc_hypre_boomeramg_relax_type_up Chebyshev
>> -pc_hypre_boomeramg_strong_threshold 0.5
>> -pc_type hypre
>>
>
>
> -- 
> --Amneet

From knepley at gmail.com  Mon Nov  4 17:20:27 2024
From: knepley at gmail.com (Matthew Knepley)
Date: Tue, 5 Nov 2024 00:20:27 +0100
Subject: [petsc-users] Advice on setting BCs in a PLEX
In-Reply-To: <f6a1a601-6bf5-4eb0-a1b2-83882601aa96@unibas.it>
References: <f6a1a601-6bf5-4eb0-a1b2-83882601aa96@unibas.it>
Message-ID: <CAMYG4G=pO3es3cWi+itS80rzDEejwo84+t7BPwi=bi2nnEjX5w@mail.gmail.com>

On Mon, Nov 4, 2024 at 7:10?PM Aldo Bonfiglioli <aldo.bonfiglioli at unibas.it>
wrote:

> Dear users,
>
> I have been using petsc's KSP for over 20 years and I am considering using
> DMPLEX to replace my own data structure in a mixed FEM/FVM CFD code.
>
> To do so, I am trying to understand DMPLEX by writing a 1D code that
> solves u_t = u_xx using PSEUDOTS.
>
> In order to prescribe Dirichlet BCs at the endpoints of the 1D box, I use
> PetscSectionSetConstraintDof when building the PetscSection.
>
> A global vector is missing both the shared dofs which are not owned by
> this process, as well as *constrained* dofs. These constraints represent
> essential (Dirichlet) boundary conditions. They are dofs that have a given
> fixed value, so they are present in local vectors for assembly purposes,
> but absent from global vectors since they are never solved for during
> algebraic solves.
>
>
>
> My global Vec has indeed two entries less than the local one.
>
> When initializing the solution or evaluating the rhs function I transfer
> data from the global to local representation, do the calculation, then
> transfer back.
>
> I am doing something wrong, though, which shows up in the attached log
> file.
>
> My simple code is also attached.
>
> Hi Aldo,

I think we are very close to working. The PetscSection only care about the
sizes of things, like vectors, and it seems
like that is right. Now you want the boundary values to be put in your
vectors (I think). There needs to be a routine
to stick in the boundary values,  which is DMPlexInsertBoundaryValues().
You can change this function by calling
PetscObjectFunctionCompose() for "DMPlexInsertBoundaryValues_C".

  Does that make sense?

    Thanks,

      Matt


> Thank you for your advice.
>
> Aldo
>
> --
> Dr. Aldo Bonfiglioli
> Associate professor of Fluid Machines
> Scuola di Ingegneria
> Universita' della Basilicata
> V.le dell'Ateneo lucano, 10 85100 Potenza ITALY
> tel:+39.0971.205203 fax:+39.0971.205215
> web: https://urldefense.us/v3/__http://docenti.unibas.it/site/home/docente.html?m=002423__;!!G_uCfscf7eWS!Yef4Gll06T0STqTT9tclyiR99EATrA8nCwlcq7LOmCo4TPWg3fYGMZS0Ohp-XcuAHGtnTiIEQOhfmV77lCcX$  <https://urldefense.us/v3/__http://docenti.unibas.it/site/home/docente.html?m=002423__;!!G_uCfscf7eWS!cWgDpzKW9KGEkqxu775gDU488CBQ5itLl2l_HUsDKcPstbcuSm0_bZPHkMciXajTi3539fuMPe8TD1YZuirDfmSBMSXXxrQZWns$>
>
>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!Yef4Gll06T0STqTT9tclyiR99EATrA8nCwlcq7LOmCo4TPWg3fYGMZS0Ohp-XcuAHGtnTiIEQOhfmSraytgu$  <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!Yef4Gll06T0STqTT9tclyiR99EATrA8nCwlcq7LOmCo4TPWg3fYGMZS0Ohp-XcuAHGtnTiIEQOhfmf3oD6xM$ >
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20241105/5585c0e9/attachment.html>

From mail2amneet at gmail.com  Mon Nov  4 20:03:23 2024
From: mail2amneet at gmail.com (Amneet Bhalla)
Date: Mon, 4 Nov 2024 18:03:23 -0800
Subject: [petsc-users] Rigid body nullspace for Stokes operator
In-Reply-To: <87wmhi27it.fsf@jedbrown.org>
References: <CAMETWJ2V7q2evB-UxPTNdHN663R=S8MSQXuy_ZgX7w6XJtjNLA@mail.gmail.com>
	<CADOhEh6ZU+h4ydq5zP1wiTWZL5p7Kp_HQ-LxF7CQ1JOF=4fsXQ@mail.gmail.com>
	<CADOhEh6dVYq4Hw+DpcYrntNykD12ej-7z2ZLJBxeg=56jh1X_A@mail.gmail.com>
	<CAMETWJ3m5X=3D2A4PkvFRQksSSWqSSgk6TArKfhH0RdVhvOU4w@mail.gmail.com>
	<CADOhEh7HCSH4_=TwpuXyq0ydpmLcxoYTLReSfka6qs-mjBFs1A@mail.gmail.com>
	<875xpas35p.fsf@jedbrown.org>
	<CAMETWJ2Qj02uZdG2jrTgRsj3Qrm44UtNqd_NPRe-qk4R=domYQ@mail.gmail.com>
	<CAMETWJ2fpdsSd=YXy+J7T3C7jW03Ubr+bOZEJ8sruX-gsy6SHw@mail.gmail.com>
	<87bjz1qtk6.fsf@jedbrown.org>
	<CAMETWJ1+BQFFFySanUL=eb0yF6f=bx_xzuJAWNfd45T+c1VwZA@mail.gmail.com>
	<87y124q31n.fsf@jedbrown.org>
	<CADOhEh5ZhSKFij7JfaHxuVn=xhy=zG4L5daKERXh4FUm-UAE5A@mail.gmail.com>
	<ACE98DE3-6041-4893-8CBD-09E2CF4F560D@joliv.et>
	<CADOhEh4n-9Ch2AfpgRZdMRbC=BW1duj_d8rLW5mkATGvJR3V6g@mail.gmail.com>
	<96BE4263-8C34-4A2E-91B4-305F94FFCAB4@joliv.et>
	<87v7x8pagg.fsf@jedbrown.org>
	<CAMETWJ0JMWWtSENX9ArNuaTr0DB69YXvmjS77nguLvRwGxqqVA@mail.gmail.com>
	<87wmhi27it.fsf@jedbrown.org>
Message-ID: <CAMETWJ3QOG1yOQ1p-ukibvJskVguGwGAku3Yua9HOeUE5AGUTQ@mail.gmail.com>

I set the rigid body null vectors but PETSc errors out that these are not
orthogonal. Is there a canned routine in PETSc to orthogonalize a bunch of
Vecs?

[0]PETSC ERROR: Invalid argument

[0]PETSC ERROR: Vector 0 must be orthogonal to vector 2, inner product is
0.612391

[0]PETSC ERROR: See https://urldefense.us/v3/__https://petsc.org/release/faq/__;!!G_uCfscf7eWS!e0btVnW2NyDoumIgYMZpmOJ8Vvkh18HjVCEY3LyPW9fAsr4jD7MqRxxBLgh-q49bw3jvM0FZc2A_s2U6RxXpfK06Dg$  for trouble shooting.

[0]PETSC ERROR: Petsc Release Version 3.17.5, unknown

[0]PETSC ERROR: ./acoustic_streaming_hier_integrator_2d on a darwin-dbg
named APSB-MacBook-Pro-16.local by amneetb Mon Nov  4 18:01:09 2024

[0]PETSC ERROR: Configure options --CC=mpicc --CXX=mpicxx --FC=mpif90
--PETSC_ARCH=darwin-dbg --with-debugging=1 --download-hypre=1 --with-x=0
-download-mumps -download-scalapack -download-parmetis -download-metis
-download-ptscotch

[0]PETSC ERROR: #1 MatNullSpaceCreate() at
/Users/amneetb/Softwares/PETSc-Gitlab/PETSc/src/mat/interface/matnull.c:271

[0]PETSC ERROR: #2 resetMatNearNullspace() at
../../../IBAMR/ibtk/lib/../src/solvers/impls/PETScKrylovLinearSolver.cpp:697

P=00000:Program abort called in file
``../../../IBAMR/ibtk/lib/../src/solvers/impls/PETScKrylovLinearSolver.cpp''
at line 697

P=00000:ERROR MESSAGE:

P=00000:
Abort trap: 6

On Mon, Nov 4, 2024 at 10:11?AM Jed Brown <jed at jedbrown.org> wrote:

> Unless the problem is entirely floating (the true null space is all six
> rigid body modes), then they will be different, so yes, you'll typically
> have two MatNullSpace objects.
>
> Amneet Bhalla <mail2amneet at gmail.com> writes:
>
> > Hi Jed,
> >
> > Do I need to create two separate MattNullSpace objects if I want to use
> > both MatSetNullSpace() and MatSetNearNullSpace()?
> >
> > Thanks,
> >
> >
> > On Thu, Oct 31, 2024 at 8:18?AM Jed Brown <jed at jedbrown.org> wrote:
> >
> >> Pierre Jolivet <pierre at joliv.et> writes:
> >>
> >> >> On 31 Oct 2024, at 2:47?PM, Mark Adams <mfadams at lbl.gov> wrote:
> >> >>
> >> >> Interesting. I have seen hypre do fine on elasticity, but do you know
> >> if boomeramg (classical) uses these vectors or is there a smoothed
> >> aggregation solver in hypre?
> >> >
> >> > I?m not sure it is precisely ?standard? smoothed aggregation, see
> bottom
> >> paragraph of
> >>
> https://urldefense.us/v3/__https://hypre.readthedocs.io/en/latest/solvers-boomeramg.html*amg-for-systems-of-pdes__;Iw!!G_uCfscf7eWS!fdj4AzKhAcDmM1x_ZQ8gxqWeX6BAKY9urnvATMpT7hC8lw77ak7tqxqXGIX3PMg2wYA5PGu7EzyCW0yzixutBg$
> >> > I?ve never made it to work, but I know some do.
> >> > A while back, Stefano gave me this pointer as well:
> >>
> https://urldefense.us/v3/__https://github.com/mfem/mfem/blob/17955e114020af340e9a06a66ebef43e05012d9c/linalg/hypre.cpp*L5245__;Iw!!G_uCfscf7eWS!fdj4AzKhAcDmM1x_ZQ8gxqWeX6BAKY9urnvATMpT7hC8lw77ak7tqxqXGIX3PMg2wYA5PGu7EzyCW0wVEi33Pw$
> >>
> >> It's still classical AMG, and in my experience, struggles on very thin
> >> structures (e.g., aspect ratio 1000 cantilever beams) when compared to
> SA.
> >> However, it can be quite competitive for many structures. I found that
> the
> >> "MFEM elasticity suite", which is based on Baker et al 2010, gave rather
> >> poor results. This is a configuration that works on GPUs and gives good
> >> convergence and performance for elasticity:
> >>
> >>
> >>
> https://urldefense.us/v3/__https://github.com/hypre-space/hypre/issues/601*issuecomment-1069426997__;Iw!!G_uCfscf7eWS!arUVBVKKcYs1M5OhNqqRZl2b2o0NIUkG7fV_22qBbg-ssHhhHazhkpMbYNjCOTN66Sfbk-VZilfox9bxDf0$
> >>
> >> In the above issue, I was only using BoomerAMG as a coarse level for
> p-MG
> >> so all the options have a `-mg_coarse_` prefix; here are those options
> >> without the prefix:
> >>
> >> -pc_hypre_boomeramg_coarsen_type pmis
> >> -pc_hypre_boomeramg_interp_type ext+i
> >> -pc_hypre_boomeramg_no_CF
> >> -pc_hypre_boomeramg_P_max 6
> >> -pc_hypre_boomeramg_print_statistics 1
> >> -pc_hypre_boomeramg_relax_type_down Chebyshev
> >> -pc_hypre_boomeramg_relax_type_up Chebyshev
> >> -pc_hypre_boomeramg_strong_threshold 0.5
> >> -pc_type hypre
> >>
> >
> >
> > --
> > --Amneet
>


-- 
--Amneet
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20241104/2b0f592d/attachment-0001.html>

From jed at jedbrown.org  Tue Nov  5 14:40:53 2024
From: jed at jedbrown.org (Jed Brown)
Date: Tue, 05 Nov 2024 13:40:53 -0700
Subject: [petsc-users] Rigid body nullspace for Stokes operator
In-Reply-To: <CAMETWJ3QOG1yOQ1p-ukibvJskVguGwGAku3Yua9HOeUE5AGUTQ@mail.gmail.com>
References: <CAMETWJ2V7q2evB-UxPTNdHN663R=S8MSQXuy_ZgX7w6XJtjNLA@mail.gmail.com>
	<CADOhEh6ZU+h4ydq5zP1wiTWZL5p7Kp_HQ-LxF7CQ1JOF=4fsXQ@mail.gmail.com>
	<CADOhEh6dVYq4Hw+DpcYrntNykD12ej-7z2ZLJBxeg=56jh1X_A@mail.gmail.com>
	<CAMETWJ3m5X=3D2A4PkvFRQksSSWqSSgk6TArKfhH0RdVhvOU4w@mail.gmail.com>
	<CADOhEh7HCSH4_=TwpuXyq0ydpmLcxoYTLReSfka6qs-mjBFs1A@mail.gmail.com>
	<875xpas35p.fsf@jedbrown.org>
	<CAMETWJ2Qj02uZdG2jrTgRsj3Qrm44UtNqd_NPRe-qk4R=domYQ@mail.gmail.com>
	<CAMETWJ2fpdsSd=YXy+J7T3C7jW03Ubr+bOZEJ8sruX-gsy6SHw@mail.gmail.com>
	<87bjz1qtk6.fsf@jedbrown.org>
	<CAMETWJ1+BQFFFySanUL=eb0yF6f=bx_xzuJAWNfd45T+c1VwZA@mail.gmail.com>
	<87y124q31n.fsf@jedbrown.org>
	<CADOhEh5ZhSKFij7JfaHxuVn=xhy=zG4L5daKERXh4FUm-UAE5A@mail.gmail.com>
	<ACE98DE3-6041-4893-8CBD-09E2CF4F560D@joliv.et>
	<CADOhEh4n-9Ch2AfpgRZdMRbC=BW1duj_d8rLW5mkATGvJR3V6g@mail.gmail.com>
	<96BE4263-8C34-4A2E-91B4-305F94FFCAB4@joliv.et>
	<87v7x8pagg.fsf@jedbrown.org>
	<CAMETWJ0JMWWtSENX9ArNuaTr0DB69YXvmjS77nguLvRwGxqqVA@mail.gmail.com>
	<87wmhi27it.fsf@jedbrown.org>
	<CAMETWJ3QOG1yOQ1p-ukibvJskVguGwGAku3Yua9HOeUE5AGUTQ@mail.gmail.com>
Message-ID: <87jzdhxvl6.fsf@jedbrown.org>

The code snippet I shared contained orthogonalization. There isn't a VecQR or VecOrthogonalize, though such a utility would be useful. Right-looking modified Gram-Schmidt would be fine for that purpose, though Cholesky QR(2) may be a bit faster.

Amneet Bhalla <mail2amneet at gmail.com> writes:

> I set the rigid body null vectors but PETSc errors out that these are not
> orthogonal. Is there a canned routine in PETSc to orthogonalize a bunch of
> Vecs?
>
> [0]PETSC ERROR: Invalid argument
>
> [0]PETSC ERROR: Vector 0 must be orthogonal to vector 2, inner product is
> 0.612391
>
> [0]PETSC ERROR: See https://urldefense.us/v3/__https://petsc.org/release/faq/__;!!G_uCfscf7eWS!ZflPyOP4IDILfa8HbHnR6mnArgOdC6dZ_mtE9devYndPR0QSR4S1__A-ax9BF-jpScXFYeYXsw3fTEv4TSk$  for trouble shooting.
>
> [0]PETSC ERROR: Petsc Release Version 3.17.5, unknown
>
> [0]PETSC ERROR: ./acoustic_streaming_hier_integrator_2d on a darwin-dbg
> named APSB-MacBook-Pro-16.local by amneetb Mon Nov  4 18:01:09 2024
>
> [0]PETSC ERROR: Configure options --CC=mpicc --CXX=mpicxx --FC=mpif90
> --PETSC_ARCH=darwin-dbg --with-debugging=1 --download-hypre=1 --with-x=0
> -download-mumps -download-scalapack -download-parmetis -download-metis
> -download-ptscotch
>
> [0]PETSC ERROR: #1 MatNullSpaceCreate() at
> /Users/amneetb/Softwares/PETSc-Gitlab/PETSc/src/mat/interface/matnull.c:271
>
> [0]PETSC ERROR: #2 resetMatNearNullspace() at
> ../../../IBAMR/ibtk/lib/../src/solvers/impls/PETScKrylovLinearSolver.cpp:697
>
> P=00000:Program abort called in file
> ``../../../IBAMR/ibtk/lib/../src/solvers/impls/PETScKrylovLinearSolver.cpp''
> at line 697
>
> P=00000:ERROR MESSAGE:
>
> P=00000:
> Abort trap: 6
>
> On Mon, Nov 4, 2024 at 10:11?AM Jed Brown <jed at jedbrown.org> wrote:
>
>> Unless the problem is entirely floating (the true null space is all six
>> rigid body modes), then they will be different, so yes, you'll typically
>> have two MatNullSpace objects.
>>
>> Amneet Bhalla <mail2amneet at gmail.com> writes:
>>
>> > Hi Jed,
>> >
>> > Do I need to create two separate MattNullSpace objects if I want to use
>> > both MatSetNullSpace() and MatSetNearNullSpace()?
>> >
>> > Thanks,
>> >
>> >
>> > On Thu, Oct 31, 2024 at 8:18?AM Jed Brown <jed at jedbrown.org> wrote:
>> >
>> >> Pierre Jolivet <pierre at joliv.et> writes:
>> >>
>> >> >> On 31 Oct 2024, at 2:47?PM, Mark Adams <mfadams at lbl.gov> wrote:
>> >> >>
>> >> >> Interesting. I have seen hypre do fine on elasticity, but do you know
>> >> if boomeramg (classical) uses these vectors or is there a smoothed
>> >> aggregation solver in hypre?
>> >> >
>> >> > I?m not sure it is precisely ?standard? smoothed aggregation, see
>> bottom
>> >> paragraph of
>> >>
>> https://urldefense.us/v3/__https://hypre.readthedocs.io/en/latest/solvers-boomeramg.html*amg-for-systems-of-pdes__;Iw!!G_uCfscf7eWS!fdj4AzKhAcDmM1x_ZQ8gxqWeX6BAKY9urnvATMpT7hC8lw77ak7tqxqXGIX3PMg2wYA5PGu7EzyCW0yzixutBg$
>> >> > I?ve never made it to work, but I know some do.
>> >> > A while back, Stefano gave me this pointer as well:
>> >>
>> https://urldefense.us/v3/__https://github.com/mfem/mfem/blob/17955e114020af340e9a06a66ebef43e05012d9c/linalg/hypre.cpp*L5245__;Iw!!G_uCfscf7eWS!fdj4AzKhAcDmM1x_ZQ8gxqWeX6BAKY9urnvATMpT7hC8lw77ak7tqxqXGIX3PMg2wYA5PGu7EzyCW0wVEi33Pw$
>> >>
>> >> It's still classical AMG, and in my experience, struggles on very thin
>> >> structures (e.g., aspect ratio 1000 cantilever beams) when compared to
>> SA.
>> >> However, it can be quite competitive for many structures. I found that
>> the
>> >> "MFEM elasticity suite", which is based on Baker et al 2010, gave rather
>> >> poor results. This is a configuration that works on GPUs and gives good
>> >> convergence and performance for elasticity:
>> >>
>> >>
>> >>
>> https://urldefense.us/v3/__https://github.com/hypre-space/hypre/issues/601*issuecomment-1069426997__;Iw!!G_uCfscf7eWS!arUVBVKKcYs1M5OhNqqRZl2b2o0NIUkG7fV_22qBbg-ssHhhHazhkpMbYNjCOTN66Sfbk-VZilfox9bxDf0$
>> >>
>> >> In the above issue, I was only using BoomerAMG as a coarse level for
>> p-MG
>> >> so all the options have a `-mg_coarse_` prefix; here are those options
>> >> without the prefix:
>> >>
>> >> -pc_hypre_boomeramg_coarsen_type pmis
>> >> -pc_hypre_boomeramg_interp_type ext+i
>> >> -pc_hypre_boomeramg_no_CF
>> >> -pc_hypre_boomeramg_P_max 6
>> >> -pc_hypre_boomeramg_print_statistics 1
>> >> -pc_hypre_boomeramg_relax_type_down Chebyshev
>> >> -pc_hypre_boomeramg_relax_type_up Chebyshev
>> >> -pc_hypre_boomeramg_strong_threshold 0.5
>> >> -pc_type hypre
>> >>
>> >
>> >
>> > --
>> > --Amneet
>>
>
>
> -- 
> --Amneet

From e.t.a.vanderweide at utwente.nl  Wed Nov  6 10:36:02 2024
From: e.t.a.vanderweide at utwente.nl (Weide, Edwin van der (UT-ET))
Date: Wed, 6 Nov 2024 16:36:02 +0000
Subject: [petsc-users] Matrix free SNES with user provided matrix vector
 product and preconditioner operation
Message-ID: <AS8P195MB2099614F528EDBAB7A4A420494532@AS8P195MB2099.EURP195.PROD.OUTLOOK.COM>

Hi,

I am trying to solve a nonlinear problem with matrix-free SNES where I would like to provide both the matrix vector product and the preconditioner myself. For that purpose, I use the following construction.

  // Set up the matrix free evaluation of the Jacobian times a vector
  // by setting the appropriate function in snes.
  PetscCall(MatCreateShell(PETSC_COMM_WORLD, PETSC_DECIDE, PETSC_DECIDE,
                           nEqns, nEqns, this, &mJac));
  PetscCall(MatShellSetOperation(mJac, MATOP_MULT,
                                 (void (*)(void))JacobianTimesVector));

  PetscCall(SNESSetJacobian(mSnes, mJac, nullptr, nullptr, nullptr));

  // Set the function to be used as preconditioner for the krylov solver.
  KSP ksp;
  PC  pc;
  PetscCall(SNESGetKSP(mSnes, &ksp));
  PetscCall(KSPGetPC(ksp, &pc));
  PetscCall(PCSetType(pc, PCSHELL));
  PetscCall(PCSetApplicationContext(pc, this));
  PetscCall(PCShellSetApply(pc, Preconditioner));

For small problems this construction works, and it does exactly what I expect it to do. However, when I increase the problem size, I get a memory allocation failure in SNESSolve, because it looks like SNES attempts to allocate memory for a full dense matrix for the preconditioner, which is not used. This is the call stack when the error occurs.

[0]PETSC ERROR: #1 PetscMallocAlign() at /home/vdweide/petsc/src/sys/memory/mal.c:53
[0]PETSC ERROR: #2 PetscTrMallocDefault() at /home/vdweide/petsc/src/sys/memory/mtr.c:175
[0]PETSC ERROR: #3 PetscMallocA() at /home/vdweide/petsc/src/sys/memory/mal.c:421
[0]PETSC ERROR: #4 MatSeqDenseSetPreallocation_SeqDense() at /home/vdweide/petsc/src/mat/impls/dense/seq/dense.c:3357
[0]PETSC ERROR: #5 MatSeqDenseSetPreallocation() at /home/vdweide/petsc/src/mat/impls/dense/seq/dense.c:3338
[0]PETSC ERROR: #6 MatDuplicateNoCreate_SeqDense() at /home/vdweide/petsc/src/mat/impls/dense/seq/dense.c:372
[0]PETSC ERROR: #7 MatDuplicate_SeqDense() at /home/vdweide/petsc/src/mat/impls/dense/seq/dense.c:399
[0]PETSC ERROR: #8 MatDuplicate() at /home/vdweide/petsc/src/mat/interface/matrix.c:4964
[0]PETSC ERROR: #9 DMCreateMatrix_Shell() at /home/vdweide/petsc/src/dm/impls/shell/dmshell.c:195
[0]PETSC ERROR: #10 DMCreateMatrix() at /home/vdweide/petsc/src/dm/interface/dm.c:1501
[0]PETSC ERROR: #11 SNESSetUpMatrices() at /home/vdweide/petsc/src/snes/interface/snes.c:794
[0]PETSC ERROR: #12 SNESSetUp_NEWTONLS() at /home/vdweide/petsc/src/snes/impls/ls/ls.c:290
[0]PETSC ERROR: #13 SNESSetUp() at /home/vdweide/petsc/src/snes/interface/snes.c:3395
[0]PETSC ERROR: #14 SNESSolve() at /home/vdweide/petsc/src/snes/interface/snes.c:4831
[0]PETSC ERROR: #15 SolveCurrentStage() at SolverClass.cpp:502

In the function SNESSetUpMatrices the source looks as follows

 784   } else if (!snes->jacobian_pre) {
 785     PetscDS   prob;
 786     Mat       J, B;
 787     PetscBool hasPrec = PETSC_FALSE;
 788
 789     J = snes->jacobian;
 790     PetscCall(DMGetDS(dm, &prob));
 791     if (prob) PetscCall(PetscDSHasJacobianPreconditioner(prob, &hasPrec));
 792     if (J) PetscCall(PetscObjectReference((PetscObject)J));
 793     else if (hasPrec) PetscCall(DMCreateMatrix(snes->dm, &J));
 794     PetscCall(DMCreateMatrix(snes->dm, &B));
 795     PetscCall(SNESSetJacobian(snes, J ? J : B, B, NULL, NULL));
 796     PetscCall(MatDestroy(&J));
 797     PetscCall(MatDestroy(&B));
 798   }

It looks like in line 794 it is attempted to create the preconditioner, because it was (intentionally) not provided.

Hence my question. Is it possible to use matrix-free SNES with a user provided matrix vector product (via MatShell) and a user provided preconditioner operation without SNES allocating the memory for a dense matrix? If so, what do I need to change in the construction above to make it work?

If needed, I can provide the source code for which this problem occurs.
Thanks,

Edwin

---------------------------------------------------
Edwin van der Weide
Department of Mechanical Engineering
University of Twente
Enschede, the Netherlands


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20241106/a2b4e684/attachment-0001.html>

From bsmith at petsc.dev  Wed Nov  6 10:52:44 2024
From: bsmith at petsc.dev (Barry Smith)
Date: Wed, 6 Nov 2024 11:52:44 -0500
Subject: [petsc-users] Matrix free SNES with user provided matrix vector
 product and preconditioner operation
In-Reply-To: <AS8P195MB2099614F528EDBAB7A4A420494532@AS8P195MB2099.EURP195.PROD.OUTLOOK.COM>
References: <AS8P195MB2099614F528EDBAB7A4A420494532@AS8P195MB2099.EURP195.PROD.OUTLOOK.COM>
Message-ID: <CF505F8F-ACD5-4B87-8A66-29FFE86C1FDE@petsc.dev>


   Just pass mJac, mJac  instead of mJac, nullptr and it will be happy.  In your case, the second mJac  won't be used in your preconditioner it is just a place holder so other parts of SNES won't try to create a matrix.

  Barry


> On Nov 6, 2024, at 11:36?AM, Weide, Edwin van der (UT-ET) via petsc-users <petsc-users at mcs.anl.gov> wrote:
> 
> Hi,
> 
> I am trying to solve a nonlinear problem with matrix-free SNES where I would like to provide both the matrix vector product and the preconditioner myself. For that purpose, I use the following construction.
> 
>   // Set up the matrix free evaluation of the Jacobian times a vector
>   // by setting the appropriate function in snes.
>   PetscCall(MatCreateShell(PETSC_COMM_WORLD, PETSC_DECIDE, PETSC_DECIDE,
>                            nEqns, nEqns, this, &mJac));
>   PetscCall(MatShellSetOperation(mJac, MATOP_MULT,
>                                  (void (*)(void))JacobianTimesVector));
> 
>   PetscCall(SNESSetJacobian(mSnes, mJac, nullptr, nullptr, nullptr));
> 
>   // Set the function to be used as preconditioner for the krylov solver.
>   KSP ksp;
>   PC  pc;
>   PetscCall(SNESGetKSP(mSnes, &ksp));
>   PetscCall(KSPGetPC(ksp, &pc));
>   PetscCall(PCSetType(pc, PCSHELL));
>   PetscCall(PCSetApplicationContext(pc, this));
>   PetscCall(PCShellSetApply(pc, Preconditioner));
> 
> For small problems this construction works, and it does exactly what I expect it to do. However, when I increase the problem size, I get a memory allocation failure in SNESSolve, because it looks like SNES attempts to allocate memory for a full dense matrix for the preconditioner, which is not used. This is the call stack when the error occurs.
> 
> [0]PETSC ERROR: #1 PetscMallocAlign() at /home/vdweide/petsc/src/sys/memory/mal.c:53
> [0]PETSC ERROR: #2 PetscTrMallocDefault() at /home/vdweide/petsc/src/sys/memory/mtr.c:175
> [0]PETSC ERROR: #3 PetscMallocA() at /home/vdweide/petsc/src/sys/memory/mal.c:421
> [0]PETSC ERROR: #4 MatSeqDenseSetPreallocation_SeqDense() at /home/vdweide/petsc/src/mat/impls/dense/seq/dense.c:3357
> [0]PETSC ERROR: #5 MatSeqDenseSetPreallocation() at /home/vdweide/petsc/src/mat/impls/dense/seq/dense.c:3338
> [0]PETSC ERROR: #6 MatDuplicateNoCreate_SeqDense() at /home/vdweide/petsc/src/mat/impls/dense/seq/dense.c:372
> [0]PETSC ERROR: #7 MatDuplicate_SeqDense() at /home/vdweide/petsc/src/mat/impls/dense/seq/dense.c:399
> [0]PETSC ERROR: #8 MatDuplicate() at /home/vdweide/petsc/src/mat/interface/matrix.c:4964
> [0]PETSC ERROR: #9 DMCreateMatrix_Shell() at /home/vdweide/petsc/src/dm/impls/shell/dmshell.c:195
> [0]PETSC ERROR: #10 DMCreateMatrix() at /home/vdweide/petsc/src/dm/interface/dm.c:1501
> [0]PETSC ERROR: #11 SNESSetUpMatrices() at /home/vdweide/petsc/src/snes/interface/snes.c:794
> [0]PETSC ERROR: #12 SNESSetUp_NEWTONLS() at /home/vdweide/petsc/src/snes/impls/ls/ls.c:290
> [0]PETSC ERROR: #13 SNESSetUp() at /home/vdweide/petsc/src/snes/interface/snes.c:3395
> [0]PETSC ERROR: #14 SNESSolve() at /home/vdweide/petsc/src/snes/interface/snes.c:4831
> [0]PETSC ERROR: #15 SolveCurrentStage() at SolverClass.cpp:502
> 
> In the function SNESSetUpMatrices the source looks as follows
> 
>  784   } else if (!snes->jacobian_pre) {
>  785     PetscDS   prob;
>  786     Mat       J, B;
>  787     PetscBool hasPrec = PETSC_FALSE;
>  788 
>  789     J = snes->jacobian;
>  790     PetscCall(DMGetDS(dm, &prob));
>  791     if (prob) PetscCall(PetscDSHasJacobianPreconditioner(prob, &hasPrec));
>  792     if (J) PetscCall(PetscObjectReference((PetscObject)J));
>  793     else if (hasPrec) PetscCall(DMCreateMatrix(snes->dm, &J));
>  794     PetscCall(DMCreateMatrix(snes->dm, &B));
>  795     PetscCall(SNESSetJacobian(snes, J ? J : B, B, NULL, NULL));
>  796     PetscCall(MatDestroy(&J));
>  797     PetscCall(MatDestroy(&B));
>  798   }
> 
> It looks like in line 794 it is attempted to create the preconditioner, because it was (intentionally) not provided. 
> 
> Hence my question. Is it possible to use matrix-free SNES with a user provided matrix vector product (via MatShell) and a user provided preconditioner operation without SNES allocating the memory for a dense matrix? If so, what do I need to change in the construction above to make it work?
> 
> If needed, I can provide the source code for which this problem occurs.
> Thanks,
> 
> Edwin
> ---------------------------------------------------
> Edwin van der Weide
> Department of Mechanical Engineering
> University of Twente
> Enschede, the Netherlands

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20241106/4f719e7d/attachment-0001.html>

From e.t.a.vanderweide at utwente.nl  Wed Nov  6 11:01:48 2024
From: e.t.a.vanderweide at utwente.nl (Weide, Edwin van der (UT-ET))
Date: Wed, 6 Nov 2024 17:01:48 +0000
Subject: [petsc-users] Matrix free SNES with user provided matrix vector
 product and preconditioner operation
In-Reply-To: <CF505F8F-ACD5-4B87-8A66-29FFE86C1FDE@petsc.dev>
References: <AS8P195MB2099614F528EDBAB7A4A420494532@AS8P195MB2099.EURP195.PROD.OUTLOOK.COM>
	<CF505F8F-ACD5-4B87-8A66-29FFE86C1FDE@petsc.dev>
Message-ID: <AS8P195MB2099CF8E96A926901CAD569194532@AS8P195MB2099.EURP195.PROD.OUTLOOK.COM>

Barry,

If I do that, I get the following error

[0]PETSC ERROR: ------------------------------------------------------------------------
[0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range
[0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
[0]PETSC ERROR: or see https://urldefense.us/v3/__https://petsc.org/release/faq/*valgrind__;Iw!!G_uCfscf7eWS!ZZB4be6WT2eisTFLrrHknlukvw3s7eibKfF0XMaUNTO9axbKKdE3l3C_RbgSVoOHvWH97FEJKHr6mY18J8kMv23zN-9CL9kUhgje$  and https://urldefense.us/v3/__https://petsc.org/release/faq/__;!!G_uCfscf7eWS!ZZB4be6WT2eisTFLrrHknlukvw3s7eibKfF0XMaUNTO9axbKKdE3l3C_RbgSVoOHvWH97FEJKHr6mY18J8kMv23zN-9CLwFG0b6V$ 
[0]PETSC ERROR: ---------------------  Stack Frames ------------------------------------
[0]PETSC ERROR: The line numbers in the error traceback are not always exact.
[0]PETSC ERROR: #1 SNES callback Jacobian
[0]PETSC ERROR: #2 SNESComputeJacobian() at /home/vdweide/petsc/src/snes/interface/snes.c:2966
[0]PETSC ERROR: #3 SNESSolve_NEWTONLS() at /home/vdweide/petsc/src/snes/impls/ls/ls.c:218
[0]PETSC ERROR: #4 SNESSolve() at /home/vdweide/petsc/src/snes/interface/snes.c:4841
[0]PETSC ERROR: #5 SolveCurrentStage() at SolverClass.cpp:502
[0]PETSC ERROR: #6 main() at Condensation.cpp:20
--------------------------------------------------------------------------

So SNES tries to call the call back function for the Jacobian, but that is not provided. Hence the failure.
Regards,

Edwin

________________________________
From: Barry Smith <bsmith at petsc.dev>
Sent: Wednesday, November 6, 2024 5:52 PM
To: Weide, Edwin van der (UT-ET) <e.t.a.vanderweide at utwente.nl>
Cc: petsc-users at mcs.anl.gov <petsc-users at mcs.anl.gov>
Subject: Re: [petsc-users] Matrix free SNES with user provided matrix vector product and preconditioner operation


   Just pass mJac, mJac  instead of mJac, nullptr and it will be happy.  In your case, the second mJac  won't be used in your preconditioner it is just a place holder so other parts of SNES won't try to create a matrix.

  Barry


On Nov 6, 2024, at 11:36?AM, Weide, Edwin van der (UT-ET) via petsc-users <petsc-users at mcs.anl.gov> wrote:

Hi,

I am trying to solve a nonlinear problem with matrix-free SNES where I would like to provide both the matrix vector product and the preconditioner myself. For that purpose, I use the following construction.

  // Set up the matrix free evaluation of the Jacobian times a vector
  // by setting the appropriate function in snes.
  PetscCall(MatCreateShell(PETSC_COMM_WORLD, PETSC_DECIDE, PETSC_DECIDE,
                           nEqns, nEqns, this, &mJac));
  PetscCall(MatShellSetOperation(mJac, MATOP_MULT,
                                 (void (*)(void))JacobianTimesVector));

  PetscCall(SNESSetJacobian(mSnes, mJac, nullptr, nullptr, nullptr));

  // Set the function to be used as preconditioner for the krylov solver.
  KSP ksp;
  PC  pc;
  PetscCall(SNESGetKSP(mSnes, &ksp));
  PetscCall(KSPGetPC(ksp, &pc));
  PetscCall(PCSetType(pc, PCSHELL));
  PetscCall(PCSetApplicationContext(pc, this));
  PetscCall(PCShellSetApply(pc, Preconditioner));

For small problems this construction works, and it does exactly what I expect it to do. However, when I increase the problem size, I get a memory allocation failure in SNESSolve, because it looks like SNES attempts to allocate memory for a full dense matrix for the preconditioner, which is not used. This is the call stack when the error occurs.

[0]PETSC ERROR: #1 PetscMallocAlign() at /home/vdweide/petsc/src/sys/memory/mal.c:53
[0]PETSC ERROR: #2 PetscTrMallocDefault() at /home/vdweide/petsc/src/sys/memory/mtr.c:175
[0]PETSC ERROR: #3 PetscMallocA() at /home/vdweide/petsc/src/sys/memory/mal.c:421
[0]PETSC ERROR: #4 MatSeqDenseSetPreallocation_SeqDense() at /home/vdweide/petsc/src/mat/impls/dense/seq/dense.c:3357
[0]PETSC ERROR: #5 MatSeqDenseSetPreallocation() at /home/vdweide/petsc/src/mat/impls/dense/seq/dense.c:3338
[0]PETSC ERROR: #6 MatDuplicateNoCreate_SeqDense() at /home/vdweide/petsc/src/mat/impls/dense/seq/dense.c:372
[0]PETSC ERROR: #7 MatDuplicate_SeqDense() at /home/vdweide/petsc/src/mat/impls/dense/seq/dense.c:399
[0]PETSC ERROR: #8 MatDuplicate() at /home/vdweide/petsc/src/mat/interface/matrix.c:4964
[0]PETSC ERROR: #9 DMCreateMatrix_Shell() at /home/vdweide/petsc/src/dm/impls/shell/dmshell.c:195
[0]PETSC ERROR: #10 DMCreateMatrix() at /home/vdweide/petsc/src/dm/interface/dm.c:1501
[0]PETSC ERROR: #11 SNESSetUpMatrices() at /home/vdweide/petsc/src/snes/interface/snes.c:794
[0]PETSC ERROR: #12 SNESSetUp_NEWTONLS() at /home/vdweide/petsc/src/snes/impls/ls/ls.c:290
[0]PETSC ERROR: #13 SNESSetUp() at /home/vdweide/petsc/src/snes/interface/snes.c:3395
[0]PETSC ERROR: #14 SNESSolve() at /home/vdweide/petsc/src/snes/interface/snes.c:4831
[0]PETSC ERROR: #15 SolveCurrentStage() at SolverClass.cpp:502

In the function SNESSetUpMatrices the source looks as follows

 784   } else if (!snes->jacobian_pre) {
 785     PetscDS   prob;
 786     Mat       J, B;
 787     PetscBool hasPrec = PETSC_FALSE;
 788
 789     J = snes->jacobian;
 790     PetscCall(DMGetDS(dm, &prob));
 791     if (prob) PetscCall(PetscDSHasJacobianPreconditioner(prob, &hasPrec));
 792     if (J) PetscCall(PetscObjectReference((PetscObject)J));
 793     else if (hasPrec) PetscCall(DMCreateMatrix(snes->dm, &J));
 794     PetscCall(DMCreateMatrix(snes->dm, &B));
 795     PetscCall(SNESSetJacobian(snes, J ? J : B, B, NULL, NULL));
 796     PetscCall(MatDestroy(&J));
 797     PetscCall(MatDestroy(&B));
 798   }

It looks like in line 794 it is attempted to create the preconditioner, because it was (intentionally) not provided.

Hence my question. Is it possible to use matrix-free SNES with a user provided matrix vector product (via MatShell) and a user provided preconditioner operation without SNES allocating the memory for a dense matrix? If so, what do I need to change in the construction above to make it work?

If needed, I can provide the source code for which this problem occurs.
Thanks,

Edwin

---------------------------------------------------
Edwin van der Weide
Department of Mechanical Engineering
University of Twente
Enschede, the Netherlands

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20241106/2de4c023/attachment-0001.html>

From bsmith at petsc.dev  Wed Nov  6 12:03:47 2024
From: bsmith at petsc.dev (Barry Smith)
Date: Wed, 6 Nov 2024 13:03:47 -0500
Subject: [petsc-users] Matrix free SNES with user provided matrix vector
 product and preconditioner operation
In-Reply-To: <AS8P195MB2099CF8E96A926901CAD569194532@AS8P195MB2099.EURP195.PROD.OUTLOOK.COM>
References: <AS8P195MB2099614F528EDBAB7A4A420494532@AS8P195MB2099.EURP195.PROD.OUTLOOK.COM>
	<CF505F8F-ACD5-4B87-8A66-29FFE86C1FDE@petsc.dev>
	<AS8P195MB2099CF8E96A926901CAD569194532@AS8P195MB2099.EURP195.PROD.OUTLOOK.COM>
Message-ID: <DA04DF21-8B6C-4917-B7C4-5690E9B4B733@petsc.dev>


  You need to provide a callback function. Why? Otherwise your MatShell and PCshell have no way of knowing at what location the Jacobian is suppose to be evaluated at (in a matrix free way). That is the x for which J(x) is used. 

  Normally one puts the x into the application context of mJac and accesses it every time the matmult is called. Similarly it needs to be accessed in application of your preconditioner.


> On Nov 6, 2024, at 12:01?PM, Weide, Edwin van der (UT-ET) <e.t.a.vanderweide at utwente.nl> wrote:
> 
> Barry,
> 
> If I do that, I get the following error
> 
> [0]PETSC ERROR: ------------------------------------------------------------------------
> [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range
> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
> [0]PETSC ERROR: or see https://urldefense.us/v3/__https://petsc.org/release/faq/*valgrind__;Iw!!G_uCfscf7eWS!elK6sfqwXIIaLqDzUlKw8Wy--UjvmZXYI0gRqu_nvUNMbIx2rMEX4aHIYXWikS4p4zHPTXyicwT9SEY8-JQt10s$  and https://urldefense.us/v3/__https://petsc.org/release/faq/__;!!G_uCfscf7eWS!elK6sfqwXIIaLqDzUlKw8Wy--UjvmZXYI0gRqu_nvUNMbIx2rMEX4aHIYXWikS4p4zHPTXyicwT9SEY8mXmEtXk$ 
> [0]PETSC ERROR: ---------------------  Stack Frames ------------------------------------
> [0]PETSC ERROR: The line numbers in the error traceback are not always exact.
> [0]PETSC ERROR: #1 SNES callback Jacobian
> [0]PETSC ERROR: #2 SNESComputeJacobian() at /home/vdweide/petsc/src/snes/interface/snes.c:2966
> [0]PETSC ERROR: #3 SNESSolve_NEWTONLS() at /home/vdweide/petsc/src/snes/impls/ls/ls.c:218
> [0]PETSC ERROR: #4 SNESSolve() at /home/vdweide/petsc/src/snes/interface/snes.c:4841
> [0]PETSC ERROR: #5 SolveCurrentStage() at SolverClass.cpp:502
> [0]PETSC ERROR: #6 main() at Condensation.cpp:20
> --------------------------------------------------------------------------
> 
> So SNES tries to call the call back function for the Jacobian, but that is not provided. Hence the failure.
> Regards,
> 
> Edwin
> 
> From: Barry Smith <bsmith at petsc.dev <mailto:bsmith at petsc.dev>>
> Sent: Wednesday, November 6, 2024 5:52 PM
> To: Weide, Edwin van der (UT-ET) <e.t.a.vanderweide at utwente.nl <mailto:e.t.a.vanderweide at utwente.nl>>
> Cc: petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov> <petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov>>
> Subject: Re: [petsc-users] Matrix free SNES with user provided matrix vector product and preconditioner operation
>  
> 
>    Just pass mJac, mJac  instead of mJac, nullptr and it will be happy.  In your case, the second mJac  won't be used in your preconditioner it is just a place holder so other parts of SNES won't try to create a matrix.
> 
>   Barry
> 
> 
>> On Nov 6, 2024, at 11:36?AM, Weide, Edwin van der (UT-ET) via petsc-users <petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov>> wrote:
>> 
>> Hi,
>> 
>> I am trying to solve a nonlinear problem with matrix-free SNES where I would like to provide both the matrix vector product and the preconditioner myself. For that purpose, I use the following construction.
>> 
>>   // Set up the matrix free evaluation of the Jacobian times a vector
>>   // by setting the appropriate function in snes.
>>   PetscCall(MatCreateShell(PETSC_COMM_WORLD, PETSC_DECIDE, PETSC_DECIDE,
>>                            nEqns, nEqns, this, &mJac));
>>   PetscCall(MatShellSetOperation(mJac, MATOP_MULT,
>>                                  (void (*)(void))JacobianTimesVector));
>> 
>>   PetscCall(SNESSetJacobian(mSnes, mJac, nullptr, nullptr, nullptr));
>> 
>>   // Set the function to be used as preconditioner for the krylov solver.
>>   KSP ksp;
>>   PC  pc;
>>   PetscCall(SNESGetKSP(mSnes, &ksp));
>>   PetscCall(KSPGetPC(ksp, &pc));
>>   PetscCall(PCSetType(pc, PCSHELL));
>>   PetscCall(PCSetApplicationContext(pc, this));
>>   PetscCall(PCShellSetApply(pc, Preconditioner));
>> 
>> For small problems this construction works, and it does exactly what I expect it to do. However, when I increase the problem size, I get a memory allocation failure in SNESSolve, because it looks like SNES attempts to allocate memory for a full dense matrix for the preconditioner, which is not used. This is the call stack when the error occurs.
>> 
>> [0]PETSC ERROR: #1 PetscMallocAlign() at /home/vdweide/petsc/src/sys/memory/mal.c:53
>> [0]PETSC ERROR: #2 PetscTrMallocDefault() at /home/vdweide/petsc/src/sys/memory/mtr.c:175
>> [0]PETSC ERROR: #3 PetscMallocA() at /home/vdweide/petsc/src/sys/memory/mal.c:421
>> [0]PETSC ERROR: #4 MatSeqDenseSetPreallocation_SeqDense() at /home/vdweide/petsc/src/mat/impls/dense/seq/dense.c:3357
>> [0]PETSC ERROR: #5 MatSeqDenseSetPreallocation() at /home/vdweide/petsc/src/mat/impls/dense/seq/dense.c:3338
>> [0]PETSC ERROR: #6 MatDuplicateNoCreate_SeqDense() at /home/vdweide/petsc/src/mat/impls/dense/seq/dense.c:372
>> [0]PETSC ERROR: #7 MatDuplicate_SeqDense() at /home/vdweide/petsc/src/mat/impls/dense/seq/dense.c:399
>> [0]PETSC ERROR: #8 MatDuplicate() at /home/vdweide/petsc/src/mat/interface/matrix.c:4964
>> [0]PETSC ERROR: #9 DMCreateMatrix_Shell() at /home/vdweide/petsc/src/dm/impls/shell/dmshell.c:195
>> [0]PETSC ERROR: #10 DMCreateMatrix() at /home/vdweide/petsc/src/dm/interface/dm.c:1501
>> [0]PETSC ERROR: #11 SNESSetUpMatrices() at /home/vdweide/petsc/src/snes/interface/snes.c:794
>> [0]PETSC ERROR: #12 SNESSetUp_NEWTONLS() at /home/vdweide/petsc/src/snes/impls/ls/ls.c:290
>> [0]PETSC ERROR: #13 SNESSetUp() at /home/vdweide/petsc/src/snes/interface/snes.c:3395
>> [0]PETSC ERROR: #14 SNESSolve() at /home/vdweide/petsc/src/snes/interface/snes.c:4831
>> [0]PETSC ERROR: #15 SolveCurrentStage() at SolverClass.cpp:502
>> 
>> In the function SNESSetUpMatrices the source looks as follows
>> 
>>  784   } else if (!snes->jacobian_pre) {
>>  785     PetscDS   prob;
>>  786     Mat       J, B;
>>  787     PetscBool hasPrec = PETSC_FALSE;
>>  788 
>>  789     J = snes->jacobian;
>>  790     PetscCall(DMGetDS(dm, &prob));
>>  791     if (prob) PetscCall(PetscDSHasJacobianPreconditioner(prob, &hasPrec));
>>  792     if (J) PetscCall(PetscObjectReference((PetscObject)J));
>>  793     else if (hasPrec) PetscCall(DMCreateMatrix(snes->dm, &J));
>>  794     PetscCall(DMCreateMatrix(snes->dm, &B));
>>  795     PetscCall(SNESSetJacobian(snes, J ? J : B, B, NULL, NULL));
>>  796     PetscCall(MatDestroy(&J));
>>  797     PetscCall(MatDestroy(&B));
>>  798   }
>> 
>> It looks like in line 794 it is attempted to create the preconditioner, because it was (intentionally) not provided. 
>> 
>> Hence my question. Is it possible to use matrix-free SNES with a user provided matrix vector product (via MatShell) and a user provided preconditioner operation without SNES allocating the memory for a dense matrix? If so, what do I need to change in the construction above to make it work?
>> 
>> If needed, I can provide the source code for which this problem occurs.
>> Thanks,
>> 
>> Edwin
>> ---------------------------------------------------
>> Edwin van der Weide
>> Department of Mechanical Engineering
>> University of Twente
>> Enschede, the Netherlands

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20241106/505f3c66/attachment-0001.html>

From C.Klaij at marin.nl  Thu Nov  7 04:19:38 2024
From: C.Klaij at marin.nl (Klaij, Christiaan)
Date: Thu, 7 Nov 2024 10:19:38 +0000
Subject: [petsc-users] null space problem with -pc_type lu on single proc
Message-ID: <DB9PR08MB7534EF7E5575D967D5B4F542875C2@DB9PR08MB7534.eurprd08.prod.outlook.com>

I'm trying to solve a system with a single, non-constant, null space vector, confirmed by passing the MatNullSpaceTest. Solving the system works fine with -pc_type ilu on one or multiple procs. It also works fine with -pc_type lu on multiple procs but fails on a single proc. Any idea what could be wrong?

Chris

[0]PETSC ERROR: *** unknown floating point error occurred ***
[0]PETSC ERROR: The specific exception can be determined by running in a debugger.  When the
[0]PETSC ERROR: debugger traps the signal, the exception can be found with fetestexcept(0x3f)
[0]PETSC ERROR: where the result is a bitwise OR of the following flags:
[0]PETSC ERROR: FE_INVALID=0x1 FE_DIVBYZERO=0x4 FE_OVERFLOW=0x8 FE_UNDERFLOW=0x10 FE_INEXACT=0x20
[0]PETSC ERROR: Try option -start_in_debugger
[0]PETSC ERROR: likely location of problem given in stack below
[0]PETSC ERROR: ---------------------  Stack Frames ------------------------------------
[0]PETSC ERROR: The line numbers in the error traceback are not always exact.
[0]PETSC ERROR: #1 PetscDefaultFPTrap() at /cm/shared/apps/petsc/oneapi/build/src/src/sys/error/fp.c:487
[0]PETSC ERROR: #2 VecMDot_Seq() at /cm/shared/apps/petsc/oneapi/build/src/src/vec/vec/impls/seq/dvec2.c:189
[0]PETSC ERROR: #3 VecMXDot_MPI_Default() at /cm/shared/apps/petsc/oneapi/build/src/include/../src/vec/vec/impls/mpi/pvecimpl.h:96
[0]PETSC ERROR: #4 VecMDot_MPI() at /cm/shared/apps/petsc/oneapi/build/src/src/vec/vec/impls/mpi/pvec2.c:25
[0]PETSC ERROR: #5 VecMXDot_Private() at /cm/shared/apps/petsc/oneapi/build/src/src/vec/vec/interface/rvector.c:1112
[0]PETSC ERROR: #6 VecMDot() at /cm/shared/apps/petsc/oneapi/build/src/src/vec/vec/interface/rvector.c:1184
[0]PETSC ERROR: #7 MatNullSpaceRemove() at /cm/shared/apps/petsc/oneapi/build/src/src/mat/interface/matnull.c:359
[0]PETSC ERROR: #8 KSP_RemoveNullSpace() at /cm/shared/apps/petsc/oneapi/build/src/include/petsc/private/kspimpl.h:322
[0]PETSC ERROR: #9 KSP_PCApply() at /cm/shared/apps/petsc/oneapi/build/src/include/petsc/private/kspimpl.h:382
[0]PETSC ERROR: #10 KSPInitialResidual() at /cm/shared/apps/petsc/oneapi/build/src/src/ksp/ksp/interface/itres.c:64
[0]PETSC ERROR: #11 KSPSolve_GMRES() at /cm/shared/apps/petsc/oneapi/build/src/src/ksp/ksp/impls/gmres/gmres.c:226
[0]PETSC ERROR: #12 KSPSolve_Private() at /cm/shared/apps/petsc/oneapi/build/src/src/ksp/ksp/interface/itfunc.c:898
[0]PETSC ERROR: #13 KSPSolve() at /cm/shared/apps/petsc/oneapi/build/src/src/ksp/ksp/interface/itfunc.c:1070
[0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
[0]PETSC ERROR: Floating point exception
[0]PETSC ERROR: trapped floating point error
[0]PETSC ERROR: See https://urldefense.us/v3/__https://petsc.org/release/faq/__;!!G_uCfscf7eWS!dXxAqHgHbtticGbE-cswI2ylVIt4A7Fn68LPFMyFtpJByLQtJocy5JF-Vf5Vzr_FbcWB1l7ve_2l9CdBfaa2Fyw$  for trouble shooting.
[0]PETSC ERROR: Petsc Release Version 3.19.4, Jul 31, 2023 
[0]PETSC ERROR: ./refresco on a  named marclus3login2 by cklaij Thu Nov  7 11:03:22 2024
[0]PETSC ERROR: Configure options --prefix=/cm/shared/apps/petsc/oneapi/3.19.4-dbg --with-mpi-dir=/cm/shared/apps/intel/oneapi/mpi/2021.4.0 --with-x=0 --with-mpe=0 --with-debugging=1 --download-superlu_dist=../superlu_dist-8.1.2.tar.gz --with-blaslapack-dir=/cm/shared/apps/intel/oneapi/mkl/2021.4.0 --download-parmetis=../parmetis-4.0.3-p9.tar.gz --download-metis=../metis-5.1.0-p11.tar.gz --with-packages-build-dir=/cm/shared/apps/petsc/oneapi/build --with-ssl=0 --with-shared-libraries=1 CFLAGS="-std=gnu11 -Wall -funroll-all-loops  -O3 -DNDEBUG" CXXFLAGS="-std=gnu++14 -Wall -funroll-all-loops  -O3 -DNDEBUG" COPTFLAGS="-std=gnu11 -Wall -funroll-all-loops  -O3 -DNDEBUG" CXXOPTFLAGS="-std=gnu++14 -Wall -funroll-all-loops  -O3 -DNDEBUG" FCFLAGS="-funroll-all-loops -O3 -DNDEBUG" F90FLAGS="-funroll-all-loops -O3 -DNDEBUG" FOPTFLAGS="-funroll-all-loops -O3 -DNDEBUG"
Abort(72) on node 0 (rank 0 in comm 0): application called MPI_Abort(MPI_COMM_WORLD, 72) - process 0
dr. ir. Christiaan Klaij
 | Senior Researcher | Research & Development
T +31 317 49 33 44 |  C.Klaij at marin.nl | https://urldefense.us/v3/__http://www.marin.nl__;!!G_uCfscf7eWS!dXxAqHgHbtticGbE-cswI2ylVIt4A7Fn68LPFMyFtpJByLQtJocy5JF-Vf5Vzr_FbcWB1l7ve_2l9CdBwOll570$ 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20241107/a7b26eaf/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image376413.png
Type: image/png
Size: 5004 bytes
Desc: image376413.png
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20241107/a7b26eaf/attachment-0004.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image225686.png
Type: image/png
Size: 487 bytes
Desc: image225686.png
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20241107/a7b26eaf/attachment-0005.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image478264.png
Type: image/png
Size: 504 bytes
Desc: image478264.png
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20241107/a7b26eaf/attachment-0006.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image765396.png
Type: image/png
Size: 482 bytes
Desc: image765396.png
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20241107/a7b26eaf/attachment-0007.png>

From stefano.zampini at gmail.com  Thu Nov  7 07:20:27 2024
From: stefano.zampini at gmail.com (Stefano Zampini)
Date: Thu, 7 Nov 2024 16:20:27 +0300
Subject: [petsc-users] null space problem with -pc_type lu on single proc
In-Reply-To: <DB9PR08MB7534EF7E5575D967D5B4F542875C2@DB9PR08MB7534.eurprd08.prod.outlook.com>
References: <DB9PR08MB7534EF7E5575D967D5B4F542875C2@DB9PR08MB7534.eurprd08.prod.outlook.com>
Message-ID: <CAGPUishOQYF6uGh+wqWix=pRfHSKPZUgNsE9M5WduxUHOXCrAQ@mail.gmail.com>

the default LU solver in sequential is the PETSc one which does not support
pivoting or singular problems. In parallel, it is either MUMPS or
SUPERLU_DIST, depending on your configurations.
MUMPS for example can handle singular problem, not sure about superlu_dist.
You can run the parallel version with -ksp_view and see what is the solver
package used.
Supposing it is mumps, you can run the sequential code with
-pc_factor_mat_solver_type mumps

Il giorno gio 7 nov 2024 alle ore 13:20 Klaij, Christiaan via petsc-users <
petsc-users at mcs.anl.gov> ha scritto:

> I'm trying to solve a system with a single, non-constant, null space
> vector, confirmed by passing the MatNullSpaceTest. Solving the system works
> fine with -pc_type ilu on one or multiple procs. It also works fine with
> -pc_type lu on multiple procs but fails on a single proc. Any idea what
> could be wrong?
>
> Chris
>
> [0]PETSC ERROR: *** unknown floating point error occurred ***
> [0]PETSC ERROR: The specific exception can be determined by running in a
> debugger. When the
> [0]PETSC ERROR: debugger traps the signal, the exception can be found with
> fetestexcept(0x3f)
> [0]PETSC ERROR: where the result is a bitwise OR of the following flags:
> [0]PETSC ERROR: FE_INVALID=0x1 FE_DIVBYZERO=0x4 FE_OVERFLOW=0x8
> FE_UNDERFLOW=0x10 FE_INEXACT=0x20
> [0]PETSC ERROR: Try option -start_in_debugger
> [0]PETSC ERROR: likely location of problem given in stack below
> [0]PETSC ERROR: --------------------- Stack Frames
> ------------------------------------
> [0]PETSC ERROR: The line numbers in the error traceback are not always
> exact.
> [0]PETSC ERROR: #1 PetscDefaultFPTrap() at
> /cm/shared/apps/petsc/oneapi/build/src/src/sys/error/fp.c:487
> [0]PETSC ERROR: #2 VecMDot_Seq() at
> /cm/shared/apps/petsc/oneapi/build/src/src/vec/vec/impls/seq/dvec2.c:189
> [0]PETSC ERROR: #3 VecMXDot_MPI_Default() at
> /cm/shared/apps/petsc/oneapi/build/src/include/../src/vec/vec/impls/mpi/pvecimpl.h:96
> [0]PETSC ERROR: #4 VecMDot_MPI() at
> /cm/shared/apps/petsc/oneapi/build/src/src/vec/vec/impls/mpi/pvec2.c:25
> [0]PETSC ERROR: #5 VecMXDot_Private() at
> /cm/shared/apps/petsc/oneapi/build/src/src/vec/vec/interface/rvector.c:1112
> [0]PETSC ERROR: #6 VecMDot() at
> /cm/shared/apps/petsc/oneapi/build/src/src/vec/vec/interface/rvector.c:1184
> [0]PETSC ERROR: #7 MatNullSpaceRemove() at
> /cm/shared/apps/petsc/oneapi/build/src/src/mat/interface/matnull.c:359
> [0]PETSC ERROR: #8 KSP_RemoveNullSpace() at
> /cm/shared/apps/petsc/oneapi/build/src/include/petsc/private/kspimpl.h:322
> [0]PETSC ERROR: #9 KSP_PCApply() at
> /cm/shared/apps/petsc/oneapi/build/src/include/petsc/private/kspimpl.h:382
> [0]PETSC ERROR: #10 KSPInitialResidual() at
> /cm/shared/apps/petsc/oneapi/build/src/src/ksp/ksp/interface/itres.c:64
> [0]PETSC ERROR: #11 KSPSolve_GMRES() at
> /cm/shared/apps/petsc/oneapi/build/src/src/ksp/ksp/impls/gmres/gmres.c:226
> [0]PETSC ERROR: #12 KSPSolve_Private() at
> /cm/shared/apps/petsc/oneapi/build/src/src/ksp/ksp/interface/itfunc.c:898
> [0]PETSC ERROR: #13 KSPSolve() at
> /cm/shared/apps/petsc/oneapi/build/src/src/ksp/ksp/interface/itfunc.c:1070
> [0]PETSC ERROR: --------------------- Error Message
> --------------------------------------------------------------
> [0]PETSC ERROR: Floating point exception
> [0]PETSC ERROR: trapped floating point error
> [0]PETSC ERROR: See https://urldefense.us/v3/__https://petsc.org/release/faq/__;!!G_uCfscf7eWS!YEL_7P5zQIFYD7bhZCjy9YmnEJKPrk8uuZ-HgzitChAgA3TpOY2MKoRQcBcC6LuKxc3x8WMoVQ62WS53A7XCLS4niHz5JWI$ 
> <https://urldefense.us/v3/__https://petsc.org/release/faq/__;!!G_uCfscf7eWS!dXxAqHgHbtticGbE-cswI2ylVIt4A7Fn68LPFMyFtpJByLQtJocy5JF-Vf5Vzr_FbcWB1l7ve_2l9CdBfaa2Fyw$>
> for trouble shooting.
> [0]PETSC ERROR: Petsc Release Version 3.19.4, Jul 31, 2023
> [0]PETSC ERROR: ./refresco on a named marclus3login2 by cklaij Thu Nov 7
> 11:03:22 2024
> [0]PETSC ERROR: Configure options
> --prefix=/cm/shared/apps/petsc/oneapi/3.19.4-dbg
> --with-mpi-dir=/cm/shared/apps/intel/oneapi/mpi/2021.4.0 --with-x=0
> --with-mpe=0 --with-debugging=1
> --download-superlu_dist=../superlu_dist-8.1.2.tar.gz
> --with-blaslapack-dir=/cm/shared/apps/intel/oneapi/mkl/2021.4.0
> --download-parmetis=../parmetis-4.0.3-p9.tar.gz
> --download-metis=../metis-5.1.0-p11.tar.gz
> --with-packages-build-dir=/cm/shared/apps/petsc/oneapi/build --with-ssl=0
> --with-shared-libraries=1 CFLAGS="-std=gnu11 -Wall -funroll-all-loops -O3
> -DNDEBUG" CXXFLAGS="-std=gnu++14 -Wall -funroll-all-loops -O3 -DNDEBUG"
> COPTFLAGS="-std=gnu11 -Wall -funroll-all-loops -O3 -DNDEBUG"
> CXXOPTFLAGS="-std=gnu++14 -Wall -funroll-all-loops -O3 -DNDEBUG"
> FCFLAGS="-funroll-all-loops -O3 -DNDEBUG" F90FLAGS="-funroll-all-loops -O3
> -DNDEBUG" FOPTFLAGS="-funroll-all-loops -O3 -DNDEBUG"
> Abort(72) on node 0 (rank 0 in comm 0): application called
> MPI_Abort(MPI_COMM_WORLD, 72) - process 0
> dr. ir.????  Christiaan  Klaij
>  |  Senior Researcher  |  Research & Development
> T +31 317 49 33 44 <+31%20317%2049%2033%2044>  |   C.Klaij at marin.nl  |
> https://urldefense.us/v3/__http://www.marin.nl__;!!G_uCfscf7eWS!YEL_7P5zQIFYD7bhZCjy9YmnEJKPrk8uuZ-HgzitChAgA3TpOY2MKoRQcBcC6LuKxc3x8WMoVQ62WS53A7XCLS4nR49qdbI$ 
> <https://urldefense.us/v3/__https://www.marin.nl/__;!!G_uCfscf7eWS!dXxAqHgHbtticGbE-cswI2ylVIt4A7Fn68LPFMyFtpJByLQtJocy5JF-Vf5Vzr_FbcWB1l7ve_2l9CdBivdTTos$>
> [image: Facebook]
> <https://urldefense.us/v3/__https://www.facebook.com/marin.wageningen__;!!G_uCfscf7eWS!dXxAqHgHbtticGbE-cswI2ylVIt4A7Fn68LPFMyFtpJByLQtJocy5JF-Vf5Vzr_FbcWB1l7ve_2l9CdBNyhbEjY$>
> [image: LinkedIn]
> <https://urldefense.us/v3/__https://www.linkedin.com/company/marin__;!!G_uCfscf7eWS!dXxAqHgHbtticGbE-cswI2ylVIt4A7Fn68LPFMyFtpJByLQtJocy5JF-Vf5Vzr_FbcWB1l7ve_2l9CdBs7HDGBM$>
> [image: YouTube]
> <https://urldefense.us/v3/__https://www.youtube.com/marinmultimedia__;!!G_uCfscf7eWS!dXxAqHgHbtticGbE-cswI2ylVIt4A7Fn68LPFMyFtpJByLQtJocy5JF-Vf5Vzr_FbcWB1l7ve_2l9CdBTcqrSW8$>
>


-- 
Stefano
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20241107/bf3d5b72/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image376413.png
Type: image/png
Size: 5004 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20241107/bf3d5b72/attachment-0004.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image225686.png
Type: image/png
Size: 487 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20241107/bf3d5b72/attachment-0005.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image478264.png
Type: image/png
Size: 504 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20241107/bf3d5b72/attachment-0006.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image765396.png
Type: image/png
Size: 482 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20241107/bf3d5b72/attachment-0007.png>

From C.Klaij at marin.nl  Thu Nov  7 08:42:27 2024
From: C.Klaij at marin.nl (Klaij, Christiaan)
Date: Thu, 7 Nov 2024 14:42:27 +0000
Subject: [petsc-users] null space problem with -pc_type lu on single proc
In-Reply-To: <CAGPUishOQYF6uGh+wqWix=pRfHSKPZUgNsE9M5WduxUHOXCrAQ@mail.gmail.com>
References: <DB9PR08MB7534EF7E5575D967D5B4F542875C2@DB9PR08MB7534.eurprd08.prod.outlook.com>
	<CAGPUishOQYF6uGh+wqWix=pRfHSKPZUgNsE9M5WduxUHOXCrAQ@mail.gmail.com>
Message-ID: <DB9PR08MB753406C1FE3EEAF3A5F57DDC875C2@DB9PR08MB7534.eurprd08.prod.outlook.com>

Thanks for explaining, Stefano. I'm using superlu so -pc_factor_mat_solver_type superlu_dist did the trick.

Chris

________________________________________
From: Stefano Zampini <stefano.zampini at gmail.com>
Sent: Thursday, November 7, 2024 2:20 PM
To: Klaij, Christiaan
Cc: petsc-users at mcs.anl.gov
Subject: Re: [petsc-users] null space problem with -pc_type lu on single proc

the default LU solver in sequential is the PETSc one which does not support pivoting or singular problems. In parallel, it is either MUMPS or SUPERLU_DIST, depending on your configurations.
MUMPS for example can handle singular problem, not sure about superlu_dist. You can run the parallel version with -ksp_view and see what is the solver package used.
Supposing it is mumps, you can run the sequential code with -pc_factor_mat_solver_type mumps

Il giorno gio 7 nov 2024 alle ore 13:20 Klaij, Christiaan via petsc-users <petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>> ha scritto:
I'm trying to solve a system with a single, non-constant, null space vector, confirmed by passing the MatNullSpaceTest. Solving the system works fine with -pc_type ilu on one or multiple procs. It also works fine with -pc_type lu on multiple procs but fails on a single proc. Any idea what could be wrong?

Chris

[0]PETSC ERROR: *** unknown floating point error occurred ***
[0]PETSC ERROR: The specific exception can be determined by running in a debugger. When the
[0]PETSC ERROR: debugger traps the signal, the exception can be found with fetestexcept(0x3f)
[0]PETSC ERROR: where the result is a bitwise OR of the following flags:
[0]PETSC ERROR: FE_INVALID=0x1 FE_DIVBYZERO=0x4 FE_OVERFLOW=0x8 FE_UNDERFLOW=0x10 FE_INEXACT=0x20
[0]PETSC ERROR: Try option -start_in_debugger
[0]PETSC ERROR: likely location of problem given in stack below
[0]PETSC ERROR: --------------------- Stack Frames ------------------------------------
[0]PETSC ERROR: The line numbers in the error traceback are not always exact.
[0]PETSC ERROR: #1 PetscDefaultFPTrap() at /cm/shared/apps/petsc/oneapi/build/src/src/sys/error/fp.c:487
[0]PETSC ERROR: #2 VecMDot_Seq() at /cm/shared/apps/petsc/oneapi/build/src/src/vec/vec/impls/seq/dvec2.c:189
[0]PETSC ERROR: #3 VecMXDot_MPI_Default() at /cm/shared/apps/petsc/oneapi/build/src/include/../src/vec/vec/impls/mpi/pvecimpl.h:96
[0]PETSC ERROR: #4 VecMDot_MPI() at /cm/shared/apps/petsc/oneapi/build/src/src/vec/vec/impls/mpi/pvec2.c:25
[0]PETSC ERROR: #5 VecMXDot_Private() at /cm/shared/apps/petsc/oneapi/build/src/src/vec/vec/interface/rvector.c:1112
[0]PETSC ERROR: #6 VecMDot() at /cm/shared/apps/petsc/oneapi/build/src/src/vec/vec/interface/rvector.c:1184
[0]PETSC ERROR: #7 MatNullSpaceRemove() at /cm/shared/apps/petsc/oneapi/build/src/src/mat/interface/matnull.c:359
[0]PETSC ERROR: #8 KSP_RemoveNullSpace() at /cm/shared/apps/petsc/oneapi/build/src/include/petsc/private/kspimpl.h:322
[0]PETSC ERROR: #9 KSP_PCApply() at /cm/shared/apps/petsc/oneapi/build/src/include/petsc/private/kspimpl.h:382
[0]PETSC ERROR: #10 KSPInitialResidual() at /cm/shared/apps/petsc/oneapi/build/src/src/ksp/ksp/interface/itres.c:64
[0]PETSC ERROR: #11 KSPSolve_GMRES() at /cm/shared/apps/petsc/oneapi/build/src/src/ksp/ksp/impls/gmres/gmres.c:226
[0]PETSC ERROR: #12 KSPSolve_Private() at /cm/shared/apps/petsc/oneapi/build/src/src/ksp/ksp/interface/itfunc.c:898
[0]PETSC ERROR: #13 KSPSolve() at /cm/shared/apps/petsc/oneapi/build/src/src/ksp/ksp/interface/itfunc.c:1070
[0]PETSC ERROR: --------------------- Error Message --------------------------------------------------------------
[0]PETSC ERROR: Floating point exception
[0]PETSC ERROR: trapped floating point error
[0]PETSC ERROR: See https://urldefense.us/v3/__https://petsc.org/release/faq/__;!!G_uCfscf7eWS!Z_V3Mo9VHcVE5kKG_z_TQ_jgrgcAJyQUPP1-I1OsT7PQgkbn1rnE5ORYi5TxwOPmLGapf_gkooDG9DZkYMM5VeI$ <https://urldefense.us/v3/__https://petsc.org/release/faq/__;!!G_uCfscf7eWS!dXxAqHgHbtticGbE-cswI2ylVIt4A7Fn68LPFMyFtpJByLQtJocy5JF-Vf5Vzr_FbcWB1l7ve_2l9CdBfaa2Fyw$> for trouble shooting.
[0]PETSC ERROR: Petsc Release Version 3.19.4, Jul 31, 2023
[0]PETSC ERROR: ./refresco on a named marclus3login2 by cklaij Thu Nov 7 11:03:22 2024
[0]PETSC ERROR: Configure options --prefix=/cm/shared/apps/petsc/oneapi/3.19.4-dbg --with-mpi-dir=/cm/shared/apps/intel/oneapi/mpi/2021.4.0 --with-x=0 --with-mpe=0 --with-debugging=1 --download-superlu_dist=../superlu_dist-8.1.2.tar.gz --with-blaslapack-dir=/cm/shared/apps/intel/oneapi/mkl/2021.4.0 --download-parmetis=../parmetis-4.0.3-p9.tar.gz --download-metis=../metis-5.1.0-p11.tar.gz --with-packages-build-dir=/cm/shared/apps/petsc/oneapi/build --with-ssl=0 --with-shared-libraries=1 CFLAGS="-std=gnu11 -Wall -funroll-all-loops -O3 -DNDEBUG" CXXFLAGS="-std=gnu++14 -Wall -funroll-all-loops -O3 -DNDEBUG" COPTFLAGS="-std=gnu11 -Wall -funroll-all-loops -O3 -DNDEBUG" CXXOPTFLAGS="-std=gnu++14 -Wall -funroll-all-loops -O3 -DNDEBUG" FCFLAGS="-funroll-all-loops -O3 -DNDEBUG" F90FLAGS="-funroll-all-loops -O3 -DNDEBUG" FOPTFLAGS="-funroll-all-loops -O3 -DNDEBUG"
Abort(72) on node 0 (rank 0 in comm 0): application called MPI_Abort(MPI_COMM_WORLD, 72) - process 0
[cid:ii_19306c695d9c0e2969a1]
dr. ir.????     Christiaan       Klaij
 |      Senior Researcher        |      Research & Development
T +31 317 49 33 44<tel:+31%20317%2049%2033%2044>         |       C.Klaij at marin.nl<mailto:C.Klaij at marin.nl>      | https://urldefense.us/v3/__http://www.marin.nl__;!!G_uCfscf7eWS!Z_V3Mo9VHcVE5kKG_z_TQ_jgrgcAJyQUPP1-I1OsT7PQgkbn1rnE5ORYi5TxwOPmLGapf_gkooDG9DZkiSv2kr8$ <https://urldefense.us/v3/__https://www.marin.nl/__;!!G_uCfscf7eWS!dXxAqHgHbtticGbE-cswI2ylVIt4A7Fn68LPFMyFtpJByLQtJocy5JF-Vf5Vzr_FbcWB1l7ve_2l9CdBivdTTos$>
[Facebook]<https://urldefense.us/v3/__https://www.facebook.com/marin.wageningen__;!!G_uCfscf7eWS!dXxAqHgHbtticGbE-cswI2ylVIt4A7Fn68LPFMyFtpJByLQtJocy5JF-Vf5Vzr_FbcWB1l7ve_2l9CdBNyhbEjY$>
[LinkedIn]<https://urldefense.us/v3/__https://www.linkedin.com/company/marin__;!!G_uCfscf7eWS!dXxAqHgHbtticGbE-cswI2ylVIt4A7Fn68LPFMyFtpJByLQtJocy5JF-Vf5Vzr_FbcWB1l7ve_2l9CdBs7HDGBM$>
[YouTube]<https://urldefense.us/v3/__https://www.youtube.com/marinmultimedia__;!!G_uCfscf7eWS!dXxAqHgHbtticGbE-cswI2ylVIt4A7Fn68LPFMyFtpJByLQtJocy5JF-Vf5Vzr_FbcWB1l7ve_2l9CdBTcqrSW8$>


--
Stefano
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image376413.png
Type: image/png
Size: 5004 bytes
Desc: image376413.png
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20241107/02b1679c/attachment.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image225686.png
Type: image/png
Size: 487 bytes
Desc: image225686.png
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20241107/02b1679c/attachment-0001.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image478264.png
Type: image/png
Size: 504 bytes
Desc: image478264.png
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20241107/02b1679c/attachment-0002.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image765396.png
Type: image/png
Size: 482 bytes
Desc: image765396.png
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20241107/02b1679c/attachment-0003.png>

From p.khurana22 at imperial.ac.uk  Thu Nov  7 09:08:29 2024
From: p.khurana22 at imperial.ac.uk (Khurana, Parv)
Date: Thu, 7 Nov 2024 15:08:29 +0000
Subject: [petsc-users] Expected weak scaling behaviour for AMG libraries?
In-Reply-To: <CADOhEh7GA3re9NDZL2YhnspSn=7moGvv_2vJ58rQVuca6Zq4sA@mail.gmail.com>
References: <LO0P265MB3946CCAA472E48E68CA9EC18D5542@LO0P265MB3946.GBRP265.PROD.OUTLOOK.COM>
	<CAMYG4GkxVFq37F=WZKLJ9Yu2kRRhDrBayjGqgUP8jOsL5dWhMA@mail.gmail.com>
	<CADOhEh7GA3re9NDZL2YhnspSn=7moGvv_2vJ58rQVuca6Zq4sA@mail.gmail.com>
Message-ID: <LO0P265MB394672496590CA5704245BC0D55C2@LO0P265MB3946.GBRP265.PROD.OUTLOOK.COM>

Hello Mark and Mathew,

Apologies for the delay in reply (I was gone for a vacation). Really appreciate the prompt response.

I am now planning to redo these tests with the load balancing suggestions you have provided. Would you suggest any load balancing options to use as default when dealing with unstructured meshes in general? I use PETSc as an external linear solver for my software, where I supply a Poisson system discretised using 3D simplical elements and FEM - which are solved using AMG. I observed bad weak scaling behaviour for my application for 20k DOF/rank, which prompted me to test something similar only in PETSc.

I choose ex12 instead of ex56 because it uses 3D FEM. I am not sure if I can make ex56 work for tetrahedrons out of the box. Maybe ex13 is more suited as Mark mentioned.

On point 3,4 from Mathew:
The plot below is from the numbers extracted from the -log_view option for all the runs. I have attached a sample log file from my runs, and pasted a sample output in the email.

------------------------------------------------------------------ PETSc Performance Summary: ------------------------------------------------------------------

------------------------------------------------------------------------------------------------------------------------
Event                Count      Time (sec)     Flop                              --- Global ---  --- Stage ----  Total
                   Max Ratio  Max     Ratio   Max  Ratio  Mess   AvgLen  Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------

--- Event Stage 0: Main Stage

KSPSolve               2 1.0 1.4079e-01 1.0 2.14e+07 2.0 1.2e+03 1.1e+04 4.4e+01  2  4 26 16 17   2  4 26 16 18   875
SNESSolve              1 1.0 2.9310e+00 1.0 1.69e+08 1.1 1.7e+03 2.0e+04 6.1e+01 46 46 37 38 23  46 46 37 38 25   445
PCApply               23 1.0 1.2774e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  2  0  0  0  0   2  0  0  0  0     0
------------------------------------------------------------------------------------------------------------------------

Thanks and Best,
Parv

________________________________
From: Mark Adams <mfadams at lbl.gov>
Sent: 31 October 2024 11:30
To: Matthew Knepley <knepley at gmail.com>
Cc: Khurana, Parv <p.khurana22 at imperial.ac.uk>; petsc-users at mcs.anl.gov <petsc-users at mcs.anl.gov>
Subject: Re: [petsc-users] Expected weak scaling behaviour for AMG libraries?


This email from mfadams at lbl.gov originates from outside Imperial. Do not click on links and attachments unless you recognise the sender. If you trust the sender, add them to your safe senders list<https://urldefense.us/v3/__https://spam.ic.ac.uk/SpamConsole/Senders.aspx__;!!G_uCfscf7eWS!f47A4_a_tRPU1XvaYrgWMAp2uGFajfIIWf6QG4FERzGhIyI7U-eiYao8U73sCFqUwb_u9HrBY8TMcMT4qnKeIzOyDrGDgvqU$ > to disable email stamping for this address.


As Matt said snes ex56 is better because it does a convergence test that refines the grid. You need/want these two parameters to have the same arg (eg, 2,2,1): -dm_plex_box_faces 2,2,1 -petscpartitioner_simple_process_grid 2,2,1.
This will put one cell per process.

Then you use: -max_conv_its N, to specify the N levels of refinement to do. It will run the 2,2,1 first then a 4,4,2, etc., N times.

/src/snes/tests/ex13.c is designed for benchmarking and it uses '-petscpartitioner_simple_node_grid 1,1,1 [default]' to give you a two level partitioner.
You need to have dm_plex_box_faces_i =  petscpartitioner_simple_process_grid_i  * petscpartitioner_simple_node_grid_i
Again, you should put one cell per process (NP = product of dm_plex_box_faces args) and use -dm_refine N to get a single solve.

Mark


On Wed, Oct 30, 2024 at 11:02?PM Matthew Knepley <knepley at gmail.com<mailto:knepley at gmail.com>> wrote:
On Wed, Oct 30, 2024 at 4:13?PM Khurana, Parv <p.khurana22 at imperial.ac.uk<mailto:p.khurana22 at imperial.ac.uk>> wrote:
Hello PETSc Community,
I am trying to understand the scaling behaviour of AMG methods in PETSc (Hypre for now) and how many DOFs/Rank are needed for a performant AMG solve.
I?m currently conducting weak scaling tests using src/snes/tutorials/ex12.c in 3D, applying Dirichlet BCs with FEM at P=1. The tests keep DOFs per processor constant while increasing the mesh size and processor count, specifically:

  *   20000 and 80000 DOF/RANK configurations.
  *   Running SNES twice, using GMRES with a tolerance of 1e-5 and preconditioning with Hypre-BoomerAMG.

A couple of quick points  in order to make sure that there is no confusion:

1) Partitioner type "simple" is for the CI. It is a very bad partition, and should not be used for timing. The default is ParMetis which should be good enough.

2) You start out with 6^3 = 216 elements, distribute that, and then refine it. This will be _really_ bad load balance on all arrangement except the divisors of 216. You usually want to start out with something bigger at the later stages. You can use -dm_refine_pre to refine before distribution.

3) It is not clear you are using the timing for just the solver (SNESSolve). It could be that extraneous things are taking time. When asking questions like this, please always send the output of -log_view for timing, and at least -ksp_monitor_true_residial for convergence.

4) SNES ex56 is the example we use for GAMG scalability testing

  Thanks,

      Matt
Unfortunately, parallel efficiency degrades noticeably with increased processor counts. Are there any insights or rules of thumb for using AMG more effectively? I have been looking at this issue for a while now and would love to engage in a further discussion. Please find below the weak scaling results and the options I use to run the tests.
[cid:ii_192e0800b4dcb971f161]
#Run type
-run_type full
-petscpartitioner_type simple

#Mesh settings
-dm_plex_dim 3
-dm_plex_simplex 1
-dm_refine 5 #Varied this
-dm_plex_box_faces 6,6,6

#BCs and FEM space
-bc_type dirichlet
-petscspace_degree 1

#Solver settings
-snes_max_it 2
-ksp_type gmres
-ksp_rtol 1.0e-5
#Same settings as what we use for LOR
-pc_type hypre
-pc_hypre_type boomeramg
-pc_hypre_boomeramg_coarsen_type hmis
-pc_hypre_boomeramg_relax_type_all symmetric-sor/jacobi
-pc_hypre_boomeramg_strong_threshold 0.7
-pc_hypre_boomeramg_interp_type ext+i
-pc_hypre_boomeramg_P_max 2
-pc_hypre_boomeramg_truncfactor 0.3

Best,
Parv


--
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener

https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!f47A4_a_tRPU1XvaYrgWMAp2uGFajfIIWf6QG4FERzGhIyI7U-eiYao8U73sCFqUwb_u9HrBY8TMcMT4qnKeIzOyDtzWTvkU$ <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!YXR-8qKioRYS0fNOHacYGkm6WaIuKge2zoTiW1n0vLsWQUBiyLM48cg58pRLtNm0QjVigIZYftn2x09fmjiN$>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20241107/2eff54f5/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 119488 bytes
Desc: image.png
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20241107/2eff54f5/attachment-0001.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: petsc_scale.l144491.pbs-6
Type: application/octet-stream
Size: 24704 bytes
Desc: petsc_scale.l144491.pbs-6
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20241107/2eff54f5/attachment-0001.obj>

From e.t.a.vanderweide at utwente.nl  Thu Nov  7 11:21:19 2024
From: e.t.a.vanderweide at utwente.nl (Weide, Edwin van der (UT-ET))
Date: Thu, 7 Nov 2024 17:21:19 +0000
Subject: [petsc-users] Matrix free SNES with user provided matrix vector
 product and preconditioner operation
In-Reply-To: <DA04DF21-8B6C-4917-B7C4-5690E9B4B733@petsc.dev>
References: <AS8P195MB2099614F528EDBAB7A4A420494532@AS8P195MB2099.EURP195.PROD.OUTLOOK.COM>
	<CF505F8F-ACD5-4B87-8A66-29FFE86C1FDE@petsc.dev>
	<AS8P195MB2099CF8E96A926901CAD569194532@AS8P195MB2099.EURP195.PROD.OUTLOOK.COM>
	<DA04DF21-8B6C-4917-B7C4-5690E9B4B733@petsc.dev>
Message-ID: <AS8P195MB209982F47B5695DA5D7E321D945C2@AS8P195MB2099.EURP195.PROD.OUTLOOK.COM>

Yes, this works. Thanks a lot for your help.
Regards,

Edwin

________________________________
From: Barry Smith <bsmith at petsc.dev>
Sent: Wednesday, November 6, 2024 7:03 PM
To: Weide, Edwin van der (UT-ET) <e.t.a.vanderweide at utwente.nl>
Cc: petsc-users at mcs.anl.gov <petsc-users at mcs.anl.gov>
Subject: Re: [petsc-users] Matrix free SNES with user provided matrix vector product and preconditioner operation


  You need to provide a callback function. Why? Otherwise your MatShell and PCshell have no way of knowing at what location the Jacobian is suppose to be evaluated at (in a matrix free way). That is the x for which J(x) is used.

  Normally one puts the x into the application context of mJac and accesses it every time the matmult is called. Similarly it needs to be accessed in application of your preconditioner.


On Nov 6, 2024, at 12:01?PM, Weide, Edwin van der (UT-ET) <e.t.a.vanderweide at utwente.nl> wrote:

Barry,

If I do that, I get the following error

[0]PETSC ERROR: ------------------------------------------------------------------------
[0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range
[0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger
[0]PETSC ERROR: or see https://urldefense.us/v3/__https://petsc.org/release/faq/*valgrind__;Iw!!G_uCfscf7eWS!dfJ_L4Ay2QlfGz0JU6uAfD7Zy_6a7x9D0bisHMG1T53fTy_gWL3Q8fzzLvWZ_TBK37FFmDYerLywkFOLtMnU1rISfF_VA3gxsDTm$  and https://urldefense.us/v3/__https://petsc.org/release/faq/__;!!G_uCfscf7eWS!dfJ_L4Ay2QlfGz0JU6uAfD7Zy_6a7x9D0bisHMG1T53fTy_gWL3Q8fzzLvWZ_TBK37FFmDYerLywkFOLtMnU1rISfF_VAwRW4W44$ 
[0]PETSC ERROR: ---------------------  Stack Frames ------------------------------------
[0]PETSC ERROR: The line numbers in the error traceback are not always exact.
[0]PETSC ERROR: #1 SNES callback Jacobian
[0]PETSC ERROR: #2 SNESComputeJacobian() at /home/vdweide/petsc/src/snes/interface/snes.c:2966
[0]PETSC ERROR: #3 SNESSolve_NEWTONLS() at /home/vdweide/petsc/src/snes/impls/ls/ls.c:218
[0]PETSC ERROR: #4 SNESSolve() at /home/vdweide/petsc/src/snes/interface/snes.c:4841
[0]PETSC ERROR: #5 SolveCurrentStage() at SolverClass.cpp:502
[0]PETSC ERROR: #6 main() at Condensation.cpp:20
--------------------------------------------------------------------------

So SNES tries to call the call back function for the Jacobian, but that is not provided. Hence the failure.
Regards,

Edwin

________________________________
From: Barry Smith <bsmith at petsc.dev<mailto:bsmith at petsc.dev>>
Sent: Wednesday, November 6, 2024 5:52 PM
To: Weide, Edwin van der (UT-ET) <e.t.a.vanderweide at utwente.nl<mailto:e.t.a.vanderweide at utwente.nl>>
Cc: petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov> <petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>>
Subject: Re: [petsc-users] Matrix free SNES with user provided matrix vector product and preconditioner operation


   Just pass mJac, mJac  instead of mJac, nullptr and it will be happy.  In your case, the second mJac  won't be used in your preconditioner it is just a place holder so other parts of SNES won't try to create a matrix.

  Barry


On Nov 6, 2024, at 11:36?AM, Weide, Edwin van der (UT-ET) via petsc-users <petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>> wrote:

Hi,

I am trying to solve a nonlinear problem with matrix-free SNES where I would like to provide both the matrix vector product and the preconditioner myself. For that purpose, I use the following construction.

  // Set up the matrix free evaluation of the Jacobian times a vector
  // by setting the appropriate function in snes.
  PetscCall(MatCreateShell(PETSC_COMM_WORLD, PETSC_DECIDE, PETSC_DECIDE,
                           nEqns, nEqns, this, &mJac));
  PetscCall(MatShellSetOperation(mJac, MATOP_MULT,
                                 (void (*)(void))JacobianTimesVector));

  PetscCall(SNESSetJacobian(mSnes, mJac, nullptr, nullptr, nullptr));

  // Set the function to be used as preconditioner for the krylov solver.
  KSP ksp;
  PC  pc;
  PetscCall(SNESGetKSP(mSnes, &ksp));
  PetscCall(KSPGetPC(ksp, &pc));
  PetscCall(PCSetType(pc, PCSHELL));
  PetscCall(PCSetApplicationContext(pc, this));
  PetscCall(PCShellSetApply(pc, Preconditioner));

For small problems this construction works, and it does exactly what I expect it to do. However, when I increase the problem size, I get a memory allocation failure in SNESSolve, because it looks like SNES attempts to allocate memory for a full dense matrix for the preconditioner, which is not used. This is the call stack when the error occurs.

[0]PETSC ERROR: #1 PetscMallocAlign() at /home/vdweide/petsc/src/sys/memory/mal.c:53
[0]PETSC ERROR: #2 PetscTrMallocDefault() at /home/vdweide/petsc/src/sys/memory/mtr.c:175
[0]PETSC ERROR: #3 PetscMallocA() at /home/vdweide/petsc/src/sys/memory/mal.c:421
[0]PETSC ERROR: #4 MatSeqDenseSetPreallocation_SeqDense() at /home/vdweide/petsc/src/mat/impls/dense/seq/dense.c:3357
[0]PETSC ERROR: #5 MatSeqDenseSetPreallocation() at /home/vdweide/petsc/src/mat/impls/dense/seq/dense.c:3338
[0]PETSC ERROR: #6 MatDuplicateNoCreate_SeqDense() at /home/vdweide/petsc/src/mat/impls/dense/seq/dense.c:372
[0]PETSC ERROR: #7 MatDuplicate_SeqDense() at /home/vdweide/petsc/src/mat/impls/dense/seq/dense.c:399
[0]PETSC ERROR: #8 MatDuplicate() at /home/vdweide/petsc/src/mat/interface/matrix.c:4964
[0]PETSC ERROR: #9 DMCreateMatrix_Shell() at /home/vdweide/petsc/src/dm/impls/shell/dmshell.c:195
[0]PETSC ERROR: #10 DMCreateMatrix() at /home/vdweide/petsc/src/dm/interface/dm.c:1501
[0]PETSC ERROR: #11 SNESSetUpMatrices() at /home/vdweide/petsc/src/snes/interface/snes.c:794
[0]PETSC ERROR: #12 SNESSetUp_NEWTONLS() at /home/vdweide/petsc/src/snes/impls/ls/ls.c:290
[0]PETSC ERROR: #13 SNESSetUp() at /home/vdweide/petsc/src/snes/interface/snes.c:3395
[0]PETSC ERROR: #14 SNESSolve() at /home/vdweide/petsc/src/snes/interface/snes.c:4831
[0]PETSC ERROR: #15 SolveCurrentStage() at SolverClass.cpp:502

In the function SNESSetUpMatrices the source looks as follows

 784   } else if (!snes->jacobian_pre) {
 785     PetscDS   prob;
 786     Mat       J, B;
 787     PetscBool hasPrec = PETSC_FALSE;
 788
 789     J = snes->jacobian;
 790     PetscCall(DMGetDS(dm, &prob));
 791     if (prob) PetscCall(PetscDSHasJacobianPreconditioner(prob, &hasPrec));
 792     if (J) PetscCall(PetscObjectReference((PetscObject)J));
 793     else if (hasPrec) PetscCall(DMCreateMatrix(snes->dm, &J));
 794     PetscCall(DMCreateMatrix(snes->dm, &B));
 795     PetscCall(SNESSetJacobian(snes, J ? J : B, B, NULL, NULL));
 796     PetscCall(MatDestroy(&J));
 797     PetscCall(MatDestroy(&B));
 798   }

It looks like in line 794 it is attempted to create the preconditioner, because it was (intentionally) not provided.

Hence my question. Is it possible to use matrix-free SNES with a user provided matrix vector product (via MatShell) and a user provided preconditioner operation without SNES allocating the memory for a dense matrix? If so, what do I need to change in the construction above to make it work?

If needed, I can provide the source code for which this problem occurs.
Thanks,

Edwin

---------------------------------------------------
Edwin van der Weide
Department of Mechanical Engineering
University of Twente
Enschede, the Netherlands

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20241107/8149a50c/attachment-0001.html>

From knepley at gmail.com  Fri Nov  8 08:11:30 2024
From: knepley at gmail.com (Matthew Knepley)
Date: Fri, 8 Nov 2024 09:11:30 -0500
Subject: [petsc-users] Expected weak scaling behaviour for AMG libraries?
In-Reply-To: <LO0P265MB394672496590CA5704245BC0D55C2@LO0P265MB3946.GBRP265.PROD.OUTLOOK.COM>
References: <LO0P265MB3946CCAA472E48E68CA9EC18D5542@LO0P265MB3946.GBRP265.PROD.OUTLOOK.COM>
	<CAMYG4GkxVFq37F=WZKLJ9Yu2kRRhDrBayjGqgUP8jOsL5dWhMA@mail.gmail.com>
	<CADOhEh7GA3re9NDZL2YhnspSn=7moGvv_2vJ58rQVuca6Zq4sA@mail.gmail.com>
	<LO0P265MB394672496590CA5704245BC0D55C2@LO0P265MB3946.GBRP265.PROD.OUTLOOK.COM>
Message-ID: <CAMYG4Gk0fm4s3N0aBUzGgKrC6r90mK=envG8t8nX=34WycbrpQ@mail.gmail.com>

On Thu, Nov 7, 2024 at 10:08?AM Khurana, Parv <p.khurana22 at imperial.ac.uk>
wrote:

> Hello Mark and Mathew,
>
> Apologies for the delay in reply (I was gone for a vacation). Really
> appreciate the prompt response.
>
> I am now planning to redo these tests with the load balancing suggestions
> you have provided. *Would you suggest any load balancing options to use
> as default when dealing with unstructured meshes in general*? I
>

The default load balancing should be good. We do not use it in CI tests
because it is not reproducible across machines. When you
use

  -plexpartitioner_type simple

you are turning off the default load balancing. Don't do that.


> use PETSc as an external linear solver for my software, where I supply a
> Poisson system discretised using 3D simplical elements and FEM - which are
> solved using AMG. I observed bad weak scaling behaviour for my application
> for 20k DOF/rank, which prompted me to test something similar only in
> PETSc.
>
> I choose ex12 instead of ex56 because it uses 3D FEM. I am not sure if I
> can make ex56 work for tetrahedrons out of the box. Maybe ex13 is more
> suited as Mark mentioned.
>
> On point 3,4 from Mathew:
> The plot below is from the numbers extracted from the -log_view option for
> all the runs. I have attached a sample log file from my runs, and pasted a
> sample output in the email.
>

Yes, this has the simple partitioning, which is not what you want.

  Thanks,

     Matt


>
> ------------------------------------------------------------------ PETSc
> Performance Summary:
> ------------------------------------------------------------------
>
>
> ------------------------------------------------------------------------------------------------------------------------
> Event                Count      Time (sec)     Flop
>        --- Global ---  --- Stage ----  Total
>                    Max Ratio  Max     Ratio   Max  Ratio  Mess   AvgLen
>  Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s
>
> ------------------------------------------------------------------------------------------------------------------------
>
> --- Event Stage 0: Main Stage
>
> KSPSolve               2 1.0 1.4079e-01 1.0 2.14e+07 2.0 1.2e+03 1.1e+04
> 4.4e+01  2  4 26 16 17   2  4 26 16 18   875
> SNESSolve              1 1.0 2.9310e+00 1.0 1.69e+08 1.1 1.7e+03 2.0e+04
> 6.1e+01 46 46 37 38 23  46 46 37 38 25   445
> PCApply               23 1.0 1.2774e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00
> 0.0e+00  2  0  0  0  0   2  0  0  0  0     0
>
> ------------------------------------------------------------------------------------------------------------------------
>
> Thanks and Best,
> Parv
>
> ------------------------------
> *From:* Mark Adams <mfadams at lbl.gov>
> *Sent:* 31 October 2024 11:30
> *To:* Matthew Knepley <knepley at gmail.com>
> *Cc:* Khurana, Parv <p.khurana22 at imperial.ac.uk>; petsc-users at mcs.anl.gov
> <petsc-users at mcs.anl.gov>
> *Subject:* Re: [petsc-users] Expected weak scaling behaviour for AMG
> libraries?
>
>
> This email from mfadams at lbl.gov originates from outside Imperial. Do not
> click on links and attachments unless you recognise the sender. If you
> trust the sender, add them to your safe senders list
> <https://urldefense.us/v3/__https://spam.ic.ac.uk/SpamConsole/Senders.aspx__;!!G_uCfscf7eWS!aRkIkjdOvZx9uPuKeRCMxZ-OzP2IYC81-tJZcPczdDkmsiYvbB5fa0eNLr_hJmfRf73BE9wcUyW78MXh0iAX$ > to disable email
> stamping for this address.
>
>
> As Matt said snes ex56 is better because it does a convergence test that
> refines the grid. You need/want these two parameters to have the same arg
> (eg, 2,2,1): -dm_plex_box_faces 2,2,1 -petscpartitioner_simple_process_grid
> 2,2,1.
> This will put one cell per process.
>
> Then you use: -max_conv_its N, to specify the N levels of refinement to
> do. It will run the 2,2,1 first then a 4,4,2, etc., N times.
>
> /src/snes/tests/ex13.c is designed for benchmarking and it uses
> '-petscpartitioner_simple_node_grid 1,1,1 [default]' to give you a two
> level partitioner.
> You need to have dm_plex_box_faces_i =
> petscpartitioner_simple_process_grid_i  *
> petscpartitioner_simple_node_grid_i
> Again, you should put one cell per process (NP = product of
> dm_plex_box_faces args) and use -dm_refine N to get a single solve.
>
> Mark
>
>
>
> On Wed, Oct 30, 2024 at 11:02?PM Matthew Knepley <knepley at gmail.com>
> wrote:
>
> On Wed, Oct 30, 2024 at 4:13?PM Khurana, Parv <p.khurana22 at imperial.ac.uk>
> wrote:
>
> Hello PETSc Community,
> I am trying to understand the scaling behaviour of AMG methods in PETSc
> (Hypre for now) and how many DOFs/Rank are needed for a performant AMG
> solve.
> I?m currently conducting weak scaling tests using
> src/snes/tutorials/ex12.c in 3D, applying Dirichlet BCs with FEM at P=1.
> The tests keep DOFs per processor constant while increasing the mesh size
> and processor count, specifically:
>
>    - *20000 and 80000 DOF/RANK* configurations.
>    - Running SNES twice, using GMRES with a tolerance of 1e-5 and
>    preconditioning with Hypre-BoomerAMG.
>
> A couple of quick points  in order to make sure that there is no confusion:
>
> 1) Partitioner type "simple" is for the CI. It is a very bad partition,
> and should not be used for timing. The default is ParMetis which should be
> good enough.
>
> 2) You start out with 6^3 = 216 elements, distribute that, and then refine
> it. This will be _really_ bad load balance on all arrangement except the
> divisors of 216. You usually want to start out with something bigger at the
> later stages. You can use -dm_refine_pre to refine before distribution.
>
> 3) It is not clear you are using the timing for just the solver
> (SNESSolve). It could be that extraneous things are taking time. When
> asking questions like this, please always send the output of -log_view for
> timing, and at least -ksp_monitor_true_residial for convergence.
>
> 4) SNES ex56 is the example we use for GAMG scalability testing
>
>   Thanks,
>
>       Matt
>
> Unfortunately, parallel efficiency degrades noticeably with increased
> processor counts. Are there any insights or rules of thumb for using AMG
> more effectively? I have been looking at this issue for a while
> now and would love to engage in a further discussion. Please find below the
> weak scaling results and the options I use to run the tests.
> *#Run type*
> -run_type full
> -petscpartitioner_type simple
>
> *#Mesh settings*
> -dm_plex_dim 3
> -dm_plex_simplex 1
> -dm_refine 5 #Varied this
> -dm_plex_box_faces 6,6,6
>
> *#BCs and FEM space*
> -bc_type dirichlet
> -petscspace_degree 1
>
> *#Solver settings*
> -snes_max_it 2
> -ksp_type gmres
> -ksp_rtol 1.0e-5
> #Same settings as what we use for LOR
> -pc_type hypre
> -pc_hypre_type boomeramg
> -pc_hypre_boomeramg_coarsen_type hmis
> -pc_hypre_boomeramg_relax_type_all symmetric-sor/jacobi
> -pc_hypre_boomeramg_strong_threshold 0.7
> -pc_hypre_boomeramg_interp_type ext+i
> -pc_hypre_boomeramg_P_max 2
> -pc_hypre_boomeramg_truncfactor 0.3
>
> Best,
> Parv
>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!aRkIkjdOvZx9uPuKeRCMxZ-OzP2IYC81-tJZcPczdDkmsiYvbB5fa0eNLr_hJmfRf73BE9wcUyW78CaVeYVe$ 
> <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!YXR-8qKioRYS0fNOHacYGkm6WaIuKge2zoTiW1n0vLsWQUBiyLM48cg58pRLtNm0QjVigIZYftn2x09fmjiN$>
>
>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!aRkIkjdOvZx9uPuKeRCMxZ-OzP2IYC81-tJZcPczdDkmsiYvbB5fa0eNLr_hJmfRf73BE9wcUyW78CaVeYVe$  <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!aRkIkjdOvZx9uPuKeRCMxZ-OzP2IYC81-tJZcPczdDkmsiYvbB5fa0eNLr_hJmfRf73BE9wcUyW78HgFR8ju$ >
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20241108/ab886669/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 119488 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20241108/ab886669/attachment-0001.png>

From edoardo.alinovi at gmail.com  Fri Nov  8 11:45:55 2024
From: edoardo.alinovi at gmail.com (Edoardo alinovi)
Date: Fri, 8 Nov 2024 18:45:55 +0100
Subject: [petsc-users] Some hypre settings for 3D fully coupled
 incompressible NS
Message-ID: <CADmAu6KPjADiBERKd1BUOsfdsH7VHa4ZRin2R5+_wO45L9wNyQ@mail.gmail.com>

Hello petsc friends,

It's been a while since I am trying to find a good setup for my coupled
solver.

Recently, I have run a scan with Dakota (more than 1k simulations) on the
Windsor body case with 7Mln cells on 36 cores on my small home server (Dell
R730 with 2x2496 v4 xeon). I thought it was a good idea to share my results
with the community!

Here is a resume of my finding:

1) Multiplicative is faster than Schur:  I have found out that Schur
preconditioner is rarely faster than multiplicative despite the fact Schur
keeps the number of iterations lower. I think there is a lot of room for
improvement as far as FV matrices are concerned. Probably custom Shat is
the way to go, but not easy to find a good one! Up to now "selfp" looks to
be the only good and "ready to go" choice.

2) Vanilla fbcgs is faster than vanilla fgmres: maybe here we can tune
gmres restart, I have not tried this systematically.

3) Stick with preonly: using bcgs/cg as preconditioner ksp lowers the
number of iterations but it adds up a lot of overhead (even setting few
iterations or mild tolerances).

4) Staging is a good idea: beyond bare iteration performance, I think that
for steady state problems it worth setting a max for outer iterations in
fieldsplit, as starting iterations would cost you a lot and probably you
will be far from convergence anyway at the stage, so it is not a good
investment pushing hard on them.

5) Here my best so far settings:

    # Outer solver settings
     "solver": "fbcgs",
     "preconditioner": "fieldsplit",
     "absTol": 1e-6,
     "relTol": 0.01,

      # Field split KSP and PC
      "fieldsplit_u_pc_type": "bjacobi",
      "fieldsplit_p_pc_type": "hypre",
      "fieldsplit_u_ksp_type": "preonly",
      "fieldsplit_p_ksp_type": "preonly",

       ! HYPRE PC options
       "fieldsplit_p_pc_hypre_boomeramg_strong_threshold": 0.05,
       "fieldsplit_p_pc_hypre_boomeramg_coarsen_type": "PMIS",
       "fieldsplit_p_pc_hypre_boomeramg_truncfactor": 0.3,
       "fieldsplit_p_pc_hypre_boomeramg_no_cf": 0,
       "fieldsplit_p_pc_hypre_boomeramg_agg_nl": 1,
       "fieldsplit_p_pc_hypre_boomeramg_agg_num_paths": 1,
       "fieldsplit_p_pc_hypre_boomeramg_P_max": 0,
       "fieldsplit_p_pc_hypre_boomeramg_max_levels": 30,
       "fieldsplit_p_pc_hypre_boomeramg_relax_type_all":
"backward-SOR/Jacobi",
       "fieldsplit_p_pc_hypre_boomeramg_interp_type": "ext+i",
       "fieldsplit_p_pc_hypre_boomeramg_grid_sweeps_down": 0,
       "fieldsplit_p_pc_hypre_boomeramg_grid_sweeps_up": 2,
       "fieldsplit_p_pc_hypre_boomeramg_cycle_type": "v"

I have a question for Barry/Jed/Matt. I have noted that most of the
commercial solvers use what I define as "SAMG with ILU smoother". I am
wondering if there's a way to reproduce this in Petsc.  I have tried
PCPATCH to test VANKA, but I am not really able to use that PC as I am not
using DMplex. With this recipe I am not miles away from Fluent on the same
problem. Yet, I am wondering why commercial solvers do not use fieldsplit.

Hope this can be helpful and of course I am happy to collaborate on this
topic if someone outhere is willing to!

Cheers,

Edoardo
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20241108/b7689f99/attachment.html>

From marcos.vanella at nist.gov  Fri Nov  8 14:18:43 2024
From: marcos.vanella at nist.gov (Vanella, Marcos (Fed))
Date: Fri, 8 Nov 2024 20:18:43 +0000
Subject: [petsc-users] Compiling PETSc in for Grace-Hopper nodes
Message-ID: <DM6PR09MB5063A9BFB12BA382AFFB4A48F85D2@DM6PR09MB5063.namprd09.prod.outlook.com>

Hi all, does anyone have experience compiling PETSc with gnu openmpi and cross compiling with cuda nvcc on these systems?
we have access to Vista, a machine in TACC and was trying to build PETSc with these libraries. I would need gnu openmpi to compile my code (fortran std 2018), and would like to keep the same cpu compiler/openmpi for PETSc.I have the following modules loaded:

Currently Loaded Modules:
  1) ucc/1.3.0   2) ucx/1.17.0   3) cmake/3.29.5   4) xalt/3.1   5) TACC   6) gcc/14.2.0   7) cuda/12.5 (g)   8) openmpi/5.0.5

  Where:
   g:  built for GPU

Here mpicc points to the gcc compiler, etc. When configuring PETSc in the following form I get nvcc not working:

$ ./configure COPTFLAGS="-O2 -g" CXXOPTFLAGS="-O2 -g" FOPTFLAGS="-O2 -g" FCOPTFLAGS="-O2 -g" CUDAOPTFLAGS="-O2 -g" --with-debugging=1 --with-cc=mpicc --with-cxx=mpicxx --with-fc=mpifort --with-cuda --with-cudac=nvcc --with-cuda-arch=90 --download-fblaslapack=1 --with-make-np=8

=============================================================================================
                         Configuring PETSc to compile on your system
=============================================================================================
TESTING: checkCUDACompiler from config.setCompilers(config/BuildSystem/config/setCompilers.py:1541)
*********************************************************************************************
           UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for details):
---------------------------------------------------------------------------------------------
  CUDA compiler you provided with -with-cudac=nvcc cannot be found or does not work.
  Cannot compile CUDA with nvcc.
*********************************************************************************************

I have nvcc in my path:

$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2024 NVIDIA Corporation
Built on Thu_Jun__6_02:26:10_PDT_2024
Cuda compilation tools, release 12.5, V12.5.82
Build cuda_12.5.r12.5/compiler.34385749_0

I remember being able to do this cross compilation in polaris. Any help is most appreciated,
Marcos
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20241108/1a6be81a/attachment.html>

From balay.anl at fastmail.org  Fri Nov  8 14:23:44 2024
From: balay.anl at fastmail.org (Satish Balay)
Date: Fri, 8 Nov 2024 14:23:44 -0600 (CST)
Subject: [petsc-users] Compiling PETSc in for Grace-Hopper nodes
In-Reply-To: <DM6PR09MB5063A9BFB12BA382AFFB4A48F85D2@DM6PR09MB5063.namprd09.prod.outlook.com>
References: <DM6PR09MB5063A9BFB12BA382AFFB4A48F85D2@DM6PR09MB5063.namprd09.prod.outlook.com>
Message-ID: <463afc6d-3776-a73c-d8c3-fef672620ca1@fastmail.org>

Can you send configure.log for this failure?

Satish

On Fri, 8 Nov 2024, Vanella, Marcos (Fed) via petsc-users wrote:

> Hi all, does anyone have experience compiling PETSc with gnu openmpi and cross compiling with cuda nvcc on these systems?
> we have access to Vista, a machine in TACC and was trying to build PETSc with these libraries. I would need gnu openmpi to compile my code (fortran std 2018), and would like to keep the same cpu compiler/openmpi for PETSc.I have the following modules loaded:
> 
> Currently Loaded Modules:
>   1) ucc/1.3.0   2) ucx/1.17.0   3) cmake/3.29.5   4) xalt/3.1   5) TACC   6) gcc/14.2.0   7) cuda/12.5 (g)   8) openmpi/5.0.5
> 
>   Where:
>    g:  built for GPU
> 
> Here mpicc points to the gcc compiler, etc. When configuring PETSc in the following form I get nvcc not working:
> 
> $ ./configure COPTFLAGS="-O2 -g" CXXOPTFLAGS="-O2 -g" FOPTFLAGS="-O2 -g" FCOPTFLAGS="-O2 -g" CUDAOPTFLAGS="-O2 -g" --with-debugging=1 --with-cc=mpicc --with-cxx=mpicxx --with-fc=mpifort --with-cuda --with-cudac=nvcc --with-cuda-arch=90 --download-fblaslapack=1 --with-make-np=8
> 
> =============================================================================================
>                          Configuring PETSc to compile on your system
> =============================================================================================
> TESTING: checkCUDACompiler from config.setCompilers(config/BuildSystem/config/setCompilers.py:1541)
> *********************************************************************************************
>            UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for details):
> ---------------------------------------------------------------------------------------------
>   CUDA compiler you provided with -with-cudac=nvcc cannot be found or does not work.
>   Cannot compile CUDA with nvcc.
> *********************************************************************************************
> 
> I have nvcc in my path:
> 
> $ nvcc --version
> nvcc: NVIDIA (R) Cuda compiler driver
> Copyright (c) 2005-2024 NVIDIA Corporation
> Built on Thu_Jun__6_02:26:10_PDT_2024
> Cuda compilation tools, release 12.5, V12.5.82
> Build cuda_12.5.r12.5/compiler.34385749_0
> 
> I remember being able to do this cross compilation in polaris. Any help is most appreciated,
> Marcos
> 


From junchao.zhang at gmail.com  Fri Nov  8 14:25:01 2024
From: junchao.zhang at gmail.com (Junchao Zhang)
Date: Fri, 8 Nov 2024 14:25:01 -0600
Subject: [petsc-users] Compiling PETSc in for Grace-Hopper nodes
In-Reply-To: <DM6PR09MB5063A9BFB12BA382AFFB4A48F85D2@DM6PR09MB5063.namprd09.prod.outlook.com>
References: <DM6PR09MB5063A9BFB12BA382AFFB4A48F85D2@DM6PR09MB5063.namprd09.prod.outlook.com>
Message-ID: <CA+MQGp_gnRT_8jnR+NUYm81ELAHo7YQcyeCVxA+rKnKdMSPgKw@mail.gmail.com>

Hi, Marcos
  Could you attach the configure.log?
--Junchao Zhang


On Fri, Nov 8, 2024 at 2:19?PM Vanella, Marcos (Fed) via petsc-users <
petsc-users at mcs.anl.gov> wrote:

> Hi all, does anyone have experience compiling PETSc with gnu openmpi and
> cross compiling with cuda nvcc on these systems?
> we have access to Vista, a machine in TACC and was trying to build PETSc
> with these libraries. I would need gnu openmpi to compile my code (fortran
> std 2018), and would like to keep the same cpu compiler/openmpi for PETSc.I
> have the following modules loaded:
>
> Currently Loaded Modules:
>   1) ucc/1.3.0   2) ucx/1.17.0   3) cmake/3.29.5   4) xalt/3.1   5) TACC
> 6) gcc/14.2.0   7) cuda/12.5 (g)   8) openmpi/5.0.5
>
>   Where:
>    g:  built for GPU
>
> Here mpicc points to the gcc compiler, etc. When configuring PETSc in the
> following form I get nvcc not working:
>
> $ ./configure COPTFLAGS="-O2 -g" CXXOPTFLAGS="-O2 -g" FOPTFLAGS="-O2 -g"
> FCOPTFLAGS="-O2 -g" CUDAOPTFLAGS="-O2 -g" --with-debugging=1
> --with-cc=mpicc --with-cxx=mpicxx --with-fc=mpifort --with-cuda
> --with-cudac=nvcc --with-cuda-arch=90 --download-fblaslapack=1
> --with-make-np=8
>
>
> =============================================================================================
>                          Configuring PETSc to compile on your system
>
> =============================================================================================
> TESTING: checkCUDACompiler from
> config.setCompilers(config/BuildSystem/config/setCompilers.py:1541)
>
> *********************************************************************************************
>            UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for
> details):
>
> ---------------------------------------------------------------------------------------------
>   CUDA compiler you provided with -with-cudac=nvcc cannot be found or does
> not work.
>   Cannot compile CUDA with nvcc.
>
> *********************************************************************************************
>
> I have nvcc in my path:
>
> $ nvcc --version
> nvcc: NVIDIA (R) Cuda compiler driver
> Copyright (c) 2005-2024 NVIDIA Corporation
> Built on Thu_Jun__6_02:26:10_PDT_2024
> Cuda compilation tools, release 12.5, V12.5.82
> Build cuda_12.5.r12.5/compiler.34385749_0
>
> I remember being able to do this cross compilation in polaris. Any help is
> most appreciated,
> Marcos
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20241108/94b64cc7/attachment-0001.html>

From marcos.vanella at nist.gov  Fri Nov  8 14:34:37 2024
From: marcos.vanella at nist.gov (Vanella, Marcos (Fed))
Date: Fri, 8 Nov 2024 20:34:37 +0000
Subject: [petsc-users] Compiling PETSc in for Grace-Hopper nodes
In-Reply-To: <CA+MQGp_gnRT_8jnR+NUYm81ELAHo7YQcyeCVxA+rKnKdMSPgKw@mail.gmail.com>
References: <DM6PR09MB5063A9BFB12BA382AFFB4A48F85D2@DM6PR09MB5063.namprd09.prod.outlook.com>
	<CA+MQGp_gnRT_8jnR+NUYm81ELAHo7YQcyeCVxA+rKnKdMSPgKw@mail.gmail.com>
Message-ID: <DM6PR09MB5063BCBA3C5364787B859A49F85D2@DM6PR09MB5063.namprd09.prod.outlook.com>

Hi Satish and Junchao, this is what I'm getting at the end of the configure.log. I guess there is an incompatibility among nvcc and the gcc version I'm using.


...
=============================================================================================
TESTING: checkCUDACompiler from config.setCompilers(/home1/09805/mnv/Software/petsc/config/BuildSystem/config/setCompilers.py:1541)
  Locate a functional CUDA compiler
    Checking for program /opt/apps/xalt/xalt/bin/nvcc...not found
    Checking for program /opt/apps/gcc14/cuda12/openmpi/5.0.5/libexec/osu-micro-benchmarks/mpi/one-sided/nvcc...not found
    Checking for program /opt/apps/gcc14/cuda12/openmpi/5.0.5/libexec/osu-micro-benchmarks/mpi/collective/nvcc...not found
    Checking for program /opt/apps/gcc14/cuda12/openmpi/5.0.5/libexec/osu-micro-benchmarks/mpi/pt2pt/nvcc...not found
    Checking for program /opt/apps/gcc14/cuda12/openmpi/5.0.5/libexec/osu-micro-benchmarks/mpi/startup/nvcc...not found
    Checking for program /opt/apps/gcc14/cuda12/openmpi/5.0.5/bin/nvcc...not found
    Checking for program /home1/apps/nvidia/Linux_aarch64/24.7/cuda/12.5/bin/nvcc...found
              Defined make macro "CUDAC" to "nvcc"
Executing: nvcc -c -o /tmp/petsc-qlfa8fb8/config.setCompilers/conftest.o -I/tmp/petsc-qlfa8fb8/config.setCompilers   /tmp/petsc-qlfa8fb8/config.setCompilers/conftest.cu
stdout:
In file included from /home1/apps/nvidia/Linux_aarch64/24.7/cuda/12.5/bin/../targets/sbsa-linux/include/cuda_runtime.h:82,
                 from <command-line>:
/home1/apps/nvidia/Linux_aarch64/24.7/cuda/12.5/bin/../targets/sbsa-linux/include/crt/host_config.h:143:2: error: #error -- unsupported GNU version! gcc versions later than 13 are not supported! The nvcc flag '-allow-unsupported-compiler' can be used to override this version check; however, using an unsupported host compiler may cause compilation failure or incorrect run time execution. Use at your own risk.
  143 | #error -- unsupported GNU version! gcc versions later than 13 are not supported! The nvcc flag '-allow-unsupported-compiler' can be used to override this version check; however, using an unsupported host compiler may cause compilation failure or incorrect run time execution. Use at your own risk.
      |  ^~~~~
Possible ERROR while running compiler: exit code 1
stderr:
In file included from /home1/apps/nvidia/Linux_aarch64/24.7/cuda/12.5/bin/../targets/sbsa-linux/include/cuda_runtime.h:82,
                 from <command-line>:
/home1/apps/nvidia/Linux_aarch64/24.7/cuda/12.5/bin/../targets/sbsa-linux/include/crt/host_config.h:143:2: error: #error -- unsupported GNU version! gcc versions later than 13 are not supported! The nvcc flag '-allow-unsupported-compiler' can be used to override this version check; however, using an unsupported host compiler may cause compilation failure or incorrect run time execution. Use at your own risk.
  143 | #error -- unsupported GNU version! gcc versions later than 13 are not supported! The nvcc flag '-allow-unsupported-compiler' can be used to override this version check; however, using an unsupported host compiler may cause compilation failure or incorrect run time execution. Use at your own risk.
      |  ^~~~~
Source:
#include "confdefs.h"
#include "conffix.h"

int main(void) {
  return 0;
}

          Error testing CUDA compiler: Cannot compile CUDA with nvcc.
            Deleting "CUDAC"
*********************************************************************************************
           UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for details):
---------------------------------------------------------------------------------------------
  CUDA compiler you provided with -with-cudac=nvcc cannot be found or does not work.
  Cannot compile CUDA with nvcc.
*********************************************************************************************
  File "/home1/09805/mnv/Software/petsc/config/configure.py", line 461, in petsc_configure
    framework.configure(out = sys.stdout)
  File "/home1/09805/mnv/Software/petsc/config/BuildSystem/config/framework.py", line 1460, in configure
    self.processChildren()
  File "/home1/09805/mnv/Software/petsc/config/BuildSystem/config/framework.py", line 1448, in processChildren
    self.serialEvaluation(self.childGraph)
  File "/home1/09805/mnv/Software/petsc/config/BuildSystem/config/framework.py", line 1423, in serialEvaluation
    child.configure()
  File "/home1/09805/mnv/Software/petsc/config/BuildSystem/config/setCompilers.py", line 2846, in configure
    self.executeTest(getattr(self,LANG.join(('check','Compiler'))))
  File "/home1/09805/mnv/Software/petsc/config/BuildSystem/config/base.py", line 138, in executeTest
    ret = test(*args,**kargs)
  File "/home1/09805/mnv/Software/petsc/config/BuildSystem/config/setCompilers.py", line 1544, in checkCUDACompiler
    for compiler in self.generateCUDACompilerGuesses():
  File "/home1/09805/mnv/Software/petsc/config/BuildSystem/config/setCompilers.py", line 1527, in generateCUDACompilerGuesses
    raise RuntimeError('CUDA compiler you provided with -with-cudac='+self.argDB['with-cudac']+' cannot be found or does not work.'+'\n'+self.mesg)
================================================================================
Finishing configure run at Fri, 08 Nov 2024 14:28:04 -0600
================================================================================
________________________________
From: Junchao Zhang <junchao.zhang at gmail.com>
Sent: Friday, November 8, 2024 3:25 PM
To: Vanella, Marcos (Fed) <marcos.vanella at nist.gov>
Cc: petsc-users at mcs.anl.gov <petsc-users at mcs.anl.gov>
Subject: Re: [petsc-users] Compiling PETSc in for Grace-Hopper nodes

Hi, Marcos
  Could you attach the configure.log?
--Junchao Zhang


On Fri, Nov 8, 2024 at 2:19?PM Vanella, Marcos (Fed) via petsc-users <petsc-users at mcs.anl.gov<mailto:petsc-users at mcs.anl.gov>> wrote:
Hi all, does anyone have experience compiling PETSc with gnu openmpi and cross compiling with cuda nvcc on these systems?
we have access to Vista, a machine in TACC and was trying to build PETSc with these libraries. I would need gnu openmpi to compile my code (fortran std 2018), and would like to keep the same cpu compiler/openmpi for PETSc.I have the following modules loaded:

Currently Loaded Modules:
  1) ucc/1.3.0   2) ucx/1.17.0   3) cmake/3.29.5   4) xalt/3.1   5) TACC   6) gcc/14.2.0   7) cuda/12.5 (g)   8) openmpi/5.0.5

  Where:
   g:  built for GPU

Here mpicc points to the gcc compiler, etc. When configuring PETSc in the following form I get nvcc not working:

$ ./configure COPTFLAGS="-O2 -g" CXXOPTFLAGS="-O2 -g" FOPTFLAGS="-O2 -g" FCOPTFLAGS="-O2 -g" CUDAOPTFLAGS="-O2 -g" --with-debugging=1 --with-cc=mpicc --with-cxx=mpicxx --with-fc=mpifort --with-cuda --with-cudac=nvcc --with-cuda-arch=90 --download-fblaslapack=1 --with-make-np=8

=============================================================================================
                         Configuring PETSc to compile on your system
=============================================================================================
TESTING: checkCUDACompiler from config.setCompilers(config/BuildSystem/config/setCompilers.py:1541)
*********************************************************************************************
           UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for details):
---------------------------------------------------------------------------------------------
  CUDA compiler you provided with -with-cudac=nvcc cannot be found or does not work.
  Cannot compile CUDA with nvcc.
*********************************************************************************************

I have nvcc in my path:

$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2024 NVIDIA Corporation
Built on Thu_Jun__6_02:26:10_PDT_2024
Cuda compilation tools, release 12.5, V12.5.82
Build cuda_12.5.r12.5/compiler.34385749_0

I remember being able to do this cross compilation in polaris. Any help is most appreciated,
Marcos
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20241108/58150ce8/attachment-0001.html>

From junchao.zhang at gmail.com  Fri Nov  8 14:47:39 2024
From: junchao.zhang at gmail.com (Junchao Zhang)
Date: Fri, 8 Nov 2024 14:47:39 -0600
Subject: [petsc-users] Compiling PETSc in for Grace-Hopper nodes
In-Reply-To: <DM6PR09MB5063BCBA3C5364787B859A49F85D2@DM6PR09MB5063.namprd09.prod.outlook.com>
References: <DM6PR09MB5063A9BFB12BA382AFFB4A48F85D2@DM6PR09MB5063.namprd09.prod.outlook.com>
	<CA+MQGp_gnRT_8jnR+NUYm81ELAHo7YQcyeCVxA+rKnKdMSPgKw@mail.gmail.com>
	<DM6PR09MB5063BCBA3C5364787B859A49F85D2@DM6PR09MB5063.namprd09.prod.outlook.com>
Message-ID: <CA+MQGp9RAn=_HuXO4OWgDH8f1EtgoSo7+d85DRuagWg-1hut2w@mail.gmail.com>

Yes, the error message indicates gcc-14 is not supported by cuda-12.5.
  According to https://urldefense.us/v3/__https://docs.nvidia.com/cuda/archive/12.5.0/cuda-installation-guide-linux/index.html*host-compiler-support-policy__;Iw!!G_uCfscf7eWS!b5iD6t-BE_J3Y7Z9ZeZ_fcD0Hz0TeQ1wpbE5vpuBJGpIqXz6nFj4mtmBLxTQlbByrjwjGJpFnDhDc5A6zQccRWlqqyJy$ ,
 it supports up to gcc-13.2.

Perhaps the best approach is to ask your sys admin to install
compatible gcc and cuda.

--Junchao Zhang

On Fri, Nov 8, 2024 at 2:34?PM Vanella, Marcos (Fed)
<marcos.vanella at nist.gov> wrote:
>
> Hi Satish and Junchao, this is what I'm getting at the end of the configure.log. I guess there is an incompatibility among nvcc and the gcc version I'm using.
>
>
> ...
> =============================================================================================
> TESTING: checkCUDACompiler from config.setCompilers(/home1/09805/mnv/Software/petsc/config/BuildSystem/config/setCompilers.py:1541)
>   Locate a functional CUDA compiler
>     Checking for program /opt/apps/xalt/xalt/bin/nvcc...not found
>     Checking for program /opt/apps/gcc14/cuda12/openmpi/5.0.5/libexec/osu-micro-benchmarks/mpi/one-sided/nvcc...not found
>     Checking for program /opt/apps/gcc14/cuda12/openmpi/5.0.5/libexec/osu-micro-benchmarks/mpi/collective/nvcc...not found
>     Checking for program /opt/apps/gcc14/cuda12/openmpi/5.0.5/libexec/osu-micro-benchmarks/mpi/pt2pt/nvcc...not found
>     Checking for program /opt/apps/gcc14/cuda12/openmpi/5.0.5/libexec/osu-micro-benchmarks/mpi/startup/nvcc...not found
>     Checking for program /opt/apps/gcc14/cuda12/openmpi/5.0.5/bin/nvcc...not found
>     Checking for program /home1/apps/nvidia/Linux_aarch64/24.7/cuda/12.5/bin/nvcc...found
>               Defined make macro "CUDAC" to "nvcc"
> Executing: nvcc -c -o /tmp/petsc-qlfa8fb8/config.setCompilers/conftest.o -I/tmp/petsc-qlfa8fb8/config.setCompilers   /tmp/petsc-qlfa8fb8/config.setCompilers/conftest.cu
> stdout:
> In file included from /home1/apps/nvidia/Linux_aarch64/24.7/cuda/12.5/bin/../targets/sbsa-linux/include/cuda_runtime.h:82,
>                  from <command-line>:
> /home1/apps/nvidia/Linux_aarch64/24.7/cuda/12.5/bin/../targets/sbsa-linux/include/crt/host_config.h:143:2: error: #error -- unsupported GNU version! gcc versions later than 13 are not supported! The nvcc flag '-allow-unsupported-compiler' can be used to override this version check; however, using an unsupported host compiler may cause compilation failure or incorrect run time execution. Use at your own risk.
>   143 | #error -- unsupported GNU version! gcc versions later than 13 are not supported! The nvcc flag '-allow-unsupported-compiler' can be used to override this version check; however, using an unsupported host compiler may cause compilation failure or incorrect run time execution. Use at your own risk.
>       |  ^~~~~
> Possible ERROR while running compiler: exit code 1
> stderr:
> In file included from /home1/apps/nvidia/Linux_aarch64/24.7/cuda/12.5/bin/../targets/sbsa-linux/include/cuda_runtime.h:82,
>                  from <command-line>:
> /home1/apps/nvidia/Linux_aarch64/24.7/cuda/12.5/bin/../targets/sbsa-linux/include/crt/host_config.h:143:2: error: #error -- unsupported GNU version! gcc versions later than 13 are not supported! The nvcc flag '-allow-unsupported-compiler' can be used to override this version check; however, using an unsupported host compiler may cause compilation failure or incorrect run time execution. Use at your own risk.
>   143 | #error -- unsupported GNU version! gcc versions later than 13 are not supported! The nvcc flag '-allow-unsupported-compiler' can be used to override this version check; however, using an unsupported host compiler may cause compilation failure or incorrect run time execution. Use at your own risk.
>       |  ^~~~~
> Source:
> #include "confdefs.h"
> #include "conffix.h"
>
> int main(void) {
>   return 0;
> }
>
>           Error testing CUDA compiler: Cannot compile CUDA with nvcc.
>             Deleting "CUDAC"
> *********************************************************************************************
>            UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for details):
> ---------------------------------------------------------------------------------------------
>   CUDA compiler you provided with -with-cudac=nvcc cannot be found or does not work.
>   Cannot compile CUDA with nvcc.
> *********************************************************************************************
>   File "/home1/09805/mnv/Software/petsc/config/configure.py", line 461, in petsc_configure
>     framework.configure(out = sys.stdout)
>   File "/home1/09805/mnv/Software/petsc/config/BuildSystem/config/framework.py", line 1460, in configure
>     self.processChildren()
>   File "/home1/09805/mnv/Software/petsc/config/BuildSystem/config/framework.py", line 1448, in processChildren
>     self.serialEvaluation(self.childGraph)
>   File "/home1/09805/mnv/Software/petsc/config/BuildSystem/config/framework.py", line 1423, in serialEvaluation
>     child.configure()
>   File "/home1/09805/mnv/Software/petsc/config/BuildSystem/config/setCompilers.py", line 2846, in configure
>     self.executeTest(getattr(self,LANG.join(('check','Compiler'))))
>   File "/home1/09805/mnv/Software/petsc/config/BuildSystem/config/base.py", line 138, in executeTest
>     ret = test(*args,**kargs)
>   File "/home1/09805/mnv/Software/petsc/config/BuildSystem/config/setCompilers.py", line 1544, in checkCUDACompiler
>     for compiler in self.generateCUDACompilerGuesses():
>   File "/home1/09805/mnv/Software/petsc/config/BuildSystem/config/setCompilers.py", line 1527, in generateCUDACompilerGuesses
>     raise RuntimeError('CUDA compiler you provided with -with-cudac='+self.argDB['with-cudac']+' cannot be found or does not work.'+'\n'+self.mesg)
> ================================================================================
> Finishing configure run at Fri, 08 Nov 2024 14:28:04 -0600
> ================================================================================
> ________________________________
> From: Junchao Zhang <junchao.zhang at gmail.com>
> Sent: Friday, November 8, 2024 3:25 PM
> To: Vanella, Marcos (Fed) <marcos.vanella at nist.gov>
> Cc: petsc-users at mcs.anl.gov <petsc-users at mcs.anl.gov>
> Subject: Re: [petsc-users] Compiling PETSc in for Grace-Hopper nodes
>
> Hi, Marcos
>   Could you attach the configure.log?
> --Junchao Zhang
>
>
> On Fri, Nov 8, 2024 at 2:19?PM Vanella, Marcos (Fed) via petsc-users <petsc-users at mcs.anl.gov> wrote:
>
> Hi all, does anyone have experience compiling PETSc with gnu openmpi and cross compiling with cuda nvcc on these systems?
> we have access to Vista, a machine in TACC and was trying to build PETSc with these libraries. I would need gnu openmpi to compile my code (fortran std 2018), and would like to keep the same cpu compiler/openmpi for PETSc.I have the following modules loaded:
>
> Currently Loaded Modules:
>   1) ucc/1.3.0   2) ucx/1.17.0   3) cmake/3.29.5   4) xalt/3.1   5) TACC   6) gcc/14.2.0   7) cuda/12.5 (g)   8) openmpi/5.0.5
>
>   Where:
>    g:  built for GPU
>
> Here mpicc points to the gcc compiler, etc. When configuring PETSc in the following form I get nvcc not working:
>
> $ ./configure COPTFLAGS="-O2 -g" CXXOPTFLAGS="-O2 -g" FOPTFLAGS="-O2 -g" FCOPTFLAGS="-O2 -g" CUDAOPTFLAGS="-O2 -g" --with-debugging=1 --with-cc=mpicc --with-cxx=mpicxx --with-fc=mpifort --with-cuda --with-cudac=nvcc --with-cuda-arch=90 --download-fblaslapack=1 --with-make-np=8
>
> =============================================================================================
>                          Configuring PETSc to compile on your system
> =============================================================================================
> TESTING: checkCUDACompiler from config.setCompilers(config/BuildSystem/config/setCompilers.py:1541)
> *********************************************************************************************
>            UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for details):
> ---------------------------------------------------------------------------------------------
>   CUDA compiler you provided with -with-cudac=nvcc cannot be found or does not work.
>   Cannot compile CUDA with nvcc.
> *********************************************************************************************
>
> I have nvcc in my path:
>
> $ nvcc --version
> nvcc: NVIDIA (R) Cuda compiler driver
> Copyright (c) 2005-2024 NVIDIA Corporation
> Built on Thu_Jun__6_02:26:10_PDT_2024
> Cuda compilation tools, release 12.5, V12.5.82
> Build cuda_12.5.r12.5/compiler.34385749_0
>
> I remember being able to do this cross compilation in polaris. Any help is most appreciated,
> Marcos

From marcos.vanella at nist.gov  Fri Nov  8 14:49:33 2024
From: marcos.vanella at nist.gov (Vanella, Marcos (Fed))
Date: Fri, 8 Nov 2024 20:49:33 +0000
Subject: [petsc-users] Compiling PETSc in for Grace-Hopper nodes
In-Reply-To: <CA+MQGp9RAn=_HuXO4OWgDH8f1EtgoSo7+d85DRuagWg-1hut2w@mail.gmail.com>
References: <DM6PR09MB5063A9BFB12BA382AFFB4A48F85D2@DM6PR09MB5063.namprd09.prod.outlook.com>
	<CA+MQGp_gnRT_8jnR+NUYm81ELAHo7YQcyeCVxA+rKnKdMSPgKw@mail.gmail.com>
	<DM6PR09MB5063BCBA3C5364787B859A49F85D2@DM6PR09MB5063.namprd09.prod.outlook.com>
	<CA+MQGp9RAn=_HuXO4OWgDH8f1EtgoSo7+d85DRuagWg-1hut2w@mail.gmail.com>
Message-ID: <DM6PR09MB50631D83C6E758D8759D11EBF85D2@DM6PR09MB5063.namprd09.prod.outlook.com>

Thank you Junchao, we'll work on this compatibility issue.
Best,
Marcos
________________________________
From: Junchao Zhang <junchao.zhang at gmail.com>
Sent: Friday, November 8, 2024 3:47 PM
To: Vanella, Marcos (Fed) <marcos.vanella at nist.gov>
Cc: Satish Balay <balay at mcs.anl.gov>; petsc-users at mcs.anl.gov <petsc-users at mcs.anl.gov>; Victor Eijkhout <eijkhout at tacc.utexas.edu>
Subject: Re: [petsc-users] Compiling PETSc in for Grace-Hopper nodes

Yes, the error message indicates gcc-14 is not supported by cuda-12.5.
  According to https://urldefense.us/v3/__https://gcc02.safelinks.protection.outlook.com/?url=https*3A*2F*2Fdocs.nvidia.com*2Fcuda*2Farchive*2F12.5.0*2Fcuda-installation-guide-linux*2Findex.html*23host-compiler-support-policy&data=05*7C02*7Cmarcos.vanella*40nist.gov*7C7136c6301354458874f708dd00369e37*7C2ab5d82fd8fa4797a93e054655c61dec*7C0*7C0*7C638666956782708272*7CUnknown*7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ*3D*3D*7C0*7C*7C*7C&sdata=ZCZOH7YEqdGU2NSHnn4Shuly*2BG2*2BPci6Gcm5yoVcJKY*3D&reserved=0__;JSUlJSUlJSUlJSUlJSUlJSUlJSUlJSUlJSUlJQ!!G_uCfscf7eWS!dE1UqBggdaLlQuHbpbc4p7cYAz9EvrpKKIDn3RKUBr-E2D7rcj5OO0krQHrW75551ETLudAIb2qjBr_qhCg0XKvqOmcTHhJu$ <https://urldefense.us/v3/__https://docs.nvidia.com/cuda/archive/12.5.0/cuda-installation-guide-linux/index.html*host-compiler-support-policy__;Iw!!G_uCfscf7eWS!dE1UqBggdaLlQuHbpbc4p7cYAz9EvrpKKIDn3RKUBr-E2D7rcj5OO0krQHrW75551ETLudAIb2qjBr_qhCg0XKvqOk0iGmQL$ >,
 it supports up to gcc-13.2.

Perhaps the best approach is to ask your sys admin to install
compatible gcc and cuda.

--Junchao Zhang

On Fri, Nov 8, 2024 at 2:34?PM Vanella, Marcos (Fed)
<marcos.vanella at nist.gov> wrote:
>
> Hi Satish and Junchao, this is what I'm getting at the end of the configure.log. I guess there is an incompatibility among nvcc and the gcc version I'm using.
>
>
> ...
> =============================================================================================
> TESTING: checkCUDACompiler from config.setCompilers(/home1/09805/mnv/Software/petsc/config/BuildSystem/config/setCompilers.py:1541)
>   Locate a functional CUDA compiler
>     Checking for program /opt/apps/xalt/xalt/bin/nvcc...not found
>     Checking for program /opt/apps/gcc14/cuda12/openmpi/5.0.5/libexec/osu-micro-benchmarks/mpi/one-sided/nvcc...not found
>     Checking for program /opt/apps/gcc14/cuda12/openmpi/5.0.5/libexec/osu-micro-benchmarks/mpi/collective/nvcc...not found
>     Checking for program /opt/apps/gcc14/cuda12/openmpi/5.0.5/libexec/osu-micro-benchmarks/mpi/pt2pt/nvcc...not found
>     Checking for program /opt/apps/gcc14/cuda12/openmpi/5.0.5/libexec/osu-micro-benchmarks/mpi/startup/nvcc...not found
>     Checking for program /opt/apps/gcc14/cuda12/openmpi/5.0.5/bin/nvcc...not found
>     Checking for program /home1/apps/nvidia/Linux_aarch64/24.7/cuda/12.5/bin/nvcc...found
>               Defined make macro "CUDAC" to "nvcc"
> Executing: nvcc -c -o /tmp/petsc-qlfa8fb8/config.setCompilers/conftest.o -I/tmp/petsc-qlfa8fb8/config.setCompilers   /tmp/petsc-qlfa8fb8/config.setCompilers/conftest.cu
> stdout:
> In file included from /home1/apps/nvidia/Linux_aarch64/24.7/cuda/12.5/bin/../targets/sbsa-linux/include/cuda_runtime.h:82,
>                  from <command-line>:
> /home1/apps/nvidia/Linux_aarch64/24.7/cuda/12.5/bin/../targets/sbsa-linux/include/crt/host_config.h:143:2: error: #error -- unsupported GNU version! gcc versions later than 13 are not supported! The nvcc flag '-allow-unsupported-compiler' can be used to override this version check; however, using an unsupported host compiler may cause compilation failure or incorrect run time execution. Use at your own risk.
>   143 | #error -- unsupported GNU version! gcc versions later than 13 are not supported! The nvcc flag '-allow-unsupported-compiler' can be used to override this version check; however, using an unsupported host compiler may cause compilation failure or incorrect run time execution. Use at your own risk.
>       |  ^~~~~
> Possible ERROR while running compiler: exit code 1
> stderr:
> In file included from /home1/apps/nvidia/Linux_aarch64/24.7/cuda/12.5/bin/../targets/sbsa-linux/include/cuda_runtime.h:82,
>                  from <command-line>:
> /home1/apps/nvidia/Linux_aarch64/24.7/cuda/12.5/bin/../targets/sbsa-linux/include/crt/host_config.h:143:2: error: #error -- unsupported GNU version! gcc versions later than 13 are not supported! The nvcc flag '-allow-unsupported-compiler' can be used to override this version check; however, using an unsupported host compiler may cause compilation failure or incorrect run time execution. Use at your own risk.
>   143 | #error -- unsupported GNU version! gcc versions later than 13 are not supported! The nvcc flag '-allow-unsupported-compiler' can be used to override this version check; however, using an unsupported host compiler may cause compilation failure or incorrect run time execution. Use at your own risk.
>       |  ^~~~~
> Source:
> #include "confdefs.h"
> #include "conffix.h"
>
> int main(void) {
>   return 0;
> }
>
>           Error testing CUDA compiler: Cannot compile CUDA with nvcc.
>             Deleting "CUDAC"
> *********************************************************************************************
>            UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for details):
> ---------------------------------------------------------------------------------------------
>   CUDA compiler you provided with -with-cudac=nvcc cannot be found or does not work.
>   Cannot compile CUDA with nvcc.
> *********************************************************************************************
>   File "/home1/09805/mnv/Software/petsc/config/configure.py", line 461, in petsc_configure
>     framework.configure(out = sys.stdout)
>   File "/home1/09805/mnv/Software/petsc/config/BuildSystem/config/framework.py", line 1460, in configure
>     self.processChildren()
>   File "/home1/09805/mnv/Software/petsc/config/BuildSystem/config/framework.py", line 1448, in processChildren
>     self.serialEvaluation(self.childGraph)
>   File "/home1/09805/mnv/Software/petsc/config/BuildSystem/config/framework.py", line 1423, in serialEvaluation
>     child.configure()
>   File "/home1/09805/mnv/Software/petsc/config/BuildSystem/config/setCompilers.py", line 2846, in configure
>     self.executeTest(getattr(self,LANG.join(('check','Compiler'))))
>   File "/home1/09805/mnv/Software/petsc/config/BuildSystem/config/base.py", line 138, in executeTest
>     ret = test(*args,**kargs)
>   File "/home1/09805/mnv/Software/petsc/config/BuildSystem/config/setCompilers.py", line 1544, in checkCUDACompiler
>     for compiler in self.generateCUDACompilerGuesses():
>   File "/home1/09805/mnv/Software/petsc/config/BuildSystem/config/setCompilers.py", line 1527, in generateCUDACompilerGuesses
>     raise RuntimeError('CUDA compiler you provided with -with-cudac='+self.argDB['with-cudac']+' cannot be found or does not work.'+'\n'+self.mesg)
> ================================================================================
> Finishing configure run at Fri, 08 Nov 2024 14:28:04 -0600
> ================================================================================
> ________________________________
> From: Junchao Zhang <junchao.zhang at gmail.com>
> Sent: Friday, November 8, 2024 3:25 PM
> To: Vanella, Marcos (Fed) <marcos.vanella at nist.gov>
> Cc: petsc-users at mcs.anl.gov <petsc-users at mcs.anl.gov>
> Subject: Re: [petsc-users] Compiling PETSc in for Grace-Hopper nodes
>
> Hi, Marcos
>   Could you attach the configure.log?
> --Junchao Zhang
>
>
> On Fri, Nov 8, 2024 at 2:19?PM Vanella, Marcos (Fed) via petsc-users <petsc-users at mcs.anl.gov> wrote:
>
> Hi all, does anyone have experience compiling PETSc with gnu openmpi and cross compiling with cuda nvcc on these systems?
> we have access to Vista, a machine in TACC and was trying to build PETSc with these libraries. I would need gnu openmpi to compile my code (fortran std 2018), and would like to keep the same cpu compiler/openmpi for PETSc.I have the following modules loaded:
>
> Currently Loaded Modules:
>   1) ucc/1.3.0   2) ucx/1.17.0   3) cmake/3.29.5   4) xalt/3.1   5) TACC   6) gcc/14.2.0   7) cuda/12.5 (g)   8) openmpi/5.0.5
>
>   Where:
>    g:  built for GPU
>
> Here mpicc points to the gcc compiler, etc. When configuring PETSc in the following form I get nvcc not working:
>
> $ ./configure COPTFLAGS="-O2 -g" CXXOPTFLAGS="-O2 -g" FOPTFLAGS="-O2 -g" FCOPTFLAGS="-O2 -g" CUDAOPTFLAGS="-O2 -g" --with-debugging=1 --with-cc=mpicc --with-cxx=mpicxx --with-fc=mpifort --with-cuda --with-cudac=nvcc --with-cuda-arch=90 --download-fblaslapack=1 --with-make-np=8
>
> =============================================================================================
>                          Configuring PETSc to compile on your system
> =============================================================================================
> TESTING: checkCUDACompiler from config.setCompilers(config/BuildSystem/config/setCompilers.py:1541)
> *********************************************************************************************
>            UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for details):
> ---------------------------------------------------------------------------------------------
>   CUDA compiler you provided with -with-cudac=nvcc cannot be found or does not work.
>   Cannot compile CUDA with nvcc.
> *********************************************************************************************
>
> I have nvcc in my path:
>
> $ nvcc --version
> nvcc: NVIDIA (R) Cuda compiler driver
> Copyright (c) 2005-2024 NVIDIA Corporation
> Built on Thu_Jun__6_02:26:10_PDT_2024
> Cuda compilation tools, release 12.5, V12.5.82
> Build cuda_12.5.r12.5/compiler.34385749_0
>
> I remember being able to do this cross compilation in polaris. Any help is most appreciated,
> Marcos
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20241108/d24f4833/attachment-0001.html>

From bsmith at petsc.dev  Sat Nov  9 09:51:31 2024
From: bsmith at petsc.dev (Barry Smith)
Date: Sat, 9 Nov 2024 10:51:31 -0500
Subject: [petsc-users] Spring 2025 PETSc  Annual Users Meeting in Buffalo,
 New York registration is now open
References: <2FBFCF80-8A3E-4A00-B0D1-9C067130AD77@petsc.dev>
Message-ID: <DFFC72DB-05CC-4A0A-88D4-733B9453584D@petsc.dev>


    The Spring 2025 PETSc Annual Users Meeting will be held May 20-21, 2025, in Buffalo, New York. 

    The meeting website https://urldefense.us/v3/__https://petsc.org/community/meetings/2025/__;!!G_uCfscf7eWS!fgUt5YviY0nuj_9WJekiYDOGrC9_D639zfhCyGjT3m4IONdLS0cUjB85MobixTn1iEbILSWiGeY0kNjUOhJmOeQ$  is now available. Please register, submit a presentation, and start making your travel plans. Buffalo is a very short distance from Niagara Falls, so come early and enjoy the Falls.

     We will hold a PETSc tutorial the day before the meeting on Monday, May 19, 2025.

     Student travel funding is available to attend the meeting; apply now when you register. Student registration is free.

     As always, thanks for your support and interest in the PETSc community,

     Barry

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20241109/1fd4f6e1/attachment.html>

From bsmith at petsc.dev  Sat Nov  9 11:58:56 2024
From: bsmith at petsc.dev (Barry Smith)
Date: Sat, 9 Nov 2024 12:58:56 -0500
Subject: [petsc-users] Correction for Spring 2025 PETSc Annual Users Meeting
 in Buffalo, New York registration is now open
References: <DFFC72DB-05CC-4A0A-88D4-733B9453584D@petsc.dev>
Message-ID: <BD95CE56-9891-4E0F-8B6D-01EFE8B48895@petsc.dev>


   Correction: please use  https://urldefense.us/v3/__https://petsc.org/release/community/meetings/2025/__;!!G_uCfscf7eWS!YsRJ9A0Fy25e0yjmpE6WJQHEmFURYU6fMLVeE3H1sed3DlRO-ImLLBLnagBdvP4NIw2QpjW6ow24sa-nWr8Gkc4$  <https://urldefense.us/v3/__https://petsc.org/community/meetings/2025/__;!!G_uCfscf7eWS!YsRJ9A0Fy25e0yjmpE6WJQHEmFURYU6fMLVeE3H1sed3DlRO-ImLLBLnagBdvP4NIw2QpjW6ow24sa-nBoPU4Oc$ >  to access the website.

    Sorry for the double mail.

   Barry

> Begin forwarded message:
> 
> From: Barry Smith <bsmith at petsc.dev>
> Subject: Spring 2025 PETSc Annual Users Meeting in Buffalo, New York registration is now open
> Date: November 9, 2024 at 10:51:31?AM EST
> To: Petsc-users <petsc-users at mcs.anl.gov>, petsc-announce at mcs.anl.gov, petsc-dev <petsc-dev at mcs.anl.gov>
> Message-Id: <DFFC72DB-05CC-4A0A-88D4-733B9453584D at petsc.dev>
> 
> 
>     The Spring 2025 PETSc Annual Users Meeting will be held May 20-21, 2025, in Buffalo, New York. 
> 
>     The meeting website https://urldefense.us/v3/__https://petsc.org/community/meetings/2025/__;!!G_uCfscf7eWS!YsRJ9A0Fy25e0yjmpE6WJQHEmFURYU6fMLVeE3H1sed3DlRO-ImLLBLnagBdvP4NIw2QpjW6ow24sa-nBoPU4Oc$  is now available. Please register, submit a presentation, and start making your travel plans. Buffalo is a very short distance from Niagara Falls, so come early and enjoy the Falls.
> 
>      We will hold a PETSc tutorial the day before the meeting on Monday, May 19, 2025.
> 
>      Student travel funding is available to attend the meeting; apply now when you register. Student registration is free.
> 
>      As always, thanks for your support and interest in the PETSc community,
> 
>      Barry
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20241109/ea5717f4/attachment.html>

From jed at jedbrown.org  Tue Nov 12 22:26:14 2024
From: jed at jedbrown.org (Jed Brown)
Date: Tue, 12 Nov 2024 21:26:14 -0700
Subject: [petsc-users] Some hypre settings for 3D fully coupled
 incompressible NS
In-Reply-To: <CADmAu6KPjADiBERKd1BUOsfdsH7VHa4ZRin2R5+_wO45L9wNyQ@mail.gmail.com>
References: <CADmAu6KPjADiBERKd1BUOsfdsH7VHa4ZRin2R5+_wO45L9wNyQ@mail.gmail.com>
Message-ID: <87cyizsqs9.fsf@jedbrown.org>

To answer your question, I think the commercial solvers are not using "SAMG with ILU smoother" on the velocity-pressure coupled system, but a splitting technique or Schur-like reduction with that applied to the pressure system and AMG with ILU smoothers (or straight ILU) applied to the momentum system. ILU smoothing is often chosen when there is strong anisotropy (usually from boundary layers) not captured in the coarsening or the transport-dominated part prevents effective smoothing using more typical smoothers. ILU theory and smoothing properties are not very nice, but it is still often pragmatic.

See -pc_hypre_boomeramg_smooth_type ilu (which activates more sub-options) if you want to try that out while sticking with hypre.

It's worth checking whether making the `u` solver stronger has any significant impact on convergence, and distinguishing the accuracy impact of using multiplicative/selfp (even with an accurate solve of that subsystem) separately from the approximation incurred by using `preonly` (which you'll almost always want to do). You may have already sorted this out in your empirical study.

Edoardo alinovi <edoardo.alinovi at gmail.com> writes:

> Hello petsc friends,
>
> It's been a while since I am trying to find a good setup for my coupled
> solver.
>
> Recently, I have run a scan with Dakota (more than 1k simulations) on the
> Windsor body case with 7Mln cells on 36 cores on my small home server (Dell
> R730 with 2x2496 v4 xeon). I thought it was a good idea to share my results
> with the community!
>
> Here is a resume of my finding:
>
> 1) Multiplicative is faster than Schur:  I have found out that Schur
> preconditioner is rarely faster than multiplicative despite the fact Schur
> keeps the number of iterations lower. I think there is a lot of room for
> improvement as far as FV matrices are concerned. Probably custom Shat is
> the way to go, but not easy to find a good one! Up to now "selfp" looks to
> be the only good and "ready to go" choice.
>
> 2) Vanilla fbcgs is faster than vanilla fgmres: maybe here we can tune
> gmres restart, I have not tried this systematically.
>
> 3) Stick with preonly: using bcgs/cg as preconditioner ksp lowers the
> number of iterations but it adds up a lot of overhead (even setting few
> iterations or mild tolerances).
>
> 4) Staging is a good idea: beyond bare iteration performance, I think that
> for steady state problems it worth setting a max for outer iterations in
> fieldsplit, as starting iterations would cost you a lot and probably you
> will be far from convergence anyway at the stage, so it is not a good
> investment pushing hard on them.
>
> 5) Here my best so far settings:
>
>     # Outer solver settings
>      "solver": "fbcgs",
>      "preconditioner": "fieldsplit",
>      "absTol": 1e-6,
>      "relTol": 0.01,
>
>       # Field split KSP and PC
>       "fieldsplit_u_pc_type": "bjacobi",
>       "fieldsplit_p_pc_type": "hypre",
>       "fieldsplit_u_ksp_type": "preonly",
>       "fieldsplit_p_ksp_type": "preonly",
>
>        ! HYPRE PC options
>        "fieldsplit_p_pc_hypre_boomeramg_strong_threshold": 0.05,
>        "fieldsplit_p_pc_hypre_boomeramg_coarsen_type": "PMIS",
>        "fieldsplit_p_pc_hypre_boomeramg_truncfactor": 0.3,
>        "fieldsplit_p_pc_hypre_boomeramg_no_cf": 0,
>        "fieldsplit_p_pc_hypre_boomeramg_agg_nl": 1,
>        "fieldsplit_p_pc_hypre_boomeramg_agg_num_paths": 1,
>        "fieldsplit_p_pc_hypre_boomeramg_P_max": 0,
>        "fieldsplit_p_pc_hypre_boomeramg_max_levels": 30,
>        "fieldsplit_p_pc_hypre_boomeramg_relax_type_all":
> "backward-SOR/Jacobi",
>        "fieldsplit_p_pc_hypre_boomeramg_interp_type": "ext+i",
>        "fieldsplit_p_pc_hypre_boomeramg_grid_sweeps_down": 0,
>        "fieldsplit_p_pc_hypre_boomeramg_grid_sweeps_up": 2,
>        "fieldsplit_p_pc_hypre_boomeramg_cycle_type": "v"
>
> I have a question for Barry/Jed/Matt. I have noted that most of the
> commercial solvers use what I define as "SAMG with ILU smoother". I am
> wondering if there's a way to reproduce this in Petsc.  I have tried
> PCPATCH to test VANKA, but I am not really able to use that PC as I am not
> using DMplex. With this recipe I am not miles away from Fluent on the same
> problem. Yet, I am wondering why commercial solvers do not use fieldsplit.
>
> Hope this can be helpful and of course I am happy to collaborate on this
> topic if someone outhere is willing to!
>
> Cheers,
>
> Edoardo

From pjool at dtu.dk  Thu Nov 14 08:39:12 2024
From: pjool at dtu.dk (=?iso-8859-1?Q?Peder_J=F8rgensgaard_Olesen?=)
Date: Thu, 14 Nov 2024 14:39:12 +0000
Subject: [petsc-users] VecPow clarification
Message-ID: <PAWP192MB22753EDE96169E84AE9CD575C85B2@PAWP192MB2275.EURP192.PROD.OUTLOOK.COM>

Given a vector containing roots of unity, v[i] = exp(i*k*x[i]) I wanted to compute the vector u[i]=exp(i*n*k*x[i]), for some real number n. From the face of it this should be easily achieved with VecPow, as u[i] = v[i]^n.

That didn't work as expected, though I got around it using VecGetArray() and a loop with PetscPowComplex(). The source designated in the docs (src/vec/vec/utils/projection.c) reveals that VecPow() maps v[i] to PETSC_INFINITY when the PetscRealPart(v[i]) < 0, unless the power is any of 0, ?0.5, ?1 or ?2. Even in the simple case of a purely real vector (with negative entries) raised to any other integer power, the results would not be what one might  reasonably expect from the description of VecPow().

While I do have a solution suiting my need, I'm left wondering what might be the rationale for VecPow working the way it does.

Best,
Peder
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20241114/14b4d4a5/attachment.html>

From stefano.zampini at gmail.com  Thu Nov 14 08:56:52 2024
From: stefano.zampini at gmail.com (Stefano Zampini)
Date: Thu, 14 Nov 2024 17:56:52 +0300
Subject: [petsc-users] VecPow clarification
In-Reply-To: <PAWP192MB22753EDE96169E84AE9CD575C85B2@PAWP192MB2275.EURP192.PROD.OUTLOOK.COM>
References: <PAWP192MB22753EDE96169E84AE9CD575C85B2@PAWP192MB2275.EURP192.PROD.OUTLOOK.COM>
Message-ID: <CAGPUisj49xMHeT6Jyb5u1nWHhXzJjPdxdY5xQDJDOEfm9fsDiw@mail.gmail.com>

That is a very old bug! Can you make an MR to just call PetscPowScalar in a
loop here
https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/blob/main/src/vec/vec/utils/projection.c*L1022__;Iw!!G_uCfscf7eWS!bbSdSMnU5KpH03jHI7aV5j4WLGQ3yPvxWzR6Lwr14QLy_7EJ2MNT-qhL6J1x6z3vpF6M5GQk9lMQkMoLJ_JNSn1mqwqict4$ 
?

Thanks

Il giorno gio 14 nov 2024 alle ore 17:39 Peder J?rgensgaard Olesen via
petsc-users <petsc-users at mcs.anl.gov> ha scritto:

> Given a vector containing roots of unity, v[i] = exp(i*k*x[i]) I wanted to
> compute the vector u[i]=exp(i*n*k*x[i]), for some real number n. From the
> face of it this should be easily achieved with VecPow, as u[i] = v[i]^n.
>
> That didn't work as expected, though I got around it using VecGetArray()
> and a loop with PetscPowComplex(). The source designated in the docs
> (src/vec/vec/utils/projection.c) reveals that VecPow() maps v[i] to
> PETSC_INFINITY when the PetscRealPart(v[i]) < 0, unless the power is any of
> 0, ?0.5, ?1 or ?2. Even in the simple case of a purely real vector (with
> negative entries) raised to any other integer power, the results would not
> be what one might  reasonably expect from the description of VecPow().
>
> While I do have a solution suiting my need, I'm left wondering what might
> be the rationale for VecPow working the way it does.
>
> Best,
> Peder
>


-- 
Stefano
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20241114/45a9ecc3/attachment.html>

From knepley at gmail.com  Thu Nov 14 09:01:36 2024
From: knepley at gmail.com (Matthew Knepley)
Date: Thu, 14 Nov 2024 10:01:36 -0500
Subject: [petsc-users] VecPow clarification
In-Reply-To: <PAWP192MB22753EDE96169E84AE9CD575C85B2@PAWP192MB2275.EURP192.PROD.OUTLOOK.COM>
References: <PAWP192MB22753EDE96169E84AE9CD575C85B2@PAWP192MB2275.EURP192.PROD.OUTLOOK.COM>
Message-ID: <CAMYG4GkHGhggYgGtt0PU_je7T4F8_82OJ7AxZ4L-siZE+wfd3g@mail.gmail.com>

On Thu, Nov 14, 2024 at 9:39?AM Peder J?rgensgaard Olesen via petsc-users <
petsc-users at mcs.anl.gov> wrote:

> Given a vector containing roots of unity, v[i] = exp(i*k*x[i]) I wanted to
> compute the vector u[i]=exp(i*n*k*x[i]), for some real number n. From the
> face of it this should be easily achieved with VecPow, as u[i] = v[i]^n.
>
> That didn't work as expected, though I got around it using VecGetArray()
> and a loop with PetscPowComplex(). The source designated in the docs
> (src/vec/vec/utils/projection.c) reveals that VecPow() maps v[i] to
> PETSC_INFINITY when the PetscRealPart(v[i]) < 0, unless the power is any of
> 0, ?0.5, ?1 or ?2. Even in the simple case of a purely real vector (with
> negative entries) raised to any other integer power, the results would not
> be what one might  reasonably expect from the description of VecPow().
>
> While I do have a solution suiting my need, I'm left wondering what might
> be the rationale for VecPow working the way it does.
>

This is indeed wrong. It was coded only for real numbers. We will fix it.

  Thanks for reporting this,

     Matt


> Best,
> Peder
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!bNQOcaOJC5gkTat8nR3TNhd8LdtJY9sMS6rBMYVNwUdmQE2UkCPoXt7GmCWMleJs9EAJr_rfIaO2WqqNrQe-$  <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!bNQOcaOJC5gkTat8nR3TNhd8LdtJY9sMS6rBMYVNwUdmQE2UkCPoXt7GmCWMleJs9EAJr_rfIaO2WqfN0fHA$ >
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20241114/66846bbc/attachment-0001.html>

From bsmith at petsc.dev  Thu Nov 14 09:14:13 2024
From: bsmith at petsc.dev (Barry Smith)
Date: Thu, 14 Nov 2024 10:14:13 -0500
Subject: [petsc-users] VecPow clarification
In-Reply-To: <PAWP192MB22753EDE96169E84AE9CD575C85B2@PAWP192MB2275.EURP192.PROD.OUTLOOK.COM>
References: <PAWP192MB22753EDE96169E84AE9CD575C85B2@PAWP192MB2275.EURP192.PROD.OUTLOOK.COM>
Message-ID: <29A8F44F-B393-4FF9-98B1-4685853A066C@petsc.dev>


   Good question. Looking at the commit logs from 2014 I see 

move tao vector operations over to Vec directory, fix a couple names and calling sequences


 So previously, the function was  problem-specific for Tao (which did not support complex numbers) used inside the semi-smooth methods where roots of negative numbers should be mapped to infinity (indicating a "bad" domain point).

 When we merged the Tao and PETSc source code base  just blindly copied over the code without realizing it was not a general-purpose VecPow() and that it did not make sense for complex numbers.

 I should rework it for general use without breaking the use in Tao.

 Thanks for pointing out the problem.

  Barry


> On Nov 14, 2024, at 9:39?AM, Peder J?rgensgaard Olesen via petsc-users <petsc-users at mcs.anl.gov> wrote:
> 
> Given a vector containing roots of unity, v[i] = exp(i*k*x[i]) I wanted to compute the vector u[i]=exp(i*n*k*x[i]), for some real number n. From the face of it this should be easily achieved with VecPow, as u[i] = v[i]^n.
> 
> That didn't work as expected, though I got around it using VecGetArray() and a loop with PetscPowComplex(). The source designated in the docs (src/vec/vec/utils/projection.c) reveals that VecPow() maps v[i] to PETSC_INFINITY when the PetscRealPart(v[i]) < 0, unless the power is any of 0, ?0.5, ?1 or ?2. Even in the simple case of a purely real vector (with negative entries) raised to any other integer power, the results would not be what one might  reasonably expect from the description of VecPow().
> 
> While I do have a solution suiting my need, I'm left wondering what might be the rationale for VecPow working the way it does.
> 
> Best,
> Peder

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20241114/00bccd17/attachment.html>

From bsmith at petsc.dev  Thu Nov 14 09:29:17 2024
From: bsmith at petsc.dev (Barry Smith)
Date: Thu, 14 Nov 2024 10:29:17 -0500
Subject: [petsc-users] VecPow clarification
In-Reply-To: <CAGPUisj49xMHeT6Jyb5u1nWHhXzJjPdxdY5xQDJDOEfm9fsDiw@mail.gmail.com>
References: <PAWP192MB22753EDE96169E84AE9CD575C85B2@PAWP192MB2275.EURP192.PROD.OUTLOOK.COM>
	<CAGPUisj49xMHeT6Jyb5u1nWHhXzJjPdxdY5xQDJDOEfm9fsDiw@mail.gmail.com>
Message-ID: <759E86BE-0061-46A1-B8BA-2A3AE57FCC8E@petsc.dev>


  I see that currently VecPow is used only in a small number of places including:

  ksp/ksp/utils/lmvm/diagbrdn/diagbrdn.c:          PetscCall(VecPow(ldb->U, ldb->beta - 1));


  I am unsure if  the usage here requires the special handling of negative numbers.

  I was wrong and it is not used in the semi-smooth code, that access the vector elements directly.

  If could be we can strip out all the special infinity cases completely.


  Barry


> On Nov 14, 2024, at 9:56?AM, Stefano Zampini <stefano.zampini at gmail.com> wrote:
> 
> That is a very old bug! Can you make an MR to just call PetscPowScalar in a loop here https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/blob/main/src/vec/vec/utils/projection.c*L1022__;Iw!!G_uCfscf7eWS!b2Nfr7i01ZnyfVjweWqi87it8hcCDv0s6MtIsQhmSU8s4dq4jBPi-Ca87RRTwS20Srwyh9wQdhXcsbR6MVC8WSU$  <https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/blob/main/src/vec/vec/utils/projection.c*L1022__;Iw!!G_uCfscf7eWS!bbSdSMnU5KpH03jHI7aV5j4WLGQ3yPvxWzR6Lwr14QLy_7EJ2MNT-qhL6J1x6z3vpF6M5GQk9lMQkMoLJ_JNSn1mqwqict4$> ?
> 
> Thanks
> 
> Il giorno gio 14 nov 2024 alle ore 17:39 Peder J?rgensgaard Olesen via petsc-users <petsc-users at mcs.anl.gov <mailto:petsc-users at mcs.anl.gov>> ha scritto:
>> Given a vector containing roots of unity, v[i] = exp(i*k*x[i]) I wanted to compute the vector u[i]=exp(i*n*k*x[i]), for some real number n. From the face of it this should be easily achieved with VecPow, as u[i] = v[i]^n.
>> 
>> That didn't work as expected, though I got around it using VecGetArray() and a loop with PetscPowComplex(). The source designated in the docs (src/vec/vec/utils/projection.c) reveals that VecPow() maps v[i] to PETSC_INFINITY when the PetscRealPart(v[i]) < 0, unless the power is any of 0, ?0.5, ?1 or ?2. Even in the simple case of a purely real vector (with negative entries) raised to any other integer power, the results would not be what one might  reasonably expect from the description of VecPow().
>> 
>> While I do have a solution suiting my need, I'm left wondering what might be the rationale for VecPow working the way it does.
>> 
>> Best,
>> Peder
> 
> 
> 
> --
> Stefano

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20241114/971492ff/attachment.html>

From liw23 at rpi.edu  Thu Nov 14 14:21:19 2024
From: liw23 at rpi.edu (Li, Weichao)
Date: Thu, 14 Nov 2024 20:21:19 +0000
Subject: [petsc-users] Fail to install petsc4py with CUDA
Message-ID: <SA1PR18MB4534284498CBB342B452A432B85B2@SA1PR18MB4534.namprd18.prod.outlook.com>

Hi, thanks for your help, I want to use petsc4py with CUDA follow the instructions  from https://urldefense.us/v3/__https://github.com/caidao22/pnode?tab=readme-ov-file__;!!G_uCfscf7eWS!bQq3ISq79y5ZF67ko1BF1T_P37kAaWIzpwJSfJR_K4OWgj3IKF1qIcQiTYBoBWi9zBqQiTWXny2RAjXGkjY$ 

git clone https://urldefense.us/v3/__https://gitlab.com/petsc/petsc.git__;!!G_uCfscf7eWS!bQq3ISq79y5ZF67ko1BF1T_P37kAaWIzpwJSfJR_K4OWgj3IKF1qIcQiTYBoBWi9zBqQiTWXny2RzgYyUHQ$ 
cd petsc
./configure PETSC_ARCH=arch-linux-opt --with-debugging=0 --download-petsc4py

If I do not use CUDA it works, if I use CUDA
./configure PETSC_ARCH=arch-linux-opt --with-debugging=0 --download-petsc4py --with-cuda=1


Then make check, there has some errors and when I run my code get the error. Cannnot import PETSc
correctly. I attach the make.log and configue.log. Thanks.


Traceback (most recent call last):
  File "/opt/dino/share/DINo_parallel_fabric/train.py", line 85, in <module>
    from pnode import petsc_adjoint as odeint
  File "/opt/Dino_parallel/lib/python3.8/site-packages/pnode/__init__.py", line 3, in <module>
    from . import petsc_adjoint
  File "/opt/Dino_parallel/lib/python3.8/site-packages/pnode/petsc_adjoint.py", line 6, in <module>
    from petsc4py import PETSc
  File "/opt/dino/share/DINo_parallel_fabric/petsc/arch-linux-opt/lib/petsc4py/PETSc.py", line 4, in <module>
    PETSc = ImportPETSc(ARCH)
  File "/opt/dino/share/DINo_parallel_fabric/petsc/arch-linux-opt/lib/petsc4py/lib/__init__.py", line 33, in ImportPETSc
    return Import('petsc4py', 'PETSc', path, arch)
  File "/opt/dino/share/DINo_parallel_fabric/petsc/arch-linux-opt/lib/petsc4py/lib/__init__.py", line 100, in Import
    module = import_module(pkg, name, path, arch)
  File "/opt/dino/share/DINo_parallel_fabric/petsc/arch-linux-opt/lib/petsc4py/lib/__init__.py", line 77, in import_module
    module = importlib.util.module_from_spec(spec)
ImportError: /opt/dino/share/DINo_parallel_fabric/petsc/arch-linux-opt/lib/libpetsc.so.3.022: undefined symbol: cusparseSpMV_preprocess, version libcusparse.so.12


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20241114/cd838be3/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: make.log
Type: text/x-log
Size: 169782 bytes
Desc: make.log
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20241114/cd838be3/attachment-0002.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: configure.log
Type: text/x-log
Size: 1626964 bytes
Desc: configure.log
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20241114/cd838be3/attachment-0003.bin>

From balay.anl at fastmail.org  Thu Nov 14 14:27:56 2024
From: balay.anl at fastmail.org (Satish Balay)
Date: Thu, 14 Nov 2024 14:27:56 -0600 (CST)
Subject: [petsc-users] Fail to install petsc4py with CUDA
In-Reply-To: <SA1PR18MB4534284498CBB342B452A432B85B2@SA1PR18MB4534.namprd18.prod.outlook.com>
References: <SA1PR18MB4534284498CBB342B452A432B85B2@SA1PR18MB4534.namprd18.prod.outlook.com>
Message-ID: <cb0049d8-ab45-3d50-814a-16165a3d4a62@fastmail.org>

This issue is also posted at https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/issues/1672__;!!G_uCfscf7eWS!Y0YGMtJ1zE_1JyNSpmN0S6SSGzQrjzQt9diycTWSwIjBjne7KZKf1UK0SguHuz-MHWJB5z8otU7tmRb5zkw3wwcn2Yo$ 

Lets continue follow-up on the issue tracker - not the mailing list.

Satish

On Thu, 14 Nov 2024, Li, Weichao wrote:

> Hi, thanks for your help, I want to use petsc4py with CUDA follow the instructions  from https://urldefense.us/v3/__https://github.com/caidao22/pnode?tab=readme-ov-file__;!!G_uCfscf7eWS!bQq3ISq79y5ZF67ko1BF1T_P37kAaWIzpwJSfJR_K4OWgj3IKF1qIcQiTYBoBWi9zBqQiTWXny2RAjXGkjY$ 
> 
> git clone https://urldefense.us/v3/__https://gitlab.com/petsc/petsc.git__;!!G_uCfscf7eWS!bQq3ISq79y5ZF67ko1BF1T_P37kAaWIzpwJSfJR_K4OWgj3IKF1qIcQiTYBoBWi9zBqQiTWXny2RzgYyUHQ$ 
> cd petsc
> ./configure PETSC_ARCH=arch-linux-opt --with-debugging=0 --download-petsc4py
> 
> If I do not use CUDA it works, if I use CUDA
> ./configure PETSC_ARCH=arch-linux-opt --with-debugging=0 --download-petsc4py --with-cuda=1
> 
> 
> Then make check, there has some errors and when I run my code get the error. Cannnot import PETSc
> correctly. I attach the make.log and configue.log. Thanks.
> 
> 
> Traceback (most recent call last):
>   File "/opt/dino/share/DINo_parallel_fabric/train.py", line 85, in <module>
>     from pnode import petsc_adjoint as odeint
>   File "/opt/Dino_parallel/lib/python3.8/site-packages/pnode/__init__.py", line 3, in <module>
>     from . import petsc_adjoint
>   File "/opt/Dino_parallel/lib/python3.8/site-packages/pnode/petsc_adjoint.py", line 6, in <module>
>     from petsc4py import PETSc
>   File "/opt/dino/share/DINo_parallel_fabric/petsc/arch-linux-opt/lib/petsc4py/PETSc.py", line 4, in <module>
>     PETSc = ImportPETSc(ARCH)
>   File "/opt/dino/share/DINo_parallel_fabric/petsc/arch-linux-opt/lib/petsc4py/lib/__init__.py", line 33, in ImportPETSc
>     return Import('petsc4py', 'PETSc', path, arch)
>   File "/opt/dino/share/DINo_parallel_fabric/petsc/arch-linux-opt/lib/petsc4py/lib/__init__.py", line 100, in Import
>     module = import_module(pkg, name, path, arch)
>   File "/opt/dino/share/DINo_parallel_fabric/petsc/arch-linux-opt/lib/petsc4py/lib/__init__.py", line 77, in import_module
>     module = importlib.util.module_from_spec(spec)
> ImportError: /opt/dino/share/DINo_parallel_fabric/petsc/arch-linux-opt/lib/libpetsc.so.3.022: undefined symbol: cusparseSpMV_preprocess, version libcusparse.so.12
> 
> 
> 


From diegomagela at usp.br  Thu Nov 14 15:40:58 2024
From: diegomagela at usp.br (Diego Magela Lemos)
Date: Thu, 14 Nov 2024 18:40:58 -0300
Subject: [petsc-users] Steps to solve time step second-order differential
 problem
Message-ID: <CAHtRRmG9Y+=Binj2v=_vjKBk0PiHUOvVbsKY-bxj0D_ddHGkyw@mail.gmail.com>

For a second-order differential problem defined as

M u_tt + C u_t + K u = F(t)

what would be the steps to solve this problem using TS?

P.S.: I already have the matrices M, C, and K implemented as Mat objects.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20241114/e997e540/attachment.html>

From jed at jedbrown.org  Thu Nov 14 16:22:13 2024
From: jed at jedbrown.org (Jed Brown)
Date: Thu, 14 Nov 2024 15:22:13 -0700
Subject: [petsc-users] Steps to solve time step second-order
 differential problem
In-Reply-To: <CAHtRRmG9Y+=Binj2v=_vjKBk0PiHUOvVbsKY-bxj0D_ddHGkyw@mail.gmail.com>
References: <CAHtRRmG9Y+=Binj2v=_vjKBk0PiHUOvVbsKY-bxj0D_ddHGkyw@mail.gmail.com>
Message-ID: <87ikspe9re.fsf@jedbrown.org>

You can either rewrite as a first-order system or use TSSetI2Function (see examples) with TSALPHA2.

https://urldefense.us/v3/__https://petsc.org/release/manualpages/TS/TSSetI2Function/*tsseti2function__;Iw!!G_uCfscf7eWS!a0HRe5m1TFfnxS6ZDGPkzSdsxgzubcDgMzFOPD-04IN5oE7gYM8QZnR9wcBCLVAWC47fej6mbRTAVAodtGc$ 

Diego Magela Lemos via petsc-users <petsc-users at mcs.anl.gov> writes:

> For a second-order differential problem defined as
>
> M u_tt + C u_t + K u = F(t)
>
> what would be the steps to solve this problem using TS?
>
> P.S.: I already have the matrices M, C, and K implemented as Mat objects.

From bsmith at petsc.dev  Fri Nov 15 11:19:25 2024
From: bsmith at petsc.dev (Barry Smith)
Date: Fri, 15 Nov 2024 12:19:25 -0500
Subject: [petsc-users] VecPow clarification
In-Reply-To: <PAWP192MB22753EDE96169E84AE9CD575C85B2@PAWP192MB2275.EURP192.PROD.OUTLOOK.COM>
References: <PAWP192MB22753EDE96169E84AE9CD575C85B2@PAWP192MB2275.EURP192.PROD.OUTLOOK.COM>
Message-ID: <ADE248B6-0864-4518-8225-0771F2CA0A15@petsc.dev>


https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/merge_requests/8012__;!!G_uCfscf7eWS!Y4coa74DKQZeRI4FvkVAj8dOc1biQzDAjzXBDqgLJKStN2JFpB7w-WYstcURUd-AykeTfuH7q6YeUPfb3eIQD2s$ 


> On Nov 14, 2024, at 9:39?AM, Peder J?rgensgaard Olesen via petsc-users <petsc-users at mcs.anl.gov> wrote:
> 
> Given a vector containing roots of unity, v[i] = exp(i*k*x[i]) I wanted to compute the vector u[i]=exp(i*n*k*x[i]), for some real number n. From the face of it this should be easily achieved with VecPow, as u[i] = v[i]^n.
> 
> That didn't work as expected, though I got around it using VecGetArray() and a loop with PetscPowComplex(). The source designated in the docs (src/vec/vec/utils/projection.c) reveals that VecPow() maps v[i] to PETSC_INFINITY when the PetscRealPart(v[i]) < 0, unless the power is any of 0, ?0.5, ?1 or ?2. Even in the simple case of a purely real vector (with negative entries) raised to any other integer power, the results would not be what one might  reasonably expect from the description of VecPow().
> 
> While I do have a solution suiting my need, I'm left wondering what might be the rationale for VecPow working the way it does.
> 
> Best,
> Peder

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20241115/eb86ed7e/attachment-0001.html>

From mmolinos at us.es  Tue Nov 19 03:56:18 2024
From: mmolinos at us.es (MIGUEL MOLINOS PEREZ)
Date: Tue, 19 Nov 2024 09:56:18 +0000
Subject: [petsc-users] Ghost particles for DMSWARM (or similar)
In-Reply-To: <55056A2B-85E9-4896-9B4B-869A14F4B2C8@us.es>
References: <8FBAC7A5-B6AE-4B21-8FEB-52BE1C04A265@us.es>
	<CAMYG4G=cqKKsYCFz2bsFnf891hR_N7EwVdj2uLWL3Gi7MbacPQ@mail.gmail.com>
	<C58DE0E1-4155-47B2-A55C-23A2AFAFFBFC@us.es>
	<CAMYG4GmwO9Z638MH3tE6+XBN2VnyadW1sSF5V02kPdp4J4osMw@mail.gmail.com>
	<1B9B1277-9566-444C-9DA8-7ED17684FE01@us.es>
	<CAMYG4G=sD5+0pUPEz1arT6G6EJdSapV0xbVoT-J39gR+u17TXw@mail.gmail.com>
	<B3A3A20F-60C8-4097-9FF5-5687D8E22C63@us.es>
	<24337E11-33D2-4FFC-89E2-12520AD487FF@us.es>
	<CAMYG4GmouD01209g-tisuUgPP-G4rgyDsTbh+X2UpigWKrzsOw@mail.gmail.com>
	<681E96BD-62A4-4566-A13E-E034B2F19D54@us.es>
	<CAMYG4GnHQS_07v7N6VXq04azu-kbXxsSfsEQ0SY3v=A6ZmasWg@mail.gmail.com>
	<EC67BDB3-E609-4847-9E4C-6A32B9205D60@us.es>
	<CAMYG4GmQeaxDyPgcwnsHLWW0CQTaTxKwtrA5-qn36M1FG0Wpyg@mail.gmail.com>
	<7698089E-0909-429F-9E89-9D1AD636ACBF@us.es>
	<562B2CA0-7462-4AF2-AAF4-E44DDD00B222@us.es>
	<CAJ98EDq-2MVFAP0T3NZOBYaAzLfrGa-ZNoKw+Zk8jBZ7fL-R9w@mail.gmail.com>
	<5C293345-E026-436B-B4D0-E5DC109A0701@us.es>
	<CAJ98EDrJcPHc6vOvESDp5yK206K9RXJbx2jQo3Wn8CVb6_cJpQ@mail.gmail.com>
	<55056A2B-85E9-4896-9B4B-869A14F4B2C8@us.es>
Message-ID: <98E34D64-80AD-4DF8-BCD1-94E8F29FB3DB@us.es>

Dear all:

Just to wrap this thread up. The easiest way to update any variable of the ghost particles is using the following lines of code:

PetscCall(VecCreateGhostWithArray(PETSC_COMM_WORLD, n_dof_local,
PETSC_DETERMINE, n_dof_ghost, idx_dof_ghost,
X_ptr, &X));
PetscCall(VecGhostUpdateBegin(X, INSERT_VALUES, SCATTER_FORWARD));
PetscCall(VecGhostUpdateEnd(X, INSERT_VALUES, SCATTER_FORWARD));

Where X_ptr is the local pointer coming from:

DMSwarmGetField

Best,
Miguel

On 1 Oct 2024, at 19:42, MIGUEL MOLINOS PEREZ <mmolinos at us.es> wrote:

Wow, thank you Dave that?s awesome, let me know if there's anything I can help you with!

Miguel

On Oct 1, 2024, at 7:20?PM, Dave May <dave.mayhem23 at gmail.com> wrote:


On Tue, 1 Oct 2024 at 08:56, MIGUEL MOLINOS PEREZ <mmolinos at us.es<mailto:mmolinos at us.es>> wrote:
Hi Dave,

Would something like that work?
Yes, this should work! Any idea on where to look so I can try to implement it myself?

I am adding support for this right now.


Best,
Miguel

On Oct 1, 2024, at 5:22?PM, Dave May <dave.mayhem23 at gmail.com<mailto:dave.mayhem23 at gmail.com>> wrote:

Hi Miguel,

On Tue 1. Oct 2024 at 07:56, MIGUEL MOLINOS PEREZ <mmolinos at us.es<mailto:mmolinos at us.es>> wrote:
Thank you Matt, it works!

The implementation is straightforward:
- 1? Define the paddle regions using DMGetLocalBoundingBox with the background DMDA mesh as an auxiliary mesh for the domain-partitioning.
- 2? Create an integer to count the local number of particles to be used as ghost particle for other processors (N_ghost). One particle can be counted more than one time. At the same time, fill two arrays:
- one with the index of the "main particle? (local particle),
- and the other with the target rank of the "main particle?.
- 3? Create the new particles using DMSwarmAddNPoints
- 4? Fill the new particles with the information of the ?main particle? but set the internal variable DMSwarmField_rank with the target rank.
- 5? Call DMSwarmMigrate(*,PETSC_TRUE). Therefore, we send the ghost particles to the corresponding processors and we delete them from the original   processor.
- 6? Do stuff?
- 7? Delete ghost particles. This is very easy, we just have to call DMSwarmRemovePoint N_ghost times.

I think this can be easily implemented as closed routine for the DMSwarm class.

The remaining question is: how to do the communication between the ?original" particle and the ghost particles? For instance, if we update some particle variable (locally) inside of a SNES context, this same variable should be updated in the ghost particles at the other processors.

I think what you are asking about is an operation similar to VecGhostUpdate{Begin,End}(). In the case of a DMSwarm I?m not sure how to define the InsertMode = ADD_VALUES? Some swarm fields do not make sense to be added. INSERT_VALUES is fine.

One solution might be to have something like this

DMSwarmCollectViewUpdateGhostOwners(DM dm, InsertMode mode, PetscInt nfields, const char *fieldNames[]);

where one can specify the insert mode and the fields on which the insert mode will apply.

Would something like that work?

Cheers,
Dave


PS: Hope this helps someone in the future :-)


On Sep 27, 2024, at 10:50?AM, MIGUEL MOLINOS PEREZ <mmolinos at us.es<mailto:mmolinos at us.es>> wrote:

Thank you Matt, let me give it try.

Miguel

On Sep 27, 2024, at 3:44?AM, Matthew Knepley <knepley at gmail.com<mailto:knepley at gmail.com>> wrote:

On Thu, Sep 26, 2024 at 7:18?PM MIGUEL MOLINOS PEREZ <mmolinos at us.es<mailto:mmolinos at us.es>> wrote:
I see, you mean:

Create the ghost particles at the local cell with the same properties as particle 1 (duplicate the original particle) but different value DMSwarmField_rank. Then, call DMSwarmMigrate(*,PETSC_FALSE) so we do the migration and delete the local copies of the particle 1.  Right?

Yep. I think it will work, from what I know about BASIC.

  Thanks,

     Matt

Thanks,
Miguel

On Sep 26, 2024, at 11:09?PM, Matthew Knepley <knepley at gmail.com<mailto:knepley at gmail.com>> wrote:

On Thu, Sep 26, 2024 at 11:20?AM MIGUEL MOLINOS PEREZ <mmolinos at us.es<mailto:mmolinos at us.es>> wrote:
Thank you Matt.

Okey, let me have a careful look to the DMSwarmMigrate_Push_Basic implementation to see if there is some workaround.

The idea of adding new particles is interesting. However, in that case, we need to initialize the new (ghost) particles using the fields of the ?real? particle, right? This can be done using something like:

VecGhostUpdateBegin<https://urldefense.us/v3/__https://petsc.org/release/manualpages/Vec/VecGhostUpdateBegin/__;!!G_uCfscf7eWS!bdHPrpwBJb1e2EwnUBsc12AsmFX-skqxMCtvkOWcXrM01q6SnXY4rnaqzZaUWI0rbyp3pCAiM_O5KKaOqoSwpA$ >(Vec<https://urldefense.us/v3/__https://petsc.org/release/manualpages/Vec/Vec/__;!!G_uCfscf7eWS!bdHPrpwBJb1e2EwnUBsc12AsmFX-skqxMCtvkOWcXrM01q6SnXY4rnaqzZaUWI0rbyp3pCAiM_O5KKZ2z74W7w$ > globalout,InsertMode<https://urldefense.us/v3/__https://petsc.org/release/manualpages/Sys/InsertMode/__;!!G_uCfscf7eWS!bdHPrpwBJb1e2EwnUBsc12AsmFX-skqxMCtvkOWcXrM01q6SnXY4rnaqzZaUWI0rbyp3pCAiM_O5KKZbkE-LOA$ > ADD_VALUES<https://urldefense.us/v3/__https://petsc.org/release/manualpages/Sys/ADD_VALUES/__;!!G_uCfscf7eWS!bdHPrpwBJb1e2EwnUBsc12AsmFX-skqxMCtvkOWcXrM01q6SnXY4rnaqzZaUWI0rbyp3pCAiM_O5KKaxSEM_NQ$ >, ScatterMode<https://urldefense.us/v3/__https://petsc.org/release/manualpages/Vec/ScatterMode/__;!!G_uCfscf7eWS!bdHPrpwBJb1e2EwnUBsc12AsmFX-skqxMCtvkOWcXrM01q6SnXY4rnaqzZaUWI0rbyp3pCAiM_O5KKZXK_wlBQ$ > SCATTER_REVERSE<https://urldefense.us/v3/__https://petsc.org/release/manualpages/Vec/SCATTER_REVERSE/__;!!G_uCfscf7eWS!bdHPrpwBJb1e2EwnUBsc12AsmFX-skqxMCtvkOWcXrM01q6SnXY4rnaqzZaUWI0rbyp3pCAiM_O5KKbYEnxJ8w$ >);
VecGhostUpdateEnd<https://urldefense.us/v3/__https://petsc.org/release/manualpages/Vec/VecGhostUpdateEnd/__;!!G_uCfscf7eWS!bdHPrpwBJb1e2EwnUBsc12AsmFX-skqxMCtvkOWcXrM01q6SnXY4rnaqzZaUWI0rbyp3pCAiM_O5KKbcByb82g$ >(Vec<https://urldefense.us/v3/__https://petsc.org/release/manualpages/Vec/Vec/__;!!G_uCfscf7eWS!bdHPrpwBJb1e2EwnUBsc12AsmFX-skqxMCtvkOWcXrM01q6SnXY4rnaqzZaUWI0rbyp3pCAiM_O5KKZ2z74W7w$ > globalout,InsertMode<https://urldefense.us/v3/__https://petsc.org/release/manualpages/Sys/InsertMode/__;!!G_uCfscf7eWS!bdHPrpwBJb1e2EwnUBsc12AsmFX-skqxMCtvkOWcXrM01q6SnXY4rnaqzZaUWI0rbyp3pCAiM_O5KKZbkE-LOA$ > ADD_VALUES<https://urldefense.us/v3/__https://petsc.org/release/manualpages/Sys/ADD_VALUES/__;!!G_uCfscf7eWS!bdHPrpwBJb1e2EwnUBsc12AsmFX-skqxMCtvkOWcXrM01q6SnXY4rnaqzZaUWI0rbyp3pCAiM_O5KKaxSEM_NQ$ >, ScatterMode<https://urldefense.us/v3/__https://petsc.org/release/manualpages/Vec/ScatterMode/__;!!G_uCfscf7eWS!bdHPrpwBJb1e2EwnUBsc12AsmFX-skqxMCtvkOWcXrM01q6SnXY4rnaqzZaUWI0rbyp3pCAiM_O5KKZXK_wlBQ$ > SCATTER_REVERSE<https://urldefense.us/v3/__https://petsc.org/release/manualpages/Vec/SCATTER_REVERSE/__;!!G_uCfscf7eWS!bdHPrpwBJb1e2EwnUBsc12AsmFX-skqxMCtvkOWcXrM01q6SnXY4rnaqzZaUWI0rbyp3pCAiM_O5KKbYEnxJ8w$ >);

for the particle fields (?).

I think we can just copy from the local particle. For example, suppose I decide that particle 1 should go to rank 5, 12, and 27. Then
I first set p1.rank = 5, then I add two new particles with the same values as particle 1, but with rank = 12 and 27. Then when I call migrate, it will move these three particles to the correct processes, and delete the original particles and the copies from the local set.

  Thanks,

     Matt

Thanks,
Miguel


On Sep 26, 2024, at 3:53?PM, Matthew Knepley <knepley at gmail.com<mailto:knepley at gmail.com>> wrote:

On Thu, Sep 26, 2024 at 6:31?AM MIGUEL MOLINOS PEREZ <mmolinos at us.es<mailto:mmolinos at us.es>> wrote:
Hi Matt et al,

I?ve been working on the scheme that you proposed to create ghost particles (atoms in my case), and it works! With a couple of caveats:
-1? In general the overlap particles will be migrate from their own rank to more than one neighbor rank, this is specially relevant for those located close to the corners. Therefore, you'll need to call DMSwarmMigrate several times (27 times for 3D cells), during the migration process.

That is terrible. Let's just fix DMSwarmMigrate to have a mode that sends the particle to all overlapping neighbors at once. It can't be that hard.

-2? You need to set DMSWARM_MIGRATE_BASIC. Otherwise the proposed algorithm will not work at all!

Oh, I should have thought of that. Sorry.

I can help code up that extension. Can you take a quick look at the BASIC code? Right now, we just use the rank attached to the particle
to send it. We could have an arrays of ranks, but that seems crazy, and would blow up particle storage. How about just adding new particles
with the other ranks right before migration?

   Thanks,

     Matt

Hope this helps to other folks!

I have a follow-up question about periodic bcc on this context, should I open a new thread of keep posting here?

Thanks,
Miguel

On Aug 7, 2024, at 4:22?AM, MIGUEL MOLINOS PEREZ <mmolinos at us.es<mailto:mmolinos at us.es>> wrote:

Thanks Matt, I think I'll start by making a small program as a proof of concept. Then, if it works I'll implement it in my code and I'll be happy to share it too :-)

Miguel

On Aug 4, 2024, at 3:30?AM, Matthew Knepley <knepley at gmail.com<mailto:knepley at gmail.com>> wrote:

On Fri, Aug 2, 2024 at 7:15?PM MIGUEL MOLINOS PEREZ <mmolinos at us.es<mailto:mmolinos at us.es>> wrote:
Thanks again Matt, that makes a lot more sense !!

Just to check that we are on the same page. You are saying:

1. create a field define a field called "owner rank" for each particle.

2. Identify the phantom particles and modify the internal variable defined by the DMSwarmField_rank variable.

3. Call DMSwarmMigrate(*,PETSC_FALSE), do the calculations using the new local vector including the ghost particles.

4. Then, once the calculations are done, rename the DMSwarmField_rank variable using the "owner rank" variable and call DMSwarmMigrate(*,PETSC_FALSE) once again.

I don't think we need this last step. We can just remove those ghost particles for the next step I think.

  Thanks,

     Matt

Thank you,
Miguel


On Aug 2, 2024, at 5:33?PM, Matthew Knepley <knepley at gmail.com<mailto:knepley at gmail.com>> wrote:

On Fri, Aug 2, 2024 at 11:15?AM MIGUEL MOLINOS PEREZ <mmolinos at us.es<mailto:mmolinos at us.es>> wrote:
Thank you Matt for your time,

What you describe seems to me the ideal approach.

1) Add a particle field 'ghost' that identifies ghost vs owned particles. I think it needs options OWNED, OVERLAP, and GHOST
This means, locally, I need to allocate Nlocal + ghost particles (duplicated) for my model?

I would do it another way. I would allocate the particles with no overlap and set them up. Then I would identify the halo particles, mark them as OVERLAP, call DMSwarmMigrate(), and mark the migrated particles as GHOST, then unmark the OVERLAP particles. Shoot! That marking will not work since we cannot tell the difference between particles we received and particles we sent. Okay, instead of the `ghost` field we need an `owner rank` field. So then we

1) Setup the non-overlapping particles

2) Identify the halo particles

3) Change the `rank`, but not the `owner rank`

4) Call DMSwarmMigrate()

Now we can identify ghost particles by the `owner rank`

If that so, how to do the communication between the ghost particles living in the rank i and their ?real? counterpart in the rank j.

Algo, as an alternative, what about:
1) Use an IS tag which contains, for each rank, a list of the global index of the neighbors particles outside of the rank.
2) Use VecCreateGhost to create a new vector which contains extra local space for the ghost components of the vector.
3) Use VecScatterCreate, VecScatterBegin, and VecScatterEnd to do the transference of data between a vector obtained with  DMSwarmCreateGlobalVectorFromField
4) Do necessary computations using the vectors created with VecCreateGhost.

This is essentially what Migrate() does. I was trying to reuse the code.

  Thanks,

     Matt

Thanks,
Miguel

On Aug 2, 2024, at 8:58?AM, Matthew Knepley <knepley at gmail.com<mailto:knepley at gmail.com>> wrote:

On Thu, Aug 1, 2024 at 4:40?PM MIGUEL MOLINOS PEREZ <mmolinos at us.es<mailto:mmolinos at us.es>> wrote:
This Message Is From an External Sender
This message came from outside your organization.


Dear all,

I am implementing a Molecular Dynamics (MD) code using the DMSWARM interface. In the MD simulations we evaluate on each particle (atoms) some kind of scalar functional using data from the neighbouring atoms. My problem lies in the parallel implementation of the model, because sometimes, some of these neighbours lie on a different processor.

This is usually solved by using ghost particles.  A similar approach (with nodes instead) is already implemented for other PETSc mesh structures like DMPlexConstructGhostCells. Unfortunately, I don't see this kind of constructs for DMSWARM. Am I missing something?

I this could be done by applying a buffer region by exploiting the background DMDA mesh that I already use to do domain decomposition. Then using the buffer region of each cell to locate the ghost particles and finally using VecCreateGhost. Is this feasible? Or is there an easier approach using other PETSc functions.

This is feasible, but it would be good to develop a set of best practices, since we have been mainly focused on the case of non-redundant particles. Here is how I think I would do what you want.

1) Add a particle field 'ghost' that identifies ghost vs owned particles. I think it needs options OWNED, OVERLAP, and GHOST

2) At some interval identify particles that should be sent to other processes as ghosts. I would call these "overlap particles". The determination
    seems application specific, so I would leave this determination to the user right now. We do two things to these particles

    a) Mark chosen particles as OVERLAP

    b) Change rank to process we are sending to

3) Call DMSwarmMigrate with PETSC_FALSE for the particle deletion flag

4) Mark OVERLAP particles as GHOST when they arrive

There is one problem in the above algorithm. It does not allow sending particles to multiple ranks. We would have to do this
in phases right now, or make a small adjustment to the interface allowing replication of particles when a set of ranks is specified.

  THanks,

     Matt


Thank you,
Miguel


--
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener

https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!bdHPrpwBJb1e2EwnUBsc12AsmFX-skqxMCtvkOWcXrM01q6SnXY4rnaqzZaUWI0rbyp3pCAiM_O5KKZZ1hWeSA$ <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!bdHPrpwBJb1e2EwnUBsc12AsmFX-skqxMCtvkOWcXrM01q6SnXY4rnaqzZaUWI0rbyp3pCAiM_O5KKYq6e2CVw$ >


--
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener

https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!bdHPrpwBJb1e2EwnUBsc12AsmFX-skqxMCtvkOWcXrM01q6SnXY4rnaqzZaUWI0rbyp3pCAiM_O5KKZZ1hWeSA$ <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!bdHPrpwBJb1e2EwnUBsc12AsmFX-skqxMCtvkOWcXrM01q6SnXY4rnaqzZaUWI0rbyp3pCAiM_O5KKYq6e2CVw$ >


--
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener

https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!bdHPrpwBJb1e2EwnUBsc12AsmFX-skqxMCtvkOWcXrM01q6SnXY4rnaqzZaUWI0rbyp3pCAiM_O5KKZZ1hWeSA$ <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!bdHPrpwBJb1e2EwnUBsc12AsmFX-skqxMCtvkOWcXrM01q6SnXY4rnaqzZaUWI0rbyp3pCAiM_O5KKYq6e2CVw$ >


--
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener

https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!bdHPrpwBJb1e2EwnUBsc12AsmFX-skqxMCtvkOWcXrM01q6SnXY4rnaqzZaUWI0rbyp3pCAiM_O5KKZZ1hWeSA$ <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!bdHPrpwBJb1e2EwnUBsc12AsmFX-skqxMCtvkOWcXrM01q6SnXY4rnaqzZaUWI0rbyp3pCAiM_O5KKYq6e2CVw$ >


--
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener

https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!bdHPrpwBJb1e2EwnUBsc12AsmFX-skqxMCtvkOWcXrM01q6SnXY4rnaqzZaUWI0rbyp3pCAiM_O5KKZZ1hWeSA$ <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!bdHPrpwBJb1e2EwnUBsc12AsmFX-skqxMCtvkOWcXrM01q6SnXY4rnaqzZaUWI0rbyp3pCAiM_O5KKYq6e2CVw$ >


--
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener

https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!bdHPrpwBJb1e2EwnUBsc12AsmFX-skqxMCtvkOWcXrM01q6SnXY4rnaqzZaUWI0rbyp3pCAiM_O5KKZZ1hWeSA$ <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!bdHPrpwBJb1e2EwnUBsc12AsmFX-skqxMCtvkOWcXrM01q6SnXY4rnaqzZaUWI0rbyp3pCAiM_O5KKYq6e2CVw$ >


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20241119/098d9d50/attachment-0001.html>

From mmolinos at us.es  Tue Nov 19 05:14:33 2024
From: mmolinos at us.es (MIGUEL MOLINOS PEREZ)
Date: Tue, 19 Nov 2024 11:14:33 +0000
Subject: [petsc-users] Doubt about mesh size distribution in DMDACreate3d
 using periodic boundary conditions
Message-ID: <56B31960-5928-4225-BE0D-B2A8E6214776@us.es>

Dear all:

It seems that if I mesh a cubic domain with ?DMDACreate3d? using 8 bricks for discretization and with periodic boundaries, each of the bricks has a different size. In contrast, if I use DM_BOUNDARY_NONE, all 8 bricks have the same size.

I have used this together with the DMSWarm discretization. And as you can see the number of particles per rank is not evenly distributed:
210 420 366 732 420 840 732 1464

Am I missing something?

Thanks,
Miguel


[cid:534265d3-3f18-41cd-8006-539cb06751f9 at eurprd01.prod.exchangelabs.com]
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20241119/6292514b/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Screenshot 2024-11-19 at 10.56.36.png
Type: image/png
Size: 233037 bytes
Desc: Screenshot 2024-11-19 at 10.56.36.png
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20241119/6292514b/attachment-0001.png>

From zhjx960203 at gmail.com  Tue Nov 19 03:35:10 2024
From: zhjx960203 at gmail.com (=?UTF-8?B?5byg5bO75rqq?=)
Date: Tue, 19 Nov 2024 17:35:10 +0800
Subject: [petsc-users] Why can't I save a diagonal matrix into a binary file
 via petsc4py?
Message-ID: <CACBHX-sKEeOX_Kq7jQDLaTXLf_FYZ+p5QgHHV9ZxgDP4B1RRvQ@mail.gmail.com>

My version is 3.22.0 from "petsc4py.__version__"
Here is the code:
```python
from petsc4py import PETSc
import numpy as np

vec1 = PETSc.Vec().createWithArray(np.arange(30))
mat1 = PETSc.Mat().createDiagonal(vec1)
mat1.assemble()
viewer = PETSc.Viewer().createBinary("abcde.bin", "w")
mat1.view(viewer)
viewer.destroy()
```

I find it runs without any error, but no file generated
However, if I convert mat1 to AIJ, it can be saved successfully
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20241119/fa04c6b9/attachment.html>

From jroman at dsic.upv.es  Tue Nov 19 10:59:59 2024
From: jroman at dsic.upv.es (Jose E. Roman)
Date: Tue, 19 Nov 2024 16:59:59 +0000
Subject: [petsc-users] Why can't I save a diagonal matrix into a binary
 file via petsc4py?
In-Reply-To: <CACBHX-sKEeOX_Kq7jQDLaTXLf_FYZ+p5QgHHV9ZxgDP4B1RRvQ@mail.gmail.com>
References: <CACBHX-sKEeOX_Kq7jQDLaTXLf_FYZ+p5QgHHV9ZxgDP4B1RRvQ@mail.gmail.com>
Message-ID: <AE7BA774-5899-409B-92D0-44417C291487@dsic.upv.es>

The implementation is here:
https://urldefense.us/v3/__https://petsc.org/release/src/mat/impls/diagonal/diagonal.c.html*MatView_Diagonal__;Iw!!G_uCfscf7eWS!ej9LRPwh7-zzKBJDEUfh0gSmHxGO7nitFAZJGu1s7mi0htPg8OUAqyV04o_0YYgqOCkjSAKzvEij27AKN9-3Mzk2$ 

You can see that it only handles the ascii case.
It should also check the case of binary viewer, then call VecView() or convert to MATAIJ and call MatView().
Do you want to contribute a merge request? https://urldefense.us/v3/__https://petsc.org/release/developers/contributing/__;!!G_uCfscf7eWS!ej9LRPwh7-zzKBJDEUfh0gSmHxGO7nitFAZJGu1s7mi0htPg8OUAqyV04o_0YYgqOCkjSAKzvEij27AKN6f4qvXn$ 

Jose


> El 19 nov 2024, a las 10:35, ??? <zhjx960203 at gmail.com> escribi?:
> 
> My version is 3.22.0 from "petsc4py.__version__"
> Here is the code:
> ```python
> from petsc4py import PETSc
> import numpy as np
> 
> vec1 = PETSc.Vec().createWithArray(np.arange(30))
> mat1 = PETSc.Mat().createDiagonal(vec1)
> mat1.assemble()
> viewer = PETSc.Viewer().createBinary("abcde.bin", "w")
> mat1.view(viewer)
> viewer.destroy()
> ```
> 
> I find it runs without any error, but no file generated
> However, if I convert mat1 to AIJ, it can be saved successfully


From bsmith at petsc.dev  Tue Nov 19 11:55:40 2024
From: bsmith at petsc.dev (Barry Smith)
Date: Tue, 19 Nov 2024 12:55:40 -0500
Subject: [petsc-users] Doubt about mesh size distribution in
 DMDACreate3d using periodic boundary conditions
In-Reply-To: <56B31960-5928-4225-BE0D-B2A8E6214776@us.es>
References: <56B31960-5928-4225-BE0D-B2A8E6214776@us.es>
Message-ID: <533B9E59-5B74-433E-B7A9-89972A2C7A8D@petsc.dev>


   I modify src/dm/tests/ex25.c and always see a nice even split when possible with both DM_BOUNDARY_NONE and DM_BOUNDARY_PERIODIC

    Can you please send a reproducible example?

    Thanks

     Barry


> On Nov 19, 2024, at 6:14?AM, MIGUEL MOLINOS PEREZ <mmolinos at us.es> wrote:
> 
> Dear all:
> 
> It seems that if I mesh a cubic domain with ?DMDACreate3d? using 8 bricks for discretization and with periodic boundaries, each of the bricks has a different size. In contrast, if I use DM_BOUNDARY_NONE, all 8 bricks have the same size. 
> 
> I have used this together with the DMSWarm discretization. And as you can see the number of particles per rank is not evenly distributed: 
> 210 420 366 732 420 840 732 1464
> 
> Am I missing something?
> 
> Thanks,
> Miguel
> 
> 
> <Screenshot 2024-11-19 at 10.56.36.png>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20241119/fb18227a/attachment-0001.html>

From mmolinos at us.es  Wed Nov 20 04:49:03 2024
From: mmolinos at us.es (MIGUEL MOLINOS PEREZ)
Date: Wed, 20 Nov 2024 10:49:03 +0000
Subject: [petsc-users] Doubt about mesh size distribution in
 DMDACreate3d using periodic boundary conditions
In-Reply-To: <533B9E59-5B74-433E-B7A9-89972A2C7A8D@petsc.dev>
References: <56B31960-5928-4225-BE0D-B2A8E6214776@us.es>
	<533B9E59-5B74-433E-B7A9-89972A2C7A8D@petsc.dev>
Message-ID: <3421F3B3-C775-4E1A-A572-9B22EF0942E9@us.es>

Hi Bary:

I will check the example you suggest. Anyhow, I?ll send a reproducible example ASAP.

Thanks,
Miguel

On 19 Nov 2024, at 18:55, Barry Smith <bsmith at petsc.dev> wrote:


   I modify src/dm/tests/ex25.c and always see a nice even split when possible with both DM_BOUNDARY_NONE and DM_BOUNDARY_PERIODIC

    Can you please send a reproducible example?

    Thanks

     Barry


On Nov 19, 2024, at 6:14?AM, MIGUEL MOLINOS PEREZ <mmolinos at us.es> wrote:

Dear all:

It seems that if I mesh a cubic domain with ?DMDACreate3d? using 8 bricks for discretization and with periodic boundaries, each of the bricks has a different size. In contrast, if I use DM_BOUNDARY_NONE, all 8 bricks have the same size.

I have used this together with the DMSWarm discretization. And as you can see the number of particles per rank is not evenly distributed:
210 420 366 732 420 840 732 1464

Am I missing something?

Thanks,
Miguel


<Screenshot 2024-11-19 at 10.56.36.png>


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20241120/f12d7a95/attachment.html>

From mmolinos at us.es  Wed Nov 20 06:06:24 2024
From: mmolinos at us.es (MIGUEL MOLINOS PEREZ)
Date: Wed, 20 Nov 2024 12:06:24 +0000
Subject: [petsc-users] Doubt about mesh size distribution in
 DMDACreate3d using periodic boundary conditions
In-Reply-To: <3421F3B3-C775-4E1A-A572-9B22EF0942E9@us.es>
References: <56B31960-5928-4225-BE0D-B2A8E6214776@us.es>
	<533B9E59-5B74-433E-B7A9-89972A2C7A8D@petsc.dev>
	<3421F3B3-C775-4E1A-A572-9B22EF0942E9@us.es>
Message-ID: <FB52894A-868A-460B-81C2-D7C73692D764@us.es>

Dear Barry,

Please, find attached to this email a minimal example of the problem. Run it using 8 MPI processes.

Thanks,
Miguel


On 20 Nov 2024, at 11:48, Miguel Molinos <mmolinos at us.es> wrote:

Hi Bary:

I will check the example you suggest. Anyhow, I?ll send a reproducible example ASAP.

Thanks,
Miguel

On 19 Nov 2024, at 18:55, Barry Smith <bsmith at petsc.dev> wrote:


   I modify src/dm/tests/ex25.c and always see a nice even split when possible with both DM_BOUNDARY_NONE and DM_BOUNDARY_PERIODIC

    Can you please send a reproducible example?

    Thanks

     Barry


On Nov 19, 2024, at 6:14?AM, MIGUEL MOLINOS PEREZ <mmolinos at us.es> wrote:

Dear all:

It seems that if I mesh a cubic domain with ?DMDACreate3d? using 8 bricks for discretization and with periodic boundaries, each of the bricks has a different size. In contrast, if I use DM_BOUNDARY_NONE, all 8 bricks have the same size.

I have used this together with the DMSWarm discretization. And as you can see the number of particles per rank is not evenly distributed:
210 420 366 732 420 840 732 1464

Am I missing something?

Thanks,
Miguel


<Screenshot 2024-11-19 at 10.56.36.png>


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20241120/d3868f49/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: atoms-3D.cpp
Type: application/octet-stream
Size: 26402 bytes
Desc: atoms-3D.cpp
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20241120/d3868f49/attachment-0002.obj>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Mg-hcp-cube-x17-x10-x10.dump
Type: application/octet-stream
Size: 436301 bytes
Desc: Mg-hcp-cube-x17-x10-x10.dump
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20241120/d3868f49/attachment-0003.obj>

From bsmith at petsc.dev  Wed Nov 20 11:36:08 2024
From: bsmith at petsc.dev (Barry Smith)
Date: Wed, 20 Nov 2024 12:36:08 -0500
Subject: [petsc-users] Doubt about mesh size distribution in
 DMDACreate3d using periodic boundary conditions
In-Reply-To: <FB52894A-868A-460B-81C2-D7C73692D764@us.es>
References: <56B31960-5928-4225-BE0D-B2A8E6214776@us.es>
	<533B9E59-5B74-433E-B7A9-89972A2C7A8D@petsc.dev>
	<3421F3B3-C775-4E1A-A572-9B22EF0942E9@us.es>
	<FB52894A-868A-460B-81C2-D7C73692D764@us.es>
Message-ID: <609D8006-8454-4FE3-A052-A0D95D9189F7@petsc.dev>


   I am sorry, I don't understand the problem. When I run by default with -da_view I get 

Processor [0] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1
X range of indices: 0 2, Y range of indices: 0 2, Z range of indices: 0 2
Processor [1] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1
X range of indices: 2 3, Y range of indices: 0 2, Z range of indices: 0 2
Processor [2] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1
X range of indices: 0 2, Y range of indices: 2 3, Z range of indices: 0 2
Processor [3] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1
X range of indices: 2 3, Y range of indices: 2 3, Z range of indices: 0 2
Processor [4] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1
X range of indices: 0 2, Y range of indices: 0 2, Z range of indices: 2 3
Processor [5] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1
X range of indices: 2 3, Y range of indices: 0 2, Z range of indices: 2 3
Processor [6] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1
X range of indices: 0 2, Y range of indices: 2 3, Z range of indices: 2 3
Processor [7] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1
X range of indices: 2 3, Y range of indices: 2 3, Z range of indices: 2 3

which seems right because you are trying to have three cells in each direction. The distribution has to be uneven, hence 0 2 and 2 3

When I change the code to use ndiv_mesh_* = 4 and run with periodic or not I get 

$ PETSC_OPTIONS="" mpiexec -n 8 ./atoms-3D -dm_view
DM Object: 8 MPI processes
  type: da
Processor [0] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1
X range of indices: 0 2, Y range of indices: 0 2, Z range of indices: 0 2
Processor [1] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1
X range of indices: 2 4, Y range of indices: 0 2, Z range of indices: 0 2
Processor [2] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1
X range of indices: 0 2, Y range of indices: 2 4, Z range of indices: 0 2
Processor [3] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1
X range of indices: 2 4, Y range of indices: 2 4, Z range of indices: 0 2
Processor [4] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1
X range of indices: 0 2, Y range of indices: 0 2, Z range of indices: 2 4
Processor [5] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1
X range of indices: 2 4, Y range of indices: 0 2, Z range of indices: 2 4
Processor [6] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1
X range of indices: 0 2, Y range of indices: 2 4, Z range of indices: 2 4
Processor [7] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1
X range of indices: 2 4, Y range of indices: 2 4, Z range of indices: 2 4

so it is splitting as expected each rank gets a 2 by 2 by 2 set of indices.

Could you please let me know what the problem is that I should be seeing.

  Barry


> On Nov 20, 2024, at 7:06?AM, MIGUEL MOLINOS PEREZ <mmolinos at us.es> wrote:
> 
> Dear Barry,
> 
> Please, find attached to this email a minimal example of the problem. Run it using 8 MPI processes. 
> 
> Thanks,
> Miguel
> 
> 
> 
> 
> 
>> On 20 Nov 2024, at 11:48, Miguel Molinos <mmolinos at us.es> wrote:
>> 
>> Hi Bary:
>> 
>> I will check the example you suggest. Anyhow, I?ll send a reproducible example ASAP.
>> 
>> Thanks,
>> Miguel
>> 
>>> On 19 Nov 2024, at 18:55, Barry Smith <bsmith at petsc.dev> wrote:
>>> 
>>> 
>>>    I modify src/dm/tests/ex25.c and always see a nice even split when possible with both DM_BOUNDARY_NONE and DM_BOUNDARY_PERIODIC
>>> 
>>>     Can you please send a reproducible example?
>>> 
>>>     Thanks
>>> 
>>>      Barry
>>> 
>>> 
>>>> On Nov 19, 2024, at 6:14?AM, MIGUEL MOLINOS PEREZ <mmolinos at us.es> wrote:
>>>> 
>>>> Dear all:
>>>> 
>>>> It seems that if I mesh a cubic domain with ?DMDACreate3d? using 8 bricks for discretization and with periodic boundaries, each of the bricks has a different size. In contrast, if I use DM_BOUNDARY_NONE, all 8 bricks have the same size. 
>>>> 
>>>> I have used this together with the DMSWarm discretization. And as you can see the number of particles per rank is not evenly distributed: 
>>>> 210 420 366 732 420 840 732 1464
>>>> 
>>>> Am I missing something?
>>>> 
>>>> Thanks,
>>>> Miguel
>>>> 
>>>> 
>>>> <Screenshot 2024-11-19 at 10.56.36.png>
>>> 
>> 
> 
> <atoms-3D.cpp><Mg-hcp-cube-x17-x10-x10.dump>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20241120/c5bb5a9c/attachment-0001.html>

From mmolinos at us.es  Wed Nov 20 11:48:47 2024
From: mmolinos at us.es (MIGUEL MOLINOS PEREZ)
Date: Wed, 20 Nov 2024 17:48:47 +0000
Subject: [petsc-users] Doubt about mesh size distribution in
 DMDACreate3d using periodic boundary conditions
In-Reply-To: <609D8006-8454-4FE3-A052-A0D95D9189F7@petsc.dev>
References: <56B31960-5928-4225-BE0D-B2A8E6214776@us.es>
	<533B9E59-5B74-433E-B7A9-89972A2C7A8D@petsc.dev>
	<3421F3B3-C775-4E1A-A572-9B22EF0942E9@us.es>
	<FB52894A-868A-460B-81C2-D7C73692D764@us.es>
	<609D8006-8454-4FE3-A052-A0D95D9189F7@petsc.dev>
Message-ID: <AFAF77C2-CEB0-430D-81E1-F16D704FE4BA@us.es>

Sorry, I meant that the discretisation size is not constant across the edges of the cube.

Miguel

On 20 Nov 2024, at 18:36, Barry Smith <bsmith at petsc.dev> wrote:


   I am sorry, I don't understand the problem. When I run by default with -da_view I get

Processor [0] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1
X range of indices: 0 2, Y range of indices: 0 2, Z range of indices: 0 2
Processor [1] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1
X range of indices: 2 3, Y range of indices: 0 2, Z range of indices: 0 2
Processor [2] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1
X range of indices: 0 2, Y range of indices: 2 3, Z range of indices: 0 2
Processor [3] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1
X range of indices: 2 3, Y range of indices: 2 3, Z range of indices: 0 2
Processor [4] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1
X range of indices: 0 2, Y range of indices: 0 2, Z range of indices: 2 3
Processor [5] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1
X range of indices: 2 3, Y range of indices: 0 2, Z range of indices: 2 3
Processor [6] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1
X range of indices: 0 2, Y range of indices: 2 3, Z range of indices: 2 3
Processor [7] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1
X range of indices: 2 3, Y range of indices: 2 3, Z range of indices: 2 3

which seems right because you are trying to have three cells in each direction. The distribution has to be uneven, hence 0 2 and 2 3

When I change the code to use ndiv_mesh_* = 4 and run with periodic or not I get

$ PETSC_OPTIONS="" mpiexec -n 8 ./atoms-3D -dm_view
DM Object: 8 MPI processes
  type: da
Processor [0] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1
X range of indices: 0 2, Y range of indices: 0 2, Z range of indices: 0 2
Processor [1] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1
X range of indices: 2 4, Y range of indices: 0 2, Z range of indices: 0 2
Processor [2] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1
X range of indices: 0 2, Y range of indices: 2 4, Z range of indices: 0 2
Processor [3] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1
X range of indices: 2 4, Y range of indices: 2 4, Z range of indices: 0 2
Processor [4] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1
X range of indices: 0 2, Y range of indices: 0 2, Z range of indices: 2 4
Processor [5] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1
X range of indices: 2 4, Y range of indices: 0 2, Z range of indices: 2 4
Processor [6] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1
X range of indices: 0 2, Y range of indices: 2 4, Z range of indices: 2 4
Processor [7] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1
X range of indices: 2 4, Y range of indices: 2 4, Z range of indices: 2 4

so it is splitting as expected each rank gets a 2 by 2 by 2 set of indices.

Could you please let me know what the problem is that I should be seeing.

  Barry


On Nov 20, 2024, at 7:06?AM, MIGUEL MOLINOS PEREZ <mmolinos at us.es> wrote:

Dear Barry,

Please, find attached to this email a minimal example of the problem. Run it using 8 MPI processes.

Thanks,
Miguel


On 20 Nov 2024, at 11:48, Miguel Molinos <mmolinos at us.es> wrote:

Hi Bary:

I will check the example you suggest. Anyhow, I?ll send a reproducible example ASAP.

Thanks,
Miguel

On 19 Nov 2024, at 18:55, Barry Smith <bsmith at petsc.dev> wrote:


   I modify src/dm/tests/ex25.c and always see a nice even split when possible with both DM_BOUNDARY_NONE and DM_BOUNDARY_PERIODIC

    Can you please send a reproducible example?

    Thanks

     Barry


On Nov 19, 2024, at 6:14?AM, MIGUEL MOLINOS PEREZ <mmolinos at us.es> wrote:

Dear all:

It seems that if I mesh a cubic domain with ?DMDACreate3d? using 8 bricks for discretization and with periodic boundaries, each of the bricks has a different size. In contrast, if I use DM_BOUNDARY_NONE, all 8 bricks have the same size.

I have used this together with the DMSWarm discretization. And as you can see the number of particles per rank is not evenly distributed:
210 420 366 732 420 840 732 1464

Am I missing something?

Thanks,
Miguel


<Screenshot 2024-11-19 at 10.56.36.png>


<atoms-3D.cpp><Mg-hcp-cube-x17-x10-x10.dump>


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20241120/97178b70/attachment-0001.html>

From bsmith at petsc.dev  Wed Nov 20 11:52:30 2024
From: bsmith at petsc.dev (Barry Smith)
Date: Wed, 20 Nov 2024 12:52:30 -0500
Subject: [petsc-users] Doubt about mesh size distribution in
 DMDACreate3d using periodic boundary conditions
In-Reply-To: <AFAF77C2-CEB0-430D-81E1-F16D704FE4BA@us.es>
References: <56B31960-5928-4225-BE0D-B2A8E6214776@us.es>
	<533B9E59-5B74-433E-B7A9-89972A2C7A8D@petsc.dev>
	<3421F3B3-C775-4E1A-A572-9B22EF0942E9@us.es>
	<FB52894A-868A-460B-81C2-D7C73692D764@us.es>
	<609D8006-8454-4FE3-A052-A0D95D9189F7@petsc.dev>
	<AFAF77C2-CEB0-430D-81E1-F16D704FE4BA@us.es>
Message-ID: <50D40867-05EE-4AD0-B356-BC4C2E1CCF1D@petsc.dev>


  What do you mean by discretization size, and how do I see it in the code?

  Barry


> On Nov 20, 2024, at 12:48?PM, MIGUEL MOLINOS PEREZ <mmolinos at us.es> wrote:
> 
> Sorry, I meant that the discretisation size is not constant across the edges of the cube. 
> 
> Miguel
> 
>> On 20 Nov 2024, at 18:36, Barry Smith <bsmith at petsc.dev> wrote:
>> 
>> 
>>    I am sorry, I don't understand the problem. When I run by default with -da_view I get 
>> 
>> Processor [0] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1
>> X range of indices: 0 2, Y range of indices: 0 2, Z range of indices: 0 2
>> Processor [1] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1
>> X range of indices: 2 3, Y range of indices: 0 2, Z range of indices: 0 2
>> Processor [2] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1
>> X range of indices: 0 2, Y range of indices: 2 3, Z range of indices: 0 2
>> Processor [3] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1
>> X range of indices: 2 3, Y range of indices: 2 3, Z range of indices: 0 2
>> Processor [4] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1
>> X range of indices: 0 2, Y range of indices: 0 2, Z range of indices: 2 3
>> Processor [5] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1
>> X range of indices: 2 3, Y range of indices: 0 2, Z range of indices: 2 3
>> Processor [6] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1
>> X range of indices: 0 2, Y range of indices: 2 3, Z range of indices: 2 3
>> Processor [7] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1
>> X range of indices: 2 3, Y range of indices: 2 3, Z range of indices: 2 3
>> 
>> which seems right because you are trying to have three cells in each direction. The distribution has to be uneven, hence 0 2 and 2 3
>> 
>> When I change the code to use ndiv_mesh_* = 4 and run with periodic or not I get 
>> 
>> $ PETSC_OPTIONS="" mpiexec -n 8 ./atoms-3D -dm_view
>> DM Object: 8 MPI processes
>>   type: da
>> Processor [0] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1
>> X range of indices: 0 2, Y range of indices: 0 2, Z range of indices: 0 2
>> Processor [1] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1
>> X range of indices: 2 4, Y range of indices: 0 2, Z range of indices: 0 2
>> Processor [2] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1
>> X range of indices: 0 2, Y range of indices: 2 4, Z range of indices: 0 2
>> Processor [3] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1
>> X range of indices: 2 4, Y range of indices: 2 4, Z range of indices: 0 2
>> Processor [4] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1
>> X range of indices: 0 2, Y range of indices: 0 2, Z range of indices: 2 4
>> Processor [5] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1
>> X range of indices: 2 4, Y range of indices: 0 2, Z range of indices: 2 4
>> Processor [6] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1
>> X range of indices: 0 2, Y range of indices: 2 4, Z range of indices: 2 4
>> Processor [7] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1
>> X range of indices: 2 4, Y range of indices: 2 4, Z range of indices: 2 4
>> 
>> so it is splitting as expected each rank gets a 2 by 2 by 2 set of indices.
>> 
>> Could you please let me know what the problem is that I should be seeing.
>> 
>>   Barry
>> 
>> 
>>> On Nov 20, 2024, at 7:06?AM, MIGUEL MOLINOS PEREZ <mmolinos at us.es> wrote:
>>> 
>>> Dear Barry,
>>> 
>>> Please, find attached to this email a minimal example of the problem. Run it using 8 MPI processes. 
>>> 
>>> Thanks,
>>> Miguel
>>> 
>>> 
>>> 
>>> 
>>> 
>>>> On 20 Nov 2024, at 11:48, Miguel Molinos <mmolinos at us.es> wrote:
>>>> 
>>>> Hi Bary:
>>>> 
>>>> I will check the example you suggest. Anyhow, I?ll send a reproducible example ASAP.
>>>> 
>>>> Thanks,
>>>> Miguel
>>>> 
>>>>> On 19 Nov 2024, at 18:55, Barry Smith <bsmith at petsc.dev> wrote:
>>>>> 
>>>>> 
>>>>>    I modify src/dm/tests/ex25.c and always see a nice even split when possible with both DM_BOUNDARY_NONE and DM_BOUNDARY_PERIODIC
>>>>> 
>>>>>     Can you please send a reproducible example?
>>>>> 
>>>>>     Thanks
>>>>> 
>>>>>      Barry
>>>>> 
>>>>> 
>>>>>> On Nov 19, 2024, at 6:14?AM, MIGUEL MOLINOS PEREZ <mmolinos at us.es> wrote:
>>>>>> 
>>>>>> Dear all:
>>>>>> 
>>>>>> It seems that if I mesh a cubic domain with ?DMDACreate3d? using 8 bricks for discretization and with periodic boundaries, each of the bricks has a different size. In contrast, if I use DM_BOUNDARY_NONE, all 8 bricks have the same size. 
>>>>>> 
>>>>>> I have used this together with the DMSWarm discretization. And as you can see the number of particles per rank is not evenly distributed: 
>>>>>> 210 420 366 732 420 840 732 1464
>>>>>> 
>>>>>> Am I missing something?
>>>>>> 
>>>>>> Thanks,
>>>>>> Miguel
>>>>>> 
>>>>>> 
>>>>>> <Screenshot 2024-11-19 at 10.56.36.png>
>>>>> 
>>>> 
>>> 
>>> <atoms-3D.cpp><Mg-hcp-cube-x17-x10-x10.dump>
>> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20241120/63e4d4b8/attachment-0001.html>

From mmolinos at us.es  Wed Nov 20 11:56:27 2024
From: mmolinos at us.es (MIGUEL MOLINOS PEREZ)
Date: Wed, 20 Nov 2024 17:56:27 +0000
Subject: [petsc-users] Doubt about mesh size distribution in
 DMDACreate3d using periodic boundary conditions
In-Reply-To: <50D40867-05EE-4AD0-B356-BC4C2E1CCF1D@petsc.dev>
References: <56B31960-5928-4225-BE0D-B2A8E6214776@us.es>
	<533B9E59-5B74-433E-B7A9-89972A2C7A8D@petsc.dev>
	<3421F3B3-C775-4E1A-A572-9B22EF0942E9@us.es>
	<FB52894A-868A-460B-81C2-D7C73692D764@us.es>
	<609D8006-8454-4FE3-A052-A0D95D9189F7@petsc.dev>
	<AFAF77C2-CEB0-430D-81E1-F16D704FE4BA@us.es>
	<50D40867-05EE-4AD0-B356-BC4C2E1CCF1D@petsc.dev>
Message-ID: <6B69A792-4004-4B32-AB0D-F9D4AFACA094@us.es>

I mean that if the dimensions of the cube are 1x1x1 (for example). And I want 10 elements per edge, the discretization size must be 0.1 constant over the whole cube edge.

This is not in the code, I just impose the number of elements per edge.

Thank you,
Miguel

On 20 Nov 2024, at 18:52, Barry Smith <bsmith at petsc.dev> wrote:


  What do you mean by discretization size, and how do I see it in the code?

  Barry


On Nov 20, 2024, at 12:48?PM, MIGUEL MOLINOS PEREZ <mmolinos at us.es> wrote:

Sorry, I meant that the discretisation size is not constant across the edges of the cube.

Miguel

On 20 Nov 2024, at 18:36, Barry Smith <bsmith at petsc.dev> wrote:


   I am sorry, I don't understand the problem. When I run by default with -da_view I get

Processor [0] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1
X range of indices: 0 2, Y range of indices: 0 2, Z range of indices: 0 2
Processor [1] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1
X range of indices: 2 3, Y range of indices: 0 2, Z range of indices: 0 2
Processor [2] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1
X range of indices: 0 2, Y range of indices: 2 3, Z range of indices: 0 2
Processor [3] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1
X range of indices: 2 3, Y range of indices: 2 3, Z range of indices: 0 2
Processor [4] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1
X range of indices: 0 2, Y range of indices: 0 2, Z range of indices: 2 3
Processor [5] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1
X range of indices: 2 3, Y range of indices: 0 2, Z range of indices: 2 3
Processor [6] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1
X range of indices: 0 2, Y range of indices: 2 3, Z range of indices: 2 3
Processor [7] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1
X range of indices: 2 3, Y range of indices: 2 3, Z range of indices: 2 3

which seems right because you are trying to have three cells in each direction. The distribution has to be uneven, hence 0 2 and 2 3

When I change the code to use ndiv_mesh_* = 4 and run with periodic or not I get

$ PETSC_OPTIONS="" mpiexec -n 8 ./atoms-3D -dm_view
DM Object: 8 MPI processes
  type: da
Processor [0] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1
X range of indices: 0 2, Y range of indices: 0 2, Z range of indices: 0 2
Processor [1] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1
X range of indices: 2 4, Y range of indices: 0 2, Z range of indices: 0 2
Processor [2] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1
X range of indices: 0 2, Y range of indices: 2 4, Z range of indices: 0 2
Processor [3] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1
X range of indices: 2 4, Y range of indices: 2 4, Z range of indices: 0 2
Processor [4] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1
X range of indices: 0 2, Y range of indices: 0 2, Z range of indices: 2 4
Processor [5] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1
X range of indices: 2 4, Y range of indices: 0 2, Z range of indices: 2 4
Processor [6] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1
X range of indices: 0 2, Y range of indices: 2 4, Z range of indices: 2 4
Processor [7] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1
X range of indices: 2 4, Y range of indices: 2 4, Z range of indices: 2 4

so it is splitting as expected each rank gets a 2 by 2 by 2 set of indices.

Could you please let me know what the problem is that I should be seeing.

  Barry


On Nov 20, 2024, at 7:06?AM, MIGUEL MOLINOS PEREZ <mmolinos at us.es> wrote:

Dear Barry,

Please, find attached to this email a minimal example of the problem. Run it using 8 MPI processes.

Thanks,
Miguel


On 20 Nov 2024, at 11:48, Miguel Molinos <mmolinos at us.es> wrote:

Hi Bary:

I will check the example you suggest. Anyhow, I?ll send a reproducible example ASAP.

Thanks,
Miguel

On 19 Nov 2024, at 18:55, Barry Smith <bsmith at petsc.dev> wrote:


   I modify src/dm/tests/ex25.c and always see a nice even split when possible with both DM_BOUNDARY_NONE and DM_BOUNDARY_PERIODIC

    Can you please send a reproducible example?

    Thanks

     Barry


On Nov 19, 2024, at 6:14?AM, MIGUEL MOLINOS PEREZ <mmolinos at us.es> wrote:

Dear all:

It seems that if I mesh a cubic domain with ?DMDACreate3d? using 8 bricks for discretization and with periodic boundaries, each of the bricks has a different size. In contrast, if I use DM_BOUNDARY_NONE, all 8 bricks have the same size.

I have used this together with the DMSWarm discretization. And as you can see the number of particles per rank is not evenly distributed:
210 420 366 732 420 840 732 1464

Am I missing something?

Thanks,
Miguel


<Screenshot 2024-11-19 at 10.56.36.png>


<atoms-3D.cpp><Mg-hcp-cube-x17-x10-x10.dump>


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20241120/0480ec7e/attachment-0001.html>

From bsmith at petsc.dev  Wed Nov 20 12:54:26 2024
From: bsmith at petsc.dev (Barry Smith)
Date: Wed, 20 Nov 2024 13:54:26 -0500
Subject: [petsc-users] Doubt about mesh size distribution in
 DMDACreate3d using periodic boundary conditions
In-Reply-To: <6B69A792-4004-4B32-AB0D-F9D4AFACA094@us.es>
References: <56B31960-5928-4225-BE0D-B2A8E6214776@us.es>
	<533B9E59-5B74-433E-B7A9-89972A2C7A8D@petsc.dev>
	<3421F3B3-C775-4E1A-A572-9B22EF0942E9@us.es>
	<FB52894A-868A-460B-81C2-D7C73692D764@us.es>
	<609D8006-8454-4FE3-A052-A0D95D9189F7@petsc.dev>
	<AFAF77C2-CEB0-430D-81E1-F16D704FE4BA@us.es>
	<50D40867-05EE-4AD0-B356-BC4C2E1CCF1D@petsc.dev>
	<6B69A792-4004-4B32-AB0D-F9D4AFACA094@us.es>
Message-ID: <83500C6C-6E3E-499A-8595-7EEFE3174028@petsc.dev>


   Are you considering your degrees of freedom as vertex or cell-centered?

   Say three "elements" per edge.

       If vertex centered then discretization size is 1/3 if periodic and 1/2 if not periodic

       If cell-centered then each cell has width 1/3 for both periodic and not periodic

    but in both cases you can think of the discretization size as constant along the whole cube edge.

    Is this related to DMSWARM in particular?


> On Nov 20, 2024, at 12:56?PM, MIGUEL MOLINOS PEREZ <mmolinos at us.es> wrote:
> 
> I mean that if the dimensions of the cube are 1x1x1 (for example). And I want 10 elements per edge, the discretization size must be 0.1 constant over the whole cube edge. 
> 
> This is not in the code, I just impose the number of elements per edge. 
> 
> Thank you,
> Miguel
> 
>> On 20 Nov 2024, at 18:52, Barry Smith <bsmith at petsc.dev> wrote:
>> 
>> 
>>   What do you mean by discretization size, and how do I see it in the code?
>> 
>>   Barry
>> 
>> 
>>> On Nov 20, 2024, at 12:48?PM, MIGUEL MOLINOS PEREZ <mmolinos at us.es> wrote:
>>> 
>>> Sorry, I meant that the discretisation size is not constant across the edges of the cube. 
>>> 
>>> Miguel
>>> 
>>>> On 20 Nov 2024, at 18:36, Barry Smith <bsmith at petsc.dev> wrote:
>>>> 
>>>> 
>>>>    I am sorry, I don't understand the problem. When I run by default with -da_view I get 
>>>> 
>>>> Processor [0] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1
>>>> X range of indices: 0 2, Y range of indices: 0 2, Z range of indices: 0 2
>>>> Processor [1] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1
>>>> X range of indices: 2 3, Y range of indices: 0 2, Z range of indices: 0 2
>>>> Processor [2] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1
>>>> X range of indices: 0 2, Y range of indices: 2 3, Z range of indices: 0 2
>>>> Processor [3] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1
>>>> X range of indices: 2 3, Y range of indices: 2 3, Z range of indices: 0 2
>>>> Processor [4] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1
>>>> X range of indices: 0 2, Y range of indices: 0 2, Z range of indices: 2 3
>>>> Processor [5] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1
>>>> X range of indices: 2 3, Y range of indices: 0 2, Z range of indices: 2 3
>>>> Processor [6] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1
>>>> X range of indices: 0 2, Y range of indices: 2 3, Z range of indices: 2 3
>>>> Processor [7] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1
>>>> X range of indices: 2 3, Y range of indices: 2 3, Z range of indices: 2 3
>>>> 
>>>> which seems right because you are trying to have three cells in each direction. The distribution has to be uneven, hence 0 2 and 2 3
>>>> 
>>>> When I change the code to use ndiv_mesh_* = 4 and run with periodic or not I get 
>>>> 
>>>> $ PETSC_OPTIONS="" mpiexec -n 8 ./atoms-3D -dm_view
>>>> DM Object: 8 MPI processes
>>>>   type: da
>>>> Processor [0] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1
>>>> X range of indices: 0 2, Y range of indices: 0 2, Z range of indices: 0 2
>>>> Processor [1] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1
>>>> X range of indices: 2 4, Y range of indices: 0 2, Z range of indices: 0 2
>>>> Processor [2] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1
>>>> X range of indices: 0 2, Y range of indices: 2 4, Z range of indices: 0 2
>>>> Processor [3] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1
>>>> X range of indices: 2 4, Y range of indices: 2 4, Z range of indices: 0 2
>>>> Processor [4] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1
>>>> X range of indices: 0 2, Y range of indices: 0 2, Z range of indices: 2 4
>>>> Processor [5] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1
>>>> X range of indices: 2 4, Y range of indices: 0 2, Z range of indices: 2 4
>>>> Processor [6] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1
>>>> X range of indices: 0 2, Y range of indices: 2 4, Z range of indices: 2 4
>>>> Processor [7] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1
>>>> X range of indices: 2 4, Y range of indices: 2 4, Z range of indices: 2 4
>>>> 
>>>> so it is splitting as expected each rank gets a 2 by 2 by 2 set of indices.
>>>> 
>>>> Could you please let me know what the problem is that I should be seeing.
>>>> 
>>>>   Barry
>>>> 
>>>> 
>>>>> On Nov 20, 2024, at 7:06?AM, MIGUEL MOLINOS PEREZ <mmolinos at us.es> wrote:
>>>>> 
>>>>> Dear Barry,
>>>>> 
>>>>> Please, find attached to this email a minimal example of the problem. Run it using 8 MPI processes. 
>>>>> 
>>>>> Thanks,
>>>>> Miguel
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>>> On 20 Nov 2024, at 11:48, Miguel Molinos <mmolinos at us.es> wrote:
>>>>>> 
>>>>>> Hi Bary:
>>>>>> 
>>>>>> I will check the example you suggest. Anyhow, I?ll send a reproducible example ASAP.
>>>>>> 
>>>>>> Thanks,
>>>>>> Miguel
>>>>>> 
>>>>>>> On 19 Nov 2024, at 18:55, Barry Smith <bsmith at petsc.dev> wrote:
>>>>>>> 
>>>>>>> 
>>>>>>>    I modify src/dm/tests/ex25.c and always see a nice even split when possible with both DM_BOUNDARY_NONE and DM_BOUNDARY_PERIODIC
>>>>>>> 
>>>>>>>     Can you please send a reproducible example?
>>>>>>> 
>>>>>>>     Thanks
>>>>>>> 
>>>>>>>      Barry
>>>>>>> 
>>>>>>> 
>>>>>>>> On Nov 19, 2024, at 6:14?AM, MIGUEL MOLINOS PEREZ <mmolinos at us.es> wrote:
>>>>>>>> 
>>>>>>>> Dear all:
>>>>>>>> 
>>>>>>>> It seems that if I mesh a cubic domain with ?DMDACreate3d? using 8 bricks for discretization and with periodic boundaries, each of the bricks has a different size. In contrast, if I use DM_BOUNDARY_NONE, all 8 bricks have the same size. 
>>>>>>>> 
>>>>>>>> I have used this together with the DMSWarm discretization. And as you can see the number of particles per rank is not evenly distributed: 
>>>>>>>> 210 420 366 732 420 840 732 1464
>>>>>>>> 
>>>>>>>> Am I missing something?
>>>>>>>> 
>>>>>>>> Thanks,
>>>>>>>> Miguel
>>>>>>>> 
>>>>>>>> 
>>>>>>>> <Screenshot 2024-11-19 at 10.56.36.png>
>>>>>>> 
>>>>>> 
>>>>> 
>>>>> <atoms-3D.cpp><Mg-hcp-cube-x17-x10-x10.dump>
>>>> 
>>> 
>> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20241120/2b338446/attachment-0001.html>

From mmolinos at us.es  Wed Nov 20 13:38:20 2024
From: mmolinos at us.es (MIGUEL MOLINOS PEREZ)
Date: Wed, 20 Nov 2024 19:38:20 +0000
Subject: [petsc-users] Doubt about mesh size distribution in
 DMDACreate3d using periodic boundary conditions
In-Reply-To: <83500C6C-6E3E-499A-8595-7EEFE3174028@petsc.dev>
References: <56B31960-5928-4225-BE0D-B2A8E6214776@us.es>
	<533B9E59-5B74-433E-B7A9-89972A2C7A8D@petsc.dev>
	<3421F3B3-C775-4E1A-A572-9B22EF0942E9@us.es>
	<FB52894A-868A-460B-81C2-D7C73692D764@us.es>
	<609D8006-8454-4FE3-A052-A0D95D9189F7@petsc.dev>
	<AFAF77C2-CEB0-430D-81E1-F16D704FE4BA@us.es>
	<50D40867-05EE-4AD0-B356-BC4C2E1CCF1D@petsc.dev>
	<6B69A792-4004-4B32-AB0D-F9D4AFACA094@us.es>
	<83500C6C-6E3E-499A-8595-7EEFE3174028@petsc.dev>
Message-ID: <14500884-FB4B-4872-9E06-207FE6482187@us.es>

Yes, I use the vertex (nodes) of the elements.

I am using the DMDA as an auxiliar mesh to do the domain partitioning in the DMSWARM.

Thanks,
Miguel


On 20 Nov 2024, at 19:54, Barry Smith <bsmith at petsc.dev> wrote:


   Are you considering your degrees of freedom as vertex or cell-centered?

   Say three "elements" per edge.

       If vertex centered then discretization size is 1/3 if periodic and 1/2 if not periodic

       If cell-centered then each cell has width 1/3 for both periodic and not periodic

    but in both cases you can think of the discretization size as constant along the whole cube edge.

    Is this related to DMSWARM in particular?

On Nov 20, 2024, at 12:56?PM, MIGUEL MOLINOS PEREZ <mmolinos at us.es> wrote:

I mean that if the dimensions of the cube are 1x1x1 (for example). And I want 10 elements per edge, the discretization size must be 0.1 constant over the whole cube edge.

This is not in the code, I just impose the number of elements per edge.

Thank you,
Miguel

On 20 Nov 2024, at 18:52, Barry Smith <bsmith at petsc.dev> wrote:


  What do you mean by discretization size, and how do I see it in the code?

  Barry


On Nov 20, 2024, at 12:48?PM, MIGUEL MOLINOS PEREZ <mmolinos at us.es> wrote:

Sorry, I meant that the discretisation size is not constant across the edges of the cube.

Miguel

On 20 Nov 2024, at 18:36, Barry Smith <bsmith at petsc.dev> wrote:


   I am sorry, I don't understand the problem. When I run by default with -da_view I get

Processor [0] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1
X range of indices: 0 2, Y range of indices: 0 2, Z range of indices: 0 2
Processor [1] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1
X range of indices: 2 3, Y range of indices: 0 2, Z range of indices: 0 2
Processor [2] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1
X range of indices: 0 2, Y range of indices: 2 3, Z range of indices: 0 2
Processor [3] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1
X range of indices: 2 3, Y range of indices: 2 3, Z range of indices: 0 2
Processor [4] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1
X range of indices: 0 2, Y range of indices: 0 2, Z range of indices: 2 3
Processor [5] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1
X range of indices: 2 3, Y range of indices: 0 2, Z range of indices: 2 3
Processor [6] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1
X range of indices: 0 2, Y range of indices: 2 3, Z range of indices: 2 3
Processor [7] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1
X range of indices: 2 3, Y range of indices: 2 3, Z range of indices: 2 3

which seems right because you are trying to have three cells in each direction. The distribution has to be uneven, hence 0 2 and 2 3

When I change the code to use ndiv_mesh_* = 4 and run with periodic or not I get

$ PETSC_OPTIONS="" mpiexec -n 8 ./atoms-3D -dm_view
DM Object: 8 MPI processes
  type: da
Processor [0] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1
X range of indices: 0 2, Y range of indices: 0 2, Z range of indices: 0 2
Processor [1] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1
X range of indices: 2 4, Y range of indices: 0 2, Z range of indices: 0 2
Processor [2] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1
X range of indices: 0 2, Y range of indices: 2 4, Z range of indices: 0 2
Processor [3] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1
X range of indices: 2 4, Y range of indices: 2 4, Z range of indices: 0 2
Processor [4] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1
X range of indices: 0 2, Y range of indices: 0 2, Z range of indices: 2 4
Processor [5] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1
X range of indices: 2 4, Y range of indices: 0 2, Z range of indices: 2 4
Processor [6] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1
X range of indices: 0 2, Y range of indices: 2 4, Z range of indices: 2 4
Processor [7] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1
X range of indices: 2 4, Y range of indices: 2 4, Z range of indices: 2 4

so it is splitting as expected each rank gets a 2 by 2 by 2 set of indices.

Could you please let me know what the problem is that I should be seeing.

  Barry


On Nov 20, 2024, at 7:06?AM, MIGUEL MOLINOS PEREZ <mmolinos at us.es> wrote:

Dear Barry,

Please, find attached to this email a minimal example of the problem. Run it using 8 MPI processes.

Thanks,
Miguel


On 20 Nov 2024, at 11:48, Miguel Molinos <mmolinos at us.es> wrote:

Hi Bary:

I will check the example you suggest. Anyhow, I?ll send a reproducible example ASAP.

Thanks,
Miguel

On 19 Nov 2024, at 18:55, Barry Smith <bsmith at petsc.dev> wrote:


   I modify src/dm/tests/ex25.c and always see a nice even split when possible with both DM_BOUNDARY_NONE and DM_BOUNDARY_PERIODIC

    Can you please send a reproducible example?

    Thanks

     Barry


On Nov 19, 2024, at 6:14?AM, MIGUEL MOLINOS PEREZ <mmolinos at us.es> wrote:

Dear all:

It seems that if I mesh a cubic domain with ?DMDACreate3d? using 8 bricks for discretization and with periodic boundaries, each of the bricks has a different size. In contrast, if I use DM_BOUNDARY_NONE, all 8 bricks have the same size.

I have used this together with the DMSWarm discretization. And as you can see the number of particles per rank is not evenly distributed:
210 420 366 732 420 840 732 1464

Am I missing something?

Thanks,
Miguel


<Screenshot 2024-11-19 at 10.56.36.png>


<atoms-3D.cpp><Mg-hcp-cube-x17-x10-x10.dump>


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20241120/8d91790a/attachment-0001.html>

From bsmith at petsc.dev  Wed Nov 20 15:56:56 2024
From: bsmith at petsc.dev (Barry Smith)
Date: Wed, 20 Nov 2024 16:56:56 -0500
Subject: [petsc-users] Doubt about mesh size distribution in
 DMDACreate3d using periodic boundary conditions
In-Reply-To: <14500884-FB4B-4872-9E06-207FE6482187@us.es>
References: <56B31960-5928-4225-BE0D-B2A8E6214776@us.es>
	<533B9E59-5B74-433E-B7A9-89972A2C7A8D@petsc.dev>
	<3421F3B3-C775-4E1A-A572-9B22EF0942E9@us.es>
	<FB52894A-868A-460B-81C2-D7C73692D764@us.es>
	<609D8006-8454-4FE3-A052-A0D95D9189F7@petsc.dev>
	<AFAF77C2-CEB0-430D-81E1-F16D704FE4BA@us.es>
	<50D40867-05EE-4AD0-B356-BC4C2E1CCF1D@petsc.dev>
	<6B69A792-4004-4B32-AB0D-F9D4AFACA094@us.es>
	<83500C6C-6E3E-499A-8595-7EEFE3174028@petsc.dev>
	<14500884-FB4B-4872-9E06-207FE6482187@us.es>
Message-ID: <969F00FA-3636-4B06-88EC-04DE57C8F492@petsc.dev>


> On Nov 20, 2024, at 2:38?PM, MIGUEL MOLINOS PEREZ <mmolinos at us.es> wrote:
> 
> Yes, I use the vertex (nodes) of the elements. 

   Then the length between each vertex will be different between periodic and non-periodic case. With 10 points and non-periodic, it will be 1/9, and with periodic it will be 1/10th. Is this what you are asking about?


> 
> I am using the DMDA as an auxiliar mesh to do the domain partitioning in the DMSWARM. 
> 
> Thanks,
> Miguel
> 
> 
> 
>> On 20 Nov 2024, at 19:54, Barry Smith <bsmith at petsc.dev> wrote:
>> 
>> 
>>    Are you considering your degrees of freedom as vertex or cell-centered?
>> 
>>    Say three "elements" per edge.
>> 
>>        If vertex centered then discretization size is 1/3 if periodic and 1/2 if not periodic
>> 
>>        If cell-centered then each cell has width 1/3 for both periodic and not periodic
>> 
>>     but in both cases you can think of the discretization size as constant along the whole cube edge.
>> 
>>     Is this related to DMSWARM in particular?
>> 
>>> On Nov 20, 2024, at 12:56?PM, MIGUEL MOLINOS PEREZ <mmolinos at us.es> wrote:
>>> 
>>> I mean that if the dimensions of the cube are 1x1x1 (for example). And I want 10 elements per edge, the discretization size must be 0.1 constant over the whole cube edge. 
>>> 
>>> This is not in the code, I just impose the number of elements per edge. 
>>> 
>>> Thank you,
>>> Miguel
>>> 
>>>> On 20 Nov 2024, at 18:52, Barry Smith <bsmith at petsc.dev> wrote:
>>>> 
>>>> 
>>>>   What do you mean by discretization size, and how do I see it in the code?
>>>> 
>>>>   Barry
>>>> 
>>>> 
>>>>> On Nov 20, 2024, at 12:48?PM, MIGUEL MOLINOS PEREZ <mmolinos at us.es> wrote:
>>>>> 
>>>>> Sorry, I meant that the discretisation size is not constant across the edges of the cube. 
>>>>> 
>>>>> Miguel
>>>>> 
>>>>>> On 20 Nov 2024, at 18:36, Barry Smith <bsmith at petsc.dev> wrote:
>>>>>> 
>>>>>> 
>>>>>>    I am sorry, I don't understand the problem. When I run by default with -da_view I get 
>>>>>> 
>>>>>> Processor [0] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1
>>>>>> X range of indices: 0 2, Y range of indices: 0 2, Z range of indices: 0 2
>>>>>> Processor [1] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1
>>>>>> X range of indices: 2 3, Y range of indices: 0 2, Z range of indices: 0 2
>>>>>> Processor [2] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1
>>>>>> X range of indices: 0 2, Y range of indices: 2 3, Z range of indices: 0 2
>>>>>> Processor [3] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1
>>>>>> X range of indices: 2 3, Y range of indices: 2 3, Z range of indices: 0 2
>>>>>> Processor [4] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1
>>>>>> X range of indices: 0 2, Y range of indices: 0 2, Z range of indices: 2 3
>>>>>> Processor [5] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1
>>>>>> X range of indices: 2 3, Y range of indices: 0 2, Z range of indices: 2 3
>>>>>> Processor [6] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1
>>>>>> X range of indices: 0 2, Y range of indices: 2 3, Z range of indices: 2 3
>>>>>> Processor [7] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1
>>>>>> X range of indices: 2 3, Y range of indices: 2 3, Z range of indices: 2 3
>>>>>> 
>>>>>> which seems right because you are trying to have three cells in each direction. The distribution has to be uneven, hence 0 2 and 2 3
>>>>>> 
>>>>>> When I change the code to use ndiv_mesh_* = 4 and run with periodic or not I get 
>>>>>> 
>>>>>> $ PETSC_OPTIONS="" mpiexec -n 8 ./atoms-3D -dm_view
>>>>>> DM Object: 8 MPI processes
>>>>>>   type: da
>>>>>> Processor [0] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1
>>>>>> X range of indices: 0 2, Y range of indices: 0 2, Z range of indices: 0 2
>>>>>> Processor [1] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1
>>>>>> X range of indices: 2 4, Y range of indices: 0 2, Z range of indices: 0 2
>>>>>> Processor [2] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1
>>>>>> X range of indices: 0 2, Y range of indices: 2 4, Z range of indices: 0 2
>>>>>> Processor [3] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1
>>>>>> X range of indices: 2 4, Y range of indices: 2 4, Z range of indices: 0 2
>>>>>> Processor [4] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1
>>>>>> X range of indices: 0 2, Y range of indices: 0 2, Z range of indices: 2 4
>>>>>> Processor [5] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1
>>>>>> X range of indices: 2 4, Y range of indices: 0 2, Z range of indices: 2 4
>>>>>> Processor [6] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1
>>>>>> X range of indices: 0 2, Y range of indices: 2 4, Z range of indices: 2 4
>>>>>> Processor [7] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1
>>>>>> X range of indices: 2 4, Y range of indices: 2 4, Z range of indices: 2 4
>>>>>> 
>>>>>> so it is splitting as expected each rank gets a 2 by 2 by 2 set of indices.
>>>>>> 
>>>>>> Could you please let me know what the problem is that I should be seeing.
>>>>>> 
>>>>>>   Barry
>>>>>> 
>>>>>> 
>>>>>>> On Nov 20, 2024, at 7:06?AM, MIGUEL MOLINOS PEREZ <mmolinos at us.es> wrote:
>>>>>>> 
>>>>>>> Dear Barry,
>>>>>>> 
>>>>>>> Please, find attached to this email a minimal example of the problem. Run it using 8 MPI processes. 
>>>>>>> 
>>>>>>> Thanks,
>>>>>>> Miguel
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>>> On 20 Nov 2024, at 11:48, Miguel Molinos <mmolinos at us.es> wrote:
>>>>>>>> 
>>>>>>>> Hi Bary:
>>>>>>>> 
>>>>>>>> I will check the example you suggest. Anyhow, I?ll send a reproducible example ASAP.
>>>>>>>> 
>>>>>>>> Thanks,
>>>>>>>> Miguel
>>>>>>>> 
>>>>>>>>> On 19 Nov 2024, at 18:55, Barry Smith <bsmith at petsc.dev> wrote:
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>    I modify src/dm/tests/ex25.c and always see a nice even split when possible with both DM_BOUNDARY_NONE and DM_BOUNDARY_PERIODIC
>>>>>>>>> 
>>>>>>>>>     Can you please send a reproducible example?
>>>>>>>>> 
>>>>>>>>>     Thanks
>>>>>>>>> 
>>>>>>>>>      Barry
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>> On Nov 19, 2024, at 6:14?AM, MIGUEL MOLINOS PEREZ <mmolinos at us.es> wrote:
>>>>>>>>>> 
>>>>>>>>>> Dear all:
>>>>>>>>>> 
>>>>>>>>>> It seems that if I mesh a cubic domain with ?DMDACreate3d? using 8 bricks for discretization and with periodic boundaries, each of the bricks has a different size. In contrast, if I use DM_BOUNDARY_NONE, all 8 bricks have the same size. 
>>>>>>>>>> 
>>>>>>>>>> I have used this together with the DMSWarm discretization. And as you can see the number of particles per rank is not evenly distributed: 
>>>>>>>>>> 210 420 366 732 420 840 732 1464
>>>>>>>>>> 
>>>>>>>>>> Am I missing something?
>>>>>>>>>> 
>>>>>>>>>> Thanks,
>>>>>>>>>> Miguel
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> <Screenshot 2024-11-19 at 10.56.36.png>
>>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>>> <atoms-3D.cpp><Mg-hcp-cube-x17-x10-x10.dump>
>>>>>> 
>>>>> 
>>>> 
>>> 
>> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20241120/1ae00f51/attachment-0001.html>

From mmolinos at us.es  Wed Nov 20 16:40:26 2024
From: mmolinos at us.es (MIGUEL MOLINOS PEREZ)
Date: Wed, 20 Nov 2024 22:40:26 +0000
Subject: [petsc-users] Doubt about mesh size distribution in
 DMDACreate3d using periodic boundary conditions
In-Reply-To: <969F00FA-3636-4B06-88EC-04DE57C8F492@petsc.dev>
References: <56B31960-5928-4225-BE0D-B2A8E6214776@us.es>
	<533B9E59-5B74-433E-B7A9-89972A2C7A8D@petsc.dev>
	<3421F3B3-C775-4E1A-A572-9B22EF0942E9@us.es>
	<FB52894A-868A-460B-81C2-D7C73692D764@us.es>
	<609D8006-8454-4FE3-A052-A0D95D9189F7@petsc.dev>
	<AFAF77C2-CEB0-430D-81E1-F16D704FE4BA@us.es>
	<50D40867-05EE-4AD0-B356-BC4C2E1CCF1D@petsc.dev>
	<6B69A792-4004-4B32-AB0D-F9D4AFACA094@us.es>
	<83500C6C-6E3E-499A-8595-7EEFE3174028@petsc.dev>
	<14500884-FB4B-4872-9E06-207FE6482187@us.es>
	<969F00FA-3636-4B06-88EC-04DE57C8F492@petsc.dev>
Message-ID: <300ECE6B-36F4-4212-9EA6-95716EBB06A2@us.es>

I see? that might be the problem. I?ll check it tomorrow. Thank you!

Miguel

On 20 Nov 2024, at 22:57, Barry Smith <bsmith at petsc.dev> wrote:

?

On Nov 20, 2024, at 2:38?PM, MIGUEL MOLINOS PEREZ <mmolinos at us.es> wrote:

Yes, I use the vertex (nodes) of the elements.

   Then the length between each vertex will be different between periodic and non-periodic case. With 10 points and non-periodic, it will be 1/9, and with periodic it will be 1/10th. Is this what you are asking about?


I am using the DMDA as an auxiliar mesh to do the domain partitioning in the DMSWARM.

Thanks,
Miguel


On 20 Nov 2024, at 19:54, Barry Smith <bsmith at petsc.dev> wrote:


   Are you considering your degrees of freedom as vertex or cell-centered?

   Say three "elements" per edge.

       If vertex centered then discretization size is 1/3 if periodic and 1/2 if not periodic

       If cell-centered then each cell has width 1/3 for both periodic and not periodic

    but in both cases you can think of the discretization size as constant along the whole cube edge.

    Is this related to DMSWARM in particular?

On Nov 20, 2024, at 12:56?PM, MIGUEL MOLINOS PEREZ <mmolinos at us.es> wrote:

I mean that if the dimensions of the cube are 1x1x1 (for example). And I want 10 elements per edge, the discretization size must be 0.1 constant over the whole cube edge.

This is not in the code, I just impose the number of elements per edge.

Thank you,
Miguel

On 20 Nov 2024, at 18:52, Barry Smith <bsmith at petsc.dev> wrote:


  What do you mean by discretization size, and how do I see it in the code?

  Barry


On Nov 20, 2024, at 12:48?PM, MIGUEL MOLINOS PEREZ <mmolinos at us.es> wrote:

Sorry, I meant that the discretisation size is not constant across the edges of the cube.

Miguel

On 20 Nov 2024, at 18:36, Barry Smith <bsmith at petsc.dev> wrote:


   I am sorry, I don't understand the problem. When I run by default with -da_view I get

Processor [0] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1
X range of indices: 0 2, Y range of indices: 0 2, Z range of indices: 0 2
Processor [1] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1
X range of indices: 2 3, Y range of indices: 0 2, Z range of indices: 0 2
Processor [2] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1
X range of indices: 0 2, Y range of indices: 2 3, Z range of indices: 0 2
Processor [3] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1
X range of indices: 2 3, Y range of indices: 2 3, Z range of indices: 0 2
Processor [4] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1
X range of indices: 0 2, Y range of indices: 0 2, Z range of indices: 2 3
Processor [5] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1
X range of indices: 2 3, Y range of indices: 0 2, Z range of indices: 2 3
Processor [6] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1
X range of indices: 0 2, Y range of indices: 2 3, Z range of indices: 2 3
Processor [7] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1
X range of indices: 2 3, Y range of indices: 2 3, Z range of indices: 2 3

which seems right because you are trying to have three cells in each direction. The distribution has to be uneven, hence 0 2 and 2 3

When I change the code to use ndiv_mesh_* = 4 and run with periodic or not I get

$ PETSC_OPTIONS="" mpiexec -n 8 ./atoms-3D -dm_view
DM Object: 8 MPI processes
  type: da
Processor [0] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1
X range of indices: 0 2, Y range of indices: 0 2, Z range of indices: 0 2
Processor [1] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1
X range of indices: 2 4, Y range of indices: 0 2, Z range of indices: 0 2
Processor [2] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1
X range of indices: 0 2, Y range of indices: 2 4, Z range of indices: 0 2
Processor [3] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1
X range of indices: 2 4, Y range of indices: 2 4, Z range of indices: 0 2
Processor [4] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1
X range of indices: 0 2, Y range of indices: 0 2, Z range of indices: 2 4
Processor [5] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1
X range of indices: 2 4, Y range of indices: 0 2, Z range of indices: 2 4
Processor [6] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1
X range of indices: 0 2, Y range of indices: 2 4, Z range of indices: 2 4
Processor [7] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1
X range of indices: 2 4, Y range of indices: 2 4, Z range of indices: 2 4

so it is splitting as expected each rank gets a 2 by 2 by 2 set of indices.

Could you please let me know what the problem is that I should be seeing.

  Barry


On Nov 20, 2024, at 7:06?AM, MIGUEL MOLINOS PEREZ <mmolinos at us.es> wrote:

Dear Barry,

Please, find attached to this email a minimal example of the problem. Run it using 8 MPI processes.

Thanks,
Miguel


On 20 Nov 2024, at 11:48, Miguel Molinos <mmolinos at us.es> wrote:

Hi Bary:

I will check the example you suggest. Anyhow, I?ll send a reproducible example ASAP.

Thanks,
Miguel

On 19 Nov 2024, at 18:55, Barry Smith <bsmith at petsc.dev> wrote:


   I modify src/dm/tests/ex25.c and always see a nice even split when possible with both DM_BOUNDARY_NONE and DM_BOUNDARY_PERIODIC

    Can you please send a reproducible example?

    Thanks

     Barry


On Nov 19, 2024, at 6:14?AM, MIGUEL MOLINOS PEREZ <mmolinos at us.es> wrote:

Dear all:

It seems that if I mesh a cubic domain with ?DMDACreate3d? using 8 bricks for discretization and with periodic boundaries, each of the bricks has a different size. In contrast, if I use DM_BOUNDARY_NONE, all 8 bricks have the same size.

I have used this together with the DMSWarm discretization. And as you can see the number of particles per rank is not evenly distributed:
210 420 366 732 420 840 732 1464

Am I missing something?

Thanks,
Miguel


<Screenshot 2024-11-19 at 10.56.36.png>


<atoms-3D.cpp><Mg-hcp-cube-x17-x10-x10.dump>


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20241120/36e00a90/attachment-0001.html>

From 12431140 at mail.sustech.edu.cn  Thu Nov 21 06:11:32 2024
From: 12431140 at mail.sustech.edu.cn (=?utf-8?B?RGF2aWQgSmlhd2VpIExVTyBMSUFORw==?=)
Date: Thu, 21 Nov 2024 20:11:32 +0800
Subject: [petsc-users] Cannot iterate well when using Newton iteration of
 SNES
Message-ID: <tencent_136418DA12A55AD236B3BA0E@qq.com>

I am using the Newton iteration to solve a nonlinear 1D heat equation problem by using FEM.


I attached my source code named "SNES_heat.cpp"&nbsp;


when I run the code

&nbsp; 0 SNES Function norm 1.206289245288e+01
 
&nbsp; 1 SNES Function norm 7.128802192789e+00
 
&nbsp; 2 SNES Function norm 6.608812909525e+00


you can find that it only iterate 3 steps, and then do all the function evaluation and finally just stop the program.&nbsp;


I think it is not reasonble. I check my code, it is correct if I set it as a linear problem. it means my Jacobian and Residual function is correct.


But when I set it as a nonlinear, the residual seems reduces as not expected.&nbsp;


I doubt that whether my understanding of the newton iteration is different from SNES's newton iteration process.


David Jiawei LUO LIANG


??????/??/???/2024


?????????????1088?


&nbsp;
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20241121/e58d6eca/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: SNES_heat.cpp
Type: application/octet-stream
Size: 35836 bytes
Desc: not available
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20241121/e58d6eca/attachment-0001.obj>

From jed at jedbrown.org  Thu Nov 21 09:05:27 2024
From: jed at jedbrown.org (Jed Brown)
Date: Thu, 21 Nov 2024 08:05:27 -0700
Subject: [petsc-users] Cannot iterate well when using Newton iteration
 of SNES
In-Reply-To: <tencent_136418DA12A55AD236B3BA0E@qq.com>
References: <tencent_136418DA12A55AD236B3BA0E@qq.com>
Message-ID: <87a5dsbpag.fsf@jedbrown.org>

You should add VecZeroEntries(f) near the top of your FormFunction (it's currently accumulating into whatever was there last) and MatZeroEntries(B) to FormJacobian.

I reduced to nElem = 5 for ease of viewing. With these changes, I see quadratic convergence but the problem is still nonlinear. To explore further, consider using these diagnostics

./SNES_heat -{snes,ksp}_monitor -{snes,ksp}_converged_reason -snes_linesearch_monitor -ksp_view_mat

with and without -snes_fd.

For readability, I would suggest consistency in "u" vs "x".

"David Jiawei LUO LIANG"	<12431140 at mail.sustech.edu.cn> writes:

> I am using the Newton iteration to solve a nonlinear 1D heat equation problem by using FEM.
>
>
> I attached my source code named "SNES_heat.cpp"&nbsp;
>
>
> when I run the code
>
> &nbsp; 0 SNES Function norm 1.206289245288e+01
>  
> &nbsp; 1 SNES Function norm 7.128802192789e+00
>  
> &nbsp; 2 SNES Function norm 6.608812909525e+00
>
>
>
> you can find that it only iterate 3 steps, and then do all the function evaluation and finally just stop the program.&nbsp;
>
>
> I think it is not reasonble. I check my code, it is correct if I set it as a linear problem. it means my Jacobian and Residual function is correct.
>
>
> But when I set it as a nonlinear, the residual seems reduces as not expected.&nbsp;
>
>
> I doubt that whether my understanding of the newton iteration is different from SNES's newton iteration process.
>
>
>
>
>
>
>
>
> David Jiawei LUO LIANG
>
>
>
> ??????/??/???/2024
>
>
>
> ?????????????1088?
>
>
>
>
> &nbsp;

From bsmith at petsc.dev  Thu Nov 21 09:19:51 2024
From: bsmith at petsc.dev (Barry Smith)
Date: Thu, 21 Nov 2024 10:19:51 -0500
Subject: [petsc-users] Cannot iterate well when using Newton iteration
 of SNES
In-Reply-To: <tencent_136418DA12A55AD236B3BA0E@qq.com>
References: <tencent_136418DA12A55AD236B3BA0E@qq.com>
Message-ID: <6A4CB22C-6D05-497C-9A57-E5AB8B7C073F@petsc.dev>


   Start with https://urldefense.us/v3/__https://petsc.org/release/faq/*why-is-newton-s-method-snes-not-converging-or-converges-slowly__;Iw!!G_uCfscf7eWS!au7FVXP89CeLcvEPaqyMevQ8XXBThUgOilXB2BskyYlAyPKwckhOPoT_TGVv_IKuZQTSFDRMPe3F09zTuhtno2k$ 

    Next use

-snes_test_jacobian - compare the user provided Jacobian with one computed via finite differences to check for errors. If a threshold is given, display only those entries whose difference is greater than the threshold.
-snes_test_jacobian_view - display the user provided Jacobian, the finite difference Jacobian and the difference between them to help users detect the location of errors in the user provided Jacobian.


There are many, many reasons Newton can fail, usually they are due to bugs in the function evaluation or Jacobian evaluation. Occasionly they are due to it being a very difficult non-linear problem. You first need to use the tools above to verify there are no bugs anywhere.

Barry


> On Nov 21, 2024, at 7:11?AM, David Jiawei LUO LIANG <12431140 at mail.sustech.edu.cn> wrote:
> 
> I am using the Newton iteration to solve a nonlinear 1D heat equation problem by using FEM.
> 
> I attached my source code named "SNES_heat.cpp" 
> 
> when I run the code
>   0 SNES Function norm 1.206289245288e+01
>   1 SNES Function norm 7.128802192789e+00
>   2 SNES Function norm 6.608812909525e+00
> 
> you can find that it only iterate 3 steps, and then do all the function evaluation and finally just stop the program. 
> 
> I think it is not reasonble. I check my code, it is correct if I set it as a linear problem. it means my Jacobian and Residual function is correct.
> 
> But when I set it as a nonlinear, the residual seems reduces as not expected. 
> 
> I doubt that whether my understanding of the newton iteration is different from SNES's newton iteration process.
> 
> 
> 
> 
> 
> David Jiawei LUO LIANG
> ??????/??/???/2024
> ?????????????1088?
>  
> <SNES_heat.cpp>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20241121/ace137d7/attachment.html>

From 12431140 at mail.sustech.edu.cn  Thu Nov 21 09:16:54 2024
From: 12431140 at mail.sustech.edu.cn (=?utf-8?B?RGF2aWQgSmlhd2VpIExVTyBMSUFORw==?=)
Date: Thu, 21 Nov 2024 23:16:54 +0800
Subject: [petsc-users] Cannot iterate well when using Newton iteration
 of SNES
In-Reply-To: <87a5dsbpag.fsf@jedbrown.org>
References: <tencent_136418DA12A55AD236B3BA0E@qq.com>
	<87a5dsbpag.fsf@jedbrown.org>
Message-ID: <tencent_5187710D5D6E000149CBCBED@qq.com>

Thank you Jed. It works, and the result is identical to the exact solution!&nbsp;


Hope you best!


David&nbsp;Jiawei&nbsp;LUO&nbsp;LIANG


??????/??/???/2024


?????????????1088?


&nbsp;
&nbsp;
&nbsp;
------------------&nbsp;Original&nbsp;------------------
From: &nbsp;"Jed&nbsp;Brown"<jed at jedbrown.org&gt;;
Date: &nbsp;Thu, Nov 21, 2024 11:05 PM
To: &nbsp;"David Jiawei LUO LIANG"<12431140 at mail.sustech.edu.cn&gt;; "petsc-users"<petsc-users at mcs.anl.gov&gt;; 

Subject: &nbsp;Re: [petsc-users] Cannot iterate well when using Newton iteration of SNES

&nbsp;

 You&nbsp;should&nbsp;add&nbsp;VecZeroEntries(f)&nbsp;near&nbsp;the&nbsp;top&nbsp;of&nbsp;your&nbsp;FormFunction&nbsp;(it's&nbsp;currently&nbsp;accumulating&nbsp;into&nbsp;whatever&nbsp;was&nbsp;there&nbsp;last)&nbsp;and&nbsp;MatZeroEntries(B)&nbsp;to&nbsp;FormJacobian.

I&nbsp;reduced&nbsp;to&nbsp;nElem&nbsp;=&nbsp;5&nbsp;for&nbsp;ease&nbsp;of&nbsp;viewing.&nbsp;With&nbsp;these&nbsp;changes,&nbsp;I&nbsp;see&nbsp;quadratic&nbsp;convergence&nbsp;but&nbsp;the&nbsp;problem&nbsp;is&nbsp;still&nbsp;nonlinear.&nbsp;To&nbsp;explore&nbsp;further,&nbsp;consider&nbsp;using&nbsp;these&nbsp;diagnostics

./SNES_heat&nbsp;-{snes,ksp}_monitor&nbsp;-{snes,ksp}_converged_reason&nbsp;-snes_linesearch_monitor&nbsp;-ksp_view_mat

with&nbsp;and&nbsp;without&nbsp;-snes_fd.

For&nbsp;readability,&nbsp;I&nbsp;would&nbsp;suggest&nbsp;consistency&nbsp;in&nbsp;"u"&nbsp;vs&nbsp;"x".

"David&nbsp;Jiawei&nbsp;LUO&nbsp;LIANG"	<12431140 at mail.sustech.edu.cn&gt;&nbsp;writes:

&gt;&nbsp;I&nbsp;am&nbsp;using&nbsp;the&nbsp;Newton&nbsp;iteration&nbsp;to&nbsp;solve&nbsp;a&nbsp;nonlinear&nbsp;1D&nbsp;heat&nbsp;equation&nbsp;problem&nbsp;by&nbsp;using&nbsp;FEM.
&gt;
&gt;
&gt;&nbsp;I&nbsp;attached&nbsp;my&nbsp;source&nbsp;code&nbsp;named&nbsp;"SNES_heat.cpp"&amp;nbsp;
&gt;
&gt;
&gt;&nbsp;when&nbsp;I&nbsp;run&nbsp;the&nbsp;code
&gt;
&gt;&nbsp;&amp;nbsp;&nbsp;0&nbsp;SNES&nbsp;Function&nbsp;norm&nbsp;1.206289245288e+01
&gt;&nbsp;&nbsp;
&gt;&nbsp;&amp;nbsp;&nbsp;1&nbsp;SNES&nbsp;Function&nbsp;norm&nbsp;7.128802192789e+00
&gt;&nbsp;&nbsp;
&gt;&nbsp;&amp;nbsp;&nbsp;2&nbsp;SNES&nbsp;Function&nbsp;norm&nbsp;6.608812909525e+00
&gt;
&gt;
&gt;
&gt;&nbsp;you&nbsp;can&nbsp;find&nbsp;that&nbsp;it&nbsp;only&nbsp;iterate&nbsp;3&nbsp;steps,&nbsp;and&nbsp;then&nbsp;do&nbsp;all&nbsp;the&nbsp;function&nbsp;evaluation&nbsp;and&nbsp;finally&nbsp;just&nbsp;stop&nbsp;the&nbsp;program.&amp;nbsp;
&gt;
&gt;
&gt;&nbsp;I&nbsp;think&nbsp;it&nbsp;is&nbsp;not&nbsp;reasonble.&nbsp;I&nbsp;check&nbsp;my&nbsp;code,&nbsp;it&nbsp;is&nbsp;correct&nbsp;if&nbsp;I&nbsp;set&nbsp;it&nbsp;as&nbsp;a&nbsp;linear&nbsp;problem.&nbsp;it&nbsp;means&nbsp;my&nbsp;Jacobian&nbsp;and&nbsp;Residual&nbsp;function&nbsp;is&nbsp;correct.
&gt;
&gt;
&gt;&nbsp;But&nbsp;when&nbsp;I&nbsp;set&nbsp;it&nbsp;as&nbsp;a&nbsp;nonlinear,&nbsp;the&nbsp;residual&nbsp;seems&nbsp;reduces&nbsp;as&nbsp;not&nbsp;expected.&amp;nbsp;
&gt;
&gt;
&gt;&nbsp;I&nbsp;doubt&nbsp;that&nbsp;whether&nbsp;my&nbsp;understanding&nbsp;of&nbsp;the&nbsp;newton&nbsp;iteration&nbsp;is&nbsp;different&nbsp;from&nbsp;SNES's&nbsp;newton&nbsp;iteration&nbsp;process.
&gt;
&gt;
&gt;
&gt;
&gt;
&gt;
&gt;
&gt;
&gt;&nbsp;David&nbsp;Jiawei&nbsp;LUO&nbsp;LIANG
&gt;
&gt;
&gt;
&gt;&nbsp;??????/??/???/2024
&gt;
&gt;
&gt;
&gt;&nbsp;?????????????1088?
&gt;
&gt;
&gt;
&gt;
&gt;&nbsp;&amp;nbsp;
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20241121/3646c1be/attachment-0001.html>

From knepley at gmail.com  Thu Nov 21 09:21:29 2024
From: knepley at gmail.com (Matthew Knepley)
Date: Thu, 21 Nov 2024 10:21:29 -0500
Subject: [petsc-users] Cannot iterate well when using Newton iteration
 of SNES
In-Reply-To: <tencent_136418DA12A55AD236B3BA0E@qq.com>
References: <tencent_136418DA12A55AD236B3BA0E@qq.com>
Message-ID: <CAMYG4Gnn2pzVNcyRWn00uYpXk5ekVs6ACUVrD-iHuDTdxxXMyQ@mail.gmail.com>

On Thu, Nov 21, 2024 at 8:57?AM David Jiawei LUO LIANG <
12431140 at mail.sustech.edu.cn> wrote:

> I am using the Newton iteration to solve a nonlinear 1D heat equation
> problem by using FEM.
>
> I attached my source code named "SNES_heat.cpp"
>
> when I run the code
>
>   0 SNES Function norm 1.206289245288e+01
>
>   1 SNES Function norm 7.128802192789e+00
>
>   2 SNES Function norm 6.608812909525e+00
>
> you can find that it only iterate 3 steps, and then do all the function
> evaluation and finally just stop the program.
>
> I think it is not reasonble. I check my code, it is correct if I set it as
> a linear problem. it means my Jacobian and Residual function is correct.
>
> But when I set it as a nonlinear, the residual seems reduces as not
> expected.
>
> I doubt that whether my understanding of the newton iteration is different
> from SNES's newton iteration process.
>

Here is what happens with the code as it is:

master *:~/Downloads/tmp/Liang$ ./SNES_heat -snes_monitor
-ksp_converged_reason -snes_converged_reason -pc_type lu -snes_view
-snes_linesearch_monitor
pp 1
nElem 10
nqp 2
n_np 11
n_en 2
n_eq 10
qp:
0.57735 -0.57735
wq:
1 1
IEN:
1 2 3 4 5 6 7 8 9 10 2 3 4 5 6 7 8 9 10 11
x_coor:
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
ID:
1 2 3 4 5 6 7 8 9 10 0
  0 SNES Function norm 1.206289245288e+01
    Linear solve converged due to CONVERGED_RTOL iterations 1
      Line search: Using full step: fnorm 1.206289245288e+01 gnorm
7.128802192789e+00
  1 SNES Function norm 7.128802192789e+00
    Linear solve converged due to CONVERGED_RTOL iterations 1
      Line search: Using full step: fnorm 7.128802192789e+00 gnorm
6.608812909525e+00
  2 SNES Function norm 6.608812909525e+00
    Linear solve converged due to CONVERGED_RTOL iterations 1
      Line search: gnorm after quadratic fit 1.265375106867e+01
      Line search: Cubic step no good, shrinking lambda, current gnorm
1.328962011911e+01 lambda=1.7500506382162818e-02
      Line search: Cubic step no good, shrinking lambda, current gnorm
1.275802797864e+01 lambda=1.7500506382162819e-03
      Line search: Cubic step no good, shrinking lambda, current gnorm
1.327920917220e+01 lambda=1.7500506382162821e-04
      Line search: Cubic step no good, shrinking lambda, current gnorm
1.275906891232e+01 lambda=1.7500506382162820e-05
      Line search: Cubic step no good, shrinking lambda, current gnorm
1.327910508109e+01 lambda=1.7500506382162821e-06
      Line search: Cubic step no good, shrinking lambda, current gnorm
1.275907932147e+01 lambda=1.7500506382162822e-07
      Line search: Cubic step no good, shrinking lambda, current gnorm
1.327910404018e+01 lambda=1.7500506382162823e-08
      Line search: Cubic step no good, shrinking lambda, current gnorm
1.275907942556e+01 lambda=1.7500506382162823e-09
      Line search: Cubic step no good, shrinking lambda, current gnorm
1.327910402977e+01 lambda=1.7500506382162824e-10
      Line search: Cubic step no good, shrinking lambda, current gnorm
1.275907942660e+01 lambda=1.7500506382162825e-11
      Line search: Cubic step no good, shrinking lambda, current gnorm
1.327910402966e+01 lambda=1.7500506382162826e-12
      Line search: Cubic step no good, shrinking lambda, current gnorm
1.275907942661e+01 lambda=1.7500506382162828e-13
      Line search: unable to find good step length! After 12 tries
      Line search: fnorm=6.6088129095253478e+00,
gnorm=1.2759079426614502e+01, ynorm=5.3714153713436097e-01,
minlambda=9.9999999999999998e-13, lambda=1.7500506382162828e-13, initial
slope=-4.3676408073108860e+01
  Nonlinear solve did not converge due to DIVERGED_LINE_SEARCH iterations 2

Usually, we suspect that the Jacobian is incorrect in this case. Thus we
can have it formed automatically,

master *:~/Downloads/tmp/Liang$ ./SNES_heat -snes_monitor
-ksp_converged_reason -snes_converged_reason -snes_fd -pc_type lu
-snes_view -snes_linesearch_monitor
pp 1
nElem 10
nqp 2
n_np 11
n_en 2
n_eq 10
qp:
0.57735 -0.57735
wq:
1 1
IEN:
1 2 3 4 5 6 7 8 9 10 2 3 4 5 6 7 8 9 10 11
x_coor:
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
ID:
1 2 3 4 5 6 7 8 9 10 0
  0 SNES Function norm 1.206289245288e+01
    Linear solve converged due to CONVERGED_RTOL iterations 1
      Line search: Scaling step by 1.837216392007e-47 old ynorm
5.443016970405e+54
      Line search: gnorm after quadratic fit 3.704240795372e+16
      Line search: Cubic step no good, shrinking lambda, current gnorm
3.704614960636e+16 lambda=1.0000000000000002e-02
      Line search: Cubic step no good, shrinking lambda, current gnorm
3.704611219026e+16 lambda=1.0000000000000002e-03
      Line search: Cubic step no good, shrinking lambda, current gnorm
3.704611256438e+16 lambda=1.0000000000000003e-04
      Line search: Cubic step no good, shrinking lambda, current gnorm
3.704611256064e+16 lambda=1.0000000000000004e-05
      Line search: Cubic step no good, shrinking lambda, current gnorm
3.704611256068e+16 lambda=1.0000000000000004e-06
      Line search: Cubic step no good, shrinking lambda, current gnorm
3.704611256068e+16 lambda=1.0000000000000005e-07
      Line search: Cubic step no good, shrinking lambda, current gnorm
3.704611256068e+16 lambda=1.0000000000000005e-08
      Line search: Cubic step no good, shrinking lambda, current gnorm
3.704611256068e+16 lambda=1.0000000000000005e-09
      Line search: Cubic step no good, shrinking lambda, current gnorm
3.704611256068e+16 lambda=1.0000000000000006e-10
      Line search: Cubic step no good, shrinking lambda, current gnorm
3.704611256068e+16 lambda=1.0000000000000006e-11
      Line search: Cubic step no good, shrinking lambda, current gnorm
3.704611256068e+16 lambda=1.0000000000000006e-12
      Line search: Cubic step no good, shrinking lambda, current gnorm
3.704611256068e+16 lambda=1.0000000000000007e-13
      Line search: unable to find good step length! After 12 tries
      Line search: fnorm=1.2062892452882465e+01,
gnorm=3.7046112560677824e+16, ynorm=1.0000000000000000e+08,
minlambda=9.9999999999999998e-13, lambda=1.0000000000000007e-13, initial
slope=-4.6079597780656769e+00
  Nonlinear solve did not converge due to DIVERGED_LINE_SEARCH iterations 0

So it is clear that the Jacobian do not match. Moreover, it appears that
Newton is not going to converge from this initial guess. It suggests that
the residual is wrong somehow. I suggest coding up a MMS to prove to
yourself that the residual is correct.

  Thanks,

    Matt


> David Jiawei LUO LIANG
>
> ??????/??/???/2024
>
> ?????????????1088?
>
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!bqnXB8rtlhm_qgLy5xeRj_mY4Rqfdgmupvjaqg3sArtduMag3ojG26K4cpDZok4CHJJwjxsl6911GOeurhJ1$  <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!bqnXB8rtlhm_qgLy5xeRj_mY4Rqfdgmupvjaqg3sArtduMag3ojG26K4cpDZok4CHJJwjxsl6911GJCZvJG0$ >
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20241121/37872e65/attachment.html>

From 12431140 at mail.sustech.edu.cn  Thu Nov 21 09:28:56 2024
From: 12431140 at mail.sustech.edu.cn (=?utf-8?B?RGF2aWQgSmlhd2VpIExVTyBMSUFORw==?=)
Date: Thu, 21 Nov 2024 23:28:56 +0800
Subject: [petsc-users] Cannot iterate well when using Newton iteration
 of SNES
In-Reply-To: <CAMYG4Gnn2pzVNcyRWn00uYpXk5ekVs6ACUVrD-iHuDTdxxXMyQ@mail.gmail.com>
References: <tencent_136418DA12A55AD236B3BA0E@qq.com>
	<CAMYG4Gnn2pzVNcyRWn00uYpXk5ekVs6ACUVrD-iHuDTdxxXMyQ@mail.gmail.com>
Message-ID: <tencent_32C021D01379FC31503AA0C5@qq.com>

Hi Matt,&nbsp;
&nbsp; &nbsp;Yes, the residual and Jacobin function are both incorrect.&nbsp;


Both of the Vec f and Mat B haven't initialized as zeros. Jed caught that bug, thanks Jed.


Anyway, thank you for your method to debug my program for the next time bug..


Hope you the best!


David&nbsp;Jiawei&nbsp;LUO&nbsp;LIANG


??????/??/???/2024


?????????????1088?


&nbsp;
&nbsp;
&nbsp;
------------------&nbsp;Original&nbsp;------------------
From: &nbsp;"Matthew&nbsp;Knepley"<knepley at gmail.com&gt;;
Date: &nbsp;Thu, Nov 21, 2024 11:21 PM
To: &nbsp;"David Jiawei LUO LIANG"<12431140 at mail.sustech.edu.cn&gt;; 
Cc: &nbsp;"petsc-users"<petsc-users at mcs.anl.gov&gt;; 
Subject: &nbsp;Re: [petsc-users] Cannot iterate well when using Newton iteration of SNES

&nbsp;

On Thu, Nov 21, 2024 at 8:57?AM David Jiawei LUO LIANG <12431140 at mail.sustech.edu.cn&gt; wrote:
I am using the Newton iteration to solve a nonlinear 1D heat equation problem by using FEM.


I attached my source code named "SNES_heat.cpp"&nbsp;


when I run the code

&nbsp; 0 SNES Function norm 1.206289245288e+01
 
&nbsp; 1 SNES Function norm 7.128802192789e+00
 
&nbsp; 2 SNES Function norm 6.608812909525e+00


you can find that it only iterate 3 steps, and then do all the function evaluation and finally just stop the program.&nbsp;


I think it is not reasonble. I check my code, it is correct if I set it as a linear problem. it means my Jacobian and Residual function is correct.


But when I set it as a nonlinear, the residual seems reduces as not expected.&nbsp;


I doubt that whether my understanding of the newton iteration is different from SNES's newton iteration process.


Here is what happens with the code as it is:


master *:~/Downloads/tmp/Liang$ ./SNES_heat -snes_monitor -ksp_converged_reason -snes_converged_reason -pc_type lu -snes_view -snes_linesearch_monitor
pp 1
nElem 10
nqp 2
n_np 11
n_en 2
n_eq 10
qp: 
0.57735	-0.57735	
wq: 
1	1	
IEN: 
1	2	3	4	5	6	7	8	9	10	2	3	4	5	6	7	8	9	10	11	
x_coor: 
0	0.1	0.2	0.3	0.4	0.5	0.6	0.7	0.8	0.9	1	
ID: 
1	2	3	4	5	6	7	8	9	10	0	
&nbsp; 0 SNES Function norm 1.206289245288e+01
&nbsp; &nbsp; Linear solve converged due to CONVERGED_RTOL iterations 1
&nbsp; &nbsp; &nbsp; Line search: Using full step: fnorm 1.206289245288e+01 gnorm 7.128802192789e+00
&nbsp; 1 SNES Function norm 7.128802192789e+00
&nbsp; &nbsp; Linear solve converged due to CONVERGED_RTOL iterations 1
&nbsp; &nbsp; &nbsp; Line search: Using full step: fnorm 7.128802192789e+00 gnorm 6.608812909525e+00
&nbsp; 2 SNES Function norm 6.608812909525e+00
&nbsp; &nbsp; Linear solve converged due to CONVERGED_RTOL iterations 1
&nbsp; &nbsp; &nbsp; Line search: gnorm after quadratic fit 1.265375106867e+01
&nbsp; &nbsp; &nbsp; Line search: Cubic step no good, shrinking lambda, current gnorm 1.328962011911e+01 lambda=1.7500506382162818e-02
&nbsp; &nbsp; &nbsp; Line search: Cubic step no good, shrinking lambda, current gnorm 1.275802797864e+01 lambda=1.7500506382162819e-03
&nbsp; &nbsp; &nbsp; Line search: Cubic step no good, shrinking lambda, current gnorm 1.327920917220e+01 lambda=1.7500506382162821e-04
&nbsp; &nbsp; &nbsp; Line search: Cubic step no good, shrinking lambda, current gnorm 1.275906891232e+01 lambda=1.7500506382162820e-05
&nbsp; &nbsp; &nbsp; Line search: Cubic step no good, shrinking lambda, current gnorm 1.327910508109e+01 lambda=1.7500506382162821e-06
&nbsp; &nbsp; &nbsp; Line search: Cubic step no good, shrinking lambda, current gnorm 1.275907932147e+01 lambda=1.7500506382162822e-07
&nbsp; &nbsp; &nbsp; Line search: Cubic step no good, shrinking lambda, current gnorm 1.327910404018e+01 lambda=1.7500506382162823e-08
&nbsp; &nbsp; &nbsp; Line search: Cubic step no good, shrinking lambda, current gnorm 1.275907942556e+01 lambda=1.7500506382162823e-09
&nbsp; &nbsp; &nbsp; Line search: Cubic step no good, shrinking lambda, current gnorm 1.327910402977e+01 lambda=1.7500506382162824e-10
&nbsp; &nbsp; &nbsp; Line search: Cubic step no good, shrinking lambda, current gnorm 1.275907942660e+01 lambda=1.7500506382162825e-11
&nbsp; &nbsp; &nbsp; Line search: Cubic step no good, shrinking lambda, current gnorm 1.327910402966e+01 lambda=1.7500506382162826e-12
&nbsp; &nbsp; &nbsp; Line search: Cubic step no good, shrinking lambda, current gnorm 1.275907942661e+01 lambda=1.7500506382162828e-13
&nbsp; &nbsp; &nbsp; Line search: unable to find good step length! After 12 tries 
&nbsp; &nbsp; &nbsp; Line search: fnorm=6.6088129095253478e+00, gnorm=1.2759079426614502e+01, ynorm=5.3714153713436097e-01, minlambda=9.9999999999999998e-13, lambda=1.7500506382162828e-13, initial slope=-4.3676408073108860e+01
&nbsp; Nonlinear solve did not converge due to DIVERGED_LINE_SEARCH iterations 2


Usually, we suspect that the Jacobian is incorrect in this case. Thus we can have it formed automatically,


master *:~/Downloads/tmp/Liang$ ./SNES_heat -snes_monitor -ksp_converged_reason -snes_converged_reason -snes_fd -pc_type lu -snes_view -snes_linesearch_monitor
pp 1
nElem 10
nqp 2
n_np 11
n_en 2
n_eq 10
qp: 
0.57735	-0.57735	
wq: 
1	1	
IEN: 
1	2	3	4	5	6	7	8	9	10	2	3	4	5	6	7	8	9	10	11	
x_coor: 
0	0.1	0.2	0.3	0.4	0.5	0.6	0.7	0.8	0.9	1	
ID: 
1	2	3	4	5	6	7	8	9	10	0	
&nbsp; 0 SNES Function norm 1.206289245288e+01
&nbsp; &nbsp; Linear solve converged due to CONVERGED_RTOL iterations 1
&nbsp; &nbsp; &nbsp; Line search: Scaling step by 1.837216392007e-47 old ynorm 5.443016970405e+54
&nbsp; &nbsp; &nbsp; Line search: gnorm after quadratic fit 3.704240795372e+16
&nbsp; &nbsp; &nbsp; Line search: Cubic step no good, shrinking lambda, current gnorm 3.704614960636e+16 lambda=1.0000000000000002e-02
&nbsp; &nbsp; &nbsp; Line search: Cubic step no good, shrinking lambda, current gnorm 3.704611219026e+16 lambda=1.0000000000000002e-03
&nbsp; &nbsp; &nbsp; Line search: Cubic step no good, shrinking lambda, current gnorm 3.704611256438e+16 lambda=1.0000000000000003e-04
&nbsp; &nbsp; &nbsp; Line search: Cubic step no good, shrinking lambda, current gnorm 3.704611256064e+16 lambda=1.0000000000000004e-05
&nbsp; &nbsp; &nbsp; Line search: Cubic step no good, shrinking lambda, current gnorm 3.704611256068e+16 lambda=1.0000000000000004e-06
&nbsp; &nbsp; &nbsp; Line search: Cubic step no good, shrinking lambda, current gnorm 3.704611256068e+16 lambda=1.0000000000000005e-07
&nbsp; &nbsp; &nbsp; Line search: Cubic step no good, shrinking lambda, current gnorm 3.704611256068e+16 lambda=1.0000000000000005e-08
&nbsp; &nbsp; &nbsp; Line search: Cubic step no good, shrinking lambda, current gnorm 3.704611256068e+16 lambda=1.0000000000000005e-09
&nbsp; &nbsp; &nbsp; Line search: Cubic step no good, shrinking lambda, current gnorm 3.704611256068e+16 lambda=1.0000000000000006e-10
&nbsp; &nbsp; &nbsp; Line search: Cubic step no good, shrinking lambda, current gnorm 3.704611256068e+16 lambda=1.0000000000000006e-11
&nbsp; &nbsp; &nbsp; Line search: Cubic step no good, shrinking lambda, current gnorm 3.704611256068e+16 lambda=1.0000000000000006e-12
&nbsp; &nbsp; &nbsp; Line search: Cubic step no good, shrinking lambda, current gnorm 3.704611256068e+16 lambda=1.0000000000000007e-13
&nbsp; &nbsp; &nbsp; Line search: unable to find good step length! After 12 tries 
&nbsp; &nbsp; &nbsp; Line search: fnorm=1.2062892452882465e+01, gnorm=3.7046112560677824e+16, ynorm=1.0000000000000000e+08, minlambda=9.9999999999999998e-13, lambda=1.0000000000000007e-13, initial slope=-4.6079597780656769e+00
&nbsp; Nonlinear solve did not converge due to DIVERGED_LINE_SEARCH iterations 0


So it is clear that the Jacobian do not match. Moreover, it appears that Newton is not going to converge from this initial guess. It suggests that the residual is wrong somehow. I suggest coding up a MMS to prove to yourself that the residual is correct.


&nbsp; Thanks,


&nbsp; &nbsp; Matt
&nbsp;


David Jiawei LUO LIANG


??????/??/???/2024


?????????????1088?


&nbsp;


-- 
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener


https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!d_hNum3xt9CQi75aQNB-b7ZmECc6WzDow92m6xvlE63WjUrrDlHvycebtTBgGerF6a61W-336JbK4GLm2mUcnH8hv2SLJQY-$ 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20241121/2b66b59b/attachment-0001.html>

From 12431140 at mail.sustech.edu.cn  Thu Nov 21 09:31:45 2024
From: 12431140 at mail.sustech.edu.cn (=?utf-8?B?RGF2aWQgSmlhd2VpIExVTyBMSUFORw==?=)
Date: Thu, 21 Nov 2024 23:31:45 +0800
Subject: [petsc-users] Cannot iterate well when using Newton iteration
 of SNES
In-Reply-To: <6A4CB22C-6D05-497C-9A57-E5AB8B7C073F@petsc.dev>
References: <tencent_136418DA12A55AD236B3BA0E@qq.com>
	<6A4CB22C-6D05-497C-9A57-E5AB8B7C073F@petsc.dev>
Message-ID: <tencent_6E7FAD9C7DD4BA385A5E6D97@qq.com>

Hi Barry,


The problem is I forgot (or say that I didn't know) to initialize the Vec f in residual function and Mat B in Jacobian function.


Anyway, thanks for sharing me the link, it is helpful for debugging the program next time.&nbsp;


Hope you the best!


David&nbsp;Jiawei&nbsp;LUO&nbsp;LIANG


??????/??/???/2024


?????????????1088?


&nbsp;
&nbsp;
&nbsp;
------------------&nbsp;Original&nbsp;------------------
From: &nbsp;"Barry&nbsp;Smith"<bsmith at petsc.dev&gt;;
Date: &nbsp;Thu, Nov 21, 2024 11:20 PM
To: &nbsp;"David Jiawei LUO LIANG"<12431140 at mail.sustech.edu.cn&gt;; 
Cc: &nbsp;"petsc-users"<petsc-users at mcs.anl.gov&gt;; 
Subject: &nbsp;Re: [petsc-users] Cannot iterate well when using Newton iteration of SNES

&nbsp;


&nbsp; &nbsp;Start with https://urldefense.us/v3/__https://petsc.org/release/faq/*why-is-newton-s-method-snes-not-converging-or-converges-slowly__;Iw!!G_uCfscf7eWS!au7QiobP2j0-OKe-njT2UpUI_j99PsLuftq54OhM2bkB2zJCddsS-MGvmfw6WAVUuB7I6eKa5HKddgjoBcaMZ7dJRVaq7MyA$ 

&nbsp; &nbsp; Next use


-snes_test_jacobian -&nbsp;compare the user provided Jacobian with one computed via finite differences to check for errors. If a threshold is given, display only those entries whose difference is greater than the threshold.


-snes_test_jacobian_view -&nbsp;display the user provided Jacobian, the finite difference Jacobian and the difference between them to help users detect the location of errors in the user provided Jacobian.


There are many, many reasons Newton can fail, usually they are due to bugs in the function evaluation or Jacobian evaluation. Occasionly they are due to it being a very difficult non-linear problem. You first need to use the tools above to verify there are no bugs anywhere.


Barry


On Nov 21, 2024, at 7:11?AM, David Jiawei LUO LIANG <12431140 at mail.sustech.edu.cn&gt; wrote:

I am using the Newton iteration to solve a nonlinear 1D heat equation problem by using FEM.


I attached my source code named "SNES_heat.cpp" 


when I run the code
&nbsp; 0 SNES Function norm 1.206289245288e+01
&nbsp; 1 SNES Function norm 7.128802192789e+00
&nbsp; 2 SNES Function norm 6.608812909525e+00


you can find that it only iterate 3 steps, and then do all the function evaluation and finally just stop the program. 


I think it is not reasonble. I check my code, it is correct if I set it as a linear problem. it means my Jacobian and Residual function is correct.


But when I set it as a nonlinear, the residual seems reduces as not expected. 


I doubt that whether my understanding of the newton iteration is different from SNES's newton iteration process.


David Jiawei LUO LIANG


??????/??/???/2024


?????????????1088?


&nbsp;

<SNES_heat.cpp&gt;
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20241121/759399d5/attachment.html>

From knepley at gmail.com  Thu Nov 21 09:37:37 2024
From: knepley at gmail.com (Matthew Knepley)
Date: Thu, 21 Nov 2024 10:37:37 -0500
Subject: [petsc-users] Cannot iterate well when using Newton iteration
 of SNES
In-Reply-To: <87a5dsbpag.fsf@jedbrown.org>
References: <tencent_136418DA12A55AD236B3BA0E@qq.com>
	<87a5dsbpag.fsf@jedbrown.org>
Message-ID: <CAMYG4Gky1gi9xcCFt7aWLb5b2RYrp7HsT-Djacq5AGfCmpLUFA@mail.gmail.com>

One more suggestion email. I solve the linear version myself in
ts/tutorials/ex45.c

  Thanks,

     Matt

On Thu, Nov 21, 2024 at 10:35?AM Jed Brown <jed at jedbrown.org> wrote:

> You should add VecZeroEntries(f) near the top of your FormFunction (it's
> currently accumulating into whatever was there last) and MatZeroEntries(B)
> to FormJacobian.
>
> I reduced to nElem = 5 for ease of viewing. With these changes, I see
> quadratic convergence but the problem is still nonlinear. To explore
> further, consider using these diagnostics
>
> ./SNES_heat -{snes,ksp}_monitor -{snes,ksp}_converged_reason
> -snes_linesearch_monitor -ksp_view_mat
>
> with and without -snes_fd.
>
> For readability, I would suggest consistency in "u" vs "x".
>
> "David Jiawei LUO LIANG"        <12431140 at mail.sustech.edu.cn> writes:
>
> > I am using the Newton iteration to solve a nonlinear 1D heat equation
> problem by using FEM.
> >
> >
> > I attached my source code named "SNES_heat.cpp"&nbsp;
> >
> >
> > when I run the code
> >
> > &nbsp; 0 SNES Function norm 1.206289245288e+01
> >
> > &nbsp; 1 SNES Function norm 7.128802192789e+00
> >
> > &nbsp; 2 SNES Function norm 6.608812909525e+00
> >
> >
> >
> > you can find that it only iterate 3 steps, and then do all the function
> evaluation and finally just stop the program.&nbsp;
> >
> >
> > I think it is not reasonble. I check my code, it is correct if I set it
> as a linear problem. it means my Jacobian and Residual function is correct.
> >
> >
> > But when I set it as a nonlinear, the residual seems reduces as not
> expected.&nbsp;
> >
> >
> > I doubt that whether my understanding of the newton iteration is
> different from SNES's newton iteration process.
> >
> >
> >
> >
> >
> >
> >
> >
> > David Jiawei LUO LIANG
> >
> >
> >
> > ??????/??/???/2024
> >
> >
> >
> > ?????????????1088?
> >
> >
> >
> >
> > &nbsp;
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!Z_i28V-z4i339qw9rz21qC1sKBK1hY750356Y2SekU_d3pHw-mdIgh0mJCT_Qp5HPuu0XkvutxpF0oHTYsJE$  <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!Z_i28V-z4i339qw9rz21qC1sKBK1hY750356Y2SekU_d3pHw-mdIgh0mJCT_Qp5HPuu0XkvutxpF0qBbFLUd$ >
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20241121/a9ba0649/attachment-0001.html>

From 12431140 at mail.sustech.edu.cn  Thu Nov 21 09:45:04 2024
From: 12431140 at mail.sustech.edu.cn (=?utf-8?B?RGF2aWQgSmlhd2VpIExVTyBMSUFORw==?=)
Date: Thu, 21 Nov 2024 23:45:04 +0800
Subject: [petsc-users] Cannot iterate well when using Newton iteration
 of SNES
In-Reply-To: <CAMYG4Gky1gi9xcCFt7aWLb5b2RYrp7HsT-Djacq5AGfCmpLUFA@mail.gmail.com>
References: <tencent_136418DA12A55AD236B3BA0E@qq.com>
	<87a5dsbpag.fsf@jedbrown.org>
	<CAMYG4Gky1gi9xcCFt7aWLb5b2RYrp7HsT-Djacq5AGfCmpLUFA@mail.gmail.com>
Message-ID: <tencent_6AB472F458ECF0ED4F49DB14@qq.com>

Coincidentally, Matt. I am gonna write the 2d heat dynamic problem.&nbsp;


Now I still not that much understand how DM mesh works.


It is a good chance to learn DM by studying your example.&nbsp;


Hope you the best!


David&nbsp;Jiawei&nbsp;LUO&nbsp;LIANG


??????/??/???/2024


?????????????1088?


&nbsp;
&nbsp;
&nbsp;
------------------&nbsp;Original&nbsp;------------------
From: &nbsp;"Matthew&nbsp;Knepley"<knepley at gmail.com&gt;;
Date: &nbsp;Thu, Nov 21, 2024 11:37 PM
To: &nbsp;"Jed Brown"<jed at jedbrown.org&gt;; 
Cc: &nbsp;"David Jiawei LUO LIANG"<12431140 at mail.sustech.edu.cn&gt;; "petsc-users"<petsc-users at mcs.anl.gov&gt;; 
Subject: &nbsp;Re: [petsc-users] Cannot iterate well when using Newton iteration of SNES

&nbsp;

One more suggestion email. I solve the linear version myself in ts/tutorials/ex45.c

&nbsp; Thanks,


&nbsp; &nbsp; &nbsp;Matt


On Thu, Nov 21, 2024 at 10:35?AM Jed Brown <jed at jedbrown.org&gt; wrote:

You should add VecZeroEntries(f) near the top of your FormFunction (it's currently accumulating into whatever was there last) and MatZeroEntries(B) to FormJacobian.
 
 I reduced to nElem = 5 for ease of viewing. With these changes, I see quadratic convergence but the problem is still nonlinear. To explore further, consider using these diagnostics
 
 ./SNES_heat -{snes,ksp}_monitor -{snes,ksp}_converged_reason -snes_linesearch_monitor -ksp_view_mat
 
 with and without -snes_fd.
 
 For readability, I would suggest consistency in "u" vs "x".
 
 "David Jiawei LUO LIANG"&nbsp; &nbsp; &nbsp; &nbsp; <12431140 at mail.sustech.edu.cn&gt; writes:
 
 &gt; I am using the Newton iteration to solve a nonlinear 1D heat equation problem by using FEM.
 &gt;
 &gt;
 &gt; I attached my source code named "SNES_heat.cpp"&amp;nbsp;
 &gt;
 &gt;
 &gt; when I run the code
 &gt;
 &gt; &amp;nbsp; 0 SNES Function norm 1.206289245288e+01
 &gt;&nbsp; 
 &gt; &amp;nbsp; 1 SNES Function norm 7.128802192789e+00
 &gt;&nbsp; 
 &gt; &amp;nbsp; 2 SNES Function norm 6.608812909525e+00
 &gt;
 &gt;
 &gt;
 &gt; you can find that it only iterate 3 steps, and then do all the function evaluation and finally just stop the program.&amp;nbsp;
 &gt;
 &gt;
 &gt; I think it is not reasonble. I check my code, it is correct if I set it as a linear problem. it means my Jacobian and Residual function is correct.
 &gt;
 &gt;
 &gt; But when I set it as a nonlinear, the residual seems reduces as not expected.&amp;nbsp;
 &gt;
 &gt;
 &gt; I doubt that whether my understanding of the newton iteration is different from SNES's newton iteration process.
 &gt;
 &gt;
 &gt;
 &gt;
 &gt;
 &gt;
 &gt;
 &gt;
 &gt; David Jiawei LUO LIANG
 &gt;
 &gt;
 &gt;
 &gt; ??????/??/???/2024
 &gt;
 &gt;
 &gt;
 &gt; ?????????????1088?
 &gt;
 &gt;
 &gt;
 &gt;
 &gt; &amp;nbsp;
 

-- 
What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead.
-- Norbert Wiener


https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!dSL2GSpljTIMB0tcY__f9j77VbVzQe3qtLiTf_zyXLkjYGis3L_HhIi6Zd3Xebfl90gusl7j3fAmWfqFsna4Ipv55wHVlbA1$ 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20241121/3d962b94/attachment.html>

From d.scott at epcc.ed.ac.uk  Fri Nov 22 10:35:45 2024
From: d.scott at epcc.ed.ac.uk (David Scott)
Date: Fri, 22 Nov 2024 16:35:45 +0000
Subject: [petsc-users] Memory Used When Reading petscrc
Message-ID: <60ba6a27-9840-4009-a94e-90bf7a5cd317@epcc.ed.ac.uk>

Hello,

I am using the options mechanism of PETSc to configure my CFD code. I
have introduced options describing the size of the domain etc. I have
noticed that this consumes a lot of memory. I have found that the amount
of memory used scales linearly with the number of MPI processes used.
This restricts the number of MPI processes that I can use.

Is there anything that I can do about this or do I need to configure my
code in a different way?

I have attached some code extracted from my application which
demonstrates this along with the output from a running it on 2 MPI
processes.

Best wishes,

David Scott
The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. Is e buidheann carthannais a th? ann an Oilthigh Dh?n ?ideann, cl?raichte an Alba, ?ireamh cl?raidh SC005336.
-------------- next part --------------
!! @author Prashant Valluri, Lennon O Naraigh, Iain Bethune,
!! David Scott, Toni Collis, Peter Spelt.
!! @version $Revision: 252 $
!! @copyright (c) 2013-2020, Prashant Valluri, Lennon O Naraigh,
!! Iain Bethune, David Scott, Toni Collis, Peter Spelt.
!! This program is distributed under the BSD Licence See LICENCE.txt
!! for details.

module test_configuration_options

#include <petsc/finclude/petscsys.h>
#include "petsc/finclude/petsc.h"
  use petsc

  implicit none

  PetscScalar :: dx
  PetscScalar :: dy
  PetscScalar :: dz
  PetscScalar :: dt
  logical :: boiling
  logical :: boiling_variant
  logical :: evaporation
  logical, SAVE :: periodic(3)
  DMBoundaryType, SAVE :: boundary(3)

  enum, bind(C)
    enumerator :: BC
    enumerator :: cyclic, neumann, dirichlet, quasi_dirichlet, inlet, outlet
  end enum

  integer(kind(BC)), SAVE :: bc_temp

  integer(kind(BC)), SAVE :: x_upper_bc_T, x_upper_bc_Cl, x_upper_bc_Cv, x_upper_bc_P
  integer(kind(BC)), SAVE :: x_upper_bc_u, x_upper_bc_v, x_upper_bc_w
  integer(kind(BC)), SAVE :: x_lower_bc_T, x_lower_bc_Cl, x_lower_bc_Cv, x_lower_bc_P
  integer(kind(BC)), SAVE :: x_lower_bc_u, x_lower_bc_v, x_lower_bc_w
  integer(kind(BC)), SAVE :: y_upper_bc_T, y_upper_bc_Cl, y_upper_bc_Cv, y_upper_bc_P
  integer(kind(BC)), SAVE :: y_upper_bc_u, y_upper_bc_v, y_upper_bc_w
  integer(kind(BC)), SAVE :: y_lower_bc_T, y_lower_bc_Cl, y_lower_bc_Cv, y_lower_bc_P
  integer(kind(BC)), SAVE :: y_lower_bc_u, y_lower_bc_v, y_lower_bc_w
  integer(kind(BC)), SAVE :: z_upper_bc_T, z_upper_bc_Cl, z_upper_bc_Cv, z_upper_bc_P
  integer(kind(BC)), SAVE :: z_upper_bc_u, z_upper_bc_v, z_upper_bc_w
  integer(kind(BC)), SAVE :: z_lower_bc_T, z_lower_bc_Cl, z_lower_bc_Cv, z_lower_bc_P
  integer(kind(BC)), SAVE :: z_lower_bc_u, z_lower_bc_v, z_lower_bc_w

  double precision, SAVE :: x_upper_bc_T_value, x_upper_bc_Cl_value, x_upper_bc_Cv_value, x_upper_bc_P_value
  double precision, SAVE :: x_lower_bc_T_value, x_lower_bc_Cl_value, x_lower_bc_Cv_value, x_lower_bc_P_value
  double precision, SAVE :: y_upper_bc_T_value, y_upper_bc_Cl_value, y_upper_bc_Cv_value, y_upper_bc_P_value
  double precision, SAVE :: y_lower_bc_T_value, y_lower_bc_Cl_value, y_lower_bc_Cv_value, y_lower_bc_P_value
  double precision, SAVE :: z_upper_bc_T_value, z_upper_bc_Cl_value, z_upper_bc_Cv_value, z_upper_bc_P_value
  double precision, SAVE :: z_lower_bc_T_value, z_lower_bc_Cl_value, z_lower_bc_Cv_value, z_lower_bc_P_value

  integer, parameter :: max_option_name_length = 30
  integer, parameter :: max_msg_length = 2**max_option_name_length + 8 ! 8 for 'Modified'

contains

  subroutine read_initial_configuration_options(global_dim_x, global_dim_y, global_dim_z,    &
       Re, Pe, We, Fr, Bod, Ja, mu_plus, mu_minus, mu_vap, rho_plus, rho_minus, rho_vap,     &
       cp_plus, cp_minus, cp_vap, k_plus, k_minus, k_vap, beta_plus, beta_minus, beta_vap,   &
       dpdx, gx, gz, epn, dTdx, T_ref,                                                       &
       Pref, Apsat, Bpsat, Cpsat, molMassRatio, PeT, PeMD, PeMDI,                            &
       x_upper_bc_T, x_upper_bc_Cl, x_upper_bc_Cv, x_upper_bc_u, x_upper_bc_v, x_upper_bc_w, &
       x_lower_bc_T, x_lower_bc_Cl, x_lower_bc_Cv, x_lower_bc_u, x_lower_bc_v, x_lower_bc_w, &
       y_upper_bc_T, y_upper_bc_Cl, y_upper_bc_Cv, y_upper_bc_u, y_upper_bc_v, y_upper_bc_w, &
       y_lower_bc_T, y_lower_bc_Cl, y_lower_bc_Cv, y_lower_bc_u, y_lower_bc_v, y_lower_bc_w, &
       z_upper_bc_T, z_upper_bc_Cl, z_upper_bc_Cv, z_upper_bc_u, z_upper_bc_v, z_upper_bc_w, &
       z_lower_bc_T, z_lower_bc_Cl, z_lower_bc_Cv, z_lower_bc_u, z_lower_bc_v, z_lower_bc_w, &
       liquid_limit, gaseous_limit, ierr)

    implicit none

    PetscInt, intent(out) :: global_dim_x, global_dim_y, global_dim_z
    double precision, intent(out) :: Re, Pe, We, Fr, Bod, Ja
    double precision, intent(out) :: mu_plus, mu_minus, mu_vap
    double precision, intent(out) :: rho_plus, rho_minus, rho_vap
    double precision, intent(out) :: cp_plus, cp_minus, cp_vap
    double precision, intent(out) :: k_plus, k_minus, k_vap
    double precision, intent(out) :: beta_plus, beta_minus, beta_vap
    double precision, intent(out) :: dpdx
    double precision, intent(out) :: gx, gz
    double precision, intent(out) :: epn
    double precision, intent(out) :: dTdx
    double precision, intent(out) :: T_ref
    double precision, intent(out) :: Pref
    double precision, intent(out) :: Apsat, Bpsat, Cpsat
    double precision, intent(out) :: molMassRatio
    double precision, intent(out) :: PeT
    double precision, intent(out) :: PeMD
    double precision, intent(out) :: PeMDI
    integer(kind(BC)), intent(out) :: x_upper_bc_T, x_upper_bc_Cl, x_upper_bc_Cv
    integer(kind(BC)), intent(out) :: x_upper_bc_u, x_upper_bc_v, x_upper_bc_w
    integer(kind(BC)), intent(out) :: x_lower_bc_T, x_lower_bc_Cl, x_lower_bc_Cv
    integer(kind(BC)), intent(out) :: x_lower_bc_u, x_lower_bc_v, x_lower_bc_w
    integer(kind(BC)), intent(out) :: y_upper_bc_T, y_upper_bc_Cl, y_upper_bc_Cv
    integer(kind(BC)), intent(out) :: y_upper_bc_u, y_upper_bc_v, y_upper_bc_w
    integer(kind(BC)), intent(out) :: y_lower_bc_T, y_lower_bc_Cl, y_lower_bc_Cv
    integer(kind(BC)), intent(out) :: y_lower_bc_u, y_lower_bc_v, y_lower_bc_w
    integer(kind(BC)), intent(out) :: z_upper_bc_T, z_upper_bc_Cl, z_upper_bc_Cv
    integer(kind(BC)), intent(out) :: z_upper_bc_u, z_upper_bc_v, z_upper_bc_w
    integer(kind(BC)), intent(out) :: z_lower_bc_T, z_lower_bc_Cl, z_lower_bc_Cv
    integer(kind(BC)), intent(out) :: z_lower_bc_u, z_lower_bc_v, z_lower_bc_w
    double precision, intent(out) :: liquid_limit
    double precision, intent(out) :: gaseous_limit
    PetscErrorCode, intent(out) :: ierr

    double precision :: MM_minus, MM_vap
    double precision :: Pr, Sc
    double precision :: PeCahnHilliardModifier
!    double precision :: PeDiffusionModifier
    double precision :: Grav, alpha
    double precision :: pi = 4.0d0*atan(1.0d0)
    character(len = max_option_name_length) :: option_name
    character(len = max_msg_length) :: msg
    character(len = max_option_name_length) :: phenomenon
    logical :: found

    option_name = '-phenomenon'
    call PetscOptionsGetString(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, phenomenon, found, ierr)
    if (found) then
      write(msg, *) option_name, '= ',  trim(phenomenon)
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
      if (phenomenon .eq. 'boiling') then
        boiling = .true.
      else
        boiling = .false.
      end if
      if (phenomenon .eq. 'boiling_variant') then
        boiling_variant = .true.
      else
        boiling_variant = .false.
      end if
      if (phenomenon .eq. 'evaporation') then
        evaporation = .true.
      else
        evaporation = .false.
      end if
    else
      write(msg, *) 'Error:', option_name, 'not found.'
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    end if

    option_name = '-global_dim_x'
    call PetscOptionsGetInt(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, global_dim_x, found, ierr)
    if (found) then
      write(msg, *) option_name, '=', global_dim_x
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    else
      write(msg, *) 'Error:', option_name, 'not found.'
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    end if

    option_name = '-global_dim_y'
    call PetscOptionsGetInt(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, global_dim_y, found, ierr)
    if (found) then
      write(msg, *) option_name, '=', global_dim_y
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    else
      write(msg, *) 'Error:', option_name, 'not found.'
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    end if

    option_name = '-global_dim_z'
    call PetscOptionsGetInt(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, global_dim_z, found, ierr)
    if (found) then
      write(msg, *) option_name, '=', global_dim_z
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
      if(global_dim_z .le. 1) then
        write(msg, *) 'global_dim_z must be greater than 1.'
        call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
      end if
    else
      write(msg, *) 'Error:', option_name, 'not found.'
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    end if

    option_name = '-dt'
    call PetscOptionsGetReal(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, dt, found, ierr)
    if (found) then
      write(msg, *) option_name, '=', dt
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
      if (.not. (dt .gt. 0.0d0)) then
        write(msg, *) 'Error:', dt, 'is not a valid value.'
        call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
      end if
    else
      write(msg, *) 'Error:', option_name, 'not found.'
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    end if

    dz = 1.0d0/dble(global_dim_z)
    dx = dz
    dy = dz
    write(msg, *) 'dx =', dx
    call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    write(msg, *) 'dy =', dy
    call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    write(msg, *) 'dz =', dz
    call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    write(msg, *) 'Lx =', dx*global_dim_x
    call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    write(msg, *) 'Ly =', dy*global_dim_y
    call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    write(msg, *) 'Lz =', dz*global_dim_z
    call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)

    option_name = '-epn'
    call PetscOptionsGetReal(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, epn, found, ierr)
    if (found) then
      if (.not. (epn .gt. 0.0d0)) then
        epn = 0.5*dz
        if (boiling) then
          ! We need to allow for different values of Pe.
          PeCahnHilliardModifier = 1.0d0/(epn*epn)
!          PeDiffusionModifier = 1.0d0/(epn*epn)
        else
          PeCahnHilliardModifier = 1.0d0/epn
!          PeDiffusionModifier = 1.0d0/epn
        end if
      else
        PeCahnHilliardModifier = 1.0d0
!        PeDiffusionModifier = 1.0d0
      end if
      write(msg, *) option_name, '=', epn
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    else
      write(msg, *) 'Error:', option_name, 'not found.'
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
   end if

    option_name = '-Re'
    call PetscOptionsGetReal(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, Re, found, ierr)
    if (found) then
      write(msg, *) option_name, '=', Re
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
      if (.not. (Re .gt. 0.0d0)) then
        write(msg, *) 'Error:', Re, 'is not a valid value.'
        call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
      end if
    else
      write(msg, *) 'Error:', option_name, 'not found.'
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    end if

    option_name = '-Pe'
    call PetscOptionsGetReal(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, Pe, found, ierr)
    if (found) then
      write(msg, *) option_name, '=', Pe
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
      if (.not. (Pe .gt. 0.0d0)) then
        write(msg, *) 'Error:', Pe, 'is not a valid value.'
        call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
      else
        Pe = Pe * PeCahnHilliardModifier
        write(msg, *) 'Modified Pe =', Pe
        call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
      end if
    else
      write(msg, *) 'Error:', option_name, 'not found.'
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    end if

    option_name = '-Pr'
    call PetscOptionsGetReal(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, Pr, found, ierr)
    if (found) then
      write(msg, *) option_name, '=', Pr
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
      if (.not. (Pr .gt. 0.0d0)) then
        write(msg, *) 'Error:', Pr, 'is not a valid value.'
        call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
      else
        PeT = Re*Pr
        write(msg, *) 'PeT =', PeT
        call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
      end if
    else
      write(msg, *) 'Error:', option_name, 'not found.'
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    end if

    option_name = '-We'
    call PetscOptionsGetReal(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, We, found, ierr)
    if (found) then
      write(msg, *) option_name, '=', We
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
      if (.not. (We .gt. 0.0d0)) then
        write(msg, *) 'Error:', We, 'is not a valid value.'
        call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
      end if
    else
      write(msg, *) 'Error:', option_name, 'not found.'
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    end if

    option_name = '-Fr'
    call PetscOptionsGetReal(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, Fr, found, ierr)
    if (found) then
      write(msg, *) option_name, '=', Fr
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
      if (.not. (Fr .gt. 0.0d0)) then
        write(msg, *) 'Error:', Fr, 'is not a valid value.'
        call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
      end if
    else
      write(msg, *) 'Error:', option_name, 'not found.'
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    end if

    option_name = '-Bod'
    call PetscOptionsGetReal(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, Bod, found, ierr)
    if (found) then
      write(msg, *) option_name, '=', Bod
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
      if (.not. (Bod .gt. 0.0d0)) then
        write(msg, *) 'Error:', Bod, 'is not a valid value.'
        call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
      end if
    else
      write(msg, *) 'Error:', option_name, 'not found.'
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    end if

    option_name = '-Sc'
    call PetscOptionsGetReal(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, Sc, found, ierr)
    if (found) then
      write(msg, *) option_name, '=', Sc
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
      if (.not. (Sc .ge. 0.0d0)) then
        write(msg, *) 'Error:', Sc, 'is not a valid value.'
        call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
      else
        PeMD = Re*Sc
        PeMDI = PeMD
        write(msg, *) 'PeMD =', PeMD
        call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
        write(msg, *) 'PeMDI =', PeMDI
        call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
       end if
    else
      write(msg, *) 'Error:', option_name, 'not found.'
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    end if

    option_name = '-Ja'
    call PetscOptionsGetReal(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, Ja, found, ierr)
    if (found) then
      write(msg, *) option_name, '=', Ja
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
      if (.not. (Ja .gt. 0.0d0)) then
        write(msg, *) 'Error:', Ja, 'is not a valid value.'
        call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
      end if
    else
      write(msg, *) 'Error:', option_name, 'not found.'
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    end if

    option_name = '-dTdx'
    call PetscOptionsGetReal(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, dTdx, found, ierr)
    if (found) then
      write(msg, *) option_name, '=', dTdx
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    else
      write(msg, *) 'Error:', option_name, 'not found.'
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    end if

    option_name = '-T_ref'
    call PetscOptionsGetReal(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, T_ref, found, ierr)
    if (found) then
      write(msg, *) option_name, '=', T_ref
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    else
      write(msg, *) 'Error:', option_name, 'not found.'
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    end if

    option_name = '-Pref'
    call PetscOptionsGetReal(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, Pref, found, ierr)
    if (found) then
      write(msg, *) option_name, '=', Pref
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
      if (.not. (Pref .ge. 0.0d0)) then
        write(msg, *) 'Error:', Pref, 'is not a valid value.'
        call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
      end if
    else
      write(msg, *) 'Error:', option_name, 'not found.'
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    end if

    option_name = '-Apsat'
    call PetscOptionsGetReal(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, Apsat, found, ierr)
    if (found) then
      write(msg, *) option_name, '=', Apsat
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
      if (.not. (Apsat .ge. 0.0d0)) then
        write(msg, *) 'Error:', Apsat, 'is not a valid value.'
        call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
      end if
    else
      write(msg, *) 'Error:', option_name, 'not found.'
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    end if

    option_name = '-Bpsat'
    call PetscOptionsGetReal(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, Bpsat, found, ierr)
    if (found) then
      write(msg, *) option_name, '=', Bpsat
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
      if (.not. (Bpsat .ge. 0.0d0)) then
        write(msg, *) 'Error:', Bpsat, 'is not a valid value.'
        call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
      end if
    else
      write(msg, *) 'Error:', option_name, 'not found.'
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    end if

    option_name = '-Cpsat'
    call PetscOptionsGetReal(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, Cpsat, found, ierr)
    if (found) then
      write(msg, *) option_name, '=', Cpsat
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
      if (.not. (Cpsat .ge. 0.0d0)) then
        write(msg, *) 'Error:', Cpsat, 'is not a valid value.'
        call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
      end if
    else
      write(msg, *) 'Error:', option_name, 'not found.'
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    end if

    option_name = '-MM_minus'
    call PetscOptionsGetReal(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, MM_minus, found, ierr)
    if (found) then
      write(msg, *) option_name, '=', MM_minus
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
      if (.not. (MM_minus .gt. 0.0d0)) then
        write(msg, *) 'Error:', MM_minus, 'is not a valid value.'
        call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
      end if
    else
      write(msg, *) 'Error:', option_name, 'not found so using the value 1.'
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
      MM_minus = 1.0d0
    end if

    option_name = '-MM_vap'
    call PetscOptionsGetReal(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, MM_vap, found, ierr)
    if (found) then
      write(msg, *) option_name, '=', MM_vap
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
      if (.not. (MM_vap .gt. 0.0d0)) then
        write(msg, *) 'Error:', MM_vap, 'is not a valid value.'
        call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
      end if
    else
      write(msg, *) 'Error:', option_name, 'not found so using the value 0.'
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
      MM_vap = 0.0d0
    end if

    molMassRatio = MM_vap / MM_minus
    write(msg, *) 'molMassRatio =', molMassRatio
    call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)

    option_name = '-mu_plus'
    call PetscOptionsGetReal(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, mu_plus, found, ierr)
    if (found) then
      write(msg, *) option_name, '=', mu_plus
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
      if (.not. (mu_plus .gt. 0.0d0)) then
        write(msg, *) 'Error:', mu_plus, 'is not a valid value.'
        call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
      end if
    else
      write(msg, *) 'Error:', option_name, 'not found.'
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    end if

    option_name = '-mu_minus'
    call PetscOptionsGetReal(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, mu_minus, found, ierr)
    if (found) then
      write(msg, *) option_name, '=', mu_minus
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
      if (.not. (mu_minus .gt. 0.0d0)) then
        write(msg, *) 'Error:', mu_minus, 'is not a valid value.'
        call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
      end if
    else
      write(msg, *) 'Error:', option_name, 'not found.'
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    end if

    option_name = '-mu_vap'
    call PetscOptionsGetReal(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, mu_vap, found, ierr)
    if (found) then
      write(msg, *) option_name, '=', mu_vap
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
      if (.not. (mu_vap .gt. 0.0d0)) then
        write(msg, *) 'Error:', mu_vap, 'is not a valid value.'
        call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
      end if
    else
      write(msg, *) 'Error:', option_name, 'not found.'
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    end if

    option_name = '-rho_plus'
    call PetscOptionsGetReal(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, rho_plus, found, ierr)
    if (found) then
      write(msg, *) option_name, '=', rho_plus
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
      if (.not. (rho_plus .gt. 0.0d0)) then
        write(msg, *) 'Error:', rho_plus, 'is not a valid value.'
        call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
      end if
    else
      write(msg, *) 'Error:', option_name, 'not found.'
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    end if

    option_name = '-rho_minus'
    call PetscOptionsGetReal(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, rho_minus, found, ierr)
    if (found) then
      write(msg, *) option_name, '=', rho_minus
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
      if (.not. (rho_minus .gt. 0.0d0)) then
        write(msg, *) 'Error:', rho_minus, 'is not a valid value.'
        call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
      end if
    else
      write(msg, *) 'Error:', option_name, 'not found.'
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    end if

    option_name = '-rho_vap'
    call PetscOptionsGetReal(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, rho_vap, found, ierr)
    if (found) then
      write(msg, *) option_name, '=', rho_vap
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
      if (.not. (rho_vap .gt. 0.0d0)) then
        write(msg, *) 'Error:', rho_vap, 'is not a valid value.'
        call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
      end if
    else
      write(msg, *) 'Error:', option_name, 'not found.'
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    end if

    option_name = '-cp_plus'
    call PetscOptionsGetReal(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, cp_plus, found, ierr)
    if (found) then
      write(msg, *) option_name, '=', cp_plus
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
      if (.not. (cp_plus .gt. 0.0d0)) then
        write(msg, *) 'Error:', cp_plus, 'is not a valid value.'
        call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
      end if
    else
      write(msg, *) 'Error:', option_name, 'not found.'
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    end if

    option_name = '-cp_minus'
    call PetscOptionsGetReal(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, cp_minus, found, ierr)
    if (found) then
      write(msg, *) option_name, '=', cp_minus
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
      if (.not. (cp_minus .gt. 0.0d0)) then
        write(msg, *) 'Error:', cp_minus, 'is not a valid value.'
        call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
      end if
    else
      write(msg, *) 'Error:', option_name, 'not found.'
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    end if

    option_name = '-cp_vap'
    call PetscOptionsGetReal(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, cp_vap, found, ierr)
    if (found) then
      write(msg, *) option_name, '=', cp_vap
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
      if (.not. (cp_vap .gt. 0.0d0)) then
        write(msg, *) 'Error:', cp_vap, 'is not a valid value.'
        call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
      end if
    else
      write(msg, *) 'Error:', option_name, 'not found.'
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    end if

    option_name = '-k_plus'
    call PetscOptionsGetReal(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, k_plus, found, ierr)
    if (found) then
      write(msg, *) option_name, '=', k_plus
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
      if (.not. (k_plus .gt. 0.0d0)) then
        write(msg, *) 'Error:', k_plus, 'is not a valid value.'
        call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
      end if
    else
      write(msg, *) 'Error:', option_name, 'not found.'
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    end if

    option_name = '-k_minus'
    call PetscOptionsGetReal(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, k_minus, found, ierr)
    if (found) then
      write(msg, *) option_name, '=', k_minus
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
      if (.not. (k_minus .gt. 0.0d0)) then
        write(msg, *) 'Error:', k_minus, 'is not a valid value.'
        call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
      end if
    else
      write(msg, *) 'Error:', option_name, 'not found.'
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    end if

    option_name = '-k_vap'
    call PetscOptionsGetReal(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, k_vap, found, ierr)
    if (found) then
      write(msg, *) option_name, '=', k_vap
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
      if (.not. (k_vap .gt. 0.0d0)) then
        write(msg, *) 'Error:', k_vap, 'is not a valid value.'
        call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
      end if
    else
      write(msg, *) 'Error:', option_name, 'not found.'
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    end if

    option_name = '-beta_plus'
    call PetscOptionsGetReal(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, beta_plus, found, ierr)
    if (found) then
      write(msg, *) option_name, '=', beta_plus
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
      if (.not. (beta_plus .ge. 0.0d0)) then
        write(msg, *) 'Error:', beta_plus, 'is not a valid value.'
        call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
      end if
    else
      write(msg, *) 'Error:', option_name, 'not found.'
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    end if

    option_name = '-beta_minus'
    call PetscOptionsGetReal(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, beta_minus, found, ierr)
    if (found) then
      write(msg, *) option_name, '=', beta_minus
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
      if (.not. (beta_minus .ge. 0.0d0)) then
        write(msg, *) 'Error:', beta_minus, 'is not a valid value.'
        call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
      end if
    else
      write(msg, *) 'Error:', option_name, 'not found.'
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    end if

    option_name = '-beta_vap'
    call PetscOptionsGetReal(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, beta_vap, found, ierr)
    if (found) then
      write(msg, *) option_name, '=', beta_vap
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
      if (.not. (beta_vap .ge. 0.0d0)) then
        write(msg, *) 'Error:', beta_vap, 'is not a valid value.'
        call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
      end if
    else
      write(msg, *) 'Error:', option_name, 'not found.'
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    end if

    option_name = '-dpdx'
    call PetscOptionsGetReal(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, dpdx, found, ierr)
    if (found) then
      write(msg, *) option_name, '=', dpdx
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
      if (dpdx .gt. 0.0d0) then
        write(msg, *) 'Counter current flow.'
        call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
      end if
    else
      write(msg, *) 'Error:', option_name, 'not found.'
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    end if

    option_name = '-Grav'
    call PetscOptionsGetReal(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, Grav, found, ierr)
    if (found) then
      write(msg, *) option_name, '=', Grav
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
      if (.not. (Grav .ge. 0.0d0)) then
        write(msg, *) 'Error: Grav must be non-negative.'
        call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
      end if
    else
      write(msg, *) 'Error:', option_name, 'not found.'
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    end if

    option_name = '-alpha'
    call PetscOptionsGetReal(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, alpha, found, ierr)
    if (found) then
      write(msg, *) option_name, '=', alpha
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
      if (.not. ((-pi .le. alpha) .and. (alpha .le. pi))) then
        write(msg, *) 'Error:', alpha, 'is not a valid value.'
        call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
      end if
    else
      write(msg, *) 'Error:', option_name, 'not found.'
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    end if

    gz = Grav*sin(alpha)
    if (gz == -1.0d0) then
      gx = 0.0d0 ! In this case cos(alpha) = -3.8285686989269494E-016
    else
      gx = Grav*cos(alpha)
    end if
    write(msg, *) 'gz =', gz
    call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    write(msg, *) 'gx =', gx
    call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)

    ! X BCs.

    option_name = '-x_upper_bc_T'
    call get_BC(option_name, x_upper_bc_T, x_upper_bc_T_value, 1)

    option_name = '-x_upper_bc_Cl'
    call get_BC_with_check(option_name, x_upper_bc_Cl, x_upper_bc_Cl_value, 1)

    option_name = '-x_upper_bc_Cv'
    call get_BC_with_check(option_name, x_upper_bc_Cv, x_upper_bc_Cv_value, 1)

    option_name = '-x_upper_bc_P'
    call get_BC_with_check(option_name, x_upper_bc_P, x_upper_bc_P_value, 1)

    option_name = '-x_upper_bc_u'
    call get_BC_with_check_uvw(option_name, x_upper_bc_u, 1)

    option_name = '-x_upper_bc_v'
    call get_BC_with_check_uvw(option_name, x_upper_bc_v, 1)

    option_name = '-x_upper_bc_w'
    call get_BC_with_check_uvw(option_name, x_upper_bc_w, 1)

    option_name = '-x_lower_bc_T'
    call get_BC_with_check(option_name, x_lower_bc_T, x_lower_bc_T_value, 1)

    option_name = '-x_lower_bc_Cl'
    call get_BC_with_check(option_name, x_lower_bc_Cl, x_lower_bc_Cl_value, 1)

    option_name = '-x_lower_bc_Cv'
    call get_BC_with_check(option_name, x_lower_bc_Cv, x_lower_bc_Cv_value, 1)

    option_name = '-x_lower_bc_P'
    call get_BC_with_check(option_name, x_lower_bc_P, x_lower_bc_P_value, 1)

    option_name = '-x_lower_bc_u'
    call get_BC_with_check_uvw(option_name, x_lower_bc_u, 1)

    option_name = '-x_lower_bc_v'
    call get_BC_with_check_uvw(option_name, x_lower_bc_v, 1)

    option_name = '-x_lower_bc_w'
    call get_BC_with_check_uvw(option_name, x_lower_bc_w, 1)

    ! Y BCs

    option_name = '-y_upper_bc_T'
    call get_BC(option_name, y_upper_bc_T, y_upper_bc_T_value, 2)

    option_name = '-y_upper_bc_Cl'
    call get_BC_with_check(option_name, y_upper_bc_Cl, y_upper_bc_Cl_value, 2)

    option_name = '-y_upper_bc_Cv'
    call get_BC_with_check(option_name, y_upper_bc_Cv, y_upper_bc_Cv_value, 2)

    option_name = '-y_upper_bc_P'
    call get_BC_with_check(option_name, y_upper_bc_P, y_upper_bc_P_value, 2)

    option_name = '-y_upper_bc_u'
    call get_BC_with_check_uvw(option_name, y_upper_bc_u, 2)

    option_name = '-y_upper_bc_v'
    call get_BC_with_check_uvw(option_name, y_upper_bc_v, 2)

    option_name = '-y_upper_bc_w'
    call get_BC_with_check_uvw(option_name, y_upper_bc_w, 2)

    option_name = '-y_lower_bc_T'
    call get_BC_with_check(option_name, y_lower_bc_T, y_lower_bc_T_value, 2)

    option_name = '-y_lower_bc_Cl'
    call get_BC_with_check(option_name, y_lower_bc_Cl, y_lower_bc_Cl_value, 2)

    option_name = '-y_lower_bc_Cv'
    call get_BC_with_check(option_name, y_lower_bc_Cv, y_lower_bc_Cv_value, 2)

    option_name = '-y_lower_bc_P'
    call get_BC_with_check(option_name, y_lower_bc_P, y_lower_bc_P_value, 2)

    option_name = '-y_lower_bc_u'
    call get_BC_with_check_uvw(option_name, y_lower_bc_u, 2)

    option_name = '-y_lower_bc_v'
    call get_BC_with_check_uvw(option_name, y_lower_bc_v, 2)

    option_name = '-y_lower_bc_w'
    call get_BC_with_check_uvw(option_name, y_lower_bc_w, 2)

    ! Z BCs.

    option_name = '-z_upper_bc_T'
    call get_BC(option_name, z_upper_bc_T, z_upper_bc_T_value, 3)

    option_name = '-z_upper_bc_Cl'
    call get_BC_with_check(option_name, z_upper_bc_Cl,  z_upper_bc_Cl_value, 3)

    option_name = '-z_upper_bc_Cv'
    call get_BC_with_check(option_name, z_upper_bc_Cv,  z_upper_bc_Cv_value, 3)

    option_name = '-z_upper_bc_P'
    call get_BC_with_check(option_name, z_upper_bc_P, z_upper_bc_P_value, 3)

    option_name = '-z_upper_bc_u'
    call get_BC_with_check_uvw(option_name, z_upper_bc_u, 3)

    option_name = '-z_upper_bc_v'
    call get_BC_with_check_uvw(option_name, z_upper_bc_v, 3)

    option_name = '-z_upper_bc_w'
    call get_BC_with_check_uvw(option_name, z_upper_bc_w, 3)

    option_name = '-z_lower_bc_T'
    call get_BC_with_check(option_name, z_lower_bc_T, z_lower_bc_T_value, 3)

    option_name = '-z_lower_bc_Cl'
    call get_BC_with_check(option_name, z_lower_bc_Cl, z_lower_bc_Cl_value, 3)

    option_name = '-z_lower_bc_Cv'
    call get_BC_with_check(option_name, z_lower_bc_Cv, z_lower_bc_Cv_value, 3)

    option_name = '-z_lower_bc_P'
    call get_BC_with_check(option_name, z_lower_bc_P, z_lower_bc_P_value, 3)

    option_name = '-z_lower_bc_u'
    call get_BC_with_check_uvw(option_name, z_lower_bc_u, 3)

    option_name = '-z_lower_bc_v'
    call get_BC_with_check_uvw(option_name, z_lower_bc_v, 3)

    option_name = '-z_lower_bc_w'
    call get_BC_with_check_uvw(option_name, z_lower_bc_w, 3)

    option_name = '-liquid_limit'
    call PetscOptionsGetReal(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, liquid_limit, found, ierr)
    if (found) then
      write(msg, *) option_name, '=', liquid_limit
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
      if (.not. ((liquid_limit >= 0.0d0) .and. (liquid_limit <= 1.0d0))) then
        write(msg, *) 'Error:', liquid_limit, 'is not a valid value.'
        call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
      end if
    else
      write(msg, *) 'Error:', option_name, 'not found.'
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    end if

    option_name = '-gaseous_limit'
    call PetscOptionsGetReal(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, gaseous_limit, found, ierr)
    if (found) then
      write(msg, *) option_name, '=', gaseous_limit
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
      if (.not. ((gaseous_limit >= 0.0d0) .and. (gaseous_limit <= 1.0d0))) then
        write(msg, *) 'Error:', gaseous_limit, 'is not a valid value.'
        call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
      end if
      if (.not. (gaseous_limit < liquid_limit)) then
        write(msg, *) 'Error:', gaseous_limit, 'is not a valid value.'
        call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
      end if
    else
      write(msg, *) 'Error:', option_name, 'not found.'
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    end if

  end subroutine read_initial_configuration_options

  subroutine read_run_time_configuration_options(num_procs_x, num_procs_y, num_procs_z,           &
       imex_Cl, imex_Cv, imex_T, imex_u, imex_v, imex_w,                                          &
       petsc_solver_Cl, petsc_solver_Cv, petsc_solver_T,                                          &
       petsc_solver_u, petsc_solver_v, petsc_solver_w, petsc_solver_p,                            &
       Cl_monitoring_on, Cv_monitoring_on, T_monitoring_on,                                       &
       u_monitoring_on, v_monitoring_on, w_monitoring_on, p_monitoring_on,                        &
       iter_pres_first_100, iter_pres, iter_u, iter_v, iter_w, iter_dim, iter_Cv,                 &
       iter_T, Cl_write_frequency, Cv_write_frequency, u_write_frequency, v_write_frequency,      &
       w_write_frequency, T_write_frequency, backup_frequency, num_timesteps, ierr)

    implicit none

    PetscInt, intent(out) :: num_procs_x, num_procs_y, num_procs_z
    character(len = max_option_name_length), intent(out) :: imex_Cl, imex_Cv, imex_T
    character(len = max_option_name_length), intent(out) :: imex_u, imex_v, imex_w
    logical, intent(out) :: petsc_solver_Cl, petsc_solver_Cv, petsc_solver_T
    logical, intent(out) :: petsc_solver_u, petsc_solver_v, petsc_solver_w, petsc_solver_p
    ! To monitor or not to monitor.
    logical, intent(out) :: Cl_monitoring_on, Cv_monitoring_on, T_monitoring_on
    logical, intent(out) :: u_monitoring_on, v_monitoring_on, w_monitoring_on, p_monitoring_on
    ! Pressure solver configuration
    PetscInt, intent(out) :: iter_pres_first_100, iter_pres
    ! Momentum equation solver configuration.
    PetscInt, intent(out) :: iter_u, iter_v, iter_w
    ! DIM equation solver configuration.
    PetscInt, intent(out) :: iter_dim
    ! Vapour equation solver configuration.
    PetscInt, intent(out) :: iter_Cv
    ! Temperature equation solver configuration.
    PetscInt, intent(out) :: iter_T
    ! Cl HDF5 file output frequency.
    PetscInt, intent(out) :: Cl_write_frequency
    PetscInt, intent(out) :: Cv_write_frequency
    ! u HDF5 file output frequency.
    PetscInt, intent(out) :: u_write_frequency
    ! v HDF5 file output frequency.
    PetscInt, intent(out) :: v_write_frequency
    ! w HDF5 file output frequency.
    PetscInt, intent(out) :: w_write_frequency
    ! T HDF5 file output frequency
    PetscInt, intent(out) :: T_write_frequency
    ! backup (restart) file output frequency
    PetscInt, intent(out) :: backup_frequency
    ! Number of timesteps.
    PetscInt, intent(out) :: num_timesteps
    PetscErrorCode, intent(out) :: ierr

    integer, parameter :: max_msg_length = 2*max_option_name_length

    character(len = max_option_name_length) :: option_name
    character(len = max_msg_length) :: msg
    logical :: found

    option_name = '-num_procs_x'
    call PetscOptionsGetInt(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, num_procs_x, found, ierr)
    if (found) then
      write(msg, *) option_name, '=', num_procs_x
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    else
      write(msg, *) 'Error:', option_name, 'not found.'
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    end if

    option_name = '-num_procs_y'
    call PetscOptionsGetInt(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, num_procs_y, found, ierr)
    if (found) then
      write(msg, *) option_name, '=', num_procs_y
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    else
      write(msg, *) 'Error:', option_name, 'not found.'
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    end if

    option_name = '-num_procs_z'
    call PetscOptionsGetInt(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, num_procs_z, found, ierr)
    if (found) then
      write(msg, *) option_name, '=', num_procs_z
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    else
      write(msg, *) 'Error:', option_name, 'not found.'
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    end if

    option_name = '-imex_Cl'
    call PetscOptionsGetString(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, imex_Cl, found, ierr)
    if (found) then
      write(msg, *) option_name, '= ',  trim(imex_Cl)
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
      if (.not. ((imex_Cl .eq. 'CNAB2') .or. (imex_Cl .eq. 'SBDF'))) then
        write(msg, *) 'Error:', imex_Cl, 'is not a valid value.'
        call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
      end if
    else
      write(msg, *) 'Error:', option_name, 'not found.'
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    end if

    option_name = '-imex_Cv'
    call PetscOptionsGetString(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, imex_Cv, found, ierr)
    if (found) then
      write(msg, *) option_name, '= ',  trim(imex_Cv)
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
      if (.not. ((imex_Cv .eq. 'CNAB2') .or. (imex_Cv .eq. 'SBDF'))) then
        write(msg, *) 'Error:', imex_Cv, 'is not a valid value.'
        call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
      end if
    else
      write(msg, *) 'Error:', option_name, 'not found.'
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    end if

    option_name = '-imex_T'
    call PetscOptionsGetString(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, imex_T, found, ierr)
    if (found) then
      write(msg, *) option_name, '= ',  trim(imex_T)
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
      if (.not. ((imex_T .eq. 'CNAB2') .or. (imex_T .eq. 'SBDF'))) then
        write(msg, *) 'Error:', imex_T, 'is not a valid value.'
        call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
      end if
    else
      write(msg, *) 'Error:', option_name, 'not found.'
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    end if

    option_name = '-imex_u'
    call PetscOptionsGetString(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, imex_u, found, ierr)
    if (found) then
      write(msg, *) option_name, '= ',  trim(imex_u)
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
      if (.not. ((imex_u .eq. 'CNAB3') .or. (imex_u .eq. 'SBDF'))) then
        write(msg, *) 'Error:', imex_u, 'is not a valid value.'
        call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
      end if
    else
      write(msg, *) 'Error:', option_name, 'not found.'
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    end if

    option_name = '-imex_v'
    call PetscOptionsGetString(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, imex_v, found, ierr)
    if (found) then
      write(msg, *) option_name, '= ',  trim(imex_v)
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
      if (.not. ((imex_v .eq. 'CNAB3') .or. (imex_v .eq. 'SBDF'))) then
        write(msg, *) 'Error:', imex_v, 'is not a valid value.'
        call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
      end if
    else
      write(msg, *) 'Error:', option_name, 'not found.'
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    end if

    option_name = '-imex_w'
    call PetscOptionsGetString(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, imex_w, found, ierr)
    if (found) then
      write(msg, *) option_name, '= ',  trim(imex_w)
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
      if (.not. ((imex_w .eq. 'CNAB3') .or. (imex_w .eq. 'SBDF'))) then
        write(msg, *) 'Error:', imex_w, 'is not a valid value.'
        call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
      end if
    else
      write(msg, *) 'Error:', option_name, 'not found.'
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    end if

    option_name = '-petsc_solver_Cl'
    call PetscOptionsGetBool(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, petsc_solver_Cl, found, ierr)
    if (found) then
      write(msg, *) option_name, '=', petsc_solver_Cl
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    else
      write(msg, *) 'Error:', option_name, 'not found.'
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    end if

    option_name = '-petsc_solver_Cv'
    call PetscOptionsGetBool(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, petsc_solver_Cv, found, ierr)
    if (found) then
      write(msg, *) option_name, '=', petsc_solver_Cv
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    else
      write(msg, *) 'Error:', option_name, 'not found.'
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    end if

    option_name = '-petsc_solver_T'
    call PetscOptionsGetBool(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, petsc_solver_T, found, ierr)
    if (found) then
      write(msg, *) option_name, '=', petsc_solver_T
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    else
      write(msg, *) 'Error:', option_name, 'not found.'
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    end if

    option_name = '-petsc_solver_u'
    call PetscOptionsGetBool(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, petsc_solver_u, found, ierr)
    if (found) then
      write(msg, *) option_name, '=', petsc_solver_u
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    else
      write(msg, *) 'Error:', option_name, 'not found.'
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    end if

    option_name = '-petsc_solver_v'
    call PetscOptionsGetBool(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, petsc_solver_v, found, ierr)
    if (found) then
      write(msg, *) option_name, '=', petsc_solver_v
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    else
      write(msg, *) 'Error:', option_name, 'not found.'
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    end if

    option_name = '-petsc_solver_w'
    call PetscOptionsGetBool(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, petsc_solver_w, found, ierr)
    if (found) then
      write(msg, *) option_name, '=', petsc_solver_w
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    else
      write(msg, *) 'Error:', option_name, 'not found.'
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    end if

    option_name = '-petsc_solver_p'
    call PetscOptionsGetBool(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, petsc_solver_p, found, ierr)
    if (found) then
      write(msg, *) option_name, '=', petsc_solver_p
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    else
      write(msg, *) 'Error:', option_name, 'not found.'
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    end if

    option_name = '-Cl_monitoring_on'
    call PetscOptionsGetBool(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, Cl_monitoring_on, found, ierr)
    if (found) then
      write(msg, *) option_name, '=', Cl_monitoring_on
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    else
      write(msg, *) 'Error:', option_name, 'not found.'
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    end if

    option_name = '-Cv_monitoring_on'
    call PetscOptionsGetBool(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, Cv_monitoring_on, found, ierr)
    if (found) then
      write(msg, *) option_name, '=', Cv_monitoring_on
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    else
      write(msg, *) 'Error:', option_name, 'not found.'
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    end if

    option_name = '-T_monitoring_on'
    call PetscOptionsGetBool(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, T_monitoring_on, found, ierr)
    if (found) then
      write(msg, *) option_name, '=', T_monitoring_on
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    else
      write(msg, *) 'Error:', option_name, 'not found.'
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    end if

    option_name = '-u_monitoring_on'
    call PetscOptionsGetBool(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, u_monitoring_on, found, ierr)
    if (found) then
      write(msg, *) option_name, '=', u_monitoring_on
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    else
      write(msg, *) 'Error:', option_name, 'not found.'
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    end if

    option_name = '-v_monitoring_on'
    call PetscOptionsGetBool(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, v_monitoring_on, found, ierr)
    if (found) then
      write(msg, *) option_name, '=', v_monitoring_on
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    else
      write(msg, *) 'Error:', option_name, 'not found.'
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    end if

    option_name = '-w_monitoring_on'
    call PetscOptionsGetBool(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, w_monitoring_on, found, ierr)
    if (found) then
      write(msg, *) option_name, '=', w_monitoring_on
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    else
      write(msg, *) 'Error:', option_name, 'not found.'
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    end if

    option_name = '-p_monitoring_on'
    call PetscOptionsGetBool(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, p_monitoring_on, found, ierr)
    if (found) then
      write(msg, *) option_name, '=', p_monitoring_on
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    else
      write(msg, *) 'Error:', option_name, 'not found.'
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    end if

    option_name = '-iter_pres_first_100'
    call PetscOptionsGetInt(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, iter_pres_first_100, found, ierr)
    if (found) then
      write(msg, *) option_name, '=', iter_pres_first_100
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    else
      write(msg, *) 'Error:', option_name, 'not found.'
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    end if

    option_name = '-iter_pres'
    call PetscOptionsGetInt(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, iter_pres, found, ierr)
    if (found) then
      write(msg, *) option_name, '=', iter_pres
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    else
      write(msg, *) 'Error:', option_name, 'not found.'
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    end if

    option_name = '-iter_u'
    call PetscOptionsGetInt(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, iter_u, found, ierr)
    if (found) then
      write(msg, *) option_name, '=', iter_u
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    else
      write(msg, *) 'Error:', option_name, 'not found.'
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    end if

    option_name = '-iter_v'
    call PetscOptionsGetInt(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, iter_v, found, ierr)
    if (found) then
      write(msg, *) option_name, '=', iter_v
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    else
      write(msg, *) 'Error:', option_name, 'not found.'
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    end if

    option_name = '-iter_w'
    call PetscOptionsGetInt(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, iter_w, found, ierr)
    if (found) then
      write(msg, *) option_name, '=', iter_w
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    else
      write(msg, *) 'Error:', option_name, 'not found.'
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    end if

    option_name = '-iter_dim'
    call PetscOptionsGetInt(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, iter_dim, found, ierr)
    if (found) then
      write(msg, *) option_name, '=', iter_dim
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    else
      write(msg, *) 'Error:', option_name, 'not found.'
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    end if

    option_name = '-iter_Cv'
    call PetscOptionsGetInt(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, iter_Cv, found, ierr)
    if (found) then
      write(msg, *) option_name, '=', iter_Cv
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    else
      write(msg, *) 'Error:', option_name, 'not found.'
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    end if

    option_name = '-iter_T'
    call PetscOptionsGetInt(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, iter_T, found, ierr)
    if (found) then
      write(msg, *) option_name, '=', iter_T
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    else
      write(msg, *) 'Error:', option_name, 'not found.'
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    end if

    option_name = '-Cl_write_frequency'
    call PetscOptionsGetInt(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, Cl_write_frequency, found, ierr)
    if (found) then
      write(msg, *) option_name, '=', Cl_write_frequency
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    else
      write(msg, *) 'Error:', option_name, 'not found.'
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    end if

    option_name = '-Cv_write_frequency'
    call PetscOptionsGetInt(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, Cv_write_frequency, found, ierr)
    if (found) then
      write(msg, *) option_name, '=', Cv_write_frequency
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    else
      write(msg, *) 'Error:', option_name, 'not found.'
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    end if

    option_name = '-u_write_frequency'
    call PetscOptionsGetInt(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, u_write_frequency, found, ierr)
    if (found) then
       write(msg, *) option_name, '=', u_write_frequency
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    else
      write(msg, *) 'Error:', option_name, 'not found.'
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    end if

    option_name = '-v_write_frequency'
    call PetscOptionsGetInt(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, v_write_frequency, found, ierr)
    if (found) then
      write(msg, *) option_name, '=', v_write_frequency
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    else
      write(msg, *) 'Error:', option_name, 'not found.'
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    end if

    option_name = '-w_write_frequency'
    call PetscOptionsGetInt(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, w_write_frequency, found, ierr)
    if (found) then
      write(msg, *) option_name, '=', w_write_frequency
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    else
      write(msg, *) 'Error:', option_name, 'not found.'
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    end if

    option_name = '-T_write_frequency'
    call PetscOptionsGetInt(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, T_write_frequency, found, ierr)
    if (found) then
      write(msg, *) option_name, '=', T_write_frequency
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    else
      write(msg, *) 'Error:', option_name, 'not found.'
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    end if

    option_name = '-backup_frequency'
    call PetscOptionsGetInt(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, backup_frequency, found, ierr)
    if (found) then
      write(msg, *) option_name, '=', backup_frequency
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    else
      write(msg, *) 'Error:', option_name, 'not found.'
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    end if

    option_name = '-num_timesteps'
    call PetscOptionsGetInt(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, num_timesteps, found, ierr)
    if (found) then
      write(msg, *) option_name, '=', num_timesteps
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    else
      write(msg, *) 'Error:', option_name, 'not found.'
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    end if

  end subroutine read_run_time_configuration_options


  subroutine get_BC(option_name, b_c, dirichlet_value, dim)

    implicit none

    ! Arguments.
    character(len = max_option_name_length), intent(in) :: option_name
    integer(kind(BC)), intent(out) :: b_c
    double precision, intent(out) :: dirichlet_value
    integer, intent(in) :: dim

    ! Local variables.
    character(len = max_option_name_length) :: boundary_condition
    character(len = max_option_name_length) :: dirichlet_option_name
    logical :: found
    character(len = max_msg_length) :: msg
    PetscErrorCode :: ierr

    call PetscOptionsGetString(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, &
         boundary_condition, found, ierr)
    if (found) then
      write(msg, *) option_name, '= ',  trim(boundary_condition)
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
      ! Default values.
      periodic(dim) = .false.
      boundary(dim) = DM_BOUNDARY_NONE
      if (boundary_condition .eq. 'periodic') then
        periodic(dim) = .true.
        boundary(dim) = DM_BOUNDARY_PERIODIC
        b_c = cyclic
      else if (boundary_condition .eq. 'neumann') then
        b_c = neumann
      else if (boundary_condition .eq. 'dirichlet') then
        dirichlet_option_name = trim(option_name) // '_value'
        call PetscOptionsGetReal(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, dirichlet_option_name, &
             dirichlet_value, found, ierr)
        if (found) then
          write(msg, *) dirichlet_option_name, '= ',  dirichlet_value
          call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
          b_c = dirichlet
        else
          b_c = quasi_dirichlet
        end if
      else
        write(msg, *) 'Error: ' // trim(boundary_condition) // ' is not a valid value.'
        call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
      end if
    else
      write(msg, *) 'Error:', option_name, 'not found.'
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
    end if

  end subroutine get_BC


  subroutine get_BC_with_check(option_name, b_c, dirichlet_value, dim)

    implicit none

    ! Arguments.
    character(len = max_option_name_length), intent(in) :: option_name
    integer(kind(BC)), intent(out) :: b_c
    double precision, intent(out) :: dirichlet_value
    integer, intent(in) :: dim

    ! Local variables.
    character(len = max_option_name_length) :: boundary_condition
    character(len = max_option_name_length) :: dirichlet_option_name
    logical :: found
    character(len = max_msg_length) :: msg
    PetscErrorCode :: ierr
    logical :: periodic_val

    call PetscOptionsGetString(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, &
         boundary_condition, found, ierr)
    if (found) then

      write(msg, *) option_name, '= ',  trim(boundary_condition)
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
      ! Default values.
      periodic_val = .false.
      if (boundary_condition .eq. 'periodic') then
        periodic_val = .true.
        b_c = cyclic
      else if (boundary_condition .eq. 'neumann') then
        b_c = neumann
      else if (boundary_condition .eq. 'dirichlet') then
        dirichlet_option_name = trim(option_name) // '_value'
        call PetscOptionsGetReal(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, dirichlet_option_name, &
             dirichlet_value, found, ierr)
        if (found) then
          write(msg, *) dirichlet_option_name, '= ',  dirichlet_value
          call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
          b_c = dirichlet
        else
          b_c = quasi_dirichlet
        end if
      else if (boundary_condition .eq. 'inlet') then
        if (option_name .ne. 'x_lower_bc_u') then
          write(msg, *) 'Error: inlet is not a valid value for' // option_name // '.'
          call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
        end if
        bc_temp = inlet
        b_c = inlet
      else if (boundary_condition .eq. 'outlet') then
        if (option_name .ne. 'x_upper_bc_u') then
          write(msg, *) 'Error: outlet is not a valid value for' // option_name // '.'
          call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
        end if
        if (bc_temp .ne. inlet) then
          write(msg, *) 'Error: outlet is not a valid value for x_upper_bc_u as x_lower_bc_u is not an inlet.'
          call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
        end if
        b_c = outlet
      else
        write(msg, *) 'Error: ' // trim(boundary_condition) // ' is not a valid value.'
        call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
      end if

      if (periodic_val .neqv. periodic(dim)) then
        write(msg, *) 'Error: ' // 'cannot have a single periodic BC in dimenstion', dim
        call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
      end if

    else

      write(msg, *) 'Error:', option_name, 'not found.'
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)

    end if

  end subroutine get_BC_with_check


  subroutine get_BC_with_check_uvw(option_name, b_c, dim)

    implicit none

    ! Arguments.
    character(len = max_option_name_length), intent(in) :: option_name
    integer(kind(BC)), intent(out) :: b_c
    integer, intent(in) :: dim

    ! Local variables.
    character(len = max_option_name_length) :: boundary_condition
    logical :: found
    character(len = max_msg_length) :: msg
    PetscErrorCode :: ierr
    logical :: periodic_val

    call PetscOptionsGetString(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, &
         boundary_condition, found, ierr)
    if (found) then

      write(msg, *) option_name, '= ',  trim(boundary_condition)
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
      ! Default values.
      periodic_val = .false.
      if (boundary_condition .eq. 'periodic') then
        periodic_val = .true.
        b_c = cyclic
      else if (boundary_condition .eq. 'neumann') then
        b_c = neumann
      else if (boundary_condition .eq. 'dirichlet') then
        b_c = dirichlet
      else if (boundary_condition .eq. 'inlet') then
        if (option_name .ne. 'x_lower_bc_u') then
          write(msg, *) 'Error: inlet is not a valid value for' // option_name // '.'
          call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
        end if
        bc_temp = inlet
        b_c = inlet
      else if (boundary_condition .eq. 'outlet') then
        if (option_name .ne. 'x_upper_bc_u') then
          write(msg, *) 'Error: outlet is not a valid value for' // option_name // '.'
          call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
        end if
        if (bc_temp .ne. inlet) then
          write(msg, *) 'Error: outlet is not a valid value for x_upper_bc_u as x_lower_bc_u is not an inlet.'
          call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
        end if
        b_c = outlet
      else
        write(msg, *) 'Error: ' // trim(boundary_condition) // ' is not a valid value.'
        call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
      end if

      if (periodic_val .neqv. periodic(dim)) then
        write(msg, *) 'Error: ' // 'cannot have a single periodic BC in dimenstion', dim
        call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)
      end if

    else

      write(msg, *) 'Error:', option_name, 'not found.'
      call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr)

    end if

  end subroutine get_BC_with_check_uvw

end module test_configuration_options
-------------- next part --------------
program test_program

  use petsc
  use test_configuration_options

  implicit none

#include "petsc/finclude/petsc.h"

  PetscInt :: global_dim_x, global_dim_y, global_dim_z
  double precision :: gz, gx
  PetscInt :: num_procs_x, num_procs_y, num_procs_z
  double precision :: liquid_limit
  double precision :: gaseous_limit
  PetscErrorCode :: ierr

  character(len = max_option_name_length) :: imex_Cl, imex_Cv, imex_T
  character(len = max_option_name_length) :: imex_u, imex_v, imex_w
  ! Choices of PETSc or original solvers.
  logical :: petsc_solver_Cl, petsc_solver_Cv, petsc_solver_T
  logical :: petsc_solver_u, petsc_solver_v, petsc_solver_w, petsc_solver_p
  ! To monitor or not to monitor.
  logical :: Cl_monitoring_on, Cv_monitoring_on, T_monitoring_on
  logical :: u_monitoring_on, v_monitoring_on, w_monitoring_on, p_monitoring_on
  ! Pressure solver configuration
  PetscInt :: iter_pres_first_100, iter_pres
  ! Momentum equation solver configuration.
  PetscInt :: iter_u, iter_v, iter_w
  ! DIM equation solver configuration.
  PetscInt :: iter_dim
  PetscInt :: iter_Cv
  ! Temperature equation solver configuration.
  PetscInt :: iter_T
  ! Cl HDF5 file output frequency.
  PetscInt :: Cl_write_frequency
  PetscInt :: Cv_write_frequency
  ! u HDF5 file output frequency.
  PetscInt :: u_write_frequency
  ! v HDF5 file output frequency.
  PetscInt :: v_write_frequency
  ! w HDF5 file output frequency.
  PetscInt :: w_write_frequency
  ! T HDF5 file output frequency.
  PetscInt :: T_write_frequency
  ! backup (restart) file output frequency.
  PetscInt :: backup_frequency
  ! Number of timesteps.
  PetscInt :: num_timesteps

  double precision :: Re, Pe, PeT, We, Fr, Bod, Ja, Fr2, Fr2BodByRe
  double precision :: dpdx
  double precision :: mu_minus, mu_plus, mu_vap
  double precision :: rho_minus, rho_plus, rho_vap
  double precision :: cp_minus, cp_plus, cp_vap
  double precision :: rhocp_minus, rhocp_plus, rhocp_vap
  double precision :: k_plus, k_minus, k_vap
  double precision :: beta_plus, beta_minus, beta_vap
  double precision :: epn
  double precision :: T_ref
  double precision :: Pref
  double precision :: Apsat, Bpsat, Cpsat
  double precision :: molMassRatio
  double precision :: PeMD
  double precision :: PeMDI
  double precision :: dTdx

  PetscLogDouble :: mem
  integer :: mpi_err

  ! ******************************************************************************************

  call PetscInitialize(PETSC_NULL_CHARACTER, ierr)
  call PetscMemoryGetCurrentUsage(mem, ierr)
  write(*, *) 'mem0 = ', mem
  call MPI_Barrier(PETSC_COMM_WORLD, mpi_err)

  call read_initial_configuration_options(global_dim_x, global_dim_y, global_dim_z,          &
       Re, Pe, We, Fr, Bod, Ja, mu_plus, mu_minus, mu_vap, rho_plus, rho_minus, rho_vap,     &
       cp_plus, cp_minus, cp_vap, k_plus, k_minus, k_vap, beta_plus, beta_minus, beta_vap,   &
       dpdx, gx, gz, epn, dTdx, T_ref,                                                       &
       Pref, Apsat, Bpsat, Cpsat, molMassRatio, PeT, PeMD, PeMDI,                            &
       x_upper_bc_T, x_upper_bc_Cl, x_upper_bc_Cv, x_upper_bc_u, x_upper_bc_v, x_upper_bc_w, &
       x_lower_bc_T, x_lower_bc_Cl, x_lower_bc_Cv, x_lower_bc_u, x_lower_bc_v, x_lower_bc_w, &
       y_upper_bc_T, y_upper_bc_Cl, y_upper_bc_Cv, y_upper_bc_u, y_upper_bc_v, y_upper_bc_w, &
       y_lower_bc_T, y_lower_bc_Cl, y_lower_bc_Cv, y_lower_bc_u, y_lower_bc_v, y_lower_bc_w, &
       z_upper_bc_T, z_upper_bc_Cl, z_upper_bc_Cv, z_upper_bc_u, z_upper_bc_v, z_upper_bc_w, &
       z_lower_bc_T, z_lower_bc_Cl, z_lower_bc_Cv, z_lower_bc_u, z_lower_bc_v, z_lower_bc_w, &
       liquid_limit, gaseous_limit, ierr)
  call PetscMemoryGetCurrentUsage(mem, ierr)
  write(*, *) 'mem1 = ', mem
  call MPI_Barrier(PETSC_COMM_WORLD, mpi_err)

  call read_run_time_configuration_options(num_procs_x, num_procs_y, num_procs_z,                 &
       imex_Cl, imex_CV, imex_T, imex_U, imex_v, imex_w,                                          &
       petsc_solver_Cl, petsc_solver_Cv, petsc_solver_T,                                          &
       petsc_solver_u, petsc_solver_v, petsc_solver_w, petsc_solver_p,                            &
       Cl_monitoring_on, Cv_monitoring_on, T_monitoring_on,                                       &
       u_monitoring_on, v_monitoring_on, w_monitoring_on, p_monitoring_on,                        &
       iter_pres_first_100, iter_pres, iter_u, iter_v, iter_w, iter_dim, iter_Cv,                 &
       iter_T, Cl_write_frequency, Cv_write_frequency, u_write_frequency, v_write_frequency,      &
       w_write_frequency, T_write_frequency, backup_frequency, num_timesteps, ierr)
  call PetscMemoryGetCurrentUsage(mem, ierr)
  write(*, *) 'mem2 = ', mem
  call MPI_Barrier(PETSC_COMM_WORLD, mpi_err)

  call PetscFinalize(ierr)

end program test_program
-------------- next part --------------
-global_dim_x 240
-global_dim_y 240
-global_dim_z 320
-phenomenon boiling_variant
-liquid_limit 0.9
-gaseous_limit 0.1

-epn 0.0    # This value causes epn to be computed by the TPLS program.

-Re 221.46
-Pe 1.0    # The Peclet number for the Cahn-Hilliard equation. It is modified in the code.
-Pr 8.4
-Ja 0.18
-Fr 1.01
-We 1.01
-Bod 1.0
-Sc 1.0

-Pref 1.0        # Relates partial pressure to mole fraction.
-Apsat 1.0
-Bpsat 1.0
-Cpsat 1.0
-T_substrate 1.0 # Bespoke option.
-T_ref 0.0       # Used to specify the saturation temperature (a.k.a. T_bulk).
-th_layer 0.36   # Bespoke option.
-dTdx 0.0
-Radius 0.5      # Bespoke option.

-MM_minus 1.0
-MM_vap 1.0

# Properties of the liquid.
-rho_plus 91.07
-mu_plus 32.6
-k_plus 3.94
-cp_plus 1.23
-beta_plus 1.0

# Properties of the inert gas. Boiling so UNUSED.
-rho_minus 1.0
-mu_minus 1.0
-k_minus 1.0
-cp_minus 1.0
-beta_minus 1.0

# Properties of vapour corresponding to the liquid.
-rho_vap 1.0
-mu_vap 1.0
-k_vap 1.0
-cp_vap 1.0
-beta_vap 1.0

-height 0.0         # Bespoke option.
-dpdx 0.0
-Grav 1.0
-alpha -1.570796326794897
-dt 0.0005

-x_upper_bc_T periodic
-x_upper_bc_Cl periodic
-x_upper_bc_Cv periodic
-x_upper_bc_P periodic
-x_upper_bc_u periodic
-x_upper_bc_v periodic
-x_upper_bc_w periodic
-x_lower_bc_T periodic
-x_lower_bc_Cl periodic
-x_lower_bc_Cv periodic
-x_lower_bc_P periodic
-x_lower_bc_u periodic
-x_lower_bc_v periodic
-x_lower_bc_w periodic
-y_upper_bc_T periodic
-y_upper_bc_Cl periodic
-y_upper_bc_Cv periodic
-y_upper_bc_P periodic
-y_upper_bc_u periodic
-y_upper_bc_v periodic
-y_upper_bc_w periodic
-y_lower_bc_T periodic
-y_lower_bc_Cl periodic
-y_lower_bc_Cv periodic
-y_lower_bc_P periodic
-y_lower_bc_u periodic
-y_lower_bc_v periodic
-y_lower_bc_w periodic
-z_upper_bc_T dirichlet    # Fixed temperature.
-z_upper_bc_T_value 0.0    # Set to T_ref (i.e. T_sat).
-z_upper_bc_Cl neumann     # dCl/dz = 0.
-z_upper_bc_Cv dirichlet   # Initialise Cv to 1 everywhere.
-z_upper_bc_P neumann      # dP/dz = rho_wgrid*gz.
-z_upper_bc_u dirichlet    # Set to 0 initially, no slip.
-z_upper_bc_v dirichlet    # Set to 0 initially, no slip.
-z_upper_bc_w neumann      # dw/dz = 0, in or out flow is allowed.
-z_lower_bc_T dirichlet    # Fixed temperature.
-z_lower_bc_T_value 1.0    # Set to T_substrate.
-z_lower_bc_Cl dirichlet   # Fixed composition determined by initialisation.
-z_lower_bc_Cv dirichlet   # Initialise Cv to 1 everywhere.
-z_lower_bc_P neumann      # dP/dz = rho_wgrid*gz.
-z_lower_bc_u dirichlet    # Set to 0 initially, no slip.
-z_lower_bc_v dirichlet    # Set to 0 initially, no slip.
-z_lower_bc_w dirichlet    # Set to 0 initially, no in or out flow.

-num_procs_x 2
-num_procs_y 2
-num_procs_z 2
-petsc_solver_u FALSE
-petsc_solver_v FALSE
-petsc_solver_w FALSE
-petsc_solver_Cl FALSE
-petsc_solver_Cv FALSE
-petsc_solver_T FALSE
-petsc_solver_p FALSE
-imex_u CNAB3
-imex_v CNAB3
-imex_w CNAB3
-imex_Cl SBDF
-imex_Cv CNAB
-imex_T CNAB
-u_monitoring_on FALSE
-v_monitoring_on FALSE
-w_monitoring_on FALSE
-p_monitoring_on FALSE
-Cl_monitoring_on FALSE
-Cv_monitoring_on FALSE
-T_monitoring_on FALSE
-iter_pres_first_100 1000
-iter_pres 1000
-iter_u 30
-iter_v 30
-iter_w 30
-iter_dim 40
-iter_T 40
-Cl_write_frequency 250
-Cv_write_frequency 2500
-T_write_frequency 2500
-u_write_frequency 2500
-v_write_frequency 2500
-w_write_frequency 2500
-backup_frequency 10000
-num_timesteps 10000

-u_ksp_rtol 0.0000001
-u_ksp_view_final_residual
-v_ksp_rtol 0.0000001
-v_ksp_view_final_residual
-w_ksp_rtol 0.0000001
-w_ksp_view_final_residual
-p_ksp_rtol 0.0000001
-p_ksp_type minres
-p_pc_type sor
-p_pc_sor_omega 1.5
-p_ksp_view_final_residual

-options_left
-------------- next part --------------
 mem0 =    16420864.000000000     
 mem0 =    16117760.000000000     
 -phenomenon                   = boiling_variant
 -global_dim_x                 =                  240
 -global_dim_y                 =                  240
 -global_dim_z                 =                  320
 -dt                           =   5.0000000000000001E-004
 dx =   3.1250000000000002E-003
 dy =   3.1250000000000002E-003
 dz =   3.1250000000000002E-003
 Lx =  0.75000000000000000
 Ly =  0.75000000000000000
 Lz =   1.0000000000000000
 -epn                          =   1.5625000000000001E-003
 -Re                           =   221.46000000000001
 -Pe                           =   1.0000000000000000
 Modified Pe =   640.00000000000000
 -Pr                           =   8.4000000000000004
 PeT =   1860.2640000000001
 -We                           =   1.0100000000000000
 -Fr                           =   1.0100000000000000
 -Bod                          =   1.0000000000000000
 -Sc                           =   1.0000000000000000
 PeMD =   221.46000000000001
 PeMDI =   221.46000000000001
 -Ja                           =  0.17999999999999999
 -dTdx                         =   0.0000000000000000
 -T_ref                        =   0.0000000000000000
 -Pref                         =   1.0000000000000000
 -Apsat                        =   1.0000000000000000
 -Bpsat                        =   1.0000000000000000
 -Cpsat                        =   1.0000000000000000
 -MM_minus                     =   1.0000000000000000
 -MM_vap                       =   1.0000000000000000
 molMassRatio =   1.0000000000000000
 -mu_plus                      =   32.600000000000001
 -mu_minus                     =   1.0000000000000000
 -mu_vap                       =   1.0000000000000000
 -rho_plus                     =   91.069999999999993
 -rho_minus                    =   1.0000000000000000
 -rho_vap                      =   1.0000000000000000
 -cp_plus                      =   1.2300000000000000
 -cp_minus                     =   1.0000000000000000
 -cp_vap                       =   1.0000000000000000
 -k_plus                       =   3.9399999999999999
 -k_minus                      =   1.0000000000000000
 -k_vap                        =   1.0000000000000000
 -beta_plus                    =   1.0000000000000000
 -beta_minus                   =   1.0000000000000000
 -beta_vap                     =   1.0000000000000000
 -dpdx                         =   0.0000000000000000
 -Grav                         =   1.0000000000000000
 -alpha                        =  -1.5707963267948970
 gz =  -1.0000000000000000
 gx =   0.0000000000000000
 -x_upper_bc_T                 = periodic
 -x_upper_bc_Cl                = periodic
 -x_upper_bc_Cv                = periodic
 -x_upper_bc_P                 = periodic
 -x_upper_bc_u                 = periodic
 -x_upper_bc_v                 = periodic
 -x_upper_bc_w                 = periodic
 -x_lower_bc_T                 = periodic
 -x_lower_bc_Cl                = periodic
 -x_lower_bc_Cv                = periodic
 -x_lower_bc_P                 = periodic
 -x_lower_bc_u                 = periodic
 -x_lower_bc_v                 = periodic
 -x_lower_bc_w                 = periodic
 -y_upper_bc_T                 = periodic
 -y_upper_bc_Cl                = periodic
 -y_upper_bc_Cv                = periodic
 -y_upper_bc_P                 = periodic
 -y_upper_bc_u                 = periodic
 -y_upper_bc_v                 = periodic
 -y_upper_bc_w                 = periodic
 -y_lower_bc_T                 = periodic
 -y_lower_bc_Cl                = periodic
 -y_lower_bc_Cv                = periodic
 -y_lower_bc_P                 = periodic
 -y_lower_bc_u                 = periodic
 -y_lower_bc_v                 = periodic
 -y_lower_bc_w                 = periodic
 -z_upper_bc_T                 = dirichlet
 -z_upper_bc_T_value           =    0.0000000000000000
 -z_upper_bc_Cl                = neumann
 -z_upper_bc_Cv                = dirichlet
 -z_upper_bc_P                 = neumann
 -z_upper_bc_u                 = dirichlet
 -z_upper_bc_v                 = dirichlet
 -z_upper_bc_w                 = neumann
 -z_lower_bc_T                 = dirichlet
 -z_lower_bc_T_value           =    1.0000000000000000
 -z_lower_bc_Cl                = dirichlet
 -z_lower_bc_Cv                = dirichlet
 -z_lower_bc_P                 = neumann
 -z_lower_bc_u                 = dirichlet
 -z_lower_bc_v                 = dirichlet
 -z_lower_bc_w                 = dirichlet
 -liquid_limit                 =  0.90000000000000002
 mem1 =    4311490560.0000000     
 -gaseous_limit                =  0.10000000000000001
 mem1 =    4311826432.0000000     
 -num_procs_x                  =                    2
 -num_procs_y                  =                    2
 -num_procs_z                  =                    2
 -imex_Cl                      = SBDF
 -imex_Cv                      = CNAB
 Error:CNAB                          is not a valid value.
 -imex_T                       = CNAB
 Error:CNAB                          is not a valid value.
 -imex_u                       = CNAB3
 -imex_v                       = CNAB3
 -imex_w                       = CNAB3
 -petsc_solver_Cl              = F
 -petsc_solver_Cv              = F
 -petsc_solver_T               = F
 -petsc_solver_u               = F
 -petsc_solver_v               = F
 -petsc_solver_w               = F
 -petsc_solver_p               = F
 -Cl_monitoring_on             = F
 -Cv_monitoring_on             = F
 -T_monitoring_on              = F
 -u_monitoring_on              = F
 -v_monitoring_on              = F
 -w_monitoring_on              = F
 -p_monitoring_on              = F
 -iter_pres_first_100          =                 1000
 -iter_pres                    =                 1000
 -iter_u                       =                   30
 -iter_v                       =                   30
 -iter_w                       =                   30
 -iter_dim                     =                   40
 Error:-iter_Cv                      not found.
 -iter_T                       =                   40
 -Cl_write_frequency           =                  250
 -Cv_write_frequency           =                 2500
 -u_write_frequency            =                 2500
 -v_write_frequency            =                 2500
 -w_write_frequency            =                 2500
 -T_write_frequency            =                 2500
 -backup_frequency             =                10000
 -num_timesteps                =                10000
 mem2 =    4311490560.0000000     
 mem2 =    4311826432.0000000     
Summary of Memory Usage in PETSc
Current process memory:                                  total 8.6236e+09 max 4.3120e+09 min 4.3116e+09
Current space PetscMalloc()ed:                           total 3.2736e+04 max 1.6368e+04 min 1.6368e+04
Run with -memory_view to get maximum memory usage
#PETSc Option Table entries:
-alpha -1.570796326794897 # (source: file)
-Apsat 1.0 # (source: file)
-backup_frequency 10000 # (source: file)
-beta_minus 1.0 # (source: file)
-beta_plus 1.0 # (source: file)
-beta_vap 1.0 # (source: file)
-Bod 1.0 # (source: file)
-Bpsat 1.0 # (source: file)
-Cl_monitoring_on FALSE # (source: file)
-Cl_write_frequency 250 # (source: file)
-cp_minus 1.0 # (source: file)
-cp_plus 1.23 # (source: file)
-cp_vap 1.0 # (source: file)
-Cpsat 1.0 # (source: file)
-Cv_monitoring_on FALSE # (source: file)
-Cv_write_frequency 2500 # (source: file)
-d initial_state # (source: command line)
-dpdx 0.0 # (source: file)
-dt 0.0005 # (source: file)
-dTdx 0.0 # (source: file)
-epn 0.0 # (source: file)
-Fr 1.01 # (source: file)
-gaseous_limit 0.1 # (source: file)
-global_dim_x 240 # (source: file)
-global_dim_y 240 # (source: file)
-global_dim_z 320 # (source: file)
-Grav 1.0 # (source: file)
-height 0.0 # (source: file)
-imex_Cl SBDF # (source: file)
-imex_Cv CNAB # (source: file)
-imex_T CNAB # (source: file)
-imex_u CNAB3 # (source: file)
-imex_v CNAB3 # (source: file)
-imex_w CNAB3 # (source: file)
-iter_dim 40 # (source: file)
-iter_pres 1000 # (source: file)
-iter_pres_first_100 1000 # (source: file)
-iter_T 40 # (source: file)
-iter_u 30 # (source: file)
-iter_v 30 # (source: file)
-iter_w 30 # (source: file)
-Ja 0.18 # (source: file)
-k_minus 1.0 # (source: file)
-k_plus 3.94 # (source: file)
-k_vap 1.0 # (source: file)
-liquid_limit 0.9 # (source: file)
-malloc_debug true # (source: command line)
-malloc_dump # (source: command line)
-malloc_view # (source: command line)
-memory_view # (source: command line)
-MM_minus 1.0 # (source: file)
-MM_vap 1.0 # (source: file)
-mu_minus 1.0 # (source: file)
-mu_plus 32.6 # (source: file)
-mu_vap 1.0 # (source: file)
-num_procs_x 2 # (source: file)
-num_procs_y 2 # (source: file)
-num_procs_z 2 # (source: file)
-num_timesteps 10000 # (source: file)
-on_error_malloc_dump # (source: command line)
-options_left # (source: file)
-p_ksp_rtol 0.0000001 # (source: file)
-p_ksp_type minres # (source: file)
-p_ksp_view_final_residual # (source: file)
-p_monitoring_on FALSE # (source: file)
-p_pc_sor_omega 1.5 # (source: file)
-p_pc_type sor # (source: file)
-Pe 1.0 # (source: file)
-petsc_solver_Cl FALSE # (source: file)
-petsc_solver_Cv FALSE # (source: file)
-petsc_solver_p FALSE # (source: file)
-petsc_solver_T FALSE # (source: file)
-petsc_solver_u FALSE # (source: file)
-petsc_solver_v FALSE # (source: file)
-petsc_solver_w FALSE # (source: file)
-phenomenon boiling_variant # (source: file)
-Pr 8.4 # (source: file)
-Pref 1.0 # (source: file)
-Radius 0.5 # (source: file)
-Re 221.46 # (source: file)
-rho_minus 1.0 # (source: file)
-rho_plus 91.07 # (source: file)
-rho_vap 1.0 # (source: file)
-Sc 1.0 # (source: file)
-T_monitoring_on FALSE # (source: file)
-T_ref 0.0 # (source: file)
-T_substrate 1.0 # (source: file)
-T_write_frequency 2500 # (source: file)
-th_layer 0.36 # (source: file)
-u_ksp_rtol 0.0000001 # (source: file)
-u_ksp_view_final_residual # (source: file)
-u_monitoring_on FALSE # (source: file)
-u_write_frequency 2500 # (source: file)
-v_ksp_rtol 0.0000001 # (source: file)
-v_ksp_view_final_residual # (source: file)
-v_monitoring_on FALSE # (source: file)
-v_write_frequency 2500 # (source: file)
-w_ksp_rtol 0.0000001 # (source: file)
-w_ksp_view_final_residual # (source: file)
-w_monitoring_on FALSE # (source: file)
-w_write_frequency 2500 # (source: file)
-We 1.01 # (source: file)
-x_lower_bc_Cl periodic # (source: file)
-x_lower_bc_Cv periodic # (source: file)
-x_lower_bc_P periodic # (source: file)
-x_lower_bc_T periodic # (source: file)
-x_lower_bc_u periodic # (source: file)
-x_lower_bc_v periodic # (source: file)
-x_lower_bc_w periodic # (source: file)
-x_upper_bc_Cl periodic # (source: file)
-x_upper_bc_Cv periodic # (source: file)
-x_upper_bc_P periodic # (source: file)
-x_upper_bc_T periodic # (source: file)
-x_upper_bc_u periodic # (source: file)
-x_upper_bc_v periodic # (source: file)
-x_upper_bc_w periodic # (source: file)
-y_lower_bc_Cl periodic # (source: file)
-y_lower_bc_Cv periodic # (source: file)
-y_lower_bc_P periodic # (source: file)
-y_lower_bc_T periodic # (source: file)
-y_lower_bc_u periodic # (source: file)
-y_lower_bc_v periodic # (source: file)
-y_lower_bc_w periodic # (source: file)
-y_upper_bc_Cl periodic # (source: file)
-y_upper_bc_Cv periodic # (source: file)
-y_upper_bc_P periodic # (source: file)
-y_upper_bc_T periodic # (source: file)
-y_upper_bc_u periodic # (source: file)
-y_upper_bc_v periodic # (source: file)
-y_upper_bc_w periodic # (source: file)
-z_lower_bc_Cl dirichlet # (source: file)
-z_lower_bc_Cv dirichlet # (source: file)
-z_lower_bc_P neumann # (source: file)
-z_lower_bc_T dirichlet # (source: file)
-z_lower_bc_T_value 1.0 # (source: file)
-z_lower_bc_u dirichlet # (source: file)
-z_lower_bc_v dirichlet # (source: file)
-z_lower_bc_w dirichlet # (source: file)
-z_upper_bc_Cl neumann # (source: file)
-z_upper_bc_Cv dirichlet # (source: file)
-z_upper_bc_P neumann # (source: file)
-z_upper_bc_T dirichlet # (source: file)
-z_upper_bc_T_value 0.0 # (source: file)
-z_upper_bc_u dirichlet # (source: file)
-z_upper_bc_v dirichlet # (source: file)
-z_upper_bc_w neumann # (source: file)
#End of PETSc Option Table entries
WARNING! There are options you set that were not used!
WARNING! could be spelling mistake, etc!
There are 17 unused database options. They are:
Option left: name:-d value: initial_state source: command line
Option left: name:-height value: 0.0 source: file
Option left: name:-on_error_malloc_dump (no value) source: command line
Option left: name:-p_ksp_rtol value: 0.0000001 source: file
Option left: name:-p_ksp_type value: minres source: file
Option left: name:-p_ksp_view_final_residual (no value) source: file
Option left: name:-p_pc_sor_omega value: 1.5 source: file
Option left: name:-p_pc_type value: sor source: file
Option left: name:-Radius value: 0.5 source: file
Option left: name:-T_substrate value: 1.0 source: file
Option left: name:-th_layer value: 0.36 source: file
Option left: name:-u_ksp_rtol value: 0.0000001 source: file
Option left: name:-u_ksp_view_final_residual (no value) source: file
Option left: name:-v_ksp_rtol value: 0.0000001 source: file
Option left: name:-v_ksp_view_final_residual (no value) source: file
Option left: name:-w_ksp_rtol value: 0.0000001 source: file
Option left: name:-w_ksp_view_final_residual (no value) source: file
[0] Maximum memory PetscMalloc()ed 29552 maximum size of entire process 4312375296
[0] Memory usage sorted by function
[0] 1 144 PetscBTCreate()
[0] 4 128 PetscCommDuplicate()
[0] 4 64 PetscFunctionListCreate_Private()
[0] 2 528 PetscIntStackCreate()
[0] 2 2064 PetscLogClassArrayCreate()
[0] 2 2064 PetscLogEventArrayCreate()
[0] 1 32 PetscLogRegistryCreate()
[0] 2 80 PetscLogStageArrayCreate()
[0] 1 48 PetscLogStateCreate()
[0] 1 16 PetscOptionsHelpPrintedCreate()
[0] 1 32 PetscPushSignalHandler()
[0] 4 20096 PetscSegBufferCreate()
[0] 190 8688 PetscStrallocpy()
[0] 12 26144 PetscStrreplace()
[0] 2 1312 PetscViewerCreate()
[0] 2 224 PetscViewerCreate_ASCII()
[0] 14 368 petscoptionsgetbool_()
[0] 22 480 petscoptionsgetint_()
[0] 43 768 petscoptionsgetreal_()
[0] 49 784 petscoptionsgetstring_()
[0] 140 7632 petscprintf_()
[1] Maximum memory PetscMalloc()ed 29552 maximum size of entire process 4311990272
[1] Memory usage sorted by function
[1] 1 144 PetscBTCreate()
[1] 4 128 PetscCommDuplicate()
[1] 4 64 PetscFunctionListCreate_Private()
[1] 2 528 PetscIntStackCreate()
[1] 2 2064 PetscLogClassArrayCreate()
[1] 2 2064 PetscLogEventArrayCreate()
[1] 1 32 PetscLogRegistryCreate()
[1] 2 80 PetscLogStageArrayCreate()
[1] 1 48 PetscLogStateCreate()
[1] 1 16 PetscOptionsHelpPrintedCreate()
[1] 1 32 PetscPushSignalHandler()
[1] 4 20096 PetscSegBufferCreate()
[1] 190 8688 PetscStrallocpy()
[1] 12 26144 PetscStrreplace()
[1] 2 1312 PetscViewerCreate()
[1] 2 224 PetscViewerCreate_ASCII()
[1] 14 368 petscoptionsgetbool_()
[1] 22 480 petscoptionsgetint_()
[1] 43 768 petscoptionsgetreal_()
[1] 49 784 petscoptionsgetstring_()
[1] 140 7632 petscprintf_()

From knepley at gmail.com  Fri Nov 22 10:53:47 2024
From: knepley at gmail.com (Matthew Knepley)
Date: Fri, 22 Nov 2024 11:53:47 -0500
Subject: [petsc-users] Memory Used When Reading petscrc
In-Reply-To: <60ba6a27-9840-4009-a94e-90bf7a5cd317@epcc.ed.ac.uk>
References: <60ba6a27-9840-4009-a94e-90bf7a5cd317@epcc.ed.ac.uk>
Message-ID: <CAMYG4GkR-NLZstxPz_NM+uD20v9KRKHZZTvc-EaVyp98i11+CQ@mail.gmail.com>

On Fri, Nov 22, 2024 at 11:36?AM David Scott <d.scott at epcc.ed.ac.uk> wrote:

> Hello,
>
> I am using the options mechanism of PETSc to configure my CFD code. I
> have introduced options describing the size of the domain etc. I have
> noticed that this consumes a lot of memory. I have found that the amount
> of memory used scales linearly with the number of MPI processes used.
> This restricts the number of MPI processes that I can use.
>

There are two statements:

1) The memory scales linearly with P

2) This uses a lot of memory

Let's deal with 1) first. This seems to be trivially true. If I want every
process to have
access to a given option value, that option value must be in the memory of
every process.
The only alternative would be to communicate with some process in order to
get values.
Few codes seem to be willing to make this tradeoff, and we do not offer it.

Now 2). Looking at the source, for each option we store a PetscOptionItem,
which I count
as having size 37 bytes (12 pointers/ints and a char). However, there is
data behind every
pointer, like the name, help text, available values (sometimes), I could
see it being as large
as 4K. Suppose it is. If I had 256 options, that would be 1M. Is this a
large amount of memory?

The way I read the SLURM output, 29K was malloced. Is this a large amount
of memory?

I am trying to get an idea of the scale.

  Thanks,

      Matt


> Is there anything that I can do about this or do I need to configure my
> code in a different way?
>
> I have attached some code extracted from my application which
> demonstrates this along with the output from a running it on 2 MPI
> processes.
>
> Best wishes,
>
> David Scott
> The University of Edinburgh is a charitable body, registered in Scotland,
> with registration number SC005336. Is e buidheann carthannais a th? ann an
> Oilthigh Dh?n ?ideann, cl?raichte an Alba, ?ireamh cl?raidh SC005336.
>


-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!aDTOdIHWWilf4sShnRrU9KcJ987GlIrJ71v1EcIH4zje2tKZ7EBoEBD2TqNejin_X3-7DKujGeq-pXHvyHqF$  <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!aDTOdIHWWilf4sShnRrU9KcJ987GlIrJ71v1EcIH4zje2tKZ7EBoEBD2TqNejin_X3-7DKujGeq-pWu0-nb6$ >
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20241122/97d4a178/attachment.html>

From d.scott at epcc.ed.ac.uk  Fri Nov 22 11:56:53 2024
From: d.scott at epcc.ed.ac.uk (David Scott)
Date: Fri, 22 Nov 2024 17:56:53 +0000
Subject: [petsc-users] Memory Used When Reading petscrc
In-Reply-To: <CAMYG4GkR-NLZstxPz_NM+uD20v9KRKHZZTvc-EaVyp98i11+CQ@mail.gmail.com>
References: <60ba6a27-9840-4009-a94e-90bf7a5cd317@epcc.ed.ac.uk>
	<CAMYG4GkR-NLZstxPz_NM+uD20v9KRKHZZTvc-EaVyp98i11+CQ@mail.gmail.com>
Message-ID: <f721f1de-de6b-423f-8de1-eb88395ad029@epcc.ed.ac.uk>

Matt,

Thanks for the quick response.

Yes 1) is trivially true.

With regard to 2), from the SLURM output:
[0] Maximum memory PetscMalloc()ed 29552 maximum size of entire process 
4312375296
[1] Maximum memory PetscMalloc()ed 29552 maximum size of entire process 
4311990272
Yes only 29KB was malloced but the total figure was 4GB per process.

Looking at
 ?mem0 =??? 16420864.000000000
 ?mem0 =??? 16117760.000000000
 ?mem1 =??? 4311490560.0000000
 ?mem1 =??? 4311826432.0000000
 ?mem2 =??? 4311490560.0000000
 ?mem2 =??? 4311826432.0000000
mem0 is written after PetscInitialize.
mem1 is written roughly half way through the options being read.
mem2 is written on completion of the options being read.

The code does very little other than read configuration options. Why is 
so much memory used?

I do not understand what is going on and I may have expressed myself 
badly but I do have a problem as I certainly cannot use anywhere near 
128 processes on a node with 128GB of RAM before I get an OOM error. 
(The code runs successfully on 32 processes but not 64.)

Regards,

David

On 22/11/2024 16:53, Matthew Knepley wrote:
> This email was sent to you by someone outside the University.
> You should only click on links or attachments if you are certain that 
> the email is genuine and the content is safe.
> On Fri, Nov 22, 2024 at 11:36?AM David Scott <d.scott at epcc.ed.ac.uk> 
> wrote:
>
>     Hello,
>
>     I am using the options mechanism of PETSc to configure my CFD code. I
>     have introduced options describing the size of the domain etc. I have
>     noticed that this consumes a lot of memory. I have found that the
>     amount
>     of memory used scales linearly with the number of MPI processes used.
>     This restricts the number of MPI processes that I can use.
>
>
> There are two statements:
>
> 1) The memory?scales linearly with P
>
> 2) This uses a lot of memory
>
> Let's deal with 1) first. This seems to be trivially true. If I want 
> every process to have
> access to a given option value, that option value must be in the 
> memory of every process.
> The only alternative would be to communicate with some process in 
> order to get values.
> Few codes seem to be willing to make this tradeoff, and we do not 
> offer it.
>
> Now 2). Looking at the source, for each option we store 
> a?PetscOptionItem, which I count
> as having size 37 bytes (12 pointers/ints and a char). However, there 
> is data behind every
> pointer, like the name, help text, available values (sometimes), I 
> could see it being as large
> as 4K. Suppose it is. If I had 256 options, that would be 1M. Is this 
> a large amount of memory?
>
> The way I read the SLURM output, 29K was malloced. Is this a large 
> amount of memory?
>
> I am trying to get an idea of the scale.
>
> ? Thanks,
>
> ? ? ? Matt
>
>     Is there anything that I can do about this or do I need to
>     configure my
>     code in a different way?
>
>     I have attached some code extracted from my application which
>     demonstrates this along with the output from a running it on 2 MPI
>     processes.
>
>     Best wishes,
>
>     David Scott
>     The University of Edinburgh is a charitable body, registered in
>     Scotland, with registration number SC005336. Is e buidheann
>     carthannais a th? ann an Oilthigh Dh?n ?ideann, cl?raichte an
>     Alba, ?ireamh cl?raidh SC005336.
>
>
>
> -- 
> What most experimenters take for granted before they begin their 
> experiments is infinitely more interesting than any results to which 
> their experiments lead.
> -- Norbert Wiener
>
> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!ZSsmcyvmT1HYNbUdssH9wNf_bUXn64WJkcv6TgscRIX6mEcDPKI4LxvsUWu9JcgeYQjCchlmOm8y7thpEupDiyaLluA$  
> <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!ZSsmcyvmT1HYNbUdssH9wNf_bUXn64WJkcv6TgscRIX6mEcDPKI4LxvsUWu9JcgeYQjCchlmOm8y7thpEupDq7BKajE$ >
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20241122/f443249c/attachment-0001.html>

From knepley at gmail.com  Fri Nov 22 16:10:43 2024
From: knepley at gmail.com (Matthew Knepley)
Date: Fri, 22 Nov 2024 17:10:43 -0500
Subject: [petsc-users] Memory Used When Reading petscrc
In-Reply-To: <f721f1de-de6b-423f-8de1-eb88395ad029@epcc.ed.ac.uk>
References: <60ba6a27-9840-4009-a94e-90bf7a5cd317@epcc.ed.ac.uk>
	<CAMYG4GkR-NLZstxPz_NM+uD20v9KRKHZZTvc-EaVyp98i11+CQ@mail.gmail.com>
	<f721f1de-de6b-423f-8de1-eb88395ad029@epcc.ed.ac.uk>
Message-ID: <CAMYG4Gm-cO3pySoBHujKTMwfCO9ECfT1zbQhzZyc27vxwG4X5A@mail.gmail.com>

On Fri, Nov 22, 2024 at 12:57?PM David Scott <d.scott at epcc.ed.ac.uk> wrote:

> Matt,
>
> Thanks for the quick response.
>
> Yes 1) is trivially true.
>
> With regard to 2), from the SLURM output:
> [0] Maximum memory PetscMalloc()ed 29552 maximum size of entire process
> 4312375296
> [1] Maximum memory PetscMalloc()ed 29552 maximum size of entire process
> 4311990272
> Yes only 29KB was malloced but the total figure was 4GB per process.
>
> Looking at
>  mem0 =    16420864.000000000
>  mem0 =    16117760.000000000
>  mem1 =    4311490560.0000000
>  mem1 =    4311826432.0000000
>  mem2 =    4311490560.0000000
>  mem2 =    4311826432.0000000
> mem0 is written after PetscInitialize.
> mem1 is written roughly half way through the options being read.
> mem2 is written on completion of the options being read.
>
> The code does very little other than read configuration options. Why is so
> much memory used?
>

This is not due to options processing, as that would fall under Petsc
malloc allocations. I believe we are measuring this
using RSS which includes the binary, all shared libraries which are paged
in, and stack/heap allocations. I think you are
seeing the shared libraries come in. You might be able to see all the
libraries that come in using strace.

  Thanks,

     Matt


> I do not understand what is going on and I may have expressed myself badly
> but I do have a problem as I certainly cannot use anywhere near 128
> processes on a node with 128GB of RAM before I get an OOM error. (The code
> runs successfully on 32 processes but not 64.)
>
> Regards,
>
> David
>
> On 22/11/2024 16:53, Matthew Knepley wrote:
>
> This email was sent to you by someone outside the University.
> You should only click on links or attachments if you are certain that the
> email is genuine and the content is safe.
> On Fri, Nov 22, 2024 at 11:36?AM David Scott <d.scott at epcc.ed.ac.uk>
> wrote:
>
>> Hello,
>>
>> I am using the options mechanism of PETSc to configure my CFD code. I
>> have introduced options describing the size of the domain etc. I have
>> noticed that this consumes a lot of memory. I have found that the amount
>> of memory used scales linearly with the number of MPI processes used.
>> This restricts the number of MPI processes that I can use.
>>
>
> There are two statements:
>
> 1) The memory scales linearly with P
>
> 2) This uses a lot of memory
>
> Let's deal with 1) first. This seems to be trivially true. If I want every
> process to have
> access to a given option value, that option value must be in the memory of
> every process.
> The only alternative would be to communicate with some process in order to
> get values.
> Few codes seem to be willing to make this tradeoff, and we do not offer it.
>
> Now 2). Looking at the source, for each option we store a PetscOptionItem,
> which I count
> as having size 37 bytes (12 pointers/ints and a char). However, there is
> data behind every
> pointer, like the name, help text, available values (sometimes), I could
> see it being as large
> as 4K. Suppose it is. If I had 256 options, that would be 1M. Is this a
> large amount of memory?
>
> The way I read the SLURM output, 29K was malloced. Is this a large amount
> of memory?
>
> I am trying to get an idea of the scale.
>
>   Thanks,
>
>       Matt
>
>
>> Is there anything that I can do about this or do I need to configure my
>> code in a different way?
>>
>> I have attached some code extracted from my application which
>> demonstrates this along with the output from a running it on 2 MPI
>> processes.
>>
>> Best wishes,
>>
>> David Scott
>> The University of Edinburgh is a charitable body, registered in Scotland,
>> with registration number SC005336. Is e buidheann carthannais a th? ann an
>> Oilthigh Dh?n ?ideann, cl?raichte an Alba, ?ireamh cl?raidh SC005336.
>>
>
>
> --
> What most experimenters take for granted before they begin their
> experiments is infinitely more interesting than any results to which their
> experiments lead.
> -- Norbert Wiener
>
> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!ZLozvsSwhcAXnBdZ4m-ImNu-aXxMX8SpsHB7SM320hlG3hZEq3hw7UvNnb4c2hzs2_t_rAlLN5oIdM6Uw7ol$ 
> <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!ZLozvsSwhcAXnBdZ4m-ImNu-aXxMX8SpsHB7SM320hlG3hZEq3hw7UvNnb4c2hzs2_t_rAlLN5oIdCKCLif4$ >
>
>
>

-- 
What most experimenters take for granted before they begin their
experiments is infinitely more interesting than any results to which their
experiments lead.
-- Norbert Wiener

https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!ZLozvsSwhcAXnBdZ4m-ImNu-aXxMX8SpsHB7SM320hlG3hZEq3hw7UvNnb4c2hzs2_t_rAlLN5oIdM6Uw7ol$  <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!ZLozvsSwhcAXnBdZ4m-ImNu-aXxMX8SpsHB7SM320hlG3hZEq3hw7UvNnb4c2hzs2_t_rAlLN5oIdCKCLif4$ >
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20241122/a8a88919/attachment.html>

From d.scott at epcc.ed.ac.uk  Fri Nov 22 19:16:22 2024
From: d.scott at epcc.ed.ac.uk (David Scott)
Date: Sat, 23 Nov 2024 01:16:22 +0000
Subject: [petsc-users] Memory Used When Reading petscrc
In-Reply-To: <CAMYG4Gm-cO3pySoBHujKTMwfCO9ECfT1zbQhzZyc27vxwG4X5A@mail.gmail.com>
References: <60ba6a27-9840-4009-a94e-90bf7a5cd317@epcc.ed.ac.uk>
	<CAMYG4GkR-NLZstxPz_NM+uD20v9KRKHZZTvc-EaVyp98i11+CQ@mail.gmail.com>
	<f721f1de-de6b-423f-8de1-eb88395ad029@epcc.ed.ac.uk>
	<CAMYG4Gm-cO3pySoBHujKTMwfCO9ECfT1zbQhzZyc27vxwG4X5A@mail.gmail.com>
Message-ID: <eb0500a2-4e3c-41c6-a946-7a16f0db2c7c@epcc.ed.ac.uk>

OK.

I had started to wonder if that was the case. I'll do some further 
investigation.

Thanks,

David

On 22/11/2024 22:10, Matthew Knepley wrote:
> This email was sent to you by someone outside the University.
> You should only click on links or attachments if you are certain that 
> the email is genuine and the content is safe.
> On Fri, Nov 22, 2024 at 12:57?PM David Scott <d.scott at epcc.ed.ac.uk> 
> wrote:
>
>     Matt,
>
>     Thanks for the quick response.
>
>     Yes 1) is trivially true.
>
>     With regard to 2), from the SLURM output:
>     [0] Maximum memory PetscMalloc()ed 29552 maximum size of entire
>     process 4312375296
>     [1] Maximum memory PetscMalloc()ed 29552 maximum size of entire
>     process 4311990272
>     Yes only 29KB was malloced but the total figure was 4GB per process.
>
>     Looking at
>     ?mem0 =??? 16420864.000000000
>     ?mem0 =??? 16117760.000000000
>     ?mem1 =??? 4311490560.0000000
>     ?mem1 =??? 4311826432.0000000
>     ?mem2 =??? 4311490560.0000000
>     ?mem2 =??? 4311826432.0000000
>     mem0 is written after PetscInitialize.
>     mem1 is written roughly half way through the options being read.
>     mem2 is written on completion of the options being read.
>
>     The code does very little other than read configuration options.
>     Why is so much memory used?
>
>
> This is not due to options processing, as that would fall under Petsc 
> malloc allocations. I believe we are measuring this
> using RSS which includes the binary, all shared libraries which are 
> paged in, and stack/heap allocations. I think you are
> seeing the shared libraries come in. You might be able to see all the 
> libraries that come in using strace.
>
> ? Thanks,
>
> ? ? ?Matt
>
>     I do not understand what is going on and I may have expressed
>     myself badly but I do have a problem as I certainly cannot use
>     anywhere near 128 processes on a node with 128GB of RAM before I
>     get an OOM error. (The code runs successfully on 32 processes but
>     not 64.)
>
>     Regards,
>
>     David
>
>     On 22/11/2024 16:53, Matthew Knepley wrote:
>>     This email was sent to you by someone outside the University.
>>     You should only click on links or attachments if you are certain
>>     that the email is genuine and the content is safe.
>>     On Fri, Nov 22, 2024 at 11:36?AM David Scott
>>     <d.scott at epcc.ed.ac.uk> wrote:
>>
>>         Hello,
>>
>>         I am using the options mechanism of PETSc to configure my CFD
>>         code. I
>>         have introduced options describing the size of the domain
>>         etc. I have
>>         noticed that this consumes a lot of memory. I have found that
>>         the amount
>>         of memory used scales linearly with the number of MPI
>>         processes used.
>>         This restricts the number of MPI processes that I can use.
>>
>>
>>     There are two statements:
>>
>>     1) The memory?scales linearly with P
>>
>>     2) This uses a lot of memory
>>
>>     Let's deal with 1) first. This seems to be trivially true. If I
>>     want every process to have
>>     access to a given option value, that option value must be in the
>>     memory of every process.
>>     The only alternative would be to communicate with some process in
>>     order to get values.
>>     Few codes seem to be willing to make this tradeoff, and we do not
>>     offer it.
>>
>>     Now 2). Looking at the source, for each option we store
>>     a?PetscOptionItem, which I count
>>     as having size 37 bytes (12 pointers/ints and a char). However,
>>     there is data behind every
>>     pointer, like the name, help text, available values (sometimes),
>>     I could see it being as large
>>     as 4K. Suppose it is. If I had 256 options, that would be 1M. Is
>>     this a large amount of memory?
>>
>>     The way I read the SLURM output, 29K was malloced. Is this a
>>     large amount of memory?
>>
>>     I am trying to get an idea of the scale.
>>
>>     ? Thanks,
>>
>>     ? ? ? Matt
>>
>>         Is there anything that I can do about this or do I need to
>>         configure my
>>         code in a different way?
>>
>>         I have attached some code extracted from my application which
>>         demonstrates this along with the output from a running it on
>>         2 MPI
>>         processes.
>>
>>         Best wishes,
>>
>>         David Scott
>>         The University of Edinburgh is a charitable body, registered
>>         in Scotland, with registration number SC005336. Is e
>>         buidheann carthannais a th? ann an Oilthigh Dh?n ?ideann,
>>         cl?raichte an Alba, ?ireamh cl?raidh SC005336.
>>
>>
>>
>>     -- 
>>     What most experimenters take for granted before they begin their
>>     experiments is infinitely more interesting than any results to
>>     which their experiments lead.
>>     -- Norbert Wiener
>>
>>     https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!cH8SjJvsuVEK1zv8noUjNUJC0VnHFqems68PjB2E94pqxc3q55YprX1q2JXFvPAzXJkh40J1-erXPWdIvc-xrLkRIgg$ 
>>     <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!cH8SjJvsuVEK1zv8noUjNUJC0VnHFqems68PjB2E94pqxc3q55YprX1q2JXFvPAzXJkh40J1-erXPWdIvc-xGybRwKU$ >
>
>
>
> -- 
> What most experimenters take for granted before they begin their 
> experiments is infinitely more interesting than any results to which 
> their experiments lead.
> -- Norbert Wiener
>
> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!cH8SjJvsuVEK1zv8noUjNUJC0VnHFqems68PjB2E94pqxc3q55YprX1q2JXFvPAzXJkh40J1-erXPWdIvc-xrLkRIgg$  
> <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!cH8SjJvsuVEK1zv8noUjNUJC0VnHFqems68PjB2E94pqxc3q55YprX1q2JXFvPAzXJkh40J1-erXPWdIvc-xGybRwKU$ >
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20241123/f16f2058/attachment-0001.html>

From jed at jedbrown.org  Sun Nov 24 23:27:45 2024
From: jed at jedbrown.org (Jed Brown)
Date: Sun, 24 Nov 2024 22:27:45 -0700
Subject: [petsc-users] Memory Used When Reading petscrc
In-Reply-To: <eb0500a2-4e3c-41c6-a946-7a16f0db2c7c@epcc.ed.ac.uk>
References: <60ba6a27-9840-4009-a94e-90bf7a5cd317@epcc.ed.ac.uk>
	<CAMYG4GkR-NLZstxPz_NM+uD20v9KRKHZZTvc-EaVyp98i11+CQ@mail.gmail.com>
	<f721f1de-de6b-423f-8de1-eb88395ad029@epcc.ed.ac.uk>
	<CAMYG4Gm-cO3pySoBHujKTMwfCO9ECfT1zbQhzZyc27vxwG4X5A@mail.gmail.com>
	<eb0500a2-4e3c-41c6-a946-7a16f0db2c7c@epcc.ed.ac.uk>
Message-ID: <87h67v3msu.fsf@jedbrown.org>

You're clearly doing almost all your allocation *not* using PetscMalloc (so not in a Vec or Mat). If you're managing your own mesh yourself, you might be allocating a global amount on each rank, instead of strictly using scalable data structures (i.e., always partitioned).

My favorite tool for understanding memory use is heaptrack.

https://urldefense.us/v3/__https://github.com/KDE/heaptrack__;!!G_uCfscf7eWS!agSDvRnjou_irVa09mE8tn11M8EkGEsPjrHe8yzMxmZyJkn-U6e0AxubboUT6qOgDuK4nIlW9w1Xr4TxxNk$ 

David Scott <d.scott at epcc.ed.ac.uk> writes:

> OK.
>
> I had started to wonder if that was the case. I'll do some further 
> investigation.
>
> Thanks,
>
> David
>
> On 22/11/2024 22:10, Matthew Knepley wrote:
>> This email was sent to you by someone outside the University.
>> You should only click on links or attachments if you are certain that 
>> the email is genuine and the content is safe.
>> On Fri, Nov 22, 2024 at 12:57?PM David Scott <d.scott at epcc.ed.ac.uk> 
>> wrote:
>>
>>     Matt,
>>
>>     Thanks for the quick response.
>>
>>     Yes 1) is trivially true.
>>
>>     With regard to 2), from the SLURM output:
>>     [0] Maximum memory PetscMalloc()ed 29552 maximum size of entire
>>     process 4312375296
>>     [1] Maximum memory PetscMalloc()ed 29552 maximum size of entire
>>     process 4311990272
>>     Yes only 29KB was malloced but the total figure was 4GB per process.
>>
>>     Looking at
>>     ?mem0 =??? 16420864.000000000
>>     ?mem0 =??? 16117760.000000000
>>     ?mem1 =??? 4311490560.0000000
>>     ?mem1 =??? 4311826432.0000000
>>     ?mem2 =??? 4311490560.0000000
>>     ?mem2 =??? 4311826432.0000000
>>     mem0 is written after PetscInitialize.
>>     mem1 is written roughly half way through the options being read.
>>     mem2 is written on completion of the options being read.
>>
>>     The code does very little other than read configuration options.
>>     Why is so much memory used?
>>
>>
>> This is not due to options processing, as that would fall under Petsc 
>> malloc allocations. I believe we are measuring this
>> using RSS which includes the binary, all shared libraries which are 
>> paged in, and stack/heap allocations. I think you are
>> seeing the shared libraries come in. You might be able to see all the 
>> libraries that come in using strace.
>>
>> ? Thanks,
>>
>> ? ? ?Matt
>>
>>     I do not understand what is going on and I may have expressed
>>     myself badly but I do have a problem as I certainly cannot use
>>     anywhere near 128 processes on a node with 128GB of RAM before I
>>     get an OOM error. (The code runs successfully on 32 processes but
>>     not 64.)
>>
>>     Regards,
>>
>>     David
>>
>>     On 22/11/2024 16:53, Matthew Knepley wrote:
>>>     This email was sent to you by someone outside the University.
>>>     You should only click on links or attachments if you are certain
>>>     that the email is genuine and the content is safe.
>>>     On Fri, Nov 22, 2024 at 11:36?AM David Scott
>>>     <d.scott at epcc.ed.ac.uk> wrote:
>>>
>>>         Hello,
>>>
>>>         I am using the options mechanism of PETSc to configure my CFD
>>>         code. I
>>>         have introduced options describing the size of the domain
>>>         etc. I have
>>>         noticed that this consumes a lot of memory. I have found that
>>>         the amount
>>>         of memory used scales linearly with the number of MPI
>>>         processes used.
>>>         This restricts the number of MPI processes that I can use.
>>>
>>>
>>>     There are two statements:
>>>
>>>     1) The memory?scales linearly with P
>>>
>>>     2) This uses a lot of memory
>>>
>>>     Let's deal with 1) first. This seems to be trivially true. If I
>>>     want every process to have
>>>     access to a given option value, that option value must be in the
>>>     memory of every process.
>>>     The only alternative would be to communicate with some process in
>>>     order to get values.
>>>     Few codes seem to be willing to make this tradeoff, and we do not
>>>     offer it.
>>>
>>>     Now 2). Looking at the source, for each option we store
>>>     a?PetscOptionItem, which I count
>>>     as having size 37 bytes (12 pointers/ints and a char). However,
>>>     there is data behind every
>>>     pointer, like the name, help text, available values (sometimes),
>>>     I could see it being as large
>>>     as 4K. Suppose it is. If I had 256 options, that would be 1M. Is
>>>     this a large amount of memory?
>>>
>>>     The way I read the SLURM output, 29K was malloced. Is this a
>>>     large amount of memory?
>>>
>>>     I am trying to get an idea of the scale.
>>>
>>>     ? Thanks,
>>>
>>>     ? ? ? Matt
>>>
>>>         Is there anything that I can do about this or do I need to
>>>         configure my
>>>         code in a different way?
>>>
>>>         I have attached some code extracted from my application which
>>>         demonstrates this along with the output from a running it on
>>>         2 MPI
>>>         processes.
>>>
>>>         Best wishes,
>>>
>>>         David Scott
>>>         The University of Edinburgh is a charitable body, registered
>>>         in Scotland, with registration number SC005336. Is e
>>>         buidheann carthannais a th? ann an Oilthigh Dh?n ?ideann,
>>>         cl?raichte an Alba, ?ireamh cl?raidh SC005336.
>>>
>>>
>>>
>>>     -- 
>>>     What most experimenters take for granted before they begin their
>>>     experiments is infinitely more interesting than any results to
>>>     which their experiments lead.
>>>     -- Norbert Wiener
>>>
>>>     https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!cH8SjJvsuVEK1zv8noUjNUJC0VnHFqems68PjB2E94pqxc3q55YprX1q2JXFvPAzXJkh40J1-erXPWdIvc-xrLkRIgg$ 
>>>     <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!cH8SjJvsuVEK1zv8noUjNUJC0VnHFqems68PjB2E94pqxc3q55YprX1q2JXFvPAzXJkh40J1-erXPWdIvc-xGybRwKU$ >
>>
>>
>>
>> -- 
>> What most experimenters take for granted before they begin their 
>> experiments is infinitely more interesting than any results to which 
>> their experiments lead.
>> -- Norbert Wiener
>>
>> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!cH8SjJvsuVEK1zv8noUjNUJC0VnHFqems68PjB2E94pqxc3q55YprX1q2JXFvPAzXJkh40J1-erXPWdIvc-xrLkRIgg$  
>> <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!cH8SjJvsuVEK1zv8noUjNUJC0VnHFqems68PjB2E94pqxc3q55YprX1q2JXFvPAzXJkh40J1-erXPWdIvc-xGybRwKU$ >

From d.scott at epcc.ed.ac.uk  Mon Nov 25 02:32:19 2024
From: d.scott at epcc.ed.ac.uk (David Scott)
Date: Mon, 25 Nov 2024 08:32:19 +0000
Subject: [petsc-users] Memory Used When Reading petscrc
In-Reply-To: <87h67v3msu.fsf@jedbrown.org>
References: <60ba6a27-9840-4009-a94e-90bf7a5cd317@epcc.ed.ac.uk>
	<CAMYG4GkR-NLZstxPz_NM+uD20v9KRKHZZTvc-EaVyp98i11+CQ@mail.gmail.com>
	<f721f1de-de6b-423f-8de1-eb88395ad029@epcc.ed.ac.uk>
	<CAMYG4Gm-cO3pySoBHujKTMwfCO9ECfT1zbQhzZyc27vxwG4X5A@mail.gmail.com>
	<eb0500a2-4e3c-41c6-a946-7a16f0db2c7c@epcc.ed.ac.uk>
	<87h67v3msu.fsf@jedbrown.org>
Message-ID: <9c8bf7af-62c4-47bf-9d68-ead4392fc014@epcc.ed.ac.uk>

I'll have a look at heaptrack.

The code that I am looking at the moment does not create a mesh. All it 
does is read a petscrc file.

Thanks,

David

On 25/11/2024 05:27, Jed Brown wrote:
> This email was sent to you by someone outside the University.
> You should only click on links or attachments if you are certain that the email is genuine and the content is safe.
>
> You're clearly doing almost all your allocation *not* using PetscMalloc (so not in a Vec or Mat). If you're managing your own mesh yourself, you might be allocating a global amount on each rank, instead of strictly using scalable data structures (i.e., always partitioned).
>
> My favorite tool for understanding memory use is heaptrack.
>
> https://urldefense.us/v3/__https://github.com/KDE/heaptrack__;!!G_uCfscf7eWS!bM8Vs5Ljq0ZJOl_Zl88PpU1JJWw39UMiu50wgyt0zhG4ax6DxOvabmaDYbKrrCATTeWrKDmDR5C-3bDziLRcXp30NMQ$ 
>
> David Scott <d.scott at epcc.ed.ac.uk> writes:
>
>> OK.
>>
>> I had started to wonder if that was the case. I'll do some further
>> investigation.
>>
>> Thanks,
>>
>> David
>>
>> On 22/11/2024 22:10, Matthew Knepley wrote:
>>> This email was sent to you by someone outside the University.
>>> You should only click on links or attachments if you are certain that
>>> the email is genuine and the content is safe.
>>> On Fri, Nov 22, 2024 at 12:57?PM David Scott <d.scott at epcc.ed.ac.uk>
>>> wrote:
>>>
>>>      Matt,
>>>
>>>      Thanks for the quick response.
>>>
>>>      Yes 1) is trivially true.
>>>
>>>      With regard to 2), from the SLURM output:
>>>      [0] Maximum memory PetscMalloc()ed 29552 maximum size of entire
>>>      process 4312375296
>>>      [1] Maximum memory PetscMalloc()ed 29552 maximum size of entire
>>>      process 4311990272
>>>      Yes only 29KB was malloced but the total figure was 4GB per process.
>>>
>>>      Looking at
>>>       mem0 =    16420864.000000000
>>>       mem0 =    16117760.000000000
>>>       mem1 =    4311490560.0000000
>>>       mem1 =    4311826432.0000000
>>>       mem2 =    4311490560.0000000
>>>       mem2 =    4311826432.0000000
>>>      mem0 is written after PetscInitialize.
>>>      mem1 is written roughly half way through the options being read.
>>>      mem2 is written on completion of the options being read.
>>>
>>>      The code does very little other than read configuration options.
>>>      Why is so much memory used?
>>>
>>>
>>> This is not due to options processing, as that would fall under Petsc
>>> malloc allocations. I believe we are measuring this
>>> using RSS which includes the binary, all shared libraries which are
>>> paged in, and stack/heap allocations. I think you are
>>> seeing the shared libraries come in. You might be able to see all the
>>> libraries that come in using strace.
>>>
>>>    Thanks,
>>>
>>>       Matt
>>>
>>>      I do not understand what is going on and I may have expressed
>>>      myself badly but I do have a problem as I certainly cannot use
>>>      anywhere near 128 processes on a node with 128GB of RAM before I
>>>      get an OOM error. (The code runs successfully on 32 processes but
>>>      not 64.)
>>>
>>>      Regards,
>>>
>>>      David
>>>
>>>      On 22/11/2024 16:53, Matthew Knepley wrote:
>>>>      This email was sent to you by someone outside the University.
>>>>      You should only click on links or attachments if you are certain
>>>>      that the email is genuine and the content is safe.
>>>>      On Fri, Nov 22, 2024 at 11:36?AM David Scott
>>>>      <d.scott at epcc.ed.ac.uk> wrote:
>>>>
>>>>          Hello,
>>>>
>>>>          I am using the options mechanism of PETSc to configure my CFD
>>>>          code. I
>>>>          have introduced options describing the size of the domain
>>>>          etc. I have
>>>>          noticed that this consumes a lot of memory. I have found that
>>>>          the amount
>>>>          of memory used scales linearly with the number of MPI
>>>>          processes used.
>>>>          This restricts the number of MPI processes that I can use.
>>>>
>>>>
>>>>      There are two statements:
>>>>
>>>>      1) The memory scales linearly with P
>>>>
>>>>      2) This uses a lot of memory
>>>>
>>>>      Let's deal with 1) first. This seems to be trivially true. If I
>>>>      want every process to have
>>>>      access to a given option value, that option value must be in the
>>>>      memory of every process.
>>>>      The only alternative would be to communicate with some process in
>>>>      order to get values.
>>>>      Few codes seem to be willing to make this tradeoff, and we do not
>>>>      offer it.
>>>>
>>>>      Now 2). Looking at the source, for each option we store
>>>>      a PetscOptionItem, which I count
>>>>      as having size 37 bytes (12 pointers/ints and a char). However,
>>>>      there is data behind every
>>>>      pointer, like the name, help text, available values (sometimes),
>>>>      I could see it being as large
>>>>      as 4K. Suppose it is. If I had 256 options, that would be 1M. Is
>>>>      this a large amount of memory?
>>>>
>>>>      The way I read the SLURM output, 29K was malloced. Is this a
>>>>      large amount of memory?
>>>>
>>>>      I am trying to get an idea of the scale.
>>>>
>>>>        Thanks,
>>>>
>>>>            Matt
>>>>
>>>>          Is there anything that I can do about this or do I need to
>>>>          configure my
>>>>          code in a different way?
>>>>
>>>>          I have attached some code extracted from my application which
>>>>          demonstrates this along with the output from a running it on
>>>>          2 MPI
>>>>          processes.
>>>>
>>>>          Best wishes,
>>>>
>>>>          David Scott
>>>>          The University of Edinburgh is a charitable body, registered
>>>>          in Scotland, with registration number SC005336. Is e
>>>>          buidheann carthannais a th? ann an Oilthigh Dh?n ?ideann,
>>>>          cl?raichte an Alba, ?ireamh cl?raidh SC005336.
>>>>
>>>>
>>>>
>>>>      --
>>>>      What most experimenters take for granted before they begin their
>>>>      experiments is infinitely more interesting than any results to
>>>>      which their experiments lead.
>>>>      -- Norbert Wiener
>>>>
>>>>      https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!cH8SjJvsuVEK1zv8noUjNUJC0VnHFqems68PjB2E94pqxc3q55YprX1q2JXFvPAzXJkh40J1-erXPWdIvc-xrLkRIgg$
>>>>      <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!cH8SjJvsuVEK1zv8noUjNUJC0VnHFqems68PjB2E94pqxc3q55YprX1q2JXFvPAzXJkh40J1-erXPWdIvc-xGybRwKU$ >
>>>
>>>
>>> --
>>> What most experimenters take for granted before they begin their
>>> experiments is infinitely more interesting than any results to which
>>> their experiments lead.
>>> -- Norbert Wiener
>>>
>>> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!cH8SjJvsuVEK1zv8noUjNUJC0VnHFqems68PjB2E94pqxc3q55YprX1q2JXFvPAzXJkh40J1-erXPWdIvc-xrLkRIgg$
>>> <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!cH8SjJvsuVEK1zv8noUjNUJC0VnHFqems68PjB2E94pqxc3q55YprX1q2JXFvPAzXJkh40J1-erXPWdIvc-xGybRwKU$ >


From Fabian.Jakub at physik.uni-muenchen.de  Mon Nov 25 02:45:30 2024
From: Fabian.Jakub at physik.uni-muenchen.de (Fabian.Jakub)
Date: Mon, 25 Nov 2024 09:45:30 +0100
Subject: [petsc-users] Memory Used When Reading petscrc
In-Reply-To: <9c8bf7af-62c4-47bf-9d68-ead4392fc014@epcc.ed.ac.uk>
References: <60ba6a27-9840-4009-a94e-90bf7a5cd317@epcc.ed.ac.uk>
	<CAMYG4GkR-NLZstxPz_NM+uD20v9KRKHZZTvc-EaVyp98i11+CQ@mail.gmail.com>
	<f721f1de-de6b-423f-8de1-eb88395ad029@epcc.ed.ac.uk>
	<CAMYG4Gm-cO3pySoBHujKTMwfCO9ECfT1zbQhzZyc27vxwG4X5A@mail.gmail.com>
	<eb0500a2-4e3c-41c6-a946-7a16f0db2c7c@epcc.ed.ac.uk>
	<87h67v3msu.fsf@jedbrown.org>
	<9c8bf7af-62c4-47bf-9d68-ead4392fc014@epcc.ed.ac.uk>
Message-ID: <e2a41a15-3fd5-4a21-bb76-84a798722536@physik.uni-muenchen.de>

test_configuration_options.F90:l.55
max_msg_length is quite large.... I guess the pow() is a typo.
Cheers,
Fabian


On 11/25/24 09:32, David Scott wrote:
> I'll have a look at heaptrack.
> 
> The code that I am looking at the moment does not create a mesh. All it 
> does is read a petscrc file.
> 
> Thanks,
> 
> David
> 
> On 25/11/2024 05:27, Jed Brown wrote:
>> This email was sent to you by someone outside the University.
>> You should only click on links or attachments if you are certain that 
>> the email is genuine and the content is safe.
>>
>> You're clearly doing almost all your allocation *not* using 
>> PetscMalloc (so not in a Vec or Mat). If you're managing your own mesh 
>> yourself, you might be allocating a global amount on each rank, 
>> instead of strictly using scalable data structures (i.e., always 
>> partitioned).
>>
>> My favorite tool for understanding memory use is heaptrack.
>>
>> https://urldefense.us/v3/__https://github.com/KDE/heaptrack__;!!G_uCfscf7eWS!bM8Vs5Ljq0ZJOl_Zl88PpU1JJWw39UMiu50wgyt0zhG4ax6DxOvabmaDYbKrrCATTeWrKDmDR5C-3bDziLRcXp30NMQ$
>> David Scott <d.scott at epcc.ed.ac.uk> writes:
>>
>>> OK.
>>>
>>> I had started to wonder if that was the case. I'll do some further
>>> investigation.
>>>
>>> Thanks,
>>>
>>> David
>>>
>>> On 22/11/2024 22:10, Matthew Knepley wrote:
>>>> This email was sent to you by someone outside the University.
>>>> You should only click on links or attachments if you are certain that
>>>> the email is genuine and the content is safe.
>>>> On Fri, Nov 22, 2024 at 12:57?PM David Scott <d.scott at epcc.ed.ac.uk>
>>>> wrote:
>>>>
>>>> ???? Matt,
>>>>
>>>> ???? Thanks for the quick response.
>>>>
>>>> ???? Yes 1) is trivially true.
>>>>
>>>> ???? With regard to 2), from the SLURM output:
>>>> ???? [0] Maximum memory PetscMalloc()ed 29552 maximum size of entire
>>>> ???? process 4312375296
>>>> ???? [1] Maximum memory PetscMalloc()ed 29552 maximum size of entire
>>>> ???? process 4311990272
>>>> ???? Yes only 29KB was malloced but the total figure was 4GB per 
>>>> process.
>>>>
>>>> ???? Looking at
>>>> ????? mem0 =??? 16420864.000000000
>>>> ????? mem0 =??? 16117760.000000000
>>>> ????? mem1 =??? 4311490560.0000000
>>>> ????? mem1 =??? 4311826432.0000000
>>>> ????? mem2 =??? 4311490560.0000000
>>>> ????? mem2 =??? 4311826432.0000000
>>>> ???? mem0 is written after PetscInitialize.
>>>> ???? mem1 is written roughly half way through the options being read.
>>>> ???? mem2 is written on completion of the options being read.
>>>>
>>>> ???? The code does very little other than read configuration options.
>>>> ???? Why is so much memory used?
>>>>
>>>>
>>>> This is not due to options processing, as that would fall under Petsc
>>>> malloc allocations. I believe we are measuring this
>>>> using RSS which includes the binary, all shared libraries which are
>>>> paged in, and stack/heap allocations. I think you are
>>>> seeing the shared libraries come in. You might be able to see all the
>>>> libraries that come in using strace.
>>>>
>>>> ?? Thanks,
>>>>
>>>> ????? Matt
>>>>
>>>> ???? I do not understand what is going on and I may have expressed
>>>> ???? myself badly but I do have a problem as I certainly cannot use
>>>> ???? anywhere near 128 processes on a node with 128GB of RAM before I
>>>> ???? get an OOM error. (The code runs successfully on 32 processes but
>>>> ???? not 64.)
>>>>
>>>> ???? Regards,
>>>>
>>>> ???? David
>>>>
>>>> ???? On 22/11/2024 16:53, Matthew Knepley wrote:
>>>>> ???? This email was sent to you by someone outside the University.
>>>>> ???? You should only click on links or attachments if you are certain
>>>>> ???? that the email is genuine and the content is safe.
>>>>> ???? On Fri, Nov 22, 2024 at 11:36?AM David Scott
>>>>> ???? <d.scott at epcc.ed.ac.uk> wrote:
>>>>>
>>>>> ???????? Hello,
>>>>>
>>>>> ???????? I am using the options mechanism of PETSc to configure my CFD
>>>>> ???????? code. I
>>>>> ???????? have introduced options describing the size of the domain
>>>>> ???????? etc. I have
>>>>> ???????? noticed that this consumes a lot of memory. I have found that
>>>>> ???????? the amount
>>>>> ???????? of memory used scales linearly with the number of MPI
>>>>> ???????? processes used.
>>>>> ???????? This restricts the number of MPI processes that I can use.
>>>>>
>>>>>
>>>>> ???? There are two statements:
>>>>>
>>>>> ???? 1) The memory scales linearly with P
>>>>>
>>>>> ???? 2) This uses a lot of memory
>>>>>
>>>>> ???? Let's deal with 1) first. This seems to be trivially true. If I
>>>>> ???? want every process to have
>>>>> ???? access to a given option value, that option value must be in the
>>>>> ???? memory of every process.
>>>>> ???? The only alternative would be to communicate with some process in
>>>>> ???? order to get values.
>>>>> ???? Few codes seem to be willing to make this tradeoff, and we do not
>>>>> ???? offer it.
>>>>>
>>>>> ???? Now 2). Looking at the source, for each option we store
>>>>> ???? a PetscOptionItem, which I count
>>>>> ???? as having size 37 bytes (12 pointers/ints and a char). However,
>>>>> ???? there is data behind every
>>>>> ???? pointer, like the name, help text, available values (sometimes),
>>>>> ???? I could see it being as large
>>>>> ???? as 4K. Suppose it is. If I had 256 options, that would be 1M. Is
>>>>> ???? this a large amount of memory?
>>>>>
>>>>> ???? The way I read the SLURM output, 29K was malloced. Is this a
>>>>> ???? large amount of memory?
>>>>>
>>>>> ???? I am trying to get an idea of the scale.
>>>>>
>>>>> ?????? Thanks,
>>>>>
>>>>> ?????????? Matt
>>>>>
>>>>> ???????? Is there anything that I can do about this or do I need to
>>>>> ???????? configure my
>>>>> ???????? code in a different way?
>>>>>
>>>>> ???????? I have attached some code extracted from my application which
>>>>> ???????? demonstrates this along with the output from a running it on
>>>>> ???????? 2 MPI
>>>>> ???????? processes.
>>>>>
>>>>> ???????? Best wishes,
>>>>>
>>>>> ???????? David Scott
>>>>> ???????? The University of Edinburgh is a charitable body, registered
>>>>> ???????? in Scotland, with registration number SC005336. Is e
>>>>> ???????? buidheann carthannais a th? ann an Oilthigh Dh?n ?ideann,
>>>>> ???????? cl?raichte an Alba, ?ireamh cl?raidh SC005336.
>>>>>
>>>>>
>>>>>
>>>>> ???? --
>>>>> ???? What most experimenters take for granted before they begin their
>>>>> ???? experiments is infinitely more interesting than any results to
>>>>> ???? which their experiments lead.
>>>>> ???? -- Norbert Wiener
>>>>>
>>>>>      
>>>>> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!cH8SjJvsuVEK1zv8noUjNUJC0VnHFqems68PjB2E94pqxc3q55YprX1q2JXFvPAzXJkh40J1-erXPWdIvc-xrLkRIgg$
>>>>>      
>>>>> <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!cH8SjJvsuVEK1zv8noUjNUJC0VnHFqems68PjB2E94pqxc3q55YprX1q2JXFvPAzXJkh40J1-erXPWdIvc-xGybRwKU$ >
>>>>
>>>>
>>>> -- 
>>>> What most experimenters take for granted before they begin their
>>>> experiments is infinitely more interesting than any results to which
>>>> their experiments lead.
>>>> -- Norbert Wiener
>>>>
>>>> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!cH8SjJvsuVEK1zv8noUjNUJC0VnHFqems68PjB2E94pqxc3q55YprX1q2JXFvPAzXJkh40J1-erXPWdIvc-xrLkRIgg$
>>>> <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!cH8SjJvsuVEK1zv8noUjNUJC0VnHFqems68PjB2E94pqxc3q55YprX1q2JXFvPAzXJkh40J1-erXPWdIvc-xGybRwKU$ >
> 


From d.scott at epcc.ed.ac.uk  Mon Nov 25 03:49:02 2024
From: d.scott at epcc.ed.ac.uk (David Scott)
Date: Mon, 25 Nov 2024 09:49:02 +0000
Subject: [petsc-users] Memory Used When Reading petscrc
In-Reply-To: <e2a41a15-3fd5-4a21-bb76-84a798722536@physik.uni-muenchen.de>
References: <60ba6a27-9840-4009-a94e-90bf7a5cd317@epcc.ed.ac.uk>
	<CAMYG4GkR-NLZstxPz_NM+uD20v9KRKHZZTvc-EaVyp98i11+CQ@mail.gmail.com>
	<f721f1de-de6b-423f-8de1-eb88395ad029@epcc.ed.ac.uk>
	<CAMYG4Gm-cO3pySoBHujKTMwfCO9ECfT1zbQhzZyc27vxwG4X5A@mail.gmail.com>
	<eb0500a2-4e3c-41c6-a946-7a16f0db2c7c@epcc.ed.ac.uk>
	<87h67v3msu.fsf@jedbrown.org>
	<9c8bf7af-62c4-47bf-9d68-ead4392fc014@epcc.ed.ac.uk>
	<e2a41a15-3fd5-4a21-bb76-84a798722536@physik.uni-muenchen.de>
Message-ID: <df13ff95-b571-4d6c-83e0-5f8772ec6548@epcc.ed.ac.uk>

Fabian,

That is indeed a typo. Thanks very much for pointing it out.

Cheers,

David

On 25/11/2024 08:45, Fabian.Jakub via petsc-users wrote:
> This email was sent to you by someone outside the University.
> You should only click on links or attachments if you are certain that 
> the email is genuine and the content is safe.
>
> test_configuration_options.F90:l.55
> max_msg_length is quite large.... I guess the pow() is a typo.
> Cheers,
> Fabian
>
>
> On 11/25/24 09:32, David Scott wrote:
>> I'll have a look at heaptrack.
>>
>> The code that I am looking at the moment does not create a mesh. All it
>> does is read a petscrc file.
>>
>> Thanks,
>>
>> David
>>
>> On 25/11/2024 05:27, Jed Brown wrote:
>>> This email was sent to you by someone outside the University.
>>> You should only click on links or attachments if you are certain that
>>> the email is genuine and the content is safe.
>>>
>>> You're clearly doing almost all your allocation *not* using
>>> PetscMalloc (so not in a Vec or Mat). If you're managing your own mesh
>>> yourself, you might be allocating a global amount on each rank,
>>> instead of strictly using scalable data structures (i.e., always
>>> partitioned).
>>>
>>> My favorite tool for understanding memory use is heaptrack.
>>>
>>> https://urldefense.us/v3/__https://github.com/KDE/heaptrack__;!!G_uCfscf7eWS!bM8Vs5Ljq0ZJOl_Zl88PpU1JJWw39UMiu50wgyt0zhG4ax6DxOvabmaDYbKrrCATTeWrKDmDR5C-3bDziLRcXp30NMQ$ 
>>>
>>> David Scott <d.scott at epcc.ed.ac.uk> writes:
>>>
>>>> OK.
>>>>
>>>> I had started to wonder if that was the case. I'll do some further
>>>> investigation.
>>>>
>>>> Thanks,
>>>>
>>>> David
>>>>
>>>> On 22/11/2024 22:10, Matthew Knepley wrote:
>>>>> This email was sent to you by someone outside the University.
>>>>> You should only click on links or attachments if you are certain that
>>>>> the email is genuine and the content is safe.
>>>>> On Fri, Nov 22, 2024 at 12:57?PM David Scott <d.scott at epcc.ed.ac.uk>
>>>>> wrote:
>>>>>
>>>>> ???? Matt,
>>>>>
>>>>> ???? Thanks for the quick response.
>>>>>
>>>>> ???? Yes 1) is trivially true.
>>>>>
>>>>> ???? With regard to 2), from the SLURM output:
>>>>> ???? [0] Maximum memory PetscMalloc()ed 29552 maximum size of entire
>>>>> ???? process 4312375296
>>>>> ???? [1] Maximum memory PetscMalloc()ed 29552 maximum size of entire
>>>>> ???? process 4311990272
>>>>> ???? Yes only 29KB was malloced but the total figure was 4GB per
>>>>> process.
>>>>>
>>>>> ???? Looking at
>>>>> ????? mem0 =??? 16420864.000000000
>>>>> ????? mem0 =??? 16117760.000000000
>>>>> ????? mem1 =??? 4311490560.0000000
>>>>> ????? mem1 =??? 4311826432.0000000
>>>>> ????? mem2 =??? 4311490560.0000000
>>>>> ????? mem2 =??? 4311826432.0000000
>>>>> ???? mem0 is written after PetscInitialize.
>>>>> ???? mem1 is written roughly half way through the options being read.
>>>>> ???? mem2 is written on completion of the options being read.
>>>>>
>>>>> ???? The code does very little other than read configuration options.
>>>>> ???? Why is so much memory used?
>>>>>
>>>>>
>>>>> This is not due to options processing, as that would fall under Petsc
>>>>> malloc allocations. I believe we are measuring this
>>>>> using RSS which includes the binary, all shared libraries which are
>>>>> paged in, and stack/heap allocations. I think you are
>>>>> seeing the shared libraries come in. You might be able to see all the
>>>>> libraries that come in using strace.
>>>>>
>>>>> ?? Thanks,
>>>>>
>>>>> ????? Matt
>>>>>
>>>>> ???? I do not understand what is going on and I may have expressed
>>>>> ???? myself badly but I do have a problem as I certainly cannot use
>>>>> ???? anywhere near 128 processes on a node with 128GB of RAM before I
>>>>> ???? get an OOM error. (The code runs successfully on 32 processes 
>>>>> but
>>>>> ???? not 64.)
>>>>>
>>>>> ???? Regards,
>>>>>
>>>>> ???? David
>>>>>
>>>>> ???? On 22/11/2024 16:53, Matthew Knepley wrote:
>>>>>> ???? This email was sent to you by someone outside the University.
>>>>>> ???? You should only click on links or attachments if you are 
>>>>>> certain
>>>>>> ???? that the email is genuine and the content is safe.
>>>>>> ???? On Fri, Nov 22, 2024 at 11:36?AM David Scott
>>>>>> ???? <d.scott at epcc.ed.ac.uk> wrote:
>>>>>>
>>>>>> ???????? Hello,
>>>>>>
>>>>>> ???????? I am using the options mechanism of PETSc to configure 
>>>>>> my CFD
>>>>>> ???????? code. I
>>>>>> ???????? have introduced options describing the size of the domain
>>>>>> ???????? etc. I have
>>>>>> ???????? noticed that this consumes a lot of memory. I have found 
>>>>>> that
>>>>>> ???????? the amount
>>>>>> ???????? of memory used scales linearly with the number of MPI
>>>>>> ???????? processes used.
>>>>>> ???????? This restricts the number of MPI processes that I can use.
>>>>>>
>>>>>>
>>>>>> ???? There are two statements:
>>>>>>
>>>>>> ???? 1) The memory scales linearly with P
>>>>>>
>>>>>> ???? 2) This uses a lot of memory
>>>>>>
>>>>>> ???? Let's deal with 1) first. This seems to be trivially true. If I
>>>>>> ???? want every process to have
>>>>>> ???? access to a given option value, that option value must be in 
>>>>>> the
>>>>>> ???? memory of every process.
>>>>>> ???? The only alternative would be to communicate with some 
>>>>>> process in
>>>>>> ???? order to get values.
>>>>>> ???? Few codes seem to be willing to make this tradeoff, and we 
>>>>>> do not
>>>>>> ???? offer it.
>>>>>>
>>>>>> ???? Now 2). Looking at the source, for each option we store
>>>>>> ???? a PetscOptionItem, which I count
>>>>>> ???? as having size 37 bytes (12 pointers/ints and a char). However,
>>>>>> ???? there is data behind every
>>>>>> ???? pointer, like the name, help text, available values 
>>>>>> (sometimes),
>>>>>> ???? I could see it being as large
>>>>>> ???? as 4K. Suppose it is. If I had 256 options, that would be 
>>>>>> 1M. Is
>>>>>> ???? this a large amount of memory?
>>>>>>
>>>>>> ???? The way I read the SLURM output, 29K was malloced. Is this a
>>>>>> ???? large amount of memory?
>>>>>>
>>>>>> ???? I am trying to get an idea of the scale.
>>>>>>
>>>>>> ?????? Thanks,
>>>>>>
>>>>>> ?????????? Matt
>>>>>>
>>>>>> ???????? Is there anything that I can do about this or do I need to
>>>>>> ???????? configure my
>>>>>> ???????? code in a different way?
>>>>>>
>>>>>> ???????? I have attached some code extracted from my application 
>>>>>> which
>>>>>> ???????? demonstrates this along with the output from a running 
>>>>>> it on
>>>>>> ???????? 2 MPI
>>>>>> ???????? processes.
>>>>>>
>>>>>> ???????? Best wishes,
>>>>>>
>>>>>> ???????? David Scott
>>>>>> ???????? The University of Edinburgh is a charitable body, 
>>>>>> registered
>>>>>> ???????? in Scotland, with registration number SC005336. Is e
>>>>>> ???????? buidheann carthannais a th? ann an Oilthigh Dh?n ?ideann,
>>>>>> ???????? cl?raichte an Alba, ?ireamh cl?raidh SC005336.
>>>>>>
>>>>>>
>>>>>>
>>>>>> ???? --
>>>>>> ???? What most experimenters take for granted before they begin 
>>>>>> their
>>>>>> ???? experiments is infinitely more interesting than any results to
>>>>>> ???? which their experiments lead.
>>>>>> ???? -- Norbert Wiener
>>>>>>
>>>>>>
>>>>>> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!cH8SjJvsuVEK1zv8noUjNUJC0VnHFqems68PjB2E94pqxc3q55YprX1q2JXFvPAzXJkh40J1-erXPWdIvc-xrLkRIgg$ 
>>>>>>
>>>>>>
>>>>>> <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!cH8SjJvsuVEK1zv8noUjNUJC0VnHFqems68PjB2E94pqxc3q55YprX1q2JXFvPAzXJkh40J1-erXPWdIvc-xGybRwKU$ 
>>>>>> >
>>>>>
>>>>>
>>>>> -- 
>>>>> What most experimenters take for granted before they begin their
>>>>> experiments is infinitely more interesting than any results to which
>>>>> their experiments lead.
>>>>> -- Norbert Wiener
>>>>>
>>>>> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!cH8SjJvsuVEK1zv8noUjNUJC0VnHFqems68PjB2E94pqxc3q55YprX1q2JXFvPAzXJkh40J1-erXPWdIvc-xrLkRIgg$ 
>>>>>
>>>>> <https://urldefense.us/v3/__http://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!cH8SjJvsuVEK1zv8noUjNUJC0VnHFqems68PjB2E94pqxc3q55YprX1q2JXFvPAzXJkh40J1-erXPWdIvc-xGybRwKU$ 
>>>>> >
>>
>


From bba at bgs.ac.uk  Mon Nov 25 05:08:09 2024
From: bba at bgs.ac.uk (Brian Bainbridge - BGS)
Date: Mon, 25 Nov 2024 11:08:09 +0000
Subject: [petsc-users] Unable to configure errors
Message-ID: <CWLP123MB4145AACFED38EA9BD706DCF5F12E2@CWLP123MB4145.GBRP123.PROD.OUTLOOK.COM>

Hi there,

I have downloaded petsc-.3.22.1 via git, but when I try to configure with the Intel compilers I get this message:

[bba at kwvmxbridgeHPC petsc]$ ./configure --prefix=/home/bba/bin/petsc --with-mpi-dir=/home/bba/intel/oneapi/mpi/2021.14/
=============================================================================================
                         Configuring PETSc to compile on your system
=============================================================================================
TESTING: checkCCompiler from config.setCompilers(config/BuildSystem/config/setCompilers.py:1457)
*********************************************************************************************
           UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for details):
---------------------------------------------------------------------------------------------
  MPI compiler wrappers in /home/bba/intel/oneapi/mpi/2021.14/bin cannot be found or do not
  work. See https://urldefense.us/v3/__https://petsc.org/release/faq/*invalid-mpi-compilers__;Iw!!G_uCfscf7eWS!bfBTBf0cnZ3REM41nHH1ecadhkhaOZ6Rj28ZmTt_QbyHuPvU3zdMDIVstg8C0aON_dJ-vLme96MmkQ_-tKw$ 
*********************************************************************************************

But:

[bba at kwvmxbridgeHPC petsc]$ ls /home/bba/intel/oneapi/mpi/2021.14/bin
cpuinfo             hydra_nameserver  IMB-MPI1      IMB-MT   IMB-P2P  impi_cpuinfo  mpicc   mpiexec        mpif77  mpifc   mpigxx  mpiicpc  mpiicx    mpiifx  mpitune_fast
hydra_bstrap_proxy  hydra_pmi_proxy   IMB-MPI1-GPU  IMB-NBC  IMB-RMA  impi_info     mpicxx  mpiexec.hydra  mpif90  mpigcc  mpiicc  mpiicpx  mpiifort  mpirun

So it should work, but it doesn't! I tried to use --with-cc=/home/bba/intel/oneapi/mpi/2021.14/bin/mpiicx

but that doesn't work:

[bba at kwvmxbridgeHPC petsc]$ ./configure --prefix=/home/bba/bin/petsc --with-cc=/home/bba/intel/oneapi/mpi/2021.14/bin/mpiicx
=============================================================================================
                         Configuring PETSc to compile on your system
=============================================================================================
TESTING: checkCCompiler from config.setCompilers(config/BuildSystem/config/setCompilers.py:1457)
*********************************************************************************************
           UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for details):
---------------------------------------------------------------------------------------------
  C compiler you provided with -with-cc=/home/bba/intel/oneapi/mpi/2021.14/bin/mpiicx
  cannot be found or does not work.
  If the above linker messages do not indicate failure of the compiler you can rerun with
  the option --ignoreLinkOutput=1
*********************************************************************************************

[bba at kwvmxbridgeHPC petsc]$ ls /home/bba/intel/oneapi/mpi/2021.14/bin/mpiicx
/home/bba/intel/oneapi/mpi/2021.14/bin/mpiicx

The compiler is there, can you please help me to configure the petsc please?

Regards,
Brian


This email and any attachments are intended solely for the use of the named recipients. If you are not the intended recipient you must not use, disclose, copy or distribute this email or any of its attachments and should notify the sender immediately and delete this email from your system. UK Research and Innovation (UKRI) has taken every reasonable precaution to minimise risk of this email or any attachments containing viruses or malware but the recipient should carry out its own virus and malware checks before opening the attachments. UKRI does not accept any liability for any losses or damages which the recipient may sustain due to presence of any viruses.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20241125/5ae2c39c/attachment.html>

From stefano.zampini at gmail.com  Mon Nov 25 11:11:18 2024
From: stefano.zampini at gmail.com (Stefano Zampini)
Date: Mon, 25 Nov 2024 20:11:18 +0300
Subject: [petsc-users] Unable to configure errors
In-Reply-To: <CWLP123MB4145AACFED38EA9BD706DCF5F12E2@CWLP123MB4145.GBRP123.PROD.OUTLOOK.COM>
References: <CWLP123MB4145AACFED38EA9BD706DCF5F12E2@CWLP123MB4145.GBRP123.PROD.OUTLOOK.COM>
Message-ID: <CAGPUisiTdkE40o1-b-ao8tzkEfuN60YFz2XXqWSeZAwBCzb1=Q@mail.gmail.com>

Send configure.log

On Mon, Nov 25, 2024, 20:09 Brian Bainbridge - BGS via petsc-users <
petsc-users at mcs.anl.gov> wrote:

> Hi there,
>
> I have downloaded petsc-.3.22.1 via git, but when I try to configure with
> the Intel compilers I get this message:
>
> [bba at kwvmxbridgeHPC petsc]$ ./configure --prefix=/home/bba/bin/petsc
> --with-mpi-dir=/home/bba/intel/oneapi/mpi/2021.14/
>
> =============================================================================================
>                          Configuring PETSc to compile on your system
>
> =============================================================================================
> TESTING: checkCCompiler from
> config.setCompilers(config/BuildSystem/config/setCompilers.py:1457)
>
> *********************************************************************************************
>            UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for
> details):
>
> ---------------------------------------------------------------------------------------------
>   MPI compiler wrappers in /home/bba/intel/oneapi/mpi/2021.14/bin cannot
> be found or do not
>   work. See https://urldefense.us/v3/__https://petsc.org/release/faq/*invalid-mpi-compilers__;Iw!!G_uCfscf7eWS!YuP2JPBRwPMjW-R0DlunxcaYcHx0zjy7uFz7aEsu_9Gj38klN7GMx7OmcFwbU9-V3EPvLhVEp9S0DLbjdZVNjgJ5KgODGVQ$ 
> <https://urldefense.us/v3/__https://petsc.org/release/faq/*invalid-mpi-compilers__;Iw!!G_uCfscf7eWS!bfBTBf0cnZ3REM41nHH1ecadhkhaOZ6Rj28ZmTt_QbyHuPvU3zdMDIVstg8C0aON_dJ-vLme96MmkQ_-tKw$>
>
> *********************************************************************************************
>
> But:
>
> [bba at kwvmxbridgeHPC petsc]$ ls /home/bba/intel/oneapi/mpi/2021.14/bin
> cpuinfo             hydra_nameserver  IMB-MPI1      IMB-MT   IMB-P2P
>  impi_cpuinfo  mpicc   mpiexec        mpif77  mpifc   mpigxx  mpiicpc
>  mpiicx    mpiifx  mpitune_fast
> hydra_bstrap_proxy  hydra_pmi_proxy   IMB-MPI1-GPU  IMB-NBC  IMB-RMA
>  impi_info     mpicxx  mpiexec.hydra  mpif90  mpigcc  mpiicc  mpiicpx
>  mpiifort  mpirun
>
> So it should work, but it doesn't! I tried to use
> --with-cc=/home/bba/intel/oneapi/mpi/2021.14/bin/mpiicx
>
> but that doesn't work:
>
> [bba at kwvmxbridgeHPC petsc]$ ./configure --prefix=/home/bba/bin/petsc
> --with-cc=/home/bba/intel/oneapi/mpi/2021.14/bin/mpiicx
>
> =============================================================================================
>                          Configuring PETSc to compile on your system
>
> =============================================================================================
> TESTING: checkCCompiler from
> config.setCompilers(config/BuildSystem/config/setCompilers.py:1457)
>
> *********************************************************************************************
>            UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for
> details):
>
> ---------------------------------------------------------------------------------------------
>   C compiler you provided with
> -with-cc=/home/bba/intel/oneapi/mpi/2021.14/bin/mpiicx
>   cannot be found or does not work.
>   If the above linker messages do not indicate failure of the compiler you
> can rerun with
>   the option --ignoreLinkOutput=1
>
> *********************************************************************************************
>
> [bba at kwvmxbridgeHPC petsc]$ ls
> /home/bba/intel/oneapi/mpi/2021.14/bin/mpiicx
> /home/bba/intel/oneapi/mpi/2021.14/bin/mpiicx
>
> The compiler is there, can you please help me to configure the petsc
> please?
>
> Regards,
> Brian
>
>
>
>
> This email and any attachments are intended solely for the use of the
> named recipients. If you are not the intended recipient you must not use,
> disclose, copy or distribute this email or any of its attachments and
> should notify the sender immediately and delete this email from your
> system. UK Research and Innovation (UKRI) has taken every reasonable
> precaution to minimise risk of this email or any attachments containing
> viruses or malware but the recipient should carry out its own virus and
> malware checks before opening the attachments. UKRI does not accept any
> liability for any losses or damages which the recipient may sustain due to
> presence of any viruses.
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20241125/dd2cf9d9/attachment-0001.html>

From balay.anl at fastmail.org  Mon Nov 25 11:13:32 2024
From: balay.anl at fastmail.org (Satish Balay)
Date: Mon, 25 Nov 2024 11:13:32 -0600 (CST)
Subject: [petsc-users] Unable to configure errors
In-Reply-To: <CWLP123MB4145AACFED38EA9BD706DCF5F12E2@CWLP123MB4145.GBRP123.PROD.OUTLOOK.COM>
References: <CWLP123MB4145AACFED38EA9BD706DCF5F12E2@CWLP123MB4145.GBRP123.PROD.OUTLOOK.COM>
Message-ID: <8863bfc9-3983-31c0-447d-30a5d7ef9966@fastmail.org>

Please check: "Intel MPI" section of https://urldefense.us/v3/__https://petsc.org/release/install/install/*mpi__;Iw!!G_uCfscf7eWS!efneLerzMMPsXXWgEwlvn6Pdk_8YCIiz8n_VrMaeIB4E4gWzmx9BZKrY5NGR48MVdQ60BU43oXs8ZhXN5BIN9zu690s$   - Likely you need to correctly set I_MPI_CC etc. for the MPI compiler wrappers to work.

If you still have issues -  send configure.log from this failure.

Satish

On Mon, 25 Nov 2024, Brian Bainbridge - BGS via petsc-users wrote:

> Hi there,
> 
> I have downloaded petsc-.3.22.1 via git, but when I try to configure with the Intel compilers I get this message:
> 
> [bba at kwvmxbridgeHPC petsc]$ ./configure --prefix=/home/bba/bin/petsc --with-mpi-dir=/home/bba/intel/oneapi/mpi/2021.14/
> =============================================================================================
>                          Configuring PETSc to compile on your system
> =============================================================================================
> TESTING: checkCCompiler from config.setCompilers(config/BuildSystem/config/setCompilers.py:1457)
> *********************************************************************************************
>            UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for details):
> ---------------------------------------------------------------------------------------------
>   MPI compiler wrappers in /home/bba/intel/oneapi/mpi/2021.14/bin cannot be found or do not
>   work. See https://urldefense.us/v3/__https://petsc.org/release/faq/*invalid-mpi-compilers__;Iw!!G_uCfscf7eWS!bfBTBf0cnZ3REM41nHH1ecadhkhaOZ6Rj28ZmTt_QbyHuPvU3zdMDIVstg8C0aON_dJ-vLme96MmkQ_-tKw$ 
> *********************************************************************************************
> 
> But:
> 
> [bba at kwvmxbridgeHPC petsc]$ ls /home/bba/intel/oneapi/mpi/2021.14/bin
> cpuinfo             hydra_nameserver  IMB-MPI1      IMB-MT   IMB-P2P  impi_cpuinfo  mpicc   mpiexec        mpif77  mpifc   mpigxx  mpiicpc  mpiicx    mpiifx  mpitune_fast
> hydra_bstrap_proxy  hydra_pmi_proxy   IMB-MPI1-GPU  IMB-NBC  IMB-RMA  impi_info     mpicxx  mpiexec.hydra  mpif90  mpigcc  mpiicc  mpiicpx  mpiifort  mpirun
> 
> So it should work, but it doesn't! I tried to use --with-cc=/home/bba/intel/oneapi/mpi/2021.14/bin/mpiicx
> 
> but that doesn't work:
> 
> [bba at kwvmxbridgeHPC petsc]$ ./configure --prefix=/home/bba/bin/petsc --with-cc=/home/bba/intel/oneapi/mpi/2021.14/bin/mpiicx
> =============================================================================================
>                          Configuring PETSc to compile on your system
> =============================================================================================
> TESTING: checkCCompiler from config.setCompilers(config/BuildSystem/config/setCompilers.py:1457)
> *********************************************************************************************
>            UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for details):
> ---------------------------------------------------------------------------------------------
>   C compiler you provided with -with-cc=/home/bba/intel/oneapi/mpi/2021.14/bin/mpiicx
>   cannot be found or does not work.
>   If the above linker messages do not indicate failure of the compiler you can rerun with
>   the option --ignoreLinkOutput=1
> *********************************************************************************************
> 
> [bba at kwvmxbridgeHPC petsc]$ ls /home/bba/intel/oneapi/mpi/2021.14/bin/mpiicx
> /home/bba/intel/oneapi/mpi/2021.14/bin/mpiicx
> 
> The compiler is there, can you please help me to configure the petsc please?
> 
> Regards,
> Brian
> 
> 
> 
> 
> This email and any attachments are intended solely for the use of the named recipients. If you are not the intended recipient you must not use, disclose, copy or distribute this email or any of its attachments and should notify the sender immediately and delete this email from your system. UK Research and Innovation (UKRI) has taken every reasonable precaution to minimise risk of this email or any attachments containing viruses or malware but the recipient should carry out its own virus and malware checks before opening the attachments. UKRI does not accept any liability for any losses or damages which the recipient may sustain due to presence of any viruses.
> 
> 


From matthew.thomas1 at anu.edu.au  Tue Nov 26 19:06:21 2024
From: matthew.thomas1 at anu.edu.au (Matthew Thomas)
Date: Wed, 27 Nov 2024 01:06:21 +0000
Subject: [petsc-users] Problem with MatMPIAIJSetPreallocation
Message-ID: <CF8D81AB-CB66-4B96-9FAC-BD31E42D644A@anu.edu.au>

Hello,


When I use MatMPIAIJSetPreallocation I get an argument out of range error, as below,


[15]PETSC ERROR: Argument out of range

[15]PETSC ERROR: New nonzero at (8,421) caused a malloc. Use MatSetOption(A, MAT_NEW_NONZERO_ALLOCATION_ERR, PETSC_FALSE) to turn off this check


However, when I run the code again with identical parameters without MatMPIAIJSetPreallocation, there is no non-zero at the location that caused the error (8, 421). Using -mat_view I can see that row 8 only contains a single value at column 8, which is expected and what I have allocated for.


I have also checked that every time I call MatSetValues that this location is not being set.


I am very confident my dnnz and onnz arrays have been set correctly, do you have any idea why this new non-zero is created?


I am using Petsc version 3.22.1 and Slepc version 3.22.1 with fortran.


This issue does not occur with small number of processors, (1-8), however, this error is consistent when I use >8 processors.


Thanks,

Matt
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20241127/b8f38eef/attachment.html>

From matthew.thomas1 at anu.edu.au  Tue Nov 26 20:41:39 2024
From: matthew.thomas1 at anu.edu.au (Matthew Thomas)
Date: Wed, 27 Nov 2024 02:41:39 +0000
Subject: [petsc-users] Problem with MatMPIAIJSetPreallocation
In-Reply-To: <CF8D81AB-CB66-4B96-9FAC-BD31E42D644A@anu.edu.au>
References: <CF8D81AB-CB66-4B96-9FAC-BD31E42D644A@anu.edu.au>
Message-ID: <4D8F95C1-E7C9-4353-A107-ABE1A8583526@anu.edu.au>

Hello,

I found a type elsewhere in my code and have fixed this issue.

Thanks,
Matt

On 27 Nov 2024, at 12:06?PM, Matthew Thomas <Matthew.Thomas1 at anu.edu.au> wrote:

Hello,


When I use MatMPIAIJSetPreallocation I get an argument out of range error, as below,


[15]PETSC ERROR: Argument out of range
[15]PETSC ERROR: New nonzero at (8,421) caused a malloc. Use MatSetOption(A, MAT_NEW_NONZERO_ALLOCATION_ERR, PETSC_FALSE) to turn off this check


However, when I run the code again with identical parameters without MatMPIAIJSetPreallocation, there is no non-zero at the location that caused the error (8, 421). Using -mat_view I can see that row 8 only contains a single value at column 8, which is expected and what I have allocated for.

I have also checked that every time I call MatSetValues that this location is not being set.

I am very confident my dnnz and onnz arrays have been set correctly, do you have any idea why this new non-zero is created?

I am using Petsc version 3.22.1 and Slepc version 3.22.1 with fortran.

This issue does not occur with small number of processors, (1-8), however, this error is consistent when I use >8 processors.

Thanks,
Matt

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20241127/e1b4c36f/attachment-0001.html>

From mmolinos at us.es  Wed Nov 27 08:47:29 2024
From: mmolinos at us.es (MIGUEL MOLINOS PEREZ)
Date: Wed, 27 Nov 2024 14:47:29 +0000
Subject: [petsc-users] Doubt about mesh size distribution in
 DMDACreate3d using periodic boundary conditions
In-Reply-To: <300ECE6B-36F4-4212-9EA6-95716EBB06A2@us.es>
References: <56B31960-5928-4225-BE0D-B2A8E6214776@us.es>
	<533B9E59-5B74-433E-B7A9-89972A2C7A8D@petsc.dev>
	<3421F3B3-C775-4E1A-A572-9B22EF0942E9@us.es>
	<FB52894A-868A-460B-81C2-D7C73692D764@us.es>
	<609D8006-8454-4FE3-A052-A0D95D9189F7@petsc.dev>
	<AFAF77C2-CEB0-430D-81E1-F16D704FE4BA@us.es>
	<50D40867-05EE-4AD0-B356-BC4C2E1CCF1D@petsc.dev>
	<6B69A792-4004-4B32-AB0D-F9D4AFACA094@us.es>
	<83500C6C-6E3E-499A-8595-7EEFE3174028@petsc.dev>
	<14500884-FB4B-4872-9E06-207FE6482187@us.es>
	<969F00FA-3636-4B06-88EC-04DE57C8F492@petsc.dev>
	<300ECE6B-36F4-4212-9EA6-95716EBB06A2@us.es>
Message-ID: <0E25EA57-DE67-440A-955E-F380DC3BF7A6@us.es>

Dear Barry:

You were right!! The problem is I am using the background DMDA mesh for the domain partitioning of the DMSWarm as in ?dm/tutorials/swarm_ex3.c?. And then ?DMGetNeighbors? to locate the neighbour ranks, including those in the other side of the domain when I am using periodic bcc.

Therefore, if I define the background DMDA to use periodic bcc the particle domain partitioning is uneven but I can locate precisely the periodic ranks.

Thanks,
Miguel

On 20 Nov 2024, at 23:40, MIGUEL MOLINOS PEREZ <mmolinos at us.es> wrote:

I see? that might be the problem. I?ll check it tomorrow. Thank you!

Miguel

On 20 Nov 2024, at 22:57, Barry Smith <bsmith at petsc.dev> wrote:

?

On Nov 20, 2024, at 2:38?PM, MIGUEL MOLINOS PEREZ <mmolinos at us.es> wrote:

Yes, I use the vertex (nodes) of the elements.

   Then the length between each vertex will be different between periodic and non-periodic case. With 10 points and non-periodic, it will be 1/9, and with periodic it will be 1/10th. Is this what you are asking about?


I am using the DMDA as an auxiliar mesh to do the domain partitioning in the DMSWARM.

Thanks,
Miguel


On 20 Nov 2024, at 19:54, Barry Smith <bsmith at petsc.dev> wrote:


   Are you considering your degrees of freedom as vertex or cell-centered?

   Say three "elements" per edge.

       If vertex centered then discretization size is 1/3 if periodic and 1/2 if not periodic

       If cell-centered then each cell has width 1/3 for both periodic and not periodic

    but in both cases you can think of the discretization size as constant along the whole cube edge.

    Is this related to DMSWARM in particular?

On Nov 20, 2024, at 12:56?PM, MIGUEL MOLINOS PEREZ <mmolinos at us.es> wrote:

I mean that if the dimensions of the cube are 1x1x1 (for example). And I want 10 elements per edge, the discretization size must be 0.1 constant over the whole cube edge.

This is not in the code, I just impose the number of elements per edge.

Thank you,
Miguel

On 20 Nov 2024, at 18:52, Barry Smith <bsmith at petsc.dev> wrote:


  What do you mean by discretization size, and how do I see it in the code?

  Barry


On Nov 20, 2024, at 12:48?PM, MIGUEL MOLINOS PEREZ <mmolinos at us.es> wrote:

Sorry, I meant that the discretisation size is not constant across the edges of the cube.

Miguel

On 20 Nov 2024, at 18:36, Barry Smith <bsmith at petsc.dev> wrote:


   I am sorry, I don't understand the problem. When I run by default with -da_view I get

Processor [0] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1
X range of indices: 0 2, Y range of indices: 0 2, Z range of indices: 0 2
Processor [1] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1
X range of indices: 2 3, Y range of indices: 0 2, Z range of indices: 0 2
Processor [2] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1
X range of indices: 0 2, Y range of indices: 2 3, Z range of indices: 0 2
Processor [3] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1
X range of indices: 2 3, Y range of indices: 2 3, Z range of indices: 0 2
Processor [4] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1
X range of indices: 0 2, Y range of indices: 0 2, Z range of indices: 2 3
Processor [5] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1
X range of indices: 2 3, Y range of indices: 0 2, Z range of indices: 2 3
Processor [6] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1
X range of indices: 0 2, Y range of indices: 2 3, Z range of indices: 2 3
Processor [7] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1
X range of indices: 2 3, Y range of indices: 2 3, Z range of indices: 2 3

which seems right because you are trying to have three cells in each direction. The distribution has to be uneven, hence 0 2 and 2 3

When I change the code to use ndiv_mesh_* = 4 and run with periodic or not I get

$ PETSC_OPTIONS="" mpiexec -n 8 ./atoms-3D -dm_view
DM Object: 8 MPI processes
  type: da
Processor [0] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1
X range of indices: 0 2, Y range of indices: 0 2, Z range of indices: 0 2
Processor [1] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1
X range of indices: 2 4, Y range of indices: 0 2, Z range of indices: 0 2
Processor [2] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1
X range of indices: 0 2, Y range of indices: 2 4, Z range of indices: 0 2
Processor [3] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1
X range of indices: 2 4, Y range of indices: 2 4, Z range of indices: 0 2
Processor [4] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1
X range of indices: 0 2, Y range of indices: 0 2, Z range of indices: 2 4
Processor [5] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1
X range of indices: 2 4, Y range of indices: 0 2, Z range of indices: 2 4
Processor [6] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1
X range of indices: 0 2, Y range of indices: 2 4, Z range of indices: 2 4
Processor [7] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1
X range of indices: 2 4, Y range of indices: 2 4, Z range of indices: 2 4

so it is splitting as expected each rank gets a 2 by 2 by 2 set of indices.

Could you please let me know what the problem is that I should be seeing.

  Barry


On Nov 20, 2024, at 7:06?AM, MIGUEL MOLINOS PEREZ <mmolinos at us.es> wrote:

Dear Barry,

Please, find attached to this email a minimal example of the problem. Run it using 8 MPI processes.

Thanks,
Miguel


On 20 Nov 2024, at 11:48, Miguel Molinos <mmolinos at us.es> wrote:

Hi Bary:

I will check the example you suggest. Anyhow, I?ll send a reproducible example ASAP.

Thanks,
Miguel

On 19 Nov 2024, at 18:55, Barry Smith <bsmith at petsc.dev> wrote:


   I modify src/dm/tests/ex25.c and always see a nice even split when possible with both DM_BOUNDARY_NONE and DM_BOUNDARY_PERIODIC

    Can you please send a reproducible example?

    Thanks

     Barry


On Nov 19, 2024, at 6:14?AM, MIGUEL MOLINOS PEREZ <mmolinos at us.es> wrote:

Dear all:

It seems that if I mesh a cubic domain with ?DMDACreate3d? using 8 bricks for discretization and with periodic boundaries, each of the bricks has a different size. In contrast, if I use DM_BOUNDARY_NONE, all 8 bricks have the same size.

I have used this together with the DMSWarm discretization. And as you can see the number of particles per rank is not evenly distributed:
210 420 366 732 420 840 732 1464

Am I missing something?

Thanks,
Miguel


<Screenshot 2024-11-19 at 10.56.36.png>


<atoms-3D.cpp><Mg-hcp-cube-x17-x10-x10.dump>


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20241127/eaa92c2a/attachment-0001.html>

From mmolinos at us.es  Wed Nov 27 09:38:07 2024
From: mmolinos at us.es (MIGUEL MOLINOS PEREZ)
Date: Wed, 27 Nov 2024 15:38:07 +0000
Subject: [petsc-users] Doubt about mesh size distribution in
 DMDACreate3d using periodic boundary conditions
In-Reply-To: <0E25EA57-DE67-440A-955E-F380DC3BF7A6@us.es>
References: <56B31960-5928-4225-BE0D-B2A8E6214776@us.es>
	<533B9E59-5B74-433E-B7A9-89972A2C7A8D@petsc.dev>
	<3421F3B3-C775-4E1A-A572-9B22EF0942E9@us.es>
	<FB52894A-868A-460B-81C2-D7C73692D764@us.es>
	<609D8006-8454-4FE3-A052-A0D95D9189F7@petsc.dev>
	<AFAF77C2-CEB0-430D-81E1-F16D704FE4BA@us.es>
	<50D40867-05EE-4AD0-B356-BC4C2E1CCF1D@petsc.dev>
	<6B69A792-4004-4B32-AB0D-F9D4AFACA094@us.es>
	<83500C6C-6E3E-499A-8595-7EEFE3174028@petsc.dev>
	<14500884-FB4B-4872-9E06-207FE6482187@us.es>
	<969F00FA-3636-4B06-88EC-04DE57C8F492@petsc.dev>
	<300ECE6B-36F4-4212-9EA6-95716EBB06A2@us.es>
	<0E25EA57-DE67-440A-955E-F380DC3BF7A6@us.es>
Message-ID: <7C4C298F-6E28-44DA-ADCF-10050154808D@us.es>

I forgot to mention that the solution to the problem is to increase the number of divisions while keeping constant the number of processes. Sorry for the silly question!

Thanks,
Miguel

On 27 Nov 2024, at 15:47, Miguel Molinos <mmolinos at us.es> wrote:

Dear Barry:

You were right!! The problem is I am using the background DMDA mesh for the domain partitioning of the DMSWarm as in ?dm/tutorials/swarm_ex3.c?. And then ?DMGetNeighbors? to locate the neighbour ranks, including those in the other side of the domain when I am using periodic bcc.

Therefore, if I define the background DMDA to use periodic bcc the particle domain partitioning is uneven but I can locate precisely the periodic ranks.

Thanks,
Miguel

On 20 Nov 2024, at 23:40, MIGUEL MOLINOS PEREZ <mmolinos at us.es> wrote:

I see? that might be the problem. I?ll check it tomorrow. Thank you!

Miguel

On 20 Nov 2024, at 22:57, Barry Smith <bsmith at petsc.dev> wrote:

?

On Nov 20, 2024, at 2:38?PM, MIGUEL MOLINOS PEREZ <mmolinos at us.es> wrote:

Yes, I use the vertex (nodes) of the elements.

   Then the length between each vertex will be different between periodic and non-periodic case. With 10 points and non-periodic, it will be 1/9, and with periodic it will be 1/10th. Is this what you are asking about?


I am using the DMDA as an auxiliar mesh to do the domain partitioning in the DMSWARM.

Thanks,
Miguel


On 20 Nov 2024, at 19:54, Barry Smith <bsmith at petsc.dev> wrote:


   Are you considering your degrees of freedom as vertex or cell-centered?

   Say three "elements" per edge.

       If vertex centered then discretization size is 1/3 if periodic and 1/2 if not periodic

       If cell-centered then each cell has width 1/3 for both periodic and not periodic

    but in both cases you can think of the discretization size as constant along the whole cube edge.

    Is this related to DMSWARM in particular?

On Nov 20, 2024, at 12:56?PM, MIGUEL MOLINOS PEREZ <mmolinos at us.es> wrote:

I mean that if the dimensions of the cube are 1x1x1 (for example). And I want 10 elements per edge, the discretization size must be 0.1 constant over the whole cube edge.

This is not in the code, I just impose the number of elements per edge.

Thank you,
Miguel

On 20 Nov 2024, at 18:52, Barry Smith <bsmith at petsc.dev> wrote:


  What do you mean by discretization size, and how do I see it in the code?

  Barry


On Nov 20, 2024, at 12:48?PM, MIGUEL MOLINOS PEREZ <mmolinos at us.es> wrote:

Sorry, I meant that the discretisation size is not constant across the edges of the cube.

Miguel

On 20 Nov 2024, at 18:36, Barry Smith <bsmith at petsc.dev> wrote:


   I am sorry, I don't understand the problem. When I run by default with -da_view I get

Processor [0] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1
X range of indices: 0 2, Y range of indices: 0 2, Z range of indices: 0 2
Processor [1] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1
X range of indices: 2 3, Y range of indices: 0 2, Z range of indices: 0 2
Processor [2] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1
X range of indices: 0 2, Y range of indices: 2 3, Z range of indices: 0 2
Processor [3] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1
X range of indices: 2 3, Y range of indices: 2 3, Z range of indices: 0 2
Processor [4] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1
X range of indices: 0 2, Y range of indices: 0 2, Z range of indices: 2 3
Processor [5] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1
X range of indices: 2 3, Y range of indices: 0 2, Z range of indices: 2 3
Processor [6] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1
X range of indices: 0 2, Y range of indices: 2 3, Z range of indices: 2 3
Processor [7] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1
X range of indices: 2 3, Y range of indices: 2 3, Z range of indices: 2 3

which seems right because you are trying to have three cells in each direction. The distribution has to be uneven, hence 0 2 and 2 3

When I change the code to use ndiv_mesh_* = 4 and run with periodic or not I get

$ PETSC_OPTIONS="" mpiexec -n 8 ./atoms-3D -dm_view
DM Object: 8 MPI processes
  type: da
Processor [0] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1
X range of indices: 0 2, Y range of indices: 0 2, Z range of indices: 0 2
Processor [1] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1
X range of indices: 2 4, Y range of indices: 0 2, Z range of indices: 0 2
Processor [2] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1
X range of indices: 0 2, Y range of indices: 2 4, Z range of indices: 0 2
Processor [3] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1
X range of indices: 2 4, Y range of indices: 2 4, Z range of indices: 0 2
Processor [4] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1
X range of indices: 0 2, Y range of indices: 0 2, Z range of indices: 2 4
Processor [5] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1
X range of indices: 2 4, Y range of indices: 0 2, Z range of indices: 2 4
Processor [6] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1
X range of indices: 0 2, Y range of indices: 2 4, Z range of indices: 2 4
Processor [7] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1
X range of indices: 2 4, Y range of indices: 2 4, Z range of indices: 2 4

so it is splitting as expected each rank gets a 2 by 2 by 2 set of indices.

Could you please let me know what the problem is that I should be seeing.

  Barry


On Nov 20, 2024, at 7:06?AM, MIGUEL MOLINOS PEREZ <mmolinos at us.es> wrote:

Dear Barry,

Please, find attached to this email a minimal example of the problem. Run it using 8 MPI processes.

Thanks,
Miguel


On 20 Nov 2024, at 11:48, Miguel Molinos <mmolinos at us.es> wrote:

Hi Bary:

I will check the example you suggest. Anyhow, I?ll send a reproducible example ASAP.

Thanks,
Miguel

On 19 Nov 2024, at 18:55, Barry Smith <bsmith at petsc.dev> wrote:


   I modify src/dm/tests/ex25.c and always see a nice even split when possible with both DM_BOUNDARY_NONE and DM_BOUNDARY_PERIODIC

    Can you please send a reproducible example?

    Thanks

     Barry


On Nov 19, 2024, at 6:14?AM, MIGUEL MOLINOS PEREZ <mmolinos at us.es> wrote:

Dear all:

It seems that if I mesh a cubic domain with ?DMDACreate3d? using 8 bricks for discretization and with periodic boundaries, each of the bricks has a different size. In contrast, if I use DM_BOUNDARY_NONE, all 8 bricks have the same size.

I have used this together with the DMSWarm discretization. And as you can see the number of particles per rank is not evenly distributed:
210 420 366 732 420 840 732 1464

Am I missing something?

Thanks,
Miguel


<Screenshot 2024-11-19 at 10.56.36.png>


<atoms-3D.cpp><Mg-hcp-cube-x17-x10-x10.dump>


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20241127/6517a464/attachment-0001.html>

From qiyuelu1 at gmail.com  Fri Nov 29 20:56:14 2024
From: qiyuelu1 at gmail.com (Qiyue Lu)
Date: Fri, 29 Nov 2024 20:56:14 -0600
Subject: [petsc-users] MatZeroRows costly while applying 1st-kind Boundary
 Conditions
Message-ID: <CALm6fhnSago7=O_c7vw4iFeD5kO_ec-M_9mefuqz7Mikgf=9Gw@mail.gmail.com>

Hello,
In the MPI context, after assembling the distributed matrix A (matmpiaij)
and the right-hand-side b, I am trying to apply the 1st kind boundary
condition using MatZeroRows() and VecSetValues(), for A and b respectively.
The pseudo-code is:
=========
for (int key = 0; key < BCNodes_Length; key++){
      // retrieving the global row position
      pos = BCNodes[key];
      // Set all elements in that row 0 except the one on the diagonal to
be 1.0
      MatZeroRows(A, 1, &pos, 1.0, NULL, NULL);
}
MatAssemblyBegin(A, MAT_FINAL_ASSEMBLY);
MatAssemblyEnd(A, MAT_FINAL_ASSEMBLY);
=========

For BCNodes_Length = 10^4, the FOR loop timing is 8 seconds.
For BCNodes_Length = 15*10^4, the FOR loop timing is 3000 seconds.
I am using two computational nodes and each having 12 cores.

My questions are:
1) Is the timing plausible? Is the MatZeroRows() function so costly?
2) Any suggestions to apply the 1st kind boundary conditions for a better
performance?

Thanks,
Qiyue Lu
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20241129/152d4910/attachment.html>

From bsmith at petsc.dev  Fri Nov 29 21:57:12 2024
From: bsmith at petsc.dev (Barry Smith)
Date: Fri, 29 Nov 2024 22:57:12 -0500
Subject: [petsc-users] MatZeroRows costly while applying 1st-kind
 Boundary Conditions
In-Reply-To: <CALm6fhnSago7=O_c7vw4iFeD5kO_ec-M_9mefuqz7Mikgf=9Gw@mail.gmail.com>
References: <CALm6fhnSago7=O_c7vw4iFeD5kO_ec-M_9mefuqz7Mikgf=9Gw@mail.gmail.com>
Message-ID: <9EF8F6C3-5169-4FE2-984D-29D335798CB5@petsc.dev>


   You need to call MatZeroRows() once; passing all the rows you want zeroed, instead of once for each row.

   If you are running in parallel each MPI process should call MatZeroRows() once passing in a list of rows to be zeroed. Each process can pass in different
   rows than the other processes.

   BTW: You do not need to call MatAssemblyBegin/End() after MatZeroRows()

   Barry


> On Nov 29, 2024, at 9:56?PM, Qiyue Lu <qiyuelu1 at gmail.com> wrote:
> 
> Hello, 
> In the MPI context, after assembling the distributed matrix A (matmpiaij) and the right-hand-side b, I am trying to apply the 1st kind boundary condition using MatZeroRows() and VecSetValues(), for A and b respectively. 
> The pseudo-code is:
> =========
> for (int key = 0; key < BCNodes_Length; key++){
>       // retrieving the global row position
>       pos = BCNodes[key];  
>       // Set all elements in that row 0 except the one on the diagonal to be 1.0
>       MatZeroRows(A, 1, &pos, 1.0, NULL, NULL);      
> }
> MatAssemblyBegin(A, MAT_FINAL_ASSEMBLY);
> MatAssemblyEnd(A, MAT_FINAL_ASSEMBLY);
> =========
> 
> For BCNodes_Length = 10^4, the FOR loop timing is 8 seconds.
> For BCNodes_Length = 15*10^4, the FOR loop timing is 3000 seconds.
> I am using two computational nodes and each having 12 cores.
> 
> My questions are:
> 1) Is the timing plausible? Is the MatZeroRows() function so costly?
> 2) Any suggestions to apply the 1st kind boundary conditions for a better performance?
> 
> Thanks,
> Qiyue Lu

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20241129/af55c615/attachment.html>

From tryit88 at proton.me  Fri Nov 29 23:48:09 2024
From: tryit88 at proton.me (tryit88)
Date: Sat, 30 Nov 2024 05:48:09 +0000
Subject: [petsc-users] error
Message-ID: <7prYhgf7pu4Oaifx7nlpPUFPOme_vw527uHLrssGWXw5JHiiSIPunbum170De4StH2Delp_fg4Ssb_26p6wC8GMTLbPqLwEvL7jSNdbdtXk=@proton.me>

???????Cialis???????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????

https://urldefense.us/v3/__https://blog.lowe99.com/__;!!G_uCfscf7eWS!aO_LP_BBT1VaODjHHdGQLLpZ87Ga25P_jHhTAPftsw__K1WqbLrMDaYWK3g92H7cBLXjUPFMrrBKG2732kA5pWQ$ 
https://urldefense.us/v3/__https://priligy88.com__;!!G_uCfscf7eWS!aO_LP_BBT1VaODjHHdGQLLpZ87Ga25P_jHhTAPftsw__K1WqbLrMDaYWK3g92H7cBLXjUPFMrrBKG273R0mXdDM$ 
https://urldefense.us/v3/__https://vaigratw.com/__;!!G_uCfscf7eWS!aO_LP_BBT1VaODjHHdGQLLpZ87Ga25P_jHhTAPftsw__K1WqbLrMDaYWK3g92H7cBLXjUPFMrrBKG273bCCkJ1s$ 
https://urldefense.us/v3/__https://levitra20mg.com/https:/*100mg.tw__;Lw!!G_uCfscf7eWS!aO_LP_BBT1VaODjHHdGQLLpZ87Ga25P_jHhTAPftsw__K1WqbLrMDaYWK3g92H7cBLXjUPFMrrBKG2732K2QEPQ$ 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20241130/970a241c/attachment-0001.html>

From tryit88 at proton.me  Fri Nov 29 23:51:34 2024
From: tryit88 at proton.me (tryit88)
Date: Sat, 30 Nov 2024 05:51:34 +0000
Subject: [petsc-users] error
Message-ID: <VCYW5PNOwSUycDV21Zaqy5bU3NwmsL6Ph5QiVWBK_SDbfMj7geGByrJwttkLXkAqObID0Vg_tBYrxgr7mJlKWi2JgE8BT-CemigdI6EKlSw=@proton.me>

???????Cialis???????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????

https://urldefense.us/v3/__https://blog.lowe99.com/__;!!G_uCfscf7eWS!ahLQJxJ6sfp7sxaTR3EC6qzo_mYcCiKnwjlEvcthxJnnqKWMLlZsGR-tQEd3cRizekWYAFcrNGDHI7CYjcX1YCE$ 
https://urldefense.us/v3/__https://priligy88.com__;!!G_uCfscf7eWS!ahLQJxJ6sfp7sxaTR3EC6qzo_mYcCiKnwjlEvcthxJnnqKWMLlZsGR-tQEd3cRizekWYAFcrNGDHI7CYgoGz8ss$ 
https://urldefense.us/v3/__https://vaigratw.com/__;!!G_uCfscf7eWS!ahLQJxJ6sfp7sxaTR3EC6qzo_mYcCiKnwjlEvcthxJnnqKWMLlZsGR-tQEd3cRizekWYAFcrNGDHI7CYM6X5OTg$ 
https://urldefense.us/v3/__https://levitra20mg.com/https:/*100mg.tw__;Lw!!G_uCfscf7eWS!ahLQJxJ6sfp7sxaTR3EC6qzo_mYcCiKnwjlEvcthxJnnqKWMLlZsGR-tQEd3cRizekWYAFcrNGDHI7CY890IhOM$ 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.mcs.anl.gov/pipermail/petsc-users/attachments/20241130/583fc1ac/attachment.html>