From daniel.pino_munoz at mines-paristech.fr Mon Nov 4 04:23:35 2024 From: daniel.pino_munoz at mines-paristech.fr (Daniel Pino Munoz) Date: Mon, 4 Nov 2024 11:23:35 +0100 Subject: [petsc-users] Example of SNES using matrix free jacobian In-Reply-To: References: <161ca7d4-82c4-4696-b123-4d441eb903da@mines-paristech.fr> <7096B68A-67A7-41AA-9DAF-A3FE78C8679E@petsc.dev> <4257d3ef-9a69-4b3c-a435-f8f9fe77f50c@mines-paristech.fr> <78522DC0-7204-42C0-8A0C-47E2CA55AB80@petsc.dev> <40d13023-ede8-4226-bdd9-a8a495ef4daa@mines-paristech.fr> Message-ID: Dear all, The problem was the context. The context was not properly set, and yet for some reason it was running correctly in Debug mode. Now this problem has been solved. Thank you Barry for your help. Best, ? Daniel On 29/10/2024 23:04, Barry Smith wrote: > Run in the debugger, even though there will not be full debugging support with the optimizations it should still provide you some information > > >> On Oct 29, 2024, at 4:35?PM, Daniel Pino Munoz wrote: >> >> That's what I thought, so I replaced its content by: >> >> VecZeroEntries(f); >> >> and the result is the same... >> >> >> On 29/10/2024 21:31, Barry Smith wrote: >>> This >>> [0]PETSC ERROR: #1 SNES callback function >>> indicates the crash is in your computeResidual function and thus you need to debug your function >>> >>> Barry >>> >>> >>>> On Oct 29, 2024, at 4:28?PM, Daniel Pino Munoz wrote: >>>> >>>> I ran it with -malloc_debug and it does not change anything. >>>> >>>> The output is the following: >>>> >>>> he absolute tolerance is 0.001 >>>> The relative tolerance is 0.001 >>>> The divergence tolerance is 10000 >>>> The maximum iterations is 10000 >>>> Initial load ! >>>> [0]PETSC ERROR: ------------------------------------------------------------------------ >>>> [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range >>>> [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger >>>> [0]PETSC ERROR: or see https://urldefense.us/v3/__https://petsc.org/release/faq/*valgrind__;Iw!!G_uCfscf7eWS!dkbZ6vbH8gLYOkcINufnJg_JyJcRHtNge8M_c8f-iuQ3J8EP-KVV58zSZ1cBNyyStNlGaXrJUcwnob5EdBnPDpfAiOKcpEClnwS21YZuJw$ and https://urldefense.us/v3/__https://petsc.org/release/faq/__;!!G_uCfscf7eWS!dkbZ6vbH8gLYOkcINufnJg_JyJcRHtNge8M_c8f-iuQ3J8EP-KVV58zSZ1cBNyyStNlGaXrJUcwnob5EdBnPDpfAiOKcpEClnwTJP97sJQ$ >>>> [0]PETSC ERROR: --------------------- Stack Frames ------------------------------------ >>>> [0]PETSC ERROR: The line numbers in the error traceback are not always exact. >>>> [0]PETSC ERROR: #1 SNES callback function >>>> [0]PETSC ERROR: #2 SNESComputeFunction() at /home/daniel-pino/Software/Dependencies/petsc/src/snes/interface/snes.c:2489 >>>> [0]PETSC ERROR: #3 SNESSolve_KSPONLY() at /home/daniel-pino/Software/Dependencies/petsc/src/snes/impls/ksponly/ksponly.c:27 >>>> [0]PETSC ERROR: #4 SNESSolve() at /home/daniel-pino/Software/Dependencies/petsc/src/snes/interface/snes.c:4841 >>>> -------------------------------------------------------------------------- >>>> MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD >>>> Proc: [[27669,1],0] >>>> Errorcode: 59 >>>> >>>> NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes. >>>> You may or may not see output from other processes, depending on >>>> exactly when Open MPI kills them. >>>> -------------------------------------------------------------------------- >>>> >>>> >>>> >>>> On 29/10/2024 20:17, Barry Smith wrote: >>>>> Hmm, cut and paste the output when it crashes. >>>>> >>>>> Also run with -malloc_debug and see what happens >>>>> >>>>> >>>>> Barry >>>>> >>>>> >>>>>> On Oct 29, 2024, at 3:13?PM, Daniel Pino Munoz wrote: >>>>>> >>>>>> Hi Barry, >>>>>> >>>>>> Thanks for getting back to me! >>>>>> >>>>>> I tried replacing KSPSetOperators(ksp, J, J); by SNESSetJacobian(snes,J,J, MatMFFDComputeJacobian) >>>>>> >>>>>> and I get the same result = It works in Debug mode but not in Release. I also ran valgrind and it did not catch any memory problem. >>>>>> >>>>>> Any ideas? >>>>>> >>>>>> PS : You are right regarding the number of iterations of the non preconditioned problem. In the previous version of the code that only used a KSP, I already had to set -ksp_gmres_restart 100. But thanks for the heads up. >>>>>> >>>>>> Best, >>>>>> >>>>>> Daniel >>>>>> >>>>>> On 29/10/2024 20:01, Barry Smith wrote: >>>>>>> Don't call >>>>>>> >>>>>>> KSPSetOperators(ksp, J, J); >>>>>>> >>>>>>> >>>>>>> instead call >>>>>>> >>>>>>> SNESSetJacobian(snes,J,J, MatMFFDComputeJacobian) >>>>>>> >>>>>>> but I am not sure that would explain the crash. >>>>>>> >>>>>>> BTW: since you are applying no preconditioner if the matrix is ill-conditioned it may take many iterations or not converge. You can try something like -ksp_gmres_restart 100 or similar value to try to improve convergence (default is 30). >>>>>>> >>>>>>> Barry >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>>> On Oct 29, 2024, at 12:37?PM, Daniel Pino Munoz wrote: >>>>>>>> >>>>>>>> Dear all, >>>>>>>> >>>>>>>> I have a linear problem that I am currently solving with a KSP matrix-free. >>>>>>>> >>>>>>>> I would like to move on to a non linear problem, so figure I could start by solving the same linear problem using SNES. So I am setting the problem as follows: >>>>>>>> >>>>>>>> SNESCreate(PETSC_COMM_WORLD, &snes); >>>>>>>> MatCreateShell(PETSC_COMM_WORLD, n_dofs, n_dofs, PETSC_DETERMINE, PETSC_DETERMINE, &ctx, &J); >>>>>>>> MatCreateShell(PETSC_COMM_WORLD, n_dofs, n_dofs, PETSC_DETERMINE, PETSC_DETERMINE, &ctx, &B); >>>>>>>> MatShellSetOperation(J, MATOP_MULT, (void (*)(void))(Multiplication)); >>>>>>>> MatCreateVecs(J, &x_sol, &b); >>>>>>>> VecDuplicate(x_sol, &r); >>>>>>>> SNESSetFromOptions(snes); >>>>>>>> SNESSetFunction(snes, r, &(computeResidual), &ctx); >>>>>>>> SNESSetUseMatrixFree(snes, PETSC_FALSE, PETSC_TRUE); >>>>>>>> SNESGetLineSearch(snes, &linesearch); >>>>>>>> SNESGetKSP(snes, &ksp); >>>>>>>> KSPSetOperators(ksp, J, J); >>>>>>>> KSPSetInitialGuessNonzero(ksp, PETSC_TRUE); >>>>>>>> >>>>>>>> I tested it with a small problem (compiled in debug) and it works. >>>>>>>> >>>>>>>> When I compiled it in Release, it crashes with a segfault. I tried running the Debug version through valgrind, but even for a small problem, it is too slow. So I was wondering if you guys could see any rocky mistake on the lines I used above? >>>>>>>> >>>>>>>> Otherwise, is there any example that uses a SNES combined with a matrix free KSP operator? >>>>>>>> >>>>>>>> Thank you, >>>>>>>> >>>>>>>> Daniel >>>>>>>> From mail2amneet at gmail.com Mon Nov 4 11:36:15 2024 From: mail2amneet at gmail.com (Amneet Bhalla) Date: Mon, 4 Nov 2024 09:36:15 -0800 Subject: [petsc-users] Rigid body nullspace for Stokes operator In-Reply-To: <87v7x8pagg.fsf@jedbrown.org> References: <875xpas35p.fsf@jedbrown.org> <87bjz1qtk6.fsf@jedbrown.org> <87y124q31n.fsf@jedbrown.org> <96BE4263-8C34-4A2E-91B4-305F94FFCAB4@joliv.et> <87v7x8pagg.fsf@jedbrown.org> Message-ID: Hi Jed, Do I need to create two separate MattNullSpace objects if I want to use both MatSetNullSpace() and MatSetNearNullSpace()? Thanks, On Thu, Oct 31, 2024 at 8:18?AM Jed Brown wrote: > Pierre Jolivet writes: > > >> On 31 Oct 2024, at 2:47?PM, Mark Adams wrote: > >> > >> Interesting. I have seen hypre do fine on elasticity, but do you know > if boomeramg (classical) uses these vectors or is there a smoothed > aggregation solver in hypre? > > > > I?m not sure it is precisely ?standard? smoothed aggregation, see bottom > paragraph of > https://urldefense.us/v3/__https://hypre.readthedocs.io/en/latest/solvers-boomeramg.html*amg-for-systems-of-pdes__;Iw!!G_uCfscf7eWS!fdj4AzKhAcDmM1x_ZQ8gxqWeX6BAKY9urnvATMpT7hC8lw77ak7tqxqXGIX3PMg2wYA5PGu7EzyCW0yzixutBg$ > > I?ve never made it to work, but I know some do. > > A while back, Stefano gave me this pointer as well: > https://urldefense.us/v3/__https://github.com/mfem/mfem/blob/17955e114020af340e9a06a66ebef43e05012d9c/linalg/hypre.cpp*L5245__;Iw!!G_uCfscf7eWS!fdj4AzKhAcDmM1x_ZQ8gxqWeX6BAKY9urnvATMpT7hC8lw77ak7tqxqXGIX3PMg2wYA5PGu7EzyCW0wVEi33Pw$ > > It's still classical AMG, and in my experience, struggles on very thin > structures (e.g., aspect ratio 1000 cantilever beams) when compared to SA. > However, it can be quite competitive for many structures. I found that the > "MFEM elasticity suite", which is based on Baker et al 2010, gave rather > poor results. This is a configuration that works on GPUs and gives good > convergence and performance for elasticity: > > > https://urldefense.us/v3/__https://github.com/hypre-space/hypre/issues/601*issuecomment-1069426997__;Iw!!G_uCfscf7eWS!arUVBVKKcYs1M5OhNqqRZl2b2o0NIUkG7fV_22qBbg-ssHhhHazhkpMbYNjCOTN66Sfbk-VZilfox9bxDf0$ > > In the above issue, I was only using BoomerAMG as a coarse level for p-MG > so all the options have a `-mg_coarse_` prefix; here are those options > without the prefix: > > -pc_hypre_boomeramg_coarsen_type pmis > -pc_hypre_boomeramg_interp_type ext+i > -pc_hypre_boomeramg_no_CF > -pc_hypre_boomeramg_P_max 6 > -pc_hypre_boomeramg_print_statistics 1 > -pc_hypre_boomeramg_relax_type_down Chebyshev > -pc_hypre_boomeramg_relax_type_up Chebyshev > -pc_hypre_boomeramg_strong_threshold 0.5 > -pc_type hypre > -- --Amneet -------------- next part -------------- An HTML attachment was scrubbed... URL: From aldo.bonfiglioli at unibas.it Mon Nov 4 12:48:08 2024 From: aldo.bonfiglioli at unibas.it (Aldo Bonfiglioli) Date: Mon, 4 Nov 2024 19:48:08 +0100 Subject: [petsc-users] Advice on setting BCs in a PLEX Message-ID: Dear users, I have been using petsc's KSP for over 20 years and I am considering using DMPLEX to replace my own data structure in a mixed FEM/FVM CFD code. To do so, I am trying to understand DMPLEX by writing a 1D code that solves u_t = u_xx using PSEUDOTS. In order to prescribe Dirichlet BCs at the endpoints of the 1D box, I use PetscSectionSetConstraintDof when building the PetscSection. > A global vector is missing both the shared dofs which are not owned by > this process, as well as /constrained/ dofs. These constraints > represent essential (Dirichlet) boundary conditions. They are dofs > that have a given fixed value, so they are present in local vectors > for assembly purposes, but absent from global vectors since they are > never solved for during algebraic solves. My global Vec has indeed two entries less than the local one. When initializing the solution or evaluating the rhs function I transfer data from the global to local representation, do the calculation, then transfer back. I am doing something wrong, though, which shows up in the attached log file. My simple code is also attached. Thank you for your advice. Aldo -- Dr. Aldo Bonfiglioli Associate professor of Fluid Machines Scuola di Ingegneria Universita' della Basilicata V.le dell'Ateneo lucano, 10 85100 Potenza ITALY tel:+39.0971.205203 fax:+39.0971.205215 web:https://urldefense.us/v3/__http://docenti.unibas.it/site/home/docente.html?m=002423__;!!G_uCfscf7eWS!cWgDpzKW9KGEkqxu775gDU488CBQ5itLl2l_HUsDKcPstbcuSm0_bZPHkMciXajTi3539fuMPe8TD1YZuirDfmSBMSXXxrQZWns$ -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- DM Object: box 1 MPI process type: plex box in 1 dimension: Number of 0-cells per rank: 11 Number of 1-cells per rank: 10 Labels: marker: 1 strata with value/size (1 (2)) Face Sets: 2 strata with value/size (1 (1), 2 (1)) depth: 2 strata with value/size (0 (11), 1 (10)) celltype: 2 strata with value/size (0 (11), 1 (10)) [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Invalid argument [0]PETSC ERROR: Point 10: Global dof 0 != 1 size - number of constraints [0]PETSC ERROR: WARNING! There are unused option(s) set! Could be the program crashed before usage or a spelling mistake, etc! [0]PETSC ERROR: Option left: name:-ts_max_steps value: 10 source: command line [0]PETSC ERROR: Option left: name:-ts_monitor (no value) source: command line [0]PETSC ERROR: See https://petsc.org/release/faq/ for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.22.0, Sep 28, 2024 [0]PETSC ERROR: ./ex15f90 with 1 MPI process(es) and PETSC_ARCH linux_gnu on DESKTOP-04AR6V6.lan by abonfi Mon Nov 4 19:28:00 2024 [0]PETSC ERROR: Configure options: --with-mpi=1 --with-mpi-dir=/usr/lib64/mpi/gcc/mpich --with-shared-libraries=1 --with-debugging=1 --with-blas-lib=/usr/lib64/libblas.a --with-lapack-lib=/usr/lib64/liblapack.a --with-triangle=1 --download-triangle=yes [0]PETSC ERROR: #1 PetscSFSetGraphSection() at /home/abonfi/src/petsc-3.22.0/src/vec/is/sf/utils/sfutils.c:179 [0]PETSC ERROR: #2 DMCreateSectionSF() at /home/abonfi/src/petsc-3.22.0/src/dm/interface/dm.c:4783 [0]PETSC ERROR: #3 DMGetSectionSF() at /home/abonfi/src/petsc-3.22.0/src/dm/interface/dm.c:4722 [0]PETSC ERROR: #4 DMGlobalToLocalBegin() at /home/abonfi/src/petsc-3.22.0/src/dm/interface/dm.c:2851 [0]PETSC ERROR: #5 DMGlobalToLocal() at /home/abonfi/src/petsc-3.22.0/src/dm/interface/dm.c:2812 [0]PETSC ERROR: #6 ex15f90.F90:280 Abort(62) on node 0 (rank 0 in comm 16): application called MPI_Abort(MPI_COMM_SELF, 62) - process 0 there are 11 entries in depth: 0 there are 10 entries in depth: 1 Setting up a section on 11 meshpoints with 1 dofs The DM is marked Point 10 is a boundary point Point 20 is a boundary point Array u has type : seq (Global) Array u has size : 9 coordVec has size 11 -------------- next part -------------- A non-text attachment was scrubbed... Name: ex15f90.F90 Type: text/x-fortran Size: 14065 bytes Desc: not available URL: From jed at jedbrown.org Mon Nov 4 12:11:54 2024 From: jed at jedbrown.org (Jed Brown) Date: Mon, 04 Nov 2024 11:11:54 -0700 Subject: [petsc-users] Rigid body nullspace for Stokes operator In-Reply-To: References: <875xpas35p.fsf@jedbrown.org> <87bjz1qtk6.fsf@jedbrown.org> <87y124q31n.fsf@jedbrown.org> <96BE4263-8C34-4A2E-91B4-305F94FFCAB4@joliv.et> <87v7x8pagg.fsf@jedbrown.org> Message-ID: <87wmhi27it.fsf@jedbrown.org> Unless the problem is entirely floating (the true null space is all six rigid body modes), then they will be different, so yes, you'll typically have two MatNullSpace objects. Amneet Bhalla writes: > Hi Jed, > > Do I need to create two separate MattNullSpace objects if I want to use > both MatSetNullSpace() and MatSetNearNullSpace()? > > Thanks, > > > On Thu, Oct 31, 2024 at 8:18?AM Jed Brown wrote: > >> Pierre Jolivet writes: >> >> >> On 31 Oct 2024, at 2:47?PM, Mark Adams wrote: >> >> >> >> Interesting. I have seen hypre do fine on elasticity, but do you know >> if boomeramg (classical) uses these vectors or is there a smoothed >> aggregation solver in hypre? >> > >> > I?m not sure it is precisely ?standard? smoothed aggregation, see bottom >> paragraph of >> https://urldefense.us/v3/__https://hypre.readthedocs.io/en/latest/solvers-boomeramg.html*amg-for-systems-of-pdes__;Iw!!G_uCfscf7eWS!fdj4AzKhAcDmM1x_ZQ8gxqWeX6BAKY9urnvATMpT7hC8lw77ak7tqxqXGIX3PMg2wYA5PGu7EzyCW0yzixutBg$ >> > I?ve never made it to work, but I know some do. >> > A while back, Stefano gave me this pointer as well: >> https://urldefense.us/v3/__https://github.com/mfem/mfem/blob/17955e114020af340e9a06a66ebef43e05012d9c/linalg/hypre.cpp*L5245__;Iw!!G_uCfscf7eWS!fdj4AzKhAcDmM1x_ZQ8gxqWeX6BAKY9urnvATMpT7hC8lw77ak7tqxqXGIX3PMg2wYA5PGu7EzyCW0wVEi33Pw$ >> >> It's still classical AMG, and in my experience, struggles on very thin >> structures (e.g., aspect ratio 1000 cantilever beams) when compared to SA. >> However, it can be quite competitive for many structures. I found that the >> "MFEM elasticity suite", which is based on Baker et al 2010, gave rather >> poor results. This is a configuration that works on GPUs and gives good >> convergence and performance for elasticity: >> >> >> https://urldefense.us/v3/__https://github.com/hypre-space/hypre/issues/601*issuecomment-1069426997__;Iw!!G_uCfscf7eWS!arUVBVKKcYs1M5OhNqqRZl2b2o0NIUkG7fV_22qBbg-ssHhhHazhkpMbYNjCOTN66Sfbk-VZilfox9bxDf0$ >> >> In the above issue, I was only using BoomerAMG as a coarse level for p-MG >> so all the options have a `-mg_coarse_` prefix; here are those options >> without the prefix: >> >> -pc_hypre_boomeramg_coarsen_type pmis >> -pc_hypre_boomeramg_interp_type ext+i >> -pc_hypre_boomeramg_no_CF >> -pc_hypre_boomeramg_P_max 6 >> -pc_hypre_boomeramg_print_statistics 1 >> -pc_hypre_boomeramg_relax_type_down Chebyshev >> -pc_hypre_boomeramg_relax_type_up Chebyshev >> -pc_hypre_boomeramg_strong_threshold 0.5 >> -pc_type hypre >> > > > -- > --Amneet From knepley at gmail.com Mon Nov 4 17:20:27 2024 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 5 Nov 2024 00:20:27 +0100 Subject: [petsc-users] Advice on setting BCs in a PLEX In-Reply-To: References: Message-ID: On Mon, Nov 4, 2024 at 7:10?PM Aldo Bonfiglioli wrote: > Dear users, > > I have been using petsc's KSP for over 20 years and I am considering using > DMPLEX to replace my own data structure in a mixed FEM/FVM CFD code. > > To do so, I am trying to understand DMPLEX by writing a 1D code that > solves u_t = u_xx using PSEUDOTS. > > In order to prescribe Dirichlet BCs at the endpoints of the 1D box, I use > PetscSectionSetConstraintDof when building the PetscSection. > > A global vector is missing both the shared dofs which are not owned by > this process, as well as *constrained* dofs. These constraints represent > essential (Dirichlet) boundary conditions. They are dofs that have a given > fixed value, so they are present in local vectors for assembly purposes, > but absent from global vectors since they are never solved for during > algebraic solves. > > > > My global Vec has indeed two entries less than the local one. > > When initializing the solution or evaluating the rhs function I transfer > data from the global to local representation, do the calculation, then > transfer back. > > I am doing something wrong, though, which shows up in the attached log > file. > > My simple code is also attached. > > Hi Aldo, I think we are very close to working. The PetscSection only care about the sizes of things, like vectors, and it seems like that is right. Now you want the boundary values to be put in your vectors (I think). There needs to be a routine to stick in the boundary values, which is DMPlexInsertBoundaryValues(). You can change this function by calling PetscObjectFunctionCompose() for "DMPlexInsertBoundaryValues_C". Does that make sense? Thanks, Matt > Thank you for your advice. > > Aldo > > -- > Dr. Aldo Bonfiglioli > Associate professor of Fluid Machines > Scuola di Ingegneria > Universita' della Basilicata > V.le dell'Ateneo lucano, 10 85100 Potenza ITALY > tel:+39.0971.205203 fax:+39.0971.205215 > web: https://urldefense.us/v3/__http://docenti.unibas.it/site/home/docente.html?m=002423__;!!G_uCfscf7eWS!Yef4Gll06T0STqTT9tclyiR99EATrA8nCwlcq7LOmCo4TPWg3fYGMZS0Ohp-XcuAHGtnTiIEQOhfmV77lCcX$ > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!Yef4Gll06T0STqTT9tclyiR99EATrA8nCwlcq7LOmCo4TPWg3fYGMZS0Ohp-XcuAHGtnTiIEQOhfmSraytgu$ -------------- next part -------------- An HTML attachment was scrubbed... URL: From mail2amneet at gmail.com Mon Nov 4 20:03:23 2024 From: mail2amneet at gmail.com (Amneet Bhalla) Date: Mon, 4 Nov 2024 18:03:23 -0800 Subject: [petsc-users] Rigid body nullspace for Stokes operator In-Reply-To: <87wmhi27it.fsf@jedbrown.org> References: <875xpas35p.fsf@jedbrown.org> <87bjz1qtk6.fsf@jedbrown.org> <87y124q31n.fsf@jedbrown.org> <96BE4263-8C34-4A2E-91B4-305F94FFCAB4@joliv.et> <87v7x8pagg.fsf@jedbrown.org> <87wmhi27it.fsf@jedbrown.org> Message-ID: I set the rigid body null vectors but PETSc errors out that these are not orthogonal. Is there a canned routine in PETSc to orthogonalize a bunch of Vecs? [0]PETSC ERROR: Invalid argument [0]PETSC ERROR: Vector 0 must be orthogonal to vector 2, inner product is 0.612391 [0]PETSC ERROR: See https://urldefense.us/v3/__https://petsc.org/release/faq/__;!!G_uCfscf7eWS!e0btVnW2NyDoumIgYMZpmOJ8Vvkh18HjVCEY3LyPW9fAsr4jD7MqRxxBLgh-q49bw3jvM0FZc2A_s2U6RxXpfK06Dg$ for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.17.5, unknown [0]PETSC ERROR: ./acoustic_streaming_hier_integrator_2d on a darwin-dbg named APSB-MacBook-Pro-16.local by amneetb Mon Nov 4 18:01:09 2024 [0]PETSC ERROR: Configure options --CC=mpicc --CXX=mpicxx --FC=mpif90 --PETSC_ARCH=darwin-dbg --with-debugging=1 --download-hypre=1 --with-x=0 -download-mumps -download-scalapack -download-parmetis -download-metis -download-ptscotch [0]PETSC ERROR: #1 MatNullSpaceCreate() at /Users/amneetb/Softwares/PETSc-Gitlab/PETSc/src/mat/interface/matnull.c:271 [0]PETSC ERROR: #2 resetMatNearNullspace() at ../../../IBAMR/ibtk/lib/../src/solvers/impls/PETScKrylovLinearSolver.cpp:697 P=00000:Program abort called in file ``../../../IBAMR/ibtk/lib/../src/solvers/impls/PETScKrylovLinearSolver.cpp'' at line 697 P=00000:ERROR MESSAGE: P=00000: Abort trap: 6 On Mon, Nov 4, 2024 at 10:11?AM Jed Brown wrote: > Unless the problem is entirely floating (the true null space is all six > rigid body modes), then they will be different, so yes, you'll typically > have two MatNullSpace objects. > > Amneet Bhalla writes: > > > Hi Jed, > > > > Do I need to create two separate MattNullSpace objects if I want to use > > both MatSetNullSpace() and MatSetNearNullSpace()? > > > > Thanks, > > > > > > On Thu, Oct 31, 2024 at 8:18?AM Jed Brown wrote: > > > >> Pierre Jolivet writes: > >> > >> >> On 31 Oct 2024, at 2:47?PM, Mark Adams wrote: > >> >> > >> >> Interesting. I have seen hypre do fine on elasticity, but do you know > >> if boomeramg (classical) uses these vectors or is there a smoothed > >> aggregation solver in hypre? > >> > > >> > I?m not sure it is precisely ?standard? smoothed aggregation, see > bottom > >> paragraph of > >> > https://urldefense.us/v3/__https://hypre.readthedocs.io/en/latest/solvers-boomeramg.html*amg-for-systems-of-pdes__;Iw!!G_uCfscf7eWS!fdj4AzKhAcDmM1x_ZQ8gxqWeX6BAKY9urnvATMpT7hC8lw77ak7tqxqXGIX3PMg2wYA5PGu7EzyCW0yzixutBg$ > >> > I?ve never made it to work, but I know some do. > >> > A while back, Stefano gave me this pointer as well: > >> > https://urldefense.us/v3/__https://github.com/mfem/mfem/blob/17955e114020af340e9a06a66ebef43e05012d9c/linalg/hypre.cpp*L5245__;Iw!!G_uCfscf7eWS!fdj4AzKhAcDmM1x_ZQ8gxqWeX6BAKY9urnvATMpT7hC8lw77ak7tqxqXGIX3PMg2wYA5PGu7EzyCW0wVEi33Pw$ > >> > >> It's still classical AMG, and in my experience, struggles on very thin > >> structures (e.g., aspect ratio 1000 cantilever beams) when compared to > SA. > >> However, it can be quite competitive for many structures. I found that > the > >> "MFEM elasticity suite", which is based on Baker et al 2010, gave rather > >> poor results. This is a configuration that works on GPUs and gives good > >> convergence and performance for elasticity: > >> > >> > >> > https://urldefense.us/v3/__https://github.com/hypre-space/hypre/issues/601*issuecomment-1069426997__;Iw!!G_uCfscf7eWS!arUVBVKKcYs1M5OhNqqRZl2b2o0NIUkG7fV_22qBbg-ssHhhHazhkpMbYNjCOTN66Sfbk-VZilfox9bxDf0$ > >> > >> In the above issue, I was only using BoomerAMG as a coarse level for > p-MG > >> so all the options have a `-mg_coarse_` prefix; here are those options > >> without the prefix: > >> > >> -pc_hypre_boomeramg_coarsen_type pmis > >> -pc_hypre_boomeramg_interp_type ext+i > >> -pc_hypre_boomeramg_no_CF > >> -pc_hypre_boomeramg_P_max 6 > >> -pc_hypre_boomeramg_print_statistics 1 > >> -pc_hypre_boomeramg_relax_type_down Chebyshev > >> -pc_hypre_boomeramg_relax_type_up Chebyshev > >> -pc_hypre_boomeramg_strong_threshold 0.5 > >> -pc_type hypre > >> > > > > > > -- > > --Amneet > -- --Amneet -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Tue Nov 5 14:40:53 2024 From: jed at jedbrown.org (Jed Brown) Date: Tue, 05 Nov 2024 13:40:53 -0700 Subject: [petsc-users] Rigid body nullspace for Stokes operator In-Reply-To: References: <875xpas35p.fsf@jedbrown.org> <87bjz1qtk6.fsf@jedbrown.org> <87y124q31n.fsf@jedbrown.org> <96BE4263-8C34-4A2E-91B4-305F94FFCAB4@joliv.et> <87v7x8pagg.fsf@jedbrown.org> <87wmhi27it.fsf@jedbrown.org> Message-ID: <87jzdhxvl6.fsf@jedbrown.org> The code snippet I shared contained orthogonalization. There isn't a VecQR or VecOrthogonalize, though such a utility would be useful. Right-looking modified Gram-Schmidt would be fine for that purpose, though Cholesky QR(2) may be a bit faster. Amneet Bhalla writes: > I set the rigid body null vectors but PETSc errors out that these are not > orthogonal. Is there a canned routine in PETSc to orthogonalize a bunch of > Vecs? > > [0]PETSC ERROR: Invalid argument > > [0]PETSC ERROR: Vector 0 must be orthogonal to vector 2, inner product is > 0.612391 > > [0]PETSC ERROR: See https://urldefense.us/v3/__https://petsc.org/release/faq/__;!!G_uCfscf7eWS!ZflPyOP4IDILfa8HbHnR6mnArgOdC6dZ_mtE9devYndPR0QSR4S1__A-ax9BF-jpScXFYeYXsw3fTEv4TSk$ for trouble shooting. > > [0]PETSC ERROR: Petsc Release Version 3.17.5, unknown > > [0]PETSC ERROR: ./acoustic_streaming_hier_integrator_2d on a darwin-dbg > named APSB-MacBook-Pro-16.local by amneetb Mon Nov 4 18:01:09 2024 > > [0]PETSC ERROR: Configure options --CC=mpicc --CXX=mpicxx --FC=mpif90 > --PETSC_ARCH=darwin-dbg --with-debugging=1 --download-hypre=1 --with-x=0 > -download-mumps -download-scalapack -download-parmetis -download-metis > -download-ptscotch > > [0]PETSC ERROR: #1 MatNullSpaceCreate() at > /Users/amneetb/Softwares/PETSc-Gitlab/PETSc/src/mat/interface/matnull.c:271 > > [0]PETSC ERROR: #2 resetMatNearNullspace() at > ../../../IBAMR/ibtk/lib/../src/solvers/impls/PETScKrylovLinearSolver.cpp:697 > > P=00000:Program abort called in file > ``../../../IBAMR/ibtk/lib/../src/solvers/impls/PETScKrylovLinearSolver.cpp'' > at line 697 > > P=00000:ERROR MESSAGE: > > P=00000: > Abort trap: 6 > > On Mon, Nov 4, 2024 at 10:11?AM Jed Brown wrote: > >> Unless the problem is entirely floating (the true null space is all six >> rigid body modes), then they will be different, so yes, you'll typically >> have two MatNullSpace objects. >> >> Amneet Bhalla writes: >> >> > Hi Jed, >> > >> > Do I need to create two separate MattNullSpace objects if I want to use >> > both MatSetNullSpace() and MatSetNearNullSpace()? >> > >> > Thanks, >> > >> > >> > On Thu, Oct 31, 2024 at 8:18?AM Jed Brown wrote: >> > >> >> Pierre Jolivet writes: >> >> >> >> >> On 31 Oct 2024, at 2:47?PM, Mark Adams wrote: >> >> >> >> >> >> Interesting. I have seen hypre do fine on elasticity, but do you know >> >> if boomeramg (classical) uses these vectors or is there a smoothed >> >> aggregation solver in hypre? >> >> > >> >> > I?m not sure it is precisely ?standard? smoothed aggregation, see >> bottom >> >> paragraph of >> >> >> https://urldefense.us/v3/__https://hypre.readthedocs.io/en/latest/solvers-boomeramg.html*amg-for-systems-of-pdes__;Iw!!G_uCfscf7eWS!fdj4AzKhAcDmM1x_ZQ8gxqWeX6BAKY9urnvATMpT7hC8lw77ak7tqxqXGIX3PMg2wYA5PGu7EzyCW0yzixutBg$ >> >> > I?ve never made it to work, but I know some do. >> >> > A while back, Stefano gave me this pointer as well: >> >> >> https://urldefense.us/v3/__https://github.com/mfem/mfem/blob/17955e114020af340e9a06a66ebef43e05012d9c/linalg/hypre.cpp*L5245__;Iw!!G_uCfscf7eWS!fdj4AzKhAcDmM1x_ZQ8gxqWeX6BAKY9urnvATMpT7hC8lw77ak7tqxqXGIX3PMg2wYA5PGu7EzyCW0wVEi33Pw$ >> >> >> >> It's still classical AMG, and in my experience, struggles on very thin >> >> structures (e.g., aspect ratio 1000 cantilever beams) when compared to >> SA. >> >> However, it can be quite competitive for many structures. I found that >> the >> >> "MFEM elasticity suite", which is based on Baker et al 2010, gave rather >> >> poor results. This is a configuration that works on GPUs and gives good >> >> convergence and performance for elasticity: >> >> >> >> >> >> >> https://urldefense.us/v3/__https://github.com/hypre-space/hypre/issues/601*issuecomment-1069426997__;Iw!!G_uCfscf7eWS!arUVBVKKcYs1M5OhNqqRZl2b2o0NIUkG7fV_22qBbg-ssHhhHazhkpMbYNjCOTN66Sfbk-VZilfox9bxDf0$ >> >> >> >> In the above issue, I was only using BoomerAMG as a coarse level for >> p-MG >> >> so all the options have a `-mg_coarse_` prefix; here are those options >> >> without the prefix: >> >> >> >> -pc_hypre_boomeramg_coarsen_type pmis >> >> -pc_hypre_boomeramg_interp_type ext+i >> >> -pc_hypre_boomeramg_no_CF >> >> -pc_hypre_boomeramg_P_max 6 >> >> -pc_hypre_boomeramg_print_statistics 1 >> >> -pc_hypre_boomeramg_relax_type_down Chebyshev >> >> -pc_hypre_boomeramg_relax_type_up Chebyshev >> >> -pc_hypre_boomeramg_strong_threshold 0.5 >> >> -pc_type hypre >> >> >> > >> > >> > -- >> > --Amneet >> > > > -- > --Amneet From e.t.a.vanderweide at utwente.nl Wed Nov 6 10:36:02 2024 From: e.t.a.vanderweide at utwente.nl (Weide, Edwin van der (UT-ET)) Date: Wed, 6 Nov 2024 16:36:02 +0000 Subject: [petsc-users] Matrix free SNES with user provided matrix vector product and preconditioner operation Message-ID: Hi, I am trying to solve a nonlinear problem with matrix-free SNES where I would like to provide both the matrix vector product and the preconditioner myself. For that purpose, I use the following construction. // Set up the matrix free evaluation of the Jacobian times a vector // by setting the appropriate function in snes. PetscCall(MatCreateShell(PETSC_COMM_WORLD, PETSC_DECIDE, PETSC_DECIDE, nEqns, nEqns, this, &mJac)); PetscCall(MatShellSetOperation(mJac, MATOP_MULT, (void (*)(void))JacobianTimesVector)); PetscCall(SNESSetJacobian(mSnes, mJac, nullptr, nullptr, nullptr)); // Set the function to be used as preconditioner for the krylov solver. KSP ksp; PC pc; PetscCall(SNESGetKSP(mSnes, &ksp)); PetscCall(KSPGetPC(ksp, &pc)); PetscCall(PCSetType(pc, PCSHELL)); PetscCall(PCSetApplicationContext(pc, this)); PetscCall(PCShellSetApply(pc, Preconditioner)); For small problems this construction works, and it does exactly what I expect it to do. However, when I increase the problem size, I get a memory allocation failure in SNESSolve, because it looks like SNES attempts to allocate memory for a full dense matrix for the preconditioner, which is not used. This is the call stack when the error occurs. [0]PETSC ERROR: #1 PetscMallocAlign() at /home/vdweide/petsc/src/sys/memory/mal.c:53 [0]PETSC ERROR: #2 PetscTrMallocDefault() at /home/vdweide/petsc/src/sys/memory/mtr.c:175 [0]PETSC ERROR: #3 PetscMallocA() at /home/vdweide/petsc/src/sys/memory/mal.c:421 [0]PETSC ERROR: #4 MatSeqDenseSetPreallocation_SeqDense() at /home/vdweide/petsc/src/mat/impls/dense/seq/dense.c:3357 [0]PETSC ERROR: #5 MatSeqDenseSetPreallocation() at /home/vdweide/petsc/src/mat/impls/dense/seq/dense.c:3338 [0]PETSC ERROR: #6 MatDuplicateNoCreate_SeqDense() at /home/vdweide/petsc/src/mat/impls/dense/seq/dense.c:372 [0]PETSC ERROR: #7 MatDuplicate_SeqDense() at /home/vdweide/petsc/src/mat/impls/dense/seq/dense.c:399 [0]PETSC ERROR: #8 MatDuplicate() at /home/vdweide/petsc/src/mat/interface/matrix.c:4964 [0]PETSC ERROR: #9 DMCreateMatrix_Shell() at /home/vdweide/petsc/src/dm/impls/shell/dmshell.c:195 [0]PETSC ERROR: #10 DMCreateMatrix() at /home/vdweide/petsc/src/dm/interface/dm.c:1501 [0]PETSC ERROR: #11 SNESSetUpMatrices() at /home/vdweide/petsc/src/snes/interface/snes.c:794 [0]PETSC ERROR: #12 SNESSetUp_NEWTONLS() at /home/vdweide/petsc/src/snes/impls/ls/ls.c:290 [0]PETSC ERROR: #13 SNESSetUp() at /home/vdweide/petsc/src/snes/interface/snes.c:3395 [0]PETSC ERROR: #14 SNESSolve() at /home/vdweide/petsc/src/snes/interface/snes.c:4831 [0]PETSC ERROR: #15 SolveCurrentStage() at SolverClass.cpp:502 In the function SNESSetUpMatrices the source looks as follows 784 } else if (!snes->jacobian_pre) { 785 PetscDS prob; 786 Mat J, B; 787 PetscBool hasPrec = PETSC_FALSE; 788 789 J = snes->jacobian; 790 PetscCall(DMGetDS(dm, &prob)); 791 if (prob) PetscCall(PetscDSHasJacobianPreconditioner(prob, &hasPrec)); 792 if (J) PetscCall(PetscObjectReference((PetscObject)J)); 793 else if (hasPrec) PetscCall(DMCreateMatrix(snes->dm, &J)); 794 PetscCall(DMCreateMatrix(snes->dm, &B)); 795 PetscCall(SNESSetJacobian(snes, J ? J : B, B, NULL, NULL)); 796 PetscCall(MatDestroy(&J)); 797 PetscCall(MatDestroy(&B)); 798 } It looks like in line 794 it is attempted to create the preconditioner, because it was (intentionally) not provided. Hence my question. Is it possible to use matrix-free SNES with a user provided matrix vector product (via MatShell) and a user provided preconditioner operation without SNES allocating the memory for a dense matrix? If so, what do I need to change in the construction above to make it work? If needed, I can provide the source code for which this problem occurs. Thanks, Edwin --------------------------------------------------- Edwin van der Weide Department of Mechanical Engineering University of Twente Enschede, the Netherlands -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Wed Nov 6 10:52:44 2024 From: bsmith at petsc.dev (Barry Smith) Date: Wed, 6 Nov 2024 11:52:44 -0500 Subject: [petsc-users] Matrix free SNES with user provided matrix vector product and preconditioner operation In-Reply-To: References: Message-ID: Just pass mJac, mJac instead of mJac, nullptr and it will be happy. In your case, the second mJac won't be used in your preconditioner it is just a place holder so other parts of SNES won't try to create a matrix. Barry > On Nov 6, 2024, at 11:36?AM, Weide, Edwin van der (UT-ET) via petsc-users wrote: > > Hi, > > I am trying to solve a nonlinear problem with matrix-free SNES where I would like to provide both the matrix vector product and the preconditioner myself. For that purpose, I use the following construction. > > // Set up the matrix free evaluation of the Jacobian times a vector > // by setting the appropriate function in snes. > PetscCall(MatCreateShell(PETSC_COMM_WORLD, PETSC_DECIDE, PETSC_DECIDE, > nEqns, nEqns, this, &mJac)); > PetscCall(MatShellSetOperation(mJac, MATOP_MULT, > (void (*)(void))JacobianTimesVector)); > > PetscCall(SNESSetJacobian(mSnes, mJac, nullptr, nullptr, nullptr)); > > // Set the function to be used as preconditioner for the krylov solver. > KSP ksp; > PC pc; > PetscCall(SNESGetKSP(mSnes, &ksp)); > PetscCall(KSPGetPC(ksp, &pc)); > PetscCall(PCSetType(pc, PCSHELL)); > PetscCall(PCSetApplicationContext(pc, this)); > PetscCall(PCShellSetApply(pc, Preconditioner)); > > For small problems this construction works, and it does exactly what I expect it to do. However, when I increase the problem size, I get a memory allocation failure in SNESSolve, because it looks like SNES attempts to allocate memory for a full dense matrix for the preconditioner, which is not used. This is the call stack when the error occurs. > > [0]PETSC ERROR: #1 PetscMallocAlign() at /home/vdweide/petsc/src/sys/memory/mal.c:53 > [0]PETSC ERROR: #2 PetscTrMallocDefault() at /home/vdweide/petsc/src/sys/memory/mtr.c:175 > [0]PETSC ERROR: #3 PetscMallocA() at /home/vdweide/petsc/src/sys/memory/mal.c:421 > [0]PETSC ERROR: #4 MatSeqDenseSetPreallocation_SeqDense() at /home/vdweide/petsc/src/mat/impls/dense/seq/dense.c:3357 > [0]PETSC ERROR: #5 MatSeqDenseSetPreallocation() at /home/vdweide/petsc/src/mat/impls/dense/seq/dense.c:3338 > [0]PETSC ERROR: #6 MatDuplicateNoCreate_SeqDense() at /home/vdweide/petsc/src/mat/impls/dense/seq/dense.c:372 > [0]PETSC ERROR: #7 MatDuplicate_SeqDense() at /home/vdweide/petsc/src/mat/impls/dense/seq/dense.c:399 > [0]PETSC ERROR: #8 MatDuplicate() at /home/vdweide/petsc/src/mat/interface/matrix.c:4964 > [0]PETSC ERROR: #9 DMCreateMatrix_Shell() at /home/vdweide/petsc/src/dm/impls/shell/dmshell.c:195 > [0]PETSC ERROR: #10 DMCreateMatrix() at /home/vdweide/petsc/src/dm/interface/dm.c:1501 > [0]PETSC ERROR: #11 SNESSetUpMatrices() at /home/vdweide/petsc/src/snes/interface/snes.c:794 > [0]PETSC ERROR: #12 SNESSetUp_NEWTONLS() at /home/vdweide/petsc/src/snes/impls/ls/ls.c:290 > [0]PETSC ERROR: #13 SNESSetUp() at /home/vdweide/petsc/src/snes/interface/snes.c:3395 > [0]PETSC ERROR: #14 SNESSolve() at /home/vdweide/petsc/src/snes/interface/snes.c:4831 > [0]PETSC ERROR: #15 SolveCurrentStage() at SolverClass.cpp:502 > > In the function SNESSetUpMatrices the source looks as follows > > 784 } else if (!snes->jacobian_pre) { > 785 PetscDS prob; > 786 Mat J, B; > 787 PetscBool hasPrec = PETSC_FALSE; > 788 > 789 J = snes->jacobian; > 790 PetscCall(DMGetDS(dm, &prob)); > 791 if (prob) PetscCall(PetscDSHasJacobianPreconditioner(prob, &hasPrec)); > 792 if (J) PetscCall(PetscObjectReference((PetscObject)J)); > 793 else if (hasPrec) PetscCall(DMCreateMatrix(snes->dm, &J)); > 794 PetscCall(DMCreateMatrix(snes->dm, &B)); > 795 PetscCall(SNESSetJacobian(snes, J ? J : B, B, NULL, NULL)); > 796 PetscCall(MatDestroy(&J)); > 797 PetscCall(MatDestroy(&B)); > 798 } > > It looks like in line 794 it is attempted to create the preconditioner, because it was (intentionally) not provided. > > Hence my question. Is it possible to use matrix-free SNES with a user provided matrix vector product (via MatShell) and a user provided preconditioner operation without SNES allocating the memory for a dense matrix? If so, what do I need to change in the construction above to make it work? > > If needed, I can provide the source code for which this problem occurs. > Thanks, > > Edwin > --------------------------------------------------- > Edwin van der Weide > Department of Mechanical Engineering > University of Twente > Enschede, the Netherlands -------------- next part -------------- An HTML attachment was scrubbed... URL: From e.t.a.vanderweide at utwente.nl Wed Nov 6 11:01:48 2024 From: e.t.a.vanderweide at utwente.nl (Weide, Edwin van der (UT-ET)) Date: Wed, 6 Nov 2024 17:01:48 +0000 Subject: [petsc-users] Matrix free SNES with user provided matrix vector product and preconditioner operation In-Reply-To: References: Message-ID: Barry, If I do that, I get the following error [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger [0]PETSC ERROR: or see https://urldefense.us/v3/__https://petsc.org/release/faq/*valgrind__;Iw!!G_uCfscf7eWS!ZZB4be6WT2eisTFLrrHknlukvw3s7eibKfF0XMaUNTO9axbKKdE3l3C_RbgSVoOHvWH97FEJKHr6mY18J8kMv23zN-9CL9kUhgje$ and https://urldefense.us/v3/__https://petsc.org/release/faq/__;!!G_uCfscf7eWS!ZZB4be6WT2eisTFLrrHknlukvw3s7eibKfF0XMaUNTO9axbKKdE3l3C_RbgSVoOHvWH97FEJKHr6mY18J8kMv23zN-9CLwFG0b6V$ [0]PETSC ERROR: --------------------- Stack Frames ------------------------------------ [0]PETSC ERROR: The line numbers in the error traceback are not always exact. [0]PETSC ERROR: #1 SNES callback Jacobian [0]PETSC ERROR: #2 SNESComputeJacobian() at /home/vdweide/petsc/src/snes/interface/snes.c:2966 [0]PETSC ERROR: #3 SNESSolve_NEWTONLS() at /home/vdweide/petsc/src/snes/impls/ls/ls.c:218 [0]PETSC ERROR: #4 SNESSolve() at /home/vdweide/petsc/src/snes/interface/snes.c:4841 [0]PETSC ERROR: #5 SolveCurrentStage() at SolverClass.cpp:502 [0]PETSC ERROR: #6 main() at Condensation.cpp:20 -------------------------------------------------------------------------- So SNES tries to call the call back function for the Jacobian, but that is not provided. Hence the failure. Regards, Edwin ________________________________ From: Barry Smith Sent: Wednesday, November 6, 2024 5:52 PM To: Weide, Edwin van der (UT-ET) Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Matrix free SNES with user provided matrix vector product and preconditioner operation Just pass mJac, mJac instead of mJac, nullptr and it will be happy. In your case, the second mJac won't be used in your preconditioner it is just a place holder so other parts of SNES won't try to create a matrix. Barry On Nov 6, 2024, at 11:36?AM, Weide, Edwin van der (UT-ET) via petsc-users wrote: Hi, I am trying to solve a nonlinear problem with matrix-free SNES where I would like to provide both the matrix vector product and the preconditioner myself. For that purpose, I use the following construction. // Set up the matrix free evaluation of the Jacobian times a vector // by setting the appropriate function in snes. PetscCall(MatCreateShell(PETSC_COMM_WORLD, PETSC_DECIDE, PETSC_DECIDE, nEqns, nEqns, this, &mJac)); PetscCall(MatShellSetOperation(mJac, MATOP_MULT, (void (*)(void))JacobianTimesVector)); PetscCall(SNESSetJacobian(mSnes, mJac, nullptr, nullptr, nullptr)); // Set the function to be used as preconditioner for the krylov solver. KSP ksp; PC pc; PetscCall(SNESGetKSP(mSnes, &ksp)); PetscCall(KSPGetPC(ksp, &pc)); PetscCall(PCSetType(pc, PCSHELL)); PetscCall(PCSetApplicationContext(pc, this)); PetscCall(PCShellSetApply(pc, Preconditioner)); For small problems this construction works, and it does exactly what I expect it to do. However, when I increase the problem size, I get a memory allocation failure in SNESSolve, because it looks like SNES attempts to allocate memory for a full dense matrix for the preconditioner, which is not used. This is the call stack when the error occurs. [0]PETSC ERROR: #1 PetscMallocAlign() at /home/vdweide/petsc/src/sys/memory/mal.c:53 [0]PETSC ERROR: #2 PetscTrMallocDefault() at /home/vdweide/petsc/src/sys/memory/mtr.c:175 [0]PETSC ERROR: #3 PetscMallocA() at /home/vdweide/petsc/src/sys/memory/mal.c:421 [0]PETSC ERROR: #4 MatSeqDenseSetPreallocation_SeqDense() at /home/vdweide/petsc/src/mat/impls/dense/seq/dense.c:3357 [0]PETSC ERROR: #5 MatSeqDenseSetPreallocation() at /home/vdweide/petsc/src/mat/impls/dense/seq/dense.c:3338 [0]PETSC ERROR: #6 MatDuplicateNoCreate_SeqDense() at /home/vdweide/petsc/src/mat/impls/dense/seq/dense.c:372 [0]PETSC ERROR: #7 MatDuplicate_SeqDense() at /home/vdweide/petsc/src/mat/impls/dense/seq/dense.c:399 [0]PETSC ERROR: #8 MatDuplicate() at /home/vdweide/petsc/src/mat/interface/matrix.c:4964 [0]PETSC ERROR: #9 DMCreateMatrix_Shell() at /home/vdweide/petsc/src/dm/impls/shell/dmshell.c:195 [0]PETSC ERROR: #10 DMCreateMatrix() at /home/vdweide/petsc/src/dm/interface/dm.c:1501 [0]PETSC ERROR: #11 SNESSetUpMatrices() at /home/vdweide/petsc/src/snes/interface/snes.c:794 [0]PETSC ERROR: #12 SNESSetUp_NEWTONLS() at /home/vdweide/petsc/src/snes/impls/ls/ls.c:290 [0]PETSC ERROR: #13 SNESSetUp() at /home/vdweide/petsc/src/snes/interface/snes.c:3395 [0]PETSC ERROR: #14 SNESSolve() at /home/vdweide/petsc/src/snes/interface/snes.c:4831 [0]PETSC ERROR: #15 SolveCurrentStage() at SolverClass.cpp:502 In the function SNESSetUpMatrices the source looks as follows 784 } else if (!snes->jacobian_pre) { 785 PetscDS prob; 786 Mat J, B; 787 PetscBool hasPrec = PETSC_FALSE; 788 789 J = snes->jacobian; 790 PetscCall(DMGetDS(dm, &prob)); 791 if (prob) PetscCall(PetscDSHasJacobianPreconditioner(prob, &hasPrec)); 792 if (J) PetscCall(PetscObjectReference((PetscObject)J)); 793 else if (hasPrec) PetscCall(DMCreateMatrix(snes->dm, &J)); 794 PetscCall(DMCreateMatrix(snes->dm, &B)); 795 PetscCall(SNESSetJacobian(snes, J ? J : B, B, NULL, NULL)); 796 PetscCall(MatDestroy(&J)); 797 PetscCall(MatDestroy(&B)); 798 } It looks like in line 794 it is attempted to create the preconditioner, because it was (intentionally) not provided. Hence my question. Is it possible to use matrix-free SNES with a user provided matrix vector product (via MatShell) and a user provided preconditioner operation without SNES allocating the memory for a dense matrix? If so, what do I need to change in the construction above to make it work? If needed, I can provide the source code for which this problem occurs. Thanks, Edwin --------------------------------------------------- Edwin van der Weide Department of Mechanical Engineering University of Twente Enschede, the Netherlands -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Wed Nov 6 12:03:47 2024 From: bsmith at petsc.dev (Barry Smith) Date: Wed, 6 Nov 2024 13:03:47 -0500 Subject: [petsc-users] Matrix free SNES with user provided matrix vector product and preconditioner operation In-Reply-To: References: Message-ID: You need to provide a callback function. Why? Otherwise your MatShell and PCshell have no way of knowing at what location the Jacobian is suppose to be evaluated at (in a matrix free way). That is the x for which J(x) is used. Normally one puts the x into the application context of mJac and accesses it every time the matmult is called. Similarly it needs to be accessed in application of your preconditioner. > On Nov 6, 2024, at 12:01?PM, Weide, Edwin van der (UT-ET) wrote: > > Barry, > > If I do that, I get the following error > > [0]PETSC ERROR: ------------------------------------------------------------------------ > [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range > [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger > [0]PETSC ERROR: or see https://urldefense.us/v3/__https://petsc.org/release/faq/*valgrind__;Iw!!G_uCfscf7eWS!elK6sfqwXIIaLqDzUlKw8Wy--UjvmZXYI0gRqu_nvUNMbIx2rMEX4aHIYXWikS4p4zHPTXyicwT9SEY8-JQt10s$ and https://urldefense.us/v3/__https://petsc.org/release/faq/__;!!G_uCfscf7eWS!elK6sfqwXIIaLqDzUlKw8Wy--UjvmZXYI0gRqu_nvUNMbIx2rMEX4aHIYXWikS4p4zHPTXyicwT9SEY8mXmEtXk$ > [0]PETSC ERROR: --------------------- Stack Frames ------------------------------------ > [0]PETSC ERROR: The line numbers in the error traceback are not always exact. > [0]PETSC ERROR: #1 SNES callback Jacobian > [0]PETSC ERROR: #2 SNESComputeJacobian() at /home/vdweide/petsc/src/snes/interface/snes.c:2966 > [0]PETSC ERROR: #3 SNESSolve_NEWTONLS() at /home/vdweide/petsc/src/snes/impls/ls/ls.c:218 > [0]PETSC ERROR: #4 SNESSolve() at /home/vdweide/petsc/src/snes/interface/snes.c:4841 > [0]PETSC ERROR: #5 SolveCurrentStage() at SolverClass.cpp:502 > [0]PETSC ERROR: #6 main() at Condensation.cpp:20 > -------------------------------------------------------------------------- > > So SNES tries to call the call back function for the Jacobian, but that is not provided. Hence the failure. > Regards, > > Edwin > > From: Barry Smith > > Sent: Wednesday, November 6, 2024 5:52 PM > To: Weide, Edwin van der (UT-ET) > > Cc: petsc-users at mcs.anl.gov > > Subject: Re: [petsc-users] Matrix free SNES with user provided matrix vector product and preconditioner operation > > > Just pass mJac, mJac instead of mJac, nullptr and it will be happy. In your case, the second mJac won't be used in your preconditioner it is just a place holder so other parts of SNES won't try to create a matrix. > > Barry > > >> On Nov 6, 2024, at 11:36?AM, Weide, Edwin van der (UT-ET) via petsc-users > wrote: >> >> Hi, >> >> I am trying to solve a nonlinear problem with matrix-free SNES where I would like to provide both the matrix vector product and the preconditioner myself. For that purpose, I use the following construction. >> >> // Set up the matrix free evaluation of the Jacobian times a vector >> // by setting the appropriate function in snes. >> PetscCall(MatCreateShell(PETSC_COMM_WORLD, PETSC_DECIDE, PETSC_DECIDE, >> nEqns, nEqns, this, &mJac)); >> PetscCall(MatShellSetOperation(mJac, MATOP_MULT, >> (void (*)(void))JacobianTimesVector)); >> >> PetscCall(SNESSetJacobian(mSnes, mJac, nullptr, nullptr, nullptr)); >> >> // Set the function to be used as preconditioner for the krylov solver. >> KSP ksp; >> PC pc; >> PetscCall(SNESGetKSP(mSnes, &ksp)); >> PetscCall(KSPGetPC(ksp, &pc)); >> PetscCall(PCSetType(pc, PCSHELL)); >> PetscCall(PCSetApplicationContext(pc, this)); >> PetscCall(PCShellSetApply(pc, Preconditioner)); >> >> For small problems this construction works, and it does exactly what I expect it to do. However, when I increase the problem size, I get a memory allocation failure in SNESSolve, because it looks like SNES attempts to allocate memory for a full dense matrix for the preconditioner, which is not used. This is the call stack when the error occurs. >> >> [0]PETSC ERROR: #1 PetscMallocAlign() at /home/vdweide/petsc/src/sys/memory/mal.c:53 >> [0]PETSC ERROR: #2 PetscTrMallocDefault() at /home/vdweide/petsc/src/sys/memory/mtr.c:175 >> [0]PETSC ERROR: #3 PetscMallocA() at /home/vdweide/petsc/src/sys/memory/mal.c:421 >> [0]PETSC ERROR: #4 MatSeqDenseSetPreallocation_SeqDense() at /home/vdweide/petsc/src/mat/impls/dense/seq/dense.c:3357 >> [0]PETSC ERROR: #5 MatSeqDenseSetPreallocation() at /home/vdweide/petsc/src/mat/impls/dense/seq/dense.c:3338 >> [0]PETSC ERROR: #6 MatDuplicateNoCreate_SeqDense() at /home/vdweide/petsc/src/mat/impls/dense/seq/dense.c:372 >> [0]PETSC ERROR: #7 MatDuplicate_SeqDense() at /home/vdweide/petsc/src/mat/impls/dense/seq/dense.c:399 >> [0]PETSC ERROR: #8 MatDuplicate() at /home/vdweide/petsc/src/mat/interface/matrix.c:4964 >> [0]PETSC ERROR: #9 DMCreateMatrix_Shell() at /home/vdweide/petsc/src/dm/impls/shell/dmshell.c:195 >> [0]PETSC ERROR: #10 DMCreateMatrix() at /home/vdweide/petsc/src/dm/interface/dm.c:1501 >> [0]PETSC ERROR: #11 SNESSetUpMatrices() at /home/vdweide/petsc/src/snes/interface/snes.c:794 >> [0]PETSC ERROR: #12 SNESSetUp_NEWTONLS() at /home/vdweide/petsc/src/snes/impls/ls/ls.c:290 >> [0]PETSC ERROR: #13 SNESSetUp() at /home/vdweide/petsc/src/snes/interface/snes.c:3395 >> [0]PETSC ERROR: #14 SNESSolve() at /home/vdweide/petsc/src/snes/interface/snes.c:4831 >> [0]PETSC ERROR: #15 SolveCurrentStage() at SolverClass.cpp:502 >> >> In the function SNESSetUpMatrices the source looks as follows >> >> 784 } else if (!snes->jacobian_pre) { >> 785 PetscDS prob; >> 786 Mat J, B; >> 787 PetscBool hasPrec = PETSC_FALSE; >> 788 >> 789 J = snes->jacobian; >> 790 PetscCall(DMGetDS(dm, &prob)); >> 791 if (prob) PetscCall(PetscDSHasJacobianPreconditioner(prob, &hasPrec)); >> 792 if (J) PetscCall(PetscObjectReference((PetscObject)J)); >> 793 else if (hasPrec) PetscCall(DMCreateMatrix(snes->dm, &J)); >> 794 PetscCall(DMCreateMatrix(snes->dm, &B)); >> 795 PetscCall(SNESSetJacobian(snes, J ? J : B, B, NULL, NULL)); >> 796 PetscCall(MatDestroy(&J)); >> 797 PetscCall(MatDestroy(&B)); >> 798 } >> >> It looks like in line 794 it is attempted to create the preconditioner, because it was (intentionally) not provided. >> >> Hence my question. Is it possible to use matrix-free SNES with a user provided matrix vector product (via MatShell) and a user provided preconditioner operation without SNES allocating the memory for a dense matrix? If so, what do I need to change in the construction above to make it work? >> >> If needed, I can provide the source code for which this problem occurs. >> Thanks, >> >> Edwin >> --------------------------------------------------- >> Edwin van der Weide >> Department of Mechanical Engineering >> University of Twente >> Enschede, the Netherlands -------------- next part -------------- An HTML attachment was scrubbed... URL: From C.Klaij at marin.nl Thu Nov 7 04:19:38 2024 From: C.Klaij at marin.nl (Klaij, Christiaan) Date: Thu, 7 Nov 2024 10:19:38 +0000 Subject: [petsc-users] null space problem with -pc_type lu on single proc Message-ID: I'm trying to solve a system with a single, non-constant, null space vector, confirmed by passing the MatNullSpaceTest. Solving the system works fine with -pc_type ilu on one or multiple procs. It also works fine with -pc_type lu on multiple procs but fails on a single proc. Any idea what could be wrong? Chris [0]PETSC ERROR: *** unknown floating point error occurred *** [0]PETSC ERROR: The specific exception can be determined by running in a debugger. When the [0]PETSC ERROR: debugger traps the signal, the exception can be found with fetestexcept(0x3f) [0]PETSC ERROR: where the result is a bitwise OR of the following flags: [0]PETSC ERROR: FE_INVALID=0x1 FE_DIVBYZERO=0x4 FE_OVERFLOW=0x8 FE_UNDERFLOW=0x10 FE_INEXACT=0x20 [0]PETSC ERROR: Try option -start_in_debugger [0]PETSC ERROR: likely location of problem given in stack below [0]PETSC ERROR: --------------------- Stack Frames ------------------------------------ [0]PETSC ERROR: The line numbers in the error traceback are not always exact. [0]PETSC ERROR: #1 PetscDefaultFPTrap() at /cm/shared/apps/petsc/oneapi/build/src/src/sys/error/fp.c:487 [0]PETSC ERROR: #2 VecMDot_Seq() at /cm/shared/apps/petsc/oneapi/build/src/src/vec/vec/impls/seq/dvec2.c:189 [0]PETSC ERROR: #3 VecMXDot_MPI_Default() at /cm/shared/apps/petsc/oneapi/build/src/include/../src/vec/vec/impls/mpi/pvecimpl.h:96 [0]PETSC ERROR: #4 VecMDot_MPI() at /cm/shared/apps/petsc/oneapi/build/src/src/vec/vec/impls/mpi/pvec2.c:25 [0]PETSC ERROR: #5 VecMXDot_Private() at /cm/shared/apps/petsc/oneapi/build/src/src/vec/vec/interface/rvector.c:1112 [0]PETSC ERROR: #6 VecMDot() at /cm/shared/apps/petsc/oneapi/build/src/src/vec/vec/interface/rvector.c:1184 [0]PETSC ERROR: #7 MatNullSpaceRemove() at /cm/shared/apps/petsc/oneapi/build/src/src/mat/interface/matnull.c:359 [0]PETSC ERROR: #8 KSP_RemoveNullSpace() at /cm/shared/apps/petsc/oneapi/build/src/include/petsc/private/kspimpl.h:322 [0]PETSC ERROR: #9 KSP_PCApply() at /cm/shared/apps/petsc/oneapi/build/src/include/petsc/private/kspimpl.h:382 [0]PETSC ERROR: #10 KSPInitialResidual() at /cm/shared/apps/petsc/oneapi/build/src/src/ksp/ksp/interface/itres.c:64 [0]PETSC ERROR: #11 KSPSolve_GMRES() at /cm/shared/apps/petsc/oneapi/build/src/src/ksp/ksp/impls/gmres/gmres.c:226 [0]PETSC ERROR: #12 KSPSolve_Private() at /cm/shared/apps/petsc/oneapi/build/src/src/ksp/ksp/interface/itfunc.c:898 [0]PETSC ERROR: #13 KSPSolve() at /cm/shared/apps/petsc/oneapi/build/src/src/ksp/ksp/interface/itfunc.c:1070 [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Floating point exception [0]PETSC ERROR: trapped floating point error [0]PETSC ERROR: See https://urldefense.us/v3/__https://petsc.org/release/faq/__;!!G_uCfscf7eWS!dXxAqHgHbtticGbE-cswI2ylVIt4A7Fn68LPFMyFtpJByLQtJocy5JF-Vf5Vzr_FbcWB1l7ve_2l9CdBfaa2Fyw$ for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.19.4, Jul 31, 2023 [0]PETSC ERROR: ./refresco on a named marclus3login2 by cklaij Thu Nov 7 11:03:22 2024 [0]PETSC ERROR: Configure options --prefix=/cm/shared/apps/petsc/oneapi/3.19.4-dbg --with-mpi-dir=/cm/shared/apps/intel/oneapi/mpi/2021.4.0 --with-x=0 --with-mpe=0 --with-debugging=1 --download-superlu_dist=../superlu_dist-8.1.2.tar.gz --with-blaslapack-dir=/cm/shared/apps/intel/oneapi/mkl/2021.4.0 --download-parmetis=../parmetis-4.0.3-p9.tar.gz --download-metis=../metis-5.1.0-p11.tar.gz --with-packages-build-dir=/cm/shared/apps/petsc/oneapi/build --with-ssl=0 --with-shared-libraries=1 CFLAGS="-std=gnu11 -Wall -funroll-all-loops -O3 -DNDEBUG" CXXFLAGS="-std=gnu++14 -Wall -funroll-all-loops -O3 -DNDEBUG" COPTFLAGS="-std=gnu11 -Wall -funroll-all-loops -O3 -DNDEBUG" CXXOPTFLAGS="-std=gnu++14 -Wall -funroll-all-loops -O3 -DNDEBUG" FCFLAGS="-funroll-all-loops -O3 -DNDEBUG" F90FLAGS="-funroll-all-loops -O3 -DNDEBUG" FOPTFLAGS="-funroll-all-loops -O3 -DNDEBUG" Abort(72) on node 0 (rank 0 in comm 0): application called MPI_Abort(MPI_COMM_WORLD, 72) - process 0 dr. ir. Christiaan Klaij | Senior Researcher | Research & Development T +31 317 49 33 44 | C.Klaij at marin.nl | https://urldefense.us/v3/__http://www.marin.nl__;!!G_uCfscf7eWS!dXxAqHgHbtticGbE-cswI2ylVIt4A7Fn68LPFMyFtpJByLQtJocy5JF-Vf5Vzr_FbcWB1l7ve_2l9CdBwOll570$ -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image376413.png Type: image/png Size: 5004 bytes Desc: image376413.png URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image225686.png Type: image/png Size: 487 bytes Desc: image225686.png URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image478264.png Type: image/png Size: 504 bytes Desc: image478264.png URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image765396.png Type: image/png Size: 482 bytes Desc: image765396.png URL: From stefano.zampini at gmail.com Thu Nov 7 07:20:27 2024 From: stefano.zampini at gmail.com (Stefano Zampini) Date: Thu, 7 Nov 2024 16:20:27 +0300 Subject: [petsc-users] null space problem with -pc_type lu on single proc In-Reply-To: References: Message-ID: the default LU solver in sequential is the PETSc one which does not support pivoting or singular problems. In parallel, it is either MUMPS or SUPERLU_DIST, depending on your configurations. MUMPS for example can handle singular problem, not sure about superlu_dist. You can run the parallel version with -ksp_view and see what is the solver package used. Supposing it is mumps, you can run the sequential code with -pc_factor_mat_solver_type mumps Il giorno gio 7 nov 2024 alle ore 13:20 Klaij, Christiaan via petsc-users < petsc-users at mcs.anl.gov> ha scritto: > I'm trying to solve a system with a single, non-constant, null space > vector, confirmed by passing the MatNullSpaceTest. Solving the system works > fine with -pc_type ilu on one or multiple procs. It also works fine with > -pc_type lu on multiple procs but fails on a single proc. Any idea what > could be wrong? > > Chris > > [0]PETSC ERROR: *** unknown floating point error occurred *** > [0]PETSC ERROR: The specific exception can be determined by running in a > debugger. When the > [0]PETSC ERROR: debugger traps the signal, the exception can be found with > fetestexcept(0x3f) > [0]PETSC ERROR: where the result is a bitwise OR of the following flags: > [0]PETSC ERROR: FE_INVALID=0x1 FE_DIVBYZERO=0x4 FE_OVERFLOW=0x8 > FE_UNDERFLOW=0x10 FE_INEXACT=0x20 > [0]PETSC ERROR: Try option -start_in_debugger > [0]PETSC ERROR: likely location of problem given in stack below > [0]PETSC ERROR: --------------------- Stack Frames > ------------------------------------ > [0]PETSC ERROR: The line numbers in the error traceback are not always > exact. > [0]PETSC ERROR: #1 PetscDefaultFPTrap() at > /cm/shared/apps/petsc/oneapi/build/src/src/sys/error/fp.c:487 > [0]PETSC ERROR: #2 VecMDot_Seq() at > /cm/shared/apps/petsc/oneapi/build/src/src/vec/vec/impls/seq/dvec2.c:189 > [0]PETSC ERROR: #3 VecMXDot_MPI_Default() at > /cm/shared/apps/petsc/oneapi/build/src/include/../src/vec/vec/impls/mpi/pvecimpl.h:96 > [0]PETSC ERROR: #4 VecMDot_MPI() at > /cm/shared/apps/petsc/oneapi/build/src/src/vec/vec/impls/mpi/pvec2.c:25 > [0]PETSC ERROR: #5 VecMXDot_Private() at > /cm/shared/apps/petsc/oneapi/build/src/src/vec/vec/interface/rvector.c:1112 > [0]PETSC ERROR: #6 VecMDot() at > /cm/shared/apps/petsc/oneapi/build/src/src/vec/vec/interface/rvector.c:1184 > [0]PETSC ERROR: #7 MatNullSpaceRemove() at > /cm/shared/apps/petsc/oneapi/build/src/src/mat/interface/matnull.c:359 > [0]PETSC ERROR: #8 KSP_RemoveNullSpace() at > /cm/shared/apps/petsc/oneapi/build/src/include/petsc/private/kspimpl.h:322 > [0]PETSC ERROR: #9 KSP_PCApply() at > /cm/shared/apps/petsc/oneapi/build/src/include/petsc/private/kspimpl.h:382 > [0]PETSC ERROR: #10 KSPInitialResidual() at > /cm/shared/apps/petsc/oneapi/build/src/src/ksp/ksp/interface/itres.c:64 > [0]PETSC ERROR: #11 KSPSolve_GMRES() at > /cm/shared/apps/petsc/oneapi/build/src/src/ksp/ksp/impls/gmres/gmres.c:226 > [0]PETSC ERROR: #12 KSPSolve_Private() at > /cm/shared/apps/petsc/oneapi/build/src/src/ksp/ksp/interface/itfunc.c:898 > [0]PETSC ERROR: #13 KSPSolve() at > /cm/shared/apps/petsc/oneapi/build/src/src/ksp/ksp/interface/itfunc.c:1070 > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > [0]PETSC ERROR: Floating point exception > [0]PETSC ERROR: trapped floating point error > [0]PETSC ERROR: See https://urldefense.us/v3/__https://petsc.org/release/faq/__;!!G_uCfscf7eWS!YEL_7P5zQIFYD7bhZCjy9YmnEJKPrk8uuZ-HgzitChAgA3TpOY2MKoRQcBcC6LuKxc3x8WMoVQ62WS53A7XCLS4niHz5JWI$ > > for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.19.4, Jul 31, 2023 > [0]PETSC ERROR: ./refresco on a named marclus3login2 by cklaij Thu Nov 7 > 11:03:22 2024 > [0]PETSC ERROR: Configure options > --prefix=/cm/shared/apps/petsc/oneapi/3.19.4-dbg > --with-mpi-dir=/cm/shared/apps/intel/oneapi/mpi/2021.4.0 --with-x=0 > --with-mpe=0 --with-debugging=1 > --download-superlu_dist=../superlu_dist-8.1.2.tar.gz > --with-blaslapack-dir=/cm/shared/apps/intel/oneapi/mkl/2021.4.0 > --download-parmetis=../parmetis-4.0.3-p9.tar.gz > --download-metis=../metis-5.1.0-p11.tar.gz > --with-packages-build-dir=/cm/shared/apps/petsc/oneapi/build --with-ssl=0 > --with-shared-libraries=1 CFLAGS="-std=gnu11 -Wall -funroll-all-loops -O3 > -DNDEBUG" CXXFLAGS="-std=gnu++14 -Wall -funroll-all-loops -O3 -DNDEBUG" > COPTFLAGS="-std=gnu11 -Wall -funroll-all-loops -O3 -DNDEBUG" > CXXOPTFLAGS="-std=gnu++14 -Wall -funroll-all-loops -O3 -DNDEBUG" > FCFLAGS="-funroll-all-loops -O3 -DNDEBUG" F90FLAGS="-funroll-all-loops -O3 > -DNDEBUG" FOPTFLAGS="-funroll-all-loops -O3 -DNDEBUG" > Abort(72) on node 0 (rank 0 in comm 0): application called > MPI_Abort(MPI_COMM_WORLD, 72) - process 0 > dr. ir.???? Christiaan Klaij > | Senior Researcher | Research & Development > T +31 317 49 33 44 <+31%20317%2049%2033%2044> | C.Klaij at marin.nl | > https://urldefense.us/v3/__http://www.marin.nl__;!!G_uCfscf7eWS!YEL_7P5zQIFYD7bhZCjy9YmnEJKPrk8uuZ-HgzitChAgA3TpOY2MKoRQcBcC6LuKxc3x8WMoVQ62WS53A7XCLS4nR49qdbI$ > > [image: Facebook] > > [image: LinkedIn] > > [image: YouTube] > > -- Stefano -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image376413.png Type: image/png Size: 5004 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image225686.png Type: image/png Size: 487 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image478264.png Type: image/png Size: 504 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image765396.png Type: image/png Size: 482 bytes Desc: not available URL: From C.Klaij at marin.nl Thu Nov 7 08:42:27 2024 From: C.Klaij at marin.nl (Klaij, Christiaan) Date: Thu, 7 Nov 2024 14:42:27 +0000 Subject: [petsc-users] null space problem with -pc_type lu on single proc In-Reply-To: References: Message-ID: Thanks for explaining, Stefano. I'm using superlu so -pc_factor_mat_solver_type superlu_dist did the trick. Chris ________________________________________ From: Stefano Zampini Sent: Thursday, November 7, 2024 2:20 PM To: Klaij, Christiaan Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] null space problem with -pc_type lu on single proc the default LU solver in sequential is the PETSc one which does not support pivoting or singular problems. In parallel, it is either MUMPS or SUPERLU_DIST, depending on your configurations. MUMPS for example can handle singular problem, not sure about superlu_dist. You can run the parallel version with -ksp_view and see what is the solver package used. Supposing it is mumps, you can run the sequential code with -pc_factor_mat_solver_type mumps Il giorno gio 7 nov 2024 alle ore 13:20 Klaij, Christiaan via petsc-users > ha scritto: I'm trying to solve a system with a single, non-constant, null space vector, confirmed by passing the MatNullSpaceTest. Solving the system works fine with -pc_type ilu on one or multiple procs. It also works fine with -pc_type lu on multiple procs but fails on a single proc. Any idea what could be wrong? Chris [0]PETSC ERROR: *** unknown floating point error occurred *** [0]PETSC ERROR: The specific exception can be determined by running in a debugger. When the [0]PETSC ERROR: debugger traps the signal, the exception can be found with fetestexcept(0x3f) [0]PETSC ERROR: where the result is a bitwise OR of the following flags: [0]PETSC ERROR: FE_INVALID=0x1 FE_DIVBYZERO=0x4 FE_OVERFLOW=0x8 FE_UNDERFLOW=0x10 FE_INEXACT=0x20 [0]PETSC ERROR: Try option -start_in_debugger [0]PETSC ERROR: likely location of problem given in stack below [0]PETSC ERROR: --------------------- Stack Frames ------------------------------------ [0]PETSC ERROR: The line numbers in the error traceback are not always exact. [0]PETSC ERROR: #1 PetscDefaultFPTrap() at /cm/shared/apps/petsc/oneapi/build/src/src/sys/error/fp.c:487 [0]PETSC ERROR: #2 VecMDot_Seq() at /cm/shared/apps/petsc/oneapi/build/src/src/vec/vec/impls/seq/dvec2.c:189 [0]PETSC ERROR: #3 VecMXDot_MPI_Default() at /cm/shared/apps/petsc/oneapi/build/src/include/../src/vec/vec/impls/mpi/pvecimpl.h:96 [0]PETSC ERROR: #4 VecMDot_MPI() at /cm/shared/apps/petsc/oneapi/build/src/src/vec/vec/impls/mpi/pvec2.c:25 [0]PETSC ERROR: #5 VecMXDot_Private() at /cm/shared/apps/petsc/oneapi/build/src/src/vec/vec/interface/rvector.c:1112 [0]PETSC ERROR: #6 VecMDot() at /cm/shared/apps/petsc/oneapi/build/src/src/vec/vec/interface/rvector.c:1184 [0]PETSC ERROR: #7 MatNullSpaceRemove() at /cm/shared/apps/petsc/oneapi/build/src/src/mat/interface/matnull.c:359 [0]PETSC ERROR: #8 KSP_RemoveNullSpace() at /cm/shared/apps/petsc/oneapi/build/src/include/petsc/private/kspimpl.h:322 [0]PETSC ERROR: #9 KSP_PCApply() at /cm/shared/apps/petsc/oneapi/build/src/include/petsc/private/kspimpl.h:382 [0]PETSC ERROR: #10 KSPInitialResidual() at /cm/shared/apps/petsc/oneapi/build/src/src/ksp/ksp/interface/itres.c:64 [0]PETSC ERROR: #11 KSPSolve_GMRES() at /cm/shared/apps/petsc/oneapi/build/src/src/ksp/ksp/impls/gmres/gmres.c:226 [0]PETSC ERROR: #12 KSPSolve_Private() at /cm/shared/apps/petsc/oneapi/build/src/src/ksp/ksp/interface/itfunc.c:898 [0]PETSC ERROR: #13 KSPSolve() at /cm/shared/apps/petsc/oneapi/build/src/src/ksp/ksp/interface/itfunc.c:1070 [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Floating point exception [0]PETSC ERROR: trapped floating point error [0]PETSC ERROR: See https://urldefense.us/v3/__https://petsc.org/release/faq/__;!!G_uCfscf7eWS!Z_V3Mo9VHcVE5kKG_z_TQ_jgrgcAJyQUPP1-I1OsT7PQgkbn1rnE5ORYi5TxwOPmLGapf_gkooDG9DZkYMM5VeI$ for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.19.4, Jul 31, 2023 [0]PETSC ERROR: ./refresco on a named marclus3login2 by cklaij Thu Nov 7 11:03:22 2024 [0]PETSC ERROR: Configure options --prefix=/cm/shared/apps/petsc/oneapi/3.19.4-dbg --with-mpi-dir=/cm/shared/apps/intel/oneapi/mpi/2021.4.0 --with-x=0 --with-mpe=0 --with-debugging=1 --download-superlu_dist=../superlu_dist-8.1.2.tar.gz --with-blaslapack-dir=/cm/shared/apps/intel/oneapi/mkl/2021.4.0 --download-parmetis=../parmetis-4.0.3-p9.tar.gz --download-metis=../metis-5.1.0-p11.tar.gz --with-packages-build-dir=/cm/shared/apps/petsc/oneapi/build --with-ssl=0 --with-shared-libraries=1 CFLAGS="-std=gnu11 -Wall -funroll-all-loops -O3 -DNDEBUG" CXXFLAGS="-std=gnu++14 -Wall -funroll-all-loops -O3 -DNDEBUG" COPTFLAGS="-std=gnu11 -Wall -funroll-all-loops -O3 -DNDEBUG" CXXOPTFLAGS="-std=gnu++14 -Wall -funroll-all-loops -O3 -DNDEBUG" FCFLAGS="-funroll-all-loops -O3 -DNDEBUG" F90FLAGS="-funroll-all-loops -O3 -DNDEBUG" FOPTFLAGS="-funroll-all-loops -O3 -DNDEBUG" Abort(72) on node 0 (rank 0 in comm 0): application called MPI_Abort(MPI_COMM_WORLD, 72) - process 0 [cid:ii_19306c695d9c0e2969a1] dr. ir.???? Christiaan Klaij | Senior Researcher | Research & Development T +31 317 49 33 44 | C.Klaij at marin.nl | https://urldefense.us/v3/__http://www.marin.nl__;!!G_uCfscf7eWS!Z_V3Mo9VHcVE5kKG_z_TQ_jgrgcAJyQUPP1-I1OsT7PQgkbn1rnE5ORYi5TxwOPmLGapf_gkooDG9DZkiSv2kr8$ [Facebook] [LinkedIn] [YouTube] -- Stefano -------------- next part -------------- A non-text attachment was scrubbed... Name: image376413.png Type: image/png Size: 5004 bytes Desc: image376413.png URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image225686.png Type: image/png Size: 487 bytes Desc: image225686.png URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image478264.png Type: image/png Size: 504 bytes Desc: image478264.png URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image765396.png Type: image/png Size: 482 bytes Desc: image765396.png URL: From p.khurana22 at imperial.ac.uk Thu Nov 7 09:08:29 2024 From: p.khurana22 at imperial.ac.uk (Khurana, Parv) Date: Thu, 7 Nov 2024 15:08:29 +0000 Subject: [petsc-users] Expected weak scaling behaviour for AMG libraries? In-Reply-To: References: Message-ID: Hello Mark and Mathew, Apologies for the delay in reply (I was gone for a vacation). Really appreciate the prompt response. I am now planning to redo these tests with the load balancing suggestions you have provided. Would you suggest any load balancing options to use as default when dealing with unstructured meshes in general? I use PETSc as an external linear solver for my software, where I supply a Poisson system discretised using 3D simplical elements and FEM - which are solved using AMG. I observed bad weak scaling behaviour for my application for 20k DOF/rank, which prompted me to test something similar only in PETSc. I choose ex12 instead of ex56 because it uses 3D FEM. I am not sure if I can make ex56 work for tetrahedrons out of the box. Maybe ex13 is more suited as Mark mentioned. On point 3,4 from Mathew: The plot below is from the numbers extracted from the -log_view option for all the runs. I have attached a sample log file from my runs, and pasted a sample output in the email. ------------------------------------------------------------------ PETSc Performance Summary: ------------------------------------------------------------------ ------------------------------------------------------------------------------------------------------------------------ Event Count Time (sec) Flop --- Global --- --- Stage ---- Total Max Ratio Max Ratio Max Ratio Mess AvgLen Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s ------------------------------------------------------------------------------------------------------------------------ --- Event Stage 0: Main Stage KSPSolve 2 1.0 1.4079e-01 1.0 2.14e+07 2.0 1.2e+03 1.1e+04 4.4e+01 2 4 26 16 17 2 4 26 16 18 875 SNESSolve 1 1.0 2.9310e+00 1.0 1.69e+08 1.1 1.7e+03 2.0e+04 6.1e+01 46 46 37 38 23 46 46 37 38 25 445 PCApply 23 1.0 1.2774e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 ------------------------------------------------------------------------------------------------------------------------ Thanks and Best, Parv ________________________________ From: Mark Adams Sent: 31 October 2024 11:30 To: Matthew Knepley Cc: Khurana, Parv ; petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Expected weak scaling behaviour for AMG libraries? This email from mfadams at lbl.gov originates from outside Imperial. Do not click on links and attachments unless you recognise the sender. If you trust the sender, add them to your safe senders list to disable email stamping for this address. As Matt said snes ex56 is better because it does a convergence test that refines the grid. You need/want these two parameters to have the same arg (eg, 2,2,1): -dm_plex_box_faces 2,2,1 -petscpartitioner_simple_process_grid 2,2,1. This will put one cell per process. Then you use: -max_conv_its N, to specify the N levels of refinement to do. It will run the 2,2,1 first then a 4,4,2, etc., N times. /src/snes/tests/ex13.c is designed for benchmarking and it uses '-petscpartitioner_simple_node_grid 1,1,1 [default]' to give you a two level partitioner. You need to have dm_plex_box_faces_i = petscpartitioner_simple_process_grid_i * petscpartitioner_simple_node_grid_i Again, you should put one cell per process (NP = product of dm_plex_box_faces args) and use -dm_refine N to get a single solve. Mark On Wed, Oct 30, 2024 at 11:02?PM Matthew Knepley > wrote: On Wed, Oct 30, 2024 at 4:13?PM Khurana, Parv > wrote: Hello PETSc Community, I am trying to understand the scaling behaviour of AMG methods in PETSc (Hypre for now) and how many DOFs/Rank are needed for a performant AMG solve. I?m currently conducting weak scaling tests using src/snes/tutorials/ex12.c in 3D, applying Dirichlet BCs with FEM at P=1. The tests keep DOFs per processor constant while increasing the mesh size and processor count, specifically: * 20000 and 80000 DOF/RANK configurations. * Running SNES twice, using GMRES with a tolerance of 1e-5 and preconditioning with Hypre-BoomerAMG. A couple of quick points in order to make sure that there is no confusion: 1) Partitioner type "simple" is for the CI. It is a very bad partition, and should not be used for timing. The default is ParMetis which should be good enough. 2) You start out with 6^3 = 216 elements, distribute that, and then refine it. This will be _really_ bad load balance on all arrangement except the divisors of 216. You usually want to start out with something bigger at the later stages. You can use -dm_refine_pre to refine before distribution. 3) It is not clear you are using the timing for just the solver (SNESSolve). It could be that extraneous things are taking time. When asking questions like this, please always send the output of -log_view for timing, and at least -ksp_monitor_true_residial for convergence. 4) SNES ex56 is the example we use for GAMG scalability testing Thanks, Matt Unfortunately, parallel efficiency degrades noticeably with increased processor counts. Are there any insights or rules of thumb for using AMG more effectively? I have been looking at this issue for a while now and would love to engage in a further discussion. Please find below the weak scaling results and the options I use to run the tests. [cid:ii_192e0800b4dcb971f161] #Run type -run_type full -petscpartitioner_type simple #Mesh settings -dm_plex_dim 3 -dm_plex_simplex 1 -dm_refine 5 #Varied this -dm_plex_box_faces 6,6,6 #BCs and FEM space -bc_type dirichlet -petscspace_degree 1 #Solver settings -snes_max_it 2 -ksp_type gmres -ksp_rtol 1.0e-5 #Same settings as what we use for LOR -pc_type hypre -pc_hypre_type boomeramg -pc_hypre_boomeramg_coarsen_type hmis -pc_hypre_boomeramg_relax_type_all symmetric-sor/jacobi -pc_hypre_boomeramg_strong_threshold 0.7 -pc_hypre_boomeramg_interp_type ext+i -pc_hypre_boomeramg_P_max 2 -pc_hypre_boomeramg_truncfactor 0.3 Best, Parv -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!f47A4_a_tRPU1XvaYrgWMAp2uGFajfIIWf6QG4FERzGhIyI7U-eiYao8U73sCFqUwb_u9HrBY8TMcMT4qnKeIzOyDtzWTvkU$ -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image.png Type: image/png Size: 119488 bytes Desc: image.png URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: petsc_scale.l144491.pbs-6 Type: application/octet-stream Size: 24704 bytes Desc: petsc_scale.l144491.pbs-6 URL: From e.t.a.vanderweide at utwente.nl Thu Nov 7 11:21:19 2024 From: e.t.a.vanderweide at utwente.nl (Weide, Edwin van der (UT-ET)) Date: Thu, 7 Nov 2024 17:21:19 +0000 Subject: [petsc-users] Matrix free SNES with user provided matrix vector product and preconditioner operation In-Reply-To: References: Message-ID: Yes, this works. Thanks a lot for your help. Regards, Edwin ________________________________ From: Barry Smith Sent: Wednesday, November 6, 2024 7:03 PM To: Weide, Edwin van der (UT-ET) Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Matrix free SNES with user provided matrix vector product and preconditioner operation You need to provide a callback function. Why? Otherwise your MatShell and PCshell have no way of knowing at what location the Jacobian is suppose to be evaluated at (in a matrix free way). That is the x for which J(x) is used. Normally one puts the x into the application context of mJac and accesses it every time the matmult is called. Similarly it needs to be accessed in application of your preconditioner. On Nov 6, 2024, at 12:01?PM, Weide, Edwin van der (UT-ET) wrote: Barry, If I do that, I get the following error [0]PETSC ERROR: ------------------------------------------------------------------------ [0]PETSC ERROR: Caught signal number 11 SEGV: Segmentation Violation, probably memory access out of range [0]PETSC ERROR: Try option -start_in_debugger or -on_error_attach_debugger [0]PETSC ERROR: or see https://urldefense.us/v3/__https://petsc.org/release/faq/*valgrind__;Iw!!G_uCfscf7eWS!dfJ_L4Ay2QlfGz0JU6uAfD7Zy_6a7x9D0bisHMG1T53fTy_gWL3Q8fzzLvWZ_TBK37FFmDYerLywkFOLtMnU1rISfF_VA3gxsDTm$ and https://urldefense.us/v3/__https://petsc.org/release/faq/__;!!G_uCfscf7eWS!dfJ_L4Ay2QlfGz0JU6uAfD7Zy_6a7x9D0bisHMG1T53fTy_gWL3Q8fzzLvWZ_TBK37FFmDYerLywkFOLtMnU1rISfF_VAwRW4W44$ [0]PETSC ERROR: --------------------- Stack Frames ------------------------------------ [0]PETSC ERROR: The line numbers in the error traceback are not always exact. [0]PETSC ERROR: #1 SNES callback Jacobian [0]PETSC ERROR: #2 SNESComputeJacobian() at /home/vdweide/petsc/src/snes/interface/snes.c:2966 [0]PETSC ERROR: #3 SNESSolve_NEWTONLS() at /home/vdweide/petsc/src/snes/impls/ls/ls.c:218 [0]PETSC ERROR: #4 SNESSolve() at /home/vdweide/petsc/src/snes/interface/snes.c:4841 [0]PETSC ERROR: #5 SolveCurrentStage() at SolverClass.cpp:502 [0]PETSC ERROR: #6 main() at Condensation.cpp:20 -------------------------------------------------------------------------- So SNES tries to call the call back function for the Jacobian, but that is not provided. Hence the failure. Regards, Edwin ________________________________ From: Barry Smith > Sent: Wednesday, November 6, 2024 5:52 PM To: Weide, Edwin van der (UT-ET) > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Matrix free SNES with user provided matrix vector product and preconditioner operation Just pass mJac, mJac instead of mJac, nullptr and it will be happy. In your case, the second mJac won't be used in your preconditioner it is just a place holder so other parts of SNES won't try to create a matrix. Barry On Nov 6, 2024, at 11:36?AM, Weide, Edwin van der (UT-ET) via petsc-users > wrote: Hi, I am trying to solve a nonlinear problem with matrix-free SNES where I would like to provide both the matrix vector product and the preconditioner myself. For that purpose, I use the following construction. // Set up the matrix free evaluation of the Jacobian times a vector // by setting the appropriate function in snes. PetscCall(MatCreateShell(PETSC_COMM_WORLD, PETSC_DECIDE, PETSC_DECIDE, nEqns, nEqns, this, &mJac)); PetscCall(MatShellSetOperation(mJac, MATOP_MULT, (void (*)(void))JacobianTimesVector)); PetscCall(SNESSetJacobian(mSnes, mJac, nullptr, nullptr, nullptr)); // Set the function to be used as preconditioner for the krylov solver. KSP ksp; PC pc; PetscCall(SNESGetKSP(mSnes, &ksp)); PetscCall(KSPGetPC(ksp, &pc)); PetscCall(PCSetType(pc, PCSHELL)); PetscCall(PCSetApplicationContext(pc, this)); PetscCall(PCShellSetApply(pc, Preconditioner)); For small problems this construction works, and it does exactly what I expect it to do. However, when I increase the problem size, I get a memory allocation failure in SNESSolve, because it looks like SNES attempts to allocate memory for a full dense matrix for the preconditioner, which is not used. This is the call stack when the error occurs. [0]PETSC ERROR: #1 PetscMallocAlign() at /home/vdweide/petsc/src/sys/memory/mal.c:53 [0]PETSC ERROR: #2 PetscTrMallocDefault() at /home/vdweide/petsc/src/sys/memory/mtr.c:175 [0]PETSC ERROR: #3 PetscMallocA() at /home/vdweide/petsc/src/sys/memory/mal.c:421 [0]PETSC ERROR: #4 MatSeqDenseSetPreallocation_SeqDense() at /home/vdweide/petsc/src/mat/impls/dense/seq/dense.c:3357 [0]PETSC ERROR: #5 MatSeqDenseSetPreallocation() at /home/vdweide/petsc/src/mat/impls/dense/seq/dense.c:3338 [0]PETSC ERROR: #6 MatDuplicateNoCreate_SeqDense() at /home/vdweide/petsc/src/mat/impls/dense/seq/dense.c:372 [0]PETSC ERROR: #7 MatDuplicate_SeqDense() at /home/vdweide/petsc/src/mat/impls/dense/seq/dense.c:399 [0]PETSC ERROR: #8 MatDuplicate() at /home/vdweide/petsc/src/mat/interface/matrix.c:4964 [0]PETSC ERROR: #9 DMCreateMatrix_Shell() at /home/vdweide/petsc/src/dm/impls/shell/dmshell.c:195 [0]PETSC ERROR: #10 DMCreateMatrix() at /home/vdweide/petsc/src/dm/interface/dm.c:1501 [0]PETSC ERROR: #11 SNESSetUpMatrices() at /home/vdweide/petsc/src/snes/interface/snes.c:794 [0]PETSC ERROR: #12 SNESSetUp_NEWTONLS() at /home/vdweide/petsc/src/snes/impls/ls/ls.c:290 [0]PETSC ERROR: #13 SNESSetUp() at /home/vdweide/petsc/src/snes/interface/snes.c:3395 [0]PETSC ERROR: #14 SNESSolve() at /home/vdweide/petsc/src/snes/interface/snes.c:4831 [0]PETSC ERROR: #15 SolveCurrentStage() at SolverClass.cpp:502 In the function SNESSetUpMatrices the source looks as follows 784 } else if (!snes->jacobian_pre) { 785 PetscDS prob; 786 Mat J, B; 787 PetscBool hasPrec = PETSC_FALSE; 788 789 J = snes->jacobian; 790 PetscCall(DMGetDS(dm, &prob)); 791 if (prob) PetscCall(PetscDSHasJacobianPreconditioner(prob, &hasPrec)); 792 if (J) PetscCall(PetscObjectReference((PetscObject)J)); 793 else if (hasPrec) PetscCall(DMCreateMatrix(snes->dm, &J)); 794 PetscCall(DMCreateMatrix(snes->dm, &B)); 795 PetscCall(SNESSetJacobian(snes, J ? J : B, B, NULL, NULL)); 796 PetscCall(MatDestroy(&J)); 797 PetscCall(MatDestroy(&B)); 798 } It looks like in line 794 it is attempted to create the preconditioner, because it was (intentionally) not provided. Hence my question. Is it possible to use matrix-free SNES with a user provided matrix vector product (via MatShell) and a user provided preconditioner operation without SNES allocating the memory for a dense matrix? If so, what do I need to change in the construction above to make it work? If needed, I can provide the source code for which this problem occurs. Thanks, Edwin --------------------------------------------------- Edwin van der Weide Department of Mechanical Engineering University of Twente Enschede, the Netherlands -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri Nov 8 08:11:30 2024 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 8 Nov 2024 09:11:30 -0500 Subject: [petsc-users] Expected weak scaling behaviour for AMG libraries? In-Reply-To: References: Message-ID: On Thu, Nov 7, 2024 at 10:08?AM Khurana, Parv wrote: > Hello Mark and Mathew, > > Apologies for the delay in reply (I was gone for a vacation). Really > appreciate the prompt response. > > I am now planning to redo these tests with the load balancing suggestions > you have provided. *Would you suggest any load balancing options to use > as default when dealing with unstructured meshes in general*? I > The default load balancing should be good. We do not use it in CI tests because it is not reproducible across machines. When you use -plexpartitioner_type simple you are turning off the default load balancing. Don't do that. > use PETSc as an external linear solver for my software, where I supply a > Poisson system discretised using 3D simplical elements and FEM - which are > solved using AMG. I observed bad weak scaling behaviour for my application > for 20k DOF/rank, which prompted me to test something similar only in > PETSc. > > I choose ex12 instead of ex56 because it uses 3D FEM. I am not sure if I > can make ex56 work for tetrahedrons out of the box. Maybe ex13 is more > suited as Mark mentioned. > > On point 3,4 from Mathew: > The plot below is from the numbers extracted from the -log_view option for > all the runs. I have attached a sample log file from my runs, and pasted a > sample output in the email. > Yes, this has the simple partitioning, which is not what you want. Thanks, Matt > > ------------------------------------------------------------------ PETSc > Performance Summary: > ------------------------------------------------------------------ > > > ------------------------------------------------------------------------------------------------------------------------ > Event Count Time (sec) Flop > --- Global --- --- Stage ---- Total > Max Ratio Max Ratio Max Ratio Mess AvgLen > Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s > > ------------------------------------------------------------------------------------------------------------------------ > > --- Event Stage 0: Main Stage > > KSPSolve 2 1.0 1.4079e-01 1.0 2.14e+07 2.0 1.2e+03 1.1e+04 > 4.4e+01 2 4 26 16 17 2 4 26 16 18 875 > SNESSolve 1 1.0 2.9310e+00 1.0 1.69e+08 1.1 1.7e+03 2.0e+04 > 6.1e+01 46 46 37 38 23 46 46 37 38 25 445 > PCApply 23 1.0 1.2774e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 > 0.0e+00 2 0 0 0 0 2 0 0 0 0 0 > > ------------------------------------------------------------------------------------------------------------------------ > > Thanks and Best, > Parv > > ------------------------------ > *From:* Mark Adams > *Sent:* 31 October 2024 11:30 > *To:* Matthew Knepley > *Cc:* Khurana, Parv ; petsc-users at mcs.anl.gov > > *Subject:* Re: [petsc-users] Expected weak scaling behaviour for AMG > libraries? > > > This email from mfadams at lbl.gov originates from outside Imperial. Do not > click on links and attachments unless you recognise the sender. If you > trust the sender, add them to your safe senders list > to disable email > stamping for this address. > > > As Matt said snes ex56 is better because it does a convergence test that > refines the grid. You need/want these two parameters to have the same arg > (eg, 2,2,1): -dm_plex_box_faces 2,2,1 -petscpartitioner_simple_process_grid > 2,2,1. > This will put one cell per process. > > Then you use: -max_conv_its N, to specify the N levels of refinement to > do. It will run the 2,2,1 first then a 4,4,2, etc., N times. > > /src/snes/tests/ex13.c is designed for benchmarking and it uses > '-petscpartitioner_simple_node_grid 1,1,1 [default]' to give you a two > level partitioner. > You need to have dm_plex_box_faces_i = > petscpartitioner_simple_process_grid_i * > petscpartitioner_simple_node_grid_i > Again, you should put one cell per process (NP = product of > dm_plex_box_faces args) and use -dm_refine N to get a single solve. > > Mark > > > > On Wed, Oct 30, 2024 at 11:02?PM Matthew Knepley > wrote: > > On Wed, Oct 30, 2024 at 4:13?PM Khurana, Parv > wrote: > > Hello PETSc Community, > I am trying to understand the scaling behaviour of AMG methods in PETSc > (Hypre for now) and how many DOFs/Rank are needed for a performant AMG > solve. > I?m currently conducting weak scaling tests using > src/snes/tutorials/ex12.c in 3D, applying Dirichlet BCs with FEM at P=1. > The tests keep DOFs per processor constant while increasing the mesh size > and processor count, specifically: > > - *20000 and 80000 DOF/RANK* configurations. > - Running SNES twice, using GMRES with a tolerance of 1e-5 and > preconditioning with Hypre-BoomerAMG. > > A couple of quick points in order to make sure that there is no confusion: > > 1) Partitioner type "simple" is for the CI. It is a very bad partition, > and should not be used for timing. The default is ParMetis which should be > good enough. > > 2) You start out with 6^3 = 216 elements, distribute that, and then refine > it. This will be _really_ bad load balance on all arrangement except the > divisors of 216. You usually want to start out with something bigger at the > later stages. You can use -dm_refine_pre to refine before distribution. > > 3) It is not clear you are using the timing for just the solver > (SNESSolve). It could be that extraneous things are taking time. When > asking questions like this, please always send the output of -log_view for > timing, and at least -ksp_monitor_true_residial for convergence. > > 4) SNES ex56 is the example we use for GAMG scalability testing > > Thanks, > > Matt > > Unfortunately, parallel efficiency degrades noticeably with increased > processor counts. Are there any insights or rules of thumb for using AMG > more effectively? I have been looking at this issue for a while > now and would love to engage in a further discussion. Please find below the > weak scaling results and the options I use to run the tests. > *#Run type* > -run_type full > -petscpartitioner_type simple > > *#Mesh settings* > -dm_plex_dim 3 > -dm_plex_simplex 1 > -dm_refine 5 #Varied this > -dm_plex_box_faces 6,6,6 > > *#BCs and FEM space* > -bc_type dirichlet > -petscspace_degree 1 > > *#Solver settings* > -snes_max_it 2 > -ksp_type gmres > -ksp_rtol 1.0e-5 > #Same settings as what we use for LOR > -pc_type hypre > -pc_hypre_type boomeramg > -pc_hypre_boomeramg_coarsen_type hmis > -pc_hypre_boomeramg_relax_type_all symmetric-sor/jacobi > -pc_hypre_boomeramg_strong_threshold 0.7 > -pc_hypre_boomeramg_interp_type ext+i > -pc_hypre_boomeramg_P_max 2 > -pc_hypre_boomeramg_truncfactor 0.3 > > Best, > Parv > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!aRkIkjdOvZx9uPuKeRCMxZ-OzP2IYC81-tJZcPczdDkmsiYvbB5fa0eNLr_hJmfRf73BE9wcUyW78CaVeYVe$ > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!aRkIkjdOvZx9uPuKeRCMxZ-OzP2IYC81-tJZcPczdDkmsiYvbB5fa0eNLr_hJmfRf73BE9wcUyW78CaVeYVe$ -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: image.png Type: image/png Size: 119488 bytes Desc: not available URL: From edoardo.alinovi at gmail.com Fri Nov 8 11:45:55 2024 From: edoardo.alinovi at gmail.com (Edoardo alinovi) Date: Fri, 8 Nov 2024 18:45:55 +0100 Subject: [petsc-users] Some hypre settings for 3D fully coupled incompressible NS Message-ID: Hello petsc friends, It's been a while since I am trying to find a good setup for my coupled solver. Recently, I have run a scan with Dakota (more than 1k simulations) on the Windsor body case with 7Mln cells on 36 cores on my small home server (Dell R730 with 2x2496 v4 xeon). I thought it was a good idea to share my results with the community! Here is a resume of my finding: 1) Multiplicative is faster than Schur: I have found out that Schur preconditioner is rarely faster than multiplicative despite the fact Schur keeps the number of iterations lower. I think there is a lot of room for improvement as far as FV matrices are concerned. Probably custom Shat is the way to go, but not easy to find a good one! Up to now "selfp" looks to be the only good and "ready to go" choice. 2) Vanilla fbcgs is faster than vanilla fgmres: maybe here we can tune gmres restart, I have not tried this systematically. 3) Stick with preonly: using bcgs/cg as preconditioner ksp lowers the number of iterations but it adds up a lot of overhead (even setting few iterations or mild tolerances). 4) Staging is a good idea: beyond bare iteration performance, I think that for steady state problems it worth setting a max for outer iterations in fieldsplit, as starting iterations would cost you a lot and probably you will be far from convergence anyway at the stage, so it is not a good investment pushing hard on them. 5) Here my best so far settings: # Outer solver settings "solver": "fbcgs", "preconditioner": "fieldsplit", "absTol": 1e-6, "relTol": 0.01, # Field split KSP and PC "fieldsplit_u_pc_type": "bjacobi", "fieldsplit_p_pc_type": "hypre", "fieldsplit_u_ksp_type": "preonly", "fieldsplit_p_ksp_type": "preonly", ! HYPRE PC options "fieldsplit_p_pc_hypre_boomeramg_strong_threshold": 0.05, "fieldsplit_p_pc_hypre_boomeramg_coarsen_type": "PMIS", "fieldsplit_p_pc_hypre_boomeramg_truncfactor": 0.3, "fieldsplit_p_pc_hypre_boomeramg_no_cf": 0, "fieldsplit_p_pc_hypre_boomeramg_agg_nl": 1, "fieldsplit_p_pc_hypre_boomeramg_agg_num_paths": 1, "fieldsplit_p_pc_hypre_boomeramg_P_max": 0, "fieldsplit_p_pc_hypre_boomeramg_max_levels": 30, "fieldsplit_p_pc_hypre_boomeramg_relax_type_all": "backward-SOR/Jacobi", "fieldsplit_p_pc_hypre_boomeramg_interp_type": "ext+i", "fieldsplit_p_pc_hypre_boomeramg_grid_sweeps_down": 0, "fieldsplit_p_pc_hypre_boomeramg_grid_sweeps_up": 2, "fieldsplit_p_pc_hypre_boomeramg_cycle_type": "v" I have a question for Barry/Jed/Matt. I have noted that most of the commercial solvers use what I define as "SAMG with ILU smoother". I am wondering if there's a way to reproduce this in Petsc. I have tried PCPATCH to test VANKA, but I am not really able to use that PC as I am not using DMplex. With this recipe I am not miles away from Fluent on the same problem. Yet, I am wondering why commercial solvers do not use fieldsplit. Hope this can be helpful and of course I am happy to collaborate on this topic if someone outhere is willing to! Cheers, Edoardo -------------- next part -------------- An HTML attachment was scrubbed... URL: From marcos.vanella at nist.gov Fri Nov 8 14:18:43 2024 From: marcos.vanella at nist.gov (Vanella, Marcos (Fed)) Date: Fri, 8 Nov 2024 20:18:43 +0000 Subject: [petsc-users] Compiling PETSc in for Grace-Hopper nodes Message-ID: Hi all, does anyone have experience compiling PETSc with gnu openmpi and cross compiling with cuda nvcc on these systems? we have access to Vista, a machine in TACC and was trying to build PETSc with these libraries. I would need gnu openmpi to compile my code (fortran std 2018), and would like to keep the same cpu compiler/openmpi for PETSc.I have the following modules loaded: Currently Loaded Modules: 1) ucc/1.3.0 2) ucx/1.17.0 3) cmake/3.29.5 4) xalt/3.1 5) TACC 6) gcc/14.2.0 7) cuda/12.5 (g) 8) openmpi/5.0.5 Where: g: built for GPU Here mpicc points to the gcc compiler, etc. When configuring PETSc in the following form I get nvcc not working: $ ./configure COPTFLAGS="-O2 -g" CXXOPTFLAGS="-O2 -g" FOPTFLAGS="-O2 -g" FCOPTFLAGS="-O2 -g" CUDAOPTFLAGS="-O2 -g" --with-debugging=1 --with-cc=mpicc --with-cxx=mpicxx --with-fc=mpifort --with-cuda --with-cudac=nvcc --with-cuda-arch=90 --download-fblaslapack=1 --with-make-np=8 ============================================================================================= Configuring PETSc to compile on your system ============================================================================================= TESTING: checkCUDACompiler from config.setCompilers(config/BuildSystem/config/setCompilers.py:1541) ********************************************************************************************* UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for details): --------------------------------------------------------------------------------------------- CUDA compiler you provided with -with-cudac=nvcc cannot be found or does not work. Cannot compile CUDA with nvcc. ********************************************************************************************* I have nvcc in my path: $ nvcc --version nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2024 NVIDIA Corporation Built on Thu_Jun__6_02:26:10_PDT_2024 Cuda compilation tools, release 12.5, V12.5.82 Build cuda_12.5.r12.5/compiler.34385749_0 I remember being able to do this cross compilation in polaris. Any help is most appreciated, Marcos -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay.anl at fastmail.org Fri Nov 8 14:23:44 2024 From: balay.anl at fastmail.org (Satish Balay) Date: Fri, 8 Nov 2024 14:23:44 -0600 (CST) Subject: [petsc-users] Compiling PETSc in for Grace-Hopper nodes In-Reply-To: References: Message-ID: <463afc6d-3776-a73c-d8c3-fef672620ca1@fastmail.org> Can you send configure.log for this failure? Satish On Fri, 8 Nov 2024, Vanella, Marcos (Fed) via petsc-users wrote: > Hi all, does anyone have experience compiling PETSc with gnu openmpi and cross compiling with cuda nvcc on these systems? > we have access to Vista, a machine in TACC and was trying to build PETSc with these libraries. I would need gnu openmpi to compile my code (fortran std 2018), and would like to keep the same cpu compiler/openmpi for PETSc.I have the following modules loaded: > > Currently Loaded Modules: > 1) ucc/1.3.0 2) ucx/1.17.0 3) cmake/3.29.5 4) xalt/3.1 5) TACC 6) gcc/14.2.0 7) cuda/12.5 (g) 8) openmpi/5.0.5 > > Where: > g: built for GPU > > Here mpicc points to the gcc compiler, etc. When configuring PETSc in the following form I get nvcc not working: > > $ ./configure COPTFLAGS="-O2 -g" CXXOPTFLAGS="-O2 -g" FOPTFLAGS="-O2 -g" FCOPTFLAGS="-O2 -g" CUDAOPTFLAGS="-O2 -g" --with-debugging=1 --with-cc=mpicc --with-cxx=mpicxx --with-fc=mpifort --with-cuda --with-cudac=nvcc --with-cuda-arch=90 --download-fblaslapack=1 --with-make-np=8 > > ============================================================================================= > Configuring PETSc to compile on your system > ============================================================================================= > TESTING: checkCUDACompiler from config.setCompilers(config/BuildSystem/config/setCompilers.py:1541) > ********************************************************************************************* > UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for details): > --------------------------------------------------------------------------------------------- > CUDA compiler you provided with -with-cudac=nvcc cannot be found or does not work. > Cannot compile CUDA with nvcc. > ********************************************************************************************* > > I have nvcc in my path: > > $ nvcc --version > nvcc: NVIDIA (R) Cuda compiler driver > Copyright (c) 2005-2024 NVIDIA Corporation > Built on Thu_Jun__6_02:26:10_PDT_2024 > Cuda compilation tools, release 12.5, V12.5.82 > Build cuda_12.5.r12.5/compiler.34385749_0 > > I remember being able to do this cross compilation in polaris. Any help is most appreciated, > Marcos > From junchao.zhang at gmail.com Fri Nov 8 14:25:01 2024 From: junchao.zhang at gmail.com (Junchao Zhang) Date: Fri, 8 Nov 2024 14:25:01 -0600 Subject: [petsc-users] Compiling PETSc in for Grace-Hopper nodes In-Reply-To: References: Message-ID: Hi, Marcos Could you attach the configure.log? --Junchao Zhang On Fri, Nov 8, 2024 at 2:19?PM Vanella, Marcos (Fed) via petsc-users < petsc-users at mcs.anl.gov> wrote: > Hi all, does anyone have experience compiling PETSc with gnu openmpi and > cross compiling with cuda nvcc on these systems? > we have access to Vista, a machine in TACC and was trying to build PETSc > with these libraries. I would need gnu openmpi to compile my code (fortran > std 2018), and would like to keep the same cpu compiler/openmpi for PETSc.I > have the following modules loaded: > > Currently Loaded Modules: > 1) ucc/1.3.0 2) ucx/1.17.0 3) cmake/3.29.5 4) xalt/3.1 5) TACC > 6) gcc/14.2.0 7) cuda/12.5 (g) 8) openmpi/5.0.5 > > Where: > g: built for GPU > > Here mpicc points to the gcc compiler, etc. When configuring PETSc in the > following form I get nvcc not working: > > $ ./configure COPTFLAGS="-O2 -g" CXXOPTFLAGS="-O2 -g" FOPTFLAGS="-O2 -g" > FCOPTFLAGS="-O2 -g" CUDAOPTFLAGS="-O2 -g" --with-debugging=1 > --with-cc=mpicc --with-cxx=mpicxx --with-fc=mpifort --with-cuda > --with-cudac=nvcc --with-cuda-arch=90 --download-fblaslapack=1 > --with-make-np=8 > > > ============================================================================================= > Configuring PETSc to compile on your system > > ============================================================================================= > TESTING: checkCUDACompiler from > config.setCompilers(config/BuildSystem/config/setCompilers.py:1541) > > ********************************************************************************************* > UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for > details): > > --------------------------------------------------------------------------------------------- > CUDA compiler you provided with -with-cudac=nvcc cannot be found or does > not work. > Cannot compile CUDA with nvcc. > > ********************************************************************************************* > > I have nvcc in my path: > > $ nvcc --version > nvcc: NVIDIA (R) Cuda compiler driver > Copyright (c) 2005-2024 NVIDIA Corporation > Built on Thu_Jun__6_02:26:10_PDT_2024 > Cuda compilation tools, release 12.5, V12.5.82 > Build cuda_12.5.r12.5/compiler.34385749_0 > > I remember being able to do this cross compilation in polaris. Any help is > most appreciated, > Marcos > -------------- next part -------------- An HTML attachment was scrubbed... URL: From marcos.vanella at nist.gov Fri Nov 8 14:34:37 2024 From: marcos.vanella at nist.gov (Vanella, Marcos (Fed)) Date: Fri, 8 Nov 2024 20:34:37 +0000 Subject: [petsc-users] Compiling PETSc in for Grace-Hopper nodes In-Reply-To: References: Message-ID: Hi Satish and Junchao, this is what I'm getting at the end of the configure.log. I guess there is an incompatibility among nvcc and the gcc version I'm using. ... ============================================================================================= TESTING: checkCUDACompiler from config.setCompilers(/home1/09805/mnv/Software/petsc/config/BuildSystem/config/setCompilers.py:1541) Locate a functional CUDA compiler Checking for program /opt/apps/xalt/xalt/bin/nvcc...not found Checking for program /opt/apps/gcc14/cuda12/openmpi/5.0.5/libexec/osu-micro-benchmarks/mpi/one-sided/nvcc...not found Checking for program /opt/apps/gcc14/cuda12/openmpi/5.0.5/libexec/osu-micro-benchmarks/mpi/collective/nvcc...not found Checking for program /opt/apps/gcc14/cuda12/openmpi/5.0.5/libexec/osu-micro-benchmarks/mpi/pt2pt/nvcc...not found Checking for program /opt/apps/gcc14/cuda12/openmpi/5.0.5/libexec/osu-micro-benchmarks/mpi/startup/nvcc...not found Checking for program /opt/apps/gcc14/cuda12/openmpi/5.0.5/bin/nvcc...not found Checking for program /home1/apps/nvidia/Linux_aarch64/24.7/cuda/12.5/bin/nvcc...found Defined make macro "CUDAC" to "nvcc" Executing: nvcc -c -o /tmp/petsc-qlfa8fb8/config.setCompilers/conftest.o -I/tmp/petsc-qlfa8fb8/config.setCompilers /tmp/petsc-qlfa8fb8/config.setCompilers/conftest.cu stdout: In file included from /home1/apps/nvidia/Linux_aarch64/24.7/cuda/12.5/bin/../targets/sbsa-linux/include/cuda_runtime.h:82, from : /home1/apps/nvidia/Linux_aarch64/24.7/cuda/12.5/bin/../targets/sbsa-linux/include/crt/host_config.h:143:2: error: #error -- unsupported GNU version! gcc versions later than 13 are not supported! The nvcc flag '-allow-unsupported-compiler' can be used to override this version check; however, using an unsupported host compiler may cause compilation failure or incorrect run time execution. Use at your own risk. 143 | #error -- unsupported GNU version! gcc versions later than 13 are not supported! The nvcc flag '-allow-unsupported-compiler' can be used to override this version check; however, using an unsupported host compiler may cause compilation failure or incorrect run time execution. Use at your own risk. | ^~~~~ Possible ERROR while running compiler: exit code 1 stderr: In file included from /home1/apps/nvidia/Linux_aarch64/24.7/cuda/12.5/bin/../targets/sbsa-linux/include/cuda_runtime.h:82, from : /home1/apps/nvidia/Linux_aarch64/24.7/cuda/12.5/bin/../targets/sbsa-linux/include/crt/host_config.h:143:2: error: #error -- unsupported GNU version! gcc versions later than 13 are not supported! The nvcc flag '-allow-unsupported-compiler' can be used to override this version check; however, using an unsupported host compiler may cause compilation failure or incorrect run time execution. Use at your own risk. 143 | #error -- unsupported GNU version! gcc versions later than 13 are not supported! The nvcc flag '-allow-unsupported-compiler' can be used to override this version check; however, using an unsupported host compiler may cause compilation failure or incorrect run time execution. Use at your own risk. | ^~~~~ Source: #include "confdefs.h" #include "conffix.h" int main(void) { return 0; } Error testing CUDA compiler: Cannot compile CUDA with nvcc. Deleting "CUDAC" ********************************************************************************************* UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for details): --------------------------------------------------------------------------------------------- CUDA compiler you provided with -with-cudac=nvcc cannot be found or does not work. Cannot compile CUDA with nvcc. ********************************************************************************************* File "/home1/09805/mnv/Software/petsc/config/configure.py", line 461, in petsc_configure framework.configure(out = sys.stdout) File "/home1/09805/mnv/Software/petsc/config/BuildSystem/config/framework.py", line 1460, in configure self.processChildren() File "/home1/09805/mnv/Software/petsc/config/BuildSystem/config/framework.py", line 1448, in processChildren self.serialEvaluation(self.childGraph) File "/home1/09805/mnv/Software/petsc/config/BuildSystem/config/framework.py", line 1423, in serialEvaluation child.configure() File "/home1/09805/mnv/Software/petsc/config/BuildSystem/config/setCompilers.py", line 2846, in configure self.executeTest(getattr(self,LANG.join(('check','Compiler')))) File "/home1/09805/mnv/Software/petsc/config/BuildSystem/config/base.py", line 138, in executeTest ret = test(*args,**kargs) File "/home1/09805/mnv/Software/petsc/config/BuildSystem/config/setCompilers.py", line 1544, in checkCUDACompiler for compiler in self.generateCUDACompilerGuesses(): File "/home1/09805/mnv/Software/petsc/config/BuildSystem/config/setCompilers.py", line 1527, in generateCUDACompilerGuesses raise RuntimeError('CUDA compiler you provided with -with-cudac='+self.argDB['with-cudac']+' cannot be found or does not work.'+'\n'+self.mesg) ================================================================================ Finishing configure run at Fri, 08 Nov 2024 14:28:04 -0600 ================================================================================ ________________________________ From: Junchao Zhang Sent: Friday, November 8, 2024 3:25 PM To: Vanella, Marcos (Fed) Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Compiling PETSc in for Grace-Hopper nodes Hi, Marcos Could you attach the configure.log? --Junchao Zhang On Fri, Nov 8, 2024 at 2:19?PM Vanella, Marcos (Fed) via petsc-users > wrote: Hi all, does anyone have experience compiling PETSc with gnu openmpi and cross compiling with cuda nvcc on these systems? we have access to Vista, a machine in TACC and was trying to build PETSc with these libraries. I would need gnu openmpi to compile my code (fortran std 2018), and would like to keep the same cpu compiler/openmpi for PETSc.I have the following modules loaded: Currently Loaded Modules: 1) ucc/1.3.0 2) ucx/1.17.0 3) cmake/3.29.5 4) xalt/3.1 5) TACC 6) gcc/14.2.0 7) cuda/12.5 (g) 8) openmpi/5.0.5 Where: g: built for GPU Here mpicc points to the gcc compiler, etc. When configuring PETSc in the following form I get nvcc not working: $ ./configure COPTFLAGS="-O2 -g" CXXOPTFLAGS="-O2 -g" FOPTFLAGS="-O2 -g" FCOPTFLAGS="-O2 -g" CUDAOPTFLAGS="-O2 -g" --with-debugging=1 --with-cc=mpicc --with-cxx=mpicxx --with-fc=mpifort --with-cuda --with-cudac=nvcc --with-cuda-arch=90 --download-fblaslapack=1 --with-make-np=8 ============================================================================================= Configuring PETSc to compile on your system ============================================================================================= TESTING: checkCUDACompiler from config.setCompilers(config/BuildSystem/config/setCompilers.py:1541) ********************************************************************************************* UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for details): --------------------------------------------------------------------------------------------- CUDA compiler you provided with -with-cudac=nvcc cannot be found or does not work. Cannot compile CUDA with nvcc. ********************************************************************************************* I have nvcc in my path: $ nvcc --version nvcc: NVIDIA (R) Cuda compiler driver Copyright (c) 2005-2024 NVIDIA Corporation Built on Thu_Jun__6_02:26:10_PDT_2024 Cuda compilation tools, release 12.5, V12.5.82 Build cuda_12.5.r12.5/compiler.34385749_0 I remember being able to do this cross compilation in polaris. Any help is most appreciated, Marcos -------------- next part -------------- An HTML attachment was scrubbed... URL: From junchao.zhang at gmail.com Fri Nov 8 14:47:39 2024 From: junchao.zhang at gmail.com (Junchao Zhang) Date: Fri, 8 Nov 2024 14:47:39 -0600 Subject: [petsc-users] Compiling PETSc in for Grace-Hopper nodes In-Reply-To: References: Message-ID: Yes, the error message indicates gcc-14 is not supported by cuda-12.5. According to https://urldefense.us/v3/__https://docs.nvidia.com/cuda/archive/12.5.0/cuda-installation-guide-linux/index.html*host-compiler-support-policy__;Iw!!G_uCfscf7eWS!b5iD6t-BE_J3Y7Z9ZeZ_fcD0Hz0TeQ1wpbE5vpuBJGpIqXz6nFj4mtmBLxTQlbByrjwjGJpFnDhDc5A6zQccRWlqqyJy$ , it supports up to gcc-13.2. Perhaps the best approach is to ask your sys admin to install compatible gcc and cuda. --Junchao Zhang On Fri, Nov 8, 2024 at 2:34?PM Vanella, Marcos (Fed) wrote: > > Hi Satish and Junchao, this is what I'm getting at the end of the configure.log. I guess there is an incompatibility among nvcc and the gcc version I'm using. > > > ... > ============================================================================================= > TESTING: checkCUDACompiler from config.setCompilers(/home1/09805/mnv/Software/petsc/config/BuildSystem/config/setCompilers.py:1541) > Locate a functional CUDA compiler > Checking for program /opt/apps/xalt/xalt/bin/nvcc...not found > Checking for program /opt/apps/gcc14/cuda12/openmpi/5.0.5/libexec/osu-micro-benchmarks/mpi/one-sided/nvcc...not found > Checking for program /opt/apps/gcc14/cuda12/openmpi/5.0.5/libexec/osu-micro-benchmarks/mpi/collective/nvcc...not found > Checking for program /opt/apps/gcc14/cuda12/openmpi/5.0.5/libexec/osu-micro-benchmarks/mpi/pt2pt/nvcc...not found > Checking for program /opt/apps/gcc14/cuda12/openmpi/5.0.5/libexec/osu-micro-benchmarks/mpi/startup/nvcc...not found > Checking for program /opt/apps/gcc14/cuda12/openmpi/5.0.5/bin/nvcc...not found > Checking for program /home1/apps/nvidia/Linux_aarch64/24.7/cuda/12.5/bin/nvcc...found > Defined make macro "CUDAC" to "nvcc" > Executing: nvcc -c -o /tmp/petsc-qlfa8fb8/config.setCompilers/conftest.o -I/tmp/petsc-qlfa8fb8/config.setCompilers /tmp/petsc-qlfa8fb8/config.setCompilers/conftest.cu > stdout: > In file included from /home1/apps/nvidia/Linux_aarch64/24.7/cuda/12.5/bin/../targets/sbsa-linux/include/cuda_runtime.h:82, > from : > /home1/apps/nvidia/Linux_aarch64/24.7/cuda/12.5/bin/../targets/sbsa-linux/include/crt/host_config.h:143:2: error: #error -- unsupported GNU version! gcc versions later than 13 are not supported! The nvcc flag '-allow-unsupported-compiler' can be used to override this version check; however, using an unsupported host compiler may cause compilation failure or incorrect run time execution. Use at your own risk. > 143 | #error -- unsupported GNU version! gcc versions later than 13 are not supported! The nvcc flag '-allow-unsupported-compiler' can be used to override this version check; however, using an unsupported host compiler may cause compilation failure or incorrect run time execution. Use at your own risk. > | ^~~~~ > Possible ERROR while running compiler: exit code 1 > stderr: > In file included from /home1/apps/nvidia/Linux_aarch64/24.7/cuda/12.5/bin/../targets/sbsa-linux/include/cuda_runtime.h:82, > from : > /home1/apps/nvidia/Linux_aarch64/24.7/cuda/12.5/bin/../targets/sbsa-linux/include/crt/host_config.h:143:2: error: #error -- unsupported GNU version! gcc versions later than 13 are not supported! The nvcc flag '-allow-unsupported-compiler' can be used to override this version check; however, using an unsupported host compiler may cause compilation failure or incorrect run time execution. Use at your own risk. > 143 | #error -- unsupported GNU version! gcc versions later than 13 are not supported! The nvcc flag '-allow-unsupported-compiler' can be used to override this version check; however, using an unsupported host compiler may cause compilation failure or incorrect run time execution. Use at your own risk. > | ^~~~~ > Source: > #include "confdefs.h" > #include "conffix.h" > > int main(void) { > return 0; > } > > Error testing CUDA compiler: Cannot compile CUDA with nvcc. > Deleting "CUDAC" > ********************************************************************************************* > UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for details): > --------------------------------------------------------------------------------------------- > CUDA compiler you provided with -with-cudac=nvcc cannot be found or does not work. > Cannot compile CUDA with nvcc. > ********************************************************************************************* > File "/home1/09805/mnv/Software/petsc/config/configure.py", line 461, in petsc_configure > framework.configure(out = sys.stdout) > File "/home1/09805/mnv/Software/petsc/config/BuildSystem/config/framework.py", line 1460, in configure > self.processChildren() > File "/home1/09805/mnv/Software/petsc/config/BuildSystem/config/framework.py", line 1448, in processChildren > self.serialEvaluation(self.childGraph) > File "/home1/09805/mnv/Software/petsc/config/BuildSystem/config/framework.py", line 1423, in serialEvaluation > child.configure() > File "/home1/09805/mnv/Software/petsc/config/BuildSystem/config/setCompilers.py", line 2846, in configure > self.executeTest(getattr(self,LANG.join(('check','Compiler')))) > File "/home1/09805/mnv/Software/petsc/config/BuildSystem/config/base.py", line 138, in executeTest > ret = test(*args,**kargs) > File "/home1/09805/mnv/Software/petsc/config/BuildSystem/config/setCompilers.py", line 1544, in checkCUDACompiler > for compiler in self.generateCUDACompilerGuesses(): > File "/home1/09805/mnv/Software/petsc/config/BuildSystem/config/setCompilers.py", line 1527, in generateCUDACompilerGuesses > raise RuntimeError('CUDA compiler you provided with -with-cudac='+self.argDB['with-cudac']+' cannot be found or does not work.'+'\n'+self.mesg) > ================================================================================ > Finishing configure run at Fri, 08 Nov 2024 14:28:04 -0600 > ================================================================================ > ________________________________ > From: Junchao Zhang > Sent: Friday, November 8, 2024 3:25 PM > To: Vanella, Marcos (Fed) > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Compiling PETSc in for Grace-Hopper nodes > > Hi, Marcos > Could you attach the configure.log? > --Junchao Zhang > > > On Fri, Nov 8, 2024 at 2:19?PM Vanella, Marcos (Fed) via petsc-users wrote: > > Hi all, does anyone have experience compiling PETSc with gnu openmpi and cross compiling with cuda nvcc on these systems? > we have access to Vista, a machine in TACC and was trying to build PETSc with these libraries. I would need gnu openmpi to compile my code (fortran std 2018), and would like to keep the same cpu compiler/openmpi for PETSc.I have the following modules loaded: > > Currently Loaded Modules: > 1) ucc/1.3.0 2) ucx/1.17.0 3) cmake/3.29.5 4) xalt/3.1 5) TACC 6) gcc/14.2.0 7) cuda/12.5 (g) 8) openmpi/5.0.5 > > Where: > g: built for GPU > > Here mpicc points to the gcc compiler, etc. When configuring PETSc in the following form I get nvcc not working: > > $ ./configure COPTFLAGS="-O2 -g" CXXOPTFLAGS="-O2 -g" FOPTFLAGS="-O2 -g" FCOPTFLAGS="-O2 -g" CUDAOPTFLAGS="-O2 -g" --with-debugging=1 --with-cc=mpicc --with-cxx=mpicxx --with-fc=mpifort --with-cuda --with-cudac=nvcc --with-cuda-arch=90 --download-fblaslapack=1 --with-make-np=8 > > ============================================================================================= > Configuring PETSc to compile on your system > ============================================================================================= > TESTING: checkCUDACompiler from config.setCompilers(config/BuildSystem/config/setCompilers.py:1541) > ********************************************************************************************* > UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for details): > --------------------------------------------------------------------------------------------- > CUDA compiler you provided with -with-cudac=nvcc cannot be found or does not work. > Cannot compile CUDA with nvcc. > ********************************************************************************************* > > I have nvcc in my path: > > $ nvcc --version > nvcc: NVIDIA (R) Cuda compiler driver > Copyright (c) 2005-2024 NVIDIA Corporation > Built on Thu_Jun__6_02:26:10_PDT_2024 > Cuda compilation tools, release 12.5, V12.5.82 > Build cuda_12.5.r12.5/compiler.34385749_0 > > I remember being able to do this cross compilation in polaris. Any help is most appreciated, > Marcos From marcos.vanella at nist.gov Fri Nov 8 14:49:33 2024 From: marcos.vanella at nist.gov (Vanella, Marcos (Fed)) Date: Fri, 8 Nov 2024 20:49:33 +0000 Subject: [petsc-users] Compiling PETSc in for Grace-Hopper nodes In-Reply-To: References: Message-ID: Thank you Junchao, we'll work on this compatibility issue. Best, Marcos ________________________________ From: Junchao Zhang Sent: Friday, November 8, 2024 3:47 PM To: Vanella, Marcos (Fed) Cc: Satish Balay ; petsc-users at mcs.anl.gov ; Victor Eijkhout Subject: Re: [petsc-users] Compiling PETSc in for Grace-Hopper nodes Yes, the error message indicates gcc-14 is not supported by cuda-12.5. According to https://urldefense.us/v3/__https://gcc02.safelinks.protection.outlook.com/?url=https*3A*2F*2Fdocs.nvidia.com*2Fcuda*2Farchive*2F12.5.0*2Fcuda-installation-guide-linux*2Findex.html*23host-compiler-support-policy&data=05*7C02*7Cmarcos.vanella*40nist.gov*7C7136c6301354458874f708dd00369e37*7C2ab5d82fd8fa4797a93e054655c61dec*7C0*7C0*7C638666956782708272*7CUnknown*7CTWFpbGZsb3d8eyJFbXB0eU1hcGkiOnRydWUsIlYiOiIwLjAuMDAwMCIsIlAiOiJXaW4zMiIsIkFOIjoiTWFpbCIsIldUIjoyfQ*3D*3D*7C0*7C*7C*7C&sdata=ZCZOH7YEqdGU2NSHnn4Shuly*2BG2*2BPci6Gcm5yoVcJKY*3D&reserved=0__;JSUlJSUlJSUlJSUlJSUlJSUlJSUlJSUlJSUlJQ!!G_uCfscf7eWS!dE1UqBggdaLlQuHbpbc4p7cYAz9EvrpKKIDn3RKUBr-E2D7rcj5OO0krQHrW75551ETLudAIb2qjBr_qhCg0XKvqOmcTHhJu$ , it supports up to gcc-13.2. Perhaps the best approach is to ask your sys admin to install compatible gcc and cuda. --Junchao Zhang On Fri, Nov 8, 2024 at 2:34?PM Vanella, Marcos (Fed) wrote: > > Hi Satish and Junchao, this is what I'm getting at the end of the configure.log. I guess there is an incompatibility among nvcc and the gcc version I'm using. > > > ... > ============================================================================================= > TESTING: checkCUDACompiler from config.setCompilers(/home1/09805/mnv/Software/petsc/config/BuildSystem/config/setCompilers.py:1541) > Locate a functional CUDA compiler > Checking for program /opt/apps/xalt/xalt/bin/nvcc...not found > Checking for program /opt/apps/gcc14/cuda12/openmpi/5.0.5/libexec/osu-micro-benchmarks/mpi/one-sided/nvcc...not found > Checking for program /opt/apps/gcc14/cuda12/openmpi/5.0.5/libexec/osu-micro-benchmarks/mpi/collective/nvcc...not found > Checking for program /opt/apps/gcc14/cuda12/openmpi/5.0.5/libexec/osu-micro-benchmarks/mpi/pt2pt/nvcc...not found > Checking for program /opt/apps/gcc14/cuda12/openmpi/5.0.5/libexec/osu-micro-benchmarks/mpi/startup/nvcc...not found > Checking for program /opt/apps/gcc14/cuda12/openmpi/5.0.5/bin/nvcc...not found > Checking for program /home1/apps/nvidia/Linux_aarch64/24.7/cuda/12.5/bin/nvcc...found > Defined make macro "CUDAC" to "nvcc" > Executing: nvcc -c -o /tmp/petsc-qlfa8fb8/config.setCompilers/conftest.o -I/tmp/petsc-qlfa8fb8/config.setCompilers /tmp/petsc-qlfa8fb8/config.setCompilers/conftest.cu > stdout: > In file included from /home1/apps/nvidia/Linux_aarch64/24.7/cuda/12.5/bin/../targets/sbsa-linux/include/cuda_runtime.h:82, > from : > /home1/apps/nvidia/Linux_aarch64/24.7/cuda/12.5/bin/../targets/sbsa-linux/include/crt/host_config.h:143:2: error: #error -- unsupported GNU version! gcc versions later than 13 are not supported! The nvcc flag '-allow-unsupported-compiler' can be used to override this version check; however, using an unsupported host compiler may cause compilation failure or incorrect run time execution. Use at your own risk. > 143 | #error -- unsupported GNU version! gcc versions later than 13 are not supported! The nvcc flag '-allow-unsupported-compiler' can be used to override this version check; however, using an unsupported host compiler may cause compilation failure or incorrect run time execution. Use at your own risk. > | ^~~~~ > Possible ERROR while running compiler: exit code 1 > stderr: > In file included from /home1/apps/nvidia/Linux_aarch64/24.7/cuda/12.5/bin/../targets/sbsa-linux/include/cuda_runtime.h:82, > from : > /home1/apps/nvidia/Linux_aarch64/24.7/cuda/12.5/bin/../targets/sbsa-linux/include/crt/host_config.h:143:2: error: #error -- unsupported GNU version! gcc versions later than 13 are not supported! The nvcc flag '-allow-unsupported-compiler' can be used to override this version check; however, using an unsupported host compiler may cause compilation failure or incorrect run time execution. Use at your own risk. > 143 | #error -- unsupported GNU version! gcc versions later than 13 are not supported! The nvcc flag '-allow-unsupported-compiler' can be used to override this version check; however, using an unsupported host compiler may cause compilation failure or incorrect run time execution. Use at your own risk. > | ^~~~~ > Source: > #include "confdefs.h" > #include "conffix.h" > > int main(void) { > return 0; > } > > Error testing CUDA compiler: Cannot compile CUDA with nvcc. > Deleting "CUDAC" > ********************************************************************************************* > UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for details): > --------------------------------------------------------------------------------------------- > CUDA compiler you provided with -with-cudac=nvcc cannot be found or does not work. > Cannot compile CUDA with nvcc. > ********************************************************************************************* > File "/home1/09805/mnv/Software/petsc/config/configure.py", line 461, in petsc_configure > framework.configure(out = sys.stdout) > File "/home1/09805/mnv/Software/petsc/config/BuildSystem/config/framework.py", line 1460, in configure > self.processChildren() > File "/home1/09805/mnv/Software/petsc/config/BuildSystem/config/framework.py", line 1448, in processChildren > self.serialEvaluation(self.childGraph) > File "/home1/09805/mnv/Software/petsc/config/BuildSystem/config/framework.py", line 1423, in serialEvaluation > child.configure() > File "/home1/09805/mnv/Software/petsc/config/BuildSystem/config/setCompilers.py", line 2846, in configure > self.executeTest(getattr(self,LANG.join(('check','Compiler')))) > File "/home1/09805/mnv/Software/petsc/config/BuildSystem/config/base.py", line 138, in executeTest > ret = test(*args,**kargs) > File "/home1/09805/mnv/Software/petsc/config/BuildSystem/config/setCompilers.py", line 1544, in checkCUDACompiler > for compiler in self.generateCUDACompilerGuesses(): > File "/home1/09805/mnv/Software/petsc/config/BuildSystem/config/setCompilers.py", line 1527, in generateCUDACompilerGuesses > raise RuntimeError('CUDA compiler you provided with -with-cudac='+self.argDB['with-cudac']+' cannot be found or does not work.'+'\n'+self.mesg) > ================================================================================ > Finishing configure run at Fri, 08 Nov 2024 14:28:04 -0600 > ================================================================================ > ________________________________ > From: Junchao Zhang > Sent: Friday, November 8, 2024 3:25 PM > To: Vanella, Marcos (Fed) > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] Compiling PETSc in for Grace-Hopper nodes > > Hi, Marcos > Could you attach the configure.log? > --Junchao Zhang > > > On Fri, Nov 8, 2024 at 2:19?PM Vanella, Marcos (Fed) via petsc-users wrote: > > Hi all, does anyone have experience compiling PETSc with gnu openmpi and cross compiling with cuda nvcc on these systems? > we have access to Vista, a machine in TACC and was trying to build PETSc with these libraries. I would need gnu openmpi to compile my code (fortran std 2018), and would like to keep the same cpu compiler/openmpi for PETSc.I have the following modules loaded: > > Currently Loaded Modules: > 1) ucc/1.3.0 2) ucx/1.17.0 3) cmake/3.29.5 4) xalt/3.1 5) TACC 6) gcc/14.2.0 7) cuda/12.5 (g) 8) openmpi/5.0.5 > > Where: > g: built for GPU > > Here mpicc points to the gcc compiler, etc. When configuring PETSc in the following form I get nvcc not working: > > $ ./configure COPTFLAGS="-O2 -g" CXXOPTFLAGS="-O2 -g" FOPTFLAGS="-O2 -g" FCOPTFLAGS="-O2 -g" CUDAOPTFLAGS="-O2 -g" --with-debugging=1 --with-cc=mpicc --with-cxx=mpicxx --with-fc=mpifort --with-cuda --with-cudac=nvcc --with-cuda-arch=90 --download-fblaslapack=1 --with-make-np=8 > > ============================================================================================= > Configuring PETSc to compile on your system > ============================================================================================= > TESTING: checkCUDACompiler from config.setCompilers(config/BuildSystem/config/setCompilers.py:1541) > ********************************************************************************************* > UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for details): > --------------------------------------------------------------------------------------------- > CUDA compiler you provided with -with-cudac=nvcc cannot be found or does not work. > Cannot compile CUDA with nvcc. > ********************************************************************************************* > > I have nvcc in my path: > > $ nvcc --version > nvcc: NVIDIA (R) Cuda compiler driver > Copyright (c) 2005-2024 NVIDIA Corporation > Built on Thu_Jun__6_02:26:10_PDT_2024 > Cuda compilation tools, release 12.5, V12.5.82 > Build cuda_12.5.r12.5/compiler.34385749_0 > > I remember being able to do this cross compilation in polaris. Any help is most appreciated, > Marcos -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Sat Nov 9 09:51:31 2024 From: bsmith at petsc.dev (Barry Smith) Date: Sat, 9 Nov 2024 10:51:31 -0500 Subject: [petsc-users] Spring 2025 PETSc Annual Users Meeting in Buffalo, New York registration is now open References: <2FBFCF80-8A3E-4A00-B0D1-9C067130AD77@petsc.dev> Message-ID: The Spring 2025 PETSc Annual Users Meeting will be held May 20-21, 2025, in Buffalo, New York. The meeting website https://urldefense.us/v3/__https://petsc.org/community/meetings/2025/__;!!G_uCfscf7eWS!fgUt5YviY0nuj_9WJekiYDOGrC9_D639zfhCyGjT3m4IONdLS0cUjB85MobixTn1iEbILSWiGeY0kNjUOhJmOeQ$ is now available. Please register, submit a presentation, and start making your travel plans. Buffalo is a very short distance from Niagara Falls, so come early and enjoy the Falls. We will hold a PETSc tutorial the day before the meeting on Monday, May 19, 2025. Student travel funding is available to attend the meeting; apply now when you register. Student registration is free. As always, thanks for your support and interest in the PETSc community, Barry -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Sat Nov 9 11:58:56 2024 From: bsmith at petsc.dev (Barry Smith) Date: Sat, 9 Nov 2024 12:58:56 -0500 Subject: [petsc-users] Correction for Spring 2025 PETSc Annual Users Meeting in Buffalo, New York registration is now open References: Message-ID: Correction: please use https://urldefense.us/v3/__https://petsc.org/release/community/meetings/2025/__;!!G_uCfscf7eWS!YsRJ9A0Fy25e0yjmpE6WJQHEmFURYU6fMLVeE3H1sed3DlRO-ImLLBLnagBdvP4NIw2QpjW6ow24sa-nWr8Gkc4$ to access the website. Sorry for the double mail. Barry > Begin forwarded message: > > From: Barry Smith > Subject: Spring 2025 PETSc Annual Users Meeting in Buffalo, New York registration is now open > Date: November 9, 2024 at 10:51:31?AM EST > To: Petsc-users , petsc-announce at mcs.anl.gov, petsc-dev > Message-Id: > > > The Spring 2025 PETSc Annual Users Meeting will be held May 20-21, 2025, in Buffalo, New York. > > The meeting website https://urldefense.us/v3/__https://petsc.org/community/meetings/2025/__;!!G_uCfscf7eWS!YsRJ9A0Fy25e0yjmpE6WJQHEmFURYU6fMLVeE3H1sed3DlRO-ImLLBLnagBdvP4NIw2QpjW6ow24sa-nBoPU4Oc$ is now available. Please register, submit a presentation, and start making your travel plans. Buffalo is a very short distance from Niagara Falls, so come early and enjoy the Falls. > > We will hold a PETSc tutorial the day before the meeting on Monday, May 19, 2025. > > Student travel funding is available to attend the meeting; apply now when you register. Student registration is free. > > As always, thanks for your support and interest in the PETSc community, > > Barry > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Tue Nov 12 22:26:14 2024 From: jed at jedbrown.org (Jed Brown) Date: Tue, 12 Nov 2024 21:26:14 -0700 Subject: [petsc-users] Some hypre settings for 3D fully coupled incompressible NS In-Reply-To: References: Message-ID: <87cyizsqs9.fsf@jedbrown.org> To answer your question, I think the commercial solvers are not using "SAMG with ILU smoother" on the velocity-pressure coupled system, but a splitting technique or Schur-like reduction with that applied to the pressure system and AMG with ILU smoothers (or straight ILU) applied to the momentum system. ILU smoothing is often chosen when there is strong anisotropy (usually from boundary layers) not captured in the coarsening or the transport-dominated part prevents effective smoothing using more typical smoothers. ILU theory and smoothing properties are not very nice, but it is still often pragmatic. See -pc_hypre_boomeramg_smooth_type ilu (which activates more sub-options) if you want to try that out while sticking with hypre. It's worth checking whether making the `u` solver stronger has any significant impact on convergence, and distinguishing the accuracy impact of using multiplicative/selfp (even with an accurate solve of that subsystem) separately from the approximation incurred by using `preonly` (which you'll almost always want to do). You may have already sorted this out in your empirical study. Edoardo alinovi writes: > Hello petsc friends, > > It's been a while since I am trying to find a good setup for my coupled > solver. > > Recently, I have run a scan with Dakota (more than 1k simulations) on the > Windsor body case with 7Mln cells on 36 cores on my small home server (Dell > R730 with 2x2496 v4 xeon). I thought it was a good idea to share my results > with the community! > > Here is a resume of my finding: > > 1) Multiplicative is faster than Schur: I have found out that Schur > preconditioner is rarely faster than multiplicative despite the fact Schur > keeps the number of iterations lower. I think there is a lot of room for > improvement as far as FV matrices are concerned. Probably custom Shat is > the way to go, but not easy to find a good one! Up to now "selfp" looks to > be the only good and "ready to go" choice. > > 2) Vanilla fbcgs is faster than vanilla fgmres: maybe here we can tune > gmres restart, I have not tried this systematically. > > 3) Stick with preonly: using bcgs/cg as preconditioner ksp lowers the > number of iterations but it adds up a lot of overhead (even setting few > iterations or mild tolerances). > > 4) Staging is a good idea: beyond bare iteration performance, I think that > for steady state problems it worth setting a max for outer iterations in > fieldsplit, as starting iterations would cost you a lot and probably you > will be far from convergence anyway at the stage, so it is not a good > investment pushing hard on them. > > 5) Here my best so far settings: > > # Outer solver settings > "solver": "fbcgs", > "preconditioner": "fieldsplit", > "absTol": 1e-6, > "relTol": 0.01, > > # Field split KSP and PC > "fieldsplit_u_pc_type": "bjacobi", > "fieldsplit_p_pc_type": "hypre", > "fieldsplit_u_ksp_type": "preonly", > "fieldsplit_p_ksp_type": "preonly", > > ! HYPRE PC options > "fieldsplit_p_pc_hypre_boomeramg_strong_threshold": 0.05, > "fieldsplit_p_pc_hypre_boomeramg_coarsen_type": "PMIS", > "fieldsplit_p_pc_hypre_boomeramg_truncfactor": 0.3, > "fieldsplit_p_pc_hypre_boomeramg_no_cf": 0, > "fieldsplit_p_pc_hypre_boomeramg_agg_nl": 1, > "fieldsplit_p_pc_hypre_boomeramg_agg_num_paths": 1, > "fieldsplit_p_pc_hypre_boomeramg_P_max": 0, > "fieldsplit_p_pc_hypre_boomeramg_max_levels": 30, > "fieldsplit_p_pc_hypre_boomeramg_relax_type_all": > "backward-SOR/Jacobi", > "fieldsplit_p_pc_hypre_boomeramg_interp_type": "ext+i", > "fieldsplit_p_pc_hypre_boomeramg_grid_sweeps_down": 0, > "fieldsplit_p_pc_hypre_boomeramg_grid_sweeps_up": 2, > "fieldsplit_p_pc_hypre_boomeramg_cycle_type": "v" > > I have a question for Barry/Jed/Matt. I have noted that most of the > commercial solvers use what I define as "SAMG with ILU smoother". I am > wondering if there's a way to reproduce this in Petsc. I have tried > PCPATCH to test VANKA, but I am not really able to use that PC as I am not > using DMplex. With this recipe I am not miles away from Fluent on the same > problem. Yet, I am wondering why commercial solvers do not use fieldsplit. > > Hope this can be helpful and of course I am happy to collaborate on this > topic if someone outhere is willing to! > > Cheers, > > Edoardo From pjool at dtu.dk Thu Nov 14 08:39:12 2024 From: pjool at dtu.dk (=?iso-8859-1?Q?Peder_J=F8rgensgaard_Olesen?=) Date: Thu, 14 Nov 2024 14:39:12 +0000 Subject: [petsc-users] VecPow clarification Message-ID: Given a vector containing roots of unity, v[i] = exp(i*k*x[i]) I wanted to compute the vector u[i]=exp(i*n*k*x[i]), for some real number n. From the face of it this should be easily achieved with VecPow, as u[i] = v[i]^n. That didn't work as expected, though I got around it using VecGetArray() and a loop with PetscPowComplex(). The source designated in the docs (src/vec/vec/utils/projection.c) reveals that VecPow() maps v[i] to PETSC_INFINITY when the PetscRealPart(v[i]) < 0, unless the power is any of 0, ?0.5, ?1 or ?2. Even in the simple case of a purely real vector (with negative entries) raised to any other integer power, the results would not be what one might reasonably expect from the description of VecPow(). While I do have a solution suiting my need, I'm left wondering what might be the rationale for VecPow working the way it does. Best, Peder -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefano.zampini at gmail.com Thu Nov 14 08:56:52 2024 From: stefano.zampini at gmail.com (Stefano Zampini) Date: Thu, 14 Nov 2024 17:56:52 +0300 Subject: [petsc-users] VecPow clarification In-Reply-To: References: Message-ID: That is a very old bug! Can you make an MR to just call PetscPowScalar in a loop here https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/blob/main/src/vec/vec/utils/projection.c*L1022__;Iw!!G_uCfscf7eWS!bbSdSMnU5KpH03jHI7aV5j4WLGQ3yPvxWzR6Lwr14QLy_7EJ2MNT-qhL6J1x6z3vpF6M5GQk9lMQkMoLJ_JNSn1mqwqict4$ ? Thanks Il giorno gio 14 nov 2024 alle ore 17:39 Peder J?rgensgaard Olesen via petsc-users ha scritto: > Given a vector containing roots of unity, v[i] = exp(i*k*x[i]) I wanted to > compute the vector u[i]=exp(i*n*k*x[i]), for some real number n. From the > face of it this should be easily achieved with VecPow, as u[i] = v[i]^n. > > That didn't work as expected, though I got around it using VecGetArray() > and a loop with PetscPowComplex(). The source designated in the docs > (src/vec/vec/utils/projection.c) reveals that VecPow() maps v[i] to > PETSC_INFINITY when the PetscRealPart(v[i]) < 0, unless the power is any of > 0, ?0.5, ?1 or ?2. Even in the simple case of a purely real vector (with > negative entries) raised to any other integer power, the results would not > be what one might reasonably expect from the description of VecPow(). > > While I do have a solution suiting my need, I'm left wondering what might > be the rationale for VecPow working the way it does. > > Best, > Peder > -- Stefano -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Nov 14 09:01:36 2024 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 14 Nov 2024 10:01:36 -0500 Subject: [petsc-users] VecPow clarification In-Reply-To: References: Message-ID: On Thu, Nov 14, 2024 at 9:39?AM Peder J?rgensgaard Olesen via petsc-users < petsc-users at mcs.anl.gov> wrote: > Given a vector containing roots of unity, v[i] = exp(i*k*x[i]) I wanted to > compute the vector u[i]=exp(i*n*k*x[i]), for some real number n. From the > face of it this should be easily achieved with VecPow, as u[i] = v[i]^n. > > That didn't work as expected, though I got around it using VecGetArray() > and a loop with PetscPowComplex(). The source designated in the docs > (src/vec/vec/utils/projection.c) reveals that VecPow() maps v[i] to > PETSC_INFINITY when the PetscRealPart(v[i]) < 0, unless the power is any of > 0, ?0.5, ?1 or ?2. Even in the simple case of a purely real vector (with > negative entries) raised to any other integer power, the results would not > be what one might reasonably expect from the description of VecPow(). > > While I do have a solution suiting my need, I'm left wondering what might > be the rationale for VecPow working the way it does. > This is indeed wrong. It was coded only for real numbers. We will fix it. Thanks for reporting this, Matt > Best, > Peder > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!bNQOcaOJC5gkTat8nR3TNhd8LdtJY9sMS6rBMYVNwUdmQE2UkCPoXt7GmCWMleJs9EAJr_rfIaO2WqqNrQe-$ -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Thu Nov 14 09:14:13 2024 From: bsmith at petsc.dev (Barry Smith) Date: Thu, 14 Nov 2024 10:14:13 -0500 Subject: [petsc-users] VecPow clarification In-Reply-To: References: Message-ID: <29A8F44F-B393-4FF9-98B1-4685853A066C@petsc.dev> Good question. Looking at the commit logs from 2014 I see move tao vector operations over to Vec directory, fix a couple names and calling sequences So previously, the function was problem-specific for Tao (which did not support complex numbers) used inside the semi-smooth methods where roots of negative numbers should be mapped to infinity (indicating a "bad" domain point). When we merged the Tao and PETSc source code base just blindly copied over the code without realizing it was not a general-purpose VecPow() and that it did not make sense for complex numbers. I should rework it for general use without breaking the use in Tao. Thanks for pointing out the problem. Barry > On Nov 14, 2024, at 9:39?AM, Peder J?rgensgaard Olesen via petsc-users wrote: > > Given a vector containing roots of unity, v[i] = exp(i*k*x[i]) I wanted to compute the vector u[i]=exp(i*n*k*x[i]), for some real number n. From the face of it this should be easily achieved with VecPow, as u[i] = v[i]^n. > > That didn't work as expected, though I got around it using VecGetArray() and a loop with PetscPowComplex(). The source designated in the docs (src/vec/vec/utils/projection.c) reveals that VecPow() maps v[i] to PETSC_INFINITY when the PetscRealPart(v[i]) < 0, unless the power is any of 0, ?0.5, ?1 or ?2. Even in the simple case of a purely real vector (with negative entries) raised to any other integer power, the results would not be what one might reasonably expect from the description of VecPow(). > > While I do have a solution suiting my need, I'm left wondering what might be the rationale for VecPow working the way it does. > > Best, > Peder -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Thu Nov 14 09:29:17 2024 From: bsmith at petsc.dev (Barry Smith) Date: Thu, 14 Nov 2024 10:29:17 -0500 Subject: [petsc-users] VecPow clarification In-Reply-To: References: Message-ID: <759E86BE-0061-46A1-B8BA-2A3AE57FCC8E@petsc.dev> I see that currently VecPow is used only in a small number of places including: ksp/ksp/utils/lmvm/diagbrdn/diagbrdn.c: PetscCall(VecPow(ldb->U, ldb->beta - 1)); I am unsure if the usage here requires the special handling of negative numbers. I was wrong and it is not used in the semi-smooth code, that access the vector elements directly. If could be we can strip out all the special infinity cases completely. Barry > On Nov 14, 2024, at 9:56?AM, Stefano Zampini wrote: > > That is a very old bug! Can you make an MR to just call PetscPowScalar in a loop here https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/blob/main/src/vec/vec/utils/projection.c*L1022__;Iw!!G_uCfscf7eWS!b2Nfr7i01ZnyfVjweWqi87it8hcCDv0s6MtIsQhmSU8s4dq4jBPi-Ca87RRTwS20Srwyh9wQdhXcsbR6MVC8WSU$ ? > > Thanks > > Il giorno gio 14 nov 2024 alle ore 17:39 Peder J?rgensgaard Olesen via petsc-users > ha scritto: >> Given a vector containing roots of unity, v[i] = exp(i*k*x[i]) I wanted to compute the vector u[i]=exp(i*n*k*x[i]), for some real number n. From the face of it this should be easily achieved with VecPow, as u[i] = v[i]^n. >> >> That didn't work as expected, though I got around it using VecGetArray() and a loop with PetscPowComplex(). The source designated in the docs (src/vec/vec/utils/projection.c) reveals that VecPow() maps v[i] to PETSC_INFINITY when the PetscRealPart(v[i]) < 0, unless the power is any of 0, ?0.5, ?1 or ?2. Even in the simple case of a purely real vector (with negative entries) raised to any other integer power, the results would not be what one might reasonably expect from the description of VecPow(). >> >> While I do have a solution suiting my need, I'm left wondering what might be the rationale for VecPow working the way it does. >> >> Best, >> Peder > > > > -- > Stefano -------------- next part -------------- An HTML attachment was scrubbed... URL: From liw23 at rpi.edu Thu Nov 14 14:21:19 2024 From: liw23 at rpi.edu (Li, Weichao) Date: Thu, 14 Nov 2024 20:21:19 +0000 Subject: [petsc-users] Fail to install petsc4py with CUDA Message-ID: Hi, thanks for your help, I want to use petsc4py with CUDA follow the instructions from https://urldefense.us/v3/__https://github.com/caidao22/pnode?tab=readme-ov-file__;!!G_uCfscf7eWS!bQq3ISq79y5ZF67ko1BF1T_P37kAaWIzpwJSfJR_K4OWgj3IKF1qIcQiTYBoBWi9zBqQiTWXny2RAjXGkjY$ git clone https://urldefense.us/v3/__https://gitlab.com/petsc/petsc.git__;!!G_uCfscf7eWS!bQq3ISq79y5ZF67ko1BF1T_P37kAaWIzpwJSfJR_K4OWgj3IKF1qIcQiTYBoBWi9zBqQiTWXny2RzgYyUHQ$ cd petsc ./configure PETSC_ARCH=arch-linux-opt --with-debugging=0 --download-petsc4py If I do not use CUDA it works, if I use CUDA ./configure PETSC_ARCH=arch-linux-opt --with-debugging=0 --download-petsc4py --with-cuda=1 Then make check, there has some errors and when I run my code get the error. Cannnot import PETSc correctly. I attach the make.log and configue.log. Thanks. Traceback (most recent call last): File "/opt/dino/share/DINo_parallel_fabric/train.py", line 85, in from pnode import petsc_adjoint as odeint File "/opt/Dino_parallel/lib/python3.8/site-packages/pnode/__init__.py", line 3, in from . import petsc_adjoint File "/opt/Dino_parallel/lib/python3.8/site-packages/pnode/petsc_adjoint.py", line 6, in from petsc4py import PETSc File "/opt/dino/share/DINo_parallel_fabric/petsc/arch-linux-opt/lib/petsc4py/PETSc.py", line 4, in PETSc = ImportPETSc(ARCH) File "/opt/dino/share/DINo_parallel_fabric/petsc/arch-linux-opt/lib/petsc4py/lib/__init__.py", line 33, in ImportPETSc return Import('petsc4py', 'PETSc', path, arch) File "/opt/dino/share/DINo_parallel_fabric/petsc/arch-linux-opt/lib/petsc4py/lib/__init__.py", line 100, in Import module = import_module(pkg, name, path, arch) File "/opt/dino/share/DINo_parallel_fabric/petsc/arch-linux-opt/lib/petsc4py/lib/__init__.py", line 77, in import_module module = importlib.util.module_from_spec(spec) ImportError: /opt/dino/share/DINo_parallel_fabric/petsc/arch-linux-opt/lib/libpetsc.so.3.022: undefined symbol: cusparseSpMV_preprocess, version libcusparse.so.12 -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: make.log Type: text/x-log Size: 169782 bytes Desc: make.log URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: configure.log Type: text/x-log Size: 1626964 bytes Desc: configure.log URL: From balay.anl at fastmail.org Thu Nov 14 14:27:56 2024 From: balay.anl at fastmail.org (Satish Balay) Date: Thu, 14 Nov 2024 14:27:56 -0600 (CST) Subject: [petsc-users] Fail to install petsc4py with CUDA In-Reply-To: References: Message-ID: This issue is also posted at https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/issues/1672__;!!G_uCfscf7eWS!Y0YGMtJ1zE_1JyNSpmN0S6SSGzQrjzQt9diycTWSwIjBjne7KZKf1UK0SguHuz-MHWJB5z8otU7tmRb5zkw3wwcn2Yo$ Lets continue follow-up on the issue tracker - not the mailing list. Satish On Thu, 14 Nov 2024, Li, Weichao wrote: > Hi, thanks for your help, I want to use petsc4py with CUDA follow the instructions from https://urldefense.us/v3/__https://github.com/caidao22/pnode?tab=readme-ov-file__;!!G_uCfscf7eWS!bQq3ISq79y5ZF67ko1BF1T_P37kAaWIzpwJSfJR_K4OWgj3IKF1qIcQiTYBoBWi9zBqQiTWXny2RAjXGkjY$ > > git clone https://urldefense.us/v3/__https://gitlab.com/petsc/petsc.git__;!!G_uCfscf7eWS!bQq3ISq79y5ZF67ko1BF1T_P37kAaWIzpwJSfJR_K4OWgj3IKF1qIcQiTYBoBWi9zBqQiTWXny2RzgYyUHQ$ > cd petsc > ./configure PETSC_ARCH=arch-linux-opt --with-debugging=0 --download-petsc4py > > If I do not use CUDA it works, if I use CUDA > ./configure PETSC_ARCH=arch-linux-opt --with-debugging=0 --download-petsc4py --with-cuda=1 > > > Then make check, there has some errors and when I run my code get the error. Cannnot import PETSc > correctly. I attach the make.log and configue.log. Thanks. > > > Traceback (most recent call last): > File "/opt/dino/share/DINo_parallel_fabric/train.py", line 85, in > from pnode import petsc_adjoint as odeint > File "/opt/Dino_parallel/lib/python3.8/site-packages/pnode/__init__.py", line 3, in > from . import petsc_adjoint > File "/opt/Dino_parallel/lib/python3.8/site-packages/pnode/petsc_adjoint.py", line 6, in > from petsc4py import PETSc > File "/opt/dino/share/DINo_parallel_fabric/petsc/arch-linux-opt/lib/petsc4py/PETSc.py", line 4, in > PETSc = ImportPETSc(ARCH) > File "/opt/dino/share/DINo_parallel_fabric/petsc/arch-linux-opt/lib/petsc4py/lib/__init__.py", line 33, in ImportPETSc > return Import('petsc4py', 'PETSc', path, arch) > File "/opt/dino/share/DINo_parallel_fabric/petsc/arch-linux-opt/lib/petsc4py/lib/__init__.py", line 100, in Import > module = import_module(pkg, name, path, arch) > File "/opt/dino/share/DINo_parallel_fabric/petsc/arch-linux-opt/lib/petsc4py/lib/__init__.py", line 77, in import_module > module = importlib.util.module_from_spec(spec) > ImportError: /opt/dino/share/DINo_parallel_fabric/petsc/arch-linux-opt/lib/libpetsc.so.3.022: undefined symbol: cusparseSpMV_preprocess, version libcusparse.so.12 > > > From diegomagela at usp.br Thu Nov 14 15:40:58 2024 From: diegomagela at usp.br (Diego Magela Lemos) Date: Thu, 14 Nov 2024 18:40:58 -0300 Subject: [petsc-users] Steps to solve time step second-order differential problem Message-ID: For a second-order differential problem defined as M u_tt + C u_t + K u = F(t) what would be the steps to solve this problem using TS? P.S.: I already have the matrices M, C, and K implemented as Mat objects. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Thu Nov 14 16:22:13 2024 From: jed at jedbrown.org (Jed Brown) Date: Thu, 14 Nov 2024 15:22:13 -0700 Subject: [petsc-users] Steps to solve time step second-order differential problem In-Reply-To: References: Message-ID: <87ikspe9re.fsf@jedbrown.org> You can either rewrite as a first-order system or use TSSetI2Function (see examples) with TSALPHA2. https://urldefense.us/v3/__https://petsc.org/release/manualpages/TS/TSSetI2Function/*tsseti2function__;Iw!!G_uCfscf7eWS!a0HRe5m1TFfnxS6ZDGPkzSdsxgzubcDgMzFOPD-04IN5oE7gYM8QZnR9wcBCLVAWC47fej6mbRTAVAodtGc$ Diego Magela Lemos via petsc-users writes: > For a second-order differential problem defined as > > M u_tt + C u_t + K u = F(t) > > what would be the steps to solve this problem using TS? > > P.S.: I already have the matrices M, C, and K implemented as Mat objects. From bsmith at petsc.dev Fri Nov 15 11:19:25 2024 From: bsmith at petsc.dev (Barry Smith) Date: Fri, 15 Nov 2024 12:19:25 -0500 Subject: [petsc-users] VecPow clarification In-Reply-To: References: Message-ID: https://urldefense.us/v3/__https://gitlab.com/petsc/petsc/-/merge_requests/8012__;!!G_uCfscf7eWS!Y4coa74DKQZeRI4FvkVAj8dOc1biQzDAjzXBDqgLJKStN2JFpB7w-WYstcURUd-AykeTfuH7q6YeUPfb3eIQD2s$ > On Nov 14, 2024, at 9:39?AM, Peder J?rgensgaard Olesen via petsc-users wrote: > > Given a vector containing roots of unity, v[i] = exp(i*k*x[i]) I wanted to compute the vector u[i]=exp(i*n*k*x[i]), for some real number n. From the face of it this should be easily achieved with VecPow, as u[i] = v[i]^n. > > That didn't work as expected, though I got around it using VecGetArray() and a loop with PetscPowComplex(). The source designated in the docs (src/vec/vec/utils/projection.c) reveals that VecPow() maps v[i] to PETSC_INFINITY when the PetscRealPart(v[i]) < 0, unless the power is any of 0, ?0.5, ?1 or ?2. Even in the simple case of a purely real vector (with negative entries) raised to any other integer power, the results would not be what one might reasonably expect from the description of VecPow(). > > While I do have a solution suiting my need, I'm left wondering what might be the rationale for VecPow working the way it does. > > Best, > Peder -------------- next part -------------- An HTML attachment was scrubbed... URL: From mmolinos at us.es Tue Nov 19 03:56:18 2024 From: mmolinos at us.es (MIGUEL MOLINOS PEREZ) Date: Tue, 19 Nov 2024 09:56:18 +0000 Subject: [petsc-users] Ghost particles for DMSWARM (or similar) In-Reply-To: <55056A2B-85E9-4896-9B4B-869A14F4B2C8@us.es> References: <8FBAC7A5-B6AE-4B21-8FEB-52BE1C04A265@us.es> <1B9B1277-9566-444C-9DA8-7ED17684FE01@us.es> <24337E11-33D2-4FFC-89E2-12520AD487FF@us.es> <681E96BD-62A4-4566-A13E-E034B2F19D54@us.es> <7698089E-0909-429F-9E89-9D1AD636ACBF@us.es> <562B2CA0-7462-4AF2-AAF4-E44DDD00B222@us.es> <5C293345-E026-436B-B4D0-E5DC109A0701@us.es> <55056A2B-85E9-4896-9B4B-869A14F4B2C8@us.es> Message-ID: <98E34D64-80AD-4DF8-BCD1-94E8F29FB3DB@us.es> Dear all: Just to wrap this thread up. The easiest way to update any variable of the ghost particles is using the following lines of code: PetscCall(VecCreateGhostWithArray(PETSC_COMM_WORLD, n_dof_local, PETSC_DETERMINE, n_dof_ghost, idx_dof_ghost, X_ptr, &X)); PetscCall(VecGhostUpdateBegin(X, INSERT_VALUES, SCATTER_FORWARD)); PetscCall(VecGhostUpdateEnd(X, INSERT_VALUES, SCATTER_FORWARD)); Where X_ptr is the local pointer coming from: DMSwarmGetField Best, Miguel On 1 Oct 2024, at 19:42, MIGUEL MOLINOS PEREZ wrote: Wow, thank you Dave that?s awesome, let me know if there's anything I can help you with! Miguel On Oct 1, 2024, at 7:20?PM, Dave May wrote: On Tue, 1 Oct 2024 at 08:56, MIGUEL MOLINOS PEREZ > wrote: Hi Dave, Would something like that work? Yes, this should work! Any idea on where to look so I can try to implement it myself? I am adding support for this right now. Best, Miguel On Oct 1, 2024, at 5:22?PM, Dave May > wrote: Hi Miguel, On Tue 1. Oct 2024 at 07:56, MIGUEL MOLINOS PEREZ > wrote: Thank you Matt, it works! The implementation is straightforward: - 1? Define the paddle regions using DMGetLocalBoundingBox with the background DMDA mesh as an auxiliary mesh for the domain-partitioning. - 2? Create an integer to count the local number of particles to be used as ghost particle for other processors (N_ghost). One particle can be counted more than one time. At the same time, fill two arrays: - one with the index of the "main particle? (local particle), - and the other with the target rank of the "main particle?. - 3? Create the new particles using DMSwarmAddNPoints - 4? Fill the new particles with the information of the ?main particle? but set the internal variable DMSwarmField_rank with the target rank. - 5? Call DMSwarmMigrate(*,PETSC_TRUE). Therefore, we send the ghost particles to the corresponding processors and we delete them from the original processor. - 6? Do stuff? - 7? Delete ghost particles. This is very easy, we just have to call DMSwarmRemovePoint N_ghost times. I think this can be easily implemented as closed routine for the DMSwarm class. The remaining question is: how to do the communication between the ?original" particle and the ghost particles? For instance, if we update some particle variable (locally) inside of a SNES context, this same variable should be updated in the ghost particles at the other processors. I think what you are asking about is an operation similar to VecGhostUpdate{Begin,End}(). In the case of a DMSwarm I?m not sure how to define the InsertMode = ADD_VALUES? Some swarm fields do not make sense to be added. INSERT_VALUES is fine. One solution might be to have something like this DMSwarmCollectViewUpdateGhostOwners(DM dm, InsertMode mode, PetscInt nfields, const char *fieldNames[]); where one can specify the insert mode and the fields on which the insert mode will apply. Would something like that work? Cheers, Dave PS: Hope this helps someone in the future :-) On Sep 27, 2024, at 10:50?AM, MIGUEL MOLINOS PEREZ > wrote: Thank you Matt, let me give it try. Miguel On Sep 27, 2024, at 3:44?AM, Matthew Knepley > wrote: On Thu, Sep 26, 2024 at 7:18?PM MIGUEL MOLINOS PEREZ > wrote: I see, you mean: Create the ghost particles at the local cell with the same properties as particle 1 (duplicate the original particle) but different value DMSwarmField_rank. Then, call DMSwarmMigrate(*,PETSC_FALSE) so we do the migration and delete the local copies of the particle 1. Right? Yep. I think it will work, from what I know about BASIC. Thanks, Matt Thanks, Miguel On Sep 26, 2024, at 11:09?PM, Matthew Knepley > wrote: On Thu, Sep 26, 2024 at 11:20?AM MIGUEL MOLINOS PEREZ > wrote: Thank you Matt. Okey, let me have a careful look to the DMSwarmMigrate_Push_Basic implementation to see if there is some workaround. The idea of adding new particles is interesting. However, in that case, we need to initialize the new (ghost) particles using the fields of the ?real? particle, right? This can be done using something like: VecGhostUpdateBegin(Vec globalout,InsertMode ADD_VALUES, ScatterMode SCATTER_REVERSE); VecGhostUpdateEnd(Vec globalout,InsertMode ADD_VALUES, ScatterMode SCATTER_REVERSE); for the particle fields (?). I think we can just copy from the local particle. For example, suppose I decide that particle 1 should go to rank 5, 12, and 27. Then I first set p1.rank = 5, then I add two new particles with the same values as particle 1, but with rank = 12 and 27. Then when I call migrate, it will move these three particles to the correct processes, and delete the original particles and the copies from the local set. Thanks, Matt Thanks, Miguel On Sep 26, 2024, at 3:53?PM, Matthew Knepley > wrote: On Thu, Sep 26, 2024 at 6:31?AM MIGUEL MOLINOS PEREZ > wrote: Hi Matt et al, I?ve been working on the scheme that you proposed to create ghost particles (atoms in my case), and it works! With a couple of caveats: -1? In general the overlap particles will be migrate from their own rank to more than one neighbor rank, this is specially relevant for those located close to the corners. Therefore, you'll need to call DMSwarmMigrate several times (27 times for 3D cells), during the migration process. That is terrible. Let's just fix DMSwarmMigrate to have a mode that sends the particle to all overlapping neighbors at once. It can't be that hard. -2? You need to set DMSWARM_MIGRATE_BASIC. Otherwise the proposed algorithm will not work at all! Oh, I should have thought of that. Sorry. I can help code up that extension. Can you take a quick look at the BASIC code? Right now, we just use the rank attached to the particle to send it. We could have an arrays of ranks, but that seems crazy, and would blow up particle storage. How about just adding new particles with the other ranks right before migration? Thanks, Matt Hope this helps to other folks! I have a follow-up question about periodic bcc on this context, should I open a new thread of keep posting here? Thanks, Miguel On Aug 7, 2024, at 4:22?AM, MIGUEL MOLINOS PEREZ > wrote: Thanks Matt, I think I'll start by making a small program as a proof of concept. Then, if it works I'll implement it in my code and I'll be happy to share it too :-) Miguel On Aug 4, 2024, at 3:30?AM, Matthew Knepley > wrote: On Fri, Aug 2, 2024 at 7:15?PM MIGUEL MOLINOS PEREZ > wrote: Thanks again Matt, that makes a lot more sense !! Just to check that we are on the same page. You are saying: 1. create a field define a field called "owner rank" for each particle. 2. Identify the phantom particles and modify the internal variable defined by the DMSwarmField_rank variable. 3. Call DMSwarmMigrate(*,PETSC_FALSE), do the calculations using the new local vector including the ghost particles. 4. Then, once the calculations are done, rename the DMSwarmField_rank variable using the "owner rank" variable and call DMSwarmMigrate(*,PETSC_FALSE) once again. I don't think we need this last step. We can just remove those ghost particles for the next step I think. Thanks, Matt Thank you, Miguel On Aug 2, 2024, at 5:33?PM, Matthew Knepley > wrote: On Fri, Aug 2, 2024 at 11:15?AM MIGUEL MOLINOS PEREZ > wrote: Thank you Matt for your time, What you describe seems to me the ideal approach. 1) Add a particle field 'ghost' that identifies ghost vs owned particles. I think it needs options OWNED, OVERLAP, and GHOST This means, locally, I need to allocate Nlocal + ghost particles (duplicated) for my model? I would do it another way. I would allocate the particles with no overlap and set them up. Then I would identify the halo particles, mark them as OVERLAP, call DMSwarmMigrate(), and mark the migrated particles as GHOST, then unmark the OVERLAP particles. Shoot! That marking will not work since we cannot tell the difference between particles we received and particles we sent. Okay, instead of the `ghost` field we need an `owner rank` field. So then we 1) Setup the non-overlapping particles 2) Identify the halo particles 3) Change the `rank`, but not the `owner rank` 4) Call DMSwarmMigrate() Now we can identify ghost particles by the `owner rank` If that so, how to do the communication between the ghost particles living in the rank i and their ?real? counterpart in the rank j. Algo, as an alternative, what about: 1) Use an IS tag which contains, for each rank, a list of the global index of the neighbors particles outside of the rank. 2) Use VecCreateGhost to create a new vector which contains extra local space for the ghost components of the vector. 3) Use VecScatterCreate, VecScatterBegin, and VecScatterEnd to do the transference of data between a vector obtained with DMSwarmCreateGlobalVectorFromField 4) Do necessary computations using the vectors created with VecCreateGhost. This is essentially what Migrate() does. I was trying to reuse the code. Thanks, Matt Thanks, Miguel On Aug 2, 2024, at 8:58?AM, Matthew Knepley > wrote: On Thu, Aug 1, 2024 at 4:40?PM MIGUEL MOLINOS PEREZ > wrote: This Message Is From an External Sender This message came from outside your organization. Dear all, I am implementing a Molecular Dynamics (MD) code using the DMSWARM interface. In the MD simulations we evaluate on each particle (atoms) some kind of scalar functional using data from the neighbouring atoms. My problem lies in the parallel implementation of the model, because sometimes, some of these neighbours lie on a different processor. This is usually solved by using ghost particles. A similar approach (with nodes instead) is already implemented for other PETSc mesh structures like DMPlexConstructGhostCells. Unfortunately, I don't see this kind of constructs for DMSWARM. Am I missing something? I this could be done by applying a buffer region by exploiting the background DMDA mesh that I already use to do domain decomposition. Then using the buffer region of each cell to locate the ghost particles and finally using VecCreateGhost. Is this feasible? Or is there an easier approach using other PETSc functions. This is feasible, but it would be good to develop a set of best practices, since we have been mainly focused on the case of non-redundant particles. Here is how I think I would do what you want. 1) Add a particle field 'ghost' that identifies ghost vs owned particles. I think it needs options OWNED, OVERLAP, and GHOST 2) At some interval identify particles that should be sent to other processes as ghosts. I would call these "overlap particles". The determination seems application specific, so I would leave this determination to the user right now. We do two things to these particles a) Mark chosen particles as OVERLAP b) Change rank to process we are sending to 3) Call DMSwarmMigrate with PETSC_FALSE for the particle deletion flag 4) Mark OVERLAP particles as GHOST when they arrive There is one problem in the above algorithm. It does not allow sending particles to multiple ranks. We would have to do this in phases right now, or make a small adjustment to the interface allowing replication of particles when a set of ranks is specified. THanks, Matt Thank you, Miguel -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!bdHPrpwBJb1e2EwnUBsc12AsmFX-skqxMCtvkOWcXrM01q6SnXY4rnaqzZaUWI0rbyp3pCAiM_O5KKZZ1hWeSA$ -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!bdHPrpwBJb1e2EwnUBsc12AsmFX-skqxMCtvkOWcXrM01q6SnXY4rnaqzZaUWI0rbyp3pCAiM_O5KKZZ1hWeSA$ -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!bdHPrpwBJb1e2EwnUBsc12AsmFX-skqxMCtvkOWcXrM01q6SnXY4rnaqzZaUWI0rbyp3pCAiM_O5KKZZ1hWeSA$ -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!bdHPrpwBJb1e2EwnUBsc12AsmFX-skqxMCtvkOWcXrM01q6SnXY4rnaqzZaUWI0rbyp3pCAiM_O5KKZZ1hWeSA$ -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!bdHPrpwBJb1e2EwnUBsc12AsmFX-skqxMCtvkOWcXrM01q6SnXY4rnaqzZaUWI0rbyp3pCAiM_O5KKZZ1hWeSA$ -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!bdHPrpwBJb1e2EwnUBsc12AsmFX-skqxMCtvkOWcXrM01q6SnXY4rnaqzZaUWI0rbyp3pCAiM_O5KKZZ1hWeSA$ -------------- next part -------------- An HTML attachment was scrubbed... URL: From mmolinos at us.es Tue Nov 19 05:14:33 2024 From: mmolinos at us.es (MIGUEL MOLINOS PEREZ) Date: Tue, 19 Nov 2024 11:14:33 +0000 Subject: [petsc-users] Doubt about mesh size distribution in DMDACreate3d using periodic boundary conditions Message-ID: <56B31960-5928-4225-BE0D-B2A8E6214776@us.es> Dear all: It seems that if I mesh a cubic domain with ?DMDACreate3d? using 8 bricks for discretization and with periodic boundaries, each of the bricks has a different size. In contrast, if I use DM_BOUNDARY_NONE, all 8 bricks have the same size. I have used this together with the DMSWarm discretization. And as you can see the number of particles per rank is not evenly distributed: 210 420 366 732 420 840 732 1464 Am I missing something? Thanks, Miguel [cid:534265d3-3f18-41cd-8006-539cb06751f9 at eurprd01.prod.exchangelabs.com] -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Screenshot 2024-11-19 at 10.56.36.png Type: image/png Size: 233037 bytes Desc: Screenshot 2024-11-19 at 10.56.36.png URL: From zhjx960203 at gmail.com Tue Nov 19 03:35:10 2024 From: zhjx960203 at gmail.com (=?UTF-8?B?5byg5bO75rqq?=) Date: Tue, 19 Nov 2024 17:35:10 +0800 Subject: [petsc-users] Why can't I save a diagonal matrix into a binary file via petsc4py? Message-ID: My version is 3.22.0 from "petsc4py.__version__" Here is the code: ```python from petsc4py import PETSc import numpy as np vec1 = PETSc.Vec().createWithArray(np.arange(30)) mat1 = PETSc.Mat().createDiagonal(vec1) mat1.assemble() viewer = PETSc.Viewer().createBinary("abcde.bin", "w") mat1.view(viewer) viewer.destroy() ``` I find it runs without any error, but no file generated However, if I convert mat1 to AIJ, it can be saved successfully -------------- next part -------------- An HTML attachment was scrubbed... URL: From jroman at dsic.upv.es Tue Nov 19 10:59:59 2024 From: jroman at dsic.upv.es (Jose E. Roman) Date: Tue, 19 Nov 2024 16:59:59 +0000 Subject: [petsc-users] Why can't I save a diagonal matrix into a binary file via petsc4py? In-Reply-To: References: Message-ID: The implementation is here: https://urldefense.us/v3/__https://petsc.org/release/src/mat/impls/diagonal/diagonal.c.html*MatView_Diagonal__;Iw!!G_uCfscf7eWS!ej9LRPwh7-zzKBJDEUfh0gSmHxGO7nitFAZJGu1s7mi0htPg8OUAqyV04o_0YYgqOCkjSAKzvEij27AKN9-3Mzk2$ You can see that it only handles the ascii case. It should also check the case of binary viewer, then call VecView() or convert to MATAIJ and call MatView(). Do you want to contribute a merge request? https://urldefense.us/v3/__https://petsc.org/release/developers/contributing/__;!!G_uCfscf7eWS!ej9LRPwh7-zzKBJDEUfh0gSmHxGO7nitFAZJGu1s7mi0htPg8OUAqyV04o_0YYgqOCkjSAKzvEij27AKN6f4qvXn$ Jose > El 19 nov 2024, a las 10:35, ??? escribi?: > > My version is 3.22.0 from "petsc4py.__version__" > Here is the code: > ```python > from petsc4py import PETSc > import numpy as np > > vec1 = PETSc.Vec().createWithArray(np.arange(30)) > mat1 = PETSc.Mat().createDiagonal(vec1) > mat1.assemble() > viewer = PETSc.Viewer().createBinary("abcde.bin", "w") > mat1.view(viewer) > viewer.destroy() > ``` > > I find it runs without any error, but no file generated > However, if I convert mat1 to AIJ, it can be saved successfully From bsmith at petsc.dev Tue Nov 19 11:55:40 2024 From: bsmith at petsc.dev (Barry Smith) Date: Tue, 19 Nov 2024 12:55:40 -0500 Subject: [petsc-users] Doubt about mesh size distribution in DMDACreate3d using periodic boundary conditions In-Reply-To: <56B31960-5928-4225-BE0D-B2A8E6214776@us.es> References: <56B31960-5928-4225-BE0D-B2A8E6214776@us.es> Message-ID: <533B9E59-5B74-433E-B7A9-89972A2C7A8D@petsc.dev> I modify src/dm/tests/ex25.c and always see a nice even split when possible with both DM_BOUNDARY_NONE and DM_BOUNDARY_PERIODIC Can you please send a reproducible example? Thanks Barry > On Nov 19, 2024, at 6:14?AM, MIGUEL MOLINOS PEREZ wrote: > > Dear all: > > It seems that if I mesh a cubic domain with ?DMDACreate3d? using 8 bricks for discretization and with periodic boundaries, each of the bricks has a different size. In contrast, if I use DM_BOUNDARY_NONE, all 8 bricks have the same size. > > I have used this together with the DMSWarm discretization. And as you can see the number of particles per rank is not evenly distributed: > 210 420 366 732 420 840 732 1464 > > Am I missing something? > > Thanks, > Miguel > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mmolinos at us.es Wed Nov 20 04:49:03 2024 From: mmolinos at us.es (MIGUEL MOLINOS PEREZ) Date: Wed, 20 Nov 2024 10:49:03 +0000 Subject: [petsc-users] Doubt about mesh size distribution in DMDACreate3d using periodic boundary conditions In-Reply-To: <533B9E59-5B74-433E-B7A9-89972A2C7A8D@petsc.dev> References: <56B31960-5928-4225-BE0D-B2A8E6214776@us.es> <533B9E59-5B74-433E-B7A9-89972A2C7A8D@petsc.dev> Message-ID: <3421F3B3-C775-4E1A-A572-9B22EF0942E9@us.es> Hi Bary: I will check the example you suggest. Anyhow, I?ll send a reproducible example ASAP. Thanks, Miguel On 19 Nov 2024, at 18:55, Barry Smith wrote: I modify src/dm/tests/ex25.c and always see a nice even split when possible with both DM_BOUNDARY_NONE and DM_BOUNDARY_PERIODIC Can you please send a reproducible example? Thanks Barry On Nov 19, 2024, at 6:14?AM, MIGUEL MOLINOS PEREZ wrote: Dear all: It seems that if I mesh a cubic domain with ?DMDACreate3d? using 8 bricks for discretization and with periodic boundaries, each of the bricks has a different size. In contrast, if I use DM_BOUNDARY_NONE, all 8 bricks have the same size. I have used this together with the DMSWarm discretization. And as you can see the number of particles per rank is not evenly distributed: 210 420 366 732 420 840 732 1464 Am I missing something? Thanks, Miguel -------------- next part -------------- An HTML attachment was scrubbed... URL: From mmolinos at us.es Wed Nov 20 06:06:24 2024 From: mmolinos at us.es (MIGUEL MOLINOS PEREZ) Date: Wed, 20 Nov 2024 12:06:24 +0000 Subject: [petsc-users] Doubt about mesh size distribution in DMDACreate3d using periodic boundary conditions In-Reply-To: <3421F3B3-C775-4E1A-A572-9B22EF0942E9@us.es> References: <56B31960-5928-4225-BE0D-B2A8E6214776@us.es> <533B9E59-5B74-433E-B7A9-89972A2C7A8D@petsc.dev> <3421F3B3-C775-4E1A-A572-9B22EF0942E9@us.es> Message-ID: Dear Barry, Please, find attached to this email a minimal example of the problem. Run it using 8 MPI processes. Thanks, Miguel On 20 Nov 2024, at 11:48, Miguel Molinos wrote: Hi Bary: I will check the example you suggest. Anyhow, I?ll send a reproducible example ASAP. Thanks, Miguel On 19 Nov 2024, at 18:55, Barry Smith wrote: I modify src/dm/tests/ex25.c and always see a nice even split when possible with both DM_BOUNDARY_NONE and DM_BOUNDARY_PERIODIC Can you please send a reproducible example? Thanks Barry On Nov 19, 2024, at 6:14?AM, MIGUEL MOLINOS PEREZ wrote: Dear all: It seems that if I mesh a cubic domain with ?DMDACreate3d? using 8 bricks for discretization and with periodic boundaries, each of the bricks has a different size. In contrast, if I use DM_BOUNDARY_NONE, all 8 bricks have the same size. I have used this together with the DMSWarm discretization. And as you can see the number of particles per rank is not evenly distributed: 210 420 366 732 420 840 732 1464 Am I missing something? Thanks, Miguel -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: atoms-3D.cpp Type: application/octet-stream Size: 26402 bytes Desc: atoms-3D.cpp URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: Mg-hcp-cube-x17-x10-x10.dump Type: application/octet-stream Size: 436301 bytes Desc: Mg-hcp-cube-x17-x10-x10.dump URL: From bsmith at petsc.dev Wed Nov 20 11:36:08 2024 From: bsmith at petsc.dev (Barry Smith) Date: Wed, 20 Nov 2024 12:36:08 -0500 Subject: [petsc-users] Doubt about mesh size distribution in DMDACreate3d using periodic boundary conditions In-Reply-To: References: <56B31960-5928-4225-BE0D-B2A8E6214776@us.es> <533B9E59-5B74-433E-B7A9-89972A2C7A8D@petsc.dev> <3421F3B3-C775-4E1A-A572-9B22EF0942E9@us.es> Message-ID: <609D8006-8454-4FE3-A052-A0D95D9189F7@petsc.dev> I am sorry, I don't understand the problem. When I run by default with -da_view I get Processor [0] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1 X range of indices: 0 2, Y range of indices: 0 2, Z range of indices: 0 2 Processor [1] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1 X range of indices: 2 3, Y range of indices: 0 2, Z range of indices: 0 2 Processor [2] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1 X range of indices: 0 2, Y range of indices: 2 3, Z range of indices: 0 2 Processor [3] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1 X range of indices: 2 3, Y range of indices: 2 3, Z range of indices: 0 2 Processor [4] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1 X range of indices: 0 2, Y range of indices: 0 2, Z range of indices: 2 3 Processor [5] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1 X range of indices: 2 3, Y range of indices: 0 2, Z range of indices: 2 3 Processor [6] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1 X range of indices: 0 2, Y range of indices: 2 3, Z range of indices: 2 3 Processor [7] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1 X range of indices: 2 3, Y range of indices: 2 3, Z range of indices: 2 3 which seems right because you are trying to have three cells in each direction. The distribution has to be uneven, hence 0 2 and 2 3 When I change the code to use ndiv_mesh_* = 4 and run with periodic or not I get $ PETSC_OPTIONS="" mpiexec -n 8 ./atoms-3D -dm_view DM Object: 8 MPI processes type: da Processor [0] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1 X range of indices: 0 2, Y range of indices: 0 2, Z range of indices: 0 2 Processor [1] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1 X range of indices: 2 4, Y range of indices: 0 2, Z range of indices: 0 2 Processor [2] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1 X range of indices: 0 2, Y range of indices: 2 4, Z range of indices: 0 2 Processor [3] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1 X range of indices: 2 4, Y range of indices: 2 4, Z range of indices: 0 2 Processor [4] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1 X range of indices: 0 2, Y range of indices: 0 2, Z range of indices: 2 4 Processor [5] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1 X range of indices: 2 4, Y range of indices: 0 2, Z range of indices: 2 4 Processor [6] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1 X range of indices: 0 2, Y range of indices: 2 4, Z range of indices: 2 4 Processor [7] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1 X range of indices: 2 4, Y range of indices: 2 4, Z range of indices: 2 4 so it is splitting as expected each rank gets a 2 by 2 by 2 set of indices. Could you please let me know what the problem is that I should be seeing. Barry > On Nov 20, 2024, at 7:06?AM, MIGUEL MOLINOS PEREZ wrote: > > Dear Barry, > > Please, find attached to this email a minimal example of the problem. Run it using 8 MPI processes. > > Thanks, > Miguel > > > > > >> On 20 Nov 2024, at 11:48, Miguel Molinos wrote: >> >> Hi Bary: >> >> I will check the example you suggest. Anyhow, I?ll send a reproducible example ASAP. >> >> Thanks, >> Miguel >> >>> On 19 Nov 2024, at 18:55, Barry Smith wrote: >>> >>> >>> I modify src/dm/tests/ex25.c and always see a nice even split when possible with both DM_BOUNDARY_NONE and DM_BOUNDARY_PERIODIC >>> >>> Can you please send a reproducible example? >>> >>> Thanks >>> >>> Barry >>> >>> >>>> On Nov 19, 2024, at 6:14?AM, MIGUEL MOLINOS PEREZ wrote: >>>> >>>> Dear all: >>>> >>>> It seems that if I mesh a cubic domain with ?DMDACreate3d? using 8 bricks for discretization and with periodic boundaries, each of the bricks has a different size. In contrast, if I use DM_BOUNDARY_NONE, all 8 bricks have the same size. >>>> >>>> I have used this together with the DMSWarm discretization. And as you can see the number of particles per rank is not evenly distributed: >>>> 210 420 366 732 420 840 732 1464 >>>> >>>> Am I missing something? >>>> >>>> Thanks, >>>> Miguel >>>> >>>> >>>> >>> >> > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mmolinos at us.es Wed Nov 20 11:48:47 2024 From: mmolinos at us.es (MIGUEL MOLINOS PEREZ) Date: Wed, 20 Nov 2024 17:48:47 +0000 Subject: [petsc-users] Doubt about mesh size distribution in DMDACreate3d using periodic boundary conditions In-Reply-To: <609D8006-8454-4FE3-A052-A0D95D9189F7@petsc.dev> References: <56B31960-5928-4225-BE0D-B2A8E6214776@us.es> <533B9E59-5B74-433E-B7A9-89972A2C7A8D@petsc.dev> <3421F3B3-C775-4E1A-A572-9B22EF0942E9@us.es> <609D8006-8454-4FE3-A052-A0D95D9189F7@petsc.dev> Message-ID: Sorry, I meant that the discretisation size is not constant across the edges of the cube. Miguel On 20 Nov 2024, at 18:36, Barry Smith wrote: I am sorry, I don't understand the problem. When I run by default with -da_view I get Processor [0] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1 X range of indices: 0 2, Y range of indices: 0 2, Z range of indices: 0 2 Processor [1] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1 X range of indices: 2 3, Y range of indices: 0 2, Z range of indices: 0 2 Processor [2] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1 X range of indices: 0 2, Y range of indices: 2 3, Z range of indices: 0 2 Processor [3] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1 X range of indices: 2 3, Y range of indices: 2 3, Z range of indices: 0 2 Processor [4] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1 X range of indices: 0 2, Y range of indices: 0 2, Z range of indices: 2 3 Processor [5] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1 X range of indices: 2 3, Y range of indices: 0 2, Z range of indices: 2 3 Processor [6] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1 X range of indices: 0 2, Y range of indices: 2 3, Z range of indices: 2 3 Processor [7] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1 X range of indices: 2 3, Y range of indices: 2 3, Z range of indices: 2 3 which seems right because you are trying to have three cells in each direction. The distribution has to be uneven, hence 0 2 and 2 3 When I change the code to use ndiv_mesh_* = 4 and run with periodic or not I get $ PETSC_OPTIONS="" mpiexec -n 8 ./atoms-3D -dm_view DM Object: 8 MPI processes type: da Processor [0] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1 X range of indices: 0 2, Y range of indices: 0 2, Z range of indices: 0 2 Processor [1] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1 X range of indices: 2 4, Y range of indices: 0 2, Z range of indices: 0 2 Processor [2] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1 X range of indices: 0 2, Y range of indices: 2 4, Z range of indices: 0 2 Processor [3] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1 X range of indices: 2 4, Y range of indices: 2 4, Z range of indices: 0 2 Processor [4] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1 X range of indices: 0 2, Y range of indices: 0 2, Z range of indices: 2 4 Processor [5] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1 X range of indices: 2 4, Y range of indices: 0 2, Z range of indices: 2 4 Processor [6] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1 X range of indices: 0 2, Y range of indices: 2 4, Z range of indices: 2 4 Processor [7] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1 X range of indices: 2 4, Y range of indices: 2 4, Z range of indices: 2 4 so it is splitting as expected each rank gets a 2 by 2 by 2 set of indices. Could you please let me know what the problem is that I should be seeing. Barry On Nov 20, 2024, at 7:06?AM, MIGUEL MOLINOS PEREZ wrote: Dear Barry, Please, find attached to this email a minimal example of the problem. Run it using 8 MPI processes. Thanks, Miguel On 20 Nov 2024, at 11:48, Miguel Molinos wrote: Hi Bary: I will check the example you suggest. Anyhow, I?ll send a reproducible example ASAP. Thanks, Miguel On 19 Nov 2024, at 18:55, Barry Smith wrote: I modify src/dm/tests/ex25.c and always see a nice even split when possible with both DM_BOUNDARY_NONE and DM_BOUNDARY_PERIODIC Can you please send a reproducible example? Thanks Barry On Nov 19, 2024, at 6:14?AM, MIGUEL MOLINOS PEREZ wrote: Dear all: It seems that if I mesh a cubic domain with ?DMDACreate3d? using 8 bricks for discretization and with periodic boundaries, each of the bricks has a different size. In contrast, if I use DM_BOUNDARY_NONE, all 8 bricks have the same size. I have used this together with the DMSWarm discretization. And as you can see the number of particles per rank is not evenly distributed: 210 420 366 732 420 840 732 1464 Am I missing something? Thanks, Miguel -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Wed Nov 20 11:52:30 2024 From: bsmith at petsc.dev (Barry Smith) Date: Wed, 20 Nov 2024 12:52:30 -0500 Subject: [petsc-users] Doubt about mesh size distribution in DMDACreate3d using periodic boundary conditions In-Reply-To: References: <56B31960-5928-4225-BE0D-B2A8E6214776@us.es> <533B9E59-5B74-433E-B7A9-89972A2C7A8D@petsc.dev> <3421F3B3-C775-4E1A-A572-9B22EF0942E9@us.es> <609D8006-8454-4FE3-A052-A0D95D9189F7@petsc.dev> Message-ID: <50D40867-05EE-4AD0-B356-BC4C2E1CCF1D@petsc.dev> What do you mean by discretization size, and how do I see it in the code? Barry > On Nov 20, 2024, at 12:48?PM, MIGUEL MOLINOS PEREZ wrote: > > Sorry, I meant that the discretisation size is not constant across the edges of the cube. > > Miguel > >> On 20 Nov 2024, at 18:36, Barry Smith wrote: >> >> >> I am sorry, I don't understand the problem. When I run by default with -da_view I get >> >> Processor [0] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1 >> X range of indices: 0 2, Y range of indices: 0 2, Z range of indices: 0 2 >> Processor [1] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1 >> X range of indices: 2 3, Y range of indices: 0 2, Z range of indices: 0 2 >> Processor [2] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1 >> X range of indices: 0 2, Y range of indices: 2 3, Z range of indices: 0 2 >> Processor [3] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1 >> X range of indices: 2 3, Y range of indices: 2 3, Z range of indices: 0 2 >> Processor [4] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1 >> X range of indices: 0 2, Y range of indices: 0 2, Z range of indices: 2 3 >> Processor [5] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1 >> X range of indices: 2 3, Y range of indices: 0 2, Z range of indices: 2 3 >> Processor [6] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1 >> X range of indices: 0 2, Y range of indices: 2 3, Z range of indices: 2 3 >> Processor [7] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1 >> X range of indices: 2 3, Y range of indices: 2 3, Z range of indices: 2 3 >> >> which seems right because you are trying to have three cells in each direction. The distribution has to be uneven, hence 0 2 and 2 3 >> >> When I change the code to use ndiv_mesh_* = 4 and run with periodic or not I get >> >> $ PETSC_OPTIONS="" mpiexec -n 8 ./atoms-3D -dm_view >> DM Object: 8 MPI processes >> type: da >> Processor [0] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1 >> X range of indices: 0 2, Y range of indices: 0 2, Z range of indices: 0 2 >> Processor [1] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1 >> X range of indices: 2 4, Y range of indices: 0 2, Z range of indices: 0 2 >> Processor [2] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1 >> X range of indices: 0 2, Y range of indices: 2 4, Z range of indices: 0 2 >> Processor [3] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1 >> X range of indices: 2 4, Y range of indices: 2 4, Z range of indices: 0 2 >> Processor [4] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1 >> X range of indices: 0 2, Y range of indices: 0 2, Z range of indices: 2 4 >> Processor [5] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1 >> X range of indices: 2 4, Y range of indices: 0 2, Z range of indices: 2 4 >> Processor [6] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1 >> X range of indices: 0 2, Y range of indices: 2 4, Z range of indices: 2 4 >> Processor [7] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1 >> X range of indices: 2 4, Y range of indices: 2 4, Z range of indices: 2 4 >> >> so it is splitting as expected each rank gets a 2 by 2 by 2 set of indices. >> >> Could you please let me know what the problem is that I should be seeing. >> >> Barry >> >> >>> On Nov 20, 2024, at 7:06?AM, MIGUEL MOLINOS PEREZ wrote: >>> >>> Dear Barry, >>> >>> Please, find attached to this email a minimal example of the problem. Run it using 8 MPI processes. >>> >>> Thanks, >>> Miguel >>> >>> >>> >>> >>> >>>> On 20 Nov 2024, at 11:48, Miguel Molinos wrote: >>>> >>>> Hi Bary: >>>> >>>> I will check the example you suggest. Anyhow, I?ll send a reproducible example ASAP. >>>> >>>> Thanks, >>>> Miguel >>>> >>>>> On 19 Nov 2024, at 18:55, Barry Smith wrote: >>>>> >>>>> >>>>> I modify src/dm/tests/ex25.c and always see a nice even split when possible with both DM_BOUNDARY_NONE and DM_BOUNDARY_PERIODIC >>>>> >>>>> Can you please send a reproducible example? >>>>> >>>>> Thanks >>>>> >>>>> Barry >>>>> >>>>> >>>>>> On Nov 19, 2024, at 6:14?AM, MIGUEL MOLINOS PEREZ wrote: >>>>>> >>>>>> Dear all: >>>>>> >>>>>> It seems that if I mesh a cubic domain with ?DMDACreate3d? using 8 bricks for discretization and with periodic boundaries, each of the bricks has a different size. In contrast, if I use DM_BOUNDARY_NONE, all 8 bricks have the same size. >>>>>> >>>>>> I have used this together with the DMSWarm discretization. And as you can see the number of particles per rank is not evenly distributed: >>>>>> 210 420 366 732 420 840 732 1464 >>>>>> >>>>>> Am I missing something? >>>>>> >>>>>> Thanks, >>>>>> Miguel >>>>>> >>>>>> >>>>>> >>>>> >>>> >>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mmolinos at us.es Wed Nov 20 11:56:27 2024 From: mmolinos at us.es (MIGUEL MOLINOS PEREZ) Date: Wed, 20 Nov 2024 17:56:27 +0000 Subject: [petsc-users] Doubt about mesh size distribution in DMDACreate3d using periodic boundary conditions In-Reply-To: <50D40867-05EE-4AD0-B356-BC4C2E1CCF1D@petsc.dev> References: <56B31960-5928-4225-BE0D-B2A8E6214776@us.es> <533B9E59-5B74-433E-B7A9-89972A2C7A8D@petsc.dev> <3421F3B3-C775-4E1A-A572-9B22EF0942E9@us.es> <609D8006-8454-4FE3-A052-A0D95D9189F7@petsc.dev> <50D40867-05EE-4AD0-B356-BC4C2E1CCF1D@petsc.dev> Message-ID: <6B69A792-4004-4B32-AB0D-F9D4AFACA094@us.es> I mean that if the dimensions of the cube are 1x1x1 (for example). And I want 10 elements per edge, the discretization size must be 0.1 constant over the whole cube edge. This is not in the code, I just impose the number of elements per edge. Thank you, Miguel On 20 Nov 2024, at 18:52, Barry Smith wrote: What do you mean by discretization size, and how do I see it in the code? Barry On Nov 20, 2024, at 12:48?PM, MIGUEL MOLINOS PEREZ wrote: Sorry, I meant that the discretisation size is not constant across the edges of the cube. Miguel On 20 Nov 2024, at 18:36, Barry Smith wrote: I am sorry, I don't understand the problem. When I run by default with -da_view I get Processor [0] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1 X range of indices: 0 2, Y range of indices: 0 2, Z range of indices: 0 2 Processor [1] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1 X range of indices: 2 3, Y range of indices: 0 2, Z range of indices: 0 2 Processor [2] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1 X range of indices: 0 2, Y range of indices: 2 3, Z range of indices: 0 2 Processor [3] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1 X range of indices: 2 3, Y range of indices: 2 3, Z range of indices: 0 2 Processor [4] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1 X range of indices: 0 2, Y range of indices: 0 2, Z range of indices: 2 3 Processor [5] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1 X range of indices: 2 3, Y range of indices: 0 2, Z range of indices: 2 3 Processor [6] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1 X range of indices: 0 2, Y range of indices: 2 3, Z range of indices: 2 3 Processor [7] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1 X range of indices: 2 3, Y range of indices: 2 3, Z range of indices: 2 3 which seems right because you are trying to have three cells in each direction. The distribution has to be uneven, hence 0 2 and 2 3 When I change the code to use ndiv_mesh_* = 4 and run with periodic or not I get $ PETSC_OPTIONS="" mpiexec -n 8 ./atoms-3D -dm_view DM Object: 8 MPI processes type: da Processor [0] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1 X range of indices: 0 2, Y range of indices: 0 2, Z range of indices: 0 2 Processor [1] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1 X range of indices: 2 4, Y range of indices: 0 2, Z range of indices: 0 2 Processor [2] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1 X range of indices: 0 2, Y range of indices: 2 4, Z range of indices: 0 2 Processor [3] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1 X range of indices: 2 4, Y range of indices: 2 4, Z range of indices: 0 2 Processor [4] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1 X range of indices: 0 2, Y range of indices: 0 2, Z range of indices: 2 4 Processor [5] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1 X range of indices: 2 4, Y range of indices: 0 2, Z range of indices: 2 4 Processor [6] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1 X range of indices: 0 2, Y range of indices: 2 4, Z range of indices: 2 4 Processor [7] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1 X range of indices: 2 4, Y range of indices: 2 4, Z range of indices: 2 4 so it is splitting as expected each rank gets a 2 by 2 by 2 set of indices. Could you please let me know what the problem is that I should be seeing. Barry On Nov 20, 2024, at 7:06?AM, MIGUEL MOLINOS PEREZ wrote: Dear Barry, Please, find attached to this email a minimal example of the problem. Run it using 8 MPI processes. Thanks, Miguel On 20 Nov 2024, at 11:48, Miguel Molinos wrote: Hi Bary: I will check the example you suggest. Anyhow, I?ll send a reproducible example ASAP. Thanks, Miguel On 19 Nov 2024, at 18:55, Barry Smith wrote: I modify src/dm/tests/ex25.c and always see a nice even split when possible with both DM_BOUNDARY_NONE and DM_BOUNDARY_PERIODIC Can you please send a reproducible example? Thanks Barry On Nov 19, 2024, at 6:14?AM, MIGUEL MOLINOS PEREZ wrote: Dear all: It seems that if I mesh a cubic domain with ?DMDACreate3d? using 8 bricks for discretization and with periodic boundaries, each of the bricks has a different size. In contrast, if I use DM_BOUNDARY_NONE, all 8 bricks have the same size. I have used this together with the DMSWarm discretization. And as you can see the number of particles per rank is not evenly distributed: 210 420 366 732 420 840 732 1464 Am I missing something? Thanks, Miguel -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Wed Nov 20 12:54:26 2024 From: bsmith at petsc.dev (Barry Smith) Date: Wed, 20 Nov 2024 13:54:26 -0500 Subject: [petsc-users] Doubt about mesh size distribution in DMDACreate3d using periodic boundary conditions In-Reply-To: <6B69A792-4004-4B32-AB0D-F9D4AFACA094@us.es> References: <56B31960-5928-4225-BE0D-B2A8E6214776@us.es> <533B9E59-5B74-433E-B7A9-89972A2C7A8D@petsc.dev> <3421F3B3-C775-4E1A-A572-9B22EF0942E9@us.es> <609D8006-8454-4FE3-A052-A0D95D9189F7@petsc.dev> <50D40867-05EE-4AD0-B356-BC4C2E1CCF1D@petsc.dev> <6B69A792-4004-4B32-AB0D-F9D4AFACA094@us.es> Message-ID: <83500C6C-6E3E-499A-8595-7EEFE3174028@petsc.dev> Are you considering your degrees of freedom as vertex or cell-centered? Say three "elements" per edge. If vertex centered then discretization size is 1/3 if periodic and 1/2 if not periodic If cell-centered then each cell has width 1/3 for both periodic and not periodic but in both cases you can think of the discretization size as constant along the whole cube edge. Is this related to DMSWARM in particular? > On Nov 20, 2024, at 12:56?PM, MIGUEL MOLINOS PEREZ wrote: > > I mean that if the dimensions of the cube are 1x1x1 (for example). And I want 10 elements per edge, the discretization size must be 0.1 constant over the whole cube edge. > > This is not in the code, I just impose the number of elements per edge. > > Thank you, > Miguel > >> On 20 Nov 2024, at 18:52, Barry Smith wrote: >> >> >> What do you mean by discretization size, and how do I see it in the code? >> >> Barry >> >> >>> On Nov 20, 2024, at 12:48?PM, MIGUEL MOLINOS PEREZ wrote: >>> >>> Sorry, I meant that the discretisation size is not constant across the edges of the cube. >>> >>> Miguel >>> >>>> On 20 Nov 2024, at 18:36, Barry Smith wrote: >>>> >>>> >>>> I am sorry, I don't understand the problem. When I run by default with -da_view I get >>>> >>>> Processor [0] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1 >>>> X range of indices: 0 2, Y range of indices: 0 2, Z range of indices: 0 2 >>>> Processor [1] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1 >>>> X range of indices: 2 3, Y range of indices: 0 2, Z range of indices: 0 2 >>>> Processor [2] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1 >>>> X range of indices: 0 2, Y range of indices: 2 3, Z range of indices: 0 2 >>>> Processor [3] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1 >>>> X range of indices: 2 3, Y range of indices: 2 3, Z range of indices: 0 2 >>>> Processor [4] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1 >>>> X range of indices: 0 2, Y range of indices: 0 2, Z range of indices: 2 3 >>>> Processor [5] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1 >>>> X range of indices: 2 3, Y range of indices: 0 2, Z range of indices: 2 3 >>>> Processor [6] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1 >>>> X range of indices: 0 2, Y range of indices: 2 3, Z range of indices: 2 3 >>>> Processor [7] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1 >>>> X range of indices: 2 3, Y range of indices: 2 3, Z range of indices: 2 3 >>>> >>>> which seems right because you are trying to have three cells in each direction. The distribution has to be uneven, hence 0 2 and 2 3 >>>> >>>> When I change the code to use ndiv_mesh_* = 4 and run with periodic or not I get >>>> >>>> $ PETSC_OPTIONS="" mpiexec -n 8 ./atoms-3D -dm_view >>>> DM Object: 8 MPI processes >>>> type: da >>>> Processor [0] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1 >>>> X range of indices: 0 2, Y range of indices: 0 2, Z range of indices: 0 2 >>>> Processor [1] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1 >>>> X range of indices: 2 4, Y range of indices: 0 2, Z range of indices: 0 2 >>>> Processor [2] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1 >>>> X range of indices: 0 2, Y range of indices: 2 4, Z range of indices: 0 2 >>>> Processor [3] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1 >>>> X range of indices: 2 4, Y range of indices: 2 4, Z range of indices: 0 2 >>>> Processor [4] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1 >>>> X range of indices: 0 2, Y range of indices: 0 2, Z range of indices: 2 4 >>>> Processor [5] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1 >>>> X range of indices: 2 4, Y range of indices: 0 2, Z range of indices: 2 4 >>>> Processor [6] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1 >>>> X range of indices: 0 2, Y range of indices: 2 4, Z range of indices: 2 4 >>>> Processor [7] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1 >>>> X range of indices: 2 4, Y range of indices: 2 4, Z range of indices: 2 4 >>>> >>>> so it is splitting as expected each rank gets a 2 by 2 by 2 set of indices. >>>> >>>> Could you please let me know what the problem is that I should be seeing. >>>> >>>> Barry >>>> >>>> >>>>> On Nov 20, 2024, at 7:06?AM, MIGUEL MOLINOS PEREZ wrote: >>>>> >>>>> Dear Barry, >>>>> >>>>> Please, find attached to this email a minimal example of the problem. Run it using 8 MPI processes. >>>>> >>>>> Thanks, >>>>> Miguel >>>>> >>>>> >>>>> >>>>> >>>>> >>>>>> On 20 Nov 2024, at 11:48, Miguel Molinos wrote: >>>>>> >>>>>> Hi Bary: >>>>>> >>>>>> I will check the example you suggest. Anyhow, I?ll send a reproducible example ASAP. >>>>>> >>>>>> Thanks, >>>>>> Miguel >>>>>> >>>>>>> On 19 Nov 2024, at 18:55, Barry Smith wrote: >>>>>>> >>>>>>> >>>>>>> I modify src/dm/tests/ex25.c and always see a nice even split when possible with both DM_BOUNDARY_NONE and DM_BOUNDARY_PERIODIC >>>>>>> >>>>>>> Can you please send a reproducible example? >>>>>>> >>>>>>> Thanks >>>>>>> >>>>>>> Barry >>>>>>> >>>>>>> >>>>>>>> On Nov 19, 2024, at 6:14?AM, MIGUEL MOLINOS PEREZ wrote: >>>>>>>> >>>>>>>> Dear all: >>>>>>>> >>>>>>>> It seems that if I mesh a cubic domain with ?DMDACreate3d? using 8 bricks for discretization and with periodic boundaries, each of the bricks has a different size. In contrast, if I use DM_BOUNDARY_NONE, all 8 bricks have the same size. >>>>>>>> >>>>>>>> I have used this together with the DMSWarm discretization. And as you can see the number of particles per rank is not evenly distributed: >>>>>>>> 210 420 366 732 420 840 732 1464 >>>>>>>> >>>>>>>> Am I missing something? >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Miguel >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>>> >>>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mmolinos at us.es Wed Nov 20 13:38:20 2024 From: mmolinos at us.es (MIGUEL MOLINOS PEREZ) Date: Wed, 20 Nov 2024 19:38:20 +0000 Subject: [petsc-users] Doubt about mesh size distribution in DMDACreate3d using periodic boundary conditions In-Reply-To: <83500C6C-6E3E-499A-8595-7EEFE3174028@petsc.dev> References: <56B31960-5928-4225-BE0D-B2A8E6214776@us.es> <533B9E59-5B74-433E-B7A9-89972A2C7A8D@petsc.dev> <3421F3B3-C775-4E1A-A572-9B22EF0942E9@us.es> <609D8006-8454-4FE3-A052-A0D95D9189F7@petsc.dev> <50D40867-05EE-4AD0-B356-BC4C2E1CCF1D@petsc.dev> <6B69A792-4004-4B32-AB0D-F9D4AFACA094@us.es> <83500C6C-6E3E-499A-8595-7EEFE3174028@petsc.dev> Message-ID: <14500884-FB4B-4872-9E06-207FE6482187@us.es> Yes, I use the vertex (nodes) of the elements. I am using the DMDA as an auxiliar mesh to do the domain partitioning in the DMSWARM. Thanks, Miguel On 20 Nov 2024, at 19:54, Barry Smith wrote: Are you considering your degrees of freedom as vertex or cell-centered? Say three "elements" per edge. If vertex centered then discretization size is 1/3 if periodic and 1/2 if not periodic If cell-centered then each cell has width 1/3 for both periodic and not periodic but in both cases you can think of the discretization size as constant along the whole cube edge. Is this related to DMSWARM in particular? On Nov 20, 2024, at 12:56?PM, MIGUEL MOLINOS PEREZ wrote: I mean that if the dimensions of the cube are 1x1x1 (for example). And I want 10 elements per edge, the discretization size must be 0.1 constant over the whole cube edge. This is not in the code, I just impose the number of elements per edge. Thank you, Miguel On 20 Nov 2024, at 18:52, Barry Smith wrote: What do you mean by discretization size, and how do I see it in the code? Barry On Nov 20, 2024, at 12:48?PM, MIGUEL MOLINOS PEREZ wrote: Sorry, I meant that the discretisation size is not constant across the edges of the cube. Miguel On 20 Nov 2024, at 18:36, Barry Smith wrote: I am sorry, I don't understand the problem. When I run by default with -da_view I get Processor [0] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1 X range of indices: 0 2, Y range of indices: 0 2, Z range of indices: 0 2 Processor [1] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1 X range of indices: 2 3, Y range of indices: 0 2, Z range of indices: 0 2 Processor [2] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1 X range of indices: 0 2, Y range of indices: 2 3, Z range of indices: 0 2 Processor [3] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1 X range of indices: 2 3, Y range of indices: 2 3, Z range of indices: 0 2 Processor [4] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1 X range of indices: 0 2, Y range of indices: 0 2, Z range of indices: 2 3 Processor [5] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1 X range of indices: 2 3, Y range of indices: 0 2, Z range of indices: 2 3 Processor [6] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1 X range of indices: 0 2, Y range of indices: 2 3, Z range of indices: 2 3 Processor [7] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1 X range of indices: 2 3, Y range of indices: 2 3, Z range of indices: 2 3 which seems right because you are trying to have three cells in each direction. The distribution has to be uneven, hence 0 2 and 2 3 When I change the code to use ndiv_mesh_* = 4 and run with periodic or not I get $ PETSC_OPTIONS="" mpiexec -n 8 ./atoms-3D -dm_view DM Object: 8 MPI processes type: da Processor [0] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1 X range of indices: 0 2, Y range of indices: 0 2, Z range of indices: 0 2 Processor [1] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1 X range of indices: 2 4, Y range of indices: 0 2, Z range of indices: 0 2 Processor [2] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1 X range of indices: 0 2, Y range of indices: 2 4, Z range of indices: 0 2 Processor [3] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1 X range of indices: 2 4, Y range of indices: 2 4, Z range of indices: 0 2 Processor [4] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1 X range of indices: 0 2, Y range of indices: 0 2, Z range of indices: 2 4 Processor [5] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1 X range of indices: 2 4, Y range of indices: 0 2, Z range of indices: 2 4 Processor [6] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1 X range of indices: 0 2, Y range of indices: 2 4, Z range of indices: 2 4 Processor [7] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1 X range of indices: 2 4, Y range of indices: 2 4, Z range of indices: 2 4 so it is splitting as expected each rank gets a 2 by 2 by 2 set of indices. Could you please let me know what the problem is that I should be seeing. Barry On Nov 20, 2024, at 7:06?AM, MIGUEL MOLINOS PEREZ wrote: Dear Barry, Please, find attached to this email a minimal example of the problem. Run it using 8 MPI processes. Thanks, Miguel On 20 Nov 2024, at 11:48, Miguel Molinos wrote: Hi Bary: I will check the example you suggest. Anyhow, I?ll send a reproducible example ASAP. Thanks, Miguel On 19 Nov 2024, at 18:55, Barry Smith wrote: I modify src/dm/tests/ex25.c and always see a nice even split when possible with both DM_BOUNDARY_NONE and DM_BOUNDARY_PERIODIC Can you please send a reproducible example? Thanks Barry On Nov 19, 2024, at 6:14?AM, MIGUEL MOLINOS PEREZ wrote: Dear all: It seems that if I mesh a cubic domain with ?DMDACreate3d? using 8 bricks for discretization and with periodic boundaries, each of the bricks has a different size. In contrast, if I use DM_BOUNDARY_NONE, all 8 bricks have the same size. I have used this together with the DMSWarm discretization. And as you can see the number of particles per rank is not evenly distributed: 210 420 366 732 420 840 732 1464 Am I missing something? Thanks, Miguel -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Wed Nov 20 15:56:56 2024 From: bsmith at petsc.dev (Barry Smith) Date: Wed, 20 Nov 2024 16:56:56 -0500 Subject: [petsc-users] Doubt about mesh size distribution in DMDACreate3d using periodic boundary conditions In-Reply-To: <14500884-FB4B-4872-9E06-207FE6482187@us.es> References: <56B31960-5928-4225-BE0D-B2A8E6214776@us.es> <533B9E59-5B74-433E-B7A9-89972A2C7A8D@petsc.dev> <3421F3B3-C775-4E1A-A572-9B22EF0942E9@us.es> <609D8006-8454-4FE3-A052-A0D95D9189F7@petsc.dev> <50D40867-05EE-4AD0-B356-BC4C2E1CCF1D@petsc.dev> <6B69A792-4004-4B32-AB0D-F9D4AFACA094@us.es> <83500C6C-6E3E-499A-8595-7EEFE3174028@petsc.dev> <14500884-FB4B-4872-9E06-207FE6482187@us.es> Message-ID: <969F00FA-3636-4B06-88EC-04DE57C8F492@petsc.dev> > On Nov 20, 2024, at 2:38?PM, MIGUEL MOLINOS PEREZ wrote: > > Yes, I use the vertex (nodes) of the elements. Then the length between each vertex will be different between periodic and non-periodic case. With 10 points and non-periodic, it will be 1/9, and with periodic it will be 1/10th. Is this what you are asking about? > > I am using the DMDA as an auxiliar mesh to do the domain partitioning in the DMSWARM. > > Thanks, > Miguel > > > >> On 20 Nov 2024, at 19:54, Barry Smith wrote: >> >> >> Are you considering your degrees of freedom as vertex or cell-centered? >> >> Say three "elements" per edge. >> >> If vertex centered then discretization size is 1/3 if periodic and 1/2 if not periodic >> >> If cell-centered then each cell has width 1/3 for both periodic and not periodic >> >> but in both cases you can think of the discretization size as constant along the whole cube edge. >> >> Is this related to DMSWARM in particular? >> >>> On Nov 20, 2024, at 12:56?PM, MIGUEL MOLINOS PEREZ wrote: >>> >>> I mean that if the dimensions of the cube are 1x1x1 (for example). And I want 10 elements per edge, the discretization size must be 0.1 constant over the whole cube edge. >>> >>> This is not in the code, I just impose the number of elements per edge. >>> >>> Thank you, >>> Miguel >>> >>>> On 20 Nov 2024, at 18:52, Barry Smith wrote: >>>> >>>> >>>> What do you mean by discretization size, and how do I see it in the code? >>>> >>>> Barry >>>> >>>> >>>>> On Nov 20, 2024, at 12:48?PM, MIGUEL MOLINOS PEREZ wrote: >>>>> >>>>> Sorry, I meant that the discretisation size is not constant across the edges of the cube. >>>>> >>>>> Miguel >>>>> >>>>>> On 20 Nov 2024, at 18:36, Barry Smith wrote: >>>>>> >>>>>> >>>>>> I am sorry, I don't understand the problem. When I run by default with -da_view I get >>>>>> >>>>>> Processor [0] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1 >>>>>> X range of indices: 0 2, Y range of indices: 0 2, Z range of indices: 0 2 >>>>>> Processor [1] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1 >>>>>> X range of indices: 2 3, Y range of indices: 0 2, Z range of indices: 0 2 >>>>>> Processor [2] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1 >>>>>> X range of indices: 0 2, Y range of indices: 2 3, Z range of indices: 0 2 >>>>>> Processor [3] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1 >>>>>> X range of indices: 2 3, Y range of indices: 2 3, Z range of indices: 0 2 >>>>>> Processor [4] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1 >>>>>> X range of indices: 0 2, Y range of indices: 0 2, Z range of indices: 2 3 >>>>>> Processor [5] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1 >>>>>> X range of indices: 2 3, Y range of indices: 0 2, Z range of indices: 2 3 >>>>>> Processor [6] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1 >>>>>> X range of indices: 0 2, Y range of indices: 2 3, Z range of indices: 2 3 >>>>>> Processor [7] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1 >>>>>> X range of indices: 2 3, Y range of indices: 2 3, Z range of indices: 2 3 >>>>>> >>>>>> which seems right because you are trying to have three cells in each direction. The distribution has to be uneven, hence 0 2 and 2 3 >>>>>> >>>>>> When I change the code to use ndiv_mesh_* = 4 and run with periodic or not I get >>>>>> >>>>>> $ PETSC_OPTIONS="" mpiexec -n 8 ./atoms-3D -dm_view >>>>>> DM Object: 8 MPI processes >>>>>> type: da >>>>>> Processor [0] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1 >>>>>> X range of indices: 0 2, Y range of indices: 0 2, Z range of indices: 0 2 >>>>>> Processor [1] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1 >>>>>> X range of indices: 2 4, Y range of indices: 0 2, Z range of indices: 0 2 >>>>>> Processor [2] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1 >>>>>> X range of indices: 0 2, Y range of indices: 2 4, Z range of indices: 0 2 >>>>>> Processor [3] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1 >>>>>> X range of indices: 2 4, Y range of indices: 2 4, Z range of indices: 0 2 >>>>>> Processor [4] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1 >>>>>> X range of indices: 0 2, Y range of indices: 0 2, Z range of indices: 2 4 >>>>>> Processor [5] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1 >>>>>> X range of indices: 2 4, Y range of indices: 0 2, Z range of indices: 2 4 >>>>>> Processor [6] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1 >>>>>> X range of indices: 0 2, Y range of indices: 2 4, Z range of indices: 2 4 >>>>>> Processor [7] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1 >>>>>> X range of indices: 2 4, Y range of indices: 2 4, Z range of indices: 2 4 >>>>>> >>>>>> so it is splitting as expected each rank gets a 2 by 2 by 2 set of indices. >>>>>> >>>>>> Could you please let me know what the problem is that I should be seeing. >>>>>> >>>>>> Barry >>>>>> >>>>>> >>>>>>> On Nov 20, 2024, at 7:06?AM, MIGUEL MOLINOS PEREZ wrote: >>>>>>> >>>>>>> Dear Barry, >>>>>>> >>>>>>> Please, find attached to this email a minimal example of the problem. Run it using 8 MPI processes. >>>>>>> >>>>>>> Thanks, >>>>>>> Miguel >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>>> On 20 Nov 2024, at 11:48, Miguel Molinos wrote: >>>>>>>> >>>>>>>> Hi Bary: >>>>>>>> >>>>>>>> I will check the example you suggest. Anyhow, I?ll send a reproducible example ASAP. >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Miguel >>>>>>>> >>>>>>>>> On 19 Nov 2024, at 18:55, Barry Smith wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>> I modify src/dm/tests/ex25.c and always see a nice even split when possible with both DM_BOUNDARY_NONE and DM_BOUNDARY_PERIODIC >>>>>>>>> >>>>>>>>> Can you please send a reproducible example? >>>>>>>>> >>>>>>>>> Thanks >>>>>>>>> >>>>>>>>> Barry >>>>>>>>> >>>>>>>>> >>>>>>>>>> On Nov 19, 2024, at 6:14?AM, MIGUEL MOLINOS PEREZ wrote: >>>>>>>>>> >>>>>>>>>> Dear all: >>>>>>>>>> >>>>>>>>>> It seems that if I mesh a cubic domain with ?DMDACreate3d? using 8 bricks for discretization and with periodic boundaries, each of the bricks has a different size. In contrast, if I use DM_BOUNDARY_NONE, all 8 bricks have the same size. >>>>>>>>>> >>>>>>>>>> I have used this together with the DMSWarm discretization. And as you can see the number of particles per rank is not evenly distributed: >>>>>>>>>> 210 420 366 732 420 840 732 1464 >>>>>>>>>> >>>>>>>>>> Am I missing something? >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Miguel >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From mmolinos at us.es Wed Nov 20 16:40:26 2024 From: mmolinos at us.es (MIGUEL MOLINOS PEREZ) Date: Wed, 20 Nov 2024 22:40:26 +0000 Subject: [petsc-users] Doubt about mesh size distribution in DMDACreate3d using periodic boundary conditions In-Reply-To: <969F00FA-3636-4B06-88EC-04DE57C8F492@petsc.dev> References: <56B31960-5928-4225-BE0D-B2A8E6214776@us.es> <533B9E59-5B74-433E-B7A9-89972A2C7A8D@petsc.dev> <3421F3B3-C775-4E1A-A572-9B22EF0942E9@us.es> <609D8006-8454-4FE3-A052-A0D95D9189F7@petsc.dev> <50D40867-05EE-4AD0-B356-BC4C2E1CCF1D@petsc.dev> <6B69A792-4004-4B32-AB0D-F9D4AFACA094@us.es> <83500C6C-6E3E-499A-8595-7EEFE3174028@petsc.dev> <14500884-FB4B-4872-9E06-207FE6482187@us.es> <969F00FA-3636-4B06-88EC-04DE57C8F492@petsc.dev> Message-ID: <300ECE6B-36F4-4212-9EA6-95716EBB06A2@us.es> I see? that might be the problem. I?ll check it tomorrow. Thank you! Miguel On 20 Nov 2024, at 22:57, Barry Smith wrote: ? On Nov 20, 2024, at 2:38?PM, MIGUEL MOLINOS PEREZ wrote: Yes, I use the vertex (nodes) of the elements. Then the length between each vertex will be different between periodic and non-periodic case. With 10 points and non-periodic, it will be 1/9, and with periodic it will be 1/10th. Is this what you are asking about? I am using the DMDA as an auxiliar mesh to do the domain partitioning in the DMSWARM. Thanks, Miguel On 20 Nov 2024, at 19:54, Barry Smith wrote: Are you considering your degrees of freedom as vertex or cell-centered? Say three "elements" per edge. If vertex centered then discretization size is 1/3 if periodic and 1/2 if not periodic If cell-centered then each cell has width 1/3 for both periodic and not periodic but in both cases you can think of the discretization size as constant along the whole cube edge. Is this related to DMSWARM in particular? On Nov 20, 2024, at 12:56?PM, MIGUEL MOLINOS PEREZ wrote: I mean that if the dimensions of the cube are 1x1x1 (for example). And I want 10 elements per edge, the discretization size must be 0.1 constant over the whole cube edge. This is not in the code, I just impose the number of elements per edge. Thank you, Miguel On 20 Nov 2024, at 18:52, Barry Smith wrote: What do you mean by discretization size, and how do I see it in the code? Barry On Nov 20, 2024, at 12:48?PM, MIGUEL MOLINOS PEREZ wrote: Sorry, I meant that the discretisation size is not constant across the edges of the cube. Miguel On 20 Nov 2024, at 18:36, Barry Smith wrote: I am sorry, I don't understand the problem. When I run by default with -da_view I get Processor [0] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1 X range of indices: 0 2, Y range of indices: 0 2, Z range of indices: 0 2 Processor [1] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1 X range of indices: 2 3, Y range of indices: 0 2, Z range of indices: 0 2 Processor [2] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1 X range of indices: 0 2, Y range of indices: 2 3, Z range of indices: 0 2 Processor [3] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1 X range of indices: 2 3, Y range of indices: 2 3, Z range of indices: 0 2 Processor [4] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1 X range of indices: 0 2, Y range of indices: 0 2, Z range of indices: 2 3 Processor [5] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1 X range of indices: 2 3, Y range of indices: 0 2, Z range of indices: 2 3 Processor [6] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1 X range of indices: 0 2, Y range of indices: 2 3, Z range of indices: 2 3 Processor [7] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1 X range of indices: 2 3, Y range of indices: 2 3, Z range of indices: 2 3 which seems right because you are trying to have three cells in each direction. The distribution has to be uneven, hence 0 2 and 2 3 When I change the code to use ndiv_mesh_* = 4 and run with periodic or not I get $ PETSC_OPTIONS="" mpiexec -n 8 ./atoms-3D -dm_view DM Object: 8 MPI processes type: da Processor [0] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1 X range of indices: 0 2, Y range of indices: 0 2, Z range of indices: 0 2 Processor [1] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1 X range of indices: 2 4, Y range of indices: 0 2, Z range of indices: 0 2 Processor [2] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1 X range of indices: 0 2, Y range of indices: 2 4, Z range of indices: 0 2 Processor [3] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1 X range of indices: 2 4, Y range of indices: 2 4, Z range of indices: 0 2 Processor [4] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1 X range of indices: 0 2, Y range of indices: 0 2, Z range of indices: 2 4 Processor [5] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1 X range of indices: 2 4, Y range of indices: 0 2, Z range of indices: 2 4 Processor [6] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1 X range of indices: 0 2, Y range of indices: 2 4, Z range of indices: 2 4 Processor [7] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1 X range of indices: 2 4, Y range of indices: 2 4, Z range of indices: 2 4 so it is splitting as expected each rank gets a 2 by 2 by 2 set of indices. Could you please let me know what the problem is that I should be seeing. Barry On Nov 20, 2024, at 7:06?AM, MIGUEL MOLINOS PEREZ wrote: Dear Barry, Please, find attached to this email a minimal example of the problem. Run it using 8 MPI processes. Thanks, Miguel On 20 Nov 2024, at 11:48, Miguel Molinos wrote: Hi Bary: I will check the example you suggest. Anyhow, I?ll send a reproducible example ASAP. Thanks, Miguel On 19 Nov 2024, at 18:55, Barry Smith wrote: I modify src/dm/tests/ex25.c and always see a nice even split when possible with both DM_BOUNDARY_NONE and DM_BOUNDARY_PERIODIC Can you please send a reproducible example? Thanks Barry On Nov 19, 2024, at 6:14?AM, MIGUEL MOLINOS PEREZ wrote: Dear all: It seems that if I mesh a cubic domain with ?DMDACreate3d? using 8 bricks for discretization and with periodic boundaries, each of the bricks has a different size. In contrast, if I use DM_BOUNDARY_NONE, all 8 bricks have the same size. I have used this together with the DMSWarm discretization. And as you can see the number of particles per rank is not evenly distributed: 210 420 366 732 420 840 732 1464 Am I missing something? Thanks, Miguel -------------- next part -------------- An HTML attachment was scrubbed... URL: From 12431140 at mail.sustech.edu.cn Thu Nov 21 06:11:32 2024 From: 12431140 at mail.sustech.edu.cn (=?utf-8?B?RGF2aWQgSmlhd2VpIExVTyBMSUFORw==?=) Date: Thu, 21 Nov 2024 20:11:32 +0800 Subject: [petsc-users] Cannot iterate well when using Newton iteration of SNES Message-ID: I am using the Newton iteration to solve a nonlinear 1D heat equation problem by using FEM. I attached my source code named "SNES_heat.cpp"  when I run the code   0 SNES Function norm 1.206289245288e+01   1 SNES Function norm 7.128802192789e+00   2 SNES Function norm 6.608812909525e+00 you can find that it only iterate 3 steps, and then do all the function evaluation and finally just stop the program.  I think it is not reasonble. I check my code, it is correct if I set it as a linear problem. it means my Jacobian and Residual function is correct. But when I set it as a nonlinear, the residual seems reduces as not expected.  I doubt that whether my understanding of the newton iteration is different from SNES's newton iteration process. David Jiawei LUO LIANG ??????/??/???/2024 ?????????????1088?   -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: SNES_heat.cpp Type: application/octet-stream Size: 35836 bytes Desc: not available URL: From jed at jedbrown.org Thu Nov 21 09:05:27 2024 From: jed at jedbrown.org (Jed Brown) Date: Thu, 21 Nov 2024 08:05:27 -0700 Subject: [petsc-users] Cannot iterate well when using Newton iteration of SNES In-Reply-To: References: Message-ID: <87a5dsbpag.fsf@jedbrown.org> You should add VecZeroEntries(f) near the top of your FormFunction (it's currently accumulating into whatever was there last) and MatZeroEntries(B) to FormJacobian. I reduced to nElem = 5 for ease of viewing. With these changes, I see quadratic convergence but the problem is still nonlinear. To explore further, consider using these diagnostics ./SNES_heat -{snes,ksp}_monitor -{snes,ksp}_converged_reason -snes_linesearch_monitor -ksp_view_mat with and without -snes_fd. For readability, I would suggest consistency in "u" vs "x". "David Jiawei LUO LIANG" <12431140 at mail.sustech.edu.cn> writes: > I am using the Newton iteration to solve a nonlinear 1D heat equation problem by using FEM. > > > I attached my source code named "SNES_heat.cpp"  > > > when I run the code > >   0 SNES Function norm 1.206289245288e+01 > >   1 SNES Function norm 7.128802192789e+00 > >   2 SNES Function norm 6.608812909525e+00 > > > > you can find that it only iterate 3 steps, and then do all the function evaluation and finally just stop the program.  > > > I think it is not reasonble. I check my code, it is correct if I set it as a linear problem. it means my Jacobian and Residual function is correct. > > > But when I set it as a nonlinear, the residual seems reduces as not expected.  > > > I doubt that whether my understanding of the newton iteration is different from SNES's newton iteration process. > > > > > > > > > David Jiawei LUO LIANG > > > > ??????/??/???/2024 > > > > ?????????????1088? > > > > >   From bsmith at petsc.dev Thu Nov 21 09:19:51 2024 From: bsmith at petsc.dev (Barry Smith) Date: Thu, 21 Nov 2024 10:19:51 -0500 Subject: [petsc-users] Cannot iterate well when using Newton iteration of SNES In-Reply-To: References: Message-ID: <6A4CB22C-6D05-497C-9A57-E5AB8B7C073F@petsc.dev> Start with https://urldefense.us/v3/__https://petsc.org/release/faq/*why-is-newton-s-method-snes-not-converging-or-converges-slowly__;Iw!!G_uCfscf7eWS!au7FVXP89CeLcvEPaqyMevQ8XXBThUgOilXB2BskyYlAyPKwckhOPoT_TGVv_IKuZQTSFDRMPe3F09zTuhtno2k$ Next use -snes_test_jacobian - compare the user provided Jacobian with one computed via finite differences to check for errors. If a threshold is given, display only those entries whose difference is greater than the threshold. -snes_test_jacobian_view - display the user provided Jacobian, the finite difference Jacobian and the difference between them to help users detect the location of errors in the user provided Jacobian. There are many, many reasons Newton can fail, usually they are due to bugs in the function evaluation or Jacobian evaluation. Occasionly they are due to it being a very difficult non-linear problem. You first need to use the tools above to verify there are no bugs anywhere. Barry > On Nov 21, 2024, at 7:11?AM, David Jiawei LUO LIANG <12431140 at mail.sustech.edu.cn> wrote: > > I am using the Newton iteration to solve a nonlinear 1D heat equation problem by using FEM. > > I attached my source code named "SNES_heat.cpp" > > when I run the code > 0 SNES Function norm 1.206289245288e+01 > 1 SNES Function norm 7.128802192789e+00 > 2 SNES Function norm 6.608812909525e+00 > > you can find that it only iterate 3 steps, and then do all the function evaluation and finally just stop the program. > > I think it is not reasonble. I check my code, it is correct if I set it as a linear problem. it means my Jacobian and Residual function is correct. > > But when I set it as a nonlinear, the residual seems reduces as not expected. > > I doubt that whether my understanding of the newton iteration is different from SNES's newton iteration process. > > > > > > David Jiawei LUO LIANG > ??????/??/???/2024 > ?????????????1088? > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From 12431140 at mail.sustech.edu.cn Thu Nov 21 09:16:54 2024 From: 12431140 at mail.sustech.edu.cn (=?utf-8?B?RGF2aWQgSmlhd2VpIExVTyBMSUFORw==?=) Date: Thu, 21 Nov 2024 23:16:54 +0800 Subject: [petsc-users] Cannot iterate well when using Newton iteration of SNES In-Reply-To: <87a5dsbpag.fsf@jedbrown.org> References: <87a5dsbpag.fsf@jedbrown.org> Message-ID: Thank you Jed. It works, and the result is identical to the exact solution!  Hope you best! David Jiawei LUO LIANG ??????/??/???/2024 ?????????????1088?       ------------------ Original ------------------ From:  "Jed Brown" From knepley at gmail.com Thu Nov 21 09:21:29 2024 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 21 Nov 2024 10:21:29 -0500 Subject: [petsc-users] Cannot iterate well when using Newton iteration of SNES In-Reply-To: References: Message-ID: On Thu, Nov 21, 2024 at 8:57?AM David Jiawei LUO LIANG < 12431140 at mail.sustech.edu.cn> wrote: > I am using the Newton iteration to solve a nonlinear 1D heat equation > problem by using FEM. > > I attached my source code named "SNES_heat.cpp" > > when I run the code > > 0 SNES Function norm 1.206289245288e+01 > > 1 SNES Function norm 7.128802192789e+00 > > 2 SNES Function norm 6.608812909525e+00 > > you can find that it only iterate 3 steps, and then do all the function > evaluation and finally just stop the program. > > I think it is not reasonble. I check my code, it is correct if I set it as > a linear problem. it means my Jacobian and Residual function is correct. > > But when I set it as a nonlinear, the residual seems reduces as not > expected. > > I doubt that whether my understanding of the newton iteration is different > from SNES's newton iteration process. > Here is what happens with the code as it is: master *:~/Downloads/tmp/Liang$ ./SNES_heat -snes_monitor -ksp_converged_reason -snes_converged_reason -pc_type lu -snes_view -snes_linesearch_monitor pp 1 nElem 10 nqp 2 n_np 11 n_en 2 n_eq 10 qp: 0.57735 -0.57735 wq: 1 1 IEN: 1 2 3 4 5 6 7 8 9 10 2 3 4 5 6 7 8 9 10 11 x_coor: 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 ID: 1 2 3 4 5 6 7 8 9 10 0 0 SNES Function norm 1.206289245288e+01 Linear solve converged due to CONVERGED_RTOL iterations 1 Line search: Using full step: fnorm 1.206289245288e+01 gnorm 7.128802192789e+00 1 SNES Function norm 7.128802192789e+00 Linear solve converged due to CONVERGED_RTOL iterations 1 Line search: Using full step: fnorm 7.128802192789e+00 gnorm 6.608812909525e+00 2 SNES Function norm 6.608812909525e+00 Linear solve converged due to CONVERGED_RTOL iterations 1 Line search: gnorm after quadratic fit 1.265375106867e+01 Line search: Cubic step no good, shrinking lambda, current gnorm 1.328962011911e+01 lambda=1.7500506382162818e-02 Line search: Cubic step no good, shrinking lambda, current gnorm 1.275802797864e+01 lambda=1.7500506382162819e-03 Line search: Cubic step no good, shrinking lambda, current gnorm 1.327920917220e+01 lambda=1.7500506382162821e-04 Line search: Cubic step no good, shrinking lambda, current gnorm 1.275906891232e+01 lambda=1.7500506382162820e-05 Line search: Cubic step no good, shrinking lambda, current gnorm 1.327910508109e+01 lambda=1.7500506382162821e-06 Line search: Cubic step no good, shrinking lambda, current gnorm 1.275907932147e+01 lambda=1.7500506382162822e-07 Line search: Cubic step no good, shrinking lambda, current gnorm 1.327910404018e+01 lambda=1.7500506382162823e-08 Line search: Cubic step no good, shrinking lambda, current gnorm 1.275907942556e+01 lambda=1.7500506382162823e-09 Line search: Cubic step no good, shrinking lambda, current gnorm 1.327910402977e+01 lambda=1.7500506382162824e-10 Line search: Cubic step no good, shrinking lambda, current gnorm 1.275907942660e+01 lambda=1.7500506382162825e-11 Line search: Cubic step no good, shrinking lambda, current gnorm 1.327910402966e+01 lambda=1.7500506382162826e-12 Line search: Cubic step no good, shrinking lambda, current gnorm 1.275907942661e+01 lambda=1.7500506382162828e-13 Line search: unable to find good step length! After 12 tries Line search: fnorm=6.6088129095253478e+00, gnorm=1.2759079426614502e+01, ynorm=5.3714153713436097e-01, minlambda=9.9999999999999998e-13, lambda=1.7500506382162828e-13, initial slope=-4.3676408073108860e+01 Nonlinear solve did not converge due to DIVERGED_LINE_SEARCH iterations 2 Usually, we suspect that the Jacobian is incorrect in this case. Thus we can have it formed automatically, master *:~/Downloads/tmp/Liang$ ./SNES_heat -snes_monitor -ksp_converged_reason -snes_converged_reason -snes_fd -pc_type lu -snes_view -snes_linesearch_monitor pp 1 nElem 10 nqp 2 n_np 11 n_en 2 n_eq 10 qp: 0.57735 -0.57735 wq: 1 1 IEN: 1 2 3 4 5 6 7 8 9 10 2 3 4 5 6 7 8 9 10 11 x_coor: 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 ID: 1 2 3 4 5 6 7 8 9 10 0 0 SNES Function norm 1.206289245288e+01 Linear solve converged due to CONVERGED_RTOL iterations 1 Line search: Scaling step by 1.837216392007e-47 old ynorm 5.443016970405e+54 Line search: gnorm after quadratic fit 3.704240795372e+16 Line search: Cubic step no good, shrinking lambda, current gnorm 3.704614960636e+16 lambda=1.0000000000000002e-02 Line search: Cubic step no good, shrinking lambda, current gnorm 3.704611219026e+16 lambda=1.0000000000000002e-03 Line search: Cubic step no good, shrinking lambda, current gnorm 3.704611256438e+16 lambda=1.0000000000000003e-04 Line search: Cubic step no good, shrinking lambda, current gnorm 3.704611256064e+16 lambda=1.0000000000000004e-05 Line search: Cubic step no good, shrinking lambda, current gnorm 3.704611256068e+16 lambda=1.0000000000000004e-06 Line search: Cubic step no good, shrinking lambda, current gnorm 3.704611256068e+16 lambda=1.0000000000000005e-07 Line search: Cubic step no good, shrinking lambda, current gnorm 3.704611256068e+16 lambda=1.0000000000000005e-08 Line search: Cubic step no good, shrinking lambda, current gnorm 3.704611256068e+16 lambda=1.0000000000000005e-09 Line search: Cubic step no good, shrinking lambda, current gnorm 3.704611256068e+16 lambda=1.0000000000000006e-10 Line search: Cubic step no good, shrinking lambda, current gnorm 3.704611256068e+16 lambda=1.0000000000000006e-11 Line search: Cubic step no good, shrinking lambda, current gnorm 3.704611256068e+16 lambda=1.0000000000000006e-12 Line search: Cubic step no good, shrinking lambda, current gnorm 3.704611256068e+16 lambda=1.0000000000000007e-13 Line search: unable to find good step length! After 12 tries Line search: fnorm=1.2062892452882465e+01, gnorm=3.7046112560677824e+16, ynorm=1.0000000000000000e+08, minlambda=9.9999999999999998e-13, lambda=1.0000000000000007e-13, initial slope=-4.6079597780656769e+00 Nonlinear solve did not converge due to DIVERGED_LINE_SEARCH iterations 0 So it is clear that the Jacobian do not match. Moreover, it appears that Newton is not going to converge from this initial guess. It suggests that the residual is wrong somehow. I suggest coding up a MMS to prove to yourself that the residual is correct. Thanks, Matt > David Jiawei LUO LIANG > > ??????/??/???/2024 > > ?????????????1088? > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!bqnXB8rtlhm_qgLy5xeRj_mY4Rqfdgmupvjaqg3sArtduMag3ojG26K4cpDZok4CHJJwjxsl6911GOeurhJ1$ -------------- next part -------------- An HTML attachment was scrubbed... URL: From 12431140 at mail.sustech.edu.cn Thu Nov 21 09:28:56 2024 From: 12431140 at mail.sustech.edu.cn (=?utf-8?B?RGF2aWQgSmlhd2VpIExVTyBMSUFORw==?=) Date: Thu, 21 Nov 2024 23:28:56 +0800 Subject: [petsc-users] Cannot iterate well when using Newton iteration of SNES In-Reply-To: References: Message-ID: Hi Matt,     Yes, the residual and Jacobin function are both incorrect.  Both of the Vec f and Mat B haven't initialized as zeros. Jed caught that bug, thanks Jed. Anyway, thank you for your method to debug my program for the next time bug.. Hope you the best! David Jiawei LUO LIANG ??????/??/???/2024 ?????????????1088?       ------------------ Original ------------------ From:  "Matthew Knepley" From 12431140 at mail.sustech.edu.cn Thu Nov 21 09:31:45 2024 From: 12431140 at mail.sustech.edu.cn (=?utf-8?B?RGF2aWQgSmlhd2VpIExVTyBMSUFORw==?=) Date: Thu, 21 Nov 2024 23:31:45 +0800 Subject: [petsc-users] Cannot iterate well when using Newton iteration of SNES In-Reply-To: <6A4CB22C-6D05-497C-9A57-E5AB8B7C073F@petsc.dev> References: <6A4CB22C-6D05-497C-9A57-E5AB8B7C073F@petsc.dev> Message-ID: Hi Barry, The problem is I forgot (or say that I didn't know) to initialize the Vec f in residual function and Mat B in Jacobian function. Anyway, thanks for sharing me the link, it is helpful for debugging the program next time.  Hope you the best! David Jiawei LUO LIANG ??????/??/???/2024 ?????????????1088?       ------------------ Original ------------------ From:  "Barry Smith" From knepley at gmail.com Thu Nov 21 09:37:37 2024 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 21 Nov 2024 10:37:37 -0500 Subject: [petsc-users] Cannot iterate well when using Newton iteration of SNES In-Reply-To: <87a5dsbpag.fsf@jedbrown.org> References: <87a5dsbpag.fsf@jedbrown.org> Message-ID: One more suggestion email. I solve the linear version myself in ts/tutorials/ex45.c Thanks, Matt On Thu, Nov 21, 2024 at 10:35?AM Jed Brown wrote: > You should add VecZeroEntries(f) near the top of your FormFunction (it's > currently accumulating into whatever was there last) and MatZeroEntries(B) > to FormJacobian. > > I reduced to nElem = 5 for ease of viewing. With these changes, I see > quadratic convergence but the problem is still nonlinear. To explore > further, consider using these diagnostics > > ./SNES_heat -{snes,ksp}_monitor -{snes,ksp}_converged_reason > -snes_linesearch_monitor -ksp_view_mat > > with and without -snes_fd. > > For readability, I would suggest consistency in "u" vs "x". > > "David Jiawei LUO LIANG" <12431140 at mail.sustech.edu.cn> writes: > > > I am using the Newton iteration to solve a nonlinear 1D heat equation > problem by using FEM. > > > > > > I attached my source code named "SNES_heat.cpp"  > > > > > > when I run the code > > > >   0 SNES Function norm 1.206289245288e+01 > > > >   1 SNES Function norm 7.128802192789e+00 > > > >   2 SNES Function norm 6.608812909525e+00 > > > > > > > > you can find that it only iterate 3 steps, and then do all the function > evaluation and finally just stop the program.  > > > > > > I think it is not reasonble. I check my code, it is correct if I set it > as a linear problem. it means my Jacobian and Residual function is correct. > > > > > > But when I set it as a nonlinear, the residual seems reduces as not > expected.  > > > > > > I doubt that whether my understanding of the newton iteration is > different from SNES's newton iteration process. > > > > > > > > > > > > > > > > > > David Jiawei LUO LIANG > > > > > > > > ??????/??/???/2024 > > > > > > > > ?????????????1088? > > > > > > > > > >   > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!Z_i28V-z4i339qw9rz21qC1sKBK1hY750356Y2SekU_d3pHw-mdIgh0mJCT_Qp5HPuu0XkvutxpF0oHTYsJE$ -------------- next part -------------- An HTML attachment was scrubbed... URL: From 12431140 at mail.sustech.edu.cn Thu Nov 21 09:45:04 2024 From: 12431140 at mail.sustech.edu.cn (=?utf-8?B?RGF2aWQgSmlhd2VpIExVTyBMSUFORw==?=) Date: Thu, 21 Nov 2024 23:45:04 +0800 Subject: [petsc-users] Cannot iterate well when using Newton iteration of SNES In-Reply-To: References: <87a5dsbpag.fsf@jedbrown.org> Message-ID: Coincidentally, Matt. I am gonna write the 2d heat dynamic problem.  Now I still not that much understand how DM mesh works. It is a good chance to learn DM by studying your example.  Hope you the best! David Jiawei LUO LIANG ??????/??/???/2024 ?????????????1088?       ------------------ Original ------------------ From:  "Matthew Knepley" From d.scott at epcc.ed.ac.uk Fri Nov 22 10:35:45 2024 From: d.scott at epcc.ed.ac.uk (David Scott) Date: Fri, 22 Nov 2024 16:35:45 +0000 Subject: [petsc-users] Memory Used When Reading petscrc Message-ID: <60ba6a27-9840-4009-a94e-90bf7a5cd317@epcc.ed.ac.uk> Hello, I am using the options mechanism of PETSc to configure my CFD code. I have introduced options describing the size of the domain etc. I have noticed that this consumes a lot of memory. I have found that the amount of memory used scales linearly with the number of MPI processes used. This restricts the number of MPI processes that I can use. Is there anything that I can do about this or do I need to configure my code in a different way? I have attached some code extracted from my application which demonstrates this along with the output from a running it on 2 MPI processes. Best wishes, David Scott The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. Is e buidheann carthannais a th? ann an Oilthigh Dh?n ?ideann, cl?raichte an Alba, ?ireamh cl?raidh SC005336. -------------- next part -------------- !! @author Prashant Valluri, Lennon O Naraigh, Iain Bethune, !! David Scott, Toni Collis, Peter Spelt. !! @version $Revision: 252 $ !! @copyright (c) 2013-2020, Prashant Valluri, Lennon O Naraigh, !! Iain Bethune, David Scott, Toni Collis, Peter Spelt. !! This program is distributed under the BSD Licence See LICENCE.txt !! for details. module test_configuration_options #include #include "petsc/finclude/petsc.h" use petsc implicit none PetscScalar :: dx PetscScalar :: dy PetscScalar :: dz PetscScalar :: dt logical :: boiling logical :: boiling_variant logical :: evaporation logical, SAVE :: periodic(3) DMBoundaryType, SAVE :: boundary(3) enum, bind(C) enumerator :: BC enumerator :: cyclic, neumann, dirichlet, quasi_dirichlet, inlet, outlet end enum integer(kind(BC)), SAVE :: bc_temp integer(kind(BC)), SAVE :: x_upper_bc_T, x_upper_bc_Cl, x_upper_bc_Cv, x_upper_bc_P integer(kind(BC)), SAVE :: x_upper_bc_u, x_upper_bc_v, x_upper_bc_w integer(kind(BC)), SAVE :: x_lower_bc_T, x_lower_bc_Cl, x_lower_bc_Cv, x_lower_bc_P integer(kind(BC)), SAVE :: x_lower_bc_u, x_lower_bc_v, x_lower_bc_w integer(kind(BC)), SAVE :: y_upper_bc_T, y_upper_bc_Cl, y_upper_bc_Cv, y_upper_bc_P integer(kind(BC)), SAVE :: y_upper_bc_u, y_upper_bc_v, y_upper_bc_w integer(kind(BC)), SAVE :: y_lower_bc_T, y_lower_bc_Cl, y_lower_bc_Cv, y_lower_bc_P integer(kind(BC)), SAVE :: y_lower_bc_u, y_lower_bc_v, y_lower_bc_w integer(kind(BC)), SAVE :: z_upper_bc_T, z_upper_bc_Cl, z_upper_bc_Cv, z_upper_bc_P integer(kind(BC)), SAVE :: z_upper_bc_u, z_upper_bc_v, z_upper_bc_w integer(kind(BC)), SAVE :: z_lower_bc_T, z_lower_bc_Cl, z_lower_bc_Cv, z_lower_bc_P integer(kind(BC)), SAVE :: z_lower_bc_u, z_lower_bc_v, z_lower_bc_w double precision, SAVE :: x_upper_bc_T_value, x_upper_bc_Cl_value, x_upper_bc_Cv_value, x_upper_bc_P_value double precision, SAVE :: x_lower_bc_T_value, x_lower_bc_Cl_value, x_lower_bc_Cv_value, x_lower_bc_P_value double precision, SAVE :: y_upper_bc_T_value, y_upper_bc_Cl_value, y_upper_bc_Cv_value, y_upper_bc_P_value double precision, SAVE :: y_lower_bc_T_value, y_lower_bc_Cl_value, y_lower_bc_Cv_value, y_lower_bc_P_value double precision, SAVE :: z_upper_bc_T_value, z_upper_bc_Cl_value, z_upper_bc_Cv_value, z_upper_bc_P_value double precision, SAVE :: z_lower_bc_T_value, z_lower_bc_Cl_value, z_lower_bc_Cv_value, z_lower_bc_P_value integer, parameter :: max_option_name_length = 30 integer, parameter :: max_msg_length = 2**max_option_name_length + 8 ! 8 for 'Modified' contains subroutine read_initial_configuration_options(global_dim_x, global_dim_y, global_dim_z, & Re, Pe, We, Fr, Bod, Ja, mu_plus, mu_minus, mu_vap, rho_plus, rho_minus, rho_vap, & cp_plus, cp_minus, cp_vap, k_plus, k_minus, k_vap, beta_plus, beta_minus, beta_vap, & dpdx, gx, gz, epn, dTdx, T_ref, & Pref, Apsat, Bpsat, Cpsat, molMassRatio, PeT, PeMD, PeMDI, & x_upper_bc_T, x_upper_bc_Cl, x_upper_bc_Cv, x_upper_bc_u, x_upper_bc_v, x_upper_bc_w, & x_lower_bc_T, x_lower_bc_Cl, x_lower_bc_Cv, x_lower_bc_u, x_lower_bc_v, x_lower_bc_w, & y_upper_bc_T, y_upper_bc_Cl, y_upper_bc_Cv, y_upper_bc_u, y_upper_bc_v, y_upper_bc_w, & y_lower_bc_T, y_lower_bc_Cl, y_lower_bc_Cv, y_lower_bc_u, y_lower_bc_v, y_lower_bc_w, & z_upper_bc_T, z_upper_bc_Cl, z_upper_bc_Cv, z_upper_bc_u, z_upper_bc_v, z_upper_bc_w, & z_lower_bc_T, z_lower_bc_Cl, z_lower_bc_Cv, z_lower_bc_u, z_lower_bc_v, z_lower_bc_w, & liquid_limit, gaseous_limit, ierr) implicit none PetscInt, intent(out) :: global_dim_x, global_dim_y, global_dim_z double precision, intent(out) :: Re, Pe, We, Fr, Bod, Ja double precision, intent(out) :: mu_plus, mu_minus, mu_vap double precision, intent(out) :: rho_plus, rho_minus, rho_vap double precision, intent(out) :: cp_plus, cp_minus, cp_vap double precision, intent(out) :: k_plus, k_minus, k_vap double precision, intent(out) :: beta_plus, beta_minus, beta_vap double precision, intent(out) :: dpdx double precision, intent(out) :: gx, gz double precision, intent(out) :: epn double precision, intent(out) :: dTdx double precision, intent(out) :: T_ref double precision, intent(out) :: Pref double precision, intent(out) :: Apsat, Bpsat, Cpsat double precision, intent(out) :: molMassRatio double precision, intent(out) :: PeT double precision, intent(out) :: PeMD double precision, intent(out) :: PeMDI integer(kind(BC)), intent(out) :: x_upper_bc_T, x_upper_bc_Cl, x_upper_bc_Cv integer(kind(BC)), intent(out) :: x_upper_bc_u, x_upper_bc_v, x_upper_bc_w integer(kind(BC)), intent(out) :: x_lower_bc_T, x_lower_bc_Cl, x_lower_bc_Cv integer(kind(BC)), intent(out) :: x_lower_bc_u, x_lower_bc_v, x_lower_bc_w integer(kind(BC)), intent(out) :: y_upper_bc_T, y_upper_bc_Cl, y_upper_bc_Cv integer(kind(BC)), intent(out) :: y_upper_bc_u, y_upper_bc_v, y_upper_bc_w integer(kind(BC)), intent(out) :: y_lower_bc_T, y_lower_bc_Cl, y_lower_bc_Cv integer(kind(BC)), intent(out) :: y_lower_bc_u, y_lower_bc_v, y_lower_bc_w integer(kind(BC)), intent(out) :: z_upper_bc_T, z_upper_bc_Cl, z_upper_bc_Cv integer(kind(BC)), intent(out) :: z_upper_bc_u, z_upper_bc_v, z_upper_bc_w integer(kind(BC)), intent(out) :: z_lower_bc_T, z_lower_bc_Cl, z_lower_bc_Cv integer(kind(BC)), intent(out) :: z_lower_bc_u, z_lower_bc_v, z_lower_bc_w double precision, intent(out) :: liquid_limit double precision, intent(out) :: gaseous_limit PetscErrorCode, intent(out) :: ierr double precision :: MM_minus, MM_vap double precision :: Pr, Sc double precision :: PeCahnHilliardModifier ! double precision :: PeDiffusionModifier double precision :: Grav, alpha double precision :: pi = 4.0d0*atan(1.0d0) character(len = max_option_name_length) :: option_name character(len = max_msg_length) :: msg character(len = max_option_name_length) :: phenomenon logical :: found option_name = '-phenomenon' call PetscOptionsGetString(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, phenomenon, found, ierr) if (found) then write(msg, *) option_name, '= ', trim(phenomenon) call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) if (phenomenon .eq. 'boiling') then boiling = .true. else boiling = .false. end if if (phenomenon .eq. 'boiling_variant') then boiling_variant = .true. else boiling_variant = .false. end if if (phenomenon .eq. 'evaporation') then evaporation = .true. else evaporation = .false. end if else write(msg, *) 'Error:', option_name, 'not found.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if option_name = '-global_dim_x' call PetscOptionsGetInt(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, global_dim_x, found, ierr) if (found) then write(msg, *) option_name, '=', global_dim_x call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) else write(msg, *) 'Error:', option_name, 'not found.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if option_name = '-global_dim_y' call PetscOptionsGetInt(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, global_dim_y, found, ierr) if (found) then write(msg, *) option_name, '=', global_dim_y call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) else write(msg, *) 'Error:', option_name, 'not found.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if option_name = '-global_dim_z' call PetscOptionsGetInt(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, global_dim_z, found, ierr) if (found) then write(msg, *) option_name, '=', global_dim_z call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) if(global_dim_z .le. 1) then write(msg, *) 'global_dim_z must be greater than 1.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if else write(msg, *) 'Error:', option_name, 'not found.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if option_name = '-dt' call PetscOptionsGetReal(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, dt, found, ierr) if (found) then write(msg, *) option_name, '=', dt call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) if (.not. (dt .gt. 0.0d0)) then write(msg, *) 'Error:', dt, 'is not a valid value.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if else write(msg, *) 'Error:', option_name, 'not found.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if dz = 1.0d0/dble(global_dim_z) dx = dz dy = dz write(msg, *) 'dx =', dx call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) write(msg, *) 'dy =', dy call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) write(msg, *) 'dz =', dz call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) write(msg, *) 'Lx =', dx*global_dim_x call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) write(msg, *) 'Ly =', dy*global_dim_y call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) write(msg, *) 'Lz =', dz*global_dim_z call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) option_name = '-epn' call PetscOptionsGetReal(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, epn, found, ierr) if (found) then if (.not. (epn .gt. 0.0d0)) then epn = 0.5*dz if (boiling) then ! We need to allow for different values of Pe. PeCahnHilliardModifier = 1.0d0/(epn*epn) ! PeDiffusionModifier = 1.0d0/(epn*epn) else PeCahnHilliardModifier = 1.0d0/epn ! PeDiffusionModifier = 1.0d0/epn end if else PeCahnHilliardModifier = 1.0d0 ! PeDiffusionModifier = 1.0d0 end if write(msg, *) option_name, '=', epn call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) else write(msg, *) 'Error:', option_name, 'not found.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if option_name = '-Re' call PetscOptionsGetReal(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, Re, found, ierr) if (found) then write(msg, *) option_name, '=', Re call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) if (.not. (Re .gt. 0.0d0)) then write(msg, *) 'Error:', Re, 'is not a valid value.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if else write(msg, *) 'Error:', option_name, 'not found.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if option_name = '-Pe' call PetscOptionsGetReal(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, Pe, found, ierr) if (found) then write(msg, *) option_name, '=', Pe call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) if (.not. (Pe .gt. 0.0d0)) then write(msg, *) 'Error:', Pe, 'is not a valid value.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) else Pe = Pe * PeCahnHilliardModifier write(msg, *) 'Modified Pe =', Pe call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if else write(msg, *) 'Error:', option_name, 'not found.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if option_name = '-Pr' call PetscOptionsGetReal(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, Pr, found, ierr) if (found) then write(msg, *) option_name, '=', Pr call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) if (.not. (Pr .gt. 0.0d0)) then write(msg, *) 'Error:', Pr, 'is not a valid value.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) else PeT = Re*Pr write(msg, *) 'PeT =', PeT call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if else write(msg, *) 'Error:', option_name, 'not found.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if option_name = '-We' call PetscOptionsGetReal(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, We, found, ierr) if (found) then write(msg, *) option_name, '=', We call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) if (.not. (We .gt. 0.0d0)) then write(msg, *) 'Error:', We, 'is not a valid value.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if else write(msg, *) 'Error:', option_name, 'not found.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if option_name = '-Fr' call PetscOptionsGetReal(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, Fr, found, ierr) if (found) then write(msg, *) option_name, '=', Fr call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) if (.not. (Fr .gt. 0.0d0)) then write(msg, *) 'Error:', Fr, 'is not a valid value.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if else write(msg, *) 'Error:', option_name, 'not found.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if option_name = '-Bod' call PetscOptionsGetReal(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, Bod, found, ierr) if (found) then write(msg, *) option_name, '=', Bod call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) if (.not. (Bod .gt. 0.0d0)) then write(msg, *) 'Error:', Bod, 'is not a valid value.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if else write(msg, *) 'Error:', option_name, 'not found.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if option_name = '-Sc' call PetscOptionsGetReal(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, Sc, found, ierr) if (found) then write(msg, *) option_name, '=', Sc call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) if (.not. (Sc .ge. 0.0d0)) then write(msg, *) 'Error:', Sc, 'is not a valid value.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) else PeMD = Re*Sc PeMDI = PeMD write(msg, *) 'PeMD =', PeMD call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) write(msg, *) 'PeMDI =', PeMDI call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if else write(msg, *) 'Error:', option_name, 'not found.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if option_name = '-Ja' call PetscOptionsGetReal(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, Ja, found, ierr) if (found) then write(msg, *) option_name, '=', Ja call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) if (.not. (Ja .gt. 0.0d0)) then write(msg, *) 'Error:', Ja, 'is not a valid value.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if else write(msg, *) 'Error:', option_name, 'not found.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if option_name = '-dTdx' call PetscOptionsGetReal(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, dTdx, found, ierr) if (found) then write(msg, *) option_name, '=', dTdx call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) else write(msg, *) 'Error:', option_name, 'not found.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if option_name = '-T_ref' call PetscOptionsGetReal(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, T_ref, found, ierr) if (found) then write(msg, *) option_name, '=', T_ref call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) else write(msg, *) 'Error:', option_name, 'not found.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if option_name = '-Pref' call PetscOptionsGetReal(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, Pref, found, ierr) if (found) then write(msg, *) option_name, '=', Pref call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) if (.not. (Pref .ge. 0.0d0)) then write(msg, *) 'Error:', Pref, 'is not a valid value.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if else write(msg, *) 'Error:', option_name, 'not found.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if option_name = '-Apsat' call PetscOptionsGetReal(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, Apsat, found, ierr) if (found) then write(msg, *) option_name, '=', Apsat call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) if (.not. (Apsat .ge. 0.0d0)) then write(msg, *) 'Error:', Apsat, 'is not a valid value.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if else write(msg, *) 'Error:', option_name, 'not found.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if option_name = '-Bpsat' call PetscOptionsGetReal(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, Bpsat, found, ierr) if (found) then write(msg, *) option_name, '=', Bpsat call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) if (.not. (Bpsat .ge. 0.0d0)) then write(msg, *) 'Error:', Bpsat, 'is not a valid value.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if else write(msg, *) 'Error:', option_name, 'not found.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if option_name = '-Cpsat' call PetscOptionsGetReal(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, Cpsat, found, ierr) if (found) then write(msg, *) option_name, '=', Cpsat call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) if (.not. (Cpsat .ge. 0.0d0)) then write(msg, *) 'Error:', Cpsat, 'is not a valid value.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if else write(msg, *) 'Error:', option_name, 'not found.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if option_name = '-MM_minus' call PetscOptionsGetReal(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, MM_minus, found, ierr) if (found) then write(msg, *) option_name, '=', MM_minus call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) if (.not. (MM_minus .gt. 0.0d0)) then write(msg, *) 'Error:', MM_minus, 'is not a valid value.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if else write(msg, *) 'Error:', option_name, 'not found so using the value 1.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) MM_minus = 1.0d0 end if option_name = '-MM_vap' call PetscOptionsGetReal(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, MM_vap, found, ierr) if (found) then write(msg, *) option_name, '=', MM_vap call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) if (.not. (MM_vap .gt. 0.0d0)) then write(msg, *) 'Error:', MM_vap, 'is not a valid value.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if else write(msg, *) 'Error:', option_name, 'not found so using the value 0.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) MM_vap = 0.0d0 end if molMassRatio = MM_vap / MM_minus write(msg, *) 'molMassRatio =', molMassRatio call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) option_name = '-mu_plus' call PetscOptionsGetReal(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, mu_plus, found, ierr) if (found) then write(msg, *) option_name, '=', mu_plus call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) if (.not. (mu_plus .gt. 0.0d0)) then write(msg, *) 'Error:', mu_plus, 'is not a valid value.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if else write(msg, *) 'Error:', option_name, 'not found.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if option_name = '-mu_minus' call PetscOptionsGetReal(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, mu_minus, found, ierr) if (found) then write(msg, *) option_name, '=', mu_minus call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) if (.not. (mu_minus .gt. 0.0d0)) then write(msg, *) 'Error:', mu_minus, 'is not a valid value.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if else write(msg, *) 'Error:', option_name, 'not found.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if option_name = '-mu_vap' call PetscOptionsGetReal(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, mu_vap, found, ierr) if (found) then write(msg, *) option_name, '=', mu_vap call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) if (.not. (mu_vap .gt. 0.0d0)) then write(msg, *) 'Error:', mu_vap, 'is not a valid value.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if else write(msg, *) 'Error:', option_name, 'not found.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if option_name = '-rho_plus' call PetscOptionsGetReal(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, rho_plus, found, ierr) if (found) then write(msg, *) option_name, '=', rho_plus call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) if (.not. (rho_plus .gt. 0.0d0)) then write(msg, *) 'Error:', rho_plus, 'is not a valid value.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if else write(msg, *) 'Error:', option_name, 'not found.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if option_name = '-rho_minus' call PetscOptionsGetReal(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, rho_minus, found, ierr) if (found) then write(msg, *) option_name, '=', rho_minus call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) if (.not. (rho_minus .gt. 0.0d0)) then write(msg, *) 'Error:', rho_minus, 'is not a valid value.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if else write(msg, *) 'Error:', option_name, 'not found.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if option_name = '-rho_vap' call PetscOptionsGetReal(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, rho_vap, found, ierr) if (found) then write(msg, *) option_name, '=', rho_vap call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) if (.not. (rho_vap .gt. 0.0d0)) then write(msg, *) 'Error:', rho_vap, 'is not a valid value.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if else write(msg, *) 'Error:', option_name, 'not found.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if option_name = '-cp_plus' call PetscOptionsGetReal(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, cp_plus, found, ierr) if (found) then write(msg, *) option_name, '=', cp_plus call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) if (.not. (cp_plus .gt. 0.0d0)) then write(msg, *) 'Error:', cp_plus, 'is not a valid value.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if else write(msg, *) 'Error:', option_name, 'not found.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if option_name = '-cp_minus' call PetscOptionsGetReal(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, cp_minus, found, ierr) if (found) then write(msg, *) option_name, '=', cp_minus call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) if (.not. (cp_minus .gt. 0.0d0)) then write(msg, *) 'Error:', cp_minus, 'is not a valid value.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if else write(msg, *) 'Error:', option_name, 'not found.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if option_name = '-cp_vap' call PetscOptionsGetReal(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, cp_vap, found, ierr) if (found) then write(msg, *) option_name, '=', cp_vap call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) if (.not. (cp_vap .gt. 0.0d0)) then write(msg, *) 'Error:', cp_vap, 'is not a valid value.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if else write(msg, *) 'Error:', option_name, 'not found.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if option_name = '-k_plus' call PetscOptionsGetReal(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, k_plus, found, ierr) if (found) then write(msg, *) option_name, '=', k_plus call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) if (.not. (k_plus .gt. 0.0d0)) then write(msg, *) 'Error:', k_plus, 'is not a valid value.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if else write(msg, *) 'Error:', option_name, 'not found.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if option_name = '-k_minus' call PetscOptionsGetReal(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, k_minus, found, ierr) if (found) then write(msg, *) option_name, '=', k_minus call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) if (.not. (k_minus .gt. 0.0d0)) then write(msg, *) 'Error:', k_minus, 'is not a valid value.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if else write(msg, *) 'Error:', option_name, 'not found.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if option_name = '-k_vap' call PetscOptionsGetReal(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, k_vap, found, ierr) if (found) then write(msg, *) option_name, '=', k_vap call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) if (.not. (k_vap .gt. 0.0d0)) then write(msg, *) 'Error:', k_vap, 'is not a valid value.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if else write(msg, *) 'Error:', option_name, 'not found.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if option_name = '-beta_plus' call PetscOptionsGetReal(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, beta_plus, found, ierr) if (found) then write(msg, *) option_name, '=', beta_plus call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) if (.not. (beta_plus .ge. 0.0d0)) then write(msg, *) 'Error:', beta_plus, 'is not a valid value.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if else write(msg, *) 'Error:', option_name, 'not found.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if option_name = '-beta_minus' call PetscOptionsGetReal(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, beta_minus, found, ierr) if (found) then write(msg, *) option_name, '=', beta_minus call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) if (.not. (beta_minus .ge. 0.0d0)) then write(msg, *) 'Error:', beta_minus, 'is not a valid value.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if else write(msg, *) 'Error:', option_name, 'not found.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if option_name = '-beta_vap' call PetscOptionsGetReal(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, beta_vap, found, ierr) if (found) then write(msg, *) option_name, '=', beta_vap call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) if (.not. (beta_vap .ge. 0.0d0)) then write(msg, *) 'Error:', beta_vap, 'is not a valid value.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if else write(msg, *) 'Error:', option_name, 'not found.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if option_name = '-dpdx' call PetscOptionsGetReal(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, dpdx, found, ierr) if (found) then write(msg, *) option_name, '=', dpdx call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) if (dpdx .gt. 0.0d0) then write(msg, *) 'Counter current flow.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if else write(msg, *) 'Error:', option_name, 'not found.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if option_name = '-Grav' call PetscOptionsGetReal(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, Grav, found, ierr) if (found) then write(msg, *) option_name, '=', Grav call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) if (.not. (Grav .ge. 0.0d0)) then write(msg, *) 'Error: Grav must be non-negative.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if else write(msg, *) 'Error:', option_name, 'not found.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if option_name = '-alpha' call PetscOptionsGetReal(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, alpha, found, ierr) if (found) then write(msg, *) option_name, '=', alpha call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) if (.not. ((-pi .le. alpha) .and. (alpha .le. pi))) then write(msg, *) 'Error:', alpha, 'is not a valid value.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if else write(msg, *) 'Error:', option_name, 'not found.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if gz = Grav*sin(alpha) if (gz == -1.0d0) then gx = 0.0d0 ! In this case cos(alpha) = -3.8285686989269494E-016 else gx = Grav*cos(alpha) end if write(msg, *) 'gz =', gz call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) write(msg, *) 'gx =', gx call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) ! X BCs. option_name = '-x_upper_bc_T' call get_BC(option_name, x_upper_bc_T, x_upper_bc_T_value, 1) option_name = '-x_upper_bc_Cl' call get_BC_with_check(option_name, x_upper_bc_Cl, x_upper_bc_Cl_value, 1) option_name = '-x_upper_bc_Cv' call get_BC_with_check(option_name, x_upper_bc_Cv, x_upper_bc_Cv_value, 1) option_name = '-x_upper_bc_P' call get_BC_with_check(option_name, x_upper_bc_P, x_upper_bc_P_value, 1) option_name = '-x_upper_bc_u' call get_BC_with_check_uvw(option_name, x_upper_bc_u, 1) option_name = '-x_upper_bc_v' call get_BC_with_check_uvw(option_name, x_upper_bc_v, 1) option_name = '-x_upper_bc_w' call get_BC_with_check_uvw(option_name, x_upper_bc_w, 1) option_name = '-x_lower_bc_T' call get_BC_with_check(option_name, x_lower_bc_T, x_lower_bc_T_value, 1) option_name = '-x_lower_bc_Cl' call get_BC_with_check(option_name, x_lower_bc_Cl, x_lower_bc_Cl_value, 1) option_name = '-x_lower_bc_Cv' call get_BC_with_check(option_name, x_lower_bc_Cv, x_lower_bc_Cv_value, 1) option_name = '-x_lower_bc_P' call get_BC_with_check(option_name, x_lower_bc_P, x_lower_bc_P_value, 1) option_name = '-x_lower_bc_u' call get_BC_with_check_uvw(option_name, x_lower_bc_u, 1) option_name = '-x_lower_bc_v' call get_BC_with_check_uvw(option_name, x_lower_bc_v, 1) option_name = '-x_lower_bc_w' call get_BC_with_check_uvw(option_name, x_lower_bc_w, 1) ! Y BCs option_name = '-y_upper_bc_T' call get_BC(option_name, y_upper_bc_T, y_upper_bc_T_value, 2) option_name = '-y_upper_bc_Cl' call get_BC_with_check(option_name, y_upper_bc_Cl, y_upper_bc_Cl_value, 2) option_name = '-y_upper_bc_Cv' call get_BC_with_check(option_name, y_upper_bc_Cv, y_upper_bc_Cv_value, 2) option_name = '-y_upper_bc_P' call get_BC_with_check(option_name, y_upper_bc_P, y_upper_bc_P_value, 2) option_name = '-y_upper_bc_u' call get_BC_with_check_uvw(option_name, y_upper_bc_u, 2) option_name = '-y_upper_bc_v' call get_BC_with_check_uvw(option_name, y_upper_bc_v, 2) option_name = '-y_upper_bc_w' call get_BC_with_check_uvw(option_name, y_upper_bc_w, 2) option_name = '-y_lower_bc_T' call get_BC_with_check(option_name, y_lower_bc_T, y_lower_bc_T_value, 2) option_name = '-y_lower_bc_Cl' call get_BC_with_check(option_name, y_lower_bc_Cl, y_lower_bc_Cl_value, 2) option_name = '-y_lower_bc_Cv' call get_BC_with_check(option_name, y_lower_bc_Cv, y_lower_bc_Cv_value, 2) option_name = '-y_lower_bc_P' call get_BC_with_check(option_name, y_lower_bc_P, y_lower_bc_P_value, 2) option_name = '-y_lower_bc_u' call get_BC_with_check_uvw(option_name, y_lower_bc_u, 2) option_name = '-y_lower_bc_v' call get_BC_with_check_uvw(option_name, y_lower_bc_v, 2) option_name = '-y_lower_bc_w' call get_BC_with_check_uvw(option_name, y_lower_bc_w, 2) ! Z BCs. option_name = '-z_upper_bc_T' call get_BC(option_name, z_upper_bc_T, z_upper_bc_T_value, 3) option_name = '-z_upper_bc_Cl' call get_BC_with_check(option_name, z_upper_bc_Cl, z_upper_bc_Cl_value, 3) option_name = '-z_upper_bc_Cv' call get_BC_with_check(option_name, z_upper_bc_Cv, z_upper_bc_Cv_value, 3) option_name = '-z_upper_bc_P' call get_BC_with_check(option_name, z_upper_bc_P, z_upper_bc_P_value, 3) option_name = '-z_upper_bc_u' call get_BC_with_check_uvw(option_name, z_upper_bc_u, 3) option_name = '-z_upper_bc_v' call get_BC_with_check_uvw(option_name, z_upper_bc_v, 3) option_name = '-z_upper_bc_w' call get_BC_with_check_uvw(option_name, z_upper_bc_w, 3) option_name = '-z_lower_bc_T' call get_BC_with_check(option_name, z_lower_bc_T, z_lower_bc_T_value, 3) option_name = '-z_lower_bc_Cl' call get_BC_with_check(option_name, z_lower_bc_Cl, z_lower_bc_Cl_value, 3) option_name = '-z_lower_bc_Cv' call get_BC_with_check(option_name, z_lower_bc_Cv, z_lower_bc_Cv_value, 3) option_name = '-z_lower_bc_P' call get_BC_with_check(option_name, z_lower_bc_P, z_lower_bc_P_value, 3) option_name = '-z_lower_bc_u' call get_BC_with_check_uvw(option_name, z_lower_bc_u, 3) option_name = '-z_lower_bc_v' call get_BC_with_check_uvw(option_name, z_lower_bc_v, 3) option_name = '-z_lower_bc_w' call get_BC_with_check_uvw(option_name, z_lower_bc_w, 3) option_name = '-liquid_limit' call PetscOptionsGetReal(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, liquid_limit, found, ierr) if (found) then write(msg, *) option_name, '=', liquid_limit call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) if (.not. ((liquid_limit >= 0.0d0) .and. (liquid_limit <= 1.0d0))) then write(msg, *) 'Error:', liquid_limit, 'is not a valid value.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if else write(msg, *) 'Error:', option_name, 'not found.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if option_name = '-gaseous_limit' call PetscOptionsGetReal(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, gaseous_limit, found, ierr) if (found) then write(msg, *) option_name, '=', gaseous_limit call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) if (.not. ((gaseous_limit >= 0.0d0) .and. (gaseous_limit <= 1.0d0))) then write(msg, *) 'Error:', gaseous_limit, 'is not a valid value.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if if (.not. (gaseous_limit < liquid_limit)) then write(msg, *) 'Error:', gaseous_limit, 'is not a valid value.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if else write(msg, *) 'Error:', option_name, 'not found.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if end subroutine read_initial_configuration_options subroutine read_run_time_configuration_options(num_procs_x, num_procs_y, num_procs_z, & imex_Cl, imex_Cv, imex_T, imex_u, imex_v, imex_w, & petsc_solver_Cl, petsc_solver_Cv, petsc_solver_T, & petsc_solver_u, petsc_solver_v, petsc_solver_w, petsc_solver_p, & Cl_monitoring_on, Cv_monitoring_on, T_monitoring_on, & u_monitoring_on, v_monitoring_on, w_monitoring_on, p_monitoring_on, & iter_pres_first_100, iter_pres, iter_u, iter_v, iter_w, iter_dim, iter_Cv, & iter_T, Cl_write_frequency, Cv_write_frequency, u_write_frequency, v_write_frequency, & w_write_frequency, T_write_frequency, backup_frequency, num_timesteps, ierr) implicit none PetscInt, intent(out) :: num_procs_x, num_procs_y, num_procs_z character(len = max_option_name_length), intent(out) :: imex_Cl, imex_Cv, imex_T character(len = max_option_name_length), intent(out) :: imex_u, imex_v, imex_w logical, intent(out) :: petsc_solver_Cl, petsc_solver_Cv, petsc_solver_T logical, intent(out) :: petsc_solver_u, petsc_solver_v, petsc_solver_w, petsc_solver_p ! To monitor or not to monitor. logical, intent(out) :: Cl_monitoring_on, Cv_monitoring_on, T_monitoring_on logical, intent(out) :: u_monitoring_on, v_monitoring_on, w_monitoring_on, p_monitoring_on ! Pressure solver configuration PetscInt, intent(out) :: iter_pres_first_100, iter_pres ! Momentum equation solver configuration. PetscInt, intent(out) :: iter_u, iter_v, iter_w ! DIM equation solver configuration. PetscInt, intent(out) :: iter_dim ! Vapour equation solver configuration. PetscInt, intent(out) :: iter_Cv ! Temperature equation solver configuration. PetscInt, intent(out) :: iter_T ! Cl HDF5 file output frequency. PetscInt, intent(out) :: Cl_write_frequency PetscInt, intent(out) :: Cv_write_frequency ! u HDF5 file output frequency. PetscInt, intent(out) :: u_write_frequency ! v HDF5 file output frequency. PetscInt, intent(out) :: v_write_frequency ! w HDF5 file output frequency. PetscInt, intent(out) :: w_write_frequency ! T HDF5 file output frequency PetscInt, intent(out) :: T_write_frequency ! backup (restart) file output frequency PetscInt, intent(out) :: backup_frequency ! Number of timesteps. PetscInt, intent(out) :: num_timesteps PetscErrorCode, intent(out) :: ierr integer, parameter :: max_msg_length = 2*max_option_name_length character(len = max_option_name_length) :: option_name character(len = max_msg_length) :: msg logical :: found option_name = '-num_procs_x' call PetscOptionsGetInt(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, num_procs_x, found, ierr) if (found) then write(msg, *) option_name, '=', num_procs_x call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) else write(msg, *) 'Error:', option_name, 'not found.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if option_name = '-num_procs_y' call PetscOptionsGetInt(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, num_procs_y, found, ierr) if (found) then write(msg, *) option_name, '=', num_procs_y call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) else write(msg, *) 'Error:', option_name, 'not found.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if option_name = '-num_procs_z' call PetscOptionsGetInt(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, num_procs_z, found, ierr) if (found) then write(msg, *) option_name, '=', num_procs_z call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) else write(msg, *) 'Error:', option_name, 'not found.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if option_name = '-imex_Cl' call PetscOptionsGetString(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, imex_Cl, found, ierr) if (found) then write(msg, *) option_name, '= ', trim(imex_Cl) call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) if (.not. ((imex_Cl .eq. 'CNAB2') .or. (imex_Cl .eq. 'SBDF'))) then write(msg, *) 'Error:', imex_Cl, 'is not a valid value.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if else write(msg, *) 'Error:', option_name, 'not found.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if option_name = '-imex_Cv' call PetscOptionsGetString(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, imex_Cv, found, ierr) if (found) then write(msg, *) option_name, '= ', trim(imex_Cv) call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) if (.not. ((imex_Cv .eq. 'CNAB2') .or. (imex_Cv .eq. 'SBDF'))) then write(msg, *) 'Error:', imex_Cv, 'is not a valid value.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if else write(msg, *) 'Error:', option_name, 'not found.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if option_name = '-imex_T' call PetscOptionsGetString(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, imex_T, found, ierr) if (found) then write(msg, *) option_name, '= ', trim(imex_T) call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) if (.not. ((imex_T .eq. 'CNAB2') .or. (imex_T .eq. 'SBDF'))) then write(msg, *) 'Error:', imex_T, 'is not a valid value.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if else write(msg, *) 'Error:', option_name, 'not found.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if option_name = '-imex_u' call PetscOptionsGetString(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, imex_u, found, ierr) if (found) then write(msg, *) option_name, '= ', trim(imex_u) call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) if (.not. ((imex_u .eq. 'CNAB3') .or. (imex_u .eq. 'SBDF'))) then write(msg, *) 'Error:', imex_u, 'is not a valid value.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if else write(msg, *) 'Error:', option_name, 'not found.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if option_name = '-imex_v' call PetscOptionsGetString(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, imex_v, found, ierr) if (found) then write(msg, *) option_name, '= ', trim(imex_v) call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) if (.not. ((imex_v .eq. 'CNAB3') .or. (imex_v .eq. 'SBDF'))) then write(msg, *) 'Error:', imex_v, 'is not a valid value.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if else write(msg, *) 'Error:', option_name, 'not found.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if option_name = '-imex_w' call PetscOptionsGetString(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, imex_w, found, ierr) if (found) then write(msg, *) option_name, '= ', trim(imex_w) call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) if (.not. ((imex_w .eq. 'CNAB3') .or. (imex_w .eq. 'SBDF'))) then write(msg, *) 'Error:', imex_w, 'is not a valid value.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if else write(msg, *) 'Error:', option_name, 'not found.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if option_name = '-petsc_solver_Cl' call PetscOptionsGetBool(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, petsc_solver_Cl, found, ierr) if (found) then write(msg, *) option_name, '=', petsc_solver_Cl call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) else write(msg, *) 'Error:', option_name, 'not found.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if option_name = '-petsc_solver_Cv' call PetscOptionsGetBool(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, petsc_solver_Cv, found, ierr) if (found) then write(msg, *) option_name, '=', petsc_solver_Cv call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) else write(msg, *) 'Error:', option_name, 'not found.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if option_name = '-petsc_solver_T' call PetscOptionsGetBool(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, petsc_solver_T, found, ierr) if (found) then write(msg, *) option_name, '=', petsc_solver_T call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) else write(msg, *) 'Error:', option_name, 'not found.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if option_name = '-petsc_solver_u' call PetscOptionsGetBool(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, petsc_solver_u, found, ierr) if (found) then write(msg, *) option_name, '=', petsc_solver_u call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) else write(msg, *) 'Error:', option_name, 'not found.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if option_name = '-petsc_solver_v' call PetscOptionsGetBool(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, petsc_solver_v, found, ierr) if (found) then write(msg, *) option_name, '=', petsc_solver_v call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) else write(msg, *) 'Error:', option_name, 'not found.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if option_name = '-petsc_solver_w' call PetscOptionsGetBool(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, petsc_solver_w, found, ierr) if (found) then write(msg, *) option_name, '=', petsc_solver_w call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) else write(msg, *) 'Error:', option_name, 'not found.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if option_name = '-petsc_solver_p' call PetscOptionsGetBool(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, petsc_solver_p, found, ierr) if (found) then write(msg, *) option_name, '=', petsc_solver_p call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) else write(msg, *) 'Error:', option_name, 'not found.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if option_name = '-Cl_monitoring_on' call PetscOptionsGetBool(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, Cl_monitoring_on, found, ierr) if (found) then write(msg, *) option_name, '=', Cl_monitoring_on call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) else write(msg, *) 'Error:', option_name, 'not found.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if option_name = '-Cv_monitoring_on' call PetscOptionsGetBool(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, Cv_monitoring_on, found, ierr) if (found) then write(msg, *) option_name, '=', Cv_monitoring_on call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) else write(msg, *) 'Error:', option_name, 'not found.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if option_name = '-T_monitoring_on' call PetscOptionsGetBool(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, T_monitoring_on, found, ierr) if (found) then write(msg, *) option_name, '=', T_monitoring_on call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) else write(msg, *) 'Error:', option_name, 'not found.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if option_name = '-u_monitoring_on' call PetscOptionsGetBool(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, u_monitoring_on, found, ierr) if (found) then write(msg, *) option_name, '=', u_monitoring_on call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) else write(msg, *) 'Error:', option_name, 'not found.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if option_name = '-v_monitoring_on' call PetscOptionsGetBool(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, v_monitoring_on, found, ierr) if (found) then write(msg, *) option_name, '=', v_monitoring_on call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) else write(msg, *) 'Error:', option_name, 'not found.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if option_name = '-w_monitoring_on' call PetscOptionsGetBool(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, w_monitoring_on, found, ierr) if (found) then write(msg, *) option_name, '=', w_monitoring_on call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) else write(msg, *) 'Error:', option_name, 'not found.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if option_name = '-p_monitoring_on' call PetscOptionsGetBool(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, p_monitoring_on, found, ierr) if (found) then write(msg, *) option_name, '=', p_monitoring_on call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) else write(msg, *) 'Error:', option_name, 'not found.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if option_name = '-iter_pres_first_100' call PetscOptionsGetInt(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, iter_pres_first_100, found, ierr) if (found) then write(msg, *) option_name, '=', iter_pres_first_100 call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) else write(msg, *) 'Error:', option_name, 'not found.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if option_name = '-iter_pres' call PetscOptionsGetInt(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, iter_pres, found, ierr) if (found) then write(msg, *) option_name, '=', iter_pres call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) else write(msg, *) 'Error:', option_name, 'not found.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if option_name = '-iter_u' call PetscOptionsGetInt(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, iter_u, found, ierr) if (found) then write(msg, *) option_name, '=', iter_u call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) else write(msg, *) 'Error:', option_name, 'not found.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if option_name = '-iter_v' call PetscOptionsGetInt(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, iter_v, found, ierr) if (found) then write(msg, *) option_name, '=', iter_v call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) else write(msg, *) 'Error:', option_name, 'not found.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if option_name = '-iter_w' call PetscOptionsGetInt(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, iter_w, found, ierr) if (found) then write(msg, *) option_name, '=', iter_w call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) else write(msg, *) 'Error:', option_name, 'not found.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if option_name = '-iter_dim' call PetscOptionsGetInt(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, iter_dim, found, ierr) if (found) then write(msg, *) option_name, '=', iter_dim call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) else write(msg, *) 'Error:', option_name, 'not found.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if option_name = '-iter_Cv' call PetscOptionsGetInt(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, iter_Cv, found, ierr) if (found) then write(msg, *) option_name, '=', iter_Cv call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) else write(msg, *) 'Error:', option_name, 'not found.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if option_name = '-iter_T' call PetscOptionsGetInt(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, iter_T, found, ierr) if (found) then write(msg, *) option_name, '=', iter_T call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) else write(msg, *) 'Error:', option_name, 'not found.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if option_name = '-Cl_write_frequency' call PetscOptionsGetInt(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, Cl_write_frequency, found, ierr) if (found) then write(msg, *) option_name, '=', Cl_write_frequency call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) else write(msg, *) 'Error:', option_name, 'not found.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if option_name = '-Cv_write_frequency' call PetscOptionsGetInt(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, Cv_write_frequency, found, ierr) if (found) then write(msg, *) option_name, '=', Cv_write_frequency call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) else write(msg, *) 'Error:', option_name, 'not found.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if option_name = '-u_write_frequency' call PetscOptionsGetInt(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, u_write_frequency, found, ierr) if (found) then write(msg, *) option_name, '=', u_write_frequency call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) else write(msg, *) 'Error:', option_name, 'not found.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if option_name = '-v_write_frequency' call PetscOptionsGetInt(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, v_write_frequency, found, ierr) if (found) then write(msg, *) option_name, '=', v_write_frequency call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) else write(msg, *) 'Error:', option_name, 'not found.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if option_name = '-w_write_frequency' call PetscOptionsGetInt(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, w_write_frequency, found, ierr) if (found) then write(msg, *) option_name, '=', w_write_frequency call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) else write(msg, *) 'Error:', option_name, 'not found.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if option_name = '-T_write_frequency' call PetscOptionsGetInt(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, T_write_frequency, found, ierr) if (found) then write(msg, *) option_name, '=', T_write_frequency call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) else write(msg, *) 'Error:', option_name, 'not found.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if option_name = '-backup_frequency' call PetscOptionsGetInt(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, backup_frequency, found, ierr) if (found) then write(msg, *) option_name, '=', backup_frequency call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) else write(msg, *) 'Error:', option_name, 'not found.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if option_name = '-num_timesteps' call PetscOptionsGetInt(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, num_timesteps, found, ierr) if (found) then write(msg, *) option_name, '=', num_timesteps call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) else write(msg, *) 'Error:', option_name, 'not found.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if end subroutine read_run_time_configuration_options subroutine get_BC(option_name, b_c, dirichlet_value, dim) implicit none ! Arguments. character(len = max_option_name_length), intent(in) :: option_name integer(kind(BC)), intent(out) :: b_c double precision, intent(out) :: dirichlet_value integer, intent(in) :: dim ! Local variables. character(len = max_option_name_length) :: boundary_condition character(len = max_option_name_length) :: dirichlet_option_name logical :: found character(len = max_msg_length) :: msg PetscErrorCode :: ierr call PetscOptionsGetString(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, & boundary_condition, found, ierr) if (found) then write(msg, *) option_name, '= ', trim(boundary_condition) call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) ! Default values. periodic(dim) = .false. boundary(dim) = DM_BOUNDARY_NONE if (boundary_condition .eq. 'periodic') then periodic(dim) = .true. boundary(dim) = DM_BOUNDARY_PERIODIC b_c = cyclic else if (boundary_condition .eq. 'neumann') then b_c = neumann else if (boundary_condition .eq. 'dirichlet') then dirichlet_option_name = trim(option_name) // '_value' call PetscOptionsGetReal(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, dirichlet_option_name, & dirichlet_value, found, ierr) if (found) then write(msg, *) dirichlet_option_name, '= ', dirichlet_value call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) b_c = dirichlet else b_c = quasi_dirichlet end if else write(msg, *) 'Error: ' // trim(boundary_condition) // ' is not a valid value.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if else write(msg, *) 'Error:', option_name, 'not found.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if end subroutine get_BC subroutine get_BC_with_check(option_name, b_c, dirichlet_value, dim) implicit none ! Arguments. character(len = max_option_name_length), intent(in) :: option_name integer(kind(BC)), intent(out) :: b_c double precision, intent(out) :: dirichlet_value integer, intent(in) :: dim ! Local variables. character(len = max_option_name_length) :: boundary_condition character(len = max_option_name_length) :: dirichlet_option_name logical :: found character(len = max_msg_length) :: msg PetscErrorCode :: ierr logical :: periodic_val call PetscOptionsGetString(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, & boundary_condition, found, ierr) if (found) then write(msg, *) option_name, '= ', trim(boundary_condition) call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) ! Default values. periodic_val = .false. if (boundary_condition .eq. 'periodic') then periodic_val = .true. b_c = cyclic else if (boundary_condition .eq. 'neumann') then b_c = neumann else if (boundary_condition .eq. 'dirichlet') then dirichlet_option_name = trim(option_name) // '_value' call PetscOptionsGetReal(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, dirichlet_option_name, & dirichlet_value, found, ierr) if (found) then write(msg, *) dirichlet_option_name, '= ', dirichlet_value call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) b_c = dirichlet else b_c = quasi_dirichlet end if else if (boundary_condition .eq. 'inlet') then if (option_name .ne. 'x_lower_bc_u') then write(msg, *) 'Error: inlet is not a valid value for' // option_name // '.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if bc_temp = inlet b_c = inlet else if (boundary_condition .eq. 'outlet') then if (option_name .ne. 'x_upper_bc_u') then write(msg, *) 'Error: outlet is not a valid value for' // option_name // '.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if if (bc_temp .ne. inlet) then write(msg, *) 'Error: outlet is not a valid value for x_upper_bc_u as x_lower_bc_u is not an inlet.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if b_c = outlet else write(msg, *) 'Error: ' // trim(boundary_condition) // ' is not a valid value.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if if (periodic_val .neqv. periodic(dim)) then write(msg, *) 'Error: ' // 'cannot have a single periodic BC in dimenstion', dim call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if else write(msg, *) 'Error:', option_name, 'not found.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if end subroutine get_BC_with_check subroutine get_BC_with_check_uvw(option_name, b_c, dim) implicit none ! Arguments. character(len = max_option_name_length), intent(in) :: option_name integer(kind(BC)), intent(out) :: b_c integer, intent(in) :: dim ! Local variables. character(len = max_option_name_length) :: boundary_condition logical :: found character(len = max_msg_length) :: msg PetscErrorCode :: ierr logical :: periodic_val call PetscOptionsGetString(PETSC_NULL_OPTIONS, PETSC_NULL_CHARACTER, option_name, & boundary_condition, found, ierr) if (found) then write(msg, *) option_name, '= ', trim(boundary_condition) call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) ! Default values. periodic_val = .false. if (boundary_condition .eq. 'periodic') then periodic_val = .true. b_c = cyclic else if (boundary_condition .eq. 'neumann') then b_c = neumann else if (boundary_condition .eq. 'dirichlet') then b_c = dirichlet else if (boundary_condition .eq. 'inlet') then if (option_name .ne. 'x_lower_bc_u') then write(msg, *) 'Error: inlet is not a valid value for' // option_name // '.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if bc_temp = inlet b_c = inlet else if (boundary_condition .eq. 'outlet') then if (option_name .ne. 'x_upper_bc_u') then write(msg, *) 'Error: outlet is not a valid value for' // option_name // '.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if if (bc_temp .ne. inlet) then write(msg, *) 'Error: outlet is not a valid value for x_upper_bc_u as x_lower_bc_u is not an inlet.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if b_c = outlet else write(msg, *) 'Error: ' // trim(boundary_condition) // ' is not a valid value.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if if (periodic_val .neqv. periodic(dim)) then write(msg, *) 'Error: ' // 'cannot have a single periodic BC in dimenstion', dim call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if else write(msg, *) 'Error:', option_name, 'not found.' call PetscPrintf(PETSC_COMM_WORLD, trim(msg) // NEW_LINE('a'), ierr) end if end subroutine get_BC_with_check_uvw end module test_configuration_options -------------- next part -------------- program test_program use petsc use test_configuration_options implicit none #include "petsc/finclude/petsc.h" PetscInt :: global_dim_x, global_dim_y, global_dim_z double precision :: gz, gx PetscInt :: num_procs_x, num_procs_y, num_procs_z double precision :: liquid_limit double precision :: gaseous_limit PetscErrorCode :: ierr character(len = max_option_name_length) :: imex_Cl, imex_Cv, imex_T character(len = max_option_name_length) :: imex_u, imex_v, imex_w ! Choices of PETSc or original solvers. logical :: petsc_solver_Cl, petsc_solver_Cv, petsc_solver_T logical :: petsc_solver_u, petsc_solver_v, petsc_solver_w, petsc_solver_p ! To monitor or not to monitor. logical :: Cl_monitoring_on, Cv_monitoring_on, T_monitoring_on logical :: u_monitoring_on, v_monitoring_on, w_monitoring_on, p_monitoring_on ! Pressure solver configuration PetscInt :: iter_pres_first_100, iter_pres ! Momentum equation solver configuration. PetscInt :: iter_u, iter_v, iter_w ! DIM equation solver configuration. PetscInt :: iter_dim PetscInt :: iter_Cv ! Temperature equation solver configuration. PetscInt :: iter_T ! Cl HDF5 file output frequency. PetscInt :: Cl_write_frequency PetscInt :: Cv_write_frequency ! u HDF5 file output frequency. PetscInt :: u_write_frequency ! v HDF5 file output frequency. PetscInt :: v_write_frequency ! w HDF5 file output frequency. PetscInt :: w_write_frequency ! T HDF5 file output frequency. PetscInt :: T_write_frequency ! backup (restart) file output frequency. PetscInt :: backup_frequency ! Number of timesteps. PetscInt :: num_timesteps double precision :: Re, Pe, PeT, We, Fr, Bod, Ja, Fr2, Fr2BodByRe double precision :: dpdx double precision :: mu_minus, mu_plus, mu_vap double precision :: rho_minus, rho_plus, rho_vap double precision :: cp_minus, cp_plus, cp_vap double precision :: rhocp_minus, rhocp_plus, rhocp_vap double precision :: k_plus, k_minus, k_vap double precision :: beta_plus, beta_minus, beta_vap double precision :: epn double precision :: T_ref double precision :: Pref double precision :: Apsat, Bpsat, Cpsat double precision :: molMassRatio double precision :: PeMD double precision :: PeMDI double precision :: dTdx PetscLogDouble :: mem integer :: mpi_err ! ****************************************************************************************** call PetscInitialize(PETSC_NULL_CHARACTER, ierr) call PetscMemoryGetCurrentUsage(mem, ierr) write(*, *) 'mem0 = ', mem call MPI_Barrier(PETSC_COMM_WORLD, mpi_err) call read_initial_configuration_options(global_dim_x, global_dim_y, global_dim_z, & Re, Pe, We, Fr, Bod, Ja, mu_plus, mu_minus, mu_vap, rho_plus, rho_minus, rho_vap, & cp_plus, cp_minus, cp_vap, k_plus, k_minus, k_vap, beta_plus, beta_minus, beta_vap, & dpdx, gx, gz, epn, dTdx, T_ref, & Pref, Apsat, Bpsat, Cpsat, molMassRatio, PeT, PeMD, PeMDI, & x_upper_bc_T, x_upper_bc_Cl, x_upper_bc_Cv, x_upper_bc_u, x_upper_bc_v, x_upper_bc_w, & x_lower_bc_T, x_lower_bc_Cl, x_lower_bc_Cv, x_lower_bc_u, x_lower_bc_v, x_lower_bc_w, & y_upper_bc_T, y_upper_bc_Cl, y_upper_bc_Cv, y_upper_bc_u, y_upper_bc_v, y_upper_bc_w, & y_lower_bc_T, y_lower_bc_Cl, y_lower_bc_Cv, y_lower_bc_u, y_lower_bc_v, y_lower_bc_w, & z_upper_bc_T, z_upper_bc_Cl, z_upper_bc_Cv, z_upper_bc_u, z_upper_bc_v, z_upper_bc_w, & z_lower_bc_T, z_lower_bc_Cl, z_lower_bc_Cv, z_lower_bc_u, z_lower_bc_v, z_lower_bc_w, & liquid_limit, gaseous_limit, ierr) call PetscMemoryGetCurrentUsage(mem, ierr) write(*, *) 'mem1 = ', mem call MPI_Barrier(PETSC_COMM_WORLD, mpi_err) call read_run_time_configuration_options(num_procs_x, num_procs_y, num_procs_z, & imex_Cl, imex_CV, imex_T, imex_U, imex_v, imex_w, & petsc_solver_Cl, petsc_solver_Cv, petsc_solver_T, & petsc_solver_u, petsc_solver_v, petsc_solver_w, petsc_solver_p, & Cl_monitoring_on, Cv_monitoring_on, T_monitoring_on, & u_monitoring_on, v_monitoring_on, w_monitoring_on, p_monitoring_on, & iter_pres_first_100, iter_pres, iter_u, iter_v, iter_w, iter_dim, iter_Cv, & iter_T, Cl_write_frequency, Cv_write_frequency, u_write_frequency, v_write_frequency, & w_write_frequency, T_write_frequency, backup_frequency, num_timesteps, ierr) call PetscMemoryGetCurrentUsage(mem, ierr) write(*, *) 'mem2 = ', mem call MPI_Barrier(PETSC_COMM_WORLD, mpi_err) call PetscFinalize(ierr) end program test_program -------------- next part -------------- -global_dim_x 240 -global_dim_y 240 -global_dim_z 320 -phenomenon boiling_variant -liquid_limit 0.9 -gaseous_limit 0.1 -epn 0.0 # This value causes epn to be computed by the TPLS program. -Re 221.46 -Pe 1.0 # The Peclet number for the Cahn-Hilliard equation. It is modified in the code. -Pr 8.4 -Ja 0.18 -Fr 1.01 -We 1.01 -Bod 1.0 -Sc 1.0 -Pref 1.0 # Relates partial pressure to mole fraction. -Apsat 1.0 -Bpsat 1.0 -Cpsat 1.0 -T_substrate 1.0 # Bespoke option. -T_ref 0.0 # Used to specify the saturation temperature (a.k.a. T_bulk). -th_layer 0.36 # Bespoke option. -dTdx 0.0 -Radius 0.5 # Bespoke option. -MM_minus 1.0 -MM_vap 1.0 # Properties of the liquid. -rho_plus 91.07 -mu_plus 32.6 -k_plus 3.94 -cp_plus 1.23 -beta_plus 1.0 # Properties of the inert gas. Boiling so UNUSED. -rho_minus 1.0 -mu_minus 1.0 -k_minus 1.0 -cp_minus 1.0 -beta_minus 1.0 # Properties of vapour corresponding to the liquid. -rho_vap 1.0 -mu_vap 1.0 -k_vap 1.0 -cp_vap 1.0 -beta_vap 1.0 -height 0.0 # Bespoke option. -dpdx 0.0 -Grav 1.0 -alpha -1.570796326794897 -dt 0.0005 -x_upper_bc_T periodic -x_upper_bc_Cl periodic -x_upper_bc_Cv periodic -x_upper_bc_P periodic -x_upper_bc_u periodic -x_upper_bc_v periodic -x_upper_bc_w periodic -x_lower_bc_T periodic -x_lower_bc_Cl periodic -x_lower_bc_Cv periodic -x_lower_bc_P periodic -x_lower_bc_u periodic -x_lower_bc_v periodic -x_lower_bc_w periodic -y_upper_bc_T periodic -y_upper_bc_Cl periodic -y_upper_bc_Cv periodic -y_upper_bc_P periodic -y_upper_bc_u periodic -y_upper_bc_v periodic -y_upper_bc_w periodic -y_lower_bc_T periodic -y_lower_bc_Cl periodic -y_lower_bc_Cv periodic -y_lower_bc_P periodic -y_lower_bc_u periodic -y_lower_bc_v periodic -y_lower_bc_w periodic -z_upper_bc_T dirichlet # Fixed temperature. -z_upper_bc_T_value 0.0 # Set to T_ref (i.e. T_sat). -z_upper_bc_Cl neumann # dCl/dz = 0. -z_upper_bc_Cv dirichlet # Initialise Cv to 1 everywhere. -z_upper_bc_P neumann # dP/dz = rho_wgrid*gz. -z_upper_bc_u dirichlet # Set to 0 initially, no slip. -z_upper_bc_v dirichlet # Set to 0 initially, no slip. -z_upper_bc_w neumann # dw/dz = 0, in or out flow is allowed. -z_lower_bc_T dirichlet # Fixed temperature. -z_lower_bc_T_value 1.0 # Set to T_substrate. -z_lower_bc_Cl dirichlet # Fixed composition determined by initialisation. -z_lower_bc_Cv dirichlet # Initialise Cv to 1 everywhere. -z_lower_bc_P neumann # dP/dz = rho_wgrid*gz. -z_lower_bc_u dirichlet # Set to 0 initially, no slip. -z_lower_bc_v dirichlet # Set to 0 initially, no slip. -z_lower_bc_w dirichlet # Set to 0 initially, no in or out flow. -num_procs_x 2 -num_procs_y 2 -num_procs_z 2 -petsc_solver_u FALSE -petsc_solver_v FALSE -petsc_solver_w FALSE -petsc_solver_Cl FALSE -petsc_solver_Cv FALSE -petsc_solver_T FALSE -petsc_solver_p FALSE -imex_u CNAB3 -imex_v CNAB3 -imex_w CNAB3 -imex_Cl SBDF -imex_Cv CNAB -imex_T CNAB -u_monitoring_on FALSE -v_monitoring_on FALSE -w_monitoring_on FALSE -p_monitoring_on FALSE -Cl_monitoring_on FALSE -Cv_monitoring_on FALSE -T_monitoring_on FALSE -iter_pres_first_100 1000 -iter_pres 1000 -iter_u 30 -iter_v 30 -iter_w 30 -iter_dim 40 -iter_T 40 -Cl_write_frequency 250 -Cv_write_frequency 2500 -T_write_frequency 2500 -u_write_frequency 2500 -v_write_frequency 2500 -w_write_frequency 2500 -backup_frequency 10000 -num_timesteps 10000 -u_ksp_rtol 0.0000001 -u_ksp_view_final_residual -v_ksp_rtol 0.0000001 -v_ksp_view_final_residual -w_ksp_rtol 0.0000001 -w_ksp_view_final_residual -p_ksp_rtol 0.0000001 -p_ksp_type minres -p_pc_type sor -p_pc_sor_omega 1.5 -p_ksp_view_final_residual -options_left -------------- next part -------------- mem0 = 16420864.000000000 mem0 = 16117760.000000000 -phenomenon = boiling_variant -global_dim_x = 240 -global_dim_y = 240 -global_dim_z = 320 -dt = 5.0000000000000001E-004 dx = 3.1250000000000002E-003 dy = 3.1250000000000002E-003 dz = 3.1250000000000002E-003 Lx = 0.75000000000000000 Ly = 0.75000000000000000 Lz = 1.0000000000000000 -epn = 1.5625000000000001E-003 -Re = 221.46000000000001 -Pe = 1.0000000000000000 Modified Pe = 640.00000000000000 -Pr = 8.4000000000000004 PeT = 1860.2640000000001 -We = 1.0100000000000000 -Fr = 1.0100000000000000 -Bod = 1.0000000000000000 -Sc = 1.0000000000000000 PeMD = 221.46000000000001 PeMDI = 221.46000000000001 -Ja = 0.17999999999999999 -dTdx = 0.0000000000000000 -T_ref = 0.0000000000000000 -Pref = 1.0000000000000000 -Apsat = 1.0000000000000000 -Bpsat = 1.0000000000000000 -Cpsat = 1.0000000000000000 -MM_minus = 1.0000000000000000 -MM_vap = 1.0000000000000000 molMassRatio = 1.0000000000000000 -mu_plus = 32.600000000000001 -mu_minus = 1.0000000000000000 -mu_vap = 1.0000000000000000 -rho_plus = 91.069999999999993 -rho_minus = 1.0000000000000000 -rho_vap = 1.0000000000000000 -cp_plus = 1.2300000000000000 -cp_minus = 1.0000000000000000 -cp_vap = 1.0000000000000000 -k_plus = 3.9399999999999999 -k_minus = 1.0000000000000000 -k_vap = 1.0000000000000000 -beta_plus = 1.0000000000000000 -beta_minus = 1.0000000000000000 -beta_vap = 1.0000000000000000 -dpdx = 0.0000000000000000 -Grav = 1.0000000000000000 -alpha = -1.5707963267948970 gz = -1.0000000000000000 gx = 0.0000000000000000 -x_upper_bc_T = periodic -x_upper_bc_Cl = periodic -x_upper_bc_Cv = periodic -x_upper_bc_P = periodic -x_upper_bc_u = periodic -x_upper_bc_v = periodic -x_upper_bc_w = periodic -x_lower_bc_T = periodic -x_lower_bc_Cl = periodic -x_lower_bc_Cv = periodic -x_lower_bc_P = periodic -x_lower_bc_u = periodic -x_lower_bc_v = periodic -x_lower_bc_w = periodic -y_upper_bc_T = periodic -y_upper_bc_Cl = periodic -y_upper_bc_Cv = periodic -y_upper_bc_P = periodic -y_upper_bc_u = periodic -y_upper_bc_v = periodic -y_upper_bc_w = periodic -y_lower_bc_T = periodic -y_lower_bc_Cl = periodic -y_lower_bc_Cv = periodic -y_lower_bc_P = periodic -y_lower_bc_u = periodic -y_lower_bc_v = periodic -y_lower_bc_w = periodic -z_upper_bc_T = dirichlet -z_upper_bc_T_value = 0.0000000000000000 -z_upper_bc_Cl = neumann -z_upper_bc_Cv = dirichlet -z_upper_bc_P = neumann -z_upper_bc_u = dirichlet -z_upper_bc_v = dirichlet -z_upper_bc_w = neumann -z_lower_bc_T = dirichlet -z_lower_bc_T_value = 1.0000000000000000 -z_lower_bc_Cl = dirichlet -z_lower_bc_Cv = dirichlet -z_lower_bc_P = neumann -z_lower_bc_u = dirichlet -z_lower_bc_v = dirichlet -z_lower_bc_w = dirichlet -liquid_limit = 0.90000000000000002 mem1 = 4311490560.0000000 -gaseous_limit = 0.10000000000000001 mem1 = 4311826432.0000000 -num_procs_x = 2 -num_procs_y = 2 -num_procs_z = 2 -imex_Cl = SBDF -imex_Cv = CNAB Error:CNAB is not a valid value. -imex_T = CNAB Error:CNAB is not a valid value. -imex_u = CNAB3 -imex_v = CNAB3 -imex_w = CNAB3 -petsc_solver_Cl = F -petsc_solver_Cv = F -petsc_solver_T = F -petsc_solver_u = F -petsc_solver_v = F -petsc_solver_w = F -petsc_solver_p = F -Cl_monitoring_on = F -Cv_monitoring_on = F -T_monitoring_on = F -u_monitoring_on = F -v_monitoring_on = F -w_monitoring_on = F -p_monitoring_on = F -iter_pres_first_100 = 1000 -iter_pres = 1000 -iter_u = 30 -iter_v = 30 -iter_w = 30 -iter_dim = 40 Error:-iter_Cv not found. -iter_T = 40 -Cl_write_frequency = 250 -Cv_write_frequency = 2500 -u_write_frequency = 2500 -v_write_frequency = 2500 -w_write_frequency = 2500 -T_write_frequency = 2500 -backup_frequency = 10000 -num_timesteps = 10000 mem2 = 4311490560.0000000 mem2 = 4311826432.0000000 Summary of Memory Usage in PETSc Current process memory: total 8.6236e+09 max 4.3120e+09 min 4.3116e+09 Current space PetscMalloc()ed: total 3.2736e+04 max 1.6368e+04 min 1.6368e+04 Run with -memory_view to get maximum memory usage #PETSc Option Table entries: -alpha -1.570796326794897 # (source: file) -Apsat 1.0 # (source: file) -backup_frequency 10000 # (source: file) -beta_minus 1.0 # (source: file) -beta_plus 1.0 # (source: file) -beta_vap 1.0 # (source: file) -Bod 1.0 # (source: file) -Bpsat 1.0 # (source: file) -Cl_monitoring_on FALSE # (source: file) -Cl_write_frequency 250 # (source: file) -cp_minus 1.0 # (source: file) -cp_plus 1.23 # (source: file) -cp_vap 1.0 # (source: file) -Cpsat 1.0 # (source: file) -Cv_monitoring_on FALSE # (source: file) -Cv_write_frequency 2500 # (source: file) -d initial_state # (source: command line) -dpdx 0.0 # (source: file) -dt 0.0005 # (source: file) -dTdx 0.0 # (source: file) -epn 0.0 # (source: file) -Fr 1.01 # (source: file) -gaseous_limit 0.1 # (source: file) -global_dim_x 240 # (source: file) -global_dim_y 240 # (source: file) -global_dim_z 320 # (source: file) -Grav 1.0 # (source: file) -height 0.0 # (source: file) -imex_Cl SBDF # (source: file) -imex_Cv CNAB # (source: file) -imex_T CNAB # (source: file) -imex_u CNAB3 # (source: file) -imex_v CNAB3 # (source: file) -imex_w CNAB3 # (source: file) -iter_dim 40 # (source: file) -iter_pres 1000 # (source: file) -iter_pres_first_100 1000 # (source: file) -iter_T 40 # (source: file) -iter_u 30 # (source: file) -iter_v 30 # (source: file) -iter_w 30 # (source: file) -Ja 0.18 # (source: file) -k_minus 1.0 # (source: file) -k_plus 3.94 # (source: file) -k_vap 1.0 # (source: file) -liquid_limit 0.9 # (source: file) -malloc_debug true # (source: command line) -malloc_dump # (source: command line) -malloc_view # (source: command line) -memory_view # (source: command line) -MM_minus 1.0 # (source: file) -MM_vap 1.0 # (source: file) -mu_minus 1.0 # (source: file) -mu_plus 32.6 # (source: file) -mu_vap 1.0 # (source: file) -num_procs_x 2 # (source: file) -num_procs_y 2 # (source: file) -num_procs_z 2 # (source: file) -num_timesteps 10000 # (source: file) -on_error_malloc_dump # (source: command line) -options_left # (source: file) -p_ksp_rtol 0.0000001 # (source: file) -p_ksp_type minres # (source: file) -p_ksp_view_final_residual # (source: file) -p_monitoring_on FALSE # (source: file) -p_pc_sor_omega 1.5 # (source: file) -p_pc_type sor # (source: file) -Pe 1.0 # (source: file) -petsc_solver_Cl FALSE # (source: file) -petsc_solver_Cv FALSE # (source: file) -petsc_solver_p FALSE # (source: file) -petsc_solver_T FALSE # (source: file) -petsc_solver_u FALSE # (source: file) -petsc_solver_v FALSE # (source: file) -petsc_solver_w FALSE # (source: file) -phenomenon boiling_variant # (source: file) -Pr 8.4 # (source: file) -Pref 1.0 # (source: file) -Radius 0.5 # (source: file) -Re 221.46 # (source: file) -rho_minus 1.0 # (source: file) -rho_plus 91.07 # (source: file) -rho_vap 1.0 # (source: file) -Sc 1.0 # (source: file) -T_monitoring_on FALSE # (source: file) -T_ref 0.0 # (source: file) -T_substrate 1.0 # (source: file) -T_write_frequency 2500 # (source: file) -th_layer 0.36 # (source: file) -u_ksp_rtol 0.0000001 # (source: file) -u_ksp_view_final_residual # (source: file) -u_monitoring_on FALSE # (source: file) -u_write_frequency 2500 # (source: file) -v_ksp_rtol 0.0000001 # (source: file) -v_ksp_view_final_residual # (source: file) -v_monitoring_on FALSE # (source: file) -v_write_frequency 2500 # (source: file) -w_ksp_rtol 0.0000001 # (source: file) -w_ksp_view_final_residual # (source: file) -w_monitoring_on FALSE # (source: file) -w_write_frequency 2500 # (source: file) -We 1.01 # (source: file) -x_lower_bc_Cl periodic # (source: file) -x_lower_bc_Cv periodic # (source: file) -x_lower_bc_P periodic # (source: file) -x_lower_bc_T periodic # (source: file) -x_lower_bc_u periodic # (source: file) -x_lower_bc_v periodic # (source: file) -x_lower_bc_w periodic # (source: file) -x_upper_bc_Cl periodic # (source: file) -x_upper_bc_Cv periodic # (source: file) -x_upper_bc_P periodic # (source: file) -x_upper_bc_T periodic # (source: file) -x_upper_bc_u periodic # (source: file) -x_upper_bc_v periodic # (source: file) -x_upper_bc_w periodic # (source: file) -y_lower_bc_Cl periodic # (source: file) -y_lower_bc_Cv periodic # (source: file) -y_lower_bc_P periodic # (source: file) -y_lower_bc_T periodic # (source: file) -y_lower_bc_u periodic # (source: file) -y_lower_bc_v periodic # (source: file) -y_lower_bc_w periodic # (source: file) -y_upper_bc_Cl periodic # (source: file) -y_upper_bc_Cv periodic # (source: file) -y_upper_bc_P periodic # (source: file) -y_upper_bc_T periodic # (source: file) -y_upper_bc_u periodic # (source: file) -y_upper_bc_v periodic # (source: file) -y_upper_bc_w periodic # (source: file) -z_lower_bc_Cl dirichlet # (source: file) -z_lower_bc_Cv dirichlet # (source: file) -z_lower_bc_P neumann # (source: file) -z_lower_bc_T dirichlet # (source: file) -z_lower_bc_T_value 1.0 # (source: file) -z_lower_bc_u dirichlet # (source: file) -z_lower_bc_v dirichlet # (source: file) -z_lower_bc_w dirichlet # (source: file) -z_upper_bc_Cl neumann # (source: file) -z_upper_bc_Cv dirichlet # (source: file) -z_upper_bc_P neumann # (source: file) -z_upper_bc_T dirichlet # (source: file) -z_upper_bc_T_value 0.0 # (source: file) -z_upper_bc_u dirichlet # (source: file) -z_upper_bc_v dirichlet # (source: file) -z_upper_bc_w neumann # (source: file) #End of PETSc Option Table entries WARNING! There are options you set that were not used! WARNING! could be spelling mistake, etc! There are 17 unused database options. They are: Option left: name:-d value: initial_state source: command line Option left: name:-height value: 0.0 source: file Option left: name:-on_error_malloc_dump (no value) source: command line Option left: name:-p_ksp_rtol value: 0.0000001 source: file Option left: name:-p_ksp_type value: minres source: file Option left: name:-p_ksp_view_final_residual (no value) source: file Option left: name:-p_pc_sor_omega value: 1.5 source: file Option left: name:-p_pc_type value: sor source: file Option left: name:-Radius value: 0.5 source: file Option left: name:-T_substrate value: 1.0 source: file Option left: name:-th_layer value: 0.36 source: file Option left: name:-u_ksp_rtol value: 0.0000001 source: file Option left: name:-u_ksp_view_final_residual (no value) source: file Option left: name:-v_ksp_rtol value: 0.0000001 source: file Option left: name:-v_ksp_view_final_residual (no value) source: file Option left: name:-w_ksp_rtol value: 0.0000001 source: file Option left: name:-w_ksp_view_final_residual (no value) source: file [0] Maximum memory PetscMalloc()ed 29552 maximum size of entire process 4312375296 [0] Memory usage sorted by function [0] 1 144 PetscBTCreate() [0] 4 128 PetscCommDuplicate() [0] 4 64 PetscFunctionListCreate_Private() [0] 2 528 PetscIntStackCreate() [0] 2 2064 PetscLogClassArrayCreate() [0] 2 2064 PetscLogEventArrayCreate() [0] 1 32 PetscLogRegistryCreate() [0] 2 80 PetscLogStageArrayCreate() [0] 1 48 PetscLogStateCreate() [0] 1 16 PetscOptionsHelpPrintedCreate() [0] 1 32 PetscPushSignalHandler() [0] 4 20096 PetscSegBufferCreate() [0] 190 8688 PetscStrallocpy() [0] 12 26144 PetscStrreplace() [0] 2 1312 PetscViewerCreate() [0] 2 224 PetscViewerCreate_ASCII() [0] 14 368 petscoptionsgetbool_() [0] 22 480 petscoptionsgetint_() [0] 43 768 petscoptionsgetreal_() [0] 49 784 petscoptionsgetstring_() [0] 140 7632 petscprintf_() [1] Maximum memory PetscMalloc()ed 29552 maximum size of entire process 4311990272 [1] Memory usage sorted by function [1] 1 144 PetscBTCreate() [1] 4 128 PetscCommDuplicate() [1] 4 64 PetscFunctionListCreate_Private() [1] 2 528 PetscIntStackCreate() [1] 2 2064 PetscLogClassArrayCreate() [1] 2 2064 PetscLogEventArrayCreate() [1] 1 32 PetscLogRegistryCreate() [1] 2 80 PetscLogStageArrayCreate() [1] 1 48 PetscLogStateCreate() [1] 1 16 PetscOptionsHelpPrintedCreate() [1] 1 32 PetscPushSignalHandler() [1] 4 20096 PetscSegBufferCreate() [1] 190 8688 PetscStrallocpy() [1] 12 26144 PetscStrreplace() [1] 2 1312 PetscViewerCreate() [1] 2 224 PetscViewerCreate_ASCII() [1] 14 368 petscoptionsgetbool_() [1] 22 480 petscoptionsgetint_() [1] 43 768 petscoptionsgetreal_() [1] 49 784 petscoptionsgetstring_() [1] 140 7632 petscprintf_() From knepley at gmail.com Fri Nov 22 10:53:47 2024 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 22 Nov 2024 11:53:47 -0500 Subject: [petsc-users] Memory Used When Reading petscrc In-Reply-To: <60ba6a27-9840-4009-a94e-90bf7a5cd317@epcc.ed.ac.uk> References: <60ba6a27-9840-4009-a94e-90bf7a5cd317@epcc.ed.ac.uk> Message-ID: On Fri, Nov 22, 2024 at 11:36?AM David Scott wrote: > Hello, > > I am using the options mechanism of PETSc to configure my CFD code. I > have introduced options describing the size of the domain etc. I have > noticed that this consumes a lot of memory. I have found that the amount > of memory used scales linearly with the number of MPI processes used. > This restricts the number of MPI processes that I can use. > There are two statements: 1) The memory scales linearly with P 2) This uses a lot of memory Let's deal with 1) first. This seems to be trivially true. If I want every process to have access to a given option value, that option value must be in the memory of every process. The only alternative would be to communicate with some process in order to get values. Few codes seem to be willing to make this tradeoff, and we do not offer it. Now 2). Looking at the source, for each option we store a PetscOptionItem, which I count as having size 37 bytes (12 pointers/ints and a char). However, there is data behind every pointer, like the name, help text, available values (sometimes), I could see it being as large as 4K. Suppose it is. If I had 256 options, that would be 1M. Is this a large amount of memory? The way I read the SLURM output, 29K was malloced. Is this a large amount of memory? I am trying to get an idea of the scale. Thanks, Matt > Is there anything that I can do about this or do I need to configure my > code in a different way? > > I have attached some code extracted from my application which > demonstrates this along with the output from a running it on 2 MPI > processes. > > Best wishes, > > David Scott > The University of Edinburgh is a charitable body, registered in Scotland, > with registration number SC005336. Is e buidheann carthannais a th? ann an > Oilthigh Dh?n ?ideann, cl?raichte an Alba, ?ireamh cl?raidh SC005336. > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!aDTOdIHWWilf4sShnRrU9KcJ987GlIrJ71v1EcIH4zje2tKZ7EBoEBD2TqNejin_X3-7DKujGeq-pXHvyHqF$ -------------- next part -------------- An HTML attachment was scrubbed... URL: From d.scott at epcc.ed.ac.uk Fri Nov 22 11:56:53 2024 From: d.scott at epcc.ed.ac.uk (David Scott) Date: Fri, 22 Nov 2024 17:56:53 +0000 Subject: [petsc-users] Memory Used When Reading petscrc In-Reply-To: References: <60ba6a27-9840-4009-a94e-90bf7a5cd317@epcc.ed.ac.uk> Message-ID: Matt, Thanks for the quick response. Yes 1) is trivially true. With regard to 2), from the SLURM output: [0] Maximum memory PetscMalloc()ed 29552 maximum size of entire process 4312375296 [1] Maximum memory PetscMalloc()ed 29552 maximum size of entire process 4311990272 Yes only 29KB was malloced but the total figure was 4GB per process. Looking at ?mem0 =??? 16420864.000000000 ?mem0 =??? 16117760.000000000 ?mem1 =??? 4311490560.0000000 ?mem1 =??? 4311826432.0000000 ?mem2 =??? 4311490560.0000000 ?mem2 =??? 4311826432.0000000 mem0 is written after PetscInitialize. mem1 is written roughly half way through the options being read. mem2 is written on completion of the options being read. The code does very little other than read configuration options. Why is so much memory used? I do not understand what is going on and I may have expressed myself badly but I do have a problem as I certainly cannot use anywhere near 128 processes on a node with 128GB of RAM before I get an OOM error. (The code runs successfully on 32 processes but not 64.) Regards, David On 22/11/2024 16:53, Matthew Knepley wrote: > This email was sent to you by someone outside the University. > You should only click on links or attachments if you are certain that > the email is genuine and the content is safe. > On Fri, Nov 22, 2024 at 11:36?AM David Scott > wrote: > > Hello, > > I am using the options mechanism of PETSc to configure my CFD code. I > have introduced options describing the size of the domain etc. I have > noticed that this consumes a lot of memory. I have found that the > amount > of memory used scales linearly with the number of MPI processes used. > This restricts the number of MPI processes that I can use. > > > There are two statements: > > 1) The memory?scales linearly with P > > 2) This uses a lot of memory > > Let's deal with 1) first. This seems to be trivially true. If I want > every process to have > access to a given option value, that option value must be in the > memory of every process. > The only alternative would be to communicate with some process in > order to get values. > Few codes seem to be willing to make this tradeoff, and we do not > offer it. > > Now 2). Looking at the source, for each option we store > a?PetscOptionItem, which I count > as having size 37 bytes (12 pointers/ints and a char). However, there > is data behind every > pointer, like the name, help text, available values (sometimes), I > could see it being as large > as 4K. Suppose it is. If I had 256 options, that would be 1M. Is this > a large amount of memory? > > The way I read the SLURM output, 29K was malloced. Is this a large > amount of memory? > > I am trying to get an idea of the scale. > > ? Thanks, > > ? ? ? Matt > > Is there anything that I can do about this or do I need to > configure my > code in a different way? > > I have attached some code extracted from my application which > demonstrates this along with the output from a running it on 2 MPI > processes. > > Best wishes, > > David Scott > The University of Edinburgh is a charitable body, registered in > Scotland, with registration number SC005336. Is e buidheann > carthannais a th? ann an Oilthigh Dh?n ?ideann, cl?raichte an > Alba, ?ireamh cl?raidh SC005336. > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener > > https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!ZSsmcyvmT1HYNbUdssH9wNf_bUXn64WJkcv6TgscRIX6mEcDPKI4LxvsUWu9JcgeYQjCchlmOm8y7thpEupDiyaLluA$ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Fri Nov 22 16:10:43 2024 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 22 Nov 2024 17:10:43 -0500 Subject: [petsc-users] Memory Used When Reading petscrc In-Reply-To: References: <60ba6a27-9840-4009-a94e-90bf7a5cd317@epcc.ed.ac.uk> Message-ID: On Fri, Nov 22, 2024 at 12:57?PM David Scott wrote: > Matt, > > Thanks for the quick response. > > Yes 1) is trivially true. > > With regard to 2), from the SLURM output: > [0] Maximum memory PetscMalloc()ed 29552 maximum size of entire process > 4312375296 > [1] Maximum memory PetscMalloc()ed 29552 maximum size of entire process > 4311990272 > Yes only 29KB was malloced but the total figure was 4GB per process. > > Looking at > mem0 = 16420864.000000000 > mem0 = 16117760.000000000 > mem1 = 4311490560.0000000 > mem1 = 4311826432.0000000 > mem2 = 4311490560.0000000 > mem2 = 4311826432.0000000 > mem0 is written after PetscInitialize. > mem1 is written roughly half way through the options being read. > mem2 is written on completion of the options being read. > > The code does very little other than read configuration options. Why is so > much memory used? > This is not due to options processing, as that would fall under Petsc malloc allocations. I believe we are measuring this using RSS which includes the binary, all shared libraries which are paged in, and stack/heap allocations. I think you are seeing the shared libraries come in. You might be able to see all the libraries that come in using strace. Thanks, Matt > I do not understand what is going on and I may have expressed myself badly > but I do have a problem as I certainly cannot use anywhere near 128 > processes on a node with 128GB of RAM before I get an OOM error. (The code > runs successfully on 32 processes but not 64.) > > Regards, > > David > > On 22/11/2024 16:53, Matthew Knepley wrote: > > This email was sent to you by someone outside the University. > You should only click on links or attachments if you are certain that the > email is genuine and the content is safe. > On Fri, Nov 22, 2024 at 11:36?AM David Scott > wrote: > >> Hello, >> >> I am using the options mechanism of PETSc to configure my CFD code. I >> have introduced options describing the size of the domain etc. I have >> noticed that this consumes a lot of memory. I have found that the amount >> of memory used scales linearly with the number of MPI processes used. >> This restricts the number of MPI processes that I can use. >> > > There are two statements: > > 1) The memory scales linearly with P > > 2) This uses a lot of memory > > Let's deal with 1) first. This seems to be trivially true. If I want every > process to have > access to a given option value, that option value must be in the memory of > every process. > The only alternative would be to communicate with some process in order to > get values. > Few codes seem to be willing to make this tradeoff, and we do not offer it. > > Now 2). Looking at the source, for each option we store a PetscOptionItem, > which I count > as having size 37 bytes (12 pointers/ints and a char). However, there is > data behind every > pointer, like the name, help text, available values (sometimes), I could > see it being as large > as 4K. Suppose it is. If I had 256 options, that would be 1M. Is this a > large amount of memory? > > The way I read the SLURM output, 29K was malloced. Is this a large amount > of memory? > > I am trying to get an idea of the scale. > > Thanks, > > Matt > > >> Is there anything that I can do about this or do I need to configure my >> code in a different way? >> >> I have attached some code extracted from my application which >> demonstrates this along with the output from a running it on 2 MPI >> processes. >> >> Best wishes, >> >> David Scott >> The University of Edinburgh is a charitable body, registered in Scotland, >> with registration number SC005336. Is e buidheann carthannais a th? ann an >> Oilthigh Dh?n ?ideann, cl?raichte an Alba, ?ireamh cl?raidh SC005336. >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!ZLozvsSwhcAXnBdZ4m-ImNu-aXxMX8SpsHB7SM320hlG3hZEq3hw7UvNnb4c2hzs2_t_rAlLN5oIdM6Uw7ol$ > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!ZLozvsSwhcAXnBdZ4m-ImNu-aXxMX8SpsHB7SM320hlG3hZEq3hw7UvNnb4c2hzs2_t_rAlLN5oIdM6Uw7ol$ -------------- next part -------------- An HTML attachment was scrubbed... URL: From d.scott at epcc.ed.ac.uk Fri Nov 22 19:16:22 2024 From: d.scott at epcc.ed.ac.uk (David Scott) Date: Sat, 23 Nov 2024 01:16:22 +0000 Subject: [petsc-users] Memory Used When Reading petscrc In-Reply-To: References: <60ba6a27-9840-4009-a94e-90bf7a5cd317@epcc.ed.ac.uk> Message-ID: OK. I had started to wonder if that was the case. I'll do some further investigation. Thanks, David On 22/11/2024 22:10, Matthew Knepley wrote: > This email was sent to you by someone outside the University. > You should only click on links or attachments if you are certain that > the email is genuine and the content is safe. > On Fri, Nov 22, 2024 at 12:57?PM David Scott > wrote: > > Matt, > > Thanks for the quick response. > > Yes 1) is trivially true. > > With regard to 2), from the SLURM output: > [0] Maximum memory PetscMalloc()ed 29552 maximum size of entire > process 4312375296 > [1] Maximum memory PetscMalloc()ed 29552 maximum size of entire > process 4311990272 > Yes only 29KB was malloced but the total figure was 4GB per process. > > Looking at > ?mem0 =??? 16420864.000000000 > ?mem0 =??? 16117760.000000000 > ?mem1 =??? 4311490560.0000000 > ?mem1 =??? 4311826432.0000000 > ?mem2 =??? 4311490560.0000000 > ?mem2 =??? 4311826432.0000000 > mem0 is written after PetscInitialize. > mem1 is written roughly half way through the options being read. > mem2 is written on completion of the options being read. > > The code does very little other than read configuration options. > Why is so much memory used? > > > This is not due to options processing, as that would fall under Petsc > malloc allocations. I believe we are measuring this > using RSS which includes the binary, all shared libraries which are > paged in, and stack/heap allocations. I think you are > seeing the shared libraries come in. You might be able to see all the > libraries that come in using strace. > > ? Thanks, > > ? ? ?Matt > > I do not understand what is going on and I may have expressed > myself badly but I do have a problem as I certainly cannot use > anywhere near 128 processes on a node with 128GB of RAM before I > get an OOM error. (The code runs successfully on 32 processes but > not 64.) > > Regards, > > David > > On 22/11/2024 16:53, Matthew Knepley wrote: >> This email was sent to you by someone outside the University. >> You should only click on links or attachments if you are certain >> that the email is genuine and the content is safe. >> On Fri, Nov 22, 2024 at 11:36?AM David Scott >> wrote: >> >> Hello, >> >> I am using the options mechanism of PETSc to configure my CFD >> code. I >> have introduced options describing the size of the domain >> etc. I have >> noticed that this consumes a lot of memory. I have found that >> the amount >> of memory used scales linearly with the number of MPI >> processes used. >> This restricts the number of MPI processes that I can use. >> >> >> There are two statements: >> >> 1) The memory?scales linearly with P >> >> 2) This uses a lot of memory >> >> Let's deal with 1) first. This seems to be trivially true. If I >> want every process to have >> access to a given option value, that option value must be in the >> memory of every process. >> The only alternative would be to communicate with some process in >> order to get values. >> Few codes seem to be willing to make this tradeoff, and we do not >> offer it. >> >> Now 2). Looking at the source, for each option we store >> a?PetscOptionItem, which I count >> as having size 37 bytes (12 pointers/ints and a char). However, >> there is data behind every >> pointer, like the name, help text, available values (sometimes), >> I could see it being as large >> as 4K. Suppose it is. If I had 256 options, that would be 1M. Is >> this a large amount of memory? >> >> The way I read the SLURM output, 29K was malloced. Is this a >> large amount of memory? >> >> I am trying to get an idea of the scale. >> >> ? Thanks, >> >> ? ? ? Matt >> >> Is there anything that I can do about this or do I need to >> configure my >> code in a different way? >> >> I have attached some code extracted from my application which >> demonstrates this along with the output from a running it on >> 2 MPI >> processes. >> >> Best wishes, >> >> David Scott >> The University of Edinburgh is a charitable body, registered >> in Scotland, with registration number SC005336. Is e >> buidheann carthannais a th? ann an Oilthigh Dh?n ?ideann, >> cl?raichte an Alba, ?ireamh cl?raidh SC005336. >> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to >> which their experiments lead. >> -- Norbert Wiener >> >> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!cH8SjJvsuVEK1zv8noUjNUJC0VnHFqems68PjB2E94pqxc3q55YprX1q2JXFvPAzXJkh40J1-erXPWdIvc-xrLkRIgg$ >> > > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which > their experiments lead. > -- Norbert Wiener > > https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!cH8SjJvsuVEK1zv8noUjNUJC0VnHFqems68PjB2E94pqxc3q55YprX1q2JXFvPAzXJkh40J1-erXPWdIvc-xrLkRIgg$ > -------------- next part -------------- An HTML attachment was scrubbed... URL: From jed at jedbrown.org Sun Nov 24 23:27:45 2024 From: jed at jedbrown.org (Jed Brown) Date: Sun, 24 Nov 2024 22:27:45 -0700 Subject: [petsc-users] Memory Used When Reading petscrc In-Reply-To: References: <60ba6a27-9840-4009-a94e-90bf7a5cd317@epcc.ed.ac.uk> Message-ID: <87h67v3msu.fsf@jedbrown.org> You're clearly doing almost all your allocation *not* using PetscMalloc (so not in a Vec or Mat). If you're managing your own mesh yourself, you might be allocating a global amount on each rank, instead of strictly using scalable data structures (i.e., always partitioned). My favorite tool for understanding memory use is heaptrack. https://urldefense.us/v3/__https://github.com/KDE/heaptrack__;!!G_uCfscf7eWS!agSDvRnjou_irVa09mE8tn11M8EkGEsPjrHe8yzMxmZyJkn-U6e0AxubboUT6qOgDuK4nIlW9w1Xr4TxxNk$ David Scott writes: > OK. > > I had started to wonder if that was the case. I'll do some further > investigation. > > Thanks, > > David > > On 22/11/2024 22:10, Matthew Knepley wrote: >> This email was sent to you by someone outside the University. >> You should only click on links or attachments if you are certain that >> the email is genuine and the content is safe. >> On Fri, Nov 22, 2024 at 12:57?PM David Scott >> wrote: >> >> Matt, >> >> Thanks for the quick response. >> >> Yes 1) is trivially true. >> >> With regard to 2), from the SLURM output: >> [0] Maximum memory PetscMalloc()ed 29552 maximum size of entire >> process 4312375296 >> [1] Maximum memory PetscMalloc()ed 29552 maximum size of entire >> process 4311990272 >> Yes only 29KB was malloced but the total figure was 4GB per process. >> >> Looking at >> ?mem0 =??? 16420864.000000000 >> ?mem0 =??? 16117760.000000000 >> ?mem1 =??? 4311490560.0000000 >> ?mem1 =??? 4311826432.0000000 >> ?mem2 =??? 4311490560.0000000 >> ?mem2 =??? 4311826432.0000000 >> mem0 is written after PetscInitialize. >> mem1 is written roughly half way through the options being read. >> mem2 is written on completion of the options being read. >> >> The code does very little other than read configuration options. >> Why is so much memory used? >> >> >> This is not due to options processing, as that would fall under Petsc >> malloc allocations. I believe we are measuring this >> using RSS which includes the binary, all shared libraries which are >> paged in, and stack/heap allocations. I think you are >> seeing the shared libraries come in. You might be able to see all the >> libraries that come in using strace. >> >> ? Thanks, >> >> ? ? ?Matt >> >> I do not understand what is going on and I may have expressed >> myself badly but I do have a problem as I certainly cannot use >> anywhere near 128 processes on a node with 128GB of RAM before I >> get an OOM error. (The code runs successfully on 32 processes but >> not 64.) >> >> Regards, >> >> David >> >> On 22/11/2024 16:53, Matthew Knepley wrote: >>> This email was sent to you by someone outside the University. >>> You should only click on links or attachments if you are certain >>> that the email is genuine and the content is safe. >>> On Fri, Nov 22, 2024 at 11:36?AM David Scott >>> wrote: >>> >>> Hello, >>> >>> I am using the options mechanism of PETSc to configure my CFD >>> code. I >>> have introduced options describing the size of the domain >>> etc. I have >>> noticed that this consumes a lot of memory. I have found that >>> the amount >>> of memory used scales linearly with the number of MPI >>> processes used. >>> This restricts the number of MPI processes that I can use. >>> >>> >>> There are two statements: >>> >>> 1) The memory?scales linearly with P >>> >>> 2) This uses a lot of memory >>> >>> Let's deal with 1) first. This seems to be trivially true. If I >>> want every process to have >>> access to a given option value, that option value must be in the >>> memory of every process. >>> The only alternative would be to communicate with some process in >>> order to get values. >>> Few codes seem to be willing to make this tradeoff, and we do not >>> offer it. >>> >>> Now 2). Looking at the source, for each option we store >>> a?PetscOptionItem, which I count >>> as having size 37 bytes (12 pointers/ints and a char). However, >>> there is data behind every >>> pointer, like the name, help text, available values (sometimes), >>> I could see it being as large >>> as 4K. Suppose it is. If I had 256 options, that would be 1M. Is >>> this a large amount of memory? >>> >>> The way I read the SLURM output, 29K was malloced. Is this a >>> large amount of memory? >>> >>> I am trying to get an idea of the scale. >>> >>> ? Thanks, >>> >>> ? ? ? Matt >>> >>> Is there anything that I can do about this or do I need to >>> configure my >>> code in a different way? >>> >>> I have attached some code extracted from my application which >>> demonstrates this along with the output from a running it on >>> 2 MPI >>> processes. >>> >>> Best wishes, >>> >>> David Scott >>> The University of Edinburgh is a charitable body, registered >>> in Scotland, with registration number SC005336. Is e >>> buidheann carthannais a th? ann an Oilthigh Dh?n ?ideann, >>> cl?raichte an Alba, ?ireamh cl?raidh SC005336. >>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to >>> which their experiments lead. >>> -- Norbert Wiener >>> >>> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!cH8SjJvsuVEK1zv8noUjNUJC0VnHFqems68PjB2E94pqxc3q55YprX1q2JXFvPAzXJkh40J1-erXPWdIvc-xrLkRIgg$ >>> >> >> >> >> -- >> What most experimenters take for granted before they begin their >> experiments is infinitely more interesting than any results to which >> their experiments lead. >> -- Norbert Wiener >> >> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!cH8SjJvsuVEK1zv8noUjNUJC0VnHFqems68PjB2E94pqxc3q55YprX1q2JXFvPAzXJkh40J1-erXPWdIvc-xrLkRIgg$ >> From d.scott at epcc.ed.ac.uk Mon Nov 25 02:32:19 2024 From: d.scott at epcc.ed.ac.uk (David Scott) Date: Mon, 25 Nov 2024 08:32:19 +0000 Subject: [petsc-users] Memory Used When Reading petscrc In-Reply-To: <87h67v3msu.fsf@jedbrown.org> References: <60ba6a27-9840-4009-a94e-90bf7a5cd317@epcc.ed.ac.uk> <87h67v3msu.fsf@jedbrown.org> Message-ID: <9c8bf7af-62c4-47bf-9d68-ead4392fc014@epcc.ed.ac.uk> I'll have a look at heaptrack. The code that I am looking at the moment does not create a mesh. All it does is read a petscrc file. Thanks, David On 25/11/2024 05:27, Jed Brown wrote: > This email was sent to you by someone outside the University. > You should only click on links or attachments if you are certain that the email is genuine and the content is safe. > > You're clearly doing almost all your allocation *not* using PetscMalloc (so not in a Vec or Mat). If you're managing your own mesh yourself, you might be allocating a global amount on each rank, instead of strictly using scalable data structures (i.e., always partitioned). > > My favorite tool for understanding memory use is heaptrack. > > https://urldefense.us/v3/__https://github.com/KDE/heaptrack__;!!G_uCfscf7eWS!bM8Vs5Ljq0ZJOl_Zl88PpU1JJWw39UMiu50wgyt0zhG4ax6DxOvabmaDYbKrrCATTeWrKDmDR5C-3bDziLRcXp30NMQ$ > > David Scott writes: > >> OK. >> >> I had started to wonder if that was the case. I'll do some further >> investigation. >> >> Thanks, >> >> David >> >> On 22/11/2024 22:10, Matthew Knepley wrote: >>> This email was sent to you by someone outside the University. >>> You should only click on links or attachments if you are certain that >>> the email is genuine and the content is safe. >>> On Fri, Nov 22, 2024 at 12:57?PM David Scott >>> wrote: >>> >>> Matt, >>> >>> Thanks for the quick response. >>> >>> Yes 1) is trivially true. >>> >>> With regard to 2), from the SLURM output: >>> [0] Maximum memory PetscMalloc()ed 29552 maximum size of entire >>> process 4312375296 >>> [1] Maximum memory PetscMalloc()ed 29552 maximum size of entire >>> process 4311990272 >>> Yes only 29KB was malloced but the total figure was 4GB per process. >>> >>> Looking at >>> mem0 = 16420864.000000000 >>> mem0 = 16117760.000000000 >>> mem1 = 4311490560.0000000 >>> mem1 = 4311826432.0000000 >>> mem2 = 4311490560.0000000 >>> mem2 = 4311826432.0000000 >>> mem0 is written after PetscInitialize. >>> mem1 is written roughly half way through the options being read. >>> mem2 is written on completion of the options being read. >>> >>> The code does very little other than read configuration options. >>> Why is so much memory used? >>> >>> >>> This is not due to options processing, as that would fall under Petsc >>> malloc allocations. I believe we are measuring this >>> using RSS which includes the binary, all shared libraries which are >>> paged in, and stack/heap allocations. I think you are >>> seeing the shared libraries come in. You might be able to see all the >>> libraries that come in using strace. >>> >>> Thanks, >>> >>> Matt >>> >>> I do not understand what is going on and I may have expressed >>> myself badly but I do have a problem as I certainly cannot use >>> anywhere near 128 processes on a node with 128GB of RAM before I >>> get an OOM error. (The code runs successfully on 32 processes but >>> not 64.) >>> >>> Regards, >>> >>> David >>> >>> On 22/11/2024 16:53, Matthew Knepley wrote: >>>> This email was sent to you by someone outside the University. >>>> You should only click on links or attachments if you are certain >>>> that the email is genuine and the content is safe. >>>> On Fri, Nov 22, 2024 at 11:36?AM David Scott >>>> wrote: >>>> >>>> Hello, >>>> >>>> I am using the options mechanism of PETSc to configure my CFD >>>> code. I >>>> have introduced options describing the size of the domain >>>> etc. I have >>>> noticed that this consumes a lot of memory. I have found that >>>> the amount >>>> of memory used scales linearly with the number of MPI >>>> processes used. >>>> This restricts the number of MPI processes that I can use. >>>> >>>> >>>> There are two statements: >>>> >>>> 1) The memory scales linearly with P >>>> >>>> 2) This uses a lot of memory >>>> >>>> Let's deal with 1) first. This seems to be trivially true. If I >>>> want every process to have >>>> access to a given option value, that option value must be in the >>>> memory of every process. >>>> The only alternative would be to communicate with some process in >>>> order to get values. >>>> Few codes seem to be willing to make this tradeoff, and we do not >>>> offer it. >>>> >>>> Now 2). Looking at the source, for each option we store >>>> a PetscOptionItem, which I count >>>> as having size 37 bytes (12 pointers/ints and a char). However, >>>> there is data behind every >>>> pointer, like the name, help text, available values (sometimes), >>>> I could see it being as large >>>> as 4K. Suppose it is. If I had 256 options, that would be 1M. Is >>>> this a large amount of memory? >>>> >>>> The way I read the SLURM output, 29K was malloced. Is this a >>>> large amount of memory? >>>> >>>> I am trying to get an idea of the scale. >>>> >>>> Thanks, >>>> >>>> Matt >>>> >>>> Is there anything that I can do about this or do I need to >>>> configure my >>>> code in a different way? >>>> >>>> I have attached some code extracted from my application which >>>> demonstrates this along with the output from a running it on >>>> 2 MPI >>>> processes. >>>> >>>> Best wishes, >>>> >>>> David Scott >>>> The University of Edinburgh is a charitable body, registered >>>> in Scotland, with registration number SC005336. Is e >>>> buidheann carthannais a th? ann an Oilthigh Dh?n ?ideann, >>>> cl?raichte an Alba, ?ireamh cl?raidh SC005336. >>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to >>>> which their experiments lead. >>>> -- Norbert Wiener >>>> >>>> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!cH8SjJvsuVEK1zv8noUjNUJC0VnHFqems68PjB2E94pqxc3q55YprX1q2JXFvPAzXJkh40J1-erXPWdIvc-xrLkRIgg$ >>>> >>> >>> >>> -- >>> What most experimenters take for granted before they begin their >>> experiments is infinitely more interesting than any results to which >>> their experiments lead. >>> -- Norbert Wiener >>> >>> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!cH8SjJvsuVEK1zv8noUjNUJC0VnHFqems68PjB2E94pqxc3q55YprX1q2JXFvPAzXJkh40J1-erXPWdIvc-xrLkRIgg$ >>> From Fabian.Jakub at physik.uni-muenchen.de Mon Nov 25 02:45:30 2024 From: Fabian.Jakub at physik.uni-muenchen.de (Fabian.Jakub) Date: Mon, 25 Nov 2024 09:45:30 +0100 Subject: [petsc-users] Memory Used When Reading petscrc In-Reply-To: <9c8bf7af-62c4-47bf-9d68-ead4392fc014@epcc.ed.ac.uk> References: <60ba6a27-9840-4009-a94e-90bf7a5cd317@epcc.ed.ac.uk> <87h67v3msu.fsf@jedbrown.org> <9c8bf7af-62c4-47bf-9d68-ead4392fc014@epcc.ed.ac.uk> Message-ID: test_configuration_options.F90:l.55 max_msg_length is quite large.... I guess the pow() is a typo. Cheers, Fabian On 11/25/24 09:32, David Scott wrote: > I'll have a look at heaptrack. > > The code that I am looking at the moment does not create a mesh. All it > does is read a petscrc file. > > Thanks, > > David > > On 25/11/2024 05:27, Jed Brown wrote: >> This email was sent to you by someone outside the University. >> You should only click on links or attachments if you are certain that >> the email is genuine and the content is safe. >> >> You're clearly doing almost all your allocation *not* using >> PetscMalloc (so not in a Vec or Mat). If you're managing your own mesh >> yourself, you might be allocating a global amount on each rank, >> instead of strictly using scalable data structures (i.e., always >> partitioned). >> >> My favorite tool for understanding memory use is heaptrack. >> >> https://urldefense.us/v3/__https://github.com/KDE/heaptrack__;!!G_uCfscf7eWS!bM8Vs5Ljq0ZJOl_Zl88PpU1JJWw39UMiu50wgyt0zhG4ax6DxOvabmaDYbKrrCATTeWrKDmDR5C-3bDziLRcXp30NMQ$ >> David Scott writes: >> >>> OK. >>> >>> I had started to wonder if that was the case. I'll do some further >>> investigation. >>> >>> Thanks, >>> >>> David >>> >>> On 22/11/2024 22:10, Matthew Knepley wrote: >>>> This email was sent to you by someone outside the University. >>>> You should only click on links or attachments if you are certain that >>>> the email is genuine and the content is safe. >>>> On Fri, Nov 22, 2024 at 12:57?PM David Scott >>>> wrote: >>>> >>>> ???? Matt, >>>> >>>> ???? Thanks for the quick response. >>>> >>>> ???? Yes 1) is trivially true. >>>> >>>> ???? With regard to 2), from the SLURM output: >>>> ???? [0] Maximum memory PetscMalloc()ed 29552 maximum size of entire >>>> ???? process 4312375296 >>>> ???? [1] Maximum memory PetscMalloc()ed 29552 maximum size of entire >>>> ???? process 4311990272 >>>> ???? Yes only 29KB was malloced but the total figure was 4GB per >>>> process. >>>> >>>> ???? Looking at >>>> ????? mem0 =??? 16420864.000000000 >>>> ????? mem0 =??? 16117760.000000000 >>>> ????? mem1 =??? 4311490560.0000000 >>>> ????? mem1 =??? 4311826432.0000000 >>>> ????? mem2 =??? 4311490560.0000000 >>>> ????? mem2 =??? 4311826432.0000000 >>>> ???? mem0 is written after PetscInitialize. >>>> ???? mem1 is written roughly half way through the options being read. >>>> ???? mem2 is written on completion of the options being read. >>>> >>>> ???? The code does very little other than read configuration options. >>>> ???? Why is so much memory used? >>>> >>>> >>>> This is not due to options processing, as that would fall under Petsc >>>> malloc allocations. I believe we are measuring this >>>> using RSS which includes the binary, all shared libraries which are >>>> paged in, and stack/heap allocations. I think you are >>>> seeing the shared libraries come in. You might be able to see all the >>>> libraries that come in using strace. >>>> >>>> ?? Thanks, >>>> >>>> ????? Matt >>>> >>>> ???? I do not understand what is going on and I may have expressed >>>> ???? myself badly but I do have a problem as I certainly cannot use >>>> ???? anywhere near 128 processes on a node with 128GB of RAM before I >>>> ???? get an OOM error. (The code runs successfully on 32 processes but >>>> ???? not 64.) >>>> >>>> ???? Regards, >>>> >>>> ???? David >>>> >>>> ???? On 22/11/2024 16:53, Matthew Knepley wrote: >>>>> ???? This email was sent to you by someone outside the University. >>>>> ???? You should only click on links or attachments if you are certain >>>>> ???? that the email is genuine and the content is safe. >>>>> ???? On Fri, Nov 22, 2024 at 11:36?AM David Scott >>>>> ???? wrote: >>>>> >>>>> ???????? Hello, >>>>> >>>>> ???????? I am using the options mechanism of PETSc to configure my CFD >>>>> ???????? code. I >>>>> ???????? have introduced options describing the size of the domain >>>>> ???????? etc. I have >>>>> ???????? noticed that this consumes a lot of memory. I have found that >>>>> ???????? the amount >>>>> ???????? of memory used scales linearly with the number of MPI >>>>> ???????? processes used. >>>>> ???????? This restricts the number of MPI processes that I can use. >>>>> >>>>> >>>>> ???? There are two statements: >>>>> >>>>> ???? 1) The memory scales linearly with P >>>>> >>>>> ???? 2) This uses a lot of memory >>>>> >>>>> ???? Let's deal with 1) first. This seems to be trivially true. If I >>>>> ???? want every process to have >>>>> ???? access to a given option value, that option value must be in the >>>>> ???? memory of every process. >>>>> ???? The only alternative would be to communicate with some process in >>>>> ???? order to get values. >>>>> ???? Few codes seem to be willing to make this tradeoff, and we do not >>>>> ???? offer it. >>>>> >>>>> ???? Now 2). Looking at the source, for each option we store >>>>> ???? a PetscOptionItem, which I count >>>>> ???? as having size 37 bytes (12 pointers/ints and a char). However, >>>>> ???? there is data behind every >>>>> ???? pointer, like the name, help text, available values (sometimes), >>>>> ???? I could see it being as large >>>>> ???? as 4K. Suppose it is. If I had 256 options, that would be 1M. Is >>>>> ???? this a large amount of memory? >>>>> >>>>> ???? The way I read the SLURM output, 29K was malloced. Is this a >>>>> ???? large amount of memory? >>>>> >>>>> ???? I am trying to get an idea of the scale. >>>>> >>>>> ?????? Thanks, >>>>> >>>>> ?????????? Matt >>>>> >>>>> ???????? Is there anything that I can do about this or do I need to >>>>> ???????? configure my >>>>> ???????? code in a different way? >>>>> >>>>> ???????? I have attached some code extracted from my application which >>>>> ???????? demonstrates this along with the output from a running it on >>>>> ???????? 2 MPI >>>>> ???????? processes. >>>>> >>>>> ???????? Best wishes, >>>>> >>>>> ???????? David Scott >>>>> ???????? The University of Edinburgh is a charitable body, registered >>>>> ???????? in Scotland, with registration number SC005336. Is e >>>>> ???????? buidheann carthannais a th? ann an Oilthigh Dh?n ?ideann, >>>>> ???????? cl?raichte an Alba, ?ireamh cl?raidh SC005336. >>>>> >>>>> >>>>> >>>>> ???? -- >>>>> ???? What most experimenters take for granted before they begin their >>>>> ???? experiments is infinitely more interesting than any results to >>>>> ???? which their experiments lead. >>>>> ???? -- Norbert Wiener >>>>> >>>>> >>>>> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!cH8SjJvsuVEK1zv8noUjNUJC0VnHFqems68PjB2E94pqxc3q55YprX1q2JXFvPAzXJkh40J1-erXPWdIvc-xrLkRIgg$ >>>>> >>>>> >>>> >>>> >>>> -- >>>> What most experimenters take for granted before they begin their >>>> experiments is infinitely more interesting than any results to which >>>> their experiments lead. >>>> -- Norbert Wiener >>>> >>>> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!cH8SjJvsuVEK1zv8noUjNUJC0VnHFqems68PjB2E94pqxc3q55YprX1q2JXFvPAzXJkh40J1-erXPWdIvc-xrLkRIgg$ >>>> > From d.scott at epcc.ed.ac.uk Mon Nov 25 03:49:02 2024 From: d.scott at epcc.ed.ac.uk (David Scott) Date: Mon, 25 Nov 2024 09:49:02 +0000 Subject: [petsc-users] Memory Used When Reading petscrc In-Reply-To: References: <60ba6a27-9840-4009-a94e-90bf7a5cd317@epcc.ed.ac.uk> <87h67v3msu.fsf@jedbrown.org> <9c8bf7af-62c4-47bf-9d68-ead4392fc014@epcc.ed.ac.uk> Message-ID: Fabian, That is indeed a typo. Thanks very much for pointing it out. Cheers, David On 25/11/2024 08:45, Fabian.Jakub via petsc-users wrote: > This email was sent to you by someone outside the University. > You should only click on links or attachments if you are certain that > the email is genuine and the content is safe. > > test_configuration_options.F90:l.55 > max_msg_length is quite large.... I guess the pow() is a typo. > Cheers, > Fabian > > > On 11/25/24 09:32, David Scott wrote: >> I'll have a look at heaptrack. >> >> The code that I am looking at the moment does not create a mesh. All it >> does is read a petscrc file. >> >> Thanks, >> >> David >> >> On 25/11/2024 05:27, Jed Brown wrote: >>> This email was sent to you by someone outside the University. >>> You should only click on links or attachments if you are certain that >>> the email is genuine and the content is safe. >>> >>> You're clearly doing almost all your allocation *not* using >>> PetscMalloc (so not in a Vec or Mat). If you're managing your own mesh >>> yourself, you might be allocating a global amount on each rank, >>> instead of strictly using scalable data structures (i.e., always >>> partitioned). >>> >>> My favorite tool for understanding memory use is heaptrack. >>> >>> https://urldefense.us/v3/__https://github.com/KDE/heaptrack__;!!G_uCfscf7eWS!bM8Vs5Ljq0ZJOl_Zl88PpU1JJWw39UMiu50wgyt0zhG4ax6DxOvabmaDYbKrrCATTeWrKDmDR5C-3bDziLRcXp30NMQ$ >>> >>> David Scott writes: >>> >>>> OK. >>>> >>>> I had started to wonder if that was the case. I'll do some further >>>> investigation. >>>> >>>> Thanks, >>>> >>>> David >>>> >>>> On 22/11/2024 22:10, Matthew Knepley wrote: >>>>> This email was sent to you by someone outside the University. >>>>> You should only click on links or attachments if you are certain that >>>>> the email is genuine and the content is safe. >>>>> On Fri, Nov 22, 2024 at 12:57?PM David Scott >>>>> wrote: >>>>> >>>>> ???? Matt, >>>>> >>>>> ???? Thanks for the quick response. >>>>> >>>>> ???? Yes 1) is trivially true. >>>>> >>>>> ???? With regard to 2), from the SLURM output: >>>>> ???? [0] Maximum memory PetscMalloc()ed 29552 maximum size of entire >>>>> ???? process 4312375296 >>>>> ???? [1] Maximum memory PetscMalloc()ed 29552 maximum size of entire >>>>> ???? process 4311990272 >>>>> ???? Yes only 29KB was malloced but the total figure was 4GB per >>>>> process. >>>>> >>>>> ???? Looking at >>>>> ????? mem0 =??? 16420864.000000000 >>>>> ????? mem0 =??? 16117760.000000000 >>>>> ????? mem1 =??? 4311490560.0000000 >>>>> ????? mem1 =??? 4311826432.0000000 >>>>> ????? mem2 =??? 4311490560.0000000 >>>>> ????? mem2 =??? 4311826432.0000000 >>>>> ???? mem0 is written after PetscInitialize. >>>>> ???? mem1 is written roughly half way through the options being read. >>>>> ???? mem2 is written on completion of the options being read. >>>>> >>>>> ???? The code does very little other than read configuration options. >>>>> ???? Why is so much memory used? >>>>> >>>>> >>>>> This is not due to options processing, as that would fall under Petsc >>>>> malloc allocations. I believe we are measuring this >>>>> using RSS which includes the binary, all shared libraries which are >>>>> paged in, and stack/heap allocations. I think you are >>>>> seeing the shared libraries come in. You might be able to see all the >>>>> libraries that come in using strace. >>>>> >>>>> ?? Thanks, >>>>> >>>>> ????? Matt >>>>> >>>>> ???? I do not understand what is going on and I may have expressed >>>>> ???? myself badly but I do have a problem as I certainly cannot use >>>>> ???? anywhere near 128 processes on a node with 128GB of RAM before I >>>>> ???? get an OOM error. (The code runs successfully on 32 processes >>>>> but >>>>> ???? not 64.) >>>>> >>>>> ???? Regards, >>>>> >>>>> ???? David >>>>> >>>>> ???? On 22/11/2024 16:53, Matthew Knepley wrote: >>>>>> ???? This email was sent to you by someone outside the University. >>>>>> ???? You should only click on links or attachments if you are >>>>>> certain >>>>>> ???? that the email is genuine and the content is safe. >>>>>> ???? On Fri, Nov 22, 2024 at 11:36?AM David Scott >>>>>> ???? wrote: >>>>>> >>>>>> ???????? Hello, >>>>>> >>>>>> ???????? I am using the options mechanism of PETSc to configure >>>>>> my CFD >>>>>> ???????? code. I >>>>>> ???????? have introduced options describing the size of the domain >>>>>> ???????? etc. I have >>>>>> ???????? noticed that this consumes a lot of memory. I have found >>>>>> that >>>>>> ???????? the amount >>>>>> ???????? of memory used scales linearly with the number of MPI >>>>>> ???????? processes used. >>>>>> ???????? This restricts the number of MPI processes that I can use. >>>>>> >>>>>> >>>>>> ???? There are two statements: >>>>>> >>>>>> ???? 1) The memory scales linearly with P >>>>>> >>>>>> ???? 2) This uses a lot of memory >>>>>> >>>>>> ???? Let's deal with 1) first. This seems to be trivially true. If I >>>>>> ???? want every process to have >>>>>> ???? access to a given option value, that option value must be in >>>>>> the >>>>>> ???? memory of every process. >>>>>> ???? The only alternative would be to communicate with some >>>>>> process in >>>>>> ???? order to get values. >>>>>> ???? Few codes seem to be willing to make this tradeoff, and we >>>>>> do not >>>>>> ???? offer it. >>>>>> >>>>>> ???? Now 2). Looking at the source, for each option we store >>>>>> ???? a PetscOptionItem, which I count >>>>>> ???? as having size 37 bytes (12 pointers/ints and a char). However, >>>>>> ???? there is data behind every >>>>>> ???? pointer, like the name, help text, available values >>>>>> (sometimes), >>>>>> ???? I could see it being as large >>>>>> ???? as 4K. Suppose it is. If I had 256 options, that would be >>>>>> 1M. Is >>>>>> ???? this a large amount of memory? >>>>>> >>>>>> ???? The way I read the SLURM output, 29K was malloced. Is this a >>>>>> ???? large amount of memory? >>>>>> >>>>>> ???? I am trying to get an idea of the scale. >>>>>> >>>>>> ?????? Thanks, >>>>>> >>>>>> ?????????? Matt >>>>>> >>>>>> ???????? Is there anything that I can do about this or do I need to >>>>>> ???????? configure my >>>>>> ???????? code in a different way? >>>>>> >>>>>> ???????? I have attached some code extracted from my application >>>>>> which >>>>>> ???????? demonstrates this along with the output from a running >>>>>> it on >>>>>> ???????? 2 MPI >>>>>> ???????? processes. >>>>>> >>>>>> ???????? Best wishes, >>>>>> >>>>>> ???????? David Scott >>>>>> ???????? The University of Edinburgh is a charitable body, >>>>>> registered >>>>>> ???????? in Scotland, with registration number SC005336. Is e >>>>>> ???????? buidheann carthannais a th? ann an Oilthigh Dh?n ?ideann, >>>>>> ???????? cl?raichte an Alba, ?ireamh cl?raidh SC005336. >>>>>> >>>>>> >>>>>> >>>>>> ???? -- >>>>>> ???? What most experimenters take for granted before they begin >>>>>> their >>>>>> ???? experiments is infinitely more interesting than any results to >>>>>> ???? which their experiments lead. >>>>>> ???? -- Norbert Wiener >>>>>> >>>>>> >>>>>> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!cH8SjJvsuVEK1zv8noUjNUJC0VnHFqems68PjB2E94pqxc3q55YprX1q2JXFvPAzXJkh40J1-erXPWdIvc-xrLkRIgg$ >>>>>> >>>>>> >>>>>> >>>>> > >>>>> >>>>> >>>>> -- >>>>> What most experimenters take for granted before they begin their >>>>> experiments is infinitely more interesting than any results to which >>>>> their experiments lead. >>>>> -- Norbert Wiener >>>>> >>>>> https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!cH8SjJvsuVEK1zv8noUjNUJC0VnHFqems68PjB2E94pqxc3q55YprX1q2JXFvPAzXJkh40J1-erXPWdIvc-xrLkRIgg$ >>>>> >>>>> >>>> > >> > From bba at bgs.ac.uk Mon Nov 25 05:08:09 2024 From: bba at bgs.ac.uk (Brian Bainbridge - BGS) Date: Mon, 25 Nov 2024 11:08:09 +0000 Subject: [petsc-users] Unable to configure errors Message-ID: Hi there, I have downloaded petsc-.3.22.1 via git, but when I try to configure with the Intel compilers I get this message: [bba at kwvmxbridgeHPC petsc]$ ./configure --prefix=/home/bba/bin/petsc --with-mpi-dir=/home/bba/intel/oneapi/mpi/2021.14/ ============================================================================================= Configuring PETSc to compile on your system ============================================================================================= TESTING: checkCCompiler from config.setCompilers(config/BuildSystem/config/setCompilers.py:1457) ********************************************************************************************* UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for details): --------------------------------------------------------------------------------------------- MPI compiler wrappers in /home/bba/intel/oneapi/mpi/2021.14/bin cannot be found or do not work. See https://urldefense.us/v3/__https://petsc.org/release/faq/*invalid-mpi-compilers__;Iw!!G_uCfscf7eWS!bfBTBf0cnZ3REM41nHH1ecadhkhaOZ6Rj28ZmTt_QbyHuPvU3zdMDIVstg8C0aON_dJ-vLme96MmkQ_-tKw$ ********************************************************************************************* But: [bba at kwvmxbridgeHPC petsc]$ ls /home/bba/intel/oneapi/mpi/2021.14/bin cpuinfo hydra_nameserver IMB-MPI1 IMB-MT IMB-P2P impi_cpuinfo mpicc mpiexec mpif77 mpifc mpigxx mpiicpc mpiicx mpiifx mpitune_fast hydra_bstrap_proxy hydra_pmi_proxy IMB-MPI1-GPU IMB-NBC IMB-RMA impi_info mpicxx mpiexec.hydra mpif90 mpigcc mpiicc mpiicpx mpiifort mpirun So it should work, but it doesn't! I tried to use --with-cc=/home/bba/intel/oneapi/mpi/2021.14/bin/mpiicx but that doesn't work: [bba at kwvmxbridgeHPC petsc]$ ./configure --prefix=/home/bba/bin/petsc --with-cc=/home/bba/intel/oneapi/mpi/2021.14/bin/mpiicx ============================================================================================= Configuring PETSc to compile on your system ============================================================================================= TESTING: checkCCompiler from config.setCompilers(config/BuildSystem/config/setCompilers.py:1457) ********************************************************************************************* UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for details): --------------------------------------------------------------------------------------------- C compiler you provided with -with-cc=/home/bba/intel/oneapi/mpi/2021.14/bin/mpiicx cannot be found or does not work. If the above linker messages do not indicate failure of the compiler you can rerun with the option --ignoreLinkOutput=1 ********************************************************************************************* [bba at kwvmxbridgeHPC petsc]$ ls /home/bba/intel/oneapi/mpi/2021.14/bin/mpiicx /home/bba/intel/oneapi/mpi/2021.14/bin/mpiicx The compiler is there, can you please help me to configure the petsc please? Regards, Brian This email and any attachments are intended solely for the use of the named recipients. If you are not the intended recipient you must not use, disclose, copy or distribute this email or any of its attachments and should notify the sender immediately and delete this email from your system. UK Research and Innovation (UKRI) has taken every reasonable precaution to minimise risk of this email or any attachments containing viruses or malware but the recipient should carry out its own virus and malware checks before opening the attachments. UKRI does not accept any liability for any losses or damages which the recipient may sustain due to presence of any viruses. -------------- next part -------------- An HTML attachment was scrubbed... URL: From stefano.zampini at gmail.com Mon Nov 25 11:11:18 2024 From: stefano.zampini at gmail.com (Stefano Zampini) Date: Mon, 25 Nov 2024 20:11:18 +0300 Subject: [petsc-users] Unable to configure errors In-Reply-To: References: Message-ID: Send configure.log On Mon, Nov 25, 2024, 20:09 Brian Bainbridge - BGS via petsc-users < petsc-users at mcs.anl.gov> wrote: > Hi there, > > I have downloaded petsc-.3.22.1 via git, but when I try to configure with > the Intel compilers I get this message: > > [bba at kwvmxbridgeHPC petsc]$ ./configure --prefix=/home/bba/bin/petsc > --with-mpi-dir=/home/bba/intel/oneapi/mpi/2021.14/ > > ============================================================================================= > Configuring PETSc to compile on your system > > ============================================================================================= > TESTING: checkCCompiler from > config.setCompilers(config/BuildSystem/config/setCompilers.py:1457) > > ********************************************************************************************* > UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for > details): > > --------------------------------------------------------------------------------------------- > MPI compiler wrappers in /home/bba/intel/oneapi/mpi/2021.14/bin cannot > be found or do not > work. See https://urldefense.us/v3/__https://petsc.org/release/faq/*invalid-mpi-compilers__;Iw!!G_uCfscf7eWS!YuP2JPBRwPMjW-R0DlunxcaYcHx0zjy7uFz7aEsu_9Gj38klN7GMx7OmcFwbU9-V3EPvLhVEp9S0DLbjdZVNjgJ5KgODGVQ$ > > > ********************************************************************************************* > > But: > > [bba at kwvmxbridgeHPC petsc]$ ls /home/bba/intel/oneapi/mpi/2021.14/bin > cpuinfo hydra_nameserver IMB-MPI1 IMB-MT IMB-P2P > impi_cpuinfo mpicc mpiexec mpif77 mpifc mpigxx mpiicpc > mpiicx mpiifx mpitune_fast > hydra_bstrap_proxy hydra_pmi_proxy IMB-MPI1-GPU IMB-NBC IMB-RMA > impi_info mpicxx mpiexec.hydra mpif90 mpigcc mpiicc mpiicpx > mpiifort mpirun > > So it should work, but it doesn't! I tried to use > --with-cc=/home/bba/intel/oneapi/mpi/2021.14/bin/mpiicx > > but that doesn't work: > > [bba at kwvmxbridgeHPC petsc]$ ./configure --prefix=/home/bba/bin/petsc > --with-cc=/home/bba/intel/oneapi/mpi/2021.14/bin/mpiicx > > ============================================================================================= > Configuring PETSc to compile on your system > > ============================================================================================= > TESTING: checkCCompiler from > config.setCompilers(config/BuildSystem/config/setCompilers.py:1457) > > ********************************************************************************************* > UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for > details): > > --------------------------------------------------------------------------------------------- > C compiler you provided with > -with-cc=/home/bba/intel/oneapi/mpi/2021.14/bin/mpiicx > cannot be found or does not work. > If the above linker messages do not indicate failure of the compiler you > can rerun with > the option --ignoreLinkOutput=1 > > ********************************************************************************************* > > [bba at kwvmxbridgeHPC petsc]$ ls > /home/bba/intel/oneapi/mpi/2021.14/bin/mpiicx > /home/bba/intel/oneapi/mpi/2021.14/bin/mpiicx > > The compiler is there, can you please help me to configure the petsc > please? > > Regards, > Brian > > > > > This email and any attachments are intended solely for the use of the > named recipients. If you are not the intended recipient you must not use, > disclose, copy or distribute this email or any of its attachments and > should notify the sender immediately and delete this email from your > system. UK Research and Innovation (UKRI) has taken every reasonable > precaution to minimise risk of this email or any attachments containing > viruses or malware but the recipient should carry out its own virus and > malware checks before opening the attachments. UKRI does not accept any > liability for any losses or damages which the recipient may sustain due to > presence of any viruses. > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay.anl at fastmail.org Mon Nov 25 11:13:32 2024 From: balay.anl at fastmail.org (Satish Balay) Date: Mon, 25 Nov 2024 11:13:32 -0600 (CST) Subject: [petsc-users] Unable to configure errors In-Reply-To: References: Message-ID: <8863bfc9-3983-31c0-447d-30a5d7ef9966@fastmail.org> Please check: "Intel MPI" section of https://urldefense.us/v3/__https://petsc.org/release/install/install/*mpi__;Iw!!G_uCfscf7eWS!efneLerzMMPsXXWgEwlvn6Pdk_8YCIiz8n_VrMaeIB4E4gWzmx9BZKrY5NGR48MVdQ60BU43oXs8ZhXN5BIN9zu690s$ - Likely you need to correctly set I_MPI_CC etc. for the MPI compiler wrappers to work. If you still have issues - send configure.log from this failure. Satish On Mon, 25 Nov 2024, Brian Bainbridge - BGS via petsc-users wrote: > Hi there, > > I have downloaded petsc-.3.22.1 via git, but when I try to configure with the Intel compilers I get this message: > > [bba at kwvmxbridgeHPC petsc]$ ./configure --prefix=/home/bba/bin/petsc --with-mpi-dir=/home/bba/intel/oneapi/mpi/2021.14/ > ============================================================================================= > Configuring PETSc to compile on your system > ============================================================================================= > TESTING: checkCCompiler from config.setCompilers(config/BuildSystem/config/setCompilers.py:1457) > ********************************************************************************************* > UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for details): > --------------------------------------------------------------------------------------------- > MPI compiler wrappers in /home/bba/intel/oneapi/mpi/2021.14/bin cannot be found or do not > work. See https://urldefense.us/v3/__https://petsc.org/release/faq/*invalid-mpi-compilers__;Iw!!G_uCfscf7eWS!bfBTBf0cnZ3REM41nHH1ecadhkhaOZ6Rj28ZmTt_QbyHuPvU3zdMDIVstg8C0aON_dJ-vLme96MmkQ_-tKw$ > ********************************************************************************************* > > But: > > [bba at kwvmxbridgeHPC petsc]$ ls /home/bba/intel/oneapi/mpi/2021.14/bin > cpuinfo hydra_nameserver IMB-MPI1 IMB-MT IMB-P2P impi_cpuinfo mpicc mpiexec mpif77 mpifc mpigxx mpiicpc mpiicx mpiifx mpitune_fast > hydra_bstrap_proxy hydra_pmi_proxy IMB-MPI1-GPU IMB-NBC IMB-RMA impi_info mpicxx mpiexec.hydra mpif90 mpigcc mpiicc mpiicpx mpiifort mpirun > > So it should work, but it doesn't! I tried to use --with-cc=/home/bba/intel/oneapi/mpi/2021.14/bin/mpiicx > > but that doesn't work: > > [bba at kwvmxbridgeHPC petsc]$ ./configure --prefix=/home/bba/bin/petsc --with-cc=/home/bba/intel/oneapi/mpi/2021.14/bin/mpiicx > ============================================================================================= > Configuring PETSc to compile on your system > ============================================================================================= > TESTING: checkCCompiler from config.setCompilers(config/BuildSystem/config/setCompilers.py:1457) > ********************************************************************************************* > UNABLE to CONFIGURE with GIVEN OPTIONS (see configure.log for details): > --------------------------------------------------------------------------------------------- > C compiler you provided with -with-cc=/home/bba/intel/oneapi/mpi/2021.14/bin/mpiicx > cannot be found or does not work. > If the above linker messages do not indicate failure of the compiler you can rerun with > the option --ignoreLinkOutput=1 > ********************************************************************************************* > > [bba at kwvmxbridgeHPC petsc]$ ls /home/bba/intel/oneapi/mpi/2021.14/bin/mpiicx > /home/bba/intel/oneapi/mpi/2021.14/bin/mpiicx > > The compiler is there, can you please help me to configure the petsc please? > > Regards, > Brian > > > > > This email and any attachments are intended solely for the use of the named recipients. If you are not the intended recipient you must not use, disclose, copy or distribute this email or any of its attachments and should notify the sender immediately and delete this email from your system. UK Research and Innovation (UKRI) has taken every reasonable precaution to minimise risk of this email or any attachments containing viruses or malware but the recipient should carry out its own virus and malware checks before opening the attachments. UKRI does not accept any liability for any losses or damages which the recipient may sustain due to presence of any viruses. > > From matthew.thomas1 at anu.edu.au Tue Nov 26 19:06:21 2024 From: matthew.thomas1 at anu.edu.au (Matthew Thomas) Date: Wed, 27 Nov 2024 01:06:21 +0000 Subject: [petsc-users] Problem with MatMPIAIJSetPreallocation Message-ID: Hello, When I use MatMPIAIJSetPreallocation I get an argument out of range error, as below, [15]PETSC ERROR: Argument out of range [15]PETSC ERROR: New nonzero at (8,421) caused a malloc. Use MatSetOption(A, MAT_NEW_NONZERO_ALLOCATION_ERR, PETSC_FALSE) to turn off this check However, when I run the code again with identical parameters without MatMPIAIJSetPreallocation, there is no non-zero at the location that caused the error (8, 421). Using -mat_view I can see that row 8 only contains a single value at column 8, which is expected and what I have allocated for. I have also checked that every time I call MatSetValues that this location is not being set. I am very confident my dnnz and onnz arrays have been set correctly, do you have any idea why this new non-zero is created? I am using Petsc version 3.22.1 and Slepc version 3.22.1 with fortran. This issue does not occur with small number of processors, (1-8), however, this error is consistent when I use >8 processors. Thanks, Matt -------------- next part -------------- An HTML attachment was scrubbed... URL: From matthew.thomas1 at anu.edu.au Tue Nov 26 20:41:39 2024 From: matthew.thomas1 at anu.edu.au (Matthew Thomas) Date: Wed, 27 Nov 2024 02:41:39 +0000 Subject: [petsc-users] Problem with MatMPIAIJSetPreallocation In-Reply-To: References: Message-ID: <4D8F95C1-E7C9-4353-A107-ABE1A8583526@anu.edu.au> Hello, I found a type elsewhere in my code and have fixed this issue. Thanks, Matt On 27 Nov 2024, at 12:06?PM, Matthew Thomas wrote: Hello, When I use MatMPIAIJSetPreallocation I get an argument out of range error, as below, [15]PETSC ERROR: Argument out of range [15]PETSC ERROR: New nonzero at (8,421) caused a malloc. Use MatSetOption(A, MAT_NEW_NONZERO_ALLOCATION_ERR, PETSC_FALSE) to turn off this check However, when I run the code again with identical parameters without MatMPIAIJSetPreallocation, there is no non-zero at the location that caused the error (8, 421). Using -mat_view I can see that row 8 only contains a single value at column 8, which is expected and what I have allocated for. I have also checked that every time I call MatSetValues that this location is not being set. I am very confident my dnnz and onnz arrays have been set correctly, do you have any idea why this new non-zero is created? I am using Petsc version 3.22.1 and Slepc version 3.22.1 with fortran. This issue does not occur with small number of processors, (1-8), however, this error is consistent when I use >8 processors. Thanks, Matt -------------- next part -------------- An HTML attachment was scrubbed... URL: From mmolinos at us.es Wed Nov 27 08:47:29 2024 From: mmolinos at us.es (MIGUEL MOLINOS PEREZ) Date: Wed, 27 Nov 2024 14:47:29 +0000 Subject: [petsc-users] Doubt about mesh size distribution in DMDACreate3d using periodic boundary conditions In-Reply-To: <300ECE6B-36F4-4212-9EA6-95716EBB06A2@us.es> References: <56B31960-5928-4225-BE0D-B2A8E6214776@us.es> <533B9E59-5B74-433E-B7A9-89972A2C7A8D@petsc.dev> <3421F3B3-C775-4E1A-A572-9B22EF0942E9@us.es> <609D8006-8454-4FE3-A052-A0D95D9189F7@petsc.dev> <50D40867-05EE-4AD0-B356-BC4C2E1CCF1D@petsc.dev> <6B69A792-4004-4B32-AB0D-F9D4AFACA094@us.es> <83500C6C-6E3E-499A-8595-7EEFE3174028@petsc.dev> <14500884-FB4B-4872-9E06-207FE6482187@us.es> <969F00FA-3636-4B06-88EC-04DE57C8F492@petsc.dev> <300ECE6B-36F4-4212-9EA6-95716EBB06A2@us.es> Message-ID: <0E25EA57-DE67-440A-955E-F380DC3BF7A6@us.es> Dear Barry: You were right!! The problem is I am using the background DMDA mesh for the domain partitioning of the DMSWarm as in ?dm/tutorials/swarm_ex3.c?. And then ?DMGetNeighbors? to locate the neighbour ranks, including those in the other side of the domain when I am using periodic bcc. Therefore, if I define the background DMDA to use periodic bcc the particle domain partitioning is uneven but I can locate precisely the periodic ranks. Thanks, Miguel On 20 Nov 2024, at 23:40, MIGUEL MOLINOS PEREZ wrote: I see? that might be the problem. I?ll check it tomorrow. Thank you! Miguel On 20 Nov 2024, at 22:57, Barry Smith wrote: ? On Nov 20, 2024, at 2:38?PM, MIGUEL MOLINOS PEREZ wrote: Yes, I use the vertex (nodes) of the elements. Then the length between each vertex will be different between periodic and non-periodic case. With 10 points and non-periodic, it will be 1/9, and with periodic it will be 1/10th. Is this what you are asking about? I am using the DMDA as an auxiliar mesh to do the domain partitioning in the DMSWARM. Thanks, Miguel On 20 Nov 2024, at 19:54, Barry Smith wrote: Are you considering your degrees of freedom as vertex or cell-centered? Say three "elements" per edge. If vertex centered then discretization size is 1/3 if periodic and 1/2 if not periodic If cell-centered then each cell has width 1/3 for both periodic and not periodic but in both cases you can think of the discretization size as constant along the whole cube edge. Is this related to DMSWARM in particular? On Nov 20, 2024, at 12:56?PM, MIGUEL MOLINOS PEREZ wrote: I mean that if the dimensions of the cube are 1x1x1 (for example). And I want 10 elements per edge, the discretization size must be 0.1 constant over the whole cube edge. This is not in the code, I just impose the number of elements per edge. Thank you, Miguel On 20 Nov 2024, at 18:52, Barry Smith wrote: What do you mean by discretization size, and how do I see it in the code? Barry On Nov 20, 2024, at 12:48?PM, MIGUEL MOLINOS PEREZ wrote: Sorry, I meant that the discretisation size is not constant across the edges of the cube. Miguel On 20 Nov 2024, at 18:36, Barry Smith wrote: I am sorry, I don't understand the problem. When I run by default with -da_view I get Processor [0] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1 X range of indices: 0 2, Y range of indices: 0 2, Z range of indices: 0 2 Processor [1] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1 X range of indices: 2 3, Y range of indices: 0 2, Z range of indices: 0 2 Processor [2] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1 X range of indices: 0 2, Y range of indices: 2 3, Z range of indices: 0 2 Processor [3] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1 X range of indices: 2 3, Y range of indices: 2 3, Z range of indices: 0 2 Processor [4] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1 X range of indices: 0 2, Y range of indices: 0 2, Z range of indices: 2 3 Processor [5] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1 X range of indices: 2 3, Y range of indices: 0 2, Z range of indices: 2 3 Processor [6] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1 X range of indices: 0 2, Y range of indices: 2 3, Z range of indices: 2 3 Processor [7] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1 X range of indices: 2 3, Y range of indices: 2 3, Z range of indices: 2 3 which seems right because you are trying to have three cells in each direction. The distribution has to be uneven, hence 0 2 and 2 3 When I change the code to use ndiv_mesh_* = 4 and run with periodic or not I get $ PETSC_OPTIONS="" mpiexec -n 8 ./atoms-3D -dm_view DM Object: 8 MPI processes type: da Processor [0] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1 X range of indices: 0 2, Y range of indices: 0 2, Z range of indices: 0 2 Processor [1] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1 X range of indices: 2 4, Y range of indices: 0 2, Z range of indices: 0 2 Processor [2] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1 X range of indices: 0 2, Y range of indices: 2 4, Z range of indices: 0 2 Processor [3] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1 X range of indices: 2 4, Y range of indices: 2 4, Z range of indices: 0 2 Processor [4] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1 X range of indices: 0 2, Y range of indices: 0 2, Z range of indices: 2 4 Processor [5] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1 X range of indices: 2 4, Y range of indices: 0 2, Z range of indices: 2 4 Processor [6] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1 X range of indices: 0 2, Y range of indices: 2 4, Z range of indices: 2 4 Processor [7] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1 X range of indices: 2 4, Y range of indices: 2 4, Z range of indices: 2 4 so it is splitting as expected each rank gets a 2 by 2 by 2 set of indices. Could you please let me know what the problem is that I should be seeing. Barry On Nov 20, 2024, at 7:06?AM, MIGUEL MOLINOS PEREZ wrote: Dear Barry, Please, find attached to this email a minimal example of the problem. Run it using 8 MPI processes. Thanks, Miguel On 20 Nov 2024, at 11:48, Miguel Molinos wrote: Hi Bary: I will check the example you suggest. Anyhow, I?ll send a reproducible example ASAP. Thanks, Miguel On 19 Nov 2024, at 18:55, Barry Smith wrote: I modify src/dm/tests/ex25.c and always see a nice even split when possible with both DM_BOUNDARY_NONE and DM_BOUNDARY_PERIODIC Can you please send a reproducible example? Thanks Barry On Nov 19, 2024, at 6:14?AM, MIGUEL MOLINOS PEREZ wrote: Dear all: It seems that if I mesh a cubic domain with ?DMDACreate3d? using 8 bricks for discretization and with periodic boundaries, each of the bricks has a different size. In contrast, if I use DM_BOUNDARY_NONE, all 8 bricks have the same size. I have used this together with the DMSWarm discretization. And as you can see the number of particles per rank is not evenly distributed: 210 420 366 732 420 840 732 1464 Am I missing something? Thanks, Miguel -------------- next part -------------- An HTML attachment was scrubbed... URL: From mmolinos at us.es Wed Nov 27 09:38:07 2024 From: mmolinos at us.es (MIGUEL MOLINOS PEREZ) Date: Wed, 27 Nov 2024 15:38:07 +0000 Subject: [petsc-users] Doubt about mesh size distribution in DMDACreate3d using periodic boundary conditions In-Reply-To: <0E25EA57-DE67-440A-955E-F380DC3BF7A6@us.es> References: <56B31960-5928-4225-BE0D-B2A8E6214776@us.es> <533B9E59-5B74-433E-B7A9-89972A2C7A8D@petsc.dev> <3421F3B3-C775-4E1A-A572-9B22EF0942E9@us.es> <609D8006-8454-4FE3-A052-A0D95D9189F7@petsc.dev> <50D40867-05EE-4AD0-B356-BC4C2E1CCF1D@petsc.dev> <6B69A792-4004-4B32-AB0D-F9D4AFACA094@us.es> <83500C6C-6E3E-499A-8595-7EEFE3174028@petsc.dev> <14500884-FB4B-4872-9E06-207FE6482187@us.es> <969F00FA-3636-4B06-88EC-04DE57C8F492@petsc.dev> <300ECE6B-36F4-4212-9EA6-95716EBB06A2@us.es> <0E25EA57-DE67-440A-955E-F380DC3BF7A6@us.es> Message-ID: <7C4C298F-6E28-44DA-ADCF-10050154808D@us.es> I forgot to mention that the solution to the problem is to increase the number of divisions while keeping constant the number of processes. Sorry for the silly question! Thanks, Miguel On 27 Nov 2024, at 15:47, Miguel Molinos wrote: Dear Barry: You were right!! The problem is I am using the background DMDA mesh for the domain partitioning of the DMSWarm as in ?dm/tutorials/swarm_ex3.c?. And then ?DMGetNeighbors? to locate the neighbour ranks, including those in the other side of the domain when I am using periodic bcc. Therefore, if I define the background DMDA to use periodic bcc the particle domain partitioning is uneven but I can locate precisely the periodic ranks. Thanks, Miguel On 20 Nov 2024, at 23:40, MIGUEL MOLINOS PEREZ wrote: I see? that might be the problem. I?ll check it tomorrow. Thank you! Miguel On 20 Nov 2024, at 22:57, Barry Smith wrote: ? On Nov 20, 2024, at 2:38?PM, MIGUEL MOLINOS PEREZ wrote: Yes, I use the vertex (nodes) of the elements. Then the length between each vertex will be different between periodic and non-periodic case. With 10 points and non-periodic, it will be 1/9, and with periodic it will be 1/10th. Is this what you are asking about? I am using the DMDA as an auxiliar mesh to do the domain partitioning in the DMSWARM. Thanks, Miguel On 20 Nov 2024, at 19:54, Barry Smith wrote: Are you considering your degrees of freedom as vertex or cell-centered? Say three "elements" per edge. If vertex centered then discretization size is 1/3 if periodic and 1/2 if not periodic If cell-centered then each cell has width 1/3 for both periodic and not periodic but in both cases you can think of the discretization size as constant along the whole cube edge. Is this related to DMSWARM in particular? On Nov 20, 2024, at 12:56?PM, MIGUEL MOLINOS PEREZ wrote: I mean that if the dimensions of the cube are 1x1x1 (for example). And I want 10 elements per edge, the discretization size must be 0.1 constant over the whole cube edge. This is not in the code, I just impose the number of elements per edge. Thank you, Miguel On 20 Nov 2024, at 18:52, Barry Smith wrote: What do you mean by discretization size, and how do I see it in the code? Barry On Nov 20, 2024, at 12:48?PM, MIGUEL MOLINOS PEREZ wrote: Sorry, I meant that the discretisation size is not constant across the edges of the cube. Miguel On 20 Nov 2024, at 18:36, Barry Smith wrote: I am sorry, I don't understand the problem. When I run by default with -da_view I get Processor [0] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1 X range of indices: 0 2, Y range of indices: 0 2, Z range of indices: 0 2 Processor [1] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1 X range of indices: 2 3, Y range of indices: 0 2, Z range of indices: 0 2 Processor [2] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1 X range of indices: 0 2, Y range of indices: 2 3, Z range of indices: 0 2 Processor [3] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1 X range of indices: 2 3, Y range of indices: 2 3, Z range of indices: 0 2 Processor [4] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1 X range of indices: 0 2, Y range of indices: 0 2, Z range of indices: 2 3 Processor [5] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1 X range of indices: 2 3, Y range of indices: 0 2, Z range of indices: 2 3 Processor [6] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1 X range of indices: 0 2, Y range of indices: 2 3, Z range of indices: 2 3 Processor [7] M 3 N 3 P 3 m 2 n 2 p 2 w 1 s 1 X range of indices: 2 3, Y range of indices: 2 3, Z range of indices: 2 3 which seems right because you are trying to have three cells in each direction. The distribution has to be uneven, hence 0 2 and 2 3 When I change the code to use ndiv_mesh_* = 4 and run with periodic or not I get $ PETSC_OPTIONS="" mpiexec -n 8 ./atoms-3D -dm_view DM Object: 8 MPI processes type: da Processor [0] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1 X range of indices: 0 2, Y range of indices: 0 2, Z range of indices: 0 2 Processor [1] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1 X range of indices: 2 4, Y range of indices: 0 2, Z range of indices: 0 2 Processor [2] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1 X range of indices: 0 2, Y range of indices: 2 4, Z range of indices: 0 2 Processor [3] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1 X range of indices: 2 4, Y range of indices: 2 4, Z range of indices: 0 2 Processor [4] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1 X range of indices: 0 2, Y range of indices: 0 2, Z range of indices: 2 4 Processor [5] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1 X range of indices: 2 4, Y range of indices: 0 2, Z range of indices: 2 4 Processor [6] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1 X range of indices: 0 2, Y range of indices: 2 4, Z range of indices: 2 4 Processor [7] M 4 N 4 P 4 m 2 n 2 p 2 w 1 s 1 X range of indices: 2 4, Y range of indices: 2 4, Z range of indices: 2 4 so it is splitting as expected each rank gets a 2 by 2 by 2 set of indices. Could you please let me know what the problem is that I should be seeing. Barry On Nov 20, 2024, at 7:06?AM, MIGUEL MOLINOS PEREZ wrote: Dear Barry, Please, find attached to this email a minimal example of the problem. Run it using 8 MPI processes. Thanks, Miguel On 20 Nov 2024, at 11:48, Miguel Molinos wrote: Hi Bary: I will check the example you suggest. Anyhow, I?ll send a reproducible example ASAP. Thanks, Miguel On 19 Nov 2024, at 18:55, Barry Smith wrote: I modify src/dm/tests/ex25.c and always see a nice even split when possible with both DM_BOUNDARY_NONE and DM_BOUNDARY_PERIODIC Can you please send a reproducible example? Thanks Barry On Nov 19, 2024, at 6:14?AM, MIGUEL MOLINOS PEREZ wrote: Dear all: It seems that if I mesh a cubic domain with ?DMDACreate3d? using 8 bricks for discretization and with periodic boundaries, each of the bricks has a different size. In contrast, if I use DM_BOUNDARY_NONE, all 8 bricks have the same size. I have used this together with the DMSWarm discretization. And as you can see the number of particles per rank is not evenly distributed: 210 420 366 732 420 840 732 1464 Am I missing something? Thanks, Miguel -------------- next part -------------- An HTML attachment was scrubbed... URL: From qiyuelu1 at gmail.com Fri Nov 29 20:56:14 2024 From: qiyuelu1 at gmail.com (Qiyue Lu) Date: Fri, 29 Nov 2024 20:56:14 -0600 Subject: [petsc-users] MatZeroRows costly while applying 1st-kind Boundary Conditions Message-ID: Hello, In the MPI context, after assembling the distributed matrix A (matmpiaij) and the right-hand-side b, I am trying to apply the 1st kind boundary condition using MatZeroRows() and VecSetValues(), for A and b respectively. The pseudo-code is: ========= for (int key = 0; key < BCNodes_Length; key++){ // retrieving the global row position pos = BCNodes[key]; // Set all elements in that row 0 except the one on the diagonal to be 1.0 MatZeroRows(A, 1, &pos, 1.0, NULL, NULL); } MatAssemblyBegin(A, MAT_FINAL_ASSEMBLY); MatAssemblyEnd(A, MAT_FINAL_ASSEMBLY); ========= For BCNodes_Length = 10^4, the FOR loop timing is 8 seconds. For BCNodes_Length = 15*10^4, the FOR loop timing is 3000 seconds. I am using two computational nodes and each having 12 cores. My questions are: 1) Is the timing plausible? Is the MatZeroRows() function so costly? 2) Any suggestions to apply the 1st kind boundary conditions for a better performance? Thanks, Qiyue Lu -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Fri Nov 29 21:57:12 2024 From: bsmith at petsc.dev (Barry Smith) Date: Fri, 29 Nov 2024 22:57:12 -0500 Subject: [petsc-users] MatZeroRows costly while applying 1st-kind Boundary Conditions In-Reply-To: References: Message-ID: <9EF8F6C3-5169-4FE2-984D-29D335798CB5@petsc.dev> You need to call MatZeroRows() once; passing all the rows you want zeroed, instead of once for each row. If you are running in parallel each MPI process should call MatZeroRows() once passing in a list of rows to be zeroed. Each process can pass in different rows than the other processes. BTW: You do not need to call MatAssemblyBegin/End() after MatZeroRows() Barry > On Nov 29, 2024, at 9:56?PM, Qiyue Lu wrote: > > Hello, > In the MPI context, after assembling the distributed matrix A (matmpiaij) and the right-hand-side b, I am trying to apply the 1st kind boundary condition using MatZeroRows() and VecSetValues(), for A and b respectively. > The pseudo-code is: > ========= > for (int key = 0; key < BCNodes_Length; key++){ > // retrieving the global row position > pos = BCNodes[key]; > // Set all elements in that row 0 except the one on the diagonal to be 1.0 > MatZeroRows(A, 1, &pos, 1.0, NULL, NULL); > } > MatAssemblyBegin(A, MAT_FINAL_ASSEMBLY); > MatAssemblyEnd(A, MAT_FINAL_ASSEMBLY); > ========= > > For BCNodes_Length = 10^4, the FOR loop timing is 8 seconds. > For BCNodes_Length = 15*10^4, the FOR loop timing is 3000 seconds. > I am using two computational nodes and each having 12 cores. > > My questions are: > 1) Is the timing plausible? Is the MatZeroRows() function so costly? > 2) Any suggestions to apply the 1st kind boundary conditions for a better performance? > > Thanks, > Qiyue Lu -------------- next part -------------- An HTML attachment was scrubbed... URL: From tryit88 at proton.me Fri Nov 29 23:48:09 2024 From: tryit88 at proton.me (tryit88) Date: Sat, 30 Nov 2024 05:48:09 +0000 Subject: [petsc-users] error Message-ID: <7prYhgf7pu4Oaifx7nlpPUFPOme_vw527uHLrssGWXw5JHiiSIPunbum170De4StH2Delp_fg4Ssb_26p6wC8GMTLbPqLwEvL7jSNdbdtXk=@proton.me> ???????Cialis??????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????? https://urldefense.us/v3/__https://blog.lowe99.com/__;!!G_uCfscf7eWS!aO_LP_BBT1VaODjHHdGQLLpZ87Ga25P_jHhTAPftsw__K1WqbLrMDaYWK3g92H7cBLXjUPFMrrBKG2732kA5pWQ$ https://urldefense.us/v3/__https://priligy88.com__;!!G_uCfscf7eWS!aO_LP_BBT1VaODjHHdGQLLpZ87Ga25P_jHhTAPftsw__K1WqbLrMDaYWK3g92H7cBLXjUPFMrrBKG273R0mXdDM$ https://urldefense.us/v3/__https://vaigratw.com/__;!!G_uCfscf7eWS!aO_LP_BBT1VaODjHHdGQLLpZ87Ga25P_jHhTAPftsw__K1WqbLrMDaYWK3g92H7cBLXjUPFMrrBKG273bCCkJ1s$ https://urldefense.us/v3/__https://levitra20mg.com/https:/*100mg.tw__;Lw!!G_uCfscf7eWS!aO_LP_BBT1VaODjHHdGQLLpZ87Ga25P_jHhTAPftsw__K1WqbLrMDaYWK3g92H7cBLXjUPFMrrBKG2732K2QEPQ$ -------------- next part -------------- An HTML attachment was scrubbed... URL: From tryit88 at proton.me Fri Nov 29 23:51:34 2024 From: tryit88 at proton.me (tryit88) Date: Sat, 30 Nov 2024 05:51:34 +0000 Subject: [petsc-users] error Message-ID: ???????Cialis??????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????????? https://urldefense.us/v3/__https://blog.lowe99.com/__;!!G_uCfscf7eWS!ahLQJxJ6sfp7sxaTR3EC6qzo_mYcCiKnwjlEvcthxJnnqKWMLlZsGR-tQEd3cRizekWYAFcrNGDHI7CYjcX1YCE$ https://urldefense.us/v3/__https://priligy88.com__;!!G_uCfscf7eWS!ahLQJxJ6sfp7sxaTR3EC6qzo_mYcCiKnwjlEvcthxJnnqKWMLlZsGR-tQEd3cRizekWYAFcrNGDHI7CYgoGz8ss$ https://urldefense.us/v3/__https://vaigratw.com/__;!!G_uCfscf7eWS!ahLQJxJ6sfp7sxaTR3EC6qzo_mYcCiKnwjlEvcthxJnnqKWMLlZsGR-tQEd3cRizekWYAFcrNGDHI7CYM6X5OTg$ https://urldefense.us/v3/__https://levitra20mg.com/https:/*100mg.tw__;Lw!!G_uCfscf7eWS!ahLQJxJ6sfp7sxaTR3EC6qzo_mYcCiKnwjlEvcthxJnnqKWMLlZsGR-tQEd3cRizekWYAFcrNGDHI7CY890IhOM$ -------------- next part -------------- An HTML attachment was scrubbed... URL: