From aldo.bonfiglioli at unibas.it Sun Feb 2 14:59:12 2025 From: aldo.bonfiglioli at unibas.it (Aldo Bonfiglioli) Date: Sun, 2 Feb 2025 21:59:12 +0100 Subject: [petsc-users] markers Message-ID: <5548e3b0-2c5a-40be-a734-c1e2a5084f38@unibas.it> Dear all, it is unclear to me what is being flagged with the "marker" flag when a mesh is created using DMPlexCreate. The 2D triangular mesh pictured in the enclosed pdf has the following features: > DM Object: 2D plex 1 MPI process > type: plex > 2D plex in 2 dimensions: > Number of 0-cells per rank: 12 > Number of 1-cells per rank: 23 > Number of 2-cells per rank: 12 > Labels: > celltype: 3 strata with value/size (0 (12), 1 (23), 3 (12)) > depth: 3 strata with value/size (0 (12), 1 (23), 2 (12)) > marker: 1 strata with value/size (1 (20)) > Face Sets: 4 strata with value/size (1 (3), 2 (2), 3 (3), 4 (2)) > i.e 12 gridpoints, 23 edges and 12 triangular cells. When I call DMGetStratumSize at stratum 0 to 2, this is what I get. > CreateSectionAlternate DMGetStratumSize found0 ?'marker' points @ > depth0 ?on PE#0 > CreateSectionAlternate DMGetStratumSize found20 ?'marker' points @ > depth1 ?on PE#0 > CreateSectionAlternate DMGetStratumSize found0 ?'marker' points @ > depth2 ?on PE#0 > Is the marker flagging boundary edges or boundary vertices (nodes) ? In any case, why are there 20, instead of 10? Finally: I believe face sets refers to boundary faces, where each side of the square domain has been given a different flag. How do I access the face sets information ? Thanks, Aldo -- Dr. Aldo Bonfiglioli Associate professor of Fluid Machines Scuola di Ingegneria Universita' della Basilicata V.le dell'Ateneo lucano, 10 85100 Potenza ITALY tel:+39.0971.205203 fax:+39.0971.205215 web:https://urldefense.us/v3/__http://docenti.unibas.it/site/home/docente.html?m=002423__;!!G_uCfscf7eWS!c6_YYKL6k6FdMvXiZuhIJ2t5rbgB45aqicDR0WpLsZSgrs7NO0SMggi_L8KB2ZKd-w8JlcGQJTFm-uotq1RhZRDxD575sfToW2A$ -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: compressed_PDF_file.pdf Type: application/pdf Size: 69121 bytes Desc: not available URL: From knepley at gmail.com Sun Feb 2 15:49:10 2025 From: knepley at gmail.com (Matthew Knepley) Date: Sun, 2 Feb 2025 16:49:10 -0500 Subject: [petsc-users] markers In-Reply-To: <5548e3b0-2c5a-40be-a734-c1e2a5084f38@unibas.it> References: <5548e3b0-2c5a-40be-a734-c1e2a5084f38@unibas.it> Message-ID: On Sun, Feb 2, 2025 at 3:59?PM Aldo Bonfiglioli wrote: > Dear all, > > it is unclear to me what is being flagged with the "marker" flag when a > mesh is created using DMPlexCreate. > > The 2D triangular mesh pictured in the enclosed pdf has the following > features: > > DM Object: 2D plex 1 MPI process > type: plex > 2D plex in 2 dimensions: > Number of 0-cells per rank: 12 > Number of 1-cells per rank: 23 > Number of 2-cells per rank: 12 > Labels: > celltype: 3 strata with value/size (0 (12), 1 (23), 3 (12)) > depth: 3 strata with value/size (0 (12), 1 (23), 2 (12)) > marker: 1 strata with value/size (1 (20)) > Face Sets: 4 strata with value/size (1 (3), 2 (2), 3 (3), 4 (2)) > > i.e 12 gridpoints, 23 edges and 12 triangular cells. > > When I call DMGetStratumSize at stratum 0 to 2, this is what I get. > > CreateSectionAlternate DMGetStratumSize found 0 'marker' > points @ depth 0 on PE# 0 > CreateSectionAlternate DMGetStratumSize found 20 'marker' > points @ depth 1 on PE# 0 > CreateSectionAlternate DMGetStratumSize found 0 'marker' > points @ depth 2 on PE# 0 > > Is the marker flagging boundary edges or boundary vertices (nodes) ? In > any case, why are there 20, instead of 10? > By default the "marker" label marks all k-cells on the boundary. In this case it means 10 vertices + 10 edges = 20 points You can see what is in the label using -dm_view -dm_plex_view_labels marker with DMViewFromOptions(dm, NULL, "-dm_view") in your code. Finally: I believe face sets refers to boundary faces, where each side of > the square domain has been given a different flag. > > Yes. > How do I access the face sets information ? > DMLabel label; PetscCall(DMGetLabel(dm, "Face Sets", &label)); Thanks, Matt > Thanks, > > Aldo > > > -- > Dr. Aldo Bonfiglioli > Associate professor of Fluid Machines > Scuola di Ingegneria > Universita' della Basilicata > V.le dell'Ateneo lucano, 10 85100 Potenza ITALY > tel:+39.0971.205203 fax:+39.0971.205215 > web: https://urldefense.us/v3/__http://docenti.unibas.it/site/home/docente.html?m=002423__;!!G_uCfscf7eWS!Zs0PBHmkwRS6C-HyPNWsfCXfxTmHZ51FTqXs6A-ujrSlUUYpguXpO2Cg1tShryNL4k0RYpZVhUjgM7T7j789$ > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!Zs0PBHmkwRS6C-HyPNWsfCXfxTmHZ51FTqXs6A-ujrSlUUYpguXpO2Cg1tShryNL4k0RYpZVhUjgM-avqGtk$ -------------- next part -------------- An HTML attachment was scrubbed... URL: From Daniel.Abele at dlr.de Sun Feb 2 12:09:17 2025 From: Daniel.Abele at dlr.de (Daniel.Abele at dlr.de) Date: Sun, 2 Feb 2025 18:09:17 +0000 Subject: [petsc-users] KSP: when to use initial residual norm (ksp_converged_use_initial_residual_norm) Message-ID: Hi, we are solving a time dependent problem with a single KSP in every time step. We are debating which convergence criterion to use. Is there general guidance around when to use one of the norms with initial residual (ksp_converged_use_initial_residual_norm or ksp_converged_use_min_initial_residual_norm) over the default norm? If I understand the formulas correctly, the initial residual norm "norm(b - A * x0)" (maybe add preconditioning) means that if you have a very good initial guess (as is often the case in time dependent problems if you can use the result if the last time step as initial guess), the norm is much stricter than the default norm "norm(b)". Is this meant as a way to control error accumulation over time? Or does it have some other purpose? Thanks and Regards, Daniel -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Mon Feb 3 10:16:36 2025 From: bsmith at petsc.dev (Barry Smith) Date: Mon, 3 Feb 2025 11:16:36 -0500 Subject: [petsc-users] KSP: when to use initial residual norm (ksp_converged_use_initial_residual_norm) In-Reply-To: References: Message-ID: <6E5AF340-30B3-4C37-965A-26E0638BA64B@petsc.dev> > On Feb 2, 2025, at 1:09?PM, Daniel.Abele--- via petsc-users wrote: > > Hi, > we are solving a time dependent problem with a single KSP in every time step. We are debating which convergence criterion to use. Is there general guidance around when to use one of the norms with initial residual (ksp_converged_use_initial_residual_norm or ksp_converged_use_min_initial_residual_norm) over the default norm? > If I understand the formulas correctly, the initial residual norm ?norm(b ? A * x0)? (maybe add preconditioning) means that if you have a very good initial guess (as is often the case in time dependent problems if you can use the result if the last time step as initial guess), the norm is much stricter than the default norm ?norm(b)?. The above statement is correct. > Is this meant as a way to control error accumulation over time? Or does it have some other purpose? > Thanks and Regards, > Daniel For splitting-type methods, you want the error in the linear system solve, e, to be on the same order as the maximum error from the splitting, the explicit time-step discretization, and the implicit time-step discretization. So you need some estimate of that value. Now || e ||_2 < || B(b - Ax) ||_2 ------------------- \lambda_min(BA). For this you need a handle on \lambda_min(BA) which you can obtain with Lanczo using -ksp_monitor_singular_values. So the converge criteria should really depend on setting the -ksp_atol using the required bound on || e ||_2 and \lambda_min(BA). PETSc should provide this convergence test but no one has gotten around to adding it. Barry -------------- next part -------------- An HTML attachment was scrubbed... URL: From anna.dalklint at solid.lth.se Tue Feb 4 06:32:33 2025 From: anna.dalklint at solid.lth.se (Anna Dalklint) Date: Tue, 4 Feb 2025 12:32:33 +0000 Subject: [petsc-users] Visualizing higher order finite element output in ParaView In-Reply-To: <87a5b6lvb2.fsf@jedbrown.org> References: <875xlxgumq.fsf@jedbrown.org> <87a5b6lvb2.fsf@jedbrown.org> Message-ID: Thank you, that worked! Best, Anna From: Jed Brown Date: Friday, 31 January 2025 at 18:58 To: Anna Dalklint , Matthew Knepley Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] Visualizing higher order finite element output in ParaView Anna Dalklint writes: > I want to save e.g. the discretized displacement field obtained from a quasi-static non-linear finite element simulation using 10 node tetrahedral elements (i.e. which has edge dofs). As mentioned, I use PetscSection to add the additional dofs on edges. I have also written my own Newton solver, i.e. I do not use SNES. In conclusion, what I want is to be able to save the discretized displacement field in each outer iteration of the Newton loop (where I increase the pseudo-time, i.e. scaling of the load). I would then preferably be able to load a stack of these files (call them u001, u002, u003? for each ?load-step?) and step in ?time? in ParaView. Please use DMSetOutputSequenceNumber to record step number. You can either use one PetscViewer of type CGNS and call VecView in your loading loop or you can write a sequence of files by creating a new PetscViewer each time. -------------- next part -------------- An HTML attachment was scrubbed... URL: From Daniel.Abele at dlr.de Tue Feb 4 12:01:22 2025 From: Daniel.Abele at dlr.de (Daniel.Abele at dlr.de) Date: Tue, 4 Feb 2025 18:01:22 +0000 Subject: [petsc-users] KSP: when to use initial residual norm (ksp_converged_use_initial_residual_norm) In-Reply-To: <6E5AF340-30B3-4C37-965A-26E0638BA64B@petsc.dev> References: <6E5AF340-30B3-4C37-965A-26E0638BA64B@petsc.dev> Message-ID: Hi Barry, thanks for the reply. We are not using any operator splitting method. Can I take from your answer that the initial residual norm is not recommended then? We are solving a diffusion equation with FD and implicit time stepping (mostly euler, sometimes crank Nicholson.) Regards, Daniel Von: Barry Smith Gesendet: Montag, 3. Februar 2025 17:17 An: Abele, Daniel Cc: petsc-users at mcs.anl.gov Betreff: Re: [petsc-users] KSP: when to use initial residual norm (ksp_converged_use_initial_residual_norm) On Feb 2, 2025, at 1:09?PM, Daniel.Abele--- via petsc-users > wrote: Hi, we are solving a time dependent problem with a single KSP in every time step. We are debating which convergence criterion to use. Is there general guidance around when to use one of the norms with initial residual (ksp_converged_use_initial_residual_norm or ksp_converged_use_min_initial_residual_norm) over the default norm? If I understand the formulas correctly, the initial residual norm ?norm(b ? A * x0)? (maybe add preconditioning) means that if you have a very good initial guess (as is often the case in time dependent problems if you can use the result if the last time step as initial guess), the norm is much stricter than the default norm ?norm(b)?. The above statement is correct. Is this meant as a way to control error accumulation over time? Or does it have some other purpose? Thanks and Regards, Daniel For splitting-type methods, you want the error in the linear system solve, e, to be on the same order as the maximum error from the splitting, the explicit time-step discretization, and the implicit time-step discretization. So you need some estimate of that value. Now || e ||_2 < || B(b - Ax) ||_2 ------------------- \lambda_min(BA). For this you need a handle on \lambda_min(BA) which you can obtain with Lanczo using -ksp_monitor_singular_values. So the converge criteria should really depend on setting the -ksp_atol using the required bound on || e ||_2 and \lambda_min(BA). PETSc should provide this convergence test but no one has gotten around to adding it. Barry -------------- next part -------------- An HTML attachment was scrubbed... URL: From matteo.semplice at uninsubria.it Wed Feb 5 11:26:50 2025 From: matteo.semplice at uninsubria.it (Matteo Semplice) Date: Wed, 5 Feb 2025 18:26:50 +0100 Subject: [petsc-users] DMUninterpolate of periodic mesh Message-ID: Dear all ??? I have updated a code of mine to Petsc3.22, when trying to uninterpolate a periodic mesh and get the following error [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: No support for this operation for this object type [0]PETSC ERROR: Missing local coordinate vector [0]PETSC ERROR: WARNING! There are unused option(s) set! Could be the program crashed before usage or a spelling mistake, etc! [0]PETSC ERROR:?? Option left: name:-dm_view (no value) source: command line [0]PETSC ERROR: See https://urldefense.us/v3/__https://petsc.org/release/faq/__;!!G_uCfscf7eWS!faxBLw4JxiiCgvgXrNR1BSgniTuJMn0UxlQ7HDbFsTvh1NfjdD1Od-Ac3koxw3kaxHCmCf6MByHS5ty49s-1W9xW81DQIhYr8dp4Zw$ for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.22.0, unknown [0]PETSC ERROR: ./testDM with 1 MPI process(es) and PETSC_ARCH? on signalkuppe by matteo Wed Feb? 5 18:21:24 2025 [0]PETSC ERROR: Configure options: --prefix=/home/matteo/software/petscsaved/3.22-opt/ PETSC_DIR=/home/matteo/software/petsc --PETSC_ARCH=opt --with-debugging=0 --COPTFLAGS="-O3 -march=native -mtune=native -mavx2" --CXXOPTFLAGS="-O3 -march=native -mtune=native -mavx2" --FOPTFLAGS="-O3 -march=native -mtune=native -mavx2" --with-strict-petscerrorcode --download-hdf5 --download-ml --with-metis --with-parmetis --with-gmsh --with-triangle --with-zlib --with-p4est-dir=~/software/p4est/local/ [0]PETSC ERROR: #1 DMLocalizeCoordinates() at /home/matteo/software/petsc/src/dm/interface/dmperiodicity.c:368 [0]PETSC ERROR: #2 DMPlexCopy_Internal() at /home/matteo/software/petsc/src/dm/impls/plex/plexcreate.c:37 [0]PETSC ERROR: #3 DMPlexUninterpolate() at /home/matteo/software/petsc/src/dm/impls/plex/plexinterpolate.c:1892 [0]PETSC ERROR: #4 main() at ../src/testDM.cpp:32 [0]PETSC ERROR: PETSc Option Table entries: [0]PETSC ERROR: -dm_view (source: command line) [0]PETSC ERROR: -dm_view_early (source: command line) [0]PETSC ERROR: ----------------End of Error Message -------send entire error message to petsc-maint at mcs.anl.gov---------- Basically I am creating a quad mesh on [0,1]x[0,1] with the command ? PetscCall( DMPlexCreateBoxMesh(MPI_COMM_WORLD, dim, simplex, faces, lower, upper, periodicity, interpolate, 0, PETSC_TRUE, &dmMesh) ); and then calling ? PetscCall( DMPlexUninterpolate(dmMesh, &dmMeshUnint) ); raises the error. This seems independent of the values that I pass for the new parameters localizationHeight and sparseLocalize. Do I need to change something else in addition to adding the new parameters in the DMPlexCreateBoxMesh call? Thanks in advance ??? Matteo From knepley at gmail.com Wed Feb 5 11:35:26 2025 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 5 Feb 2025 12:35:26 -0500 Subject: [petsc-users] DMUninterpolate of periodic mesh In-Reply-To: References: Message-ID: On Wed, Feb 5, 2025 at 12:27?PM Matteo Semplice via petsc-users < petsc-users at mcs.anl.gov> wrote: > Dear all > > I have updated a code of mine to Petsc3.22, when trying to > uninterpolate a periodic mesh and get the following error > This looks like a bug. I will fix it. Note that you can get what you want by passing interpolate = PETSC_FALSE Thanks, Matt > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > [0]PETSC ERROR: No support for this operation for this object type > [0]PETSC ERROR: Missing local coordinate vector > [0]PETSC ERROR: WARNING! There are unused option(s) set! Could be the > program crashed before usage or a spelling mistake, etc! > [0]PETSC ERROR: Option left: name:-dm_view (no value) source: command > line > [0]PETSC ERROR: See > https://urldefense.us/v3/__https://petsc.org/release/faq/__;!!G_uCfscf7eWS!faxBLw4JxiiCgvgXrNR1BSgniTuJMn0UxlQ7HDbFsTvh1NfjdD1Od-Ac3koxw3kaxHCmCf6MByHS5ty49s-1W9xW81DQIhYr8dp4Zw$ > for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.22.0, unknown > [0]PETSC ERROR: ./testDM with 1 MPI process(es) and PETSC_ARCH on > signalkuppe by matteo Wed Feb 5 18:21:24 2025 > [0]PETSC ERROR: Configure options: > --prefix=/home/matteo/software/petscsaved/3.22-opt/ > PETSC_DIR=/home/matteo/software/petsc --PETSC_ARCH=opt > --with-debugging=0 --COPTFLAGS="-O3 -march=native -mtune=native -mavx2" > --CXXOPTFLAGS="-O3 -march=native -mtune=native -mavx2" --FOPTFLAGS="-O3 > -march=native -mtune=native -mavx2" --with-strict-petscerrorcode > --download-hdf5 --download-ml --with-metis --with-parmetis --with-gmsh > --with-triangle --with-zlib --with-p4est-dir=~/software/p4est/local/ > [0]PETSC ERROR: #1 DMLocalizeCoordinates() at > /home/matteo/software/petsc/src/dm/interface/dmperiodicity.c:368 > [0]PETSC ERROR: #2 DMPlexCopy_Internal() at > /home/matteo/software/petsc/src/dm/impls/plex/plexcreate.c:37 > [0]PETSC ERROR: #3 DMPlexUninterpolate() at > /home/matteo/software/petsc/src/dm/impls/plex/plexinterpolate.c:1892 > [0]PETSC ERROR: #4 main() at ../src/testDM.cpp:32 > [0]PETSC ERROR: PETSc Option Table entries: > [0]PETSC ERROR: -dm_view (source: command line) > [0]PETSC ERROR: -dm_view_early (source: command line) > [0]PETSC ERROR: ----------------End of Error Message -------send entire > error message to petsc-maint at mcs.anl.gov---------- > > Basically I am creating a quad mesh on [0,1]x[0,1] with the command > > PetscCall( DMPlexCreateBoxMesh(MPI_COMM_WORLD, dim, simplex, faces, > lower, upper, periodicity, interpolate, 0, PETSC_TRUE, &dmMesh) ); > > and then calling > > PetscCall( DMPlexUninterpolate(dmMesh, &dmMeshUnint) ); > > raises the error. This seems independent of the values that I pass for > the new parameters localizationHeight and sparseLocalize. > > Do I need to change something else in addition to adding the new > parameters in the DMPlexCreateBoxMesh call? > > Thanks in advance > > Matteo > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!YZMEed4K2IwyEQhBwJUypWI3aKs5Wmwawwo7BynU-0Vn_H-W4_qFcSC3h-JFkofKZsXlP63vBUj4owuXbY5T$ -------------- next part -------------- An HTML attachment was scrubbed... URL: From medane.tchakorom at univ-fcomte.fr Fri Feb 7 04:05:56 2025 From: medane.tchakorom at univ-fcomte.fr (medane.tchakorom at univ-fcomte.fr) Date: Fri, 7 Feb 2025 11:05:56 +0100 Subject: [petsc-users] Incoherent data entries in array from a dense sub matrix Message-ID: <50ED5DCF-A092-4C35-8E3D-F018A96AD56E@univ-fcomte.fr> Dear all, I have been experiencing incoherent data entries from this code below, when printing the array. Maybe I?am doing something wrong. ---------------- PetscInt nlines = 8; // lines PetscInt ncols = 4; // columns PetscMPIInt rank; PetscMPIInt size; // Initialize PETSc PetscCall(PetscInitialize(&argc, &args, NULL, NULL)); PetscCallMPI(MPI_Comm_rank(MPI_COMM_WORLD, &rank)); PetscCallMPI(MPI_Comm_size(MPI_COMM_WORLD, &size)); Mat R_full; Mat R_part; PetscInt idx_first_row = 0; PetscInt idx_one_plus_last_row = nlines / 2; PetscCall(MatCreateDense(PETSC_COMM_WORLD, PETSC_DECIDE, PETSC_DECIDE, nlines, ncols, NULL, &R_full)); // Get sub matrix PetscCall(MatDenseGetSubMatrix(R_full, idx_first_row, idx_one_plus_last_row, PETSC_DECIDE, PETSC_DECIDE, &R_part)); // Add entries to sub matrix MatSetRandom(R_part, NULL); //View sub matrix PetscCall(MatView(R_part, PETSC_VIEWER_STDOUT_WORLD)); // Get array from sub matrix and print entries PetscScalar *buffer; PetscCall(MatDenseGetArray(R_part, &buffer)); PetscInt idx_end = (nlines/2) * ncols; for (int i = 0; i < idx_end; i++) { PetscPrintf(PETSC_COMM_SELF, "buffer[%d] = %e \n", i, buffer[i]); } //Restore array to sub matrix PetscCall(MatDenseRestoreArray(R_part, &buffer)); // Restore sub matrix PetscCall(MatDenseRestoreSubMatrix(R_full, &R_part)); // View the initial matrix PetscCall(MatView(R_full, PETSC_VIEWER_STDOUT_WORLD)); PetscCall(MatDestroy(&R_full)); PetscCall(PetscFinalize()); return 0; ---------------- Thanks Medane From pierre at joliv.et Fri Feb 7 04:34:36 2025 From: pierre at joliv.et (Pierre Jolivet) Date: Fri, 7 Feb 2025 11:34:36 +0100 Subject: [petsc-users] Incoherent data entries in array from a dense sub matrix In-Reply-To: <50ED5DCF-A092-4C35-8E3D-F018A96AD56E@univ-fcomte.fr> References: <50ED5DCF-A092-4C35-8E3D-F018A96AD56E@univ-fcomte.fr> Message-ID: > On 7 Feb 2025, at 11:05?AM, medane.tchakorom at univ-fcomte.fr wrote: > > > Dear all, > > I have been experiencing incoherent data entries from this code below, when printing the array. Maybe I?am doing something wrong. What is incoherent? Everything looks OK to me. Thanks, Pierre > ---------------- > > PetscInt nlines = 8; // lines > PetscInt ncols = 4; // columns > PetscMPIInt rank; > PetscMPIInt size; > > // Initialize PETSc > PetscCall(PetscInitialize(&argc, &args, NULL, NULL)); > PetscCallMPI(MPI_Comm_rank(MPI_COMM_WORLD, &rank)); > PetscCallMPI(MPI_Comm_size(MPI_COMM_WORLD, &size)); > > Mat R_full; > Mat R_part; > PetscInt idx_first_row = 0; > PetscInt idx_one_plus_last_row = nlines / 2; > PetscCall(MatCreateDense(PETSC_COMM_WORLD, PETSC_DECIDE, PETSC_DECIDE, nlines, ncols, NULL, &R_full)); > > // Get sub matrix > PetscCall(MatDenseGetSubMatrix(R_full, idx_first_row, idx_one_plus_last_row, PETSC_DECIDE, PETSC_DECIDE, &R_part)); > // Add entries to sub matrix > MatSetRandom(R_part, NULL); > //View sub matrix > PetscCall(MatView(R_part, PETSC_VIEWER_STDOUT_WORLD)); > > // Get array from sub matrix and print entries > PetscScalar *buffer; > PetscCall(MatDenseGetArray(R_part, &buffer)); > PetscInt idx_end = (nlines/2) * ncols; > > for (int i = 0; i < idx_end; i++) > { > PetscPrintf(PETSC_COMM_SELF, "buffer[%d] = %e \n", i, buffer[i]); > } > > //Restore array to sub matrix > PetscCall(MatDenseRestoreArray(R_part, &buffer)); > // Restore sub matrix > PetscCall(MatDenseRestoreSubMatrix(R_full, &R_part)); > // View the initial matrix > PetscCall(MatView(R_full, PETSC_VIEWER_STDOUT_WORLD)); > > PetscCall(MatDestroy(&R_full)); > > PetscCall(PetscFinalize()); > return 0; > > ---------------- > > > Thanks > Medane From medane.tchakorom at univ-fcomte.fr Fri Feb 7 04:49:38 2025 From: medane.tchakorom at univ-fcomte.fr (medane.tchakorom at univ-fcomte.fr) Date: Fri, 7 Feb 2025 11:49:38 +0100 Subject: [petsc-users] Incoherent data entries in array from a dense sub matrix In-Reply-To: References: <50ED5DCF-A092-4C35-8E3D-F018A96AD56E@univ-fcomte.fr> Message-ID: <95DA5C68-39E1-4BDF-8DDD-120B2DB0E1DC@univ-fcomte.fr> Re: Please find below the output from the previous code, running on only one processor. Mat Object: 1 MPI process type: seqdense 7.2003197397953400e-01 3.9777780919128602e-01 9.8405227390177075e-01 1.4405427480322786e-01 6.1793966542126100e-02 7.3036588248200474e-02 7.3851607000756303e-01 9.9650445216117589e-01 1.0022337819588500e-02 1.0386628927366459e-01 4.0114727059134836e-01 1.0677308875937896e-01 1.4463931936456476e-01 2.5078039364333193e-01 5.2764865382548720e-01 9.8905332488367748e-01 buffer[0] = 7.200320e-01 buffer[1] = 6.179397e-02 buffer[2] = 1.002234e-02 buffer[3] = 1.446393e-01 buffer[4] = 0.000000e+00 buffer[5] = 0.000000e+00 buffer[6] = 0.000000e+00 buffer[7] = 0.000000e+00 buffer[8] = 3.977778e-01 buffer[9] = 7.303659e-02 buffer[10] = 1.038663e-01 buffer[11] = 2.507804e-01 buffer[12] = 0.000000e+00 buffer[13] = 0.000000e+00 buffer[14] = 0.000000e+00 buffer[15] = 0.000000e+00 Mat Object: 1 MPI process type: seqdense 7.2003197397953400e-01 3.9777780919128602e-01 9.8405227390177075e-01 1.4405427480322786e-01 6.1793966542126100e-02 7.3036588248200474e-02 7.3851607000756303e-01 9.9650445216117589e-01 1.0022337819588500e-02 1.0386628927366459e-01 4.0114727059134836e-01 1.0677308875937896e-01 1.4463931936456476e-01 2.5078039364333193e-01 5.2764865382548720e-01 9.8905332488367748e-01 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 I was expecting to get in ?buffer?, only the data entries from R_part. Please, let me know if this is the excepted behavior and I?am missing something. Thanks, Medane > On 7 Feb 2025, at 11:34, Pierre Jolivet wrote: > > > >> On 7 Feb 2025, at 11:05?AM, medane.tchakorom at univ-fcomte.fr wrote: >> >> >> Dear all, >> >> I have been experiencing incoherent data entries from this code below, when printing the array. Maybe I?am doing something wrong. > > What is incoherent? > Everything looks OK to me. > > Thanks, > Pierre > >> ---------------- >> >> PetscInt nlines = 8; // lines >> PetscInt ncols = 4; // columns >> PetscMPIInt rank; >> PetscMPIInt size; >> >> // Initialize PETSc >> PetscCall(PetscInitialize(&argc, &args, NULL, NULL)); >> PetscCallMPI(MPI_Comm_rank(MPI_COMM_WORLD, &rank)); >> PetscCallMPI(MPI_Comm_size(MPI_COMM_WORLD, &size)); >> >> Mat R_full; >> Mat R_part; >> PetscInt idx_first_row = 0; >> PetscInt idx_one_plus_last_row = nlines / 2; >> PetscCall(MatCreateDense(PETSC_COMM_WORLD, PETSC_DECIDE, PETSC_DECIDE, nlines, ncols, NULL, &R_full)); >> >> // Get sub matrix >> PetscCall(MatDenseGetSubMatrix(R_full, idx_first_row, idx_one_plus_last_row, PETSC_DECIDE, PETSC_DECIDE, &R_part)); >> // Add entries to sub matrix >> MatSetRandom(R_part, NULL); >> //View sub matrix >> PetscCall(MatView(R_part, PETSC_VIEWER_STDOUT_WORLD)); >> >> // Get array from sub matrix and print entries >> PetscScalar *buffer; >> PetscCall(MatDenseGetArray(R_part, &buffer)); >> PetscInt idx_end = (nlines/2) * ncols; >> >> for (int i = 0; i < idx_end; i++) >> { >> PetscPrintf(PETSC_COMM_SELF, "buffer[%d] = %e \n", i, buffer[i]); >> } >> >> //Restore array to sub matrix >> PetscCall(MatDenseRestoreArray(R_part, &buffer)); >> // Restore sub matrix >> PetscCall(MatDenseRestoreSubMatrix(R_full, &R_part)); >> // View the initial matrix >> PetscCall(MatView(R_full, PETSC_VIEWER_STDOUT_WORLD)); >> >> PetscCall(MatDestroy(&R_full)); >> >> PetscCall(PetscFinalize()); >> return 0; >> >> ---------------- >> >> >> Thanks >> Medane > > From jroman at dsic.upv.es Fri Feb 7 05:15:33 2025 From: jroman at dsic.upv.es (Jose E. Roman) Date: Fri, 7 Feb 2025 11:15:33 +0000 Subject: [petsc-users] Incoherent data entries in array from a dense sub matrix In-Reply-To: <95DA5C68-39E1-4BDF-8DDD-120B2DB0E1DC@univ-fcomte.fr> References: <50ED5DCF-A092-4C35-8E3D-F018A96AD56E@univ-fcomte.fr> <95DA5C68-39E1-4BDF-8DDD-120B2DB0E1DC@univ-fcomte.fr> Message-ID: This is expected. For dense matrices, MatDenseGetSubMatrix() does not duplicate the memory. You should interpret the array as a two-dimensional column-major array: buffer[i+j*lda] where i,j are row and column indices, and lda can be obtained with MatDenseGetLDA(). Jose > El 7 feb 2025, a las 11:49, medane.tchakorom at univ-fcomte.fr escribi?: > > Re: > Please find below the output from the previous code, running on only one processor. > > Mat Object: 1 MPI process > type: seqdense > 7.2003197397953400e-01 3.9777780919128602e-01 9.8405227390177075e-01 1.4405427480322786e-01 > 6.1793966542126100e-02 7.3036588248200474e-02 7.3851607000756303e-01 9.9650445216117589e-01 > 1.0022337819588500e-02 1.0386628927366459e-01 4.0114727059134836e-01 1.0677308875937896e-01 > 1.4463931936456476e-01 2.5078039364333193e-01 5.2764865382548720e-01 9.8905332488367748e-01 > > buffer[0] = 7.200320e-01 > buffer[1] = 6.179397e-02 > buffer[2] = 1.002234e-02 > buffer[3] = 1.446393e-01 > buffer[4] = 0.000000e+00 > buffer[5] = 0.000000e+00 > buffer[6] = 0.000000e+00 > buffer[7] = 0.000000e+00 > buffer[8] = 3.977778e-01 > buffer[9] = 7.303659e-02 > buffer[10] = 1.038663e-01 > buffer[11] = 2.507804e-01 > buffer[12] = 0.000000e+00 > buffer[13] = 0.000000e+00 > buffer[14] = 0.000000e+00 > buffer[15] = 0.000000e+00 > > Mat Object: 1 MPI process > type: seqdense > 7.2003197397953400e-01 3.9777780919128602e-01 9.8405227390177075e-01 1.4405427480322786e-01 > 6.1793966542126100e-02 7.3036588248200474e-02 7.3851607000756303e-01 9.9650445216117589e-01 > 1.0022337819588500e-02 1.0386628927366459e-01 4.0114727059134836e-01 1.0677308875937896e-01 > 1.4463931936456476e-01 2.5078039364333193e-01 5.2764865382548720e-01 9.8905332488367748e-01 > 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 > 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 > 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 > 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 > > > I was expecting to get in ?buffer?, only the data entries from R_part. Please, let me know if this is the excepted behavior and I?am missing something. > > Thanks, > Medane > > > >> On 7 Feb 2025, at 11:34, Pierre Jolivet wrote: >> >> >> >>> On 7 Feb 2025, at 11:05?AM, medane.tchakorom at univ-fcomte.fr wrote: >>> >>> >>> Dear all, >>> >>> I have been experiencing incoherent data entries from this code below, when printing the array. Maybe I?am doing something wrong. >> >> What is incoherent? >> Everything looks OK to me. >> >> Thanks, >> Pierre >> >>> ---------------- >>> >>> PetscInt nlines = 8; // lines >>> PetscInt ncols = 4; // columns >>> PetscMPIInt rank; >>> PetscMPIInt size; >>> >>> // Initialize PETSc >>> PetscCall(PetscInitialize(&argc, &args, NULL, NULL)); >>> PetscCallMPI(MPI_Comm_rank(MPI_COMM_WORLD, &rank)); >>> PetscCallMPI(MPI_Comm_size(MPI_COMM_WORLD, &size)); >>> >>> Mat R_full; >>> Mat R_part; >>> PetscInt idx_first_row = 0; >>> PetscInt idx_one_plus_last_row = nlines / 2; >>> PetscCall(MatCreateDense(PETSC_COMM_WORLD, PETSC_DECIDE, PETSC_DECIDE, nlines, ncols, NULL, &R_full)); >>> >>> // Get sub matrix >>> PetscCall(MatDenseGetSubMatrix(R_full, idx_first_row, idx_one_plus_last_row, PETSC_DECIDE, PETSC_DECIDE, &R_part)); >>> // Add entries to sub matrix >>> MatSetRandom(R_part, NULL); >>> //View sub matrix >>> PetscCall(MatView(R_part, PETSC_VIEWER_STDOUT_WORLD)); >>> >>> // Get array from sub matrix and print entries >>> PetscScalar *buffer; >>> PetscCall(MatDenseGetArray(R_part, &buffer)); >>> PetscInt idx_end = (nlines/2) * ncols; >>> >>> for (int i = 0; i < idx_end; i++) >>> { >>> PetscPrintf(PETSC_COMM_SELF, "buffer[%d] = %e \n", i, buffer[i]); >>> } >>> >>> //Restore array to sub matrix >>> PetscCall(MatDenseRestoreArray(R_part, &buffer)); >>> // Restore sub matrix >>> PetscCall(MatDenseRestoreSubMatrix(R_full, &R_part)); >>> // View the initial matrix >>> PetscCall(MatView(R_full, PETSC_VIEWER_STDOUT_WORLD)); >>> >>> PetscCall(MatDestroy(&R_full)); >>> >>> PetscCall(PetscFinalize()); >>> return 0; >>> >>> ---------------- >>> >>> >>> Thanks >>> Medane >> >> > From knepley at gmail.com Fri Feb 7 08:22:58 2025 From: knepley at gmail.com (Matthew Knepley) Date: Fri, 7 Feb 2025 09:22:58 -0500 Subject: [petsc-users] Incoherent data entries in array from a dense sub matrix In-Reply-To: <95DA5C68-39E1-4BDF-8DDD-120B2DB0E1DC@univ-fcomte.fr> References: <50ED5DCF-A092-4C35-8E3D-F018A96AD56E@univ-fcomte.fr> <95DA5C68-39E1-4BDF-8DDD-120B2DB0E1DC@univ-fcomte.fr> Message-ID: On Fri, Feb 7, 2025 at 8:20?AM medane.tchakorom at univ-fcomte.fr < medane.tchakorom at univ-fcomte.fr> wrote: > Re: > Please find below the output from the previous code, running on only one > processor. > > Mat Object: 1 MPI process > type: seqdense > 7.2003197397953400e-01 3.9777780919128602e-01 9.8405227390177075e-01 > 1.4405427480322786e-01 > 6.1793966542126100e-02 7.3036588248200474e-02 7.3851607000756303e-01 > 9.9650445216117589e-01 > 1.0022337819588500e-02 1.0386628927366459e-01 4.0114727059134836e-01 > 1.0677308875937896e-01 > 1.4463931936456476e-01 2.5078039364333193e-01 5.2764865382548720e-01 > 9.8905332488367748e-01 > > buffer[0] = 7.200320e-01 > buffer[1] = 6.179397e-02 > buffer[2] = 1.002234e-02 > buffer[3] = 1.446393e-01 > buffer[4] = 0.000000e+00 > buffer[5] = 0.000000e+00 > buffer[6] = 0.000000e+00 > buffer[7] = 0.000000e+00 > buffer[8] = 3.977778e-01 > buffer[9] = 7.303659e-02 > buffer[10] = 1.038663e-01 > buffer[11] = 2.507804e-01 > buffer[12] = 0.000000e+00 > buffer[13] = 0.000000e+00 > buffer[14] = 0.000000e+00 > buffer[15] = 0.000000e+00 > > Mat Object: 1 MPI process > type: seqdense > 7.2003197397953400e-01 3.9777780919128602e-01 9.8405227390177075e-01 > 1.4405427480322786e-01 > 6.1793966542126100e-02 7.3036588248200474e-02 7.3851607000756303e-01 > 9.9650445216117589e-01 > 1.0022337819588500e-02 1.0386628927366459e-01 4.0114727059134836e-01 > 1.0677308875937896e-01 > 1.4463931936456476e-01 2.5078039364333193e-01 5.2764865382548720e-01 > 9.8905332488367748e-01 > 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 > 0.0000000000000000e+00 > 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 > 0.0000000000000000e+00 > 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 > 0.0000000000000000e+00 > 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 > 0.0000000000000000e+00 > > > I was expecting to get in ?buffer?, only the data entries from R_part. > Please, let me know if this is the excepted behavior and I?am missing > something. > As Jose already pointed out, SubMatrix() does not copy. It gives you a Mat front end to the same data, but with changed sizes. In this case, the LDA is 4, not 2, so when you iterate over the values, you skip over the ones you don't want. Thanks, Matt > Thanks, > Medane > > > > > On 7 Feb 2025, at 11:34, Pierre Jolivet wrote: > > > > > > > >> On 7 Feb 2025, at 11:05?AM, medane.tchakorom at univ-fcomte.fr wrote: > >> > >> > >> Dear all, > >> > >> I have been experiencing incoherent data entries from this code below, > when printing the array. Maybe I?am doing something wrong. > > > > What is incoherent? > > Everything looks OK to me. > > > > Thanks, > > Pierre > > > >> ---------------- > >> > >> PetscInt nlines = 8; // lines > >> PetscInt ncols = 4; // columns > >> PetscMPIInt rank; > >> PetscMPIInt size; > >> > >> // Initialize PETSc > >> PetscCall(PetscInitialize(&argc, &args, NULL, NULL)); > >> PetscCallMPI(MPI_Comm_rank(MPI_COMM_WORLD, &rank)); > >> PetscCallMPI(MPI_Comm_size(MPI_COMM_WORLD, &size)); > >> > >> Mat R_full; > >> Mat R_part; > >> PetscInt idx_first_row = 0; > >> PetscInt idx_one_plus_last_row = nlines / 2; > >> PetscCall(MatCreateDense(PETSC_COMM_WORLD, PETSC_DECIDE, > PETSC_DECIDE, nlines, ncols, NULL, &R_full)); > >> > >> // Get sub matrix > >> PetscCall(MatDenseGetSubMatrix(R_full, idx_first_row, > idx_one_plus_last_row, PETSC_DECIDE, PETSC_DECIDE, &R_part)); > >> // Add entries to sub matrix > >> MatSetRandom(R_part, NULL); > >> //View sub matrix > >> PetscCall(MatView(R_part, PETSC_VIEWER_STDOUT_WORLD)); > >> > >> // Get array from sub matrix and print entries > >> PetscScalar *buffer; > >> PetscCall(MatDenseGetArray(R_part, &buffer)); > >> PetscInt idx_end = (nlines/2) * ncols; > >> > >> for (int i = 0; i < idx_end; i++) > >> { > >> PetscPrintf(PETSC_COMM_SELF, "buffer[%d] = %e \n", i, buffer[i]); > >> } > >> > >> //Restore array to sub matrix > >> PetscCall(MatDenseRestoreArray(R_part, &buffer)); > >> // Restore sub matrix > >> PetscCall(MatDenseRestoreSubMatrix(R_full, &R_part)); > >> // View the initial matrix > >> PetscCall(MatView(R_full, PETSC_VIEWER_STDOUT_WORLD)); > >> > >> PetscCall(MatDestroy(&R_full)); > >> > >> PetscCall(PetscFinalize()); > >> return 0; > >> > >> ---------------- > >> > >> > >> Thanks > >> Medane > > > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!c2IunCO2oQ3I91jAU5GYm2XQbPfgQfcl0n_uf1fsjnqd7gGNf1YDMYee5YkTRcQAfGtUxSZxDS4kWs9Rs1qa$ -------------- next part -------------- An HTML attachment was scrubbed... URL: From medane.tchakorom at univ-fcomte.fr Sat Feb 8 03:46:01 2025 From: medane.tchakorom at univ-fcomte.fr (medane.tchakorom at univ-fcomte.fr) Date: Sat, 8 Feb 2025 10:46:01 +0100 Subject: [petsc-users] Incoherent data entries in array from a dense sub matrix In-Reply-To: References: <50ED5DCF-A092-4C35-8E3D-F018A96AD56E@univ-fcomte.fr> <95DA5C68-39E1-4BDF-8DDD-120B2DB0E1DC@univ-fcomte.fr> Message-ID: <38CA73F0-1204-4909-8778-2C59BA41707B@univ-fcomte.fr> Dear petsc team, Thank you for all your answers. I really appreciate. Best regards, Medane > On 7 Feb 2025, at 15:22, Matthew Knepley wrote: > > On Fri, Feb 7, 2025 at 8:20?AM medane.tchakorom at univ-fcomte.fr > wrote: >> Re: >> Please find below the output from the previous code, running on only one processor. >> >> Mat Object: 1 MPI process >> type: seqdense >> 7.2003197397953400e-01 3.9777780919128602e-01 9.8405227390177075e-01 1.4405427480322786e-01 >> 6.1793966542126100e-02 7.3036588248200474e-02 7.3851607000756303e-01 9.9650445216117589e-01 >> 1.0022337819588500e-02 1.0386628927366459e-01 4.0114727059134836e-01 1.0677308875937896e-01 >> 1.4463931936456476e-01 2.5078039364333193e-01 5.2764865382548720e-01 9.8905332488367748e-01 >> >> buffer[0] = 7.200320e-01 >> buffer[1] = 6.179397e-02 >> buffer[2] = 1.002234e-02 >> buffer[3] = 1.446393e-01 >> buffer[4] = 0.000000e+00 >> buffer[5] = 0.000000e+00 >> buffer[6] = 0.000000e+00 >> buffer[7] = 0.000000e+00 >> buffer[8] = 3.977778e-01 >> buffer[9] = 7.303659e-02 >> buffer[10] = 1.038663e-01 >> buffer[11] = 2.507804e-01 >> buffer[12] = 0.000000e+00 >> buffer[13] = 0.000000e+00 >> buffer[14] = 0.000000e+00 >> buffer[15] = 0.000000e+00 >> >> Mat Object: 1 MPI process >> type: seqdense >> 7.2003197397953400e-01 3.9777780919128602e-01 9.8405227390177075e-01 1.4405427480322786e-01 >> 6.1793966542126100e-02 7.3036588248200474e-02 7.3851607000756303e-01 9.9650445216117589e-01 >> 1.0022337819588500e-02 1.0386628927366459e-01 4.0114727059134836e-01 1.0677308875937896e-01 >> 1.4463931936456476e-01 2.5078039364333193e-01 5.2764865382548720e-01 9.8905332488367748e-01 >> 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 >> 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 >> 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 >> 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 >> >> >> I was expecting to get in ?buffer?, only the data entries from R_part. Please, let me know if this is the excepted behavior and I?am missing something. > > As Jose already pointed out, SubMatrix() does not copy. It gives you a Mat front end to the same data, but with changed sizes. In this case, the LDA is 4, not 2, so when you iterate over the values, you skip over the ones you don't want. > > Thanks, > > Matt > >> Thanks, >> Medane >> >> >> >> > On 7 Feb 2025, at 11:34, Pierre Jolivet > wrote: >> > >> > >> > >> >> On 7 Feb 2025, at 11:05?AM, medane.tchakorom at univ-fcomte.fr wrote: >> >> >> >> >> >> Dear all, >> >> >> >> I have been experiencing incoherent data entries from this code below, when printing the array. Maybe I?am doing something wrong. >> > >> > What is incoherent? >> > Everything looks OK to me. >> > >> > Thanks, >> > Pierre >> > >> >> ---------------- >> >> >> >> PetscInt nlines = 8; // lines >> >> PetscInt ncols = 4; // columns >> >> PetscMPIInt rank; >> >> PetscMPIInt size; >> >> >> >> // Initialize PETSc >> >> PetscCall(PetscInitialize(&argc, &args, NULL, NULL)); >> >> PetscCallMPI(MPI_Comm_rank(MPI_COMM_WORLD, &rank)); >> >> PetscCallMPI(MPI_Comm_size(MPI_COMM_WORLD, &size)); >> >> >> >> Mat R_full; >> >> Mat R_part; >> >> PetscInt idx_first_row = 0; >> >> PetscInt idx_one_plus_last_row = nlines / 2; >> >> PetscCall(MatCreateDense(PETSC_COMM_WORLD, PETSC_DECIDE, PETSC_DECIDE, nlines, ncols, NULL, &R_full)); >> >> >> >> // Get sub matrix >> >> PetscCall(MatDenseGetSubMatrix(R_full, idx_first_row, idx_one_plus_last_row, PETSC_DECIDE, PETSC_DECIDE, &R_part)); >> >> // Add entries to sub matrix >> >> MatSetRandom(R_part, NULL); >> >> //View sub matrix >> >> PetscCall(MatView(R_part, PETSC_VIEWER_STDOUT_WORLD)); >> >> >> >> // Get array from sub matrix and print entries >> >> PetscScalar *buffer; >> >> PetscCall(MatDenseGetArray(R_part, &buffer)); >> >> PetscInt idx_end = (nlines/2) * ncols; >> >> >> >> for (int i = 0; i < idx_end; i++) >> >> { >> >> PetscPrintf(PETSC_COMM_SELF, "buffer[%d] = %e \n", i, buffer[i]); >> >> } >> >> >> >> //Restore array to sub matrix >> >> PetscCall(MatDenseRestoreArray(R_part, &buffer)); >> >> // Restore sub matrix >> >> PetscCall(MatDenseRestoreSubMatrix(R_full, &R_part)); >> >> // View the initial matrix >> >> PetscCall(MatView(R_full, PETSC_VIEWER_STDOUT_WORLD)); >> >> >> >> PetscCall(MatDestroy(&R_full)); >> >> >> >> PetscCall(PetscFinalize()); >> >> return 0; >> >> >> >> ---------------- >> >> >> >> >> >> Thanks >> >> Medane >> > >> > >> > > > > -- > What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. > -- Norbert Wiener > > https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!aW67hycVSXmRHgmoXNGV0fVjZ4HM7XloTvLw0b1d9peGDnJGYOm6nKgJPy53qErREjKHhwJybk0bAXiSQY7f9RreQX1Aik1VZBB7CIDA$ -------------- next part -------------- An HTML attachment was scrubbed... URL: From antonio.ghidoni at unibs.it Thu Feb 13 07:15:51 2025 From: antonio.ghidoni at unibs.it (ANTONIO GHIDONI) Date: Thu, 13 Feb 2025 14:15:51 +0100 Subject: [petsc-users] Problem with VecGhostUpdateBegin Message-ID: <476488F7-FC0B-4AC4-9CA4-C53877C89AD5@unibs.it> Hello, I am using Petsc 3.30.2. When I trie to update a ghost vector, I obtain the following error: [0]PETSC ERROR: --------------------- Error Message -------------------------------------------------------------- [0]PETSC ERROR: Object is in wrong state [0]PETSC ERROR: Outstanding operation has not been completed [0]PETSC ERROR: See https://urldefense.us/v3/__https://petsc.org/release/faq/__;!!G_uCfscf7eWS!e7HV8lVYESfvZo54Jin3YQ8o42CMuDL5AoF-o58a35sCLqd_0HhPSZMkoF20Zd0goGvAFFbbapxUFblcU12EitvG2MwayenCuQ$ for trouble shooting. [0]PETSC ERROR: Petsc Release Version 3.20.2, Nov 30, 2023 [0]PETSC ERROR: ./main2d.out on a linux-intel named node1 by cfdlab Thu Feb 13 14:08:53 2025 [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ --with-fc=gfortran --with-debugging=0 --with-pic COPTFLAGS=-O3 CXXOPTFLAGS=-O3 FOPTFLAGS=-O3 F90OPTFLAGS=-O3 --download-fblaslapack --download-mpich [0]PETSC ERROR: #1 PetscSFReset_Basic() at /home/cfdlab/Lib/petsc-3.20.2/src/vec/is/sf/impls/basic/sfbasic.c:93 [0]PETSC ERROR: #2 PetscSFReset() at /home/cfdlab/Lib/petsc-3.20.2/src/vec/is/sf/interface/sf.c:119 [0]PETSC ERROR: #3 PetscSFDestroy() at /home/cfdlab/Lib/petsc-3.20.2/src/vec/is/sf/interface/sf.c:237 [0]PETSC ERROR: #4 VecScatterDestroy() at /home/cfdlab/Lib/petsc-3.20.2/src/vec/is/sf/interface/vscat.c:483 [0]PETSC ERROR: #5 VecDestroy_MPI() at /home/cfdlab/Lib/petsc-3.20.2/src/vec/vec/impls/mpi/pdvec.c:38 [0]PETSC ERROR: #6 VecDestroy() at /home/cfdlab/Lib/petsc-3.20.2/src/vec/vec/interface/vector.c:579 It?s a strange behavior because with other vectors the same routine works properly. My routine is as follows: INTEGER(4) :: nb, nt(:), jtg(nb), nblk, nlfr INTEGER(4) :: fr(nlfr+nb*nblk) Vec iv_fr INTEGER(4), ALLOCATABLE :: igh(:) INTEGER(4) :: ierr ALLOCATE (igh(nt(1))) DO it = 1, nt(1) igh(it) = jtg(it) -1 ENDDO CALL VecCreateGhostBlockWithArray (PETSC_COMM_WORLD,nblk,nlfr, & PETSC_DECIDE,nt(1),igh,fr,iv_fr,ierr) CALL VecGhostUpdateBegin(iv_fr,INSERT_VALUES,SCATTER_FORWARD,ierr) CALL VecGhostUpdateEnd (iv_fr,INSERT_VALUES,SCATTER_FORWARD,ierr) CALL VecDestroy (iv_fr,ierr) DEALLOCATE (igh) Any suggestion about this strange error? Antonio -- Informativa sulla Privacy:?https://urldefense.us/v3/__https://www.unibs.it/it/node/1452__;!!G_uCfscf7eWS!e7HV8lVYESfvZo54Jin3YQ8o42CMuDL5AoF-o58a35sCLqd_0HhPSZMkoF20Zd0goGvAFFbbapxUFblcU12EitvG2MzDu5UMRg$ From knepley at gmail.com Thu Feb 13 09:33:31 2025 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 13 Feb 2025 10:33:31 -0500 Subject: [petsc-users] Problem with VecGhostUpdateBegin In-Reply-To: <476488F7-FC0B-4AC4-9CA4-C53877C89AD5@unibs.it> References: <476488F7-FC0B-4AC4-9CA4-C53877C89AD5@unibs.it> Message-ID: On Thu, Feb 13, 2025 at 10:27?AM ANTONIO GHIDONI via petsc-users < petsc-users at mcs.anl.gov> wrote: > Hello, > I am using Petsc 3.30.2. When I trie to update a ghost vector, I obtain > the following error: > > [0]PETSC ERROR: --------------------- Error Message > -------------------------------------------------------------- > [0]PETSC ERROR: Object is in wrong state > [0]PETSC ERROR: Outstanding operation has not been completed > [0]PETSC ERROR: See > https://urldefense.us/v3/__https://petsc.org/release/faq/__;!!G_uCfscf7eWS!e7HV8lVYESfvZo54Jin3YQ8o42CMuDL5AoF-o58a35sCLqd_0HhPSZMkoF20Zd0goGvAFFbbapxUFblcU12EitvG2MwayenCuQ$ > for trouble shooting. > [0]PETSC ERROR: Petsc Release Version 3.20.2, Nov 30, 2023 > [0]PETSC ERROR: ./main2d.out on a linux-intel named node1 by cfdlab Thu > Feb 13 14:08:53 2025 > [0]PETSC ERROR: Configure options --with-cc=gcc --with-cxx=g++ > --with-fc=gfortran --with-debugging=0 --with-pic COPTFLAGS=-O3 > CXXOPTFLAGS=-O3 FOPTFLAGS=-O3 F90OPTFLAGS=-O3 --download-fblaslapack > --download-mpich > [0]PETSC ERROR: #1 PetscSFReset_Basic() at > /home/cfdlab/Lib/petsc-3.20.2/src/vec/is/sf/impls/basic/sfbasic.c:93 > [0]PETSC ERROR: #2 PetscSFReset() at > /home/cfdlab/Lib/petsc-3.20.2/src/vec/is/sf/interface/sf.c:119 > [0]PETSC ERROR: #3 PetscSFDestroy() at > /home/cfdlab/Lib/petsc-3.20.2/src/vec/is/sf/interface/sf.c:237 > [0]PETSC ERROR: #4 VecScatterDestroy() at > /home/cfdlab/Lib/petsc-3.20.2/src/vec/is/sf/interface/vscat.c:483 > [0]PETSC ERROR: #5 VecDestroy_MPI() at > /home/cfdlab/Lib/petsc-3.20.2/src/vec/vec/impls/mpi/pdvec.c:38 > [0]PETSC ERROR: #6 VecDestroy() at > /home/cfdlab/Lib/petsc-3.20.2/src/vec/vec/interface/vector.c:579 > > > It?s a strange behavior because with other vectors the same routine works > properly. My routine is as follows: > > INTEGER(4) :: nb, nt(:), jtg(nb), nblk, nlfr > INTEGER(4) :: fr(nlfr+nb*nblk) > > Vec iv_fr > > INTEGER(4), ALLOCATABLE :: igh(:) > INTEGER(4) :: ierr > > > ALLOCATE (igh(nt(1))) > > DO it = 1, nt(1) > igh(it) = jtg(it) -1 > ENDDO > > CALL VecCreateGhostBlockWithArray (PETSC_COMM_WORLD,nblk,nlfr, & > PETSC_DECIDE,nt(1),igh,fr,iv_fr,ierr) > CALL VecGhostUpdateBegin(iv_fr,INSERT_VALUES,SCATTER_FORWARD,ierr) > CALL VecGhostUpdateEnd (iv_fr,INSERT_VALUES,SCATTER_FORWARD,ierr) > > CALL VecDestroy (iv_fr,ierr) > > DEALLOCATE (igh) > > Any suggestion about this strange error? > You have a Begin() somewhere without an End(). It is hard to say anything else without the code. Thanks, Matt > Antonio > > > -- > > > > > Informativa sulla Privacy: > https://urldefense.us/v3/__https://www.unibs.it/it/node/1452__;!!G_uCfscf7eWS!e7HV8lVYESfvZo54Jin3YQ8o42CMuDL5AoF-o58a35sCLqd_0HhPSZMkoF20Zd0goGvAFFbbapxUFblcU12EitvG2MzDu5UMRg$ > > < > https://urldefense.us/v3/__https://www.unibs.it/it/node/1452__;!!G_uCfscf7eWS!e7HV8lVYESfvZo54Jin3YQ8o42CMuDL5AoF-o58a35sCLqd_0HhPSZMkoF20Zd0goGvAFFbbapxUFblcU12EitvG2MzDu5UMRg$ > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!aHE-77So2SAK4bBUqZtJoEI5p6AYoobJOERwt-ejOAIW_O9tKt10N3EzIsYw3bc_PTiu6nip8M2siVtrPfIz$ -------------- next part -------------- An HTML attachment was scrubbed... URL: From mfadams at lbl.gov Fri Feb 14 07:28:07 2025 From: mfadams at lbl.gov (Mark Adams) Date: Fri, 14 Feb 2025 08:28:07 -0500 Subject: [petsc-users] DOE STTR partner opportunity In-Reply-To: <1340996571.133008.1739538340313@mail.yahoo.com> References: <1544782710.24203.1736797499791.ref@mail.yahoo.com> <1544782710.24203.1736797499791@mail.yahoo.com> <858303897.206391.1736863756977@mail.yahoo.com> <1654280882.6193.1736889607440@mail.yahoo.com> <1661786321.7226111.1738901906542@mail.yahoo.com> <1789024741.10078199.1739394891478@mail.yahoo.com> <1340996571.133008.1739538340313@mail.yahoo.com> Message-ID: cc'ing Rich and petsc-users. * I tried putting "petsc example stiffness matrix" into ChatGPT and it actually looked fine, 1D Laplacian C code with instructions to build and run it. * But we have many tutorials that do this and you can browse them to find one that looks best for your interests at https://urldefense.us/v3/__https://petsc.org/release/tutorials/__;!!G_uCfscf7eWS!fNXzXlpQ-ukRC2r1EAR0vzOZCP4M_UUNOIfBsJXXEMXS3FrkItd2r4hJKy94MHlF6h7APDqb7-Pfc9V4GdZtUls$ Good luck, Mark On Fri, Feb 14, 2025 at 8:05?AM Debiprasad Panda wrote: > Mark, > > Can you send me any link for FEA example using PETSC which will generate > the stiffness matrix so that I can play with you for porting into our FPGA. > Regards. > > Debiprasad Panda, PhD > President & CTO, > Universal Real Time Power Conversion LLC > Greater Milwaukee, WI > Tel:1-440-840-3393 (cell) > Web: https://urldefense.us/v3/__https://urtpc.com__;!!G_uCfscf7eWS!fNXzXlpQ-ukRC2r1EAR0vzOZCP4M_UUNOIfBsJXXEMXS3FrkItd2r4hJKy94MHlF6h7APDqb7-Pfc9V435nHato$ > > NOTICE OF PROPRIETARY AND CONFIDENTIAL INFORMATION*:* > *This e-mail transmission and its attachments are privileged, proprietary > and confidential and is **for the review of the designated recipient only**. > If you have received this transmission in error, please immediately > notify dpanda68 at yahoo.com . Unintended transmission > shall not constitute waiver of any privilege.* > > > On Wednesday, February 12, 2025 at 03:14:51 PM CST, Debiprasad Panda < > dpanda68 at yahoo.com> wrote: > > > Richard, > > Thanks for the detailed email. It certainly explains the limitation at > this time. Your email also very clearly explains about what kind of > collaboration you extend to the partnering company. I had in my mind that > neither you or Mark or Todd will write the code for us. I thought there > might be junior scientists/programmers who works in your team will do the > bulk of the work under your supervision. Now I understand that you would > like to participate only in specific issues which is not readily available > or needs further development in PETSc. As you say some issues may arise > while using it and if so, you would like to participate in resolving such > issues either in a future DOE proposal or through a self-generated project > by PETSc community. Correct me if my understanding is not right. We did > work with university and consultants as sub-contractor in the past from > this organization, but not directly with research labs. Your email > certainly provides some guideline on the process and timeline. > > As you know we did implement a complete FEA analysis in FPGA and the speed > up is significant. However, that was partly hardcoded. Thats why looking > for an interface which is already tested and just need to be streamlined > with our workflow. I thought that having a complete example of our interest > in PETSC and implementing the same by part/full in FPGA will give us a good > handle to continue development in that direction. As I mentioned we can do > that task ourselves - we do have people who used the same workflow as I > provided in my email, but it was for a different application. The main > problem for small business like us is lack of funding. An SBIR/STTR funding > will be very helpful ton accomplish this ground research on FPGA PETSC > interface. > > I know time is short and certainly this transition time is making things > more complicated. > > Let's plan for the next round and I believe the solicitation will be out > in first week of June and the submission of the final proposal will be in > October 2025. I will contact you in June. > > > Anyway, we will proceed with our proposal with another partner this time. > > In the mean it will be helpful if Adam or any of you can send a link to > any existing FEA example so that we can play with it. > > Thanks again for all your time and email discussion. > > > > Regards. > > > Debiprasad Panda, PhD > President & CTO, > Universal Real Time Power Conversion LLC > Greater Milwaukee, WI > Tel:1-440-840-3393 (cell) > Web: https://urldefense.us/v3/__https://urtpc.com__;!!G_uCfscf7eWS!fNXzXlpQ-ukRC2r1EAR0vzOZCP4M_UUNOIfBsJXXEMXS3FrkItd2r4hJKy94MHlF6h7APDqb7-Pfc9V435nHato$ > > NOTICE OF PROPRIETARY AND CONFIDENTIAL INFORMATION*:* > *This e-mail transmission and its attachments are privileged, proprietary > and confidential and is **for the review of the designated recipient only**. > If you have received this transmission in error, please immediately > notify dpanda68 at yahoo.com . Unintended transmission > shall not constitute waiver of any privilege.* > > > On Wednesday, February 12, 2025 at 02:07:03 PM CST, Mills, Richard Tran < > rtmills at anl.gov> wrote: > > > Hi Debiprasad, > > I apologize for being slow to get back to you; I am severely > over-committed at the moment, and keeping up with email (among other > things) has been extremely difficult. > > I am sorry to have to disappoint you, but I do not think that it will be > possible for me, Todd, or Mark to partner with you on the STTR call this > time around. Let me try to explain the two major reasons why. > > First: For staff at the DOE National Laboratories, it is very > time-consuming to get approvals to participate as a subcontractor for > something like an SBIR or STTR project. I receive a small amount of funding > from an SBIR project right now, and it literally took weeks to get that > proposal through all of the required approvals, including an "letter of > commitment" signed by the the Laboratory's Director of Sponsored Research, > as well as approval from our Contracting Officer at the DOE Site Office in > Chicago. There are many steps of the review to ensure that proposed work is > consistent with the DOE and Argonne missions, that it does not adversely > impact DOE work at Argonne, and that it is not in direct competition with > the private sector. The laboratory's guidance on this approval process > state that we should allow a minimum of 15 business days for this process, > but, with the current upheaval due to the transition to the new > Administration, I suspect that more than 15 business days would be > required. I also note that a reasonably close to complete draft of the > proposal is required to be submitted at the beginning of the approval > process, so you need to factor in time to develop the proposal ahead of the > approval window if you want to respond to a future STTR or SBIR call and > partner with a DOE Laboratory. > > Second: The breakdown of work that you are proposing isn't really aligned > what with laboratory research scientists like Todd, Mark, and I are > expected to do. We are primarily researchers, and our output is judged > similarly to that of a professor at an R1 university, except that we have > no teaching load and engage in some programmatic work. What you have > proposed is having us develop a complete finite-element analysis code to > some specification you provide, which we will then hand to you (before you > implement part or all of it using FPGAs). For this sort of arrangement, it > sounds like what you are looking for is scientific programmers who work on > contract. That is not the role that we play. We do research on > computational mathematics and its applications, and we develop software to > aid this research and to enable the broader computing community to benefit > from our research and perhaps collaborate with us on further developments. > This has led to a widely-used piece of software, PETSc, which provides > useful computational building blocks that many teams have used to build > finite-element analysis applications, but when teams have used PETSc for > such work and have teamed with us, it has very much been in a collaborative > research relationship: others are doing much of the development of their > FEM code, but we help them because, say, they are modeling systems with > very difficult nonlinearities, discontinuous jumps in material > coefficients, strangely stretched elements, etc., that cause problems for > simple algebraic solvers, so we collaborate with them on developing new > solver techniques that are amenable to their problems. > > It may make sense for you to partner with us or other members of the PETSc > team in the future, but I think you need to take some time to lay more of > the groundwork before a future funding call. You can experiment with > porting a PETSc-based FEM code using your FPGA approach without needing > anything from us right now: There are numerous finite-element example codes > provided with PETSc (Mark has written a few of them, and might be able to > recommend some good ones to start with). You could start by playing with > these examples and then try porting bits of them to FPGAs. As I said in an > earlier message, based on my limited experience with FPGAs, I suspect that > you will run into several technical challenges. When you have had a chance > to identify these challenges, then it might make sense to come back to the > PETSc team to describe some of them ? you can start by emailing petsc-maint > or petsc-users about this ? and perhaps eventually develop a proposal that > aims to address them in collaboration with the team. > > Apologies if I have had to disappoint you, and best of luck. Perhaps later > there will be good opportunities to partner with us in the future. I > encourage you to experiment some with PETSc to determine whether it is the > right software toolkit to use for your FPGA-targeted applications, and to > not be shy about asking on the PETSc user lists as you uncover issues as > you experiment. > > Best regards, > Richard > > ------------------------------ > *From:* Debiprasad Panda > *Sent:* Thursday, February 6, 2025 8:18 PM > *To:* Mills, Richard Tran > *Cc:* Mark Adams ; Munson, Todd > *Subject:* Re: DOE STTR partner opportunity > > This Message Is From an External Sender > This message came from outside your organization. > > Dear All, > > Amidst all these organizational and administrative changes, I have good > news to share that our LOI has been accepted by DOE and the final proposal > submission is due on 26th February 2025. The proposal is about an FEA > thermal analysis using PETSc and porting it to FPGA for its real time > simulation. > > Given a mechanical drawing of an object, in PETSC a mesh will be generated > and then a thermal problem will be formulated using FEA theory and boundary > condition to generate a global stiffness matrix in the form of Ax =B, which > will be eventually solved using linear or non-linear solver. In Phase I, we > will concentrate only on linear system and only the solver part will be > implemented in FPGA to demonstrate the real time operation in part. In > Phase II, the entire FEA problem formulation with non-linearity as well as > solver will be implemented in FPGA to have a complete real time solution. > > We went through PETSc libraries and one of our team members has used it > extensively during his PhD. The steps we would like to follow to formulate > a FEA problem, and its solution is described in the attached document. > > We would like you to partnering with us in this DOE project and your > responsibility will be to create this FEA thermal model in PETSc following > the steps in the given document and then run it in a PC/server and > collect the result. We will take the responsibility of implementing the > same in our FPGA solver. > > I was thinking to write this email for some time but kept on hold till the > formal acceptance of LOI in order to justify your time. > > Please go through the attached document and then let's follow up with a > zoom call sometime early next week per your convenience for discussing it > for any question you may have. > > Please acknowledge receiving this email so that I know our communication > is going through. > > I will look forward to collaborating with you. > > Regards. > > > Debiprasad Panda, PhD > President & CTO, > Universal Real Time Power Conversion LLC > Greater Milwaukee, WI > Tel:1-440-840-3393 (cell) > Web: https://urldefense.us/v3/__https://urtpc.com__;!!G_uCfscf7eWS!fNXzXlpQ-ukRC2r1EAR0vzOZCP4M_UUNOIfBsJXXEMXS3FrkItd2r4hJKy94MHlF6h7APDqb7-Pfc9V435nHato$ > > > NOTICE OF PROPRIETARY AND CONFIDENTIAL INFORMATION*:* > *This e-mail transmission and its attachments are privileged, proprietary > and confidential and is **for the review of the designated recipient only**. > If you have received this transmission in error, please immediately notify **dpanda68 at yahoo.com > **. Unintended transmission shall not constitute > waiver of any privilege.* > > > On Tuesday, January 14, 2025 at 03:20:07 PM CST, Debiprasad Panda < > dpanda68 at yahoo.com> wrote: > > > Richard, Mark, Todd > > I am submitting the LOI without ANL at this time. It seems we can include > ANL as STTR partner while submitting the full proposal if things look good > from both sides. So, we may have about six weeks from now to understand the > project. Let's discuss it over a zoom call sometime this week. > > Regards. > > Debiprasad Panda, PhD > President & CTO, > Universal Real Time Power Conversion LLC > Greater Milwaukee, WI > Tel:1-440-840-3393 (cell) > Web: https://urldefense.us/v3/__https://urtpc.com__;!!G_uCfscf7eWS!fNXzXlpQ-ukRC2r1EAR0vzOZCP4M_UUNOIfBsJXXEMXS3FrkItd2r4hJKy94MHlF6h7APDqb7-Pfc9V435nHato$ > > > NOTICE OF PROPRIETARY AND CONFIDENTIAL INFORMATION*:* > *This e-mail transmission and its attachments are privileged, proprietary > and confidential and is **for the review of the designated recipient only**. > If you have received this transmission in error, please immediately notify **dpanda68 at yahoo.com > **. Unintended transmission shall not constitute > waiver of any privilege.* > > > On Tuesday, January 14, 2025 at 12:50:53 PM CST, Mills, Richard Tran < > rtmills at anl.gov> wrote: > > > Hi Debiprasad, > > Apologies for the delay in my reply; the past few days have been > especially busy ones due to some internal proposal deadlines I had to rush > to meet, on top of several other things. > > Your project sounds interesting, but, unfortunately, I don't think that > there is time before your LOI is due for me to understand your application, > discuss whether PETSc is appropriate for it, or how you would map any > implementation using PETSc to FPGA hardware. PETSc is an extremely > complicated piece of software and a lot of effort is required from > algorithm selection and parallel problem decomposition on down to details > of individual microkernels when bringing it to and optimizing it for new > kinds of computing architectures. (I spent roughly six years working with > several others on getting solid GPU support in PETSc, for instance.) I have > a little bit of familiarity with FPGAs from my time at ORNL and Intel, and > I think that enabling PETSc to make efficient use of FPGAs is going to be a > highly non-trivial (though interesting!) project. Are you familiar at all > with PETSc, and do you have a particular reason that you think it would be > helpful to your work? You might be better served by using a different piece > of software as a starting point, if you do not need things like the > distributed memory-parallel implementations or the advanced, composable > solvers and preconditioners. If you do have a particular need for things > that PETSc provides, perhaps I or others from the PETSc team could discuss > this with you with future opportunities in mind. Best of luck to you if you > do submit an STTR proposal this time. > > Sincerely, > Richard > > ------------------------------ > *From:* Debiprasad Panda > *Sent:* Tuesday, January 14, 2025 6:09 AM > *To:* Mills, Richard Tran > *Subject:* Re: DOE STTR partner opportunity > > This Message Is From an External Sender > This message came from outside your organization. > > Richard, > Hope you received my previous email. I will appreciate if you let me know > if you would like to participate in this STTR project or not. I know its a > short notice and I will understand if that is not sufficient to make it a > "GO". > > I will still have good amount time to create and upload an LOI. > > Regards. > > Debiprasad Panda, PhD > President & CTO, > Universal Real Time Power Conversion LLC > Greater Milwaukee, WI > Tel:1-440-840-3393 (cell) > Web: https://urldefense.us/v3/__https://urtpc.com__;!!G_uCfscf7eWS!fNXzXlpQ-ukRC2r1EAR0vzOZCP4M_UUNOIfBsJXXEMXS3FrkItd2r4hJKy94MHlF6h7APDqb7-Pfc9V435nHato$ > > > NOTICE OF PROPRIETARY AND CONFIDENTIAL INFORMATION*:* > *This e-mail transmission and its attachments are privileged, proprietary > and confidential and is **for the review of the designated recipient only**. > If you have received this transmission in error, please immediately notify **dpanda68 at yahoo.com > **. Unintended transmission shall not constitute > waiver of any privilege.* > > > On Monday, January 13, 2025 at 01:44:59 PM CST, Debiprasad Panda < > dpanda68 at yahoo.com> wrote: > > > Richard, > I got your contact from Todd Munson. We are a small business located in > greater Milwaukee and working on a one-stop real time simulator where we > can simulate a large grid along with IBR in real time. In addition, we can > conduct a thermal and structural FEA analysis in real time for up to 1-5 > Million grid points. > A new DOE solicitation is out where we can propose a one stop solution for > solar power IBR where we can model an IBR with very low step size (20-40ns) > for its real time simulation and also, we can calculate thermal loss > through semiconductor switches and then provide a thermal footprint of the > IBR in real time employing a FEA analysis. We have implemented a thermal > analysis of a heat sink using our proprietary FPGA implementation in real > time with 52000 nodes and can extend it upto 1-5M. I am wondering if you > would like to take part as RI for our STTR application where you can > formulate the FEA problem using PETSC or any other software and then we can > implement the same in FPGA for its real time implementation. If so, let me > know by COB today. We do not have much time - the LOI is due tomorrow 4:00 > PM central time, and the full proposal is due on 26th February. If you > would like we can have a quick call to discuss. At this time an email > consent will be fine and then we can discuss the detailed scopes and > deliverable in next couple of weeks. The STTR > > Let me know if you will be interested. > > > Regards. > > Debiprasad Panda, PhD > President & CTO, > Universal Real Time Power Conversion LLC > Greater Milwaukee, WI > Tel:1-440-840-3393 (cell) > Web: https://urldefense.us/v3/__https://urtpc.com__;!!G_uCfscf7eWS!fNXzXlpQ-ukRC2r1EAR0vzOZCP4M_UUNOIfBsJXXEMXS3FrkItd2r4hJKy94MHlF6h7APDqb7-Pfc9V435nHato$ > > > NOTICE OF PROPRIETARY AND CONFIDENTIAL INFORMATION*:* > *This e-mail transmission and its attachments are privileged, proprietary > and confidential and is **for the review of the designated recipient only**. > If you have received this transmission in error, please immediately notify **dpanda68 at yahoo.com > * > *. Unintended transmission shall not constitute waiver of any privilege. * > -------------- next part -------------- An HTML attachment was scrubbed... URL: From aldo.bonfiglioli at unibas.it Sat Feb 15 10:37:18 2025 From: aldo.bonfiglioli at unibas.it (Aldo Bonfiglioli) Date: Sat, 15 Feb 2025 17:37:18 +0100 Subject: [petsc-users] Advice on identifying boundary vertices Message-ID: <20efa212-964c-4a34-a417-9a888d2a47c5@unibas.it> Dear all, I am trying to identify the boundary vertices that belong to a given stratum of the Face sets. This is going to be used to prescribe Dirichlet-type bcs. I can select the boundary faces (edges in 2D) of a given stratum using PetscCall(DMLabelGetStratumIS(label, stratum, user%bndryfaces(i), ierr)) and that looks ok; I then try to identify the boundary vertices by looping over the points (edges in 2D/faces in 3D) of the face set of a given stratum and retrieve the vertices that make up each individual edge/face. For reasons I fail to understand, the above procedure fails to identify certain vertices (those circled in the enclosed pdf where different colours mark different ranks) in a parallel environment. Questions: 1. is there an available function that does what I am trying to do? I know that the boundary points can be found in the "marker" label, but I need to discriminate among Face Sets of different strata. 2. what might be wrong in the aforementioned approach? Thanks, Aldo -- Dr. Aldo Bonfiglioli Associate professor of Fluid Machines Scuola di Ingegneria Universita' della Basilicata V.le dell'Ateneo lucano, 10 85100 Potenza ITALY tel:+39.0971.205203 fax:+39.0971.205215 web: https://urldefense.us/v3/__http://docenti.unibas.it/site/home/docente.html?m=002423__;!!G_uCfscf7eWS!YLL7mzBlK9H_sEdtXob1AVklf8cVLhT3NTDdExIZsdI3xMfNHBoLXr92BmYzCXOuJqtm2L6OU4DJKV_81AkjvQf-Kra-Ym8P0tU$ -------------- next part -------------- A non-text attachment was scrubbed... Name: 2025-02-15-Nota-14-02_annotated.pdf Type: application/pdf Size: 60693 bytes Desc: not available URL: From knepley at gmail.com Sat Feb 15 20:03:24 2025 From: knepley at gmail.com (Matthew Knepley) Date: Sat, 15 Feb 2025 21:03:24 -0500 Subject: [petsc-users] Advice on identifying boundary vertices In-Reply-To: <20efa212-964c-4a34-a417-9a888d2a47c5@unibas.it> References: <20efa212-964c-4a34-a417-9a888d2a47c5@unibas.it> Message-ID: On Sat, Feb 15, 2025 at 11:40?AM Aldo Bonfiglioli < aldo.bonfiglioli at unibas.it> wrote: > Dear all, > > I am trying to identify the boundary vertices that belong to a given > stratum of the Face sets. This is going to be used to prescribe > Dirichlet-type bcs. > The label "Face Sets" is designed to only contain faces. > I can select the boundary faces (edges in 2D) of a given stratum using > PetscCall(DMLabelGetStratumIS(label, stratum, user%bndryfaces(i), ierr)) > and that looks ok; > Yes, you get the faces marked with some BC value. > I then try to identify the boundary vertices by looping over the points > (edges in 2D/faces in 3D) of the face set of a given stratum and > retrieve the vertices that make up each individual edge/face. > You can do that. You can also do something like DMLabelDuplicate(faceSets, &newLabel); DMPlexLabelComplete(dm, newLabel); which will put in all the points in the transitive closure (such as vertices). Then you can just loop over the points in the label, and check for vertices using DMPlexGetPointDepth(). > For reasons I fail to understand, the above procedure fails to identify > certain vertices (those circled in the enclosed pdf where different > colours mark different ranks) in a parallel environment. > I do this all the time, so it should not happen. If the above fails, can you send a small reproducer? Thanks, Matt > Questions: > > 1. is there an available function that does what I am trying to do? I > know that the boundary points can be found in the "marker" label, but I > need to discriminate among Face Sets of different strata. > > 2. what might be wrong in the aforementioned approach? > > Thanks, > > Aldo > > -- > Dr. Aldo Bonfiglioli > Associate professor of Fluid Machines > Scuola di Ingegneria > Universita' della Basilicata > V.le dell'Ateneo lucano, 10 85100 Potenza ITALY > tel:+39.0971.205203 fax:+39.0971.205215 > web: > https://urldefense.us/v3/__http://docenti.unibas.it/site/home/docente.html?m=002423__;!!G_uCfscf7eWS!YLL7mzBlK9H_sEdtXob1AVklf8cVLhT3NTDdExIZsdI3xMfNHBoLXr92BmYzCXOuJqtm2L6OU4DJKV_81AkjvQf-Kra-Ym8P0tU$ > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!cwtigEBvRpcpoLeV3aRNqP6fHwoAHN_2SW4Rjbh_ZqKelJ54Nvgncprg0QimeBjNfY5Ox-8qG6AzP5ZqCg-M$ -------------- next part -------------- An HTML attachment was scrubbed... URL: From silvia.preda at uninsubria.it Wed Feb 19 10:18:19 2025 From: silvia.preda at uninsubria.it (Preda Silvia) Date: Wed, 19 Feb 2025 16:18:19 +0000 Subject: [petsc-users] grid adaptivity with dmforest Message-ID: Hi, I'm using the toycode below to manage adaptivity using a DMForest, based on p4est. The code simply performs three adaptivity steps on a square grid [0,1]x[0,1], uniformly refined at level 2 at the beginning. The minimum and the maximum level of refinement are set to 1 and 5, respectively. During the adaptivity procedure, a quadrant is refined if its centroid is inside the circle of radius 0.25, centred in (0.5,0.5). I'm facing two main issues: * As far as I understood, when the label asks to refine a quadrant which is already at the maximum level, the adaption procedure errors out instead of ignoring the request. Thus I need to check the quadrant level before setting the adaptivity label. How can I get access to the quadrant-level of a cell? * Once the mesh is adapted, data need to be projected on the new mesh and in my method I'll need to write an elaborate ad-hoc procedure for that. How can I access the map that provides the correspondence between the indexes of outcoming quadrants and the indexes of incoming ones? Below is the code, to be run with these options: -dm_type p4est -dm_forest_topology brick -dm_p4est_brick_size 1,1 -dm_view vtk:brick.vtu -dm_forest_initial_refinement 2 -dm_forest_minimum_refinement 1 -dm_forest_maximum_refinement 5 static char help[] = "Create and view a forest mesh\n\n"; #include #include #include static PetscErrorCode CreateAdaptLabel(DM dm, DM dmConv, PetscInt *nAdaptLoc) { DMLabel adaptLabel; PetscInt cStart, cEnd, c; PetscFunctionBeginUser; PetscCall(DMGetCoordinatesLocalSetUp(dm)); PetscCall(DMCreateLabel(dm, "adaptLabel")); PetscCall(DMGetLabel(dm, "adaptLabel", &adaptLabel)); PetscCall(DMForestGetCellChart(dm, &cStart, &cEnd)); for (c = cStart; c < cEnd; ++c) { PetscReal centroid[3], volume, x, y; PetscCall(DMPlexComputeCellGeometryFVM(dmConv, c, &volume, centroid, NULL)); x = centroid[0]; y = centroid[1]; if (std::sqrt((x-0.5)*(x-0.5)+(y-0.5)*(y-0.5))<0.25) { PetscCall(DMLabelSetValue(adaptLabel, c, DM_ADAPT_REFINE)); ++nAdaptLoc[0]; } else { PetscCall(DMLabelSetValue(adaptLabel, c, DM_ADAPT_KEEP)); ++nAdaptLoc[1]; } } PetscFunctionReturn(PETSC_SUCCESS); } static PetscErrorCode ForestToPlex(DM *dm, DM *dmConv) { PetscFunctionBeginUser; PetscCall(DMConvert(*dm, DMPLEX, dmConv)); PetscCall(DMLocalizeCoordinates(*dmConv)); PetscCall(DMViewFromOptions(*dmConv, NULL, "-dm_conv_view")); PetscCall(DMPlexCheckCellShape(*dmConv, PETSC_FALSE, PETSC_DETERMINE)); PetscFunctionReturn(PETSC_SUCCESS); } static PetscErrorCode AdaptMesh(DM *dm) { DM dmCur = *dm; PetscBool hasLabel=PETSC_FALSE, adapt=PETSC_TRUE; PetscInt adaptIter=0, maxAdaptIter=3; PetscFunctionBeginUser; while (adapt) { DM dmAdapt; DMLabel adaptLabel; PetscInt nAdaptLoc[2]={0,0}, nAdapt[2]={0,0}; ++adaptIter; PetscCall(PetscPrintf(PETSC_COMM_SELF,"\nADAPT ITER %d\n",adaptIter)); DM dmConv; PetscCall(ForestToPlex(&dmCur,&dmConv)); PetscCall(CreateAdaptLabel(dmCur,dmConv,nAdaptLoc)); PetscCallMPI(MPIU_Allreduce(&nAdaptLoc, &nAdapt, 2, MPIU_INT, MPI_SUM, PetscObjectComm((PetscObject)dmCur))); PetscCall(DMGetLabel(dmCur, "adaptLabel", &adaptLabel)); PetscCall(PetscPrintf(PETSC_COMM_WORLD,"Cell to refine = %d\n",nAdapt[0])); PetscCall(PetscPrintf(PETSC_COMM_WORLD,"Cell to keep = %d\n",nAdapt[1])); if (nAdapt[0]) { PetscCall(DMAdaptLabel(dmCur, adaptLabel, &dmAdapt)); PetscCall(DMHasLabel(dmAdapt, "adaptLabel", &hasLabel)); PetscCall(DMDestroy(&dmCur)); PetscCall(DMViewFromOptions(dmAdapt, NULL, "-adapt_dm_view")); dmCur = dmAdapt; } //PetscCall(DMLabelDestroy(&adaptLabel)); PetscCall(DMDestroy(&dmConv)); if (adaptIter==maxAdaptIter) adapt=PETSC_FALSE; } *dm = dmCur; PetscFunctionReturn(PETSC_SUCCESS); } int main(int argc, char **argv) { DM dm; char typeString[256] = {'\0'}; PetscViewer viewer = NULL; PetscFunctionBeginUser; PetscCall(PetscInitialize(&argc, &argv, NULL, help)); PetscCall(DMCreate(PETSC_COMM_WORLD, &dm)); PetscCall(PetscStrncpy(typeString, DMFOREST, 256)); PetscOptionsBegin(PETSC_COMM_WORLD, NULL, "DM Forest example options", NULL); PetscCall(PetscOptionsString("-dm_type", "The type of the dm", NULL, DMFOREST, typeString, sizeof(typeString), NULL)); PetscOptionsEnd(); PetscCall(PetscPrintf(PETSC_COMM_SELF,"\n ==== TOY CODE DMFOREST WITH AMR ====\n")); PetscCall(DMSetType(dm, (DMType)typeString)); PetscCall(DMSetFromOptions(dm)); PetscCall(DMSetUp(dm)); /* Adapt */ PetscCall(PetscPrintf(PETSC_COMM_SELF,"\nADAPTIVITY PHASE STARTED\n")); PetscCall(AdaptMesh(&dm)); PetscCall(PetscPrintf(PETSC_COMM_SELF,"\nADAPTIVITY PHASE ENDED\n\n")); PetscCall(DMViewFromOptions(dm, NULL, "-dm_view")); PetscCall(PetscViewerDestroy(&viewer)); PetscCall(DMDestroy(&dm)); PetscCall(PetscFinalize()); return 0; } Thanks a lot, Silvia -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed Feb 19 13:13:50 2025 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 19 Feb 2025 14:13:50 -0500 Subject: [petsc-users] grid adaptivity with dmforest In-Reply-To: References: Message-ID: On Wed, Feb 19, 2025 at 11:18?AM Preda Silvia via petsc-users < petsc-users at mcs.anl.gov> wrote: > Hi, > > > > I?m using the toycode below to manage adaptivity using a DMForest, based > on p4est. > > > > The code simply performs three adaptivity steps on a square grid > [0,1]x[0,1], uniformly refined at level 2 at the beginning. The minimum and > the maximum level of refinement are set to 1 and 5, respectively. During > the adaptivity procedure, a quadrant is refined if its centroid is inside > the circle of radius 0.25, centred in (0.5,0.5). > > > > I?m facing two main issues: > > - As far as I understood, when the label asks to refine a quadrant > which is already at the maximum level, the adaption procedure errors out > instead of ignoring the request. Thus I need to check the quadrant level > before setting the adaptivity label. How can I get access to the > quadrant-level of a cell? > - Once the mesh is adapted, data need to be projected on the new mesh > and in my method I?ll need to write an elaborate ad-hoc procedure for that. > How can I access the map that provides the correspondence between the > indexes of outcoming quadrants and the indexes of incoming ones? > > Toby understands this better than me. Toby, two questions: 1) Can we catch that error and just return? 2) Will DMProjectField() work in the case of multiple refinements like this? Thanks, Matt > Below is the code, to be run with these options: > > -dm_type p4est > > -dm_forest_topology brick > > -dm_p4est_brick_size 1,1 > > -dm_view vtk:brick.vtu > > -dm_forest_initial_refinement 2 > > -dm_forest_minimum_refinement 1 > > -dm_forest_maximum_refinement 5 > > > > static char help[] = "Create and view a forest mesh\n\n"; > > > > #include > > #include > > #include > > > > static PetscErrorCode CreateAdaptLabel(DM dm, DM dmConv, PetscInt > *nAdaptLoc) > > { > > DMLabel adaptLabel; > > PetscInt cStart, cEnd, c; > > > > PetscFunctionBeginUser; > > PetscCall(DMGetCoordinatesLocalSetUp(dm)); > > PetscCall(DMCreateLabel(dm, "adaptLabel")); > > PetscCall(DMGetLabel(dm, "adaptLabel", &adaptLabel)); > > PetscCall(DMForestGetCellChart(dm, &cStart, &cEnd)); > > for (c = cStart; c < cEnd; ++c) { > > PetscReal centroid[3], volume, x, y; > > PetscCall(DMPlexComputeCellGeometryFVM(dmConv, c, &volume, centroid, > NULL)); > > x = centroid[0]; > > y = centroid[1]; > > if (std::sqrt((x-0.5)*(x-0.5)+(y-0.5)*(y-0.5))<0.25) { > > PetscCall(DMLabelSetValue(adaptLabel, c, DM_ADAPT_REFINE)); > > ++nAdaptLoc[0]; > > } else { > > PetscCall(DMLabelSetValue(adaptLabel, c, DM_ADAPT_KEEP)); > > ++nAdaptLoc[1]; > > } > > } > > PetscFunctionReturn(PETSC_SUCCESS); > > } > > > > static PetscErrorCode ForestToPlex(DM *dm, DM *dmConv) > > { > > PetscFunctionBeginUser; > > PetscCall(DMConvert(*dm, DMPLEX, dmConv)); > > PetscCall(DMLocalizeCoordinates(*dmConv)); > > PetscCall(DMViewFromOptions(*dmConv, NULL, "-dm_conv_view")); > > PetscCall(DMPlexCheckCellShape(*dmConv, PETSC_FALSE, PETSC_DETERMINE)); > > PetscFunctionReturn(PETSC_SUCCESS); > > } > > > > static PetscErrorCode AdaptMesh(DM *dm) > > { > > DM dmCur = *dm; > > PetscBool hasLabel=PETSC_FALSE, adapt=PETSC_TRUE; > > PetscInt adaptIter=0, maxAdaptIter=3; > > > > PetscFunctionBeginUser; > > while (adapt) { > > DM dmAdapt; > > DMLabel adaptLabel; > > PetscInt nAdaptLoc[2]={0,0}, nAdapt[2]={0,0}; > > > > ++adaptIter; > > PetscCall(PetscPrintf(PETSC_COMM_SELF,"\nADAPT ITER %d\n",adaptIter)); > > > > DM dmConv; > > PetscCall(ForestToPlex(&dmCur,&dmConv)); > > PetscCall(CreateAdaptLabel(dmCur,dmConv,nAdaptLoc)); > > PetscCallMPI(MPIU_Allreduce(&nAdaptLoc, &nAdapt, 2, MPIU_INT, MPI_SUM, > PetscObjectComm((PetscObject)dmCur))); > > PetscCall(DMGetLabel(dmCur, "adaptLabel", &adaptLabel)); > > PetscCall(PetscPrintf(PETSC_COMM_WORLD,"Cell to refine = > %d\n",nAdapt[0])); > > PetscCall(PetscPrintf(PETSC_COMM_WORLD,"Cell to keep = > %d\n",nAdapt[1])); > > > > if (nAdapt[0]) { > > PetscCall(DMAdaptLabel(dmCur, adaptLabel, &dmAdapt)); > > PetscCall(DMHasLabel(dmAdapt, "adaptLabel", &hasLabel)); > > PetscCall(DMDestroy(&dmCur)); > > PetscCall(DMViewFromOptions(dmAdapt, NULL, "-adapt_dm_view")); > > dmCur = dmAdapt; > > } > > //PetscCall(DMLabelDestroy(&adaptLabel)); > > PetscCall(DMDestroy(&dmConv)); > > if (adaptIter==maxAdaptIter) adapt=PETSC_FALSE; > > } > > *dm = dmCur; > > PetscFunctionReturn(PETSC_SUCCESS); > > } > > > > int main(int argc, char **argv) > > { > > DM dm; > > char typeString[256] = {'\0'}; > > PetscViewer viewer = NULL; > > > > PetscFunctionBeginUser; > > PetscCall(PetscInitialize(&argc, &argv, NULL, help)); > > PetscCall(DMCreate(PETSC_COMM_WORLD, &dm)); > > PetscCall(PetscStrncpy(typeString, DMFOREST, 256)); > > PetscOptionsBegin(PETSC_COMM_WORLD, NULL, "DM Forest example options", > NULL); > > PetscCall(PetscOptionsString("-dm_type", "The type of the dm", NULL, > DMFOREST, typeString, sizeof(typeString), NULL)); > > PetscOptionsEnd(); > > > > PetscCall(PetscPrintf(PETSC_COMM_SELF,"\n ==== TOY CODE DMFOREST WITH > AMR ====\n")); > > > > PetscCall(DMSetType(dm, (DMType)typeString)); > > PetscCall(DMSetFromOptions(dm)); > > PetscCall(DMSetUp(dm)); > > > > /* Adapt */ > > PetscCall(PetscPrintf(PETSC_COMM_SELF,"\nADAPTIVITY PHASE STARTED\n")); > > PetscCall(AdaptMesh(&dm)); > > PetscCall(PetscPrintf(PETSC_COMM_SELF,"\nADAPTIVITY PHASE ENDED\n\n")); > > > > PetscCall(DMViewFromOptions(dm, NULL, "-dm_view")); > > PetscCall(PetscViewerDestroy(&viewer)); > > > > PetscCall(DMDestroy(&dm)); > > PetscCall(PetscFinalize()); > > return 0; > > } > > > > > > Thanks a lot, > > > > Silvia > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!ZSCq5o18nPkr_fesXZsTE1dDngSGlwHv1_gJXKWrkgdaUqTlCeNxjF-VSyRK8bjgaHXIu9LJw-fH5fP0nQ8Z$ -------------- next part -------------- An HTML attachment was scrubbed... URL: From dargaville.steven at gmail.com Wed Feb 19 17:17:49 2025 From: dargaville.steven at gmail.com (Steven Dargaville) Date: Wed, 19 Feb 2025 23:17:49 +0000 Subject: [petsc-users] kokkos and include flags Message-ID: Hi I'm trying to build my application code (which includes C and Fortran files) with a Makefile based off $PETSC_DIR/share/petsc/Makefile.basic.user by using the variables and rules defined in ${PETSC_DIR}/lib/petsc/conf/variables. My application uses petsc as well as another library, and hence I have to add some extra include statements pointing at the other library during compilation. Currently I have been doing: # Read in the petsc compile/linking variables and makefile rules include ${PETSC_DIR}/lib/petsc/conf/variables include ${PETSC_DIR}/lib/petsc/conf/rules # Add the extra include files PETSC_FC_INCLUDES += $(INCLUDE_OTHER_LIB) PETSC_CC_INCLUDES += $(INCLUDE_OTHER_LIB) which works very well, with the correct include flags from INCLUDE_OTHER_LIBS being added to the compilation of both fortran and C files. If however I try and compile a kokkos file, named adv_1dk.kokkos.cxx (by calling "make adv_1dk"), the extra flags are not included. If I instead call "make adv_1dk.kokkos", the rule for cxx files is instead triggered and correctly includes the include flags, but this just calls the c++ wrapper, rather than the nvcc_wrapper and therefore breaks when kokkos has been built with cuda (or hip, etc). Just wondering if there is something I have missed, from what I can tell the kokkos rules don't use the PETSC_CC_INCLUDES during compilation. Thanks for all your help Steven -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay.anl at fastmail.org Wed Feb 19 18:32:08 2025 From: balay.anl at fastmail.org (Satish Balay) Date: Wed, 19 Feb 2025 18:32:08 -0600 (CST) Subject: [petsc-users] kokkos and include flags In-Reply-To: References: Message-ID: Try setting CPPFLAGS, FPPFLAGS, CXXPPFLAGS [and not via PETSC_FC_INCLUDES]. I think kokkos compile targets [for *.kokkos.cxx sources] should pick up one of them. for ex: >>> CPPFLAGS = -Wall FPPFLAGS = -Wall CXXPPFLAGS = -Wall include ${PETSC_DIR}/lib/petsc/conf/variables include ${PETSC_DIR}/lib/petsc/conf/rules ... <<< Satish On Wed, 19 Feb 2025, Steven Dargaville wrote: > Hi > > I'm trying to build my application code (which includes C and Fortran > files) with a Makefile based off $PETSC_DIR/share/petsc/Makefile.basic.user > by using the variables and rules defined in > ${PETSC_DIR}/lib/petsc/conf/variables. My application uses petsc as well as > another library, and hence I have to add some extra include statements > pointing at the other library during compilation. Currently I have been > doing: > > # Read in the petsc compile/linking variables and makefile rules > include ${PETSC_DIR}/lib/petsc/conf/variables > include ${PETSC_DIR}/lib/petsc/conf/rules > > # Add the extra include files > PETSC_FC_INCLUDES += $(INCLUDE_OTHER_LIB) > PETSC_CC_INCLUDES += $(INCLUDE_OTHER_LIB) > > > which works very well, with the correct include flags from > INCLUDE_OTHER_LIBS being added to the compilation of both fortran and C > files. > > If however I try and compile a kokkos file, named adv_1dk.kokkos.cxx (by > calling "make adv_1dk"), the extra flags are not included. If I instead > call "make adv_1dk.kokkos", the rule for cxx files is instead triggered and > correctly includes the include flags, but this just calls the c++ wrapper, > rather than the nvcc_wrapper and therefore breaks when kokkos has been > built with cuda (or hip, etc). > > Just wondering if there is something I have missed, from what I can tell > the kokkos rules don't use the PETSC_CC_INCLUDES during compilation. > > Thanks for all your help > Steven > From dargaville.steven at gmail.com Thu Feb 20 05:36:11 2025 From: dargaville.steven at gmail.com (Steven Dargaville) Date: Thu, 20 Feb 2025 11:36:11 +0000 Subject: [petsc-users] kokkos and include flags In-Reply-To: References: Message-ID: Thanks for the reply! I've tried that and again it doesn't seem to work for the kokkos files. I went a bit overboard and set every variable I could find but it doesn't seem to change the kokkos compilation, despite some of those flags definitely being present in the kokkos compile targets. CPPFLAGS = $(INCLUDE) FPPFLAGS = $(INCLUDE) CPPFLAGS = $(INCLUDE) CXXPPFLAGS = $(INCLUDE) CXXCPPFLAGS = $(INCLUDE) CUDAC_FLAGS = $(INCLUDE) HIPC_FLAGS = $(INCLUDE) SYCLC_FLAGS = $(INCLUDE) PETSC_CXXCPPFLAGS = $(INCLUDE) PETSC_CCPPFLAGS = $(INCLUDE) PETSC_FCPPFLAGS = $(INCLUDE) PETSC_CUDACPPFLAGS = $(INCLUDE) MPICXX_INCLUDES = $(INCLUDE) # Read in the petsc compile/linking variables and makefile rules include ${PETSC_DIR}/lib/petsc/conf/variables include ${PETSC_DIR}/lib/petsc/conf/rules The strangest thing is if I echo the value of PETSC_KOKKOSCOMPILE_SINGLE before building, it seems to have the correct flags in it. # Build the tests build_tests: $(OUT) echo $(PETSC_KOKKOSCOMPILE_SINGLE) @for t in $(TEST_TARGETS); do \ $(MAKE) -C tests $$t; \ done for example the echo gives (where I've bolded the flags I need added): mpicxx -o .o -c -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -Wno-lto-type-mismatch -Wno-psabi -fstack-protector -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -Wno-lto-type-mismatch -Wno-psabi -fstack-protector -g -O0 -std=gnu++17 -fPIC *-I/home/sdargavi/projects/PFLARE -Iinclude* but the actual command that is called when the build is happening is (which doesn't have the includes I need): mpicxx -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -Wno-lto-type-mismatch -Wno-psabi -fstack-protector -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -Wno-lto-type-mismatch -Wno-psabi -fstack-protector -g -O0 -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -Wno-lto-type-mismatch -Wno-psabi -fstack-protector -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas -Wno-lto-type-mismatch -Wno-psabi -fstack-protector -g -O0 -std=gnu++17 -fPIC -I/home/sdargavi/projects/dependencies/petsc-3.22.0/include -I/home/sdargavi/projects/dependencies/petsc-3.22.0/arch-linux-c-debug/include adv_1dk.kokkos.cxx -L/home/sdargavi/projects/PFLARE/lib -lpflare -Wl,-rpath,/home/sdargavi/projects/PFLARE/lib:-Wl,-rpath,/home/sdargavi/projects/dependencies/petsc-3.22.0/arch-linux-c-debug/lib -L/home/sdargavi/projects/dependencies/petsc-3.22.0/arch-linux-c-debug/lib -Wl,-rpath,/usr/lib/x86_64-linux-gnu/openmpi/lib/fortran/gfortran -L/usr/lib/x86_64-linux-gnu/openmpi/lib/fortran/gfortran -Wl,-rpath,/usr/lib/gcc/x86_64-linux-gnu/11 -L/usr/lib/gcc/x86_64-linux-gnu/11 -lpetsc -lkokkoskernels -lkokkoscontainers -lkokkoscore -lkokkossimd -lflapack -lfblas -lparmetis -lmetis -lm -lX11 -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lgfortran -lm -lz -lgfortran -lm -lgfortran -lgcc_s -lquadmath -lstdc++ -lquadmath -o adv_1dk On Thu, 20 Feb 2025 at 00:32, Satish Balay wrote: > Try setting CPPFLAGS, FPPFLAGS, CXXPPFLAGS [and not via PETSC_FC_INCLUDES]. > > I think kokkos compile targets [for *.kokkos.cxx sources] should pick up > one of them. > > for ex: > > >>> > CPPFLAGS = -Wall > FPPFLAGS = -Wall > CXXPPFLAGS = -Wall > > include ${PETSC_DIR}/lib/petsc/conf/variables > include ${PETSC_DIR}/lib/petsc/conf/rules > > ... > <<< > > Satish > > On Wed, 19 Feb 2025, Steven Dargaville wrote: > > > Hi > > > > I'm trying to build my application code (which includes C and Fortran > > files) with a Makefile based off > $PETSC_DIR/share/petsc/Makefile.basic.user > > by using the variables and rules defined in > > ${PETSC_DIR}/lib/petsc/conf/variables. My application uses petsc as well > as > > another library, and hence I have to add some extra include statements > > pointing at the other library during compilation. Currently I have been > > doing: > > > > # Read in the petsc compile/linking variables and makefile rules > > include ${PETSC_DIR}/lib/petsc/conf/variables > > include ${PETSC_DIR}/lib/petsc/conf/rules > > > > # Add the extra include files > > PETSC_FC_INCLUDES += $(INCLUDE_OTHER_LIB) > > PETSC_CC_INCLUDES += $(INCLUDE_OTHER_LIB) > > > > > > which works very well, with the correct include flags from > > INCLUDE_OTHER_LIBS being added to the compilation of both fortran and C > > files. > > > > If however I try and compile a kokkos file, named adv_1dk.kokkos.cxx (by > > calling "make adv_1dk"), the extra flags are not included. If I instead > > call "make adv_1dk.kokkos", the rule for cxx files is instead triggered > and > > correctly includes the include flags, but this just calls the c++ > wrapper, > > rather than the nvcc_wrapper and therefore breaks when kokkos has been > > built with cuda (or hip, etc). > > > > Just wondering if there is something I have missed, from what I can tell > > the kokkos rules don't use the PETSC_CC_INCLUDES during compilation. > > > > Thanks for all your help > > Steven > > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dargaville.steven at gmail.com Thu Feb 20 05:38:44 2025 From: dargaville.steven at gmail.com (Steven Dargaville) Date: Thu, 20 Feb 2025 11:38:44 +0000 Subject: [petsc-users] kokkos and include flags In-Reply-To: References: Message-ID: My apologies, that does seem to have fixed the problem, the build rules in my tests were overriding the new variables. Thanks for your help in sorting that out! Steven On Thu, 20 Feb 2025 at 11:36, Steven Dargaville wrote: > Thanks for the reply! I've tried that and again it doesn't seem to work > for the kokkos files. I went a bit overboard and set every variable I could > find but it doesn't seem to change the kokkos compilation, despite some of > those flags definitely being present in the kokkos compile targets. > > CPPFLAGS = $(INCLUDE) > FPPFLAGS = $(INCLUDE) > CPPFLAGS = $(INCLUDE) > CXXPPFLAGS = $(INCLUDE) > CXXCPPFLAGS = $(INCLUDE) > CUDAC_FLAGS = $(INCLUDE) > HIPC_FLAGS = $(INCLUDE) > SYCLC_FLAGS = $(INCLUDE) > PETSC_CXXCPPFLAGS = $(INCLUDE) > PETSC_CCPPFLAGS = $(INCLUDE) > PETSC_FCPPFLAGS = $(INCLUDE) > PETSC_CUDACPPFLAGS = $(INCLUDE) > MPICXX_INCLUDES = $(INCLUDE) > > # Read in the petsc compile/linking variables and makefile rules > include ${PETSC_DIR}/lib/petsc/conf/variables > include ${PETSC_DIR}/lib/petsc/conf/rules > > The strangest thing is if I echo the value of PETSC_KOKKOSCOMPILE_SINGLE > before building, it seems to have the correct flags in it. > > # Build the tests > build_tests: $(OUT) > echo $(PETSC_KOKKOSCOMPILE_SINGLE) > @for t in $(TEST_TARGETS); do \ > $(MAKE) -C tests $$t; \ > done > > for example the echo gives (where I've bolded the flags I need added): > > mpicxx -o .o -c -Wall -Wwrite-strings -Wno-strict-aliasing > -Wno-unknown-pragmas -Wno-lto-type-mismatch -Wno-psabi -fstack-protector > -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas > -Wno-lto-type-mismatch -Wno-psabi -fstack-protector -g -O0 -std=gnu++17 > -fPIC *-I/home/sdargavi/projects/PFLARE -Iinclude* > > but the actual command that is called when the build is happening is > (which doesn't have the includes I need): > > mpicxx -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas > -Wno-lto-type-mismatch -Wno-psabi -fstack-protector -Wall -Wwrite-strings > -Wno-strict-aliasing -Wno-unknown-pragmas -Wno-lto-type-mismatch -Wno-psabi > -fstack-protector -g -O0 -Wall -Wwrite-strings -Wno-strict-aliasing > -Wno-unknown-pragmas -Wno-lto-type-mismatch -Wno-psabi -fstack-protector > -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas > -Wno-lto-type-mismatch -Wno-psabi -fstack-protector -g -O0 -std=gnu++17 > -fPIC -I/home/sdargavi/projects/dependencies/petsc-3.22.0/include > -I/home/sdargavi/projects/dependencies/petsc-3.22.0/arch-linux-c-debug/include > adv_1dk.kokkos.cxx -L/home/sdargavi/projects/PFLARE/lib -lpflare > -Wl,-rpath,/home/sdargavi/projects/PFLARE/lib:-Wl,-rpath,/home/sdargavi/projects/dependencies/petsc-3.22.0/arch-linux-c-debug/lib > -L/home/sdargavi/projects/dependencies/petsc-3.22.0/arch-linux-c-debug/lib > -Wl,-rpath,/usr/lib/x86_64-linux-gnu/openmpi/lib/fortran/gfortran > -L/usr/lib/x86_64-linux-gnu/openmpi/lib/fortran/gfortran > -Wl,-rpath,/usr/lib/gcc/x86_64-linux-gnu/11 > -L/usr/lib/gcc/x86_64-linux-gnu/11 -lpetsc -lkokkoskernels > -lkokkoscontainers -lkokkoscore -lkokkossimd -lflapack -lfblas -lparmetis > -lmetis -lm -lX11 -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi > -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lgfortran -lm > -lz -lgfortran -lm -lgfortran -lgcc_s -lquadmath -lstdc++ -lquadmath -o > adv_1dk > > > > On Thu, 20 Feb 2025 at 00:32, Satish Balay wrote: > >> Try setting CPPFLAGS, FPPFLAGS, CXXPPFLAGS [and not via >> PETSC_FC_INCLUDES]. >> >> I think kokkos compile targets [for *.kokkos.cxx sources] should pick up >> one of them. >> >> for ex: >> >> >>> >> CPPFLAGS = -Wall >> FPPFLAGS = -Wall >> CXXPPFLAGS = -Wall >> >> include ${PETSC_DIR}/lib/petsc/conf/variables >> include ${PETSC_DIR}/lib/petsc/conf/rules >> >> ... >> <<< >> >> Satish >> >> On Wed, 19 Feb 2025, Steven Dargaville wrote: >> >> > Hi >> > >> > I'm trying to build my application code (which includes C and Fortran >> > files) with a Makefile based off >> $PETSC_DIR/share/petsc/Makefile.basic.user >> > by using the variables and rules defined in >> > ${PETSC_DIR}/lib/petsc/conf/variables. My application uses petsc as >> well as >> > another library, and hence I have to add some extra include statements >> > pointing at the other library during compilation. Currently I have been >> > doing: >> > >> > # Read in the petsc compile/linking variables and makefile rules >> > include ${PETSC_DIR}/lib/petsc/conf/variables >> > include ${PETSC_DIR}/lib/petsc/conf/rules >> > >> > # Add the extra include files >> > PETSC_FC_INCLUDES += $(INCLUDE_OTHER_LIB) >> > PETSC_CC_INCLUDES += $(INCLUDE_OTHER_LIB) >> > >> > >> > which works very well, with the correct include flags from >> > INCLUDE_OTHER_LIBS being added to the compilation of both fortran and C >> > files. >> > >> > If however I try and compile a kokkos file, named adv_1dk.kokkos.cxx (by >> > calling "make adv_1dk"), the extra flags are not included. If I instead >> > call "make adv_1dk.kokkos", the rule for cxx files is instead triggered >> and >> > correctly includes the include flags, but this just calls the c++ >> wrapper, >> > rather than the nvcc_wrapper and therefore breaks when kokkos has been >> > built with cuda (or hip, etc). >> > >> > Just wondering if there is something I have missed, from what I can tell >> > the kokkos rules don't use the PETSC_CC_INCLUDES during compilation. >> > >> > Thanks for all your help >> > Steven >> > >> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay.anl at fastmail.org Thu Feb 20 08:44:02 2025 From: balay.anl at fastmail.org (Satish Balay) Date: Thu, 20 Feb 2025 08:44:02 -0600 (CST) Subject: [petsc-users] kokkos and include flags In-Reply-To: References: Message-ID: <9432ed9e-384a-4860-62b9-b7fb569bf457@fastmail.org> I'm glad you have it working now! Thanks for the update! Satish On Thu, 20 Feb 2025, Steven Dargaville wrote: > My apologies, that does seem to have fixed the problem, the build rules in > my tests were overriding the new variables. > > Thanks for your help in sorting that out! > Steven > > On Thu, 20 Feb 2025 at 11:36, Steven Dargaville > wrote: > > > Thanks for the reply! I've tried that and again it doesn't seem to work > > for the kokkos files. I went a bit overboard and set every variable I could > > find but it doesn't seem to change the kokkos compilation, despite some of > > those flags definitely being present in the kokkos compile targets. > > > > CPPFLAGS = $(INCLUDE) > > FPPFLAGS = $(INCLUDE) > > CPPFLAGS = $(INCLUDE) > > CXXPPFLAGS = $(INCLUDE) > > CXXCPPFLAGS = $(INCLUDE) > > CUDAC_FLAGS = $(INCLUDE) > > HIPC_FLAGS = $(INCLUDE) > > SYCLC_FLAGS = $(INCLUDE) > > PETSC_CXXCPPFLAGS = $(INCLUDE) > > PETSC_CCPPFLAGS = $(INCLUDE) > > PETSC_FCPPFLAGS = $(INCLUDE) > > PETSC_CUDACPPFLAGS = $(INCLUDE) > > MPICXX_INCLUDES = $(INCLUDE) > > > > # Read in the petsc compile/linking variables and makefile rules > > include ${PETSC_DIR}/lib/petsc/conf/variables > > include ${PETSC_DIR}/lib/petsc/conf/rules > > > > The strangest thing is if I echo the value of PETSC_KOKKOSCOMPILE_SINGLE > > before building, it seems to have the correct flags in it. > > > > # Build the tests > > build_tests: $(OUT) > > echo $(PETSC_KOKKOSCOMPILE_SINGLE) > > @for t in $(TEST_TARGETS); do \ > > $(MAKE) -C tests $$t; \ > > done > > > > for example the echo gives (where I've bolded the flags I need added): > > > > mpicxx -o .o -c -Wall -Wwrite-strings -Wno-strict-aliasing > > -Wno-unknown-pragmas -Wno-lto-type-mismatch -Wno-psabi -fstack-protector > > -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas > > -Wno-lto-type-mismatch -Wno-psabi -fstack-protector -g -O0 -std=gnu++17 > > -fPIC *-I/home/sdargavi/projects/PFLARE -Iinclude* > > > > but the actual command that is called when the build is happening is > > (which doesn't have the includes I need): > > > > mpicxx -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas > > -Wno-lto-type-mismatch -Wno-psabi -fstack-protector -Wall -Wwrite-strings > > -Wno-strict-aliasing -Wno-unknown-pragmas -Wno-lto-type-mismatch -Wno-psabi > > -fstack-protector -g -O0 -Wall -Wwrite-strings -Wno-strict-aliasing > > -Wno-unknown-pragmas -Wno-lto-type-mismatch -Wno-psabi -fstack-protector > > -Wall -Wwrite-strings -Wno-strict-aliasing -Wno-unknown-pragmas > > -Wno-lto-type-mismatch -Wno-psabi -fstack-protector -g -O0 -std=gnu++17 > > -fPIC -I/home/sdargavi/projects/dependencies/petsc-3.22.0/include > > -I/home/sdargavi/projects/dependencies/petsc-3.22.0/arch-linux-c-debug/include > > adv_1dk.kokkos.cxx -L/home/sdargavi/projects/PFLARE/lib -lpflare > > -Wl,-rpath,/home/sdargavi/projects/PFLARE/lib:-Wl,-rpath,/home/sdargavi/projects/dependencies/petsc-3.22.0/arch-linux-c-debug/lib > > -L/home/sdargavi/projects/dependencies/petsc-3.22.0/arch-linux-c-debug/lib > > -Wl,-rpath,/usr/lib/x86_64-linux-gnu/openmpi/lib/fortran/gfortran > > -L/usr/lib/x86_64-linux-gnu/openmpi/lib/fortran/gfortran > > -Wl,-rpath,/usr/lib/gcc/x86_64-linux-gnu/11 > > -L/usr/lib/gcc/x86_64-linux-gnu/11 -lpetsc -lkokkoskernels > > -lkokkoscontainers -lkokkoscore -lkokkossimd -lflapack -lfblas -lparmetis > > -lmetis -lm -lX11 -lmpi_usempif08 -lmpi_usempi_ignore_tkr -lmpi_mpifh -lmpi > > -lopen-rte -lopen-pal -lhwloc -levent_core -levent_pthreads -lgfortran -lm > > -lz -lgfortran -lm -lgfortran -lgcc_s -lquadmath -lstdc++ -lquadmath -o > > adv_1dk > > > > > > > > On Thu, 20 Feb 2025 at 00:32, Satish Balay wrote: > > > >> Try setting CPPFLAGS, FPPFLAGS, CXXPPFLAGS [and not via > >> PETSC_FC_INCLUDES]. > >> > >> I think kokkos compile targets [for *.kokkos.cxx sources] should pick up > >> one of them. > >> > >> for ex: > >> > >> >>> > >> CPPFLAGS = -Wall > >> FPPFLAGS = -Wall > >> CXXPPFLAGS = -Wall > >> > >> include ${PETSC_DIR}/lib/petsc/conf/variables > >> include ${PETSC_DIR}/lib/petsc/conf/rules > >> > >> ... > >> <<< > >> > >> Satish > >> > >> On Wed, 19 Feb 2025, Steven Dargaville wrote: > >> > >> > Hi > >> > > >> > I'm trying to build my application code (which includes C and Fortran > >> > files) with a Makefile based off > >> $PETSC_DIR/share/petsc/Makefile.basic.user > >> > by using the variables and rules defined in > >> > ${PETSC_DIR}/lib/petsc/conf/variables. My application uses petsc as > >> well as > >> > another library, and hence I have to add some extra include statements > >> > pointing at the other library during compilation. Currently I have been > >> > doing: > >> > > >> > # Read in the petsc compile/linking variables and makefile rules > >> > include ${PETSC_DIR}/lib/petsc/conf/variables > >> > include ${PETSC_DIR}/lib/petsc/conf/rules > >> > > >> > # Add the extra include files > >> > PETSC_FC_INCLUDES += $(INCLUDE_OTHER_LIB) > >> > PETSC_CC_INCLUDES += $(INCLUDE_OTHER_LIB) > >> > > >> > > >> > which works very well, with the correct include flags from > >> > INCLUDE_OTHER_LIBS being added to the compilation of both fortran and C > >> > files. > >> > > >> > If however I try and compile a kokkos file, named adv_1dk.kokkos.cxx (by > >> > calling "make adv_1dk"), the extra flags are not included. If I instead > >> > call "make adv_1dk.kokkos", the rule for cxx files is instead triggered > >> and > >> > correctly includes the include flags, but this just calls the c++ > >> wrapper, > >> > rather than the nvcc_wrapper and therefore breaks when kokkos has been > >> > built with cuda (or hip, etc). > >> > > >> > Just wondering if there is something I have missed, from what I can tell > >> > the kokkos rules don't use the PETSC_CC_INCLUDES during compilation. > >> > > >> > Thanks for all your help > >> > Steven > >> > > >> > >> > From schaferk at bellsouth.net Thu Feb 20 12:22:24 2025 From: schaferk at bellsouth.net (Michael Schaferkotter) Date: Thu, 20 Feb 2025 12:22:24 -0600 Subject: [petsc-users] Info; build petsc-3.20.2 with llvm check fails References: <288707F5-8A3C-4973-AE12-99F370C39C5A.ref@bellsouth.net> Message-ID: <288707F5-8A3C-4973-AE12-99F370C39C5A@bellsouth.net> build petsc-3.20.3 with llvm, clang, clang++, gfortran CFLAGS='-std=c++11' CXXFLAGS='-std=c++11 -D_GLIBCXX_USE_CXX11_ABI=1' LDLIBS += -lstdc++ $PETSC_ARCH arch-linux-c-opt MPIF90 = ./models/src/v2021.03-2.0.3-llvm/bin/mpif90 MPICC = ./models/src/v2021.03-2.0.3-llvm/bin/mpicc CLANG = clang FC = gfortran Petsc libraries are built; /models/src/v2021.03-2.0.3-llvm/lib/libpetsc.so@ /models/src/v2021.03-2.0.3-llvm/lib/libpetsc.so.3.020@ /models/src/v2021.03-2.0.3-llvm/lib/libpetsc.so.3.020.3* The configure is this: cd $(PETSC_SRC) && unset CXX CC FC F77 && $(PYTHON2) ./configure --prefix=$(PREFIX) \ --with-cc=clang \ --with-cxx=clang++ \ --with-fc=gfortran \ --download-mpich="$(DIR_SRC)/mpich-$(MPICH_VERSION).tar.gz" \ --download-fblaslapack="$(DIR_SRC)/fblaslapack-$(FBLASLAPACK_VERSION).tar.gz" \ --download-sowing \ --with-debugging=$(PETSC_DBG) \ --with-shared-libraries=1 \ CFLAGS='-std=c11' \ CXXFLAGS='-std=c++11 -D_GLIBCXX_USE_CXX11_ABI=1' \ CPPFLAGS='-D_GLIBCXX_USE_CXX11_ABI=1' \ LDFLAGS='-L$(LLVM_LIB)' \ LIBS='-lstdc++? \ --COPTFLAGS=$(COPTFLAGS) --CXXOPTFLAGS=$(CXXOPTFLAGS) --FOPTFLAGS=$(FOPTFLAGS) Here is the make: $(MAKE) -C $(PETSC_SRC) PETSC_DIR=$(PETSC_SRC) PETSC_ARCH=$(PETSC_ARCH) all Check-petsc is: $(MAKE) -C $(PETSC_SRC) PETSC_DIR=$(PETSC_SRC) PETSC_ARCH=$(PETSC_ARCH) test Here is the log file for test: make[1]: Entering directory '/models/src/v2021.03-2.0.3-llvm/build/petsc/petsc-3.20.3' /usr/bin/python3 /models/src/v2021.03-2.0.3-llvm/build/petsc/petsc-3.20.3/config/gmakegentest.py --petsc-dir=/models/src/v2021.03-2.0.3-llvm/build/petsc/petsc-3.20.3 --petsc-arch=arch-linux-c-opt --testdir=./arch-linux-c-opt/tests --srcdir /models/src/v2021.03-2.0.3-llvm/build/petsc/petsc-3.20.3/src --pkg-pkgs "sys vec mat dm ksp snes ts tao" Using MAKEFLAGS: iw -- PETSC_ARCH=arch-linux-c-opt PETSC_DIR=/models/src/v2021.03-2.0.3-llvm/build/petsc/petsc-3.20.3 CC arch-linux-c-opt/tests/sys/classes/draw/tests/ex1.o CLINKER arch-linux-c-opt/tests/sys/classes/draw/tests/ex1 /opt/rh/gcc-toolset-12/root/usr/lib/gcc/x86_64-redhat-linux/12/../../../../bin/ld: arch-linux-c-opt/lib/libpetsc.so: undefined reference to `std::__cxx11::basic_ostringstream, std::allocator >::basic_ostringstream()' /opt/rh/gcc-toolset-12/root/usr/lib/gcc/x86_64-redhat-linux/12/../../../../bin/ld: arch-linux-c-opt/lib/libpetsc.so: undefined reference to `std::__throw_bad_array_new_length()' clang: error: linker command failed with exit code 1 (use -v to see invocation) make[1]: [gmakefile.test:273: arch-linux-c-opt/tests/sys/classes/draw/tests/ex1] Error 1 (ignored) There are many errors of the ilk: std::__cxx11::basic_ostringstream, std::allocator >::basic_ostringstream() [lib]$ nm -A libpetsc.so | grep basic_ostringstream libpetsc.so: U _ZNKSt7__cxx1119basic_ostringstreamIcSt11char_traitsIcESaIcEE3strEv at GLIBCXX_3.4.21 libpetsc.so: U _ZNSt7__cxx1119basic_ostringstreamIcSt11char_traitsIcESaIcEEC1Ev libpetsc.so: U _ZNSt7__cxx1119basic_ostringstreamIcSt11char_traitsIcESaIcEED1Ev at GLIBCXX_3.4.21 I/m new to llvm and this is the first time to compile petsc.3.20.3 with llvm compilers. Clearly something is amiss. Any ideas appreciated. Michael -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay.anl at fastmail.org Thu Feb 20 12:52:05 2025 From: balay.anl at fastmail.org (Satish Balay) Date: Thu, 20 Feb 2025 12:52:05 -0600 (CST) Subject: [petsc-users] Info; build petsc-3.20.2 with llvm check fails In-Reply-To: <288707F5-8A3C-4973-AE12-99F370C39C5A@bellsouth.net> References: <288707F5-8A3C-4973-AE12-99F370C39C5A.ref@bellsouth.net> <288707F5-8A3C-4973-AE12-99F370C39C5A@bellsouth.net> Message-ID: <209ececc-d6bb-734a-e928-e12b73b2dc15@fastmail.org> Any particular reason to use these flags? What clang version? OS? Best if you can send build logs [perhaps to petsc-maint] Can you try a simpler build and see if it works: ./configure --with-mpi-dir=/PATH_TO/models/src/v2021.03-2.0.3-llvm --download-fblaslapack="$(DIR_SRC)/fblaslapack-$(FBLASLAPACK_VERSION).tar.gz" && make && make check or: ./configure --with-cc=clang --with-cxx=clang++ --with-fc=gfortran --download-mpich="$(DIR_SRC)/mpich-$(MPICH_VERSION).tar.gz" --download-fblaslapack="$(DIR_SRC)/fblaslapack-$(FBLASLAPACK_VERSION).tar.gz" && make && make check Satish On Thu, 20 Feb 2025, Michael Schaferkotter wrote: > build petsc-3.20.3 with llvm, clang, clang++, gfortran > > CFLAGS='-std=c++11' > CXXFLAGS='-std=c++11 -D_GLIBCXX_USE_CXX11_ABI=1' > LDLIBS += -lstdc++ > > $PETSC_ARCH arch-linux-c-opt > MPIF90 = ./models/src/v2021.03-2.0.3-llvm/bin/mpif90 > MPICC = ./models/src/v2021.03-2.0.3-llvm/bin/mpicc > CLANG = clang > FC = gfortran > > > Petsc libraries are built; > /models/src/v2021.03-2.0.3-llvm/lib/libpetsc.so@ > /models/src/v2021.03-2.0.3-llvm/lib/libpetsc.so.3.020@ > /models/src/v2021.03-2.0.3-llvm/lib/libpetsc.so.3.020.3* > > > The configure is this: > cd $(PETSC_SRC) && unset CXX CC FC F77 && $(PYTHON2) ./configure --prefix=$(PREFIX) \ > --with-cc=clang \ > --with-cxx=clang++ \ > --with-fc=gfortran \ > --download-mpich="$(DIR_SRC)/mpich-$(MPICH_VERSION).tar.gz" \ > --download-fblaslapack="$(DIR_SRC)/fblaslapack-$(FBLASLAPACK_VERSION).tar.gz" \ > --download-sowing \ > --with-debugging=$(PETSC_DBG) \ > --with-shared-libraries=1 \ > CFLAGS='-std=c11' \ > CXXFLAGS='-std=c++11 -D_GLIBCXX_USE_CXX11_ABI=1' \ > CPPFLAGS='-D_GLIBCXX_USE_CXX11_ABI=1' \ > LDFLAGS='-L$(LLVM_LIB)' \ > LIBS='-lstdc++? \ > --COPTFLAGS=$(COPTFLAGS) --CXXOPTFLAGS=$(CXXOPTFLAGS) --FOPTFLAGS=$(FOPTFLAGS) > > > Here is the make: > > $(MAKE) -C $(PETSC_SRC) PETSC_DIR=$(PETSC_SRC) PETSC_ARCH=$(PETSC_ARCH) all > > > Check-petsc is: > > $(MAKE) -C $(PETSC_SRC) PETSC_DIR=$(PETSC_SRC) PETSC_ARCH=$(PETSC_ARCH) test > > Here is the log file for test: > > make[1]: Entering directory '/models/src/v2021.03-2.0.3-llvm/build/petsc/petsc-3.20.3' > /usr/bin/python3 /models/src/v2021.03-2.0.3-llvm/build/petsc/petsc-3.20.3/config/gmakegentest.py --petsc-dir=/models/src/v2021.03-2.0.3-llvm/build/petsc/petsc-3.20.3 --petsc-arch=arch-linux-c-opt --testdir=./arch-linux-c-opt/tests --srcdir /models/src/v2021.03-2.0.3-llvm/build/petsc/petsc-3.20.3/src --pkg-pkgs "sys vec mat dm ksp snes ts tao" > Using MAKEFLAGS: iw -- PETSC_ARCH=arch-linux-c-opt PETSC_DIR=/models/src/v2021.03-2.0.3-llvm/build/petsc/petsc-3.20.3 > CC arch-linux-c-opt/tests/sys/classes/draw/tests/ex1.o > CLINKER arch-linux-c-opt/tests/sys/classes/draw/tests/ex1 > /opt/rh/gcc-toolset-12/root/usr/lib/gcc/x86_64-redhat-linux/12/../../../../bin/ld: arch-linux-c-opt/lib/libpetsc.so: undefined reference to `std::__cxx11::basic_ostringstream, std::allocator >::basic_ostringstream()' > /opt/rh/gcc-toolset-12/root/usr/lib/gcc/x86_64-redhat-linux/12/../../../../bin/ld: arch-linux-c-opt/lib/libpetsc.so: undefined reference to `std::__throw_bad_array_new_length()' > clang: error: linker command failed with exit code 1 (use -v to see invocation) > make[1]: [gmakefile.test:273: arch-linux-c-opt/tests/sys/classes/draw/tests/ex1] Error 1 (ignored) > > > There are many errors of the ilk: > > std::__cxx11::basic_ostringstream, std::allocator >::basic_ostringstream() > > [lib]$ nm -A libpetsc.so | grep basic_ostringstream > libpetsc.so: U _ZNKSt7__cxx1119basic_ostringstreamIcSt11char_traitsIcESaIcEE3strEv at GLIBCXX_3.4.21 > libpetsc.so: U _ZNSt7__cxx1119basic_ostringstreamIcSt11char_traitsIcESaIcEEC1Ev > libpetsc.so: U _ZNSt7__cxx1119basic_ostringstreamIcSt11char_traitsIcESaIcEED1Ev at GLIBCXX_3.4.21 > > > I/m new to llvm and this is the first time to compile petsc.3.20.3 with llvm compilers. > > Clearly something is amiss. > > Any ideas appreciated. > > Michael > > > From balay.anl at fastmail.org Thu Feb 20 13:18:11 2025 From: balay.anl at fastmail.org (Satish Balay) Date: Thu, 20 Feb 2025 13:18:11 -0600 (CST) Subject: [petsc-users] Info; build petsc-3.20.2 with llvm check fails In-Reply-To: <209ececc-d6bb-734a-e928-e12b73b2dc15@fastmail.org> References: <288707F5-8A3C-4973-AE12-99F370C39C5A.ref@bellsouth.net> <288707F5-8A3C-4973-AE12-99F370C39C5A@bellsouth.net> <209ececc-d6bb-734a-e928-e12b73b2dc15@fastmail.org> Message-ID: Actually, simpler: ./configure --with-cc=clang --with-cxx=clang++ --with-fc=gfortran --with-mpi=0 --download-fblaslapack="$(DIR_SRC)/fblaslapack-$(FBLASLAPACK_VERSION).tar.gz" && make && make check > /opt/rh/gcc-toolset-12/root/usr/lib/gcc/x86_64-redhat-linux/12/../../../../bin/ld Hm - there was in issue with some (clang versions?) incompatibilities with gcc-12 - I think using gcc-11 (system default in that use case) worked. I'm not sure if you are seeing the same issue here. Satish On Thu, 20 Feb 2025, Satish Balay wrote: > > Any particular reason to use these flags? What clang version? OS? > > Best if you can send build logs [perhaps to petsc-maint] > > Can you try a simpler build and see if it works: > > ./configure --with-mpi-dir=/PATH_TO/models/src/v2021.03-2.0.3-llvm --download-fblaslapack="$(DIR_SRC)/fblaslapack-$(FBLASLAPACK_VERSION).tar.gz" && make && make check > or: > ./configure --with-cc=clang --with-cxx=clang++ --with-fc=gfortran --download-mpich="$(DIR_SRC)/mpich-$(MPICH_VERSION).tar.gz" --download-fblaslapack="$(DIR_SRC)/fblaslapack-$(FBLASLAPACK_VERSION).tar.gz" && make && make check > > Satish > > On Thu, 20 Feb 2025, Michael Schaferkotter wrote: > > > build petsc-3.20.3 with llvm, clang, clang++, gfortran > > > > CFLAGS='-std=c++11' > > CXXFLAGS='-std=c++11 -D_GLIBCXX_USE_CXX11_ABI=1' > > LDLIBS += -lstdc++ > > > > $PETSC_ARCH arch-linux-c-opt > > MPIF90 = ./models/src/v2021.03-2.0.3-llvm/bin/mpif90 > > MPICC = ./models/src/v2021.03-2.0.3-llvm/bin/mpicc > > CLANG = clang > > FC = gfortran > > > > > > Petsc libraries are built; > > /models/src/v2021.03-2.0.3-llvm/lib/libpetsc.so@ > > /models/src/v2021.03-2.0.3-llvm/lib/libpetsc.so.3.020@ > > /models/src/v2021.03-2.0.3-llvm/lib/libpetsc.so.3.020.3* > > > > > > The configure is this: > > cd $(PETSC_SRC) && unset CXX CC FC F77 && $(PYTHON2) ./configure --prefix=$(PREFIX) \ > > --with-cc=clang \ > > --with-cxx=clang++ \ > > --with-fc=gfortran \ > > --download-mpich="$(DIR_SRC)/mpich-$(MPICH_VERSION).tar.gz" \ > > --download-fblaslapack="$(DIR_SRC)/fblaslapack-$(FBLASLAPACK_VERSION).tar.gz" \ > > --download-sowing \ > > --with-debugging=$(PETSC_DBG) \ > > --with-shared-libraries=1 \ > > CFLAGS='-std=c11' \ > > CXXFLAGS='-std=c++11 -D_GLIBCXX_USE_CXX11_ABI=1' \ > > CPPFLAGS='-D_GLIBCXX_USE_CXX11_ABI=1' \ > > LDFLAGS='-L$(LLVM_LIB)' \ > > LIBS='-lstdc++? \ > > --COPTFLAGS=$(COPTFLAGS) --CXXOPTFLAGS=$(CXXOPTFLAGS) --FOPTFLAGS=$(FOPTFLAGS) > > > > > > Here is the make: > > > > $(MAKE) -C $(PETSC_SRC) PETSC_DIR=$(PETSC_SRC) PETSC_ARCH=$(PETSC_ARCH) all > > > > > > Check-petsc is: > > > > $(MAKE) -C $(PETSC_SRC) PETSC_DIR=$(PETSC_SRC) PETSC_ARCH=$(PETSC_ARCH) test > > > > Here is the log file for test: > > > > make[1]: Entering directory '/models/src/v2021.03-2.0.3-llvm/build/petsc/petsc-3.20.3' > > /usr/bin/python3 /models/src/v2021.03-2.0.3-llvm/build/petsc/petsc-3.20.3/config/gmakegentest.py --petsc-dir=/models/src/v2021.03-2.0.3-llvm/build/petsc/petsc-3.20.3 --petsc-arch=arch-linux-c-opt --testdir=./arch-linux-c-opt/tests --srcdir /models/src/v2021.03-2.0.3-llvm/build/petsc/petsc-3.20.3/src --pkg-pkgs "sys vec mat dm ksp snes ts tao" > > Using MAKEFLAGS: iw -- PETSC_ARCH=arch-linux-c-opt PETSC_DIR=/models/src/v2021.03-2.0.3-llvm/build/petsc/petsc-3.20.3 > > CC arch-linux-c-opt/tests/sys/classes/draw/tests/ex1.o > > CLINKER arch-linux-c-opt/tests/sys/classes/draw/tests/ex1 > > /opt/rh/gcc-toolset-12/root/usr/lib/gcc/x86_64-redhat-linux/12/../../../../bin/ld: arch-linux-c-opt/lib/libpetsc.so: undefined reference to `std::__cxx11::basic_ostringstream, std::allocator >::basic_ostringstream()' > > /opt/rh/gcc-toolset-12/root/usr/lib/gcc/x86_64-redhat-linux/12/../../../../bin/ld: arch-linux-c-opt/lib/libpetsc.so: undefined reference to `std::__throw_bad_array_new_length()' > > clang: error: linker command failed with exit code 1 (use -v to see invocation) > > make[1]: [gmakefile.test:273: arch-linux-c-opt/tests/sys/classes/draw/tests/ex1] Error 1 (ignored) > > > > > > There are many errors of the ilk: > > > > std::__cxx11::basic_ostringstream, std::allocator >::basic_ostringstream() > > > > [lib]$ nm -A libpetsc.so | grep basic_ostringstream > > libpetsc.so: U _ZNKSt7__cxx1119basic_ostringstreamIcSt11char_traitsIcESaIcEE3strEv at GLIBCXX_3.4.21 > > libpetsc.so: U _ZNSt7__cxx1119basic_ostringstreamIcSt11char_traitsIcESaIcEEC1Ev > > libpetsc.so: U _ZNSt7__cxx1119basic_ostringstreamIcSt11char_traitsIcESaIcEED1Ev at GLIBCXX_3.4.21 > > > > > > I/m new to llvm and this is the first time to compile petsc.3.20.3 with llvm compilers. > > > > Clearly something is amiss. > > > > Any ideas appreciated. > > > > Michael > > > > > > > From balay.anl at fastmail.org Thu Feb 20 14:08:44 2025 From: balay.anl at fastmail.org (Satish Balay) Date: Thu, 20 Feb 2025 14:08:44 -0600 (CST) Subject: [petsc-users] Info; build petsc-3.20.2 with llvm check fails In-Reply-To: References: <288707F5-8A3C-4973-AE12-99F370C39C5A.ref@bellsouth.net> <288707F5-8A3C-4973-AE12-99F370C39C5A@bellsouth.net> <209ececc-d6bb-734a-e928-e12b73b2dc15@fastmail.org> Message-ID: <618386bb-0919-0fee-656e-b89ba7eaa08e@fastmail.org> Ok - I see this issue on CentOS [Stream/9]. What I have is: >>> [balay at frog petsc]$ clang --version clang version 19.1.7 (CentOS 19.1.7-1.el9) Target: x86_64-redhat-linux-gnu Thread model: posix InstalledDir: /usr/bin Configuration file: /etc/clang/x86_64-redhat-linux-gnu-clang.cfg [balay at frog petsc]$ gfortran --version GNU Fortran (GCC) 11.5.0 20240719 (Red Hat 11.5.0-5) Copyright (C) 2021 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. <<<< Now I build: >>> [balay at frog petsc]$ ./configure --with-cc=clang --with-cxx=clang++ --with-fc=gfortran --with-mpi=0 && make && make check ********************************************************************************* clang -fPIC -Wall -Wwrite-strings -Wno-unknown-pragmas -Wconversion -Wno-sign-conversion -Wno-float-conversion -Wno-implicit-float-conversion -fstack-protector -Qunused-arguments -fvisibility=hidden -Wall -Wwrite-strings -Wno-unknown-pragmas -Wconversion -Wno-sign-conversion -Wno-float-conversion -Wno-implicit-float-conversion -fstack-protector -Qunused-arguments -fvisibility=hidden -g3 -O0 -I/home/balay/petsc/include -I/home/balay/petsc/arch-linux-c-debug/include -Wl,-export-dynamic ex19.c -Wl,-rpath,/home/balay/petsc/arch-linux-c-debug/lib -L/home/balay/petsc/arch-linux-c-debug/lib -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/11 -L/usr/lib/gcc/x86_64-redhat-linux/11 -lpetsc -llapack -lblas -lm -lX11 -lgfortran -lm -lgfortran -lm -lgcc_s -lquadmath -lstdc++ -o ex19 /opt/rh/gcc-toolset-14/root//usr/lib/gcc/x86_64-redhat-linux/14/../../../../bin/ld: /home/balay/petsc/arch-linux-c-debug/lib/libpetsc.so: undefined reference to `std::__cxx11::basic_string, std::allocator >::_M_replace_cold(char*, unsigned long, char const*, unsigned long, unsigned long)' <<<< Ok some v11 compiler libraries are getting mixed up (likely from -lgfortran) causing grief. >>>>>>>> [root at frog ~]# yum remove gcc-toolset-14-runtime Dependencies resolved. ================================================================================ Package Arch Version Repository Size ================================================================================ Removing: gcc-toolset-14-runtime x86_64 14.0-1.el9 @appstream 11 k Removing dependent packages: clang x86_64 19.1.7-1.el9 @appstream 181 k clang-tools-extra x86_64 19.1.7-1.el9 @appstream 69 M gcc-toolset-14-binutils x86_64 2.41-3.el9 @appstream 27 M Removing unused dependencies: clang-libs x86_64 19.1.7-1.el9 @appstream 413 M clang-resource-filesystem x86_64 19.1.7-1.el9 @appstream 15 k compiler-rt x86_64 19.1.7-1.el9 @appstream 37 M gcc-toolset-14-gcc x86_64 14.2.1-7.1.el9 @appstream 122 M gcc-toolset-14-gcc-c++ x86_64 14.2.1-7.1.el9 @appstream 39 M gcc-toolset-14-libstdc++-devel x86_64 14.2.1-7.1.el9 @appstream 22 M libomp x86_64 19.1.7-1.el9 @appstream 1.9 M libomp-devel x86_64 19.1.7-1.el9 @appstream 31 M Transaction Summary ================================================================================ Remove 12 Packages Freed space: 763 M Is this ok [y/N]: <<<<< So this install of clang depends-on/requires gcc-toolset-14-gcc. Also gfortran-14 is missing. Try installing it. >>>> [root at frog ~]# yum install gcc-toolset-14-gcc-gfortran <<<< Now retry build: >>> [balay at frog petsc]$ ./configure --with-cc=clang --with-cxx=clang++ --with-fc=gfortran --with-mpi=0 && make && make check Running PETSc check examples to verify correct installation Using PETSC_DIR=/home/balay/petsc and PETSC_ARCH=arch-linux-c-debug C/C++ example src/snes/tutorials/ex19 run successfully with 1 MPI process Fortran example src/snes/tutorials/ex5f run successfully with 1 MPI process Completed PETSc check examples [balay at frog petsc]$ <<<< Hm - Using gfortran-11 here [with gfortran-14 installed] somehow worked! But perhaps its better to use gfortran-14 [as this install of clang requires g++-14] >>>> [balay at frog petsc]$ export PATH=/opt/rh/gcc-toolset-14/root/usr/bin:$PATH [balay at frog petsc]$ gfortran --version GNU Fortran (GCC) 14.2.1 20250110 (Red Hat 14.2.1-7) Copyright (C) 2024 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. [balay at frog petsc]$ ./configure --with-cc=clang --with-cxx=clang++ --with-fc=gfortran --with-mpi=0 && make && make check CLINKER arch-linux-c-debug/lib/libpetsc.so.3.22.3 ========================================= Now to check if the libraries are working do: make PETSC_DIR=/home/balay/petsc PETSC_ARCH=arch-linux-c-debug check ========================================= Running PETSc check examples to verify correct installation Using PETSC_DIR=/home/balay/petsc and PETSC_ARCH=arch-linux-c-debug C/C++ example src/snes/tutorials/ex19 run successfully with 1 MPI process Fortran example src/snes/tutorials/ex5f run successfully with 1 MPI process Completed PETSc check examples [balay at frog petsc]$ <<<< So that worked! Satish On Thu, 20 Feb 2025, Satish Balay wrote: > Actually, simpler: > > ./configure --with-cc=clang --with-cxx=clang++ --with-fc=gfortran --with-mpi=0 --download-fblaslapack="$(DIR_SRC)/fblaslapack-$(FBLASLAPACK_VERSION).tar.gz" && make && make check > > > /opt/rh/gcc-toolset-12/root/usr/lib/gcc/x86_64-redhat-linux/12/../../../../bin/ld > > Hm - there was in issue with some (clang versions?) incompatibilities with gcc-12 - I think using gcc-11 (system default in that use case) worked. I'm not sure if you are seeing the same issue here. > > Satish > > On Thu, 20 Feb 2025, Satish Balay wrote: > > > > > Any particular reason to use these flags? What clang version? OS? > > > > Best if you can send build logs [perhaps to petsc-maint] > > > > Can you try a simpler build and see if it works: > > > > ./configure --with-mpi-dir=/PATH_TO/models/src/v2021.03-2.0.3-llvm --download-fblaslapack="$(DIR_SRC)/fblaslapack-$(FBLASLAPACK_VERSION).tar.gz" && make && make check > > or: > > ./configure --with-cc=clang --with-cxx=clang++ --with-fc=gfortran --download-mpich="$(DIR_SRC)/mpich-$(MPICH_VERSION).tar.gz" --download-fblaslapack="$(DIR_SRC)/fblaslapack-$(FBLASLAPACK_VERSION).tar.gz" && make && make check > > > > Satish > > > > On Thu, 20 Feb 2025, Michael Schaferkotter wrote: > > > > > build petsc-3.20.3 with llvm, clang, clang++, gfortran > > > > > > CFLAGS='-std=c++11' > > > CXXFLAGS='-std=c++11 -D_GLIBCXX_USE_CXX11_ABI=1' > > > LDLIBS += -lstdc++ > > > > > > $PETSC_ARCH arch-linux-c-opt > > > MPIF90 = ./models/src/v2021.03-2.0.3-llvm/bin/mpif90 > > > MPICC = ./models/src/v2021.03-2.0.3-llvm/bin/mpicc > > > CLANG = clang > > > FC = gfortran > > > > > > > > > Petsc libraries are built; > > > /models/src/v2021.03-2.0.3-llvm/lib/libpetsc.so@ > > > /models/src/v2021.03-2.0.3-llvm/lib/libpetsc.so.3.020@ > > > /models/src/v2021.03-2.0.3-llvm/lib/libpetsc.so.3.020.3* > > > > > > > > > The configure is this: > > > cd $(PETSC_SRC) && unset CXX CC FC F77 && $(PYTHON2) ./configure --prefix=$(PREFIX) \ > > > --with-cc=clang \ > > > --with-cxx=clang++ \ > > > --with-fc=gfortran \ > > > --download-mpich="$(DIR_SRC)/mpich-$(MPICH_VERSION).tar.gz" \ > > > --download-fblaslapack="$(DIR_SRC)/fblaslapack-$(FBLASLAPACK_VERSION).tar.gz" \ > > > --download-sowing \ > > > --with-debugging=$(PETSC_DBG) \ > > > --with-shared-libraries=1 \ > > > CFLAGS='-std=c11' \ > > > CXXFLAGS='-std=c++11 -D_GLIBCXX_USE_CXX11_ABI=1' \ > > > CPPFLAGS='-D_GLIBCXX_USE_CXX11_ABI=1' \ > > > LDFLAGS='-L$(LLVM_LIB)' \ > > > LIBS='-lstdc++? \ > > > --COPTFLAGS=$(COPTFLAGS) --CXXOPTFLAGS=$(CXXOPTFLAGS) --FOPTFLAGS=$(FOPTFLAGS) > > > > > > > > > Here is the make: > > > > > > $(MAKE) -C $(PETSC_SRC) PETSC_DIR=$(PETSC_SRC) PETSC_ARCH=$(PETSC_ARCH) all > > > > > > > > > Check-petsc is: > > > > > > $(MAKE) -C $(PETSC_SRC) PETSC_DIR=$(PETSC_SRC) PETSC_ARCH=$(PETSC_ARCH) test > > > > > > Here is the log file for test: > > > > > > make[1]: Entering directory '/models/src/v2021.03-2.0.3-llvm/build/petsc/petsc-3.20.3' > > > /usr/bin/python3 /models/src/v2021.03-2.0.3-llvm/build/petsc/petsc-3.20.3/config/gmakegentest.py --petsc-dir=/models/src/v2021.03-2.0.3-llvm/build/petsc/petsc-3.20.3 --petsc-arch=arch-linux-c-opt --testdir=./arch-linux-c-opt/tests --srcdir /models/src/v2021.03-2.0.3-llvm/build/petsc/petsc-3.20.3/src --pkg-pkgs "sys vec mat dm ksp snes ts tao" > > > Using MAKEFLAGS: iw -- PETSC_ARCH=arch-linux-c-opt PETSC_DIR=/models/src/v2021.03-2.0.3-llvm/build/petsc/petsc-3.20.3 > > > CC arch-linux-c-opt/tests/sys/classes/draw/tests/ex1.o > > > CLINKER arch-linux-c-opt/tests/sys/classes/draw/tests/ex1 > > > /opt/rh/gcc-toolset-12/root/usr/lib/gcc/x86_64-redhat-linux/12/../../../../bin/ld: arch-linux-c-opt/lib/libpetsc.so: undefined reference to `std::__cxx11::basic_ostringstream, std::allocator >::basic_ostringstream()' > > > /opt/rh/gcc-toolset-12/root/usr/lib/gcc/x86_64-redhat-linux/12/../../../../bin/ld: arch-linux-c-opt/lib/libpetsc.so: undefined reference to `std::__throw_bad_array_new_length()' > > > clang: error: linker command failed with exit code 1 (use -v to see invocation) > > > make[1]: [gmakefile.test:273: arch-linux-c-opt/tests/sys/classes/draw/tests/ex1] Error 1 (ignored) > > > > > > > > > There are many errors of the ilk: > > > > > > std::__cxx11::basic_ostringstream, std::allocator >::basic_ostringstream() > > > > > > [lib]$ nm -A libpetsc.so | grep basic_ostringstream > > > libpetsc.so: U _ZNKSt7__cxx1119basic_ostringstreamIcSt11char_traitsIcESaIcEE3strEv at GLIBCXX_3.4.21 > > > libpetsc.so: U _ZNSt7__cxx1119basic_ostringstreamIcSt11char_traitsIcESaIcEEC1Ev > > > libpetsc.so: U _ZNSt7__cxx1119basic_ostringstreamIcSt11char_traitsIcESaIcEED1Ev at GLIBCXX_3.4.21 > > > > > > > > > I/m new to llvm and this is the first time to compile petsc.3.20.3 with llvm compilers. > > > > > > Clearly something is amiss. > > > > > > Any ideas appreciated. > > > > > > Michael > > > > > > > > > > > > From balay.anl at fastmail.org Thu Feb 20 14:44:15 2025 From: balay.anl at fastmail.org (Satish Balay) Date: Thu, 20 Feb 2025 14:44:15 -0600 (CST) Subject: [petsc-users] Info; build petsc-3.20.2 with llvm check fails In-Reply-To: <618386bb-0919-0fee-656e-b89ba7eaa08e@fastmail.org> References: <288707F5-8A3C-4973-AE12-99F370C39C5A.ref@bellsouth.net> <288707F5-8A3C-4973-AE12-99F370C39C5A@bellsouth.net> <209ececc-d6bb-734a-e928-e12b73b2dc15@fastmail.org> <618386bb-0919-0fee-656e-b89ba7eaa08e@fastmail.org> Message-ID: A couple of alternates (if mixing compiler versions can't be avoided): - don't need to use petsc from fortran: [balay at frog petsc]$ ./configure --with-cc=clang --with-cxx=clang++ --with-fc=0 --with-mpi=0 --download-f2cblaslapack && make && make check - don't use c++: [balay at frog petsc]$ ./configure --with-cc=clang --with-cxx=0 --with-fc=gfortran --with-mpi=0 && make && make check - add in v14 -lstdc++ location ahead in the search path - so that even when -lgfortran is found in v11, v14 -lstdc++ gets picked up correctly. [balay at frog petsc]$ ./configure LDFLAGS=-L/opt/rh/gcc-toolset-14/root/usr/lib/gcc/x86_64-redhat-linux/14/ --with-cc=clang --with-cxx=clang++ --with-fc=gfortran --with-mpi=0 && make && make check Satish On Thu, 20 Feb 2025, Satish Balay wrote: > Ok - I see this issue on CentOS [Stream/9]. > > What I have is: > >>> > [balay at frog petsc]$ clang --version > clang version 19.1.7 (CentOS 19.1.7-1.el9) > Target: x86_64-redhat-linux-gnu > Thread model: posix > InstalledDir: /usr/bin > Configuration file: /etc/clang/x86_64-redhat-linux-gnu-clang.cfg > [balay at frog petsc]$ gfortran --version > GNU Fortran (GCC) 11.5.0 20240719 (Red Hat 11.5.0-5) > Copyright (C) 2021 Free Software Foundation, Inc. > This is free software; see the source for copying conditions. There is NO > warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. > <<<< > > Now I build: > >>> > [balay at frog petsc]$ ./configure --with-cc=clang --with-cxx=clang++ --with-fc=gfortran --with-mpi=0 && make && make check > > ********************************************************************************* > clang -fPIC -Wall -Wwrite-strings -Wno-unknown-pragmas -Wconversion -Wno-sign-conversion -Wno-float-conversion -Wno-implicit-float-conversion -fstack-protector -Qunused-arguments -fvisibility=hidden -Wall -Wwrite-strings -Wno-unknown-pragmas -Wconversion -Wno-sign-conversion -Wno-float-conversion -Wno-implicit-float-conversion -fstack-protector -Qunused-arguments -fvisibility=hidden -g3 -O0 -I/home/balay/petsc/include -I/home/balay/petsc/arch-linux-c-debug/include -Wl,-export-dynamic ex19.c -Wl,-rpath,/home/balay/petsc/arch-linux-c-debug/lib -L/home/balay/petsc/arch-linux-c-debug/lib -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/11 -L/usr/lib/gcc/x86_64-redhat-linux/11 -lpetsc -llapack -lblas -lm -lX11 -lgfortran -lm -lgfortran -lm -lgcc_s -lquadmath -lstdc++ -o ex19 > /opt/rh/gcc-toolset-14/root//usr/lib/gcc/x86_64-redhat-linux/14/../../../../bin/ld: /home/balay/petsc/arch-linux-c-debug/lib/libpetsc.so: undefined reference to `std::__cxx11::basic_string, std::allocator >::_M_replace_cold(char*, unsigned long, char const*, unsigned long, unsigned long)' > > <<<< > > Ok some v11 compiler libraries are getting mixed up (likely from -lgfortran) causing grief. > >>>>>>>> > [root at frog ~]# yum remove gcc-toolset-14-runtime > Dependencies resolved. > ================================================================================ > Package Arch Version Repository Size > ================================================================================ > Removing: > gcc-toolset-14-runtime x86_64 14.0-1.el9 @appstream 11 k > Removing dependent packages: > clang x86_64 19.1.7-1.el9 @appstream 181 k > clang-tools-extra x86_64 19.1.7-1.el9 @appstream 69 M > gcc-toolset-14-binutils x86_64 2.41-3.el9 @appstream 27 M > Removing unused dependencies: > clang-libs x86_64 19.1.7-1.el9 @appstream 413 M > clang-resource-filesystem x86_64 19.1.7-1.el9 @appstream 15 k > compiler-rt x86_64 19.1.7-1.el9 @appstream 37 M > gcc-toolset-14-gcc x86_64 14.2.1-7.1.el9 @appstream 122 M > gcc-toolset-14-gcc-c++ x86_64 14.2.1-7.1.el9 @appstream 39 M > gcc-toolset-14-libstdc++-devel x86_64 14.2.1-7.1.el9 @appstream 22 M > libomp x86_64 19.1.7-1.el9 @appstream 1.9 M > libomp-devel x86_64 19.1.7-1.el9 @appstream 31 M > > Transaction Summary > ================================================================================ > Remove 12 Packages > > Freed space: 763 M > Is this ok [y/N]: > <<<<< > > So this install of clang depends-on/requires gcc-toolset-14-gcc. Also gfortran-14 is missing. Try installing it. > >>>> > [root at frog ~]# yum install gcc-toolset-14-gcc-gfortran > <<<< > > Now retry build: > >>> > [balay at frog petsc]$ ./configure --with-cc=clang --with-cxx=clang++ --with-fc=gfortran --with-mpi=0 && make && make check > > Running PETSc check examples to verify correct installation > Using PETSC_DIR=/home/balay/petsc and PETSC_ARCH=arch-linux-c-debug > C/C++ example src/snes/tutorials/ex19 run successfully with 1 MPI process > Fortran example src/snes/tutorials/ex5f run successfully with 1 MPI process > Completed PETSc check examples > [balay at frog petsc]$ > <<<< > > Hm - Using gfortran-11 here [with gfortran-14 installed] somehow worked! But perhaps its better to use gfortran-14 [as this install of clang requires g++-14] > >>>> > [balay at frog petsc]$ export PATH=/opt/rh/gcc-toolset-14/root/usr/bin:$PATH > [balay at frog petsc]$ gfortran --version > GNU Fortran (GCC) 14.2.1 20250110 (Red Hat 14.2.1-7) > Copyright (C) 2024 Free Software Foundation, Inc. > This is free software; see the source for copying conditions. There is NO > warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. > > [balay at frog petsc]$ ./configure --with-cc=clang --with-cxx=clang++ --with-fc=gfortran --with-mpi=0 && make && make check > > CLINKER arch-linux-c-debug/lib/libpetsc.so.3.22.3 > ========================================= > Now to check if the libraries are working do: > make PETSC_DIR=/home/balay/petsc PETSC_ARCH=arch-linux-c-debug check > ========================================= > Running PETSc check examples to verify correct installation > Using PETSC_DIR=/home/balay/petsc and PETSC_ARCH=arch-linux-c-debug > C/C++ example src/snes/tutorials/ex19 run successfully with 1 MPI process > Fortran example src/snes/tutorials/ex5f run successfully with 1 MPI process > Completed PETSc check examples > [balay at frog petsc]$ > <<<< > > So that worked! > > Satish > > > On Thu, 20 Feb 2025, Satish Balay wrote: > > > Actually, simpler: > > > > ./configure --with-cc=clang --with-cxx=clang++ --with-fc=gfortran --with-mpi=0 --download-fblaslapack="$(DIR_SRC)/fblaslapack-$(FBLASLAPACK_VERSION).tar.gz" && make && make check > > > > > /opt/rh/gcc-toolset-12/root/usr/lib/gcc/x86_64-redhat-linux/12/../../../../bin/ld > > > > Hm - there was in issue with some (clang versions?) incompatibilities with gcc-12 - I think using gcc-11 (system default in that use case) worked. I'm not sure if you are seeing the same issue here. > > > > Satish > > > > On Thu, 20 Feb 2025, Satish Balay wrote: > > > > > > > > Any particular reason to use these flags? What clang version? OS? > > > > > > Best if you can send build logs [perhaps to petsc-maint] > > > > > > Can you try a simpler build and see if it works: > > > > > > ./configure --with-mpi-dir=/PATH_TO/models/src/v2021.03-2.0.3-llvm --download-fblaslapack="$(DIR_SRC)/fblaslapack-$(FBLASLAPACK_VERSION).tar.gz" && make && make check > > > or: > > > ./configure --with-cc=clang --with-cxx=clang++ --with-fc=gfortran --download-mpich="$(DIR_SRC)/mpich-$(MPICH_VERSION).tar.gz" --download-fblaslapack="$(DIR_SRC)/fblaslapack-$(FBLASLAPACK_VERSION).tar.gz" && make && make check > > > > > > Satish > > > > > > On Thu, 20 Feb 2025, Michael Schaferkotter wrote: > > > > > > > build petsc-3.20.3 with llvm, clang, clang++, gfortran > > > > > > > > CFLAGS='-std=c++11' > > > > CXXFLAGS='-std=c++11 -D_GLIBCXX_USE_CXX11_ABI=1' > > > > LDLIBS += -lstdc++ > > > > > > > > $PETSC_ARCH arch-linux-c-opt > > > > MPIF90 = ./models/src/v2021.03-2.0.3-llvm/bin/mpif90 > > > > MPICC = ./models/src/v2021.03-2.0.3-llvm/bin/mpicc > > > > CLANG = clang > > > > FC = gfortran > > > > > > > > > > > > Petsc libraries are built; > > > > /models/src/v2021.03-2.0.3-llvm/lib/libpetsc.so@ > > > > /models/src/v2021.03-2.0.3-llvm/lib/libpetsc.so.3.020@ > > > > /models/src/v2021.03-2.0.3-llvm/lib/libpetsc.so.3.020.3* > > > > > > > > > > > > The configure is this: > > > > cd $(PETSC_SRC) && unset CXX CC FC F77 && $(PYTHON2) ./configure --prefix=$(PREFIX) \ > > > > --with-cc=clang \ > > > > --with-cxx=clang++ \ > > > > --with-fc=gfortran \ > > > > --download-mpich="$(DIR_SRC)/mpich-$(MPICH_VERSION).tar.gz" \ > > > > --download-fblaslapack="$(DIR_SRC)/fblaslapack-$(FBLASLAPACK_VERSION).tar.gz" \ > > > > --download-sowing \ > > > > --with-debugging=$(PETSC_DBG) \ > > > > --with-shared-libraries=1 \ > > > > CFLAGS='-std=c11' \ > > > > CXXFLAGS='-std=c++11 -D_GLIBCXX_USE_CXX11_ABI=1' \ > > > > CPPFLAGS='-D_GLIBCXX_USE_CXX11_ABI=1' \ > > > > LDFLAGS='-L$(LLVM_LIB)' \ > > > > LIBS='-lstdc++? \ > > > > --COPTFLAGS=$(COPTFLAGS) --CXXOPTFLAGS=$(CXXOPTFLAGS) --FOPTFLAGS=$(FOPTFLAGS) > > > > > > > > > > > > Here is the make: > > > > > > > > $(MAKE) -C $(PETSC_SRC) PETSC_DIR=$(PETSC_SRC) PETSC_ARCH=$(PETSC_ARCH) all > > > > > > > > > > > > Check-petsc is: > > > > > > > > $(MAKE) -C $(PETSC_SRC) PETSC_DIR=$(PETSC_SRC) PETSC_ARCH=$(PETSC_ARCH) test > > > > > > > > Here is the log file for test: > > > > > > > > make[1]: Entering directory '/models/src/v2021.03-2.0.3-llvm/build/petsc/petsc-3.20.3' > > > > /usr/bin/python3 /models/src/v2021.03-2.0.3-llvm/build/petsc/petsc-3.20.3/config/gmakegentest.py --petsc-dir=/models/src/v2021.03-2.0.3-llvm/build/petsc/petsc-3.20.3 --petsc-arch=arch-linux-c-opt --testdir=./arch-linux-c-opt/tests --srcdir /models/src/v2021.03-2.0.3-llvm/build/petsc/petsc-3.20.3/src --pkg-pkgs "sys vec mat dm ksp snes ts tao" > > > > Using MAKEFLAGS: iw -- PETSC_ARCH=arch-linux-c-opt PETSC_DIR=/models/src/v2021.03-2.0.3-llvm/build/petsc/petsc-3.20.3 > > > > CC arch-linux-c-opt/tests/sys/classes/draw/tests/ex1.o > > > > CLINKER arch-linux-c-opt/tests/sys/classes/draw/tests/ex1 > > > > /opt/rh/gcc-toolset-12/root/usr/lib/gcc/x86_64-redhat-linux/12/../../../../bin/ld: arch-linux-c-opt/lib/libpetsc.so: undefined reference to `std::__cxx11::basic_ostringstream, std::allocator >::basic_ostringstream()' > > > > /opt/rh/gcc-toolset-12/root/usr/lib/gcc/x86_64-redhat-linux/12/../../../../bin/ld: arch-linux-c-opt/lib/libpetsc.so: undefined reference to `std::__throw_bad_array_new_length()' > > > > clang: error: linker command failed with exit code 1 (use -v to see invocation) > > > > make[1]: [gmakefile.test:273: arch-linux-c-opt/tests/sys/classes/draw/tests/ex1] Error 1 (ignored) > > > > > > > > > > > > There are many errors of the ilk: > > > > > > > > std::__cxx11::basic_ostringstream, std::allocator >::basic_ostringstream() > > > > > > > > [lib]$ nm -A libpetsc.so | grep basic_ostringstream > > > > libpetsc.so: U _ZNKSt7__cxx1119basic_ostringstreamIcSt11char_traitsIcESaIcEE3strEv at GLIBCXX_3.4.21 > > > > libpetsc.so: U _ZNSt7__cxx1119basic_ostringstreamIcSt11char_traitsIcESaIcEEC1Ev > > > > libpetsc.so: U _ZNSt7__cxx1119basic_ostringstreamIcSt11char_traitsIcESaIcEED1Ev at GLIBCXX_3.4.21 > > > > > > > > > > > > I/m new to llvm and this is the first time to compile petsc.3.20.3 with llvm compilers. > > > > > > > > Clearly something is amiss. > > > > > > > > Any ideas appreciated. > > > > > > > > Michael > > > > > > > > > > > > > > > > > From schaferk at bellsouth.net Fri Feb 21 08:02:36 2025 From: schaferk at bellsouth.net (Michael Schaferkotter) Date: Fri, 21 Feb 2025 08:02:36 -0600 Subject: [petsc-users] Info; build petsc-3.20.2 with llvm check fails In-Reply-To: References: <288707F5-8A3C-4973-AE12-99F370C39C5A.ref@bellsouth.net> <288707F5-8A3C-4973-AE12-99F370C39C5A@bellsouth.net> <209ececc-d6bb-734a-e928-e12b73b2dc15@fastmail.org> <618386bb-0919-0fee-656e-b89ba7eaa08e@fastmail.org> Message-ID: <03DE8DD9-B858-4D21-A2A8-CBC3BC1BE9C4@bellsouth.net> Satish; Thank you for the masterful demonstration. One of the alternatives caught my eye: ?with-cxx=0 (I remember I had to do that ages ago on my macOS Darwin machine. I cleared *FLAGS and successfully completed make && make check as suggested; #======================================================================= It may be moot now. Here is the requested OS, compiler information: cat /etc/os-release NAME="Red Hat Enterprise Linux" VERSION="8.8 (Ootpa)" ID="rhel" ID_LIKE="fedora" VERSION_ID="8.8" PLATFORM_ID="platform:el8" PRETTY_NAME="Red Hat Enterprise Linux 8.8 (Ootpa)" ANSI_COLOR="0;31" CPE_NAME="cpe:/o:redhat:enterprise_linux:8::baseos" HOME_URL="https://urldefense.us/v3/__https://www.redhat.com/__;!!G_uCfscf7eWS!Z9OsinaKusqhhdPuDCNknHJq6f6UGZt17SofPYc-BvWQvrlqpeDbEEucEHNxioN04anLOPsjW0v_aCHG5RpV-0tjnQ$ " DOCUMENTATION_URL="https://urldefense.us/v3/__https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8__;!!G_uCfscf7eWS!Z9OsinaKusqhhdPuDCNknHJq6f6UGZt17SofPYc-BvWQvrlqpeDbEEucEHNxioN04anLOPsjW0v_aCHG5Roh1jkm1w$ " BUG_REPORT_URL="https://urldefense.us/v3/__https://bugzilla.redhat.com/__;!!G_uCfscf7eWS!Z9OsinaKusqhhdPuDCNknHJq6f6UGZt17SofPYc-BvWQvrlqpeDbEEucEHNxioN04anLOPsjW0v_aCHG5RpG5qdUaQ$ " REDHAT_BUGZILLA_PRODUCT="Red Hat Enterprise Linux 8" REDHAT_BUGZILLA_PRODUCT_VERSION=8.8 REDHAT_SUPPORT_PRODUCT="Red Hat Enterprise Linux" REDHAT_SUPPORT_PRODUCT_VERSION="8.8" clang version 20.0.0git (https://urldefense.us/v3/__https://github.com/llvm/llvm-project.git__;!!G_uCfscf7eWS!Z9OsinaKusqhhdPuDCNknHJq6f6UGZt17SofPYc-BvWQvrlqpeDbEEucEHNxioN04anLOPsjW0v_aCHG5RoxqM2a5g$ 48d0ef1a07993139e1acf65910704255443103a5) Target: x86_64-unknown-linux-gnu Thread model: posix InstalledDir: /tmp/build_release/bin Clang version: 20.0.0git LLVM version: LLVM version 20.0.0git (48d0ef1a07993139e1acf65910704255443103a5 C++ standard: Host target: x86_64-unknown-linux-gnu Supported targets: Registered Targets: x86 - 32-bit X86: Pentium-Pro and above x86-64 - 64-bit X86: EM64T and AMD64 flang version 20.0.0git (https://urldefense.us/v3/__https://github.com/llvm/llvm-project.git__;!!G_uCfscf7eWS!Z9OsinaKusqhhdPuDCNknHJq6f6UGZt17SofPYc-BvWQvrlqpeDbEEucEHNxioN04anLOPsjW0v_aCHG5RoxqM2a5g$ 48d0ef1a07993139e1acf65910704255443103a5) Target: x86_64-unknown-linux-gnu Thread model: posix InstalledDir: /tmp/build_release/bin Flang version: 20.0.0git Host target: x86_64-unknown-linux-gnu Fortran compiler: GNU Fortran (GCC) 8.5.0 20210514 (Red Hat 8.5.0-4) Compiler path: /usr/bin/gfortran Version: -std= Assume that the input sources are for . Default flags: No default flags information available Target: x86_64-redhat-linux #======================================================================= Thank you again. > On Feb 20, 2025, at 2:44 PM, Satish Balay wrote: > > A couple of alternates (if mixing compiler versions can't be avoided): > > - don't need to use petsc from fortran: > [balay at frog petsc]$ ./configure --with-cc=clang --with-cxx=clang++ --with-fc=0 --with-mpi=0 --download-f2cblaslapack && make && make check > > - don't use c++: > [balay at frog petsc]$ ./configure --with-cc=clang --with-cxx=0 --with-fc=gfortran --with-mpi=0 && make && make check > > - add in v14 -lstdc++ location ahead in the search path - so that even when -lgfortran is found in v11, v14 -lstdc++ gets picked up correctly. > [balay at frog petsc]$ ./configure LDFLAGS=-L/opt/rh/gcc-toolset-14/root/usr/lib/gcc/x86_64-redhat-linux/14/ --with-cc=clang --with-cxx=clang++ --with-fc=gfortran --with-mpi=0 && make && make check > > Satish > > On Thu, 20 Feb 2025, Satish Balay wrote: > >> Ok - I see this issue on CentOS [Stream/9]. >> >> What I have is: >>>>> >> [balay at frog petsc]$ clang --version >> clang version 19.1.7 (CentOS 19.1.7-1.el9) >> Target: x86_64-redhat-linux-gnu >> Thread model: posix >> InstalledDir: /usr/bin >> Configuration file: /etc/clang/x86_64-redhat-linux-gnu-clang.cfg >> [balay at frog petsc]$ gfortran --version >> GNU Fortran (GCC) 11.5.0 20240719 (Red Hat 11.5.0-5) >> Copyright (C) 2021 Free Software Foundation, Inc. >> This is free software; see the source for copying conditions. There is NO >> warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. >> <<<< >> >> Now I build: >>>>> >> [balay at frog petsc]$ ./configure --with-cc=clang --with-cxx=clang++ --with-fc=gfortran --with-mpi=0 && make && make check >> >> ********************************************************************************* >> clang -fPIC -Wall -Wwrite-strings -Wno-unknown-pragmas -Wconversion -Wno-sign-conversion -Wno-float-conversion -Wno-implicit-float-conversion -fstack-protector -Qunused-arguments -fvisibility=hidden -Wall -Wwrite-strings -Wno-unknown-pragmas -Wconversion -Wno-sign-conversion -Wno-float-conversion -Wno-implicit-float-conversion -fstack-protector -Qunused-arguments -fvisibility=hidden -g3 -O0 -I/home/balay/petsc/include -I/home/balay/petsc/arch-linux-c-debug/include -Wl,-export-dynamic ex19.c -Wl,-rpath,/home/balay/petsc/arch-linux-c-debug/lib -L/home/balay/petsc/arch-linux-c-debug/lib -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/11 -L/usr/lib/gcc/x86_64-redhat-linux/11 -lpetsc -llapack -lblas -lm -lX11 -lgfortran -lm -lgfortran -lm -lgcc_s -lquadmath -lstdc++ -o ex19 >> /opt/rh/gcc-toolset-14/root//usr/lib/gcc/x86_64-redhat-linux/14/../../../../bin/ld: /home/balay/petsc/arch-linux-c-debug/lib/libpetsc.so: undefined reference to `std::__cxx11::basic_string, std::allocator >::_M_replace_cold(char*, unsigned long, char const*, unsigned long, unsigned long)' >> >> <<<< >> >> Ok some v11 compiler libraries are getting mixed up (likely from -lgfortran) causing grief. >>>>>>>>>> >> [root at frog ~]# yum remove gcc-toolset-14-runtime >> Dependencies resolved. >> ================================================================================ >> Package Arch Version Repository Size >> ================================================================================ >> Removing: >> gcc-toolset-14-runtime x86_64 14.0-1.el9 @appstream 11 k >> Removing dependent packages: >> clang x86_64 19.1.7-1.el9 @appstream 181 k >> clang-tools-extra x86_64 19.1.7-1.el9 @appstream 69 M >> gcc-toolset-14-binutils x86_64 2.41-3.el9 @appstream 27 M >> Removing unused dependencies: >> clang-libs x86_64 19.1.7-1.el9 @appstream 413 M >> clang-resource-filesystem x86_64 19.1.7-1.el9 @appstream 15 k >> compiler-rt x86_64 19.1.7-1.el9 @appstream 37 M >> gcc-toolset-14-gcc x86_64 14.2.1-7.1.el9 @appstream 122 M >> gcc-toolset-14-gcc-c++ x86_64 14.2.1-7.1.el9 @appstream 39 M >> gcc-toolset-14-libstdc++-devel x86_64 14.2.1-7.1.el9 @appstream 22 M >> libomp x86_64 19.1.7-1.el9 @appstream 1.9 M >> libomp-devel x86_64 19.1.7-1.el9 @appstream 31 M >> >> Transaction Summary >> ================================================================================ >> Remove 12 Packages >> >> Freed space: 763 M >> Is this ok [y/N]: >> <<<<< >> >> So this install of clang depends-on/requires gcc-toolset-14-gcc. Also gfortran-14 is missing. Try installing it. >>>>>> >> [root at frog ~]# yum install gcc-toolset-14-gcc-gfortran >> <<<< >> >> Now retry build: >>>>> >> [balay at frog petsc]$ ./configure --with-cc=clang --with-cxx=clang++ --with-fc=gfortran --with-mpi=0 && make && make check >> >> Running PETSc check examples to verify correct installation >> Using PETSC_DIR=/home/balay/petsc and PETSC_ARCH=arch-linux-c-debug >> C/C++ example src/snes/tutorials/ex19 run successfully with 1 MPI process >> Fortran example src/snes/tutorials/ex5f run successfully with 1 MPI process >> Completed PETSc check examples >> [balay at frog petsc]$ >> <<<< >> >> Hm - Using gfortran-11 here [with gfortran-14 installed] somehow worked! But perhaps its better to use gfortran-14 [as this install of clang requires g++-14] >>>>>> >> [balay at frog petsc]$ export PATH=/opt/rh/gcc-toolset-14/root/usr/bin:$PATH >> [balay at frog petsc]$ gfortran --version >> GNU Fortran (GCC) 14.2.1 20250110 (Red Hat 14.2.1-7) >> Copyright (C) 2024 Free Software Foundation, Inc. >> This is free software; see the source for copying conditions. There is NO >> warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. >> >> [balay at frog petsc]$ ./configure --with-cc=clang --with-cxx=clang++ --with-fc=gfortran --with-mpi=0 && make && make check >> >> CLINKER arch-linux-c-debug/lib/libpetsc.so.3.22.3 >> ========================================= >> Now to check if the libraries are working do: >> make PETSC_DIR=/home/balay/petsc PETSC_ARCH=arch-linux-c-debug check >> ========================================= >> Running PETSc check examples to verify correct installation >> Using PETSC_DIR=/home/balay/petsc and PETSC_ARCH=arch-linux-c-debug >> C/C++ example src/snes/tutorials/ex19 run successfully with 1 MPI process >> Fortran example src/snes/tutorials/ex5f run successfully with 1 MPI process >> Completed PETSc check examples >> [balay at frog petsc]$ >> <<<< >> >> So that worked! >> >> Satish >> >> >> On Thu, 20 Feb 2025, Satish Balay wrote: >> >>> Actually, simpler: >>> >>> ./configure --with-cc=clang --with-cxx=clang++ --with-fc=gfortran --with-mpi=0 --download-fblaslapack="$(DIR_SRC)/fblaslapack-$(FBLASLAPACK_VERSION).tar.gz" && make && make check >>> >>>> /opt/rh/gcc-toolset-12/root/usr/lib/gcc/x86_64-redhat-linux/12/../../../../bin/ld >>> >>> Hm - there was in issue with some (clang versions?) incompatibilities with gcc-12 - I think using gcc-11 (system default in that use case) worked. I'm not sure if you are seeing the same issue here. >>> >>> Satish >>> >>> On Thu, 20 Feb 2025, Satish Balay wrote: >>> >>>> >>>> Any particular reason to use these flags? What clang version? OS? >>>> >>>> Best if you can send build logs [perhaps to petsc-maint] >>>> >>>> Can you try a simpler build and see if it works: >>>> >>>> ./configure --with-mpi-dir=/PATH_TO/models/src/v2021.03-2.0.3-llvm --download-fblaslapack="$(DIR_SRC)/fblaslapack-$(FBLASLAPACK_VERSION).tar.gz" && make && make check >>>> or: >>>> ./configure --with-cc=clang --with-cxx=clang++ --with-fc=gfortran --download-mpich="$(DIR_SRC)/mpich-$(MPICH_VERSION).tar.gz" --download-fblaslapack="$(DIR_SRC)/fblaslapack-$(FBLASLAPACK_VERSION).tar.gz" && make && make check >>>> >>>> Satish >>>> >>>> On Thu, 20 Feb 2025, Michael Schaferkotter wrote: >>>> >>>>> build petsc-3.20.3 with llvm, clang, clang++, gfortran >>>>> >>>>> CFLAGS='-std=c++11' >>>>> CXXFLAGS='-std=c++11 -D_GLIBCXX_USE_CXX11_ABI=1' >>>>> LDLIBS += -lstdc++ >>>>> >>>>> $PETSC_ARCH arch-linux-c-opt >>>>> MPIF90 = ./models/src/v2021.03-2.0.3-llvm/bin/mpif90 >>>>> MPICC = ./models/src/v2021.03-2.0.3-llvm/bin/mpicc >>>>> CLANG = clang >>>>> FC = gfortran >>>>> >>>>> >>>>> Petsc libraries are built; >>>>> /models/src/v2021.03-2.0.3-llvm/lib/libpetsc.so@ >>>>> /models/src/v2021.03-2.0.3-llvm/lib/libpetsc.so.3.020@ >>>>> /models/src/v2021.03-2.0.3-llvm/lib/libpetsc.so.3.020.3* >>>>> >>>>> >>>>> The configure is this: >>>>> cd $(PETSC_SRC) && unset CXX CC FC F77 && $(PYTHON2) ./configure --prefix=$(PREFIX) \ >>>>> --with-cc=clang \ >>>>> --with-cxx=clang++ \ >>>>> --with-fc=gfortran \ >>>>> --download-mpich="$(DIR_SRC)/mpich-$(MPICH_VERSION).tar.gz" \ >>>>> --download-fblaslapack="$(DIR_SRC)/fblaslapack-$(FBLASLAPACK_VERSION).tar.gz" \ >>>>> --download-sowing \ >>>>> --with-debugging=$(PETSC_DBG) \ >>>>> --with-shared-libraries=1 \ >>>>> CFLAGS='-std=c11' \ >>>>> CXXFLAGS='-std=c++11 -D_GLIBCXX_USE_CXX11_ABI=1' \ >>>>> CPPFLAGS='-D_GLIBCXX_USE_CXX11_ABI=1' \ >>>>> LDFLAGS='-L$(LLVM_LIB)' \ >>>>> LIBS='-lstdc++? \ >>>>> --COPTFLAGS=$(COPTFLAGS) --CXXOPTFLAGS=$(CXXOPTFLAGS) --FOPTFLAGS=$(FOPTFLAGS) >>>>> >>>>> >>>>> Here is the make: >>>>> >>>>> $(MAKE) -C $(PETSC_SRC) PETSC_DIR=$(PETSC_SRC) PETSC_ARCH=$(PETSC_ARCH) all >>>>> >>>>> >>>>> Check-petsc is: >>>>> >>>>> $(MAKE) -C $(PETSC_SRC) PETSC_DIR=$(PETSC_SRC) PETSC_ARCH=$(PETSC_ARCH) test >>>>> >>>>> Here is the log file for test: >>>>> >>>>> make[1]: Entering directory '/models/src/v2021.03-2.0.3-llvm/build/petsc/petsc-3.20.3' >>>>> /usr/bin/python3 /models/src/v2021.03-2.0.3-llvm/build/petsc/petsc-3.20.3/config/gmakegentest.py --petsc-dir=/models/src/v2021.03-2.0.3-llvm/build/petsc/petsc-3.20.3 --petsc-arch=arch-linux-c-opt --testdir=./arch-linux-c-opt/tests --srcdir /models/src/v2021.03-2.0.3-llvm/build/petsc/petsc-3.20.3/src --pkg-pkgs "sys vec mat dm ksp snes ts tao" >>>>> Using MAKEFLAGS: iw -- PETSC_ARCH=arch-linux-c-opt PETSC_DIR=/models/src/v2021.03-2.0.3-llvm/build/petsc/petsc-3.20.3 >>>>> CC arch-linux-c-opt/tests/sys/classes/draw/tests/ex1.o >>>>> CLINKER arch-linux-c-opt/tests/sys/classes/draw/tests/ex1 >>>>> /opt/rh/gcc-toolset-12/root/usr/lib/gcc/x86_64-redhat-linux/12/../../../../bin/ld: arch-linux-c-opt/lib/libpetsc.so: undefined reference to `std::__cxx11::basic_ostringstream, std::allocator >::basic_ostringstream()' >>>>> /opt/rh/gcc-toolset-12/root/usr/lib/gcc/x86_64-redhat-linux/12/../../../../bin/ld: arch-linux-c-opt/lib/libpetsc.so: undefined reference to `std::__throw_bad_array_new_length()' >>>>> clang: error: linker command failed with exit code 1 (use -v to see invocation) >>>>> make[1]: [gmakefile.test:273: arch-linux-c-opt/tests/sys/classes/draw/tests/ex1] Error 1 (ignored) >>>>> >>>>> >>>>> There are many errors of the ilk: >>>>> >>>>> std::__cxx11::basic_ostringstream, std::allocator >::basic_ostringstream() >>>>> >>>>> [lib]$ nm -A libpetsc.so | grep basic_ostringstream >>>>> libpetsc.so: U _ZNKSt7__cxx1119basic_ostringstreamIcSt11char_traitsIcESaIcEE3strEv at GLIBCXX_3.4.21 >>>>> libpetsc.so: U _ZNSt7__cxx1119basic_ostringstreamIcSt11char_traitsIcESaIcEEC1Ev >>>>> libpetsc.so: U _ZNSt7__cxx1119basic_ostringstreamIcSt11char_traitsIcESaIcEED1Ev at GLIBCXX_3.4.21 >>>>> >>>>> >>>>> I/m new to llvm and this is the first time to compile petsc.3.20.3 with llvm compilers. >>>>> >>>>> Clearly something is amiss. >>>>> >>>>> Any ideas appreciated. >>>>> >>>>> Michael >>>>> >>>>> >>>>> >>>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From balay.anl at fastmail.org Fri Feb 21 09:38:57 2025 From: balay.anl at fastmail.org (Satish Balay) Date: Fri, 21 Feb 2025 09:38:57 -0600 (CST) Subject: [petsc-users] Info; build petsc-3.20.2 with llvm check fails In-Reply-To: <03DE8DD9-B858-4D21-A2A8-CBC3BC1BE9C4@bellsouth.net> References: <288707F5-8A3C-4973-AE12-99F370C39C5A.ref@bellsouth.net> <288707F5-8A3C-4973-AE12-99F370C39C5A@bellsouth.net> <209ececc-d6bb-734a-e928-e12b73b2dc15@fastmail.org> <618386bb-0919-0fee-656e-b89ba7eaa08e@fastmail.org> <03DE8DD9-B858-4D21-A2A8-CBC3BC1BE9C4@bellsouth.net> Message-ID: <5d274e31-ee10-cbeb-93e2-1bdbf8decc74@fastmail.org> I'm glad you now have a working build! BTW: Since you have flang installed - a build with it - i.e. with clang/clang++/flang might also work [instead of clang/clang++/gfortran] However flang usage is still nascent (likely some fortran examples fail with it) - if you are using PETSc from fortran - a build with gfortran is the preferred option Satish On Fri, 21 Feb 2025, Michael Schaferkotter wrote: > Satish; > > Thank you for the masterful demonstration. > > One of the alternatives caught my eye: ?with-cxx=0 (I remember I had to do that ages ago on my macOS Darwin machine. > > I cleared *FLAGS and successfully completed make && make check as suggested; > > #======================================================================= > It may be moot now. Here is the requested OS, compiler information: > > cat /etc/os-release > > NAME="Red Hat Enterprise Linux" > VERSION="8.8 (Ootpa)" > ID="rhel" > ID_LIKE="fedora" > VERSION_ID="8.8" > PLATFORM_ID="platform:el8" > PRETTY_NAME="Red Hat Enterprise Linux 8.8 (Ootpa)" > ANSI_COLOR="0;31" > CPE_NAME="cpe:/o:redhat:enterprise_linux:8::baseos" > HOME_URL="https://urldefense.us/v3/__https://www.redhat.com/__;!!G_uCfscf7eWS!Z9OsinaKusqhhdPuDCNknHJq6f6UGZt17SofPYc-BvWQvrlqpeDbEEucEHNxioN04anLOPsjW0v_aCHG5RpV-0tjnQ$ " > DOCUMENTATION_URL="https://urldefense.us/v3/__https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8__;!!G_uCfscf7eWS!Z9OsinaKusqhhdPuDCNknHJq6f6UGZt17SofPYc-BvWQvrlqpeDbEEucEHNxioN04anLOPsjW0v_aCHG5Roh1jkm1w$ " > BUG_REPORT_URL="https://urldefense.us/v3/__https://bugzilla.redhat.com/__;!!G_uCfscf7eWS!Z9OsinaKusqhhdPuDCNknHJq6f6UGZt17SofPYc-BvWQvrlqpeDbEEucEHNxioN04anLOPsjW0v_aCHG5RpG5qdUaQ$ " > > REDHAT_BUGZILLA_PRODUCT="Red Hat Enterprise Linux 8" > REDHAT_BUGZILLA_PRODUCT_VERSION=8.8 > REDHAT_SUPPORT_PRODUCT="Red Hat Enterprise Linux" > REDHAT_SUPPORT_PRODUCT_VERSION="8.8" > > clang version 20.0.0git (https://urldefense.us/v3/__https://github.com/llvm/llvm-project.git__;!!G_uCfscf7eWS!Z9OsinaKusqhhdPuDCNknHJq6f6UGZt17SofPYc-BvWQvrlqpeDbEEucEHNxioN04anLOPsjW0v_aCHG5RoxqM2a5g$ 48d0ef1a07993139e1acf65910704255443103a5) > Target: x86_64-unknown-linux-gnu > Thread model: posix > InstalledDir: /tmp/build_release/bin > Clang version: 20.0.0git > LLVM version: LLVM version 20.0.0git (48d0ef1a07993139e1acf65910704255443103a5 > C++ standard: > Host target: x86_64-unknown-linux-gnu > Supported targets: > Registered Targets: > x86 - 32-bit X86: Pentium-Pro and above > x86-64 - 64-bit X86: EM64T and AMD64 > > flang version 20.0.0git (https://urldefense.us/v3/__https://github.com/llvm/llvm-project.git__;!!G_uCfscf7eWS!Z9OsinaKusqhhdPuDCNknHJq6f6UGZt17SofPYc-BvWQvrlqpeDbEEucEHNxioN04anLOPsjW0v_aCHG5RoxqM2a5g$ 48d0ef1a07993139e1acf65910704255443103a5) > Target: x86_64-unknown-linux-gnu > Thread model: posix > InstalledDir: /tmp/build_release/bin > Flang version: 20.0.0git > Host target: x86_64-unknown-linux-gnu > > > Fortran compiler: GNU Fortran (GCC) 8.5.0 20210514 (Red Hat 8.5.0-4) > Compiler path: /usr/bin/gfortran > Version: > -std= Assume that the input sources are for . > Default flags: No default flags information available > Target: x86_64-redhat-linux > > #======================================================================= > > Thank you again. > > > On Feb 20, 2025, at 2:44 PM, Satish Balay wrote: > > > > A couple of alternates (if mixing compiler versions can't be avoided): > > > > - don't need to use petsc from fortran: > > [balay at frog petsc]$ ./configure --with-cc=clang --with-cxx=clang++ --with-fc=0 --with-mpi=0 --download-f2cblaslapack && make && make check > > > > - don't use c++: > > [balay at frog petsc]$ ./configure --with-cc=clang --with-cxx=0 --with-fc=gfortran --with-mpi=0 && make && make check > > > > - add in v14 -lstdc++ location ahead in the search path - so that even when -lgfortran is found in v11, v14 -lstdc++ gets picked up correctly. > > [balay at frog petsc]$ ./configure LDFLAGS=-L/opt/rh/gcc-toolset-14/root/usr/lib/gcc/x86_64-redhat-linux/14/ --with-cc=clang --with-cxx=clang++ --with-fc=gfortran --with-mpi=0 && make && make check > > > > Satish > > > > On Thu, 20 Feb 2025, Satish Balay wrote: > > > >> Ok - I see this issue on CentOS [Stream/9]. > >> > >> What I have is: > >>>>> > >> [balay at frog petsc]$ clang --version > >> clang version 19.1.7 (CentOS 19.1.7-1.el9) > >> Target: x86_64-redhat-linux-gnu > >> Thread model: posix > >> InstalledDir: /usr/bin > >> Configuration file: /etc/clang/x86_64-redhat-linux-gnu-clang.cfg > >> [balay at frog petsc]$ gfortran --version > >> GNU Fortran (GCC) 11.5.0 20240719 (Red Hat 11.5.0-5) > >> Copyright (C) 2021 Free Software Foundation, Inc. > >> This is free software; see the source for copying conditions. There is NO > >> warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. > >> <<<< > >> > >> Now I build: > >>>>> > >> [balay at frog petsc]$ ./configure --with-cc=clang --with-cxx=clang++ --with-fc=gfortran --with-mpi=0 && make && make check > >> > >> ********************************************************************************* > >> clang -fPIC -Wall -Wwrite-strings -Wno-unknown-pragmas -Wconversion -Wno-sign-conversion -Wno-float-conversion -Wno-implicit-float-conversion -fstack-protector -Qunused-arguments -fvisibility=hidden -Wall -Wwrite-strings -Wno-unknown-pragmas -Wconversion -Wno-sign-conversion -Wno-float-conversion -Wno-implicit-float-conversion -fstack-protector -Qunused-arguments -fvisibility=hidden -g3 -O0 -I/home/balay/petsc/include -I/home/balay/petsc/arch-linux-c-debug/include -Wl,-export-dynamic ex19.c -Wl,-rpath,/home/balay/petsc/arch-linux-c-debug/lib -L/home/balay/petsc/arch-linux-c-debug/lib -Wl,-rpath,/usr/lib/gcc/x86_64-redhat-linux/11 -L/usr/lib/gcc/x86_64-redhat-linux/11 -lpetsc -llapack -lblas -lm -lX11 -lgfortran -lm -lgfortran -lm -lgcc_s -lquadmath -lstdc++ -o ex19 > >> /opt/rh/gcc-toolset-14/root//usr/lib/gcc/x86_64-redhat-linux/14/../../../../bin/ld: /home/balay/petsc/arch-linux-c-debug/lib/libpetsc.so: undefined reference to `std::__cxx11::basic_string, std::allocator >::_M_replace_cold(char*, unsigned long, char const*, unsigned long, unsigned long)' > >> > >> <<<< > >> > >> Ok some v11 compiler libraries are getting mixed up (likely from -lgfortran) causing grief. > >>>>>>>>>> > >> [root at frog ~]# yum remove gcc-toolset-14-runtime > >> Dependencies resolved. > >> ================================================================================ > >> Package Arch Version Repository Size > >> ================================================================================ > >> Removing: > >> gcc-toolset-14-runtime x86_64 14.0-1.el9 @appstream 11 k > >> Removing dependent packages: > >> clang x86_64 19.1.7-1.el9 @appstream 181 k > >> clang-tools-extra x86_64 19.1.7-1.el9 @appstream 69 M > >> gcc-toolset-14-binutils x86_64 2.41-3.el9 @appstream 27 M > >> Removing unused dependencies: > >> clang-libs x86_64 19.1.7-1.el9 @appstream 413 M > >> clang-resource-filesystem x86_64 19.1.7-1.el9 @appstream 15 k > >> compiler-rt x86_64 19.1.7-1.el9 @appstream 37 M > >> gcc-toolset-14-gcc x86_64 14.2.1-7.1.el9 @appstream 122 M > >> gcc-toolset-14-gcc-c++ x86_64 14.2.1-7.1.el9 @appstream 39 M > >> gcc-toolset-14-libstdc++-devel x86_64 14.2.1-7.1.el9 @appstream 22 M > >> libomp x86_64 19.1.7-1.el9 @appstream 1.9 M > >> libomp-devel x86_64 19.1.7-1.el9 @appstream 31 M > >> > >> Transaction Summary > >> ================================================================================ > >> Remove 12 Packages > >> > >> Freed space: 763 M > >> Is this ok [y/N]: > >> <<<<< > >> > >> So this install of clang depends-on/requires gcc-toolset-14-gcc. Also gfortran-14 is missing. Try installing it. > >>>>>> > >> [root at frog ~]# yum install gcc-toolset-14-gcc-gfortran > >> <<<< > >> > >> Now retry build: > >>>>> > >> [balay at frog petsc]$ ./configure --with-cc=clang --with-cxx=clang++ --with-fc=gfortran --with-mpi=0 && make && make check > >> > >> Running PETSc check examples to verify correct installation > >> Using PETSC_DIR=/home/balay/petsc and PETSC_ARCH=arch-linux-c-debug > >> C/C++ example src/snes/tutorials/ex19 run successfully with 1 MPI process > >> Fortran example src/snes/tutorials/ex5f run successfully with 1 MPI process > >> Completed PETSc check examples > >> [balay at frog petsc]$ > >> <<<< > >> > >> Hm - Using gfortran-11 here [with gfortran-14 installed] somehow worked! But perhaps its better to use gfortran-14 [as this install of clang requires g++-14] > >>>>>> > >> [balay at frog petsc]$ export PATH=/opt/rh/gcc-toolset-14/root/usr/bin:$PATH > >> [balay at frog petsc]$ gfortran --version > >> GNU Fortran (GCC) 14.2.1 20250110 (Red Hat 14.2.1-7) > >> Copyright (C) 2024 Free Software Foundation, Inc. > >> This is free software; see the source for copying conditions. There is NO > >> warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. > >> > >> [balay at frog petsc]$ ./configure --with-cc=clang --with-cxx=clang++ --with-fc=gfortran --with-mpi=0 && make && make check > >> > >> CLINKER arch-linux-c-debug/lib/libpetsc.so.3.22.3 > >> ========================================= > >> Now to check if the libraries are working do: > >> make PETSC_DIR=/home/balay/petsc PETSC_ARCH=arch-linux-c-debug check > >> ========================================= > >> Running PETSc check examples to verify correct installation > >> Using PETSC_DIR=/home/balay/petsc and PETSC_ARCH=arch-linux-c-debug > >> C/C++ example src/snes/tutorials/ex19 run successfully with 1 MPI process > >> Fortran example src/snes/tutorials/ex5f run successfully with 1 MPI process > >> Completed PETSc check examples > >> [balay at frog petsc]$ > >> <<<< > >> > >> So that worked! > >> > >> Satish > >> > >> > >> On Thu, 20 Feb 2025, Satish Balay wrote: > >> > >>> Actually, simpler: > >>> > >>> ./configure --with-cc=clang --with-cxx=clang++ --with-fc=gfortran --with-mpi=0 --download-fblaslapack="$(DIR_SRC)/fblaslapack-$(FBLASLAPACK_VERSION).tar.gz" && make && make check > >>> > >>>> /opt/rh/gcc-toolset-12/root/usr/lib/gcc/x86_64-redhat-linux/12/../../../../bin/ld > >>> > >>> Hm - there was in issue with some (clang versions?) incompatibilities with gcc-12 - I think using gcc-11 (system default in that use case) worked. I'm not sure if you are seeing the same issue here. > >>> > >>> Satish > >>> > >>> On Thu, 20 Feb 2025, Satish Balay wrote: > >>> > >>>> > >>>> Any particular reason to use these flags? What clang version? OS? > >>>> > >>>> Best if you can send build logs [perhaps to petsc-maint] > >>>> > >>>> Can you try a simpler build and see if it works: > >>>> > >>>> ./configure --with-mpi-dir=/PATH_TO/models/src/v2021.03-2.0.3-llvm --download-fblaslapack="$(DIR_SRC)/fblaslapack-$(FBLASLAPACK_VERSION).tar.gz" && make && make check > >>>> or: > >>>> ./configure --with-cc=clang --with-cxx=clang++ --with-fc=gfortran --download-mpich="$(DIR_SRC)/mpich-$(MPICH_VERSION).tar.gz" --download-fblaslapack="$(DIR_SRC)/fblaslapack-$(FBLASLAPACK_VERSION).tar.gz" && make && make check > >>>> > >>>> Satish > >>>> > >>>> On Thu, 20 Feb 2025, Michael Schaferkotter wrote: > >>>> > >>>>> build petsc-3.20.3 with llvm, clang, clang++, gfortran > >>>>> > >>>>> CFLAGS='-std=c++11' > >>>>> CXXFLAGS='-std=c++11 -D_GLIBCXX_USE_CXX11_ABI=1' > >>>>> LDLIBS += -lstdc++ > >>>>> > >>>>> $PETSC_ARCH arch-linux-c-opt > >>>>> MPIF90 = ./models/src/v2021.03-2.0.3-llvm/bin/mpif90 > >>>>> MPICC = ./models/src/v2021.03-2.0.3-llvm/bin/mpicc > >>>>> CLANG = clang > >>>>> FC = gfortran > >>>>> > >>>>> > >>>>> Petsc libraries are built; > >>>>> /models/src/v2021.03-2.0.3-llvm/lib/libpetsc.so@ > >>>>> /models/src/v2021.03-2.0.3-llvm/lib/libpetsc.so.3.020@ > >>>>> /models/src/v2021.03-2.0.3-llvm/lib/libpetsc.so.3.020.3* > >>>>> > >>>>> > >>>>> The configure is this: > >>>>> cd $(PETSC_SRC) && unset CXX CC FC F77 && $(PYTHON2) ./configure --prefix=$(PREFIX) \ > >>>>> --with-cc=clang \ > >>>>> --with-cxx=clang++ \ > >>>>> --with-fc=gfortran \ > >>>>> --download-mpich="$(DIR_SRC)/mpich-$(MPICH_VERSION).tar.gz" \ > >>>>> --download-fblaslapack="$(DIR_SRC)/fblaslapack-$(FBLASLAPACK_VERSION).tar.gz" \ > >>>>> --download-sowing \ > >>>>> --with-debugging=$(PETSC_DBG) \ > >>>>> --with-shared-libraries=1 \ > >>>>> CFLAGS='-std=c11' \ > >>>>> CXXFLAGS='-std=c++11 -D_GLIBCXX_USE_CXX11_ABI=1' \ > >>>>> CPPFLAGS='-D_GLIBCXX_USE_CXX11_ABI=1' \ > >>>>> LDFLAGS='-L$(LLVM_LIB)' \ > >>>>> LIBS='-lstdc++? \ > >>>>> --COPTFLAGS=$(COPTFLAGS) --CXXOPTFLAGS=$(CXXOPTFLAGS) --FOPTFLAGS=$(FOPTFLAGS) > >>>>> > >>>>> > >>>>> Here is the make: > >>>>> > >>>>> $(MAKE) -C $(PETSC_SRC) PETSC_DIR=$(PETSC_SRC) PETSC_ARCH=$(PETSC_ARCH) all > >>>>> > >>>>> > >>>>> Check-petsc is: > >>>>> > >>>>> $(MAKE) -C $(PETSC_SRC) PETSC_DIR=$(PETSC_SRC) PETSC_ARCH=$(PETSC_ARCH) test > >>>>> > >>>>> Here is the log file for test: > >>>>> > >>>>> make[1]: Entering directory '/models/src/v2021.03-2.0.3-llvm/build/petsc/petsc-3.20.3' > >>>>> /usr/bin/python3 /models/src/v2021.03-2.0.3-llvm/build/petsc/petsc-3.20.3/config/gmakegentest.py --petsc-dir=/models/src/v2021.03-2.0.3-llvm/build/petsc/petsc-3.20.3 --petsc-arch=arch-linux-c-opt --testdir=./arch-linux-c-opt/tests --srcdir /models/src/v2021.03-2.0.3-llvm/build/petsc/petsc-3.20.3/src --pkg-pkgs "sys vec mat dm ksp snes ts tao" > >>>>> Using MAKEFLAGS: iw -- PETSC_ARCH=arch-linux-c-opt PETSC_DIR=/models/src/v2021.03-2.0.3-llvm/build/petsc/petsc-3.20.3 > >>>>> CC arch-linux-c-opt/tests/sys/classes/draw/tests/ex1.o > >>>>> CLINKER arch-linux-c-opt/tests/sys/classes/draw/tests/ex1 > >>>>> /opt/rh/gcc-toolset-12/root/usr/lib/gcc/x86_64-redhat-linux/12/../../../../bin/ld: arch-linux-c-opt/lib/libpetsc.so: undefined reference to `std::__cxx11::basic_ostringstream, std::allocator >::basic_ostringstream()' > >>>>> /opt/rh/gcc-toolset-12/root/usr/lib/gcc/x86_64-redhat-linux/12/../../../../bin/ld: arch-linux-c-opt/lib/libpetsc.so: undefined reference to `std::__throw_bad_array_new_length()' > >>>>> clang: error: linker command failed with exit code 1 (use -v to see invocation) > >>>>> make[1]: [gmakefile.test:273: arch-linux-c-opt/tests/sys/classes/draw/tests/ex1] Error 1 (ignored) > >>>>> > >>>>> > >>>>> There are many errors of the ilk: > >>>>> > >>>>> std::__cxx11::basic_ostringstream, std::allocator >::basic_ostringstream() > >>>>> > >>>>> [lib]$ nm -A libpetsc.so | grep basic_ostringstream > >>>>> libpetsc.so: U _ZNKSt7__cxx1119basic_ostringstreamIcSt11char_traitsIcESaIcEE3strEv at GLIBCXX_3.4.21 > >>>>> libpetsc.so: U _ZNSt7__cxx1119basic_ostringstreamIcSt11char_traitsIcESaIcEEC1Ev > >>>>> libpetsc.so: U _ZNSt7__cxx1119basic_ostringstreamIcSt11char_traitsIcESaIcEED1Ev at GLIBCXX_3.4.21 > >>>>> > >>>>> > >>>>> I/m new to llvm and this is the first time to compile petsc.3.20.3 with llvm compilers. > >>>>> > >>>>> Clearly something is amiss. > >>>>> > >>>>> Any ideas appreciated. > >>>>> > >>>>> Michael > >>>>> > >>>>> > >>>>> > >>>> > > From eirik.hoydalsvik at sintef.no Mon Feb 24 07:41:09 2025 From: eirik.hoydalsvik at sintef.no (=?Windows-1252?Q?Eirik_Jaccheri_H=F8ydalsvik?=) Date: Mon, 24 Feb 2025 13:41:09 +0000 Subject: [petsc-users] TS Solver stops working when including ts.setDM Message-ID: Hi, I am using the petsc4py.ts timestepper to solve a PDE. It is not easy to obtain the jacobian for my equations, so I do not provide a jacobian function. The code is given at the end of the email. When I comment out the function call ?ts.setDM(da)?, the code runs and gives reasonable results. However, when I add this line of code, the program crashes with the error message provided at the end of the email. Questions: 1. Do you know why adding this line of code can make the SNES solver diverge? Any suggestions for how to debug the issue? 2. What is the advantage of adding the DMDA object to the ts solver? Will this speed up the calculation of the finite difference jacobian? Best regards, Eirik H?ydalsvik SINTEF ER/NTNU Error message: [Eiriks-MacBook-Pro.local:26384] shmem: mmap: an error occurred while determining whether or not /var/folders/w1/35jw9y4n7lsbw0dhjqdhgzz80000gn/T//ompi.Eiriks-MacBook-Pro.501/jf.0/2046361600/sm_segment.Eiriks-MacBook-Pro.501.79f90000.0 could be created. t 0 of 1 with dt = 0.2 0 TS dt 0.2 time 0. TSAdapt none step 0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0 + 2.000e-01 retrying with dt=5.000e-02 TSAdapt none step 0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0 + 5.000e-02 retrying with dt=1.250e-02 TSAdapt none step 0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0 + 1.250e-02 retrying with dt=3.125e-03 TSAdapt none step 0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0 + 3.125e-03 retrying with dt=7.813e-04 TSAdapt none step 0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0 + 7.813e-04 retrying with dt=1.953e-04 TSAdapt none step 0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0 + 1.953e-04 retrying with dt=4.883e-05 TSAdapt none step 0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0 + 4.883e-05 retrying with dt=1.221e-05 TSAdapt none step 0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0 + 1.221e-05 retrying with dt=3.052e-06 TSAdapt none step 0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0 + 3.052e-06 retrying with dt=7.629e-07 TSAdapt none step 0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0 + 7.629e-07 retrying with dt=1.907e-07 TSAdapt none step 0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0 + 1.907e-07 retrying with dt=4.768e-08 Traceback (most recent call last): File "/Users/iept1445/Code/tank_model/closed_tank.py", line 200, in return_dict1d = get_tank_composition_1d(tank_params) File "/Users/iept1445/Code/tank_model/src/tank_model1d.py", line 223, in get_tank_composition_1d ts.solve(u=x) File "petsc4py/PETSc/TS.pyx", line 2478, in petsc4py.PETSc.TS.solve petsc4py.PETSc.Error: error code 91 [0] TSSolve() at /Users/iept1445/.cache/uv/sdists-v6/pypi/petsc/3.22.1/svV0XnlR4s3LB4lmsaKjR/src/src/ts/interface/ts.c:4072 [0] TSStep() at /Users/iept1445/.cache/uv/sdists-v6/pypi/petsc/3.22.1/svV0XnlR4s3LB4lmsaKjR/src/src/ts/interface/ts.c:3440 [0] TSStep has failed due to DIVERGED_STEP_REJECTED Options for solver: COMM = PETSc.COMM_WORLD da = PETSc.DMDA().create( dim=(N_vertical,), dof=3, stencil_type=PETSc.DMDA().StencilType.STAR, stencil_width=1, # boundary_type=PETSc.DMDA.BoundaryType.GHOSTED, ) x = da.createGlobalVec() x_old = da.createGlobalVec() f = da.createGlobalVec() J = da.createMat() rho_ref = rho_m[0] # kg/m3 e_ref = e_m[0] # J/mol p_ref = p0 # Pa x.setArray(np.array([rho_m / rho_ref, e_m / e_ref, ux_m]).T.flatten()) x_old.setArray(np.array([rho_m / rho_ref, e_m / e_ref, ux_m]).T.flatten()) optsDB = PETSc.Options() optsDB["snes_lag_preconditioner_persists"] = False optsDB["snes_lag_jacobian"] = 1 optsDB["snes_lag_jacobian_persists"] = False optsDB["snes_lag_preconditioner"] = 1 optsDB["ksp_type"] = "gmres" # "gmres" # gmres" optsDB["pc_type"] = "ilu" # "lu" # "ilu" optsDB["snes_type"] = "newtonls" optsDB["ksp_rtol"] = 1e-7 optsDB["ksp_atol"] = 1e-7 optsDB["ksp_max_it"] = 100 optsDB["snes_rtol"] = 1e-5 optsDB["snes_atol"] = 1e-5 optsDB["snes_stol"] = 1e-5 optsDB["snes_max_it"] = 100 optsDB["snes_mf"] = False optsDB["ts_max_time"] = t_end optsDB["ts_type"] = "beuler" # "bdf" # optsDB["ts_max_snes_failures"] = -1 optsDB["ts_monitor"] = "" optsDB["ts_adapt_monitor"] = "" # optsDB["snes_monitor"] = "" # optsDB["ksp_monitor"] = "" optsDB["ts_atol"] = 1e-4 x0 = x_old residual_wrap = residual_ts( eos, x0, N_vertical, g, pos, z, mw, dt, dx, p_amb, A_nozzle, r_tank_inner, mph_uv_flsh_L, rho_ref, e_ref, p_ref, closed_tank, J, f, da, drift_func, T_wall, tank_params, ) # optsDB["ts_adapt_type"] = "none" ts = PETSc.TS().create(comm=COMM) # TODO: Figure out why DM crashes the code # ts.setDM(residual_wrap.da) ts.setIFunction(residual_wrap.residual_ts, None) ts.setTimeStep(dt) ts.setMaxSteps(-1) ts.setTime(t_start) # s ts.setMaxTime(t_end) # s ts.setMaxSteps(1e5) ts.setStepLimits(1e-3, 1e5) ts.setFromOptions() ts.solve(u=x) Residual function: class residual_ts: def __init__( self, eos, x0, N, g, pos, z, mw, dt, dx, p_amb, A_nozzle, r_tank_inner, mph_uv_flsh_l, rho_ref, e_ref, p_ref, closed_tank, J, f, da, drift_func, T_wall, tank_params, ): self.eos = eos self.x0 = x0 self.N = N self.g = g self.pos = pos self.z = z self.mw = mw self.dt = dt self.dx = dx self.p_amb = p_amb self.A_nozzle = A_nozzle self.r_tank_inner = r_tank_inner self.mph_uv_flsh_L = mph_uv_flsh_l self.rho_ref = rho_ref self.e_ref = e_ref self.p_ref = p_ref self.closed_tank = closed_tank self.J = J self.f = f self.da = da self.drift_func = drift_func self.T_wall = T_wall self.tank_params = tank_params self.Q_wall = np.zeros(N) self.n_iter = 0 self.t_current = [0.0] self.s_top = [0.0] self.p_choke = [0.0] # setting interp func # TODO figure out how to generalize this method self._interp_func = _jalla_upwind # allocate space for new params self.p = np.zeros(N) # Pa self.T = np.zeros(N) # K self.alpha = np.zeros((2, N)) self.rho = np.zeros((2, N)) self.e = np.zeros((2, N)) # allocate space for ghost cells self.alpha_ghost = np.zeros((2, N + 2)) self.rho_ghost = np.zeros((2, N + 2)) self.rho_m_ghost = np.zeros(N + 2) self.u_m_ghost = np.zeros(N + 1) self.u_ghost = np.zeros((2, N + 1)) self.e_ghost = np.zeros((2, N + 2)) self.pos_ghost = np.zeros(N + 2) self.h_ghost = np.zeros((2, N + 2)) # allocate soace for local X and Xdot self.X_LOCAL = da.createLocalVec() self.XDOT_LOCAL = da.createLocalVec() def residual_ts(self, ts, t, X, XDOT, F): self.n_iter += 1 # TODO: Estimate time use """ Caculate residuals for equations (rho, rho (e + gx))_t + (rho u, rho u (h + gx))_x = 0 P_x = - g \rho """ n_phase = 2 self.da.globalToLocal(X, self.X_LOCAL) self.da.globalToLocal(XDOT, self.XDOT_LOCAL) x = self.da.getVecArray(self.X_LOCAL) xdot = self.da.getVecArray(self.XDOT_LOCAL) f = self.da.getVecArray(F) T_c, v_c, p_c = self.eos.critical(self.z) # K, m3/mol, Pa rho_m = x[:, 0] * self.rho_ref # kg/m3 e_m = x[:, 1] * self.e_ref # J/mol u_m = x[:-1, 2] # m/s # derivatives rho_m_dot = xdot[:, 0] * self.rho_ref # kg/m3 e_m_dot = xdot[:, 1] * self.e_ref # kg/m3 dt = ts.getTimeStep() # s for i in range(self.N): # get new parameters self.mph_uv_flsh_L[i] = self.eos.multi_phase_uvflash( self.z, e_m[i], self.mw / rho_m[i], self.mph_uv_flsh_L[i] ) betaL, betaV, betaS = _get_betas(self.mph_uv_flsh_L[i]) # mol/mol beta = [betaL, betaV] if betaS != 0.0: print("there is a solid phase which is not accounted for") self.T[i], self.p[i] = _get_tank_temperature_pressure( self.mph_uv_flsh_L[i] ) # K, Pa) for j, phase in enumerate([self.eos.LIQPH, self.eos.VAPPH]): # new parameters self.rho_ghost[:, 1:-1][j][i] = ( self.mw / self.eos.specific_volume(self.T[i], self.p[i], self.z, phase)[0] ) # kg/m3 self.e_ghost[:, 1:-1][j][i] = self.eos.internal_energy_tv( self.T[i], self.mw / self.rho_ghost[:, 1:-1][j][i], self.z, phase )[ 0 ] # J/mol self.h_ghost[:, 1:-1][j][i] = ( self.e_ghost[:, 1:-1][j][i] + self.p[i] * self.mw / self.rho_ghost[:, 1:-1][j][i] ) # J/mol self.alpha_ghost[:, 1:-1][j][i] = ( beta[j] / self.rho_ghost[:, 1:-1][j][i] * rho_m[i] ) # m3/m3 # calculate drift velocity for i in range(self.N - 1): self.u_ghost[:, 1:-1][0][i], self.u_ghost[:, 1:-1][1][i] = ( calc_drift_velocity( u_m[i], self._interp_func( self.rho_ghost[:, 1:-1][0][i], self.rho_ghost[:, 1:-1][0][i + 1], u_m[i], ), self._interp_func( self.rho_ghost[:, 1:-1][1][i], self.rho_ghost[:, 1:-1][1][i + 1], u_m[i], ), self.g, self._interp_func(self.T[i], self.T[i + 1], u_m[i]), T_c, self.r_tank_inner, self._interp_func( self.alpha_ghost[:, 1:-1][0][i], self.alpha_ghost[:, 1:-1][0][i + 1], u_m[i], ), self._interp_func( self.alpha_ghost[:, 1:-1][1][i], self.alpha_ghost[:, 1:-1][1][i + 1], u_m[i], ), self.drift_func, ) ) # liq m / s , vapour m / s u_bottom = 0 if self.closed_tank: u_top = 0.0 # m/s else: # calc phase to skip env_isentrope_cross if ( self.mph_uv_flsh_L[-1].liquid != None and self.mph_uv_flsh_L[-1].vapour == None and self.mph_uv_flsh_L[-1].solid == None ): phase_env = self.eos.LIQPH else: phase_env = self.eos.TWOPH self.h_m = e_m + self.p * self.mw / rho_m # J / mol self.s_top[0] = _get_s_tank(self.eos, self.mph_uv_flsh_L[-1]) # J / mol / K mdot, self.p_choke[0] = calc_mass_outflow( self.eos, self.z, self.h_m[-1], self.s_top[0], self.p[-1], self.p_amb, self.A_nozzle, self.mw, phase_env, debug_plot=False, ) # mol / s , Pa u_top = -mdot * self.mw / rho_m[-1] / (np.pi * self.r_tank_inner**2) # m/s # assemble vectors with ghost cells self.alpha_ghost[:, 0] = self.alpha_ghost[:, 1:-1][:, 0] # m3/m3 self.alpha_ghost[:, -1] = self.alpha_ghost[:, 1:-1][:, -1] # m3/m3 self.rho_ghost[:, 0] = self.rho_ghost[:, 1:-1][:, 0] # kg/m3 self.rho_ghost[:, -1] = self.rho_ghost[:, 1:-1][:, -1] # kg/m3 self.rho_m_ghost[0] = rho_m[0] # kg/m3 self.rho_m_ghost[1:-1] = rho_m # kg/m3 self.rho_m_ghost[-1] = rho_m[-1] # kg/m3 # u_ghost[:, 1:-1] = u # m/s self.u_ghost[:, 0] = u_bottom # m/s self.u_ghost[:, -1] = u_top # m/s self.u_m_ghost[0] = u_bottom # m/s self.u_m_ghost[1:-1] = u_m # m/s self.u_m_ghost[-1] = u_top # m/s self.e_ghost[:, 0] = self.e_ghost[:, 1:-1][:, 0] # J/mol self.e_ghost[:, -1] = self.e_ghost[:, 1:-1][:, -1] # J/mol self.pos_ghost[1:-1] = self.pos # m self.pos_ghost[0] = self.pos[0] # m self.pos_ghost[-1] = self.pos[-1] # m self.h_ghost[:, 0] = self.h_ghost[:, 1] # J/mol self.h_ghost[:, -1] = self.h_ghost[:, -2] # J/mol # recalculate wall temperature and heat flux # TODO ARE WE DOING THE STAGGERING CORRECTLY? lz = self.tank_params["lz_tank"] / self.N # m if ts.getTime() != self.t_current[0] and self.tank_params["heat_transfer"]: self.t_current[0] = ts.getTime() for i in range(self.N): self.T_wall[i], self.Q_wall[i], h_ht = ( solve_radial_heat_conduction_implicit( self.tank_params, self.T[i], self.T_wall[i], (self.u_m_ghost[i] + self.u_m_ghost[i + 1]) / 2, self.rho_m_ghost[i + 1], self.mph_uv_flsh_L[i], lz, dt, ) ) # K, J/s, W/m2K # Calculate residuals f[:, :] = 0.0 f[:, 0] = dt * rho_m_dot # kg/m3 f[:-1, 1] = self.p[1:] - self.p[:-1] + self.dx * self.g * rho_m[0:-1] # Pa/m f[:, 2] = ( dt * ( rho_m_dot * (e_m / self.mw + self.g * self.pos) + rho_m * e_m_dot / self.mw ) - rho_m_dot * e_m_dot / self.mw * dt**2 - self.Q_wall / (np.pi * self.r_tank_inner**2 * lz) * dt ) # J / m3 # add contribution from space for i in range(n_phase): e_flux_i = np.zeros_like(self.u_ghost[i]) # J/m3 m/s rho_flux_i = np.zeros_like(self.u_ghost[i]) # kg/m2/s for j in range(1, self.N + 1): if self.u_ghost[i][j] >= 0.0: rho_flux_new = _rho_flux( self.alpha_ghost[i][j], self.rho_ghost[i][j], self.u_ghost[i][j] ) e_flux_new = _e_flux( self.alpha_ghost[i][j], self.rho_ghost[i][j], self.h_ghost[i][j], self.mw, self.g, self.pos_ghost[j], self.u_ghost[i][j], ) # backward euler rho_flux_i[j] = rho_flux_new # kg/m2/s e_flux_i[j] = e_flux_new # J/m3 m/s else: rho_flux_new = _rho_flux( self.alpha_ghost[i][j + 1], self.rho_ghost[i][j + 1], self.u_ghost[i][j], ) e_flux_new = _e_flux( self.alpha_ghost[i][j + 1], self.rho_ghost[i][j + 1], self.h_ghost[i][j + 1], self.mw, self.g, self.pos_ghost[j + 1], self.u_ghost[i][j], ) # backward euler rho_flux_i[j] = rho_flux_new e_flux_i[j] = e_flux_new # mass eq f[:, 0] += (dt / self.dx) * (rho_flux_i[1:] - rho_flux_i[:-1]) # kg/m3 # energy eq f[:, 2] += (dt / self.dx) * (e_flux_i[1:] - e_flux_i[:-1]) # J/m3 f1_ref, f2_ref, f3_ref = self.rho_ref, self.p_ref, self.e_ref f[:, 0] /= f1_ref f[:-1, 1] /= f2_ref f[:, 2] /= f3_ref # dummy eq f[-1, 1] = x[-1, 2] -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Feb 24 07:53:41 2025 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 24 Feb 2025 08:53:41 -0500 Subject: [petsc-users] TS Solver stops working when including ts.setDM In-Reply-To: References: Message-ID: On Mon, Feb 24, 2025 at 8:41?AM Eirik Jaccheri H?ydalsvik via petsc-users < petsc-users at mcs.anl.gov> wrote: > Hi, > > I am using the petsc4py.ts timestepper to solve a PDE. It is not easy to > obtain the jacobian for my equations, so I do not provide a jacobian > function. The code is given at the end of the email. > > When I comment out the function call ?ts.setDM(da)?, the code runs and > gives reasonable results. > > However, when I add this line of code, the program crashes with the error > message provided at the end of the email. > > Questions: > > 1. Do you know why adding this line of code can make the SNES solver > diverge? Any suggestions for how to debug the issue? > I will not know until I run it, but here is my guess. When the DMDA is specified, PETSc uses coloring to produce the Jacobian. When it is not, it just brute-forces the entire J. My guess is that your residual does not respect the stencil in the DMDA, so the coloring is wrong, making a wrong Jacobian. > 2. What is the advantage of adding the DMDA object to the ts solver? Will > this speed up the calculation of the finite difference jacobian? > Yes, it speeds up the computation of the FD Jacobian. Thanks, Matt > Best regards, > > Eirik H?ydalsvik > > SINTEF ER/NTNU > > Error message: > > [Eiriks-MacBook-Pro.local:26384] shmem: mmap: an error occurred while > determining whether or not > /var/folders/w1/35jw9y4n7lsbw0dhjqdhgzz80000gn/T//ompi.Eiriks-MacBook-Pro.501/jf.0/2046361600/sm_segment.Eiriks-MacBook-Pro.501.79f90000.0 could > be created. > > t 0 of 1 with dt = 0.2 > > 0 TS dt 0.2 time 0. > > TSAdapt none step 0 stage rejected (SNES reason > DIVERGED_LINEAR_SOLVE) t=0 + 2.000e-01 retrying with dt=5.000e-02 > > TSAdapt none step 0 stage rejected (SNES reason > DIVERGED_LINEAR_SOLVE) t=0 + 5.000e-02 retrying with dt=1.250e-02 > > TSAdapt none step 0 stage rejected (SNES reason > DIVERGED_LINEAR_SOLVE) t=0 + 1.250e-02 retrying with dt=3.125e-03 > > TSAdapt none step 0 stage rejected (SNES reason > DIVERGED_LINEAR_SOLVE) t=0 + 3.125e-03 retrying with dt=7.813e-04 > > TSAdapt none step 0 stage rejected (SNES reason > DIVERGED_LINEAR_SOLVE) t=0 + 7.813e-04 retrying with dt=1.953e-04 > > TSAdapt none step 0 stage rejected (SNES reason > DIVERGED_LINEAR_SOLVE) t=0 + 1.953e-04 retrying with dt=4.883e-05 > > TSAdapt none step 0 stage rejected (SNES reason > DIVERGED_LINEAR_SOLVE) t=0 + 4.883e-05 retrying with dt=1.221e-05 > > TSAdapt none step 0 stage rejected (SNES reason > DIVERGED_LINEAR_SOLVE) t=0 + 1.221e-05 retrying with dt=3.052e-06 > > TSAdapt none step 0 stage rejected (SNES reason > DIVERGED_LINEAR_SOLVE) t=0 + 3.052e-06 retrying with dt=7.629e-07 > > TSAdapt none step 0 stage rejected (SNES reason > DIVERGED_LINEAR_SOLVE) t=0 + 7.629e-07 retrying with dt=1.907e-07 > > TSAdapt none step 0 stage rejected (SNES reason > DIVERGED_LINEAR_SOLVE) t=0 + 1.907e-07 retrying with dt=4.768e-08 > > Traceback (most recent call last): > > File "/Users/iept1445/Code/tank_model/closed_tank.py", line 200, in > > > return_dict1d = get_tank_composition_1d(tank_params) > > File "/Users/iept1445/Code/tank_model/src/tank_model1d.py", line 223, in > get_tank_composition_1d > > ts.solve(u=x) > > File "petsc4py/PETSc/TS.pyx", line 2478, in petsc4py.PETSc.TS.solve > > petsc4py.PETSc.Error: error code 91 > > [0] TSSolve() at > /Users/iept1445/.cache/uv/sdists-v6/pypi/petsc/3.22.1/svV0XnlR4s3LB4lmsaKjR/src/src/ts/interface/ts.c:4072 > > [0] TSStep() at > /Users/iept1445/.cache/uv/sdists-v6/pypi/petsc/3.22.1/svV0XnlR4s3LB4lmsaKjR/src/src/ts/interface/ts.c:3440 > > [0] TSStep has failed due to DIVERGED_STEP_REJECTED > > Options for solver: > > COMM = PETSc.COMM_WORLD > > > > da = PETSc.DMDA().create( > > dim=(N_vertical,), > > dof=3, > > stencil_type=PETSc.DMDA().StencilType.STAR, > > stencil_width=1, > > # boundary_type=PETSc.DMDA.BoundaryType.GHOSTED, > > ) > > x = da.createGlobalVec() > > x_old = da.createGlobalVec() > > f = da.createGlobalVec() > > J = da.createMat() > > rho_ref = rho_m[0] # kg/m3 > > e_ref = e_m[0] # J/mol > > p_ref = p0 # Pa > > x.setArray(np.array([rho_m / rho_ref, e_m / e_ref, ux_m]).T.flatten()) > > x_old.setArray(np.array([rho_m / rho_ref, e_m / e_ref, > ux_m]).T.flatten()) > > > > optsDB = PETSc.Options() > > optsDB["snes_lag_preconditioner_persists"] = False > > optsDB["snes_lag_jacobian"] = 1 > > optsDB["snes_lag_jacobian_persists"] = False > > optsDB["snes_lag_preconditioner"] = 1 > > optsDB["ksp_type"] = "gmres" # "gmres" # gmres" > > optsDB["pc_type"] = "ilu" # "lu" # "ilu" > > optsDB["snes_type"] = "newtonls" > > optsDB["ksp_rtol"] = 1e-7 > > optsDB["ksp_atol"] = 1e-7 > > optsDB["ksp_max_it"] = 100 > > optsDB["snes_rtol"] = 1e-5 > > optsDB["snes_atol"] = 1e-5 > > optsDB["snes_stol"] = 1e-5 > > optsDB["snes_max_it"] = 100 > > optsDB["snes_mf"] = False > > optsDB["ts_max_time"] = t_end > > optsDB["ts_type"] = "beuler" # "bdf" # > > optsDB["ts_max_snes_failures"] = -1 > > optsDB["ts_monitor"] = "" > > optsDB["ts_adapt_monitor"] = "" > > # optsDB["snes_monitor"] = "" > > # optsDB["ksp_monitor"] = "" > > optsDB["ts_atol"] = 1e-4 > > > > x0 = x_old > > residual_wrap = residual_ts( > > eos, > > x0, > > N_vertical, > > g, > > pos, > > z, > > mw, > > dt, > > dx, > > p_amb, > > A_nozzle, > > r_tank_inner, > > mph_uv_flsh_L, > > rho_ref, > > e_ref, > > p_ref, > > closed_tank, > > J, > > f, > > da, > > drift_func, > > T_wall, > > tank_params, > > ) > > > > # optsDB["ts_adapt_type"] = "none" > > > > ts = PETSc.TS().create(comm=COMM) > > # TODO: Figure out why DM crashes the code > > # ts.setDM(residual_wrap.da) > > ts.setIFunction(residual_wrap.residual_ts, None) > > ts.setTimeStep(dt) > > ts.setMaxSteps(-1) > > ts.setTime(t_start) # s > > ts.setMaxTime(t_end) # s > > ts.setMaxSteps(1e5) > > ts.setStepLimits(1e-3, 1e5) > > ts.setFromOptions() > > ts.solve(u=x) > > > > Residual function: > > class residual_ts: > > def __init__( > > self, > > eos, > > x0, > > N, > > g, > > pos, > > z, > > mw, > > dt, > > dx, > > p_amb, > > A_nozzle, > > r_tank_inner, > > mph_uv_flsh_l, > > rho_ref, > > e_ref, > > p_ref, > > closed_tank, > > J, > > f, > > da, > > drift_func, > > T_wall, > > tank_params, > > ): > > self.eos = eos > > self.x0 = x0 > > self.N = N > > self.g = g > > self.pos = pos > > self.z = z > > self.mw = mw > > self.dt = dt > > self.dx = dx > > self.p_amb = p_amb > > self.A_nozzle = A_nozzle > > self.r_tank_inner = r_tank_inner > > self.mph_uv_flsh_L = mph_uv_flsh_l > > self.rho_ref = rho_ref > > self.e_ref = e_ref > > self.p_ref = p_ref > > self.closed_tank = closed_tank > > self.J = J > > self.f = f > > self.da = da > > self.drift_func = drift_func > > self.T_wall = T_wall > > self.tank_params = tank_params > > self.Q_wall = np.zeros(N) > > self.n_iter = 0 > > self.t_current = [0.0] > > self.s_top = [0.0] > > self.p_choke = [0.0] > > > > # setting interp func # TODO figure out how to generalize this > method > > self._interp_func = _jalla_upwind > > > > # allocate space for new params > > self.p = np.zeros(N) # Pa > > self.T = np.zeros(N) # K > > self.alpha = np.zeros((2, N)) > > self.rho = np.zeros((2, N)) > > self.e = np.zeros((2, N)) > > > > # allocate space for ghost cells > > self.alpha_ghost = np.zeros((2, N + 2)) > > self.rho_ghost = np.zeros((2, N + 2)) > > self.rho_m_ghost = np.zeros(N + 2) > > self.u_m_ghost = np.zeros(N + 1) > > self.u_ghost = np.zeros((2, N + 1)) > > self.e_ghost = np.zeros((2, N + 2)) > > self.pos_ghost = np.zeros(N + 2) > > self.h_ghost = np.zeros((2, N + 2)) > > > > # allocate soace for local X and Xdot > > self.X_LOCAL = da.createLocalVec() > > self.XDOT_LOCAL = da.createLocalVec() > > > > def residual_ts(self, ts, t, X, XDOT, F): > > self.n_iter += 1 > > # TODO: Estimate time use > > """ > > Caculate residuals for equations > > (rho, rho (e + gx))_t + (rho u, rho u (h + gx))_x = 0 > > P_x = - g \rho > > """ > > n_phase = 2 > > self.da.globalToLocal(X, self.X_LOCAL) > > self.da.globalToLocal(XDOT, self.XDOT_LOCAL) > > x = self.da.getVecArray(self.X_LOCAL) > > xdot = self.da.getVecArray(self.XDOT_LOCAL) > > f = self.da.getVecArray(F) > > > > T_c, v_c, p_c = self.eos.critical(self.z) # K, m3/mol, Pa > > rho_m = x[:, 0] * self.rho_ref # kg/m3 > > e_m = x[:, 1] * self.e_ref # J/mol > > u_m = x[:-1, 2] # m/s > > > > # derivatives > > rho_m_dot = xdot[:, 0] * self.rho_ref # kg/m3 > > e_m_dot = xdot[:, 1] * self.e_ref # kg/m3 > > dt = ts.getTimeStep() # s > > > > for i in range(self.N): > > # get new parameters > > self.mph_uv_flsh_L[i] = self.eos.multi_phase_uvflash( > > self.z, e_m[i], self.mw / rho_m[i], self.mph_uv_flsh_L[i] > > ) > > > > betaL, betaV, betaS = _get_betas(self.mph_uv_flsh_L[i]) # > mol/mol > > beta = [betaL, betaV] > > if betaS != 0.0: > > print("there is a solid phase which is not accounted for") > > self.T[i], self.p[i] = _get_tank_temperature_pressure( > > self.mph_uv_flsh_L[i] > > ) # K, Pa) > > for j, phase in enumerate([self.eos.LIQPH, self.eos.VAPPH]): > > # new parameters > > self.rho_ghost[:, 1:-1][j][i] = ( > > self.mw > > / self.eos.specific_volume(self.T[i], self.p[i], > self.z, phase)[0] > > ) # kg/m3 > > self.e_ghost[:, 1:-1][j][i] = self.eos.internal_energy_tv( > > self.T[i], self.mw / self.rho_ghost[:, 1:-1][j][i], > self.z, phase > > )[ > > 0 > > ] # J/mol > > self.h_ghost[:, 1:-1][j][i] = ( > > self.e_ghost[:, 1:-1][j][i] > > + self.p[i] * self.mw / self.rho_ghost[:, 1:-1][j][i] > > ) # J/mol > > self.alpha_ghost[:, 1:-1][j][i] = ( > > beta[j] / self.rho_ghost[:, 1:-1][j][i] * rho_m[i] > > ) # m3/m3 > > > > # calculate drift velocity > > for i in range(self.N - 1): > > self.u_ghost[:, 1:-1][0][i], self.u_ghost[:, 1:-1][1][i] = ( > > calc_drift_velocity( > > u_m[i], > > self._interp_func( > > self.rho_ghost[:, 1:-1][0][i], > > self.rho_ghost[:, 1:-1][0][i + 1], > > u_m[i], > > ), > > self._interp_func( > > self.rho_ghost[:, 1:-1][1][i], > > self.rho_ghost[:, 1:-1][1][i + 1], > > u_m[i], > > ), > > self.g, > > self._interp_func(self.T[i], self.T[i + 1], u_m[i]), > > T_c, > > self.r_tank_inner, > > self._interp_func( > > self.alpha_ghost[:, 1:-1][0][i], > > self.alpha_ghost[:, 1:-1][0][i + 1], > > u_m[i], > > ), > > self._interp_func( > > self.alpha_ghost[:, 1:-1][1][i], > > self.alpha_ghost[:, 1:-1][1][i + 1], > > u_m[i], > > ), > > self.drift_func, > > ) > > ) # liq m / s , vapour m / s > > > > u_bottom = 0 > > if self.closed_tank: > > u_top = 0.0 # m/s > > else: > > # calc phase to skip env_isentrope_cross > > if ( > > self.mph_uv_flsh_L[-1].liquid != None > > and self.mph_uv_flsh_L[-1].vapour == None > > and self.mph_uv_flsh_L[-1].solid == None > > ): > > phase_env = self.eos.LIQPH > > else: > > phase_env = self.eos.TWOPH > > > > self.h_m = e_m + self.p * self.mw / rho_m # J / mol > > self.s_top[0] = _get_s_tank(self.eos, self.mph_uv_flsh_L[-1]) > # J / mol / K > > mdot, self.p_choke[0] = calc_mass_outflow( > > self.eos, > > self.z, > > self.h_m[-1], > > self.s_top[0], > > self.p[-1], > > self.p_amb, > > self.A_nozzle, > > self.mw, > > phase_env, > > debug_plot=False, > > ) # mol / s , Pa > > u_top = -mdot * self.mw / rho_m[-1] / (np.pi * > self.r_tank_inner**2) # m/s > > > > # assemble vectors with ghost cells > > self.alpha_ghost[:, 0] = self.alpha_ghost[:, 1:-1][:, 0] # m3/m3 > > self.alpha_ghost[:, -1] = self.alpha_ghost[:, 1:-1][:, -1] # m3/m3 > > self.rho_ghost[:, 0] = self.rho_ghost[:, 1:-1][:, 0] # kg/m3 > > self.rho_ghost[:, -1] = self.rho_ghost[:, 1:-1][:, -1] # kg/m3 > > self.rho_m_ghost[0] = rho_m[0] # kg/m3 > > self.rho_m_ghost[1:-1] = rho_m # kg/m3 > > self.rho_m_ghost[-1] = rho_m[-1] # kg/m3 > > # u_ghost[:, 1:-1] = u # m/s > > self.u_ghost[:, 0] = u_bottom # m/s > > self.u_ghost[:, -1] = u_top # m/s > > self.u_m_ghost[0] = u_bottom # m/s > > self.u_m_ghost[1:-1] = u_m # m/s > > self.u_m_ghost[-1] = u_top # m/s > > self.e_ghost[:, 0] = self.e_ghost[:, 1:-1][:, 0] # J/mol > > self.e_ghost[:, -1] = self.e_ghost[:, 1:-1][:, -1] # J/mol > > self.pos_ghost[1:-1] = self.pos # m > > self.pos_ghost[0] = self.pos[0] # m > > self.pos_ghost[-1] = self.pos[-1] # m > > self.h_ghost[:, 0] = self.h_ghost[:, 1] # J/mol > > self.h_ghost[:, -1] = self.h_ghost[:, -2] # J/mol > > > > # recalculate wall temperature and heat flux > > # TODO ARE WE DOING THE STAGGERING CORRECTLY? > > lz = self.tank_params["lz_tank"] / self.N # m > > if ts.getTime() != self.t_current[0] and > self.tank_params["heat_transfer"]: > > self.t_current[0] = ts.getTime() > > for i in range(self.N): > > self.T_wall[i], self.Q_wall[i], h_ht = ( > > solve_radial_heat_conduction_implicit( > > self.tank_params, > > self.T[i], > > self.T_wall[i], > > (self.u_m_ghost[i] + self.u_m_ghost[i + 1]) / 2, > > self.rho_m_ghost[i + 1], > > self.mph_uv_flsh_L[i], > > lz, > > dt, > > ) > > ) # K, J/s, W/m2K > > > > # Calculate residuals > > f[:, :] = 0.0 > > f[:, 0] = dt * rho_m_dot # kg/m3 > > f[:-1, 1] = self.p[1:] - self.p[:-1] + self.dx * self.g * > rho_m[0:-1] # Pa/m > > f[:, 2] = ( > > dt > > * ( > > rho_m_dot * (e_m / self.mw + self.g * self.pos) > > + rho_m * e_m_dot / self.mw > > ) > > - rho_m_dot * e_m_dot / self.mw * dt**2 > > - self.Q_wall / (np.pi * self.r_tank_inner**2 * lz) * dt > > ) # J / m3 > > > > # add contribution from space > > for i in range(n_phase): > > e_flux_i = np.zeros_like(self.u_ghost[i]) # J/m3 m/s > > rho_flux_i = np.zeros_like(self.u_ghost[i]) # kg/m2/s > > for j in range(1, self.N + 1): > > if self.u_ghost[i][j] >= 0.0: > > rho_flux_new = _rho_flux( > > self.alpha_ghost[i][j], self.rho_ghost[i][j], > self.u_ghost[i][j] > > ) > > e_flux_new = _e_flux( > > self.alpha_ghost[i][j], > > self.rho_ghost[i][j], > > self.h_ghost[i][j], > > self.mw, > > self.g, > > self.pos_ghost[j], > > self.u_ghost[i][j], > > ) > > > > # backward euler > > rho_flux_i[j] = rho_flux_new # kg/m2/s > > e_flux_i[j] = e_flux_new # J/m3 m/s > > > > else: > > rho_flux_new = _rho_flux( > > self.alpha_ghost[i][j + 1], > > self.rho_ghost[i][j + 1], > > self.u_ghost[i][j], > > ) > > > > e_flux_new = _e_flux( > > self.alpha_ghost[i][j + 1], > > self.rho_ghost[i][j + 1], > > self.h_ghost[i][j + 1], > > self.mw, > > self.g, > > self.pos_ghost[j + 1], > > self.u_ghost[i][j], > > ) > > > > # backward euler > > rho_flux_i[j] = rho_flux_new > > e_flux_i[j] = e_flux_new > > > > # mass eq > > f[:, 0] += (dt / self.dx) * (rho_flux_i[1:] - > rho_flux_i[:-1]) # kg/m3 > > > > # energy eq > > f[:, 2] += (dt / self.dx) * (e_flux_i[1:] - e_flux_i[:-1]) # > J/m3 > > > > f1_ref, f2_ref, f3_ref = self.rho_ref, self.p_ref, self.e_ref > > f[:, 0] /= f1_ref > > f[:-1, 1] /= f2_ref > > f[:, 2] /= f3_ref > > # dummy eq > > f[-1, 1] = x[-1, 2] > > > > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!ZxDCDT2_bjnnDIclXM4ZGttKBfwEZhFamWy_uuk1tJpgQYOZv6UzsOeafuUZ_Zln7sTsEo0uKwot1T29bxWQ$ -------------- next part -------------- An HTML attachment was scrubbed... URL: From eirik.hoydalsvik at sintef.no Mon Feb 24 07:56:20 2025 From: eirik.hoydalsvik at sintef.no (=?utf-8?B?RWlyaWsgSmFjY2hlcmkgSMO4eWRhbHN2aWs=?=) Date: Mon, 24 Feb 2025 13:56:20 +0000 Subject: [petsc-users] TS Solver stops working when including ts.setDM In-Reply-To: References: Message-ID: 1. Thank you for the quick answer, I think this sounds reasonable? Is there any way to compare the brute-force jacobian to the one computed using the coloring information? From: Matthew Knepley Date: Monday, February 24, 2025 at 14:53 To: Eirik Jaccheri H?ydalsvik Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] TS Solver stops working when including ts.setDM On Mon, Feb 24, 2025 at 8:41?AM Eirik Jaccheri H?ydalsvik via petsc-users > wrote: Hi, I am using the petsc4py.ts timestepper to solve a PDE. It is not easy to obtain the jacobian for my equations, so I do not provide a jacobian function. The code is given at the end of the email. When I comment out the function call ?ts.setDM(da)?, the code runs and gives reasonable results. However, when I add this line of code, the program crashes with the error message provided at the end of the email. Questions: 1. Do you know why adding this line of code can make the SNES solver diverge? Any suggestions for how to debug the issue? I will not know until I run it, but here is my guess. When the DMDA is specified, PETSc uses coloring to produce the Jacobian. When it is not, it just brute-forces the entire J. My guess is that your residual does not respect the stencil in the DMDA, so the coloring is wrong, making a wrong Jacobian. 2. What is the advantage of adding the DMDA object to the ts solver? Will this speed up the calculation of the finite difference jacobian? Yes, it speeds up the computation of the FD Jacobian. Thanks, Matt Best regards, Eirik H?ydalsvik SINTEF ER/NTNU Error message: [Eiriks-MacBook-Pro.local:26384] shmem: mmap: an error occurred while determining whether or not /var/folders/w1/35jw9y4n7lsbw0dhjqdhgzz80000gn/T//ompi.Eiriks-MacBook-Pro.501/jf.0/2046361600/sm_segment.Eiriks-MacBook-Pro.501.79f90000.0 could be created. t 0 of 1 with dt = 0.2 0 TS dt 0.2 time 0. TSAdapt none step 0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0 + 2.000e-01 retrying with dt=5.000e-02 TSAdapt none step 0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0 + 5.000e-02 retrying with dt=1.250e-02 TSAdapt none step 0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0 + 1.250e-02 retrying with dt=3.125e-03 TSAdapt none step 0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0 + 3.125e-03 retrying with dt=7.813e-04 TSAdapt none step 0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0 + 7.813e-04 retrying with dt=1.953e-04 TSAdapt none step 0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0 + 1.953e-04 retrying with dt=4.883e-05 TSAdapt none step 0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0 + 4.883e-05 retrying with dt=1.221e-05 TSAdapt none step 0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0 + 1.221e-05 retrying with dt=3.052e-06 TSAdapt none step 0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0 + 3.052e-06 retrying with dt=7.629e-07 TSAdapt none step 0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0 + 7.629e-07 retrying with dt=1.907e-07 TSAdapt none step 0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0 + 1.907e-07 retrying with dt=4.768e-08 Traceback (most recent call last): File "/Users/iept1445/Code/tank_model/closed_tank.py", line 200, in return_dict1d = get_tank_composition_1d(tank_params) File "/Users/iept1445/Code/tank_model/src/tank_model1d.py", line 223, in get_tank_composition_1d ts.solve(u=x) File "petsc4py/PETSc/TS.pyx", line 2478, in petsc4py.PETSc.TS.solve petsc4py.PETSc.Error: error code 91 [0] TSSolve() at /Users/iept1445/.cache/uv/sdists-v6/pypi/petsc/3.22.1/svV0XnlR4s3LB4lmsaKjR/src/src/ts/interface/ts.c:4072 [0] TSStep() at /Users/iept1445/.cache/uv/sdists-v6/pypi/petsc/3.22.1/svV0XnlR4s3LB4lmsaKjR/src/src/ts/interface/ts.c:3440 [0] TSStep has failed due to DIVERGED_STEP_REJECTED Options for solver: COMM = PETSc.COMM_WORLD da = PETSc.DMDA().create( dim=(N_vertical,), dof=3, stencil_type=PETSc.DMDA().StencilType.STAR, stencil_width=1, # boundary_type=PETSc.DMDA.BoundaryType.GHOSTED, ) x = da.createGlobalVec() x_old = da.createGlobalVec() f = da.createGlobalVec() J = da.createMat() rho_ref = rho_m[0] # kg/m3 e_ref = e_m[0] # J/mol p_ref = p0 # Pa x.setArray(np.array([rho_m / rho_ref, e_m / e_ref, ux_m]).T.flatten()) x_old.setArray(np.array([rho_m / rho_ref, e_m / e_ref, ux_m]).T.flatten()) optsDB = PETSc.Options() optsDB["snes_lag_preconditioner_persists"] = False optsDB["snes_lag_jacobian"] = 1 optsDB["snes_lag_jacobian_persists"] = False optsDB["snes_lag_preconditioner"] = 1 optsDB["ksp_type"] = "gmres" # "gmres" # gmres" optsDB["pc_type"] = "ilu" # "lu" # "ilu" optsDB["snes_type"] = "newtonls" optsDB["ksp_rtol"] = 1e-7 optsDB["ksp_atol"] = 1e-7 optsDB["ksp_max_it"] = 100 optsDB["snes_rtol"] = 1e-5 optsDB["snes_atol"] = 1e-5 optsDB["snes_stol"] = 1e-5 optsDB["snes_max_it"] = 100 optsDB["snes_mf"] = False optsDB["ts_max_time"] = t_end optsDB["ts_type"] = "beuler" # "bdf" # optsDB["ts_max_snes_failures"] = -1 optsDB["ts_monitor"] = "" optsDB["ts_adapt_monitor"] = "" # optsDB["snes_monitor"] = "" # optsDB["ksp_monitor"] = "" optsDB["ts_atol"] = 1e-4 x0 = x_old residual_wrap = residual_ts( eos, x0, N_vertical, g, pos, z, mw, dt, dx, p_amb, A_nozzle, r_tank_inner, mph_uv_flsh_L, rho_ref, e_ref, p_ref, closed_tank, J, f, da, drift_func, T_wall, tank_params, ) # optsDB["ts_adapt_type"] = "none" ts = PETSc.TS().create(comm=COMM) # TODO: Figure out why DM crashes the code # ts.setDM(residual_wrap.da) ts.setIFunction(residual_wrap.residual_ts, None) ts.setTimeStep(dt) ts.setMaxSteps(-1) ts.setTime(t_start) # s ts.setMaxTime(t_end) # s ts.setMaxSteps(1e5) ts.setStepLimits(1e-3, 1e5) ts.setFromOptions() ts.solve(u=x) Residual function: class residual_ts: def __init__( self, eos, x0, N, g, pos, z, mw, dt, dx, p_amb, A_nozzle, r_tank_inner, mph_uv_flsh_l, rho_ref, e_ref, p_ref, closed_tank, J, f, da, drift_func, T_wall, tank_params, ): self.eos = eos self.x0 = x0 self.N = N self.g = g self.pos = pos self.z = z self.mw = mw self.dt = dt self.dx = dx self.p_amb = p_amb self.A_nozzle = A_nozzle self.r_tank_inner = r_tank_inner self.mph_uv_flsh_L = mph_uv_flsh_l self.rho_ref = rho_ref self.e_ref = e_ref self.p_ref = p_ref self.closed_tank = closed_tank self.J = J self.f = f self.da = da self.drift_func = drift_func self.T_wall = T_wall self.tank_params = tank_params self.Q_wall = np.zeros(N) self.n_iter = 0 self.t_current = [0.0] self.s_top = [0.0] self.p_choke = [0.0] # setting interp func # TODO figure out how to generalize this method self._interp_func = _jalla_upwind # allocate space for new params self.p = np.zeros(N) # Pa self.T = np.zeros(N) # K self.alpha = np.zeros((2, N)) self.rho = np.zeros((2, N)) self.e = np.zeros((2, N)) # allocate space for ghost cells self.alpha_ghost = np.zeros((2, N + 2)) self.rho_ghost = np.zeros((2, N + 2)) self.rho_m_ghost = np.zeros(N + 2) self.u_m_ghost = np.zeros(N + 1) self.u_ghost = np.zeros((2, N + 1)) self.e_ghost = np.zeros((2, N + 2)) self.pos_ghost = np.zeros(N + 2) self.h_ghost = np.zeros((2, N + 2)) # allocate soace for local X and Xdot self.X_LOCAL = da.createLocalVec() self.XDOT_LOCAL = da.createLocalVec() def residual_ts(self, ts, t, X, XDOT, F): self.n_iter += 1 # TODO: Estimate time use """ Caculate residuals for equations (rho, rho (e + gx))_t + (rho u, rho u (h + gx))_x = 0 P_x = - g \rho """ n_phase = 2 self.da.globalToLocal(X, self.X_LOCAL) self.da.globalToLocal(XDOT, self.XDOT_LOCAL) x = self.da.getVecArray(self.X_LOCAL) xdot = self.da.getVecArray(self.XDOT_LOCAL) f = self.da.getVecArray(F) T_c, v_c, p_c = self.eos.critical(self.z) # K, m3/mol, Pa rho_m = x[:, 0] * self.rho_ref # kg/m3 e_m = x[:, 1] * self.e_ref # J/mol u_m = x[:-1, 2] # m/s # derivatives rho_m_dot = xdot[:, 0] * self.rho_ref # kg/m3 e_m_dot = xdot[:, 1] * self.e_ref # kg/m3 dt = ts.getTimeStep() # s for i in range(self.N): # get new parameters self.mph_uv_flsh_L[i] = self.eos.multi_phase_uvflash( self.z, e_m[i], self.mw / rho_m[i], self.mph_uv_flsh_L[i] ) betaL, betaV, betaS = _get_betas(self.mph_uv_flsh_L[i]) # mol/mol beta = [betaL, betaV] if betaS != 0.0: print("there is a solid phase which is not accounted for") self.T[i], self.p[i] = _get_tank_temperature_pressure( self.mph_uv_flsh_L[i] ) # K, Pa) for j, phase in enumerate([self.eos.LIQPH, self.eos.VAPPH]): # new parameters self.rho_ghost[:, 1:-1][j][i] = ( self.mw / self.eos.specific_volume(self.T[i], self.p[i], self.z, phase)[0] ) # kg/m3 self.e_ghost[:, 1:-1][j][i] = self.eos.internal_energy_tv( self.T[i], self.mw / self.rho_ghost[:, 1:-1][j][i], self.z, phase )[ 0 ] # J/mol self.h_ghost[:, 1:-1][j][i] = ( self.e_ghost[:, 1:-1][j][i] + self.p[i] * self.mw / self.rho_ghost[:, 1:-1][j][i] ) # J/mol self.alpha_ghost[:, 1:-1][j][i] = ( beta[j] / self.rho_ghost[:, 1:-1][j][i] * rho_m[i] ) # m3/m3 # calculate drift velocity for i in range(self.N - 1): self.u_ghost[:, 1:-1][0][i], self.u_ghost[:, 1:-1][1][i] = ( calc_drift_velocity( u_m[i], self._interp_func( self.rho_ghost[:, 1:-1][0][i], self.rho_ghost[:, 1:-1][0][i + 1], u_m[i], ), self._interp_func( self.rho_ghost[:, 1:-1][1][i], self.rho_ghost[:, 1:-1][1][i + 1], u_m[i], ), self.g, self._interp_func(self.T[i], self.T[i + 1], u_m[i]), T_c, self.r_tank_inner, self._interp_func( self.alpha_ghost[:, 1:-1][0][i], self.alpha_ghost[:, 1:-1][0][i + 1], u_m[i], ), self._interp_func( self.alpha_ghost[:, 1:-1][1][i], self.alpha_ghost[:, 1:-1][1][i + 1], u_m[i], ), self.drift_func, ) ) # liq m / s , vapour m / s u_bottom = 0 if self.closed_tank: u_top = 0.0 # m/s else: # calc phase to skip env_isentrope_cross if ( self.mph_uv_flsh_L[-1].liquid != None and self.mph_uv_flsh_L[-1].vapour == None and self.mph_uv_flsh_L[-1].solid == None ): phase_env = self.eos.LIQPH else: phase_env = self.eos.TWOPH self.h_m = e_m + self.p * self.mw / rho_m # J / mol self.s_top[0] = _get_s_tank(self.eos, self.mph_uv_flsh_L[-1]) # J / mol / K mdot, self.p_choke[0] = calc_mass_outflow( self.eos, self.z, self.h_m[-1], self.s_top[0], self.p[-1], self.p_amb, self.A_nozzle, self.mw, phase_env, debug_plot=False, ) # mol / s , Pa u_top = -mdot * self.mw / rho_m[-1] / (np.pi * self.r_tank_inner**2) # m/s # assemble vectors with ghost cells self.alpha_ghost[:, 0] = self.alpha_ghost[:, 1:-1][:, 0] # m3/m3 self.alpha_ghost[:, -1] = self.alpha_ghost[:, 1:-1][:, -1] # m3/m3 self.rho_ghost[:, 0] = self.rho_ghost[:, 1:-1][:, 0] # kg/m3 self.rho_ghost[:, -1] = self.rho_ghost[:, 1:-1][:, -1] # kg/m3 self.rho_m_ghost[0] = rho_m[0] # kg/m3 self.rho_m_ghost[1:-1] = rho_m # kg/m3 self.rho_m_ghost[-1] = rho_m[-1] # kg/m3 # u_ghost[:, 1:-1] = u # m/s self.u_ghost[:, 0] = u_bottom # m/s self.u_ghost[:, -1] = u_top # m/s self.u_m_ghost[0] = u_bottom # m/s self.u_m_ghost[1:-1] = u_m # m/s self.u_m_ghost[-1] = u_top # m/s self.e_ghost[:, 0] = self.e_ghost[:, 1:-1][:, 0] # J/mol self.e_ghost[:, -1] = self.e_ghost[:, 1:-1][:, -1] # J/mol self.pos_ghost[1:-1] = self.pos # m self.pos_ghost[0] = self.pos[0] # m self.pos_ghost[-1] = self.pos[-1] # m self.h_ghost[:, 0] = self.h_ghost[:, 1] # J/mol self.h_ghost[:, -1] = self.h_ghost[:, -2] # J/mol # recalculate wall temperature and heat flux # TODO ARE WE DOING THE STAGGERING CORRECTLY? lz = self.tank_params["lz_tank"] / self.N # m if ts.getTime() != self.t_current[0] and self.tank_params["heat_transfer"]: self.t_current[0] = ts.getTime() for i in range(self.N): self.T_wall[i], self.Q_wall[i], h_ht = ( solve_radial_heat_conduction_implicit( self.tank_params, self.T[i], self.T_wall[i], (self.u_m_ghost[i] + self.u_m_ghost[i + 1]) / 2, self.rho_m_ghost[i + 1], self.mph_uv_flsh_L[i], lz, dt, ) ) # K, J/s, W/m2K # Calculate residuals f[:, :] = 0.0 f[:, 0] = dt * rho_m_dot # kg/m3 f[:-1, 1] = self.p[1:] - self.p[:-1] + self.dx * self.g * rho_m[0:-1] # Pa/m f[:, 2] = ( dt * ( rho_m_dot * (e_m / self.mw + self.g * self.pos) + rho_m * e_m_dot / self.mw ) - rho_m_dot * e_m_dot / self.mw * dt**2 - self.Q_wall / (np.pi * self.r_tank_inner**2 * lz) * dt ) # J / m3 # add contribution from space for i in range(n_phase): e_flux_i = np.zeros_like(self.u_ghost[i]) # J/m3 m/s rho_flux_i = np.zeros_like(self.u_ghost[i]) # kg/m2/s for j in range(1, self.N + 1): if self.u_ghost[i][j] >= 0.0: rho_flux_new = _rho_flux( self.alpha_ghost[i][j], self.rho_ghost[i][j], self.u_ghost[i][j] ) e_flux_new = _e_flux( self.alpha_ghost[i][j], self.rho_ghost[i][j], self.h_ghost[i][j], self.mw, self.g, self.pos_ghost[j], self.u_ghost[i][j], ) # backward euler rho_flux_i[j] = rho_flux_new # kg/m2/s e_flux_i[j] = e_flux_new # J/m3 m/s else: rho_flux_new = _rho_flux( self.alpha_ghost[i][j + 1], self.rho_ghost[i][j + 1], self.u_ghost[i][j], ) e_flux_new = _e_flux( self.alpha_ghost[i][j + 1], self.rho_ghost[i][j + 1], self.h_ghost[i][j + 1], self.mw, self.g, self.pos_ghost[j + 1], self.u_ghost[i][j], ) # backward euler rho_flux_i[j] = rho_flux_new e_flux_i[j] = e_flux_new # mass eq f[:, 0] += (dt / self.dx) * (rho_flux_i[1:] - rho_flux_i[:-1]) # kg/m3 # energy eq f[:, 2] += (dt / self.dx) * (e_flux_i[1:] - e_flux_i[:-1]) # J/m3 f1_ref, f2_ref, f3_ref = self.rho_ref, self.p_ref, self.e_ref f[:, 0] /= f1_ref f[:-1, 1] /= f2_ref f[:, 2] /= f3_ref # dummy eq f[-1, 1] = x[-1, 2] -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!eAK-pHiCaIieRkr0rA56mmobvaBQrGUzDnAK0WEafTKv_sK1uhYaYWFHIfy9HHlLuLFD0KExd3ovrR0DF4AFc6XURpvLDDW3jjA$ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Mon Feb 24 08:00:28 2025 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 24 Feb 2025 09:00:28 -0500 Subject: [petsc-users] TS Solver stops working when including ts.setDM In-Reply-To: References: Message-ID: On Mon, Feb 24, 2025 at 8:56?AM Eirik Jaccheri H?ydalsvik < eirik.hoydalsvik at sintef.no> wrote: > > 1. Thank you for the quick answer, I think this sounds reasonable? Is > there any way to compare the brute-force jacobian to the one computed using > the coloring information? > > The easiest way we have is to print them both out: -ksp_view_mat on both runs. We have a way to compare the analytic and FD Jacobians (-snes_test_jacobian), but not two different FDs. Thanks, Matt > > 1. > > *From: *Matthew Knepley > *Date: *Monday, February 24, 2025 at 14:53 > *To: *Eirik Jaccheri H?ydalsvik > *Cc: *petsc-users at mcs.anl.gov > *Subject: *Re: [petsc-users] TS Solver stops working when including > ts.setDM > > On Mon, Feb 24, 2025 at 8:41?AM Eirik Jaccheri H?ydalsvik via petsc-users > wrote: > > Hi, > > I am using the petsc4py.ts timestepper to solve a PDE. It is not easy to > obtain the jacobian for my equations, so I do not provide a jacobian > function. The code is given at the end of the email. > > When I comment out the function call ?ts.setDM(da)?, the code runs and > gives reasonable results. > > However, when I add this line of code, the program crashes with the error > message provided at the end of the email. > > Questions: > > 1. Do you know why adding this line of code can make the SNES solver > diverge? Any suggestions for how to debug the issue? > > > > I will not know until I run it, but here is my guess. When the DMDA is > specified, PETSc uses coloring to produce the Jacobian. When it is not, it > just brute-forces the entire J. My guess is that your residual does not > respect the stencil in the DMDA, so the coloring is wrong, making a wrong > Jacobian. > > > > 2. What is the advantage of adding the DMDA object to the ts solver? Will > this speed up the calculation of the finite difference jacobian? > > > > Yes, it speeds up the computation of the FD Jacobian. > > > > Thanks, > > > > Matt > > > > Best regards, > > Eirik H?ydalsvik > > SINTEF ER/NTNU > > Error message: > > [Eiriks-MacBook-Pro.local:26384] shmem: mmap: an error occurred while > determining whether or not > /var/folders/w1/35jw9y4n7lsbw0dhjqdhgzz80000gn/T//ompi.Eiriks-MacBook-Pro.501/jf.0/2046361600/sm_segment.Eiriks-MacBook-Pro.501.79f90000.0 could > be created. > > t 0 of 1 with dt = 0.2 > > 0 TS dt 0.2 time 0. > > TSAdapt none step 0 stage rejected (SNES reason > DIVERGED_LINEAR_SOLVE) t=0 + 2.000e-01 retrying with dt=5.000e-02 > > TSAdapt none step 0 stage rejected (SNES reason > DIVERGED_LINEAR_SOLVE) t=0 + 5.000e-02 retrying with dt=1.250e-02 > > TSAdapt none step 0 stage rejected (SNES reason > DIVERGED_LINEAR_SOLVE) t=0 + 1.250e-02 retrying with dt=3.125e-03 > > TSAdapt none step 0 stage rejected (SNES reason > DIVERGED_LINEAR_SOLVE) t=0 + 3.125e-03 retrying with dt=7.813e-04 > > TSAdapt none step 0 stage rejected (SNES reason > DIVERGED_LINEAR_SOLVE) t=0 + 7.813e-04 retrying with dt=1.953e-04 > > TSAdapt none step 0 stage rejected (SNES reason > DIVERGED_LINEAR_SOLVE) t=0 + 1.953e-04 retrying with dt=4.883e-05 > > TSAdapt none step 0 stage rejected (SNES reason > DIVERGED_LINEAR_SOLVE) t=0 + 4.883e-05 retrying with dt=1.221e-05 > > TSAdapt none step 0 stage rejected (SNES reason > DIVERGED_LINEAR_SOLVE) t=0 + 1.221e-05 retrying with dt=3.052e-06 > > TSAdapt none step 0 stage rejected (SNES reason > DIVERGED_LINEAR_SOLVE) t=0 + 3.052e-06 retrying with dt=7.629e-07 > > TSAdapt none step 0 stage rejected (SNES reason > DIVERGED_LINEAR_SOLVE) t=0 + 7.629e-07 retrying with dt=1.907e-07 > > TSAdapt none step 0 stage rejected (SNES reason > DIVERGED_LINEAR_SOLVE) t=0 + 1.907e-07 retrying with dt=4.768e-08 > > Traceback (most recent call last): > > File "/Users/iept1445/Code/tank_model/closed_tank.py", line 200, in > > > return_dict1d = get_tank_composition_1d(tank_params) > > File "/Users/iept1445/Code/tank_model/src/tank_model1d.py", line 223, in > get_tank_composition_1d > > ts.solve(u=x) > > File "petsc4py/PETSc/TS.pyx", line 2478, in petsc4py.PETSc.TS.solve > > petsc4py.PETSc.Error: error code 91 > > [0] TSSolve() at > /Users/iept1445/.cache/uv/sdists-v6/pypi/petsc/3.22.1/svV0XnlR4s3LB4lmsaKjR/src/src/ts/interface/ts.c:4072 > > [0] TSStep() at > /Users/iept1445/.cache/uv/sdists-v6/pypi/petsc/3.22.1/svV0XnlR4s3LB4lmsaKjR/src/src/ts/interface/ts.c:3440 > > [0] TSStep has failed due to DIVERGED_STEP_REJECTED > > Options for solver: > > COMM = PETSc.COMM_WORLD > > > > da = PETSc.DMDA().create( > > dim=(N_vertical,), > > dof=3, > > stencil_type=PETSc.DMDA().StencilType.STAR, > > stencil_width=1, > > # boundary_type=PETSc.DMDA.BoundaryType.GHOSTED, > > ) > > x = da.createGlobalVec() > > x_old = da.createGlobalVec() > > f = da.createGlobalVec() > > J = da.createMat() > > rho_ref = rho_m[0] # kg/m3 > > e_ref = e_m[0] # J/mol > > p_ref = p0 # Pa > > x.setArray(np.array([rho_m / rho_ref, e_m / e_ref, ux_m]).T.flatten()) > > x_old.setArray(np.array([rho_m / rho_ref, e_m / e_ref, > ux_m]).T.flatten()) > > > > optsDB = PETSc.Options() > > optsDB["snes_lag_preconditioner_persists"] = False > > optsDB["snes_lag_jacobian"] = 1 > > optsDB["snes_lag_jacobian_persists"] = False > > optsDB["snes_lag_preconditioner"] = 1 > > optsDB["ksp_type"] = "gmres" # "gmres" # gmres" > > optsDB["pc_type"] = "ilu" # "lu" # "ilu" > > optsDB["snes_type"] = "newtonls" > > optsDB["ksp_rtol"] = 1e-7 > > optsDB["ksp_atol"] = 1e-7 > > optsDB["ksp_max_it"] = 100 > > optsDB["snes_rtol"] = 1e-5 > > optsDB["snes_atol"] = 1e-5 > > optsDB["snes_stol"] = 1e-5 > > optsDB["snes_max_it"] = 100 > > optsDB["snes_mf"] = False > > optsDB["ts_max_time"] = t_end > > optsDB["ts_type"] = "beuler" # "bdf" # > > optsDB["ts_max_snes_failures"] = -1 > > optsDB["ts_monitor"] = "" > > optsDB["ts_adapt_monitor"] = "" > > # optsDB["snes_monitor"] = "" > > # optsDB["ksp_monitor"] = "" > > optsDB["ts_atol"] = 1e-4 > > > > x0 = x_old > > residual_wrap = residual_ts( > > eos, > > x0, > > N_vertical, > > g, > > pos, > > z, > > mw, > > dt, > > dx, > > p_amb, > > A_nozzle, > > r_tank_inner, > > mph_uv_flsh_L, > > rho_ref, > > e_ref, > > p_ref, > > closed_tank, > > J, > > f, > > da, > > drift_func, > > T_wall, > > tank_params, > > ) > > > > # optsDB["ts_adapt_type"] = "none" > > > > ts = PETSc.TS().create(comm=COMM) > > # TODO: Figure out why DM crashes the code > > # ts.setDM(residual_wrap.da) > > ts.setIFunction(residual_wrap.residual_ts, None) > > ts.setTimeStep(dt) > > ts.setMaxSteps(-1) > > ts.setTime(t_start) # s > > ts.setMaxTime(t_end) # s > > ts.setMaxSteps(1e5) > > ts.setStepLimits(1e-3, 1e5) > > ts.setFromOptions() > > ts.solve(u=x) > > > > Residual function: > > class residual_ts: > > def __init__( > > self, > > eos, > > x0, > > N, > > g, > > pos, > > z, > > mw, > > dt, > > dx, > > p_amb, > > A_nozzle, > > r_tank_inner, > > mph_uv_flsh_l, > > rho_ref, > > e_ref, > > p_ref, > > closed_tank, > > J, > > f, > > da, > > drift_func, > > T_wall, > > tank_params, > > ): > > self.eos = eos > > self.x0 = x0 > > self.N = N > > self.g = g > > self.pos = pos > > self.z = z > > self.mw = mw > > self.dt = dt > > self.dx = dx > > self.p_amb = p_amb > > self.A_nozzle = A_nozzle > > self.r_tank_inner = r_tank_inner > > self.mph_uv_flsh_L = mph_uv_flsh_l > > self.rho_ref = rho_ref > > self.e_ref = e_ref > > self.p_ref = p_ref > > self.closed_tank = closed_tank > > self.J = J > > self.f = f > > self.da = da > > self.drift_func = drift_func > > self.T_wall = T_wall > > self.tank_params = tank_params > > self.Q_wall = np.zeros(N) > > self.n_iter = 0 > > self.t_current = [0.0] > > self.s_top = [0.0] > > self.p_choke = [0.0] > > > > # setting interp func # TODO figure out how to generalize this > method > > self._interp_func = _jalla_upwind > > > > # allocate space for new params > > self.p = np.zeros(N) # Pa > > self.T = np.zeros(N) # K > > self.alpha = np.zeros((2, N)) > > self.rho = np.zeros((2, N)) > > self.e = np.zeros((2, N)) > > > > # allocate space for ghost cells > > self.alpha_ghost = np.zeros((2, N + 2)) > > self.rho_ghost = np.zeros((2, N + 2)) > > self.rho_m_ghost = np.zeros(N + 2) > > self.u_m_ghost = np.zeros(N + 1) > > self.u_ghost = np.zeros((2, N + 1)) > > self.e_ghost = np.zeros((2, N + 2)) > > self.pos_ghost = np.zeros(N + 2) > > self.h_ghost = np.zeros((2, N + 2)) > > > > # allocate soace for local X and Xdot > > self.X_LOCAL = da.createLocalVec() > > self.XDOT_LOCAL = da.createLocalVec() > > > > def residual_ts(self, ts, t, X, XDOT, F): > > self.n_iter += 1 > > # TODO: Estimate time use > > """ > > Caculate residuals for equations > > (rho, rho (e + gx))_t + (rho u, rho u (h + gx))_x = 0 > > P_x = - g \rho > > """ > > n_phase = 2 > > self.da.globalToLocal(X, self.X_LOCAL) > > self.da.globalToLocal(XDOT, self.XDOT_LOCAL) > > x = self.da.getVecArray(self.X_LOCAL) > > xdot = self.da.getVecArray(self.XDOT_LOCAL) > > f = self.da.getVecArray(F) > > > > T_c, v_c, p_c = self.eos.critical(self.z) # K, m3/mol, Pa > > rho_m = x[:, 0] * self.rho_ref # kg/m3 > > e_m = x[:, 1] * self.e_ref # J/mol > > u_m = x[:-1, 2] # m/s > > > > # derivatives > > rho_m_dot = xdot[:, 0] * self.rho_ref # kg/m3 > > e_m_dot = xdot[:, 1] * self.e_ref # kg/m3 > > dt = ts.getTimeStep() # s > > > > for i in range(self.N): > > # get new parameters > > self.mph_uv_flsh_L[i] = self.eos.multi_phase_uvflash( > > self.z, e_m[i], self.mw / rho_m[i], self.mph_uv_flsh_L[i] > > ) > > > > betaL, betaV, betaS = _get_betas(self.mph_uv_flsh_L[i]) # > mol/mol > > beta = [betaL, betaV] > > if betaS != 0.0: > > print("there is a solid phase which is not accounted for") > > self.T[i], self.p[i] = _get_tank_temperature_pressure( > > self.mph_uv_flsh_L[i] > > ) # K, Pa) > > for j, phase in enumerate([self.eos.LIQPH, self.eos.VAPPH]): > > # new parameters > > self.rho_ghost[:, 1:-1][j][i] = ( > > self.mw > > / self.eos.specific_volume(self.T[i], self.p[i], > self.z, phase)[0] > > ) # kg/m3 > > self.e_ghost[:, 1:-1][j][i] = self.eos.internal_energy_tv( > > self.T[i], self.mw / self.rho_ghost[:, 1:-1][j][i], > self.z, phase > > )[ > > 0 > > ] # J/mol > > self.h_ghost[:, 1:-1][j][i] = ( > > self.e_ghost[:, 1:-1][j][i] > > + self.p[i] * self.mw / self.rho_ghost[:, 1:-1][j][i] > > ) # J/mol > > self.alpha_ghost[:, 1:-1][j][i] = ( > > beta[j] / self.rho_ghost[:, 1:-1][j][i] * rho_m[i] > > ) # m3/m3 > > > > # calculate drift velocity > > for i in range(self.N - 1): > > self.u_ghost[:, 1:-1][0][i], self.u_ghost[:, 1:-1][1][i] = ( > > calc_drift_velocity( > > u_m[i], > > self._interp_func( > > self.rho_ghost[:, 1:-1][0][i], > > self.rho_ghost[:, 1:-1][0][i + 1], > > u_m[i], > > ), > > self._interp_func( > > self.rho_ghost[:, 1:-1][1][i], > > self.rho_ghost[:, 1:-1][1][i + 1], > > u_m[i], > > ), > > self.g, > > self._interp_func(self.T[i], self.T[i + 1], u_m[i]), > > T_c, > > self.r_tank_inner, > > self._interp_func( > > self.alpha_ghost[:, 1:-1][0][i], > > self.alpha_ghost[:, 1:-1][0][i + 1], > > u_m[i], > > ), > > self._interp_func( > > self.alpha_ghost[:, 1:-1][1][i], > > self.alpha_ghost[:, 1:-1][1][i + 1], > > u_m[i], > > ), > > self.drift_func, > > ) > > ) # liq m / s , vapour m / s > > > > u_bottom = 0 > > if self.closed_tank: > > u_top = 0.0 # m/s > > else: > > # calc phase to skip env_isentrope_cross > > if ( > > self.mph_uv_flsh_L[-1].liquid != None > > and self.mph_uv_flsh_L[-1].vapour == None > > and self.mph_uv_flsh_L[-1].solid == None > > ): > > phase_env = self.eos.LIQPH > > else: > > phase_env = self.eos.TWOPH > > > > self.h_m = e_m + self.p * self.mw / rho_m # J / mol > > self.s_top[0] = _get_s_tank(self.eos, self.mph_uv_flsh_L[-1]) > # J / mol / K > > mdot, self.p_choke[0] = calc_mass_outflow( > > self.eos, > > self.z, > > self.h_m[-1], > > self.s_top[0], > > self.p[-1], > > self.p_amb, > > self.A_nozzle, > > self.mw, > > phase_env, > > debug_plot=False, > > ) # mol / s , Pa > > u_top = -mdot * self.mw / rho_m[-1] / (np.pi * > self.r_tank_inner**2) # m/s > > > > # assemble vectors with ghost cells > > self.alpha_ghost[:, 0] = self.alpha_ghost[:, 1:-1][:, 0] # m3/m3 > > self.alpha_ghost[:, -1] = self.alpha_ghost[:, 1:-1][:, -1] # m3/m3 > > self.rho_ghost[:, 0] = self.rho_ghost[:, 1:-1][:, 0] # kg/m3 > > self.rho_ghost[:, -1] = self.rho_ghost[:, 1:-1][:, -1] # kg/m3 > > self.rho_m_ghost[0] = rho_m[0] # kg/m3 > > self.rho_m_ghost[1:-1] = rho_m # kg/m3 > > self.rho_m_ghost[-1] = rho_m[-1] # kg/m3 > > # u_ghost[:, 1:-1] = u # m/s > > self.u_ghost[:, 0] = u_bottom # m/s > > self.u_ghost[:, -1] = u_top # m/s > > self.u_m_ghost[0] = u_bottom # m/s > > self.u_m_ghost[1:-1] = u_m # m/s > > self.u_m_ghost[-1] = u_top # m/s > > self.e_ghost[:, 0] = self.e_ghost[:, 1:-1][:, 0] # J/mol > > self.e_ghost[:, -1] = self.e_ghost[:, 1:-1][:, -1] # J/mol > > self.pos_ghost[1:-1] = self.pos # m > > self.pos_ghost[0] = self.pos[0] # m > > self.pos_ghost[-1] = self.pos[-1] # m > > self.h_ghost[:, 0] = self.h_ghost[:, 1] # J/mol > > self.h_ghost[:, -1] = self.h_ghost[:, -2] # J/mol > > > > # recalculate wall temperature and heat flux > > # TODO ARE WE DOING THE STAGGERING CORRECTLY? > > lz = self.tank_params["lz_tank"] / self.N # m > > if ts.getTime() != self.t_current[0] and > self.tank_params["heat_transfer"]: > > self.t_current[0] = ts.getTime() > > for i in range(self.N): > > self.T_wall[i], self.Q_wall[i], h_ht = ( > > solve_radial_heat_conduction_implicit( > > self.tank_params, > > self.T[i], > > self.T_wall[i], > > (self.u_m_ghost[i] + self.u_m_ghost[i + 1]) / 2, > > self.rho_m_ghost[i + 1], > > self.mph_uv_flsh_L[i], > > lz, > > dt, > > ) > > ) # K, J/s, W/m2K > > > > # Calculate residuals > > f[:, :] = 0.0 > > f[:, 0] = dt * rho_m_dot # kg/m3 > > f[:-1, 1] = self.p[1:] - self.p[:-1] + self.dx * self.g * > rho_m[0:-1] # Pa/m > > f[:, 2] = ( > > dt > > * ( > > rho_m_dot * (e_m / self.mw + self.g * self.pos) > > + rho_m * e_m_dot / self.mw > > ) > > - rho_m_dot * e_m_dot / self.mw * dt**2 > > - self.Q_wall / (np.pi * self.r_tank_inner**2 * lz) * dt > > ) # J / m3 > > > > # add contribution from space > > for i in range(n_phase): > > e_flux_i = np.zeros_like(self.u_ghost[i]) # J/m3 m/s > > rho_flux_i = np.zeros_like(self.u_ghost[i]) # kg/m2/s > > for j in range(1, self.N + 1): > > if self.u_ghost[i][j] >= 0.0: > > rho_flux_new = _rho_flux( > > self.alpha_ghost[i][j], self.rho_ghost[i][j], > self.u_ghost[i][j] > > ) > > e_flux_new = _e_flux( > > self.alpha_ghost[i][j], > > self.rho_ghost[i][j], > > self.h_ghost[i][j], > > self.mw, > > self.g, > > self.pos_ghost[j], > > self.u_ghost[i][j], > > ) > > > > # backward euler > > rho_flux_i[j] = rho_flux_new # kg/m2/s > > e_flux_i[j] = e_flux_new # J/m3 m/s > > > > else: > > rho_flux_new = _rho_flux( > > self.alpha_ghost[i][j + 1], > > self.rho_ghost[i][j + 1], > > self.u_ghost[i][j], > > ) > > > > e_flux_new = _e_flux( > > self.alpha_ghost[i][j + 1], > > self.rho_ghost[i][j + 1], > > self.h_ghost[i][j + 1], > > self.mw, > > self.g, > > self.pos_ghost[j + 1], > > self.u_ghost[i][j], > > ) > > > > # backward euler > > rho_flux_i[j] = rho_flux_new > > e_flux_i[j] = e_flux_new > > > > # mass eq > > f[:, 0] += (dt / self.dx) * (rho_flux_i[1:] - > rho_flux_i[:-1]) # kg/m3 > > > > # energy eq > > f[:, 2] += (dt / self.dx) * (e_flux_i[1:] - e_flux_i[:-1]) # > J/m3 > > > > f1_ref, f2_ref, f3_ref = self.rho_ref, self.p_ref, self.e_ref > > f[:, 0] /= f1_ref > > f[:-1, 1] /= f2_ref > > f[:, 2] /= f3_ref > > # dummy eq > > f[-1, 1] = x[-1, 2] > > > > > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > > https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!dcoz7MDn7GnxkOOW8KrkFC-3TAKZKmVlbtBSOJqC2xpb8AuzPeBKTeVUS-nWxQhzkrKp4wQF9njzok2yPpno$ > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!dcoz7MDn7GnxkOOW8KrkFC-3TAKZKmVlbtBSOJqC2xpb8AuzPeBKTeVUS-nWxQhzkrKp4wQF9njzok2yPpno$ -------------- next part -------------- An HTML attachment was scrubbed... URL: From eirik.hoydalsvik at sintef.no Mon Feb 24 07:35:01 2025 From: eirik.hoydalsvik at sintef.no (=?Windows-1252?Q?Eirik_Jaccheri_H=F8ydalsvik?=) Date: Mon, 24 Feb 2025 13:35:01 +0000 Subject: [petsc-users] TS Solver stops working when including ts.setDM Message-ID: Hi, I am using the petsc4py.ts timestepper to solve a PDE. It is not easy to obtain the jacobian for my equations, so I do not provide a jacobian function. The code is given at the end of the email. When I comment out the function call ?ts.setDM(da)?, the code runs and gives reasonable results. However, when I add this line of code, the program crashes with the error message provided at the end of the email. Questions: 1. Do you know why adding this line of code can make the SNES solver diverge? Any suggestions for how to debug the issue? 2. What is the advantage of adding the DMDA object to the ts solver? Will this speed up the calculation of the finite difference jacobian? Best regards, Eirik H?ydalsvik SINTEF ER/NTNU Error message: [Eiriks-MacBook-Pro.local:26384] shmem: mmap: an error occurred while determining whether or not /var/folders/w1/35jw9y4n7lsbw0dhjqdhgzz80000gn/T//ompi.Eiriks-MacBook-Pro.501/jf.0/2046361600/sm_segment.Eiriks-MacBook-Pro.501.79f90000.0 could be created. t 0 of 1 with dt = 0.2 0 TS dt 0.2 time 0. TSAdapt none step 0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0 + 2.000e-01 retrying with dt=5.000e-02 TSAdapt none step 0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0 + 5.000e-02 retrying with dt=1.250e-02 TSAdapt none step 0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0 + 1.250e-02 retrying with dt=3.125e-03 TSAdapt none step 0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0 + 3.125e-03 retrying with dt=7.813e-04 TSAdapt none step 0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0 + 7.813e-04 retrying with dt=1.953e-04 TSAdapt none step 0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0 + 1.953e-04 retrying with dt=4.883e-05 TSAdapt none step 0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0 + 4.883e-05 retrying with dt=1.221e-05 TSAdapt none step 0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0 + 1.221e-05 retrying with dt=3.052e-06 TSAdapt none step 0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0 + 3.052e-06 retrying with dt=7.629e-07 TSAdapt none step 0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0 + 7.629e-07 retrying with dt=1.907e-07 TSAdapt none step 0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0 + 1.907e-07 retrying with dt=4.768e-08 Traceback (most recent call last): File "/Users/iept1445/Code/tank_model/closed_tank.py", line 200, in return_dict1d = get_tank_composition_1d(tank_params) File "/Users/iept1445/Code/tank_model/src/tank_model1d.py", line 223, in get_tank_composition_1d ts.solve(u=x) File "petsc4py/PETSc/TS.pyx", line 2478, in petsc4py.PETSc.TS.solve petsc4py.PETSc.Error: error code 91 [0] TSSolve() at /Users/iept1445/.cache/uv/sdists-v6/pypi/petsc/3.22.1/svV0XnlR4s3LB4lmsaKjR/src/src/ts/interface/ts.c:4072 [0] TSStep() at /Users/iept1445/.cache/uv/sdists-v6/pypi/petsc/3.22.1/svV0XnlR4s3LB4lmsaKjR/src/src/ts/interface/ts.c:3440 [0] TSStep has failed due to DIVERGED_STEP_REJECTED Options for solver: COMM = PETSc.COMM_WORLD da = PETSc.DMDA().create( dim=(N_vertical,), dof=3, stencil_type=PETSc.DMDA().StencilType.STAR, stencil_width=1, # boundary_type=PETSc.DMDA.BoundaryType.GHOSTED, ) x = da.createGlobalVec() x_old = da.createGlobalVec() f = da.createGlobalVec() J = da.createMat() rho_ref = rho_m[0] # kg/m3 e_ref = e_m[0] # J/mol p_ref = p0 # Pa x.setArray(np.array([rho_m / rho_ref, e_m / e_ref, ux_m]).T.flatten()) x_old.setArray(np.array([rho_m / rho_ref, e_m / e_ref, ux_m]).T.flatten()) optsDB = PETSc.Options() optsDB["snes_lag_preconditioner_persists"] = False optsDB["snes_lag_jacobian"] = 1 optsDB["snes_lag_jacobian_persists"] = False optsDB["snes_lag_preconditioner"] = 1 optsDB["ksp_type"] = "gmres" # "gmres" # gmres" optsDB["pc_type"] = "ilu" # "lu" # "ilu" optsDB["snes_type"] = "newtonls" optsDB["ksp_rtol"] = 1e-7 optsDB["ksp_atol"] = 1e-7 optsDB["ksp_max_it"] = 100 optsDB["snes_rtol"] = 1e-5 optsDB["snes_atol"] = 1e-5 optsDB["snes_stol"] = 1e-5 optsDB["snes_max_it"] = 100 optsDB["snes_mf"] = False optsDB["ts_max_time"] = t_end optsDB["ts_type"] = "beuler" # "bdf" # optsDB["ts_max_snes_failures"] = -1 optsDB["ts_monitor"] = "" optsDB["ts_adapt_monitor"] = "" # optsDB["snes_monitor"] = "" # optsDB["ksp_monitor"] = "" optsDB["ts_atol"] = 1e-4 x0 = x_old residual_wrap = residual_ts( eos, x0, N_vertical, g, pos, z, mw, dt, dx, p_amb, A_nozzle, r_tank_inner, mph_uv_flsh_L, rho_ref, e_ref, p_ref, closed_tank, J, f, da, drift_func, T_wall, tank_params, ) # optsDB["ts_adapt_type"] = "none" ts = PETSc.TS().create(comm=COMM) # TODO: Figure out why DM crashes the code # ts.setDM(residual_wrap.da) ts.setIFunction(residual_wrap.residual_ts, None) ts.setTimeStep(dt) ts.setMaxSteps(-1) ts.setTime(t_start) # s ts.setMaxTime(t_end) # s ts.setMaxSteps(1e5) ts.setStepLimits(1e-3, 1e5) ts.setFromOptions() ts.solve(u=x) Residual function: class residual_ts: def __init__( self, eos, x0, N, g, pos, z, mw, dt, dx, p_amb, A_nozzle, r_tank_inner, mph_uv_flsh_l, rho_ref, e_ref, p_ref, closed_tank, J, f, da, drift_func, T_wall, tank_params, ): self.eos = eos self.x0 = x0 self.N = N self.g = g self.pos = pos self.z = z self.mw = mw self.dt = dt self.dx = dx self.p_amb = p_amb self.A_nozzle = A_nozzle self.r_tank_inner = r_tank_inner self.mph_uv_flsh_L = mph_uv_flsh_l self.rho_ref = rho_ref self.e_ref = e_ref self.p_ref = p_ref self.closed_tank = closed_tank self.J = J self.f = f self.da = da self.drift_func = drift_func self.T_wall = T_wall self.tank_params = tank_params self.Q_wall = np.zeros(N) self.n_iter = 0 self.t_current = [0.0] self.s_top = [0.0] self.p_choke = [0.0] # setting interp func # TODO figure out how to generalize this method self._interp_func = _jalla_upwind # allocate space for new params self.p = np.zeros(N) # Pa self.T = np.zeros(N) # K self.alpha = np.zeros((2, N)) self.rho = np.zeros((2, N)) self.e = np.zeros((2, N)) # allocate space for ghost cells self.alpha_ghost = np.zeros((2, N + 2)) self.rho_ghost = np.zeros((2, N + 2)) self.rho_m_ghost = np.zeros(N + 2) self.u_m_ghost = np.zeros(N + 1) self.u_ghost = np.zeros((2, N + 1)) self.e_ghost = np.zeros((2, N + 2)) self.pos_ghost = np.zeros(N + 2) self.h_ghost = np.zeros((2, N + 2)) # allocate soace for local X and Xdot self.X_LOCAL = da.createLocalVec() self.XDOT_LOCAL = da.createLocalVec() def residual_ts(self, ts, t, X, XDOT, F): self.n_iter += 1 # TODO: Estimate time use """ Caculate residuals for equations (rho, rho (e + gx))_t + (rho u, rho u (h + gx))_x = 0 P_x = - g \rho """ n_phase = 2 self.da.globalToLocal(X, self.X_LOCAL) self.da.globalToLocal(XDOT, self.XDOT_LOCAL) x = self.da.getVecArray(self.X_LOCAL) xdot = self.da.getVecArray(self.XDOT_LOCAL) f = self.da.getVecArray(F) T_c, v_c, p_c = self.eos.critical(self.z) # K, m3/mol, Pa rho_m = x[:, 0] * self.rho_ref # kg/m3 e_m = x[:, 1] * self.e_ref # J/mol u_m = x[:-1, 2] # m/s # derivatives rho_m_dot = xdot[:, 0] * self.rho_ref # kg/m3 e_m_dot = xdot[:, 1] * self.e_ref # kg/m3 dt = ts.getTimeStep() # s for i in range(self.N): # get new parameters self.mph_uv_flsh_L[i] = self.eos.multi_phase_uvflash( self.z, e_m[i], self.mw / rho_m[i], self.mph_uv_flsh_L[i] ) betaL, betaV, betaS = _get_betas(self.mph_uv_flsh_L[i]) # mol/mol beta = [betaL, betaV] if betaS != 0.0: print("there is a solid phase which is not accounted for") self.T[i], self.p[i] = _get_tank_temperature_pressure( self.mph_uv_flsh_L[i] ) # K, Pa) for j, phase in enumerate([self.eos.LIQPH, self.eos.VAPPH]): # new parameters self.rho_ghost[:, 1:-1][j][i] = ( self.mw / self.eos.specific_volume(self.T[i], self.p[i], self.z, phase)[0] ) # kg/m3 self.e_ghost[:, 1:-1][j][i] = self.eos.internal_energy_tv( self.T[i], self.mw / self.rho_ghost[:, 1:-1][j][i], self.z, phase )[ 0 ] # J/mol self.h_ghost[:, 1:-1][j][i] = ( self.e_ghost[:, 1:-1][j][i] + self.p[i] * self.mw / self.rho_ghost[:, 1:-1][j][i] ) # J/mol self.alpha_ghost[:, 1:-1][j][i] = ( beta[j] / self.rho_ghost[:, 1:-1][j][i] * rho_m[i] ) # m3/m3 # calculate drift velocity for i in range(self.N - 1): self.u_ghost[:, 1:-1][0][i], self.u_ghost[:, 1:-1][1][i] = ( calc_drift_velocity( u_m[i], self._interp_func( self.rho_ghost[:, 1:-1][0][i], self.rho_ghost[:, 1:-1][0][i + 1], u_m[i], ), self._interp_func( self.rho_ghost[:, 1:-1][1][i], self.rho_ghost[:, 1:-1][1][i + 1], u_m[i], ), self.g, self._interp_func(self.T[i], self.T[i + 1], u_m[i]), T_c, self.r_tank_inner, self._interp_func( self.alpha_ghost[:, 1:-1][0][i], self.alpha_ghost[:, 1:-1][0][i + 1], u_m[i], ), self._interp_func( self.alpha_ghost[:, 1:-1][1][i], self.alpha_ghost[:, 1:-1][1][i + 1], u_m[i], ), self.drift_func, ) ) # liq m / s , vapour m / s u_bottom = 0 if self.closed_tank: u_top = 0.0 # m/s else: # calc phase to skip env_isentrope_cross if ( self.mph_uv_flsh_L[-1].liquid != None and self.mph_uv_flsh_L[-1].vapour == None and self.mph_uv_flsh_L[-1].solid == None ): phase_env = self.eos.LIQPH else: phase_env = self.eos.TWOPH self.h_m = e_m + self.p * self.mw / rho_m # J / mol self.s_top[0] = _get_s_tank(self.eos, self.mph_uv_flsh_L[-1]) # J / mol / K mdot, self.p_choke[0] = calc_mass_outflow( self.eos, self.z, self.h_m[-1], self.s_top[0], self.p[-1], self.p_amb, self.A_nozzle, self.mw, phase_env, debug_plot=False, ) # mol / s , Pa u_top = -mdot * self.mw / rho_m[-1] / (np.pi * self.r_tank_inner**2) # m/s # assemble vectors with ghost cells self.alpha_ghost[:, 0] = self.alpha_ghost[:, 1:-1][:, 0] # m3/m3 self.alpha_ghost[:, -1] = self.alpha_ghost[:, 1:-1][:, -1] # m3/m3 self.rho_ghost[:, 0] = self.rho_ghost[:, 1:-1][:, 0] # kg/m3 self.rho_ghost[:, -1] = self.rho_ghost[:, 1:-1][:, -1] # kg/m3 self.rho_m_ghost[0] = rho_m[0] # kg/m3 self.rho_m_ghost[1:-1] = rho_m # kg/m3 self.rho_m_ghost[-1] = rho_m[-1] # kg/m3 # u_ghost[:, 1:-1] = u # m/s self.u_ghost[:, 0] = u_bottom # m/s self.u_ghost[:, -1] = u_top # m/s self.u_m_ghost[0] = u_bottom # m/s self.u_m_ghost[1:-1] = u_m # m/s self.u_m_ghost[-1] = u_top # m/s self.e_ghost[:, 0] = self.e_ghost[:, 1:-1][:, 0] # J/mol self.e_ghost[:, -1] = self.e_ghost[:, 1:-1][:, -1] # J/mol self.pos_ghost[1:-1] = self.pos # m self.pos_ghost[0] = self.pos[0] # m self.pos_ghost[-1] = self.pos[-1] # m self.h_ghost[:, 0] = self.h_ghost[:, 1] # J/mol self.h_ghost[:, -1] = self.h_ghost[:, -2] # J/mol # recalculate wall temperature and heat flux # TODO ARE WE DOING THE STAGGERING CORRECTLY? lz = self.tank_params["lz_tank"] / self.N # m if ts.getTime() != self.t_current[0] and self.tank_params["heat_transfer"]: self.t_current[0] = ts.getTime() for i in range(self.N): self.T_wall[i], self.Q_wall[i], h_ht = ( solve_radial_heat_conduction_implicit( self.tank_params, self.T[i], self.T_wall[i], (self.u_m_ghost[i] + self.u_m_ghost[i + 1]) / 2, self.rho_m_ghost[i + 1], self.mph_uv_flsh_L[i], lz, dt, ) ) # K, J/s, W/m2K # Calculate residuals f[:, :] = 0.0 f[:, 0] = dt * rho_m_dot # kg/m3 f[:-1, 1] = self.p[1:] - self.p[:-1] + self.dx * self.g * rho_m[0:-1] # Pa/m f[:, 2] = ( dt * ( rho_m_dot * (e_m / self.mw + self.g * self.pos) + rho_m * e_m_dot / self.mw ) - rho_m_dot * e_m_dot / self.mw * dt**2 - self.Q_wall / (np.pi * self.r_tank_inner**2 * lz) * dt ) # J / m3 # add contribution from space for i in range(n_phase): e_flux_i = np.zeros_like(self.u_ghost[i]) # J/m3 m/s rho_flux_i = np.zeros_like(self.u_ghost[i]) # kg/m2/s for j in range(1, self.N + 1): if self.u_ghost[i][j] >= 0.0: rho_flux_new = _rho_flux( self.alpha_ghost[i][j], self.rho_ghost[i][j], self.u_ghost[i][j] ) e_flux_new = _e_flux( self.alpha_ghost[i][j], self.rho_ghost[i][j], self.h_ghost[i][j], self.mw, self.g, self.pos_ghost[j], self.u_ghost[i][j], ) # backward euler rho_flux_i[j] = rho_flux_new # kg/m2/s e_flux_i[j] = e_flux_new # J/m3 m/s else: rho_flux_new = _rho_flux( self.alpha_ghost[i][j + 1], self.rho_ghost[i][j + 1], self.u_ghost[i][j], ) e_flux_new = _e_flux( self.alpha_ghost[i][j + 1], self.rho_ghost[i][j + 1], self.h_ghost[i][j + 1], self.mw, self.g, self.pos_ghost[j + 1], self.u_ghost[i][j], ) # backward euler rho_flux_i[j] = rho_flux_new e_flux_i[j] = e_flux_new # mass eq f[:, 0] += (dt / self.dx) * (rho_flux_i[1:] - rho_flux_i[:-1]) # kg/m3 # energy eq f[:, 2] += (dt / self.dx) * (e_flux_i[1:] - e_flux_i[:-1]) # J/m3 f1_ref, f2_ref, f3_ref = self.rho_ref, self.p_ref, self.e_ref f[:, 0] /= f1_ref f[:-1, 1] /= f2_ref f[:, 2] /= f3_ref # dummy eq f[-1, 1] = x[-1, 2] -------------- next part -------------- An HTML attachment was scrubbed... URL: From Eric.Chamberland at giref.ulaval.ca Mon Feb 24 10:07:27 2025 From: Eric.Chamberland at giref.ulaval.ca (Eric Chamberland) Date: Mon, 24 Feb 2025 11:07:27 -0500 Subject: [petsc-users] Configuring PETSc to use a relative RPATH with $ORIGIN Message-ID: <501b9850-68ab-4740-9d11-58249bb45873@giref.ulaval.ca> Hello, We would like to make the libraries generated from PETSc compilation and installation more easily relocatable. Currently, we work around this limitation by using LD_LIBRARY_PATH in the environment and manually modifying the RPATH recorded in the libraries to remove it. Recently, we discovered an interesting approach: during the build process, we can set a relative RPATH using the $ORIGIN variable, which corresponds to the directory containing the library. This allows libpetsc.so dependencies to be referenced relatively instead of absolutely, making the library "movable" without requiring LD_LIBRARY_PATH modifications. We can also apply the same approach to our binaries. To avoid manually modifying the libraries after their creation, we were wondering if there is a way to configure PETSc to use a relative RPATH with $ORIGIN directly? I looked through PETSc's configuration options and files but couldn't find anything mentioning $ORIGIN, and very little related to RPATH. A SPACK newbie question: is this achievable with SPACK? Thanks in advance for your help! Eric -- Eric Chamberland, ing., M. Ing Professionnel de recherche GIREF/Universit? Laval (418) 656-2131 poste 41 22 42 From balay.anl at fastmail.org Mon Feb 24 10:57:42 2025 From: balay.anl at fastmail.org (Satish Balay) Date: Mon, 24 Feb 2025 10:57:42 -0600 (CST) Subject: [petsc-users] Configuring PETSc to use a relative RPATH with $ORIGIN In-Reply-To: <501b9850-68ab-4740-9d11-58249bb45873@giref.ulaval.ca> References: <501b9850-68ab-4740-9d11-58249bb45873@giref.ulaval.ca> Message-ID: I see you are referring to https://urldefense.us/v3/__https://www.baeldung.com/linux/rpath-change-in-binary__;!!G_uCfscf7eWS!f7FgqvyBSVkObSDsRA2uRpI9SqxQm_45esiwkTgdj0eKkrGQn5C-j-bU-DR-T7stFHid-jUbqwu9_gXjsiU_4KPp7pg$ I suspect its easier to do this by "manually modifying the libraries after their creation" than fix up build tools to support it. configure accepts 'LIBS' option that can potentially be used - but I suspect the '$ORIGIN' might not survive different layers of shell escapes that might occur. ./configure LIBS=-Wl,-rpath,'$ORIGIN'/foo1 Satish On Mon, 24 Feb 2025, Eric Chamberland via petsc-users wrote: > Hello, > > We would like to make the libraries generated from PETSc compilation and > installation more easily relocatable. Currently, we work around this > limitation by using LD_LIBRARY_PATH in the environment and manually modifying > the RPATH recorded in the libraries to remove it. > > Recently, we discovered an interesting approach: during the build process, we > can set a relative RPATH using the $ORIGIN variable, which corresponds to the > directory containing the library. This allows libpetsc.so dependencies to be > referenced relatively instead of absolutely, making the library "movable" > without requiring LD_LIBRARY_PATH modifications. We can also apply the same > approach to our binaries. > > To avoid manually modifying the libraries after their creation, we were > wondering if there is a way to configure PETSc to use a relative RPATH with > $ORIGIN directly? > > I looked through PETSc's configuration options and files but couldn't find > anything mentioning $ORIGIN, and very little related to RPATH. > > A SPACK newbie question: is this achievable with SPACK? > > Thanks in advance for your help! > > Eric > > From jed at jedbrown.org Mon Feb 24 11:13:48 2025 From: jed at jedbrown.org (Jed Brown) Date: Mon, 24 Feb 2025 10:13:48 -0700 Subject: [petsc-users] Configuring PETSc to use a relative RPATH with $ORIGIN In-Reply-To: References: <501b9850-68ab-4740-9d11-58249bb45873@giref.ulaval.ca> Message-ID: <87y0xvgt9v.fsf@jedbrown.org> I think fixing up the build tools would be more reliable. We can't change paths for libraries that are not installed in the same directory as libpetsc.so. What if we just substituted (perhaps only in the Makefile) $ORIGIN for the $PETSC_DIR/$PETSC_ARCH/lib prefix? Satish Balay writes: > I see you are referring to https://urldefense.us/v3/__https://www.baeldung.com/linux/rpath-change-in-binary__;!!G_uCfscf7eWS!f7FgqvyBSVkObSDsRA2uRpI9SqxQm_45esiwkTgdj0eKkrGQn5C-j-bU-DR-T7stFHid-jUbqwu9_gXjsiU_4KPp7pg$ > > I suspect its easier to do this by "manually modifying the libraries after their creation" than fix up build tools to support it. > > configure accepts 'LIBS' option that can potentially be used - but I suspect the '$ORIGIN' might not survive different layers > of shell escapes that might occur. > > ./configure LIBS=-Wl,-rpath,'$ORIGIN'/foo1 > > Satish > > On Mon, 24 Feb 2025, Eric Chamberland via petsc-users wrote: > >> Hello, >> >> We would like to make the libraries generated from PETSc compilation and >> installation more easily relocatable. Currently, we work around this >> limitation by using LD_LIBRARY_PATH in the environment and manually modifying >> the RPATH recorded in the libraries to remove it. >> >> Recently, we discovered an interesting approach: during the build process, we >> can set a relative RPATH using the $ORIGIN variable, which corresponds to the >> directory containing the library. This allows libpetsc.so dependencies to be >> referenced relatively instead of absolutely, making the library "movable" >> without requiring LD_LIBRARY_PATH modifications. We can also apply the same >> approach to our binaries. >> >> To avoid manually modifying the libraries after their creation, we were >> wondering if there is a way to configure PETSc to use a relative RPATH with >> $ORIGIN directly? >> >> I looked through PETSc's configuration options and files but couldn't find >> anything mentioning $ORIGIN, and very little related to RPATH. >> >> A SPACK newbie question: is this achievable with SPACK? >> >> Thanks in advance for your help! >> >> Eric >> >> From aduarteg at utexas.edu Mon Feb 24 14:09:49 2025 From: aduarteg at utexas.edu (Alfredo J Duarte Gomez) Date: Mon, 24 Feb 2025 14:09:49 -0600 Subject: [petsc-users] Interpreting flamegraph and profiling logs Message-ID: Good afternoon PETSC team, I am doing some profiling on one of my applications (petsc 3.15) using the command options: -log_view :performance.out -log_view :flame.out:ascii_flamegraph. I want to understand the specifics of the gnereated flamegraphs and I am attaching performance.out and flame.out to this email. I am assuming that the flamegraphs represent what is labeled "MPI Messages" in the regular output file (i.e. performance.out) instead of time or is it some other quantity? Since the "unit" of these flamegraphs is not time when I load into speedscope app and quantities most closely resemble the order of magnitude of the messages sent. My application being an unsteady solver, I can understand that in time percentage, the amount of time spent in SNESsolve is almost equal to the amount spent in TSStep (~1.63e02 s, see performance.out). However, that percentage is not represented in the flamegraph (97% vs 82% in flame.out). How would I interpret this difference in percentage? Thanks, Alfredo -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: performance.out Type: application/octet-stream Size: 24069 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: flame.out Type: application/octet-stream Size: 14255 bytes Desc: not available URL: From knepley at gmail.com Mon Feb 24 14:56:09 2025 From: knepley at gmail.com (Matthew Knepley) Date: Mon, 24 Feb 2025 15:56:09 -0500 Subject: [petsc-users] Interpreting flamegraph and profiling logs In-Reply-To: References: Message-ID: On Mon, Feb 24, 2025 at 3:10?PM Alfredo J Duarte Gomez wrote: > Good afternoon PETSC team, > > I am doing some profiling on one of my applications (petsc 3.15) using the > command options: > -log_view :performance.out > -log_view :flame.out:ascii_flamegraph. > > I want to understand the specifics of the gnereated flamegraphs and I am > attaching performance.out and flame.out to this email. > > I am assuming that the flamegraphs represent what is labeled "MPI > Messages" in the regular output file (i.e. performance.out) instead of time > or is it some other quantity? Since the "unit" of these flamegraphs is not > time when I load into speedscope app and quantities most closely resemble > the order of magnitude of the messages sent. > No. The flame graphs represent time. I forget what the normalization, but I think you will see that the ratios match the ratios in the performance.out file. > My application being an unsteady solver, I can understand that in time > percentage, the amount of time spent in SNESsolve is almost equal to the > amount spent in TSStep (~1.63e02 s, see performance.out). However, that > percentage is not represented in the flamegraph (97% vs 82% in flame.out). > How would I interpret this difference in percentage? > SNESSolve is nested in TSSolve, as you see in the flamegraph. Do you have multiple SNESolves? (I cannot look at the flamegraph right now) Thanks, Matt > Thanks, > > Alfredo > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!c_-aeuJAFXYfHQTColMAQdUWfWjw1ddAzBpk8EeqF_v9bfTtRQ8PKjSF8mYiYDWi3GkkWQa9991vePUQPPTm$ -------------- next part -------------- An HTML attachment was scrubbed... URL: From Eric.Chamberland at giref.ulaval.ca Mon Feb 24 16:54:39 2025 From: Eric.Chamberland at giref.ulaval.ca (Eric Chamberland) Date: Mon, 24 Feb 2025 17:54:39 -0500 Subject: [petsc-users] Configuring PETSc to use a relative RPATH with $ORIGIN In-Reply-To: <87y0xvgt9v.fsf@jedbrown.org> References: <501b9850-68ab-4740-9d11-58249bb45873@giref.ulaval.ca> <87y0xvgt9v.fsf@jedbrown.org> Message-ID: <0e3764f5-872c-4f4e-b3e7-b922302570ca@giref.ulaval.ca> Hi Jed, Satish, Thank you both for your insights. I think Jed is right -- it seems this would require changes in PETSc's build tools and scripts. On our side, the CMake modification we made was along these lines: SET(CMAKE_SKIP_BUILD_RPATH? FALSE) SET(CMAKE_BUILD_WITH_INSTALL_RPATH TRUE) SET(CMAKE_INSTALL_RPATH "$ORIGIN/../lib/${LIB_BUILD_TYPE}") However, it's unclear to me how this approach would work for all of PETSc?s external packages. Can they also be relocated using $ORIGIN, or would they need a different solution? I suspect many libraries might not support this mechanism. In any case, you?ve answered my question -- thank you again for your help! Best, Eric On 2025-02-24 12:13, Jed Brown wrote: > I think fixing up the build tools would be more reliable. We can't change paths for libraries that are not installed in the same directory as libpetsc.so. What if we just substituted (perhaps only in the Makefile) $ORIGIN for the $PETSC_DIR/$PETSC_ARCH/lib prefix? > > Satish Balay writes: > >> I see you are referring to https://urldefense.us/v3/__https://www.baeldung.com/linux/rpath-change-in-binary__;!!G_uCfscf7eWS!f7FgqvyBSVkObSDsRA2uRpI9SqxQm_45esiwkTgdj0eKkrGQn5C-j-bU-DR-T7stFHid-jUbqwu9_gXjsiU_4KPp7pg$ >> >> I suspect its easier to do this by "manually modifying the libraries after their creation" than fix up build tools to support it. >> >> configure accepts 'LIBS' option that can potentially be used - but I suspect the '$ORIGIN' might not survive different layers >> of shell escapes that might occur. >> >> ./configure LIBS=-Wl,-rpath,'$ORIGIN'/foo1 >> >> Satish >> >> On Mon, 24 Feb 2025, Eric Chamberland via petsc-users wrote: >> >>> Hello, >>> >>> We would like to make the libraries generated from PETSc compilation and >>> installation more easily relocatable. Currently, we work around this >>> limitation by using LD_LIBRARY_PATH in the environment and manually modifying >>> the RPATH recorded in the libraries to remove it. >>> >>> Recently, we discovered an interesting approach: during the build process, we >>> can set a relative RPATH using the $ORIGIN variable, which corresponds to the >>> directory containing the library. This allows libpetsc.so dependencies to be >>> referenced relatively instead of absolutely, making the library "movable" >>> without requiring LD_LIBRARY_PATH modifications. We can also apply the same >>> approach to our binaries. >>> >>> To avoid manually modifying the libraries after their creation, we were >>> wondering if there is a way to configure PETSc to use a relative RPATH with >>> $ORIGIN directly? >>> >>> I looked through PETSc's configuration options and files but couldn't find >>> anything mentioning $ORIGIN, and very little related to RPATH. >>> >>> A SPACK newbie question: is this achievable with SPACK? >>> >>> Thanks in advance for your help! >>> >>> Eric >>> >>> -- Eric Chamberland, ing., M. Ing Professionnel de recherche GIREF/Universit? Laval (418) 656-2131 poste 41 22 42 From eirik.hoydalsvik at sintef.no Tue Feb 25 02:19:10 2025 From: eirik.hoydalsvik at sintef.no (=?utf-8?B?RWlyaWsgSmFjY2hlcmkgSMO4eWRhbHN2aWs=?=) Date: Tue, 25 Feb 2025 08:19:10 +0000 Subject: [petsc-users] TS Solver stops working when including ts.setDM In-Reply-To: References: Message-ID: Thanks again for the quick response, I tried prining the jacobians with -ksp_view_mat as you suggested, with a system of only 3 cells (I am studying at a 1d problem). Printing the jacobian in the first timestep I got the two matrices attached at the end of this email. The jacobians are in general agreement, with some small diviations, like the final element of the matrix being 1.6e-5 in the sparse case and 3.7 In the full case. Questions: 1. Are differences on the order of 1e-5 expected when computing the jacobians in different ways? 2. Do you think these differences can be the cause of my problems? Any suggestions for furtner debugging strategies? Eirik ! sparse jacobian row 0: (0, 1.1012) (1, -104.568) (2, 0.258649) (3, -0.0644364) (4, -13.1186) (5, 1.3237e-08) row 1: (0, -0.44489) (1, 1846.04) (2, 2.12629e-07) (3, 0.445291) (4, -1846.04) (5, 7.08762e-08) row 2: (0, 540.692) (1, -40219.1) (2, 126.734) (3, -31.5544) (4, -7023.46) (5, 6.48896e-06) row 3: (0, -0.101197) (1, 104.568) (2, -0.258649) (3, 1.16563) (4, -91.4489) (5, 0.258649) (6, -0.0644365) (7, -13.1186) (8, -4.4809e-08) row 4: (0, 0.) (1, 0.) (2, 0.) (3, -0.44489) (4, 1846.04) (5, -2.17357e-07) (6, 0.445291) (7, -1846.04) (8, 2.17355e-07) row 5: (0, -49.7734) (1, 51373.8) (2, -126.734) (3, 572.246) (4, -33195.6) (5, 126.734) (6, -31.5544) (7, -7023.46) (8, -2.19026e-05) row 6: (3, -0.101197) (4, 104.568) (5, -0.258649) (6, 1.06444) (7, 13.1186) (8, 3.32334e-08) row 7: (3, 0.) (4, 0.) (5, 0.) (6, 0.) (7, 0.) (8, 1.) row 8: (3, -49.7734) (4, 51373.8) (5, -126.734) (6, 522.472) (7, 18178.2) (8, 1.61503e-05) ! full jacobian 1.1011966827009450e+00 -1.0456754702270389e+02 2.5864915220241336e-01 -6.4436436239323838e-02 -1.3118626729240630e+01 7.6042484344402957e-08 2.9290438414140398e-08 2.5347494781467651e-08 7.2381179542635411e-08 -4.4488897562431995e-01 1.8460406897256150e+03 -5.0558790784552242e-07 4.4529051871980491e-01 -1.8460406898012175e+03 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 5.4069168701069862e+02 -4.0219094660763396e+04 1.2673402711669499e+02 -3.1554364136687809e+01 -7.0234605760797031e+03 3.7635960251523168e-05 1.4708306305192963e-05 1.2833718246687978e-05 3.5617173111594722e-05 -1.0119659898285956e-01 1.0456754705230254e+02 -2.5864912672499502e-01 1.1656331184446040e+00 -9.1448920317937109e+01 2.5864905777109459e-01 -6.4436443447843730e-02 -1.3118626754633008e+01 3.8772800266974086e-09 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 -4.4488904650049199e-01 1.8460406899429699e+03 -2.8823314009443733e-07 4.4529051871980491e-01 -1.8460406898012175e+03 0.0000000000000000e+00 -4.9773392795905657e+01 5.1373794518018862e+04 -1.2673401444613337e+02 5.7224585844509852e+02 -3.3195615874594827e+04 1.2673393520690603e+02 -3.1554356503250116e+01 -7.0234583029005144e+03 1.9105655626471588e-06 -8.1675260962506883e-08 -2.9290438414140398e-08 -2.5347494781467651e-08 -1.0119667997558363e-01 1.0456754704720647e+02 -2.5864913361425051e-01 1.0644364161519400e+00 1.3118626729240630e+01 -7.6042484344402957e-08 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 1.0000000000000000e+00 -4.0087344635721997e-05 -1.4564107223769502e-05 -1.2401121002417596e-05 -4.9773414819747863e+01 5.1373776293551586e+04 -1.2673397275364130e+02 5.2247224727851687e+02 1.8178158133060850e+04 -3.7347562088676249e-05 From: Matthew Knepley Date: Monday, February 24, 2025 at 15:00 To: Eirik Jaccheri H?ydalsvik Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] TS Solver stops working when including ts.setDM On Mon, Feb 24, 2025 at 8:56?AM Eirik Jaccheri H?ydalsvik > wrote: 1. Thank you for the quick answer, I think this sounds reasonable? Is there any way to compare the brute-force jacobian to the one computed using the coloring information? The easiest way we have is to print them both out: -ksp_view_mat on both runs. We have a way to compare the analytic and FD Jacobians (-snes_test_jacobian), but not two different FDs. Thanks, Matt 1. From: Matthew Knepley > Date: Monday, February 24, 2025 at 14:53 To: Eirik Jaccheri H?ydalsvik > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] TS Solver stops working when including ts.setDM On Mon, Feb 24, 2025 at 8:41?AM Eirik Jaccheri H?ydalsvik via petsc-users > wrote: Hi, I am using the petsc4py.ts timestepper to solve a PDE. It is not easy to obtain the jacobian for my equations, so I do not provide a jacobian function. The code is given at the end of the email. When I comment out the function call ?ts.setDM(da)?, the code runs and gives reasonable results. However, when I add this line of code, the program crashes with the error message provided at the end of the email. Questions: 1. Do you know why adding this line of code can make the SNES solver diverge? Any suggestions for how to debug the issue? I will not know until I run it, but here is my guess. When the DMDA is specified, PETSc uses coloring to produce the Jacobian. When it is not, it just brute-forces the entire J. My guess is that your residual does not respect the stencil in the DMDA, so the coloring is wrong, making a wrong Jacobian. 2. What is the advantage of adding the DMDA object to the ts solver? Will this speed up the calculation of the finite difference jacobian? Yes, it speeds up the computation of the FD Jacobian. Thanks, Matt Best regards, Eirik H?ydalsvik SINTEF ER/NTNU Error message: [Eiriks-MacBook-Pro.local:26384] shmem: mmap: an error occurred while determining whether or not /var/folders/w1/35jw9y4n7lsbw0dhjqdhgzz80000gn/T//ompi.Eiriks-MacBook-Pro.501/jf.0/2046361600/sm_segment.Eiriks-MacBook-Pro.501.79f90000.0 could be created. t 0 of 1 with dt = 0.2 0 TS dt 0.2 time 0. TSAdapt none step 0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0 + 2.000e-01 retrying with dt=5.000e-02 TSAdapt none step 0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0 + 5.000e-02 retrying with dt=1.250e-02 TSAdapt none step 0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0 + 1.250e-02 retrying with dt=3.125e-03 TSAdapt none step 0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0 + 3.125e-03 retrying with dt=7.813e-04 TSAdapt none step 0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0 + 7.813e-04 retrying with dt=1.953e-04 TSAdapt none step 0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0 + 1.953e-04 retrying with dt=4.883e-05 TSAdapt none step 0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0 + 4.883e-05 retrying with dt=1.221e-05 TSAdapt none step 0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0 + 1.221e-05 retrying with dt=3.052e-06 TSAdapt none step 0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0 + 3.052e-06 retrying with dt=7.629e-07 TSAdapt none step 0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0 + 7.629e-07 retrying with dt=1.907e-07 TSAdapt none step 0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0 + 1.907e-07 retrying with dt=4.768e-08 Traceback (most recent call last): File "/Users/iept1445/Code/tank_model/closed_tank.py", line 200, in return_dict1d = get_tank_composition_1d(tank_params) File "/Users/iept1445/Code/tank_model/src/tank_model1d.py", line 223, in get_tank_composition_1d ts.solve(u=x) File "petsc4py/PETSc/TS.pyx", line 2478, in petsc4py.PETSc.TS.solve petsc4py.PETSc.Error: error code 91 [0] TSSolve() at /Users/iept1445/.cache/uv/sdists-v6/pypi/petsc/3.22.1/svV0XnlR4s3LB4lmsaKjR/src/src/ts/interface/ts.c:4072 [0] TSStep() at /Users/iept1445/.cache/uv/sdists-v6/pypi/petsc/3.22.1/svV0XnlR4s3LB4lmsaKjR/src/src/ts/interface/ts.c:3440 [0] TSStep has failed due to DIVERGED_STEP_REJECTED Options for solver: COMM = PETSc.COMM_WORLD da = PETSc.DMDA().create( dim=(N_vertical,), dof=3, stencil_type=PETSc.DMDA().StencilType.STAR, stencil_width=1, # boundary_type=PETSc.DMDA.BoundaryType.GHOSTED, ) x = da.createGlobalVec() x_old = da.createGlobalVec() f = da.createGlobalVec() J = da.createMat() rho_ref = rho_m[0] # kg/m3 e_ref = e_m[0] # J/mol p_ref = p0 # Pa x.setArray(np.array([rho_m / rho_ref, e_m / e_ref, ux_m]).T.flatten()) x_old.setArray(np.array([rho_m / rho_ref, e_m / e_ref, ux_m]).T.flatten()) optsDB = PETSc.Options() optsDB["snes_lag_preconditioner_persists"] = False optsDB["snes_lag_jacobian"] = 1 optsDB["snes_lag_jacobian_persists"] = False optsDB["snes_lag_preconditioner"] = 1 optsDB["ksp_type"] = "gmres" # "gmres" # gmres" optsDB["pc_type"] = "ilu" # "lu" # "ilu" optsDB["snes_type"] = "newtonls" optsDB["ksp_rtol"] = 1e-7 optsDB["ksp_atol"] = 1e-7 optsDB["ksp_max_it"] = 100 optsDB["snes_rtol"] = 1e-5 optsDB["snes_atol"] = 1e-5 optsDB["snes_stol"] = 1e-5 optsDB["snes_max_it"] = 100 optsDB["snes_mf"] = False optsDB["ts_max_time"] = t_end optsDB["ts_type"] = "beuler" # "bdf" # optsDB["ts_max_snes_failures"] = -1 optsDB["ts_monitor"] = "" optsDB["ts_adapt_monitor"] = "" # optsDB["snes_monitor"] = "" # optsDB["ksp_monitor"] = "" optsDB["ts_atol"] = 1e-4 x0 = x_old residual_wrap = residual_ts( eos, x0, N_vertical, g, pos, z, mw, dt, dx, p_amb, A_nozzle, r_tank_inner, mph_uv_flsh_L, rho_ref, e_ref, p_ref, closed_tank, J, f, da, drift_func, T_wall, tank_params, ) # optsDB["ts_adapt_type"] = "none" ts = PETSc.TS().create(comm=COMM) # TODO: Figure out why DM crashes the code # ts.setDM(residual_wrap.da) ts.setIFunction(residual_wrap.residual_ts, None) ts.setTimeStep(dt) ts.setMaxSteps(-1) ts.setTime(t_start) # s ts.setMaxTime(t_end) # s ts.setMaxSteps(1e5) ts.setStepLimits(1e-3, 1e5) ts.setFromOptions() ts.solve(u=x) Residual function: class residual_ts: def __init__( self, eos, x0, N, g, pos, z, mw, dt, dx, p_amb, A_nozzle, r_tank_inner, mph_uv_flsh_l, rho_ref, e_ref, p_ref, closed_tank, J, f, da, drift_func, T_wall, tank_params, ): self.eos = eos self.x0 = x0 self.N = N self.g = g self.pos = pos self.z = z self.mw = mw self.dt = dt self.dx = dx self.p_amb = p_amb self.A_nozzle = A_nozzle self.r_tank_inner = r_tank_inner self.mph_uv_flsh_L = mph_uv_flsh_l self.rho_ref = rho_ref self.e_ref = e_ref self.p_ref = p_ref self.closed_tank = closed_tank self.J = J self.f = f self.da = da self.drift_func = drift_func self.T_wall = T_wall self.tank_params = tank_params self.Q_wall = np.zeros(N) self.n_iter = 0 self.t_current = [0.0] self.s_top = [0.0] self.p_choke = [0.0] # setting interp func # TODO figure out how to generalize this method self._interp_func = _jalla_upwind # allocate space for new params self.p = np.zeros(N) # Pa self.T = np.zeros(N) # K self.alpha = np.zeros((2, N)) self.rho = np.zeros((2, N)) self.e = np.zeros((2, N)) # allocate space for ghost cells self.alpha_ghost = np.zeros((2, N + 2)) self.rho_ghost = np.zeros((2, N + 2)) self.rho_m_ghost = np.zeros(N + 2) self.u_m_ghost = np.zeros(N + 1) self.u_ghost = np.zeros((2, N + 1)) self.e_ghost = np.zeros((2, N + 2)) self.pos_ghost = np.zeros(N + 2) self.h_ghost = np.zeros((2, N + 2)) # allocate soace for local X and Xdot self.X_LOCAL = da.createLocalVec() self.XDOT_LOCAL = da.createLocalVec() def residual_ts(self, ts, t, X, XDOT, F): self.n_iter += 1 # TODO: Estimate time use """ Caculate residuals for equations (rho, rho (e + gx))_t + (rho u, rho u (h + gx))_x = 0 P_x = - g \rho """ n_phase = 2 self.da.globalToLocal(X, self.X_LOCAL) self.da.globalToLocal(XDOT, self.XDOT_LOCAL) x = self.da.getVecArray(self.X_LOCAL) xdot = self.da.getVecArray(self.XDOT_LOCAL) f = self.da.getVecArray(F) T_c, v_c, p_c = self.eos.critical(self.z) # K, m3/mol, Pa rho_m = x[:, 0] * self.rho_ref # kg/m3 e_m = x[:, 1] * self.e_ref # J/mol u_m = x[:-1, 2] # m/s # derivatives rho_m_dot = xdot[:, 0] * self.rho_ref # kg/m3 e_m_dot = xdot[:, 1] * self.e_ref # kg/m3 dt = ts.getTimeStep() # s for i in range(self.N): # get new parameters self.mph_uv_flsh_L[i] = self.eos.multi_phase_uvflash( self.z, e_m[i], self.mw / rho_m[i], self.mph_uv_flsh_L[i] ) betaL, betaV, betaS = _get_betas(self.mph_uv_flsh_L[i]) # mol/mol beta = [betaL, betaV] if betaS != 0.0: print("there is a solid phase which is not accounted for") self.T[i], self.p[i] = _get_tank_temperature_pressure( self.mph_uv_flsh_L[i] ) # K, Pa) for j, phase in enumerate([self.eos.LIQPH, self.eos.VAPPH]): # new parameters self.rho_ghost[:, 1:-1][j][i] = ( self.mw / self.eos.specific_volume(self.T[i], self.p[i], self.z, phase)[0] ) # kg/m3 self.e_ghost[:, 1:-1][j][i] = self.eos.internal_energy_tv( self.T[i], self.mw / self.rho_ghost[:, 1:-1][j][i], self.z, phase )[ 0 ] # J/mol self.h_ghost[:, 1:-1][j][i] = ( self.e_ghost[:, 1:-1][j][i] + self.p[i] * self.mw / self.rho_ghost[:, 1:-1][j][i] ) # J/mol self.alpha_ghost[:, 1:-1][j][i] = ( beta[j] / self.rho_ghost[:, 1:-1][j][i] * rho_m[i] ) # m3/m3 # calculate drift velocity for i in range(self.N - 1): self.u_ghost[:, 1:-1][0][i], self.u_ghost[:, 1:-1][1][i] = ( calc_drift_velocity( u_m[i], self._interp_func( self.rho_ghost[:, 1:-1][0][i], self.rho_ghost[:, 1:-1][0][i + 1], u_m[i], ), self._interp_func( self.rho_ghost[:, 1:-1][1][i], self.rho_ghost[:, 1:-1][1][i + 1], u_m[i], ), self.g, self._interp_func(self.T[i], self.T[i + 1], u_m[i]), T_c, self.r_tank_inner, self._interp_func( self.alpha_ghost[:, 1:-1][0][i], self.alpha_ghost[:, 1:-1][0][i + 1], u_m[i], ), self._interp_func( self.alpha_ghost[:, 1:-1][1][i], self.alpha_ghost[:, 1:-1][1][i + 1], u_m[i], ), self.drift_func, ) ) # liq m / s , vapour m / s u_bottom = 0 if self.closed_tank: u_top = 0.0 # m/s else: # calc phase to skip env_isentrope_cross if ( self.mph_uv_flsh_L[-1].liquid != None and self.mph_uv_flsh_L[-1].vapour == None and self.mph_uv_flsh_L[-1].solid == None ): phase_env = self.eos.LIQPH else: phase_env = self.eos.TWOPH self.h_m = e_m + self.p * self.mw / rho_m # J / mol self.s_top[0] = _get_s_tank(self.eos, self.mph_uv_flsh_L[-1]) # J / mol / K mdot, self.p_choke[0] = calc_mass_outflow( self.eos, self.z, self.h_m[-1], self.s_top[0], self.p[-1], self.p_amb, self.A_nozzle, self.mw, phase_env, debug_plot=False, ) # mol / s , Pa u_top = -mdot * self.mw / rho_m[-1] / (np.pi * self.r_tank_inner**2) # m/s # assemble vectors with ghost cells self.alpha_ghost[:, 0] = self.alpha_ghost[:, 1:-1][:, 0] # m3/m3 self.alpha_ghost[:, -1] = self.alpha_ghost[:, 1:-1][:, -1] # m3/m3 self.rho_ghost[:, 0] = self.rho_ghost[:, 1:-1][:, 0] # kg/m3 self.rho_ghost[:, -1] = self.rho_ghost[:, 1:-1][:, -1] # kg/m3 self.rho_m_ghost[0] = rho_m[0] # kg/m3 self.rho_m_ghost[1:-1] = rho_m # kg/m3 self.rho_m_ghost[-1] = rho_m[-1] # kg/m3 # u_ghost[:, 1:-1] = u # m/s self.u_ghost[:, 0] = u_bottom # m/s self.u_ghost[:, -1] = u_top # m/s self.u_m_ghost[0] = u_bottom # m/s self.u_m_ghost[1:-1] = u_m # m/s self.u_m_ghost[-1] = u_top # m/s self.e_ghost[:, 0] = self.e_ghost[:, 1:-1][:, 0] # J/mol self.e_ghost[:, -1] = self.e_ghost[:, 1:-1][:, -1] # J/mol self.pos_ghost[1:-1] = self.pos # m self.pos_ghost[0] = self.pos[0] # m self.pos_ghost[-1] = self.pos[-1] # m self.h_ghost[:, 0] = self.h_ghost[:, 1] # J/mol self.h_ghost[:, -1] = self.h_ghost[:, -2] # J/mol # recalculate wall temperature and heat flux # TODO ARE WE DOING THE STAGGERING CORRECTLY? lz = self.tank_params["lz_tank"] / self.N # m if ts.getTime() != self.t_current[0] and self.tank_params["heat_transfer"]: self.t_current[0] = ts.getTime() for i in range(self.N): self.T_wall[i], self.Q_wall[i], h_ht = ( solve_radial_heat_conduction_implicit( self.tank_params, self.T[i], self.T_wall[i], (self.u_m_ghost[i] + self.u_m_ghost[i + 1]) / 2, self.rho_m_ghost[i + 1], self.mph_uv_flsh_L[i], lz, dt, ) ) # K, J/s, W/m2K # Calculate residuals f[:, :] = 0.0 f[:, 0] = dt * rho_m_dot # kg/m3 f[:-1, 1] = self.p[1:] - self.p[:-1] + self.dx * self.g * rho_m[0:-1] # Pa/m f[:, 2] = ( dt * ( rho_m_dot * (e_m / self.mw + self.g * self.pos) + rho_m * e_m_dot / self.mw ) - rho_m_dot * e_m_dot / self.mw * dt**2 - self.Q_wall / (np.pi * self.r_tank_inner**2 * lz) * dt ) # J / m3 # add contribution from space for i in range(n_phase): e_flux_i = np.zeros_like(self.u_ghost[i]) # J/m3 m/s rho_flux_i = np.zeros_like(self.u_ghost[i]) # kg/m2/s for j in range(1, self.N + 1): if self.u_ghost[i][j] >= 0.0: rho_flux_new = _rho_flux( self.alpha_ghost[i][j], self.rho_ghost[i][j], self.u_ghost[i][j] ) e_flux_new = _e_flux( self.alpha_ghost[i][j], self.rho_ghost[i][j], self.h_ghost[i][j], self.mw, self.g, self.pos_ghost[j], self.u_ghost[i][j], ) # backward euler rho_flux_i[j] = rho_flux_new # kg/m2/s e_flux_i[j] = e_flux_new # J/m3 m/s else: rho_flux_new = _rho_flux( self.alpha_ghost[i][j + 1], self.rho_ghost[i][j + 1], self.u_ghost[i][j], ) e_flux_new = _e_flux( self.alpha_ghost[i][j + 1], self.rho_ghost[i][j + 1], self.h_ghost[i][j + 1], self.mw, self.g, self.pos_ghost[j + 1], self.u_ghost[i][j], ) # backward euler rho_flux_i[j] = rho_flux_new e_flux_i[j] = e_flux_new # mass eq f[:, 0] += (dt / self.dx) * (rho_flux_i[1:] - rho_flux_i[:-1]) # kg/m3 # energy eq f[:, 2] += (dt / self.dx) * (e_flux_i[1:] - e_flux_i[:-1]) # J/m3 f1_ref, f2_ref, f3_ref = self.rho_ref, self.p_ref, self.e_ref f[:, 0] /= f1_ref f[:-1, 1] /= f2_ref f[:, 2] /= f3_ref # dummy eq f[-1, 1] = x[-1, 2] -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!YHklSzZKWeUyyqPeA2cK5USsTLG_tjjMs0innpBaQeO9qFz_YqnRf-SHAGLf2GZ-Ku35MtOoNkWCBhU44E22_6AQznSlT6sU48M$ -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!YHklSzZKWeUyyqPeA2cK5USsTLG_tjjMs0innpBaQeO9qFz_YqnRf-SHAGLf2GZ-Ku35MtOoNkWCBhU44E22_6AQznSlT6sU48M$ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Feb 25 08:26:49 2025 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 25 Feb 2025 09:26:49 -0500 Subject: [petsc-users] TS Solver stops working when including ts.setDM In-Reply-To: References: Message-ID: On Tue, Feb 25, 2025 at 3:19?AM Eirik Jaccheri H?ydalsvik < eirik.hoydalsvik at sintef.no> wrote: > Thanks again for the quick response, > > I tried prining the jacobians with -ksp_view_mat as you suggested, with a > system of only 3 cells (I am studying at a 1d problem). Printing the > jacobian in the first timestep I got the two matrices attached at the end > of this email. The jacobians are in general agreement, with some small > diviations, like the final element of the matrix being 1.6e-5 in the sparse > case and 3.7 In the full case. > We usually expect to see single precision accuracy (1e-7), so this indicates that your condition number is high. If you use LU (-pc_type lu) to solve the linear system, do you get similar results? Thanks, Matt > Questions: > > 1. Are differences on the order of 1e-5 expected when computing the > jacobians in different ways? > > 2. Do you think these differences can be the cause of my problems? Any > suggestions for furtner debugging strategies? > > Eirik > > ! sparse jacobian > > row 0: (0, 1.1012) (1, -104.568) (2, 0.258649) (3, -0.0644364) (4, > -13.1186) (5, 1.3237e-08) > > row 1: (0, -0.44489) (1, 1846.04) (2, 2.12629e-07) (3, 0.445291) (4, > -1846.04) (5, 7.08762e-08) > > row 2: (0, 540.692) (1, -40219.1) (2, 126.734) (3, -31.5544) (4, > -7023.46) (5, 6.48896e-06) > > row 3: (0, -0.101197) (1, 104.568) (2, -0.258649) (3, 1.16563) (4, > -91.4489) (5, 0.258649) (6, -0.0644365) (7, -13.1186) (8, -4.4809e-08) > > row 4: (0, 0.) (1, 0.) (2, 0.) (3, -0.44489) (4, 1846.04) (5, > -2.17357e-07) (6, 0.445291) (7, -1846.04) (8, 2.17355e-07) > > row 5: (0, -49.7734) (1, 51373.8) (2, -126.734) (3, 572.246) (4, > -33195.6) (5, 126.734) (6, -31.5544) (7, -7023.46) (8, -2.19026e-05) > > row 6: (3, -0.101197) (4, 104.568) (5, -0.258649) (6, 1.06444) (7, > 13.1186) (8, 3.32334e-08) > > row 7: (3, 0.) (4, 0.) (5, 0.) (6, 0.) (7, 0.) (8, 1.) > > row 8: (3, -49.7734) (4, 51373.8) (5, -126.734) (6, 522.472) (7, > 18178.2) (8, 1.61503e-05) > > > > ! full jacobian > > 1.1011966827009450e+00 -1.0456754702270389e+02 2.5864915220241336e-01 > -6.4436436239323838e-02 -1.3118626729240630e+01 7.6042484344402957e-08 > 2.9290438414140398e-08 2.5347494781467651e-08 7.2381179542635411e-08 > > -4.4488897562431995e-01 1.8460406897256150e+03 -5.0558790784552242e-07 > 4.4529051871980491e-01 -1.8460406898012175e+03 0.0000000000000000e+00 > 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 > > 5.4069168701069862e+02 -4.0219094660763396e+04 1.2673402711669499e+02 > -3.1554364136687809e+01 -7.0234605760797031e+03 3.7635960251523168e-05 > 1.4708306305192963e-05 1.2833718246687978e-05 3.5617173111594722e-05 > > -1.0119659898285956e-01 1.0456754705230254e+02 -2.5864912672499502e-01 > 1.1656331184446040e+00 -9.1448920317937109e+01 2.5864905777109459e-01 > -6.4436443447843730e-02 -1.3118626754633008e+01 3.8772800266974086e-09 > > 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 > -4.4488904650049199e-01 1.8460406899429699e+03 -2.8823314009443733e-07 > 4.4529051871980491e-01 -1.8460406898012175e+03 0.0000000000000000e+00 > > -4.9773392795905657e+01 5.1373794518018862e+04 -1.2673401444613337e+02 > 5.7224585844509852e+02 -3.3195615874594827e+04 1.2673393520690603e+02 > -3.1554356503250116e+01 -7.0234583029005144e+03 1.9105655626471588e-06 > > -8.1675260962506883e-08 -2.9290438414140398e-08 -2.5347494781467651e-08 > -1.0119667997558363e-01 1.0456754704720647e+02 -2.5864913361425051e-01 > 1.0644364161519400e+00 1.3118626729240630e+01 -7.6042484344402957e-08 > > 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 > 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 > 0.0000000000000000e+00 0.0000000000000000e+00 1.0000000000000000e+00 > > -4.0087344635721997e-05 -1.4564107223769502e-05 -1.2401121002417596e-05 > -4.9773414819747863e+01 5.1373776293551586e+04 -1.2673397275364130e+02 > 5.2247224727851687e+02 1.8178158133060850e+04 -3.7347562088676249e-05 > > > > *From: *Matthew Knepley > *Date: *Monday, February 24, 2025 at 15:00 > *To: *Eirik Jaccheri H?ydalsvik > *Cc: *petsc-users at mcs.anl.gov > *Subject: *Re: [petsc-users] TS Solver stops working when including > ts.setDM > > On Mon, Feb 24, 2025 at 8:56?AM Eirik Jaccheri H?ydalsvik < > eirik.hoydalsvik at sintef.no> wrote: > > > 1. Thank you for the quick answer, I think this sounds reasonable? Is > there any way to compare the brute-force jacobian to the one computed using > the coloring information? > > > > The easiest way we have is to print them both out: > > > > -ksp_view_mat > > > > on both runs. We have a way to compare the analytic and FD Jacobians > (-snes_test_jacobian), but > > not two different FDs. > > > > Thanks, > > > > Matt > > > > > 1. > > *From: *Matthew Knepley > *Date: *Monday, February 24, 2025 at 14:53 > *To: *Eirik Jaccheri H?ydalsvik > *Cc: *petsc-users at mcs.anl.gov > *Subject: *Re: [petsc-users] TS Solver stops working when including > ts.setDM > > On Mon, Feb 24, 2025 at 8:41?AM Eirik Jaccheri H?ydalsvik via petsc-users > wrote: > > Hi, > > I am using the petsc4py.ts timestepper to solve a PDE. It is not easy to > obtain the jacobian for my equations, so I do not provide a jacobian > function. The code is given at the end of the email. > > When I comment out the function call ?ts.setDM(da)?, the code runs and > gives reasonable results. > > However, when I add this line of code, the program crashes with the error > message provided at the end of the email. > > Questions: > > 1. Do you know why adding this line of code can make the SNES solver > diverge? Any suggestions for how to debug the issue? > > > > I will not know until I run it, but here is my guess. When the DMDA is > specified, PETSc uses coloring to produce the Jacobian. When it is not, it > just brute-forces the entire J. My guess is that your residual does not > respect the stencil in the DMDA, so the coloring is wrong, making a wrong > Jacobian. > > > > 2. What is the advantage of adding the DMDA object to the ts solver? Will > this speed up the calculation of the finite difference jacobian? > > > > Yes, it speeds up the computation of the FD Jacobian. > > > > Thanks, > > > > Matt > > > > Best regards, > > Eirik H?ydalsvik > > SINTEF ER/NTNU > > Error message: > > [Eiriks-MacBook-Pro.local:26384] shmem: mmap: an error occurred while > determining whether or not > /var/folders/w1/35jw9y4n7lsbw0dhjqdhgzz80000gn/T//ompi.Eiriks-MacBook-Pro.501/jf.0/2046361600/sm_segment.Eiriks-MacBook-Pro.501.79f90000.0 could > be created. > > t 0 of 1 with dt = 0.2 > > 0 TS dt 0.2 time 0. > > TSAdapt none step 0 stage rejected (SNES reason > DIVERGED_LINEAR_SOLVE) t=0 + 2.000e-01 retrying with dt=5.000e-02 > > TSAdapt none step 0 stage rejected (SNES reason > DIVERGED_LINEAR_SOLVE) t=0 + 5.000e-02 retrying with dt=1.250e-02 > > TSAdapt none step 0 stage rejected (SNES reason > DIVERGED_LINEAR_SOLVE) t=0 + 1.250e-02 retrying with dt=3.125e-03 > > TSAdapt none step 0 stage rejected (SNES reason > DIVERGED_LINEAR_SOLVE) t=0 + 3.125e-03 retrying with dt=7.813e-04 > > TSAdapt none step 0 stage rejected (SNES reason > DIVERGED_LINEAR_SOLVE) t=0 + 7.813e-04 retrying with dt=1.953e-04 > > TSAdapt none step 0 stage rejected (SNES reason > DIVERGED_LINEAR_SOLVE) t=0 + 1.953e-04 retrying with dt=4.883e-05 > > TSAdapt none step 0 stage rejected (SNES reason > DIVERGED_LINEAR_SOLVE) t=0 + 4.883e-05 retrying with dt=1.221e-05 > > TSAdapt none step 0 stage rejected (SNES reason > DIVERGED_LINEAR_SOLVE) t=0 + 1.221e-05 retrying with dt=3.052e-06 > > TSAdapt none step 0 stage rejected (SNES reason > DIVERGED_LINEAR_SOLVE) t=0 + 3.052e-06 retrying with dt=7.629e-07 > > TSAdapt none step 0 stage rejected (SNES reason > DIVERGED_LINEAR_SOLVE) t=0 + 7.629e-07 retrying with dt=1.907e-07 > > TSAdapt none step 0 stage rejected (SNES reason > DIVERGED_LINEAR_SOLVE) t=0 + 1.907e-07 retrying with dt=4.768e-08 > > Traceback (most recent call last): > > File "/Users/iept1445/Code/tank_model/closed_tank.py", line 200, in > > > return_dict1d = get_tank_composition_1d(tank_params) > > File "/Users/iept1445/Code/tank_model/src/tank_model1d.py", line 223, in > get_tank_composition_1d > > ts.solve(u=x) > > File "petsc4py/PETSc/TS.pyx", line 2478, in petsc4py.PETSc.TS.solve > > petsc4py.PETSc.Error: error code 91 > > [0] TSSolve() at > /Users/iept1445/.cache/uv/sdists-v6/pypi/petsc/3.22.1/svV0XnlR4s3LB4lmsaKjR/src/src/ts/interface/ts.c:4072 > > [0] TSStep() at > /Users/iept1445/.cache/uv/sdists-v6/pypi/petsc/3.22.1/svV0XnlR4s3LB4lmsaKjR/src/src/ts/interface/ts.c:3440 > > [0] TSStep has failed due to DIVERGED_STEP_REJECTED > > Options for solver: > > COMM = PETSc.COMM_WORLD > > > > da = PETSc.DMDA().create( > > dim=(N_vertical,), > > dof=3, > > stencil_type=PETSc.DMDA().StencilType.STAR, > > stencil_width=1, > > # boundary_type=PETSc.DMDA.BoundaryType.GHOSTED, > > ) > > x = da.createGlobalVec() > > x_old = da.createGlobalVec() > > f = da.createGlobalVec() > > J = da.createMat() > > rho_ref = rho_m[0] # kg/m3 > > e_ref = e_m[0] # J/mol > > p_ref = p0 # Pa > > x.setArray(np.array([rho_m / rho_ref, e_m / e_ref, ux_m]).T.flatten()) > > x_old.setArray(np.array([rho_m / rho_ref, e_m / e_ref, > ux_m]).T.flatten()) > > > > optsDB = PETSc.Options() > > optsDB["snes_lag_preconditioner_persists"] = False > > optsDB["snes_lag_jacobian"] = 1 > > optsDB["snes_lag_jacobian_persists"] = False > > optsDB["snes_lag_preconditioner"] = 1 > > optsDB["ksp_type"] = "gmres" # "gmres" # gmres" > > optsDB["pc_type"] = "ilu" # "lu" # "ilu" > > optsDB["snes_type"] = "newtonls" > > optsDB["ksp_rtol"] = 1e-7 > > optsDB["ksp_atol"] = 1e-7 > > optsDB["ksp_max_it"] = 100 > > optsDB["snes_rtol"] = 1e-5 > > optsDB["snes_atol"] = 1e-5 > > optsDB["snes_stol"] = 1e-5 > > optsDB["snes_max_it"] = 100 > > optsDB["snes_mf"] = False > > optsDB["ts_max_time"] = t_end > > optsDB["ts_type"] = "beuler" # "bdf" # > > optsDB["ts_max_snes_failures"] = -1 > > optsDB["ts_monitor"] = "" > > optsDB["ts_adapt_monitor"] = "" > > # optsDB["snes_monitor"] = "" > > # optsDB["ksp_monitor"] = "" > > optsDB["ts_atol"] = 1e-4 > > > > x0 = x_old > > residual_wrap = residual_ts( > > eos, > > x0, > > N_vertical, > > g, > > pos, > > z, > > mw, > > dt, > > dx, > > p_amb, > > A_nozzle, > > r_tank_inner, > > mph_uv_flsh_L, > > rho_ref, > > e_ref, > > p_ref, > > closed_tank, > > J, > > f, > > da, > > drift_func, > > T_wall, > > tank_params, > > ) > > > > # optsDB["ts_adapt_type"] = "none" > > > > ts = PETSc.TS().create(comm=COMM) > > # TODO: Figure out why DM crashes the code > > # ts.setDM(residual_wrap.da) > > ts.setIFunction(residual_wrap.residual_ts, None) > > ts.setTimeStep(dt) > > ts.setMaxSteps(-1) > > ts.setTime(t_start) # s > > ts.setMaxTime(t_end) # s > > ts.setMaxSteps(1e5) > > ts.setStepLimits(1e-3, 1e5) > > ts.setFromOptions() > > ts.solve(u=x) > > > > Residual function: > > class residual_ts: > > def __init__( > > self, > > eos, > > x0, > > N, > > g, > > pos, > > z, > > mw, > > dt, > > dx, > > p_amb, > > A_nozzle, > > r_tank_inner, > > mph_uv_flsh_l, > > rho_ref, > > e_ref, > > p_ref, > > closed_tank, > > J, > > f, > > da, > > drift_func, > > T_wall, > > tank_params, > > ): > > self.eos = eos > > self.x0 = x0 > > self.N = N > > self.g = g > > self.pos = pos > > self.z = z > > self.mw = mw > > self.dt = dt > > self.dx = dx > > self.p_amb = p_amb > > self.A_nozzle = A_nozzle > > self.r_tank_inner = r_tank_inner > > self.mph_uv_flsh_L = mph_uv_flsh_l > > self.rho_ref = rho_ref > > self.e_ref = e_ref > > self.p_ref = p_ref > > self.closed_tank = closed_tank > > self.J = J > > self.f = f > > self.da = da > > self.drift_func = drift_func > > self.T_wall = T_wall > > self.tank_params = tank_params > > self.Q_wall = np.zeros(N) > > self.n_iter = 0 > > self.t_current = [0.0] > > self.s_top = [0.0] > > self.p_choke = [0.0] > > > > # setting interp func # TODO figure out how to generalize this > method > > self._interp_func = _jalla_upwind > > > > # allocate space for new params > > self.p = np.zeros(N) # Pa > > self.T = np.zeros(N) # K > > self.alpha = np.zeros((2, N)) > > self.rho = np.zeros((2, N)) > > self.e = np.zeros((2, N)) > > > > # allocate space for ghost cells > > self.alpha_ghost = np.zeros((2, N + 2)) > > self.rho_ghost = np.zeros((2, N + 2)) > > self.rho_m_ghost = np.zeros(N + 2) > > self.u_m_ghost = np.zeros(N + 1) > > self.u_ghost = np.zeros((2, N + 1)) > > self.e_ghost = np.zeros((2, N + 2)) > > self.pos_ghost = np.zeros(N + 2) > > self.h_ghost = np.zeros((2, N + 2)) > > > > # allocate soace for local X and Xdot > > self.X_LOCAL = da.createLocalVec() > > self.XDOT_LOCAL = da.createLocalVec() > > > > def residual_ts(self, ts, t, X, XDOT, F): > > self.n_iter += 1 > > # TODO: Estimate time use > > """ > > Caculate residuals for equations > > (rho, rho (e + gx))_t + (rho u, rho u (h + gx))_x = 0 > > P_x = - g \rho > > """ > > n_phase = 2 > > self.da.globalToLocal(X, self.X_LOCAL) > > self.da.globalToLocal(XDOT, self.XDOT_LOCAL) > > x = self.da.getVecArray(self.X_LOCAL) > > xdot = self.da.getVecArray(self.XDOT_LOCAL) > > f = self.da.getVecArray(F) > > > > T_c, v_c, p_c = self.eos.critical(self.z) # K, m3/mol, Pa > > rho_m = x[:, 0] * self.rho_ref # kg/m3 > > e_m = x[:, 1] * self.e_ref # J/mol > > u_m = x[:-1, 2] # m/s > > > > # derivatives > > rho_m_dot = xdot[:, 0] * self.rho_ref # kg/m3 > > e_m_dot = xdot[:, 1] * self.e_ref # kg/m3 > > dt = ts.getTimeStep() # s > > > > for i in range(self.N): > > # get new parameters > > self.mph_uv_flsh_L[i] = self.eos.multi_phase_uvflash( > > self.z, e_m[i], self.mw / rho_m[i], self.mph_uv_flsh_L[i] > > ) > > > > betaL, betaV, betaS = _get_betas(self.mph_uv_flsh_L[i]) # > mol/mol > > beta = [betaL, betaV] > > if betaS != 0.0: > > print("there is a solid phase which is not accounted for") > > self.T[i], self.p[i] = _get_tank_temperature_pressure( > > self.mph_uv_flsh_L[i] > > ) # K, Pa) > > for j, phase in enumerate([self.eos.LIQPH, self.eos.VAPPH]): > > # new parameters > > self.rho_ghost[:, 1:-1][j][i] = ( > > self.mw > > / self.eos.specific_volume(self.T[i], self.p[i], > self.z, phase)[0] > > ) # kg/m3 > > self.e_ghost[:, 1:-1][j][i] = self.eos.internal_energy_tv( > > self.T[i], self.mw / self.rho_ghost[:, 1:-1][j][i], > self.z, phase > > )[ > > 0 > > ] # J/mol > > self.h_ghost[:, 1:-1][j][i] = ( > > self.e_ghost[:, 1:-1][j][i] > > + self.p[i] * self.mw / self.rho_ghost[:, 1:-1][j][i] > > ) # J/mol > > self.alpha_ghost[:, 1:-1][j][i] = ( > > beta[j] / self.rho_ghost[:, 1:-1][j][i] * rho_m[i] > > ) # m3/m3 > > > > # calculate drift velocity > > for i in range(self.N - 1): > > self.u_ghost[:, 1:-1][0][i], self.u_ghost[:, 1:-1][1][i] = ( > > calc_drift_velocity( > > u_m[i], > > self._interp_func( > > self.rho_ghost[:, 1:-1][0][i], > > self.rho_ghost[:, 1:-1][0][i + 1], > > u_m[i], > > ), > > self._interp_func( > > self.rho_ghost[:, 1:-1][1][i], > > self.rho_ghost[:, 1:-1][1][i + 1], > > u_m[i], > > ), > > self.g, > > self._interp_func(self.T[i], self.T[i + 1], u_m[i]), > > T_c, > > self.r_tank_inner, > > self._interp_func( > > self.alpha_ghost[:, 1:-1][0][i], > > self.alpha_ghost[:, 1:-1][0][i + 1], > > u_m[i], > > ), > > self._interp_func( > > self.alpha_ghost[:, 1:-1][1][i], > > self.alpha_ghost[:, 1:-1][1][i + 1], > > u_m[i], > > ), > > self.drift_func, > > ) > > ) # liq m / s , vapour m / s > > > > u_bottom = 0 > > if self.closed_tank: > > u_top = 0.0 # m/s > > else: > > # calc phase to skip env_isentrope_cross > > if ( > > self.mph_uv_flsh_L[-1].liquid != None > > and self.mph_uv_flsh_L[-1].vapour == None > > and self.mph_uv_flsh_L[-1].solid == None > > ): > > phase_env = self.eos.LIQPH > > else: > > phase_env = self.eos.TWOPH > > > > self.h_m = e_m + self.p * self.mw / rho_m # J / mol > > self.s_top[0] = _get_s_tank(self.eos, self.mph_uv_flsh_L[-1]) > # J / mol / K > > mdot, self.p_choke[0] = calc_mass_outflow( > > self.eos, > > self.z, > > self.h_m[-1], > > self.s_top[0], > > self.p[-1], > > self.p_amb, > > self.A_nozzle, > > self.mw, > > phase_env, > > debug_plot=False, > > ) # mol / s , Pa > > u_top = -mdot * self.mw / rho_m[-1] / (np.pi * > self.r_tank_inner**2) # m/s > > > > # assemble vectors with ghost cells > > self.alpha_ghost[:, 0] = self.alpha_ghost[:, 1:-1][:, 0] # m3/m3 > > self.alpha_ghost[:, -1] = self.alpha_ghost[:, 1:-1][:, -1] # m3/m3 > > self.rho_ghost[:, 0] = self.rho_ghost[:, 1:-1][:, 0] # kg/m3 > > self.rho_ghost[:, -1] = self.rho_ghost[:, 1:-1][:, -1] # kg/m3 > > self.rho_m_ghost[0] = rho_m[0] # kg/m3 > > self.rho_m_ghost[1:-1] = rho_m # kg/m3 > > self.rho_m_ghost[-1] = rho_m[-1] # kg/m3 > > # u_ghost[:, 1:-1] = u # m/s > > self.u_ghost[:, 0] = u_bottom # m/s > > self.u_ghost[:, -1] = u_top # m/s > > self.u_m_ghost[0] = u_bottom # m/s > > self.u_m_ghost[1:-1] = u_m # m/s > > self.u_m_ghost[-1] = u_top # m/s > > self.e_ghost[:, 0] = self.e_ghost[:, 1:-1][:, 0] # J/mol > > self.e_ghost[:, -1] = self.e_ghost[:, 1:-1][:, -1] # J/mol > > self.pos_ghost[1:-1] = self.pos # m > > self.pos_ghost[0] = self.pos[0] # m > > self.pos_ghost[-1] = self.pos[-1] # m > > self.h_ghost[:, 0] = self.h_ghost[:, 1] # J/mol > > self.h_ghost[:, -1] = self.h_ghost[:, -2] # J/mol > > > > # recalculate wall temperature and heat flux > > # TODO ARE WE DOING THE STAGGERING CORRECTLY? > > lz = self.tank_params["lz_tank"] / self.N # m > > if ts.getTime() != self.t_current[0] and > self.tank_params["heat_transfer"]: > > self.t_current[0] = ts.getTime() > > for i in range(self.N): > > self.T_wall[i], self.Q_wall[i], h_ht = ( > > solve_radial_heat_conduction_implicit( > > self.tank_params, > > self.T[i], > > self.T_wall[i], > > (self.u_m_ghost[i] + self.u_m_ghost[i + 1]) / 2, > > self.rho_m_ghost[i + 1], > > self.mph_uv_flsh_L[i], > > lz, > > dt, > > ) > > ) # K, J/s, W/m2K > > > > # Calculate residuals > > f[:, :] = 0.0 > > f[:, 0] = dt * rho_m_dot # kg/m3 > > f[:-1, 1] = self.p[1:] - self.p[:-1] + self.dx * self.g * > rho_m[0:-1] # Pa/m > > f[:, 2] = ( > > dt > > * ( > > rho_m_dot * (e_m / self.mw + self.g * self.pos) > > + rho_m * e_m_dot / self.mw > > ) > > - rho_m_dot * e_m_dot / self.mw * dt**2 > > - self.Q_wall / (np.pi * self.r_tank_inner**2 * lz) * dt > > ) # J / m3 > > > > # add contribution from space > > for i in range(n_phase): > > e_flux_i = np.zeros_like(self.u_ghost[i]) # J/m3 m/s > > rho_flux_i = np.zeros_like(self.u_ghost[i]) # kg/m2/s > > for j in range(1, self.N + 1): > > if self.u_ghost[i][j] >= 0.0: > > rho_flux_new = _rho_flux( > > self.alpha_ghost[i][j], self.rho_ghost[i][j], > self.u_ghost[i][j] > > ) > > e_flux_new = _e_flux( > > self.alpha_ghost[i][j], > > self.rho_ghost[i][j], > > self.h_ghost[i][j], > > self.mw, > > self.g, > > self.pos_ghost[j], > > self.u_ghost[i][j], > > ) > > > > # backward euler > > rho_flux_i[j] = rho_flux_new # kg/m2/s > > e_flux_i[j] = e_flux_new # J/m3 m/s > > > > else: > > rho_flux_new = _rho_flux( > > self.alpha_ghost[i][j + 1], > > self.rho_ghost[i][j + 1], > > self.u_ghost[i][j], > > ) > > > > e_flux_new = _e_flux( > > self.alpha_ghost[i][j + 1], > > self.rho_ghost[i][j + 1], > > self.h_ghost[i][j + 1], > > self.mw, > > self.g, > > self.pos_ghost[j + 1], > > self.u_ghost[i][j], > > ) > > > > # backward euler > > rho_flux_i[j] = rho_flux_new > > e_flux_i[j] = e_flux_new > > > > # mass eq > > f[:, 0] += (dt / self.dx) * (rho_flux_i[1:] - > rho_flux_i[:-1]) # kg/m3 > > > > # energy eq > > f[:, 2] += (dt / self.dx) * (e_flux_i[1:] - e_flux_i[:-1]) # > J/m3 > > > > f1_ref, f2_ref, f3_ref = self.rho_ref, self.p_ref, self.e_ref > > f[:, 0] /= f1_ref > > f[:-1, 1] /= f2_ref > > f[:, 2] /= f3_ref > > # dummy eq > > f[-1, 1] = x[-1, 2] > > > > > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > > https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!dF3HkSM-0mmYDI4gDB-9LOH6k5M9HjOvEIw1XdzlTas2-tKkCBjmhfjgtouzKAhXWqb--ieWjn0Q8bbsceAb$ > > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > > https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!dF3HkSM-0mmYDI4gDB-9LOH6k5M9HjOvEIw1XdzlTas2-tKkCBjmhfjgtouzKAhXWqb--ieWjn0Q8bbsceAb$ > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!dF3HkSM-0mmYDI4gDB-9LOH6k5M9HjOvEIw1XdzlTas2-tKkCBjmhfjgtouzKAhXWqb--ieWjn0Q8bbsceAb$ -------------- next part -------------- An HTML attachment was scrubbed... URL: From eirik.hoydalsvik at sintef.no Tue Feb 25 08:51:08 2025 From: eirik.hoydalsvik at sintef.no (=?utf-8?B?RWlyaWsgSmFjY2hlcmkgSMO4eWRhbHN2aWs=?=) Date: Tue, 25 Feb 2025 14:51:08 +0000 Subject: [petsc-users] TS Solver stops working when including ts.setDM In-Reply-To: References: Message-ID: Hi, After sending you the email, I rescaled the residual function and got the two jacobians to agree down to e-7. I have tried with ?lu? and ?ilu? as preconditioners, and this does not work. However, I just tried to use ?sor? as a preconditioner, and using sor using the da jacobian works just fine! Why should it work with sor and not with ilu or lu? Eirik Jacobians: row 0: (0, 1.1012) (1, -51356.3) (2, 0.258649) (3, -0.0644364) (4, -6402.63) (5, 6.19796e-08) row 1: (0, -0.445291) (1, 901708.) (2, 0.) (3, 0.44529) (4, -901708.) (5, 3.63946e-07) row 2: (0, 1.10139) (1, -40239.6) (2, 0.258157) (3, -0.0642761) (4, -6985.51) (5, 6.19796e-08) row 3: (0, -0.101197) (1, 51356.3) (2, -0.258649) (3, 1.16563) (4, -44953.7) (5, 0.258649) (6, -0.0644364) (7, -6402.63) (8, 8.23293e-08) row 4: (0, 0.) (1, 0.) (2, 0.) (3, -0.44529) (4, 901708.) (5, -3.63946e-07) (6, 0.44529) (7, -901708.) (8, -2.8832e-07) row 5: (0, -0.101388) (1, 51394.3) (2, -0.258157) (3, 1.16566) (4, -33254.1) (5, 0.258157) (6, -0.0642762) (7, -6985.51) (8, 8.27566e-08) row 6: (3, -0.101197) (4, 51356.3) (5, -0.258649) (6, 1.06444) (7, 6402.63) (8, -5.80354e-08) row 7: (3, 0.) (4, 0.) (5, 0.) (6, 0.) (7, 0.) (8, 1.) row 8: (3, -0.101388) (4, 51394.3) (5, -0.258157) (6, 1.06428) (7, 18140.2) (8, -5.88806e-08) 1.1011966721737030e+00 -5.1356338141763350e+04 2.5864916418200712e-01 -6.4436434390614972e-02 -6.4026259743175688e+03 7.0149583230222760e-08 -1.2114185055821600e-08 7.4938912205780135e-08 -1.2114185055821600e-08 -4.4529045442108389e-01 9.0170829570587131e+05 -3.6394551278146074e-07 4.4529038352260736e-01 -9.0170829570607911e+05 -2.8832047116453381e-07 2.8832047116453381e-07 -2.8832047116453381e-07 2.8832047116453381e-07 1.1013882301195626e+00 -4.0239613965471210e+04 2.5815705499526392e-01 -6.4276195275552644e-02 -6.9855091616075770e+03 7.0994758931791700e-08 -1.2114185055821600e-08 7.5502362673492770e-08 -1.2114185055821600e-08 -1.0119660581208668e-01 5.1356338141756765e+04 -2.5864909514315410e-01 1.1656331102401210e+00 -4.4953712167476042e+04 2.5864905556513557e-01 -6.4436357255260146e-02 -6.4026259743998889e+03 8.6207799859128616e-08 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 -4.4529067184307852e-01 9.0170829570636747e+05 0.0000000000000000e+00 4.4529045442108389e-01 -9.0170829570587131e+05 3.6394551278146074e-07 -1.0138834089854361e-01 5.1394313808568040e+04 -2.5815698520133573e-01 1.1656640380374783e+00 -3.3254086573823952e+04 2.5815685372183378e-01 -6.4276095304361430e-02 -6.9855068889361701e+03 8.6288440103988163e-08 -6.9304407528653807e-08 0.0000000000000000e+00 -6.9304407528653807e-08 -1.0119667848877271e-01 5.1356338141793494e+04 -2.5864912586737532e-01 1.0644363666594443e+00 6.4026259743251758e+03 -7.4093736504211182e-08 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 1.0000000000000000e+00 -6.9586132762510124e-08 0.0000000000000000e+00 -6.9586132762510124e-08 -1.0138837786069978e-01 5.1394295578534489e+04 -2.5815692427475545e-01 1.0642752170084262e+00 1.8140206731963925e+04 -7.4375461738067500e-08 From: Matthew Knepley Date: Tuesday, February 25, 2025 at 15:27 To: Eirik Jaccheri H?ydalsvik Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] TS Solver stops working when including ts.setDM On Tue, Feb 25, 2025 at 3:19?AM Eirik Jaccheri H?ydalsvik > wrote: Thanks again for the quick response, I tried prining the jacobians with -ksp_view_mat as you suggested, with a system of only 3 cells (I am studying at a 1d problem). Printing the jacobian in the first timestep I got the two matrices attached at the end of this email. The jacobians are in general agreement, with some small diviations, like the final element of the matrix being 1.6e-5 in the sparse case and 3.7 In the full case. We usually expect to see single precision accuracy (1e-7), so this indicates that your condition number is high. If you use LU (-pc_type lu) to solve the linear system, do you get similar results? Thanks, Matt Questions: 1. Are differences on the order of 1e-5 expected when computing the jacobians in different ways? 2. Do you think these differences can be the cause of my problems? Any suggestions for furtner debugging strategies? Eirik ! sparse jacobian row 0: (0, 1.1012) (1, -104.568) (2, 0.258649) (3, -0.0644364) (4, -13.1186) (5, 1.3237e-08) row 1: (0, -0.44489) (1, 1846.04) (2, 2.12629e-07) (3, 0.445291) (4, -1846.04) (5, 7.08762e-08) row 2: (0, 540.692) (1, -40219.1) (2, 126.734) (3, -31.5544) (4, -7023.46) (5, 6.48896e-06) row 3: (0, -0.101197) (1, 104.568) (2, -0.258649) (3, 1.16563) (4, -91.4489) (5, 0.258649) (6, -0.0644365) (7, -13.1186) (8, -4.4809e-08) row 4: (0, 0.) (1, 0.) (2, 0.) (3, -0.44489) (4, 1846.04) (5, -2.17357e-07) (6, 0.445291) (7, -1846.04) (8, 2.17355e-07) row 5: (0, -49.7734) (1, 51373.8) (2, -126.734) (3, 572.246) (4, -33195.6) (5, 126.734) (6, -31.5544) (7, -7023.46) (8, -2.19026e-05) row 6: (3, -0.101197) (4, 104.568) (5, -0.258649) (6, 1.06444) (7, 13.1186) (8, 3.32334e-08) row 7: (3, 0.) (4, 0.) (5, 0.) (6, 0.) (7, 0.) (8, 1.) row 8: (3, -49.7734) (4, 51373.8) (5, -126.734) (6, 522.472) (7, 18178.2) (8, 1.61503e-05) ! full jacobian 1.1011966827009450e+00 -1.0456754702270389e+02 2.5864915220241336e-01 -6.4436436239323838e-02 -1.3118626729240630e+01 7.6042484344402957e-08 2.9290438414140398e-08 2.5347494781467651e-08 7.2381179542635411e-08 -4.4488897562431995e-01 1.8460406897256150e+03 -5.0558790784552242e-07 4.4529051871980491e-01 -1.8460406898012175e+03 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 5.4069168701069862e+02 -4.0219094660763396e+04 1.2673402711669499e+02 -3.1554364136687809e+01 -7.0234605760797031e+03 3.7635960251523168e-05 1.4708306305192963e-05 1.2833718246687978e-05 3.5617173111594722e-05 -1.0119659898285956e-01 1.0456754705230254e+02 -2.5864912672499502e-01 1.1656331184446040e+00 -9.1448920317937109e+01 2.5864905777109459e-01 -6.4436443447843730e-02 -1.3118626754633008e+01 3.8772800266974086e-09 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 -4.4488904650049199e-01 1.8460406899429699e+03 -2.8823314009443733e-07 4.4529051871980491e-01 -1.8460406898012175e+03 0.0000000000000000e+00 -4.9773392795905657e+01 5.1373794518018862e+04 -1.2673401444613337e+02 5.7224585844509852e+02 -3.3195615874594827e+04 1.2673393520690603e+02 -3.1554356503250116e+01 -7.0234583029005144e+03 1.9105655626471588e-06 -8.1675260962506883e-08 -2.9290438414140398e-08 -2.5347494781467651e-08 -1.0119667997558363e-01 1.0456754704720647e+02 -2.5864913361425051e-01 1.0644364161519400e+00 1.3118626729240630e+01 -7.6042484344402957e-08 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 1.0000000000000000e+00 -4.0087344635721997e-05 -1.4564107223769502e-05 -1.2401121002417596e-05 -4.9773414819747863e+01 5.1373776293551586e+04 -1.2673397275364130e+02 5.2247224727851687e+02 1.8178158133060850e+04 -3.7347562088676249e-05 From: Matthew Knepley > Date: Monday, February 24, 2025 at 15:00 To: Eirik Jaccheri H?ydalsvik > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] TS Solver stops working when including ts.setDM On Mon, Feb 24, 2025 at 8:56?AM Eirik Jaccheri H?ydalsvik > wrote: 1. Thank you for the quick answer, I think this sounds reasonable? Is there any way to compare the brute-force jacobian to the one computed using the coloring information? The easiest way we have is to print them both out: -ksp_view_mat on both runs. We have a way to compare the analytic and FD Jacobians (-snes_test_jacobian), but not two different FDs. Thanks, Matt 1. From: Matthew Knepley > Date: Monday, February 24, 2025 at 14:53 To: Eirik Jaccheri H?ydalsvik > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] TS Solver stops working when including ts.setDM On Mon, Feb 24, 2025 at 8:41?AM Eirik Jaccheri H?ydalsvik via petsc-users > wrote: Hi, I am using the petsc4py.ts timestepper to solve a PDE. It is not easy to obtain the jacobian for my equations, so I do not provide a jacobian function. The code is given at the end of the email. When I comment out the function call ?ts.setDM(da)?, the code runs and gives reasonable results. However, when I add this line of code, the program crashes with the error message provided at the end of the email. Questions: 1. Do you know why adding this line of code can make the SNES solver diverge? Any suggestions for how to debug the issue? I will not know until I run it, but here is my guess. When the DMDA is specified, PETSc uses coloring to produce the Jacobian. When it is not, it just brute-forces the entire J. My guess is that your residual does not respect the stencil in the DMDA, so the coloring is wrong, making a wrong Jacobian. 2. What is the advantage of adding the DMDA object to the ts solver? Will this speed up the calculation of the finite difference jacobian? Yes, it speeds up the computation of the FD Jacobian. Thanks, Matt Best regards, Eirik H?ydalsvik SINTEF ER/NTNU Error message: [Eiriks-MacBook-Pro.local:26384] shmem: mmap: an error occurred while determining whether or not /var/folders/w1/35jw9y4n7lsbw0dhjqdhgzz80000gn/T//ompi.Eiriks-MacBook-Pro.501/jf.0/2046361600/sm_segment.Eiriks-MacBook-Pro.501.79f90000.0 could be created. t 0 of 1 with dt = 0.2 0 TS dt 0.2 time 0. TSAdapt none step 0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0 + 2.000e-01 retrying with dt=5.000e-02 TSAdapt none step 0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0 + 5.000e-02 retrying with dt=1.250e-02 TSAdapt none step 0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0 + 1.250e-02 retrying with dt=3.125e-03 TSAdapt none step 0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0 + 3.125e-03 retrying with dt=7.813e-04 TSAdapt none step 0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0 + 7.813e-04 retrying with dt=1.953e-04 TSAdapt none step 0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0 + 1.953e-04 retrying with dt=4.883e-05 TSAdapt none step 0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0 + 4.883e-05 retrying with dt=1.221e-05 TSAdapt none step 0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0 + 1.221e-05 retrying with dt=3.052e-06 TSAdapt none step 0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0 + 3.052e-06 retrying with dt=7.629e-07 TSAdapt none step 0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0 + 7.629e-07 retrying with dt=1.907e-07 TSAdapt none step 0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0 + 1.907e-07 retrying with dt=4.768e-08 Traceback (most recent call last): File "/Users/iept1445/Code/tank_model/closed_tank.py", line 200, in return_dict1d = get_tank_composition_1d(tank_params) File "/Users/iept1445/Code/tank_model/src/tank_model1d.py", line 223, in get_tank_composition_1d ts.solve(u=x) File "petsc4py/PETSc/TS.pyx", line 2478, in petsc4py.PETSc.TS.solve petsc4py.PETSc.Error: error code 91 [0] TSSolve() at /Users/iept1445/.cache/uv/sdists-v6/pypi/petsc/3.22.1/svV0XnlR4s3LB4lmsaKjR/src/src/ts/interface/ts.c:4072 [0] TSStep() at /Users/iept1445/.cache/uv/sdists-v6/pypi/petsc/3.22.1/svV0XnlR4s3LB4lmsaKjR/src/src/ts/interface/ts.c:3440 [0] TSStep has failed due to DIVERGED_STEP_REJECTED Options for solver: COMM = PETSc.COMM_WORLD da = PETSc.DMDA().create( dim=(N_vertical,), dof=3, stencil_type=PETSc.DMDA().StencilType.STAR, stencil_width=1, # boundary_type=PETSc.DMDA.BoundaryType.GHOSTED, ) x = da.createGlobalVec() x_old = da.createGlobalVec() f = da.createGlobalVec() J = da.createMat() rho_ref = rho_m[0] # kg/m3 e_ref = e_m[0] # J/mol p_ref = p0 # Pa x.setArray(np.array([rho_m / rho_ref, e_m / e_ref, ux_m]).T.flatten()) x_old.setArray(np.array([rho_m / rho_ref, e_m / e_ref, ux_m]).T.flatten()) optsDB = PETSc.Options() optsDB["snes_lag_preconditioner_persists"] = False optsDB["snes_lag_jacobian"] = 1 optsDB["snes_lag_jacobian_persists"] = False optsDB["snes_lag_preconditioner"] = 1 optsDB["ksp_type"] = "gmres" # "gmres" # gmres" optsDB["pc_type"] = "ilu" # "lu" # "ilu" optsDB["snes_type"] = "newtonls" optsDB["ksp_rtol"] = 1e-7 optsDB["ksp_atol"] = 1e-7 optsDB["ksp_max_it"] = 100 optsDB["snes_rtol"] = 1e-5 optsDB["snes_atol"] = 1e-5 optsDB["snes_stol"] = 1e-5 optsDB["snes_max_it"] = 100 optsDB["snes_mf"] = False optsDB["ts_max_time"] = t_end optsDB["ts_type"] = "beuler" # "bdf" # optsDB["ts_max_snes_failures"] = -1 optsDB["ts_monitor"] = "" optsDB["ts_adapt_monitor"] = "" # optsDB["snes_monitor"] = "" # optsDB["ksp_monitor"] = "" optsDB["ts_atol"] = 1e-4 x0 = x_old residual_wrap = residual_ts( eos, x0, N_vertical, g, pos, z, mw, dt, dx, p_amb, A_nozzle, r_tank_inner, mph_uv_flsh_L, rho_ref, e_ref, p_ref, closed_tank, J, f, da, drift_func, T_wall, tank_params, ) # optsDB["ts_adapt_type"] = "none" ts = PETSc.TS().create(comm=COMM) # TODO: Figure out why DM crashes the code # ts.setDM(residual_wrap.da) ts.setIFunction(residual_wrap.residual_ts, None) ts.setTimeStep(dt) ts.setMaxSteps(-1) ts.setTime(t_start) # s ts.setMaxTime(t_end) # s ts.setMaxSteps(1e5) ts.setStepLimits(1e-3, 1e5) ts.setFromOptions() ts.solve(u=x) Residual function: class residual_ts: def __init__( self, eos, x0, N, g, pos, z, mw, dt, dx, p_amb, A_nozzle, r_tank_inner, mph_uv_flsh_l, rho_ref, e_ref, p_ref, closed_tank, J, f, da, drift_func, T_wall, tank_params, ): self.eos = eos self.x0 = x0 self.N = N self.g = g self.pos = pos self.z = z self.mw = mw self.dt = dt self.dx = dx self.p_amb = p_amb self.A_nozzle = A_nozzle self.r_tank_inner = r_tank_inner self.mph_uv_flsh_L = mph_uv_flsh_l self.rho_ref = rho_ref self.e_ref = e_ref self.p_ref = p_ref self.closed_tank = closed_tank self.J = J self.f = f self.da = da self.drift_func = drift_func self.T_wall = T_wall self.tank_params = tank_params self.Q_wall = np.zeros(N) self.n_iter = 0 self.t_current = [0.0] self.s_top = [0.0] self.p_choke = [0.0] # setting interp func # TODO figure out how to generalize this method self._interp_func = _jalla_upwind # allocate space for new params self.p = np.zeros(N) # Pa self.T = np.zeros(N) # K self.alpha = np.zeros((2, N)) self.rho = np.zeros((2, N)) self.e = np.zeros((2, N)) # allocate space for ghost cells self.alpha_ghost = np.zeros((2, N + 2)) self.rho_ghost = np.zeros((2, N + 2)) self.rho_m_ghost = np.zeros(N + 2) self.u_m_ghost = np.zeros(N + 1) self.u_ghost = np.zeros((2, N + 1)) self.e_ghost = np.zeros((2, N + 2)) self.pos_ghost = np.zeros(N + 2) self.h_ghost = np.zeros((2, N + 2)) # allocate soace for local X and Xdot self.X_LOCAL = da.createLocalVec() self.XDOT_LOCAL = da.createLocalVec() def residual_ts(self, ts, t, X, XDOT, F): self.n_iter += 1 # TODO: Estimate time use """ Caculate residuals for equations (rho, rho (e + gx))_t + (rho u, rho u (h + gx))_x = 0 P_x = - g \rho """ n_phase = 2 self.da.globalToLocal(X, self.X_LOCAL) self.da.globalToLocal(XDOT, self.XDOT_LOCAL) x = self.da.getVecArray(self.X_LOCAL) xdot = self.da.getVecArray(self.XDOT_LOCAL) f = self.da.getVecArray(F) T_c, v_c, p_c = self.eos.critical(self.z) # K, m3/mol, Pa rho_m = x[:, 0] * self.rho_ref # kg/m3 e_m = x[:, 1] * self.e_ref # J/mol u_m = x[:-1, 2] # m/s # derivatives rho_m_dot = xdot[:, 0] * self.rho_ref # kg/m3 e_m_dot = xdot[:, 1] * self.e_ref # kg/m3 dt = ts.getTimeStep() # s for i in range(self.N): # get new parameters self.mph_uv_flsh_L[i] = self.eos.multi_phase_uvflash( self.z, e_m[i], self.mw / rho_m[i], self.mph_uv_flsh_L[i] ) betaL, betaV, betaS = _get_betas(self.mph_uv_flsh_L[i]) # mol/mol beta = [betaL, betaV] if betaS != 0.0: print("there is a solid phase which is not accounted for") self.T[i], self.p[i] = _get_tank_temperature_pressure( self.mph_uv_flsh_L[i] ) # K, Pa) for j, phase in enumerate([self.eos.LIQPH, self.eos.VAPPH]): # new parameters self.rho_ghost[:, 1:-1][j][i] = ( self.mw / self.eos.specific_volume(self.T[i], self.p[i], self.z, phase)[0] ) # kg/m3 self.e_ghost[:, 1:-1][j][i] = self.eos.internal_energy_tv( self.T[i], self.mw / self.rho_ghost[:, 1:-1][j][i], self.z, phase )[ 0 ] # J/mol self.h_ghost[:, 1:-1][j][i] = ( self.e_ghost[:, 1:-1][j][i] + self.p[i] * self.mw / self.rho_ghost[:, 1:-1][j][i] ) # J/mol self.alpha_ghost[:, 1:-1][j][i] = ( beta[j] / self.rho_ghost[:, 1:-1][j][i] * rho_m[i] ) # m3/m3 # calculate drift velocity for i in range(self.N - 1): self.u_ghost[:, 1:-1][0][i], self.u_ghost[:, 1:-1][1][i] = ( calc_drift_velocity( u_m[i], self._interp_func( self.rho_ghost[:, 1:-1][0][i], self.rho_ghost[:, 1:-1][0][i + 1], u_m[i], ), self._interp_func( self.rho_ghost[:, 1:-1][1][i], self.rho_ghost[:, 1:-1][1][i + 1], u_m[i], ), self.g, self._interp_func(self.T[i], self.T[i + 1], u_m[i]), T_c, self.r_tank_inner, self._interp_func( self.alpha_ghost[:, 1:-1][0][i], self.alpha_ghost[:, 1:-1][0][i + 1], u_m[i], ), self._interp_func( self.alpha_ghost[:, 1:-1][1][i], self.alpha_ghost[:, 1:-1][1][i + 1], u_m[i], ), self.drift_func, ) ) # liq m / s , vapour m / s u_bottom = 0 if self.closed_tank: u_top = 0.0 # m/s else: # calc phase to skip env_isentrope_cross if ( self.mph_uv_flsh_L[-1].liquid != None and self.mph_uv_flsh_L[-1].vapour == None and self.mph_uv_flsh_L[-1].solid == None ): phase_env = self.eos.LIQPH else: phase_env = self.eos.TWOPH self.h_m = e_m + self.p * self.mw / rho_m # J / mol self.s_top[0] = _get_s_tank(self.eos, self.mph_uv_flsh_L[-1]) # J / mol / K mdot, self.p_choke[0] = calc_mass_outflow( self.eos, self.z, self.h_m[-1], self.s_top[0], self.p[-1], self.p_amb, self.A_nozzle, self.mw, phase_env, debug_plot=False, ) # mol / s , Pa u_top = -mdot * self.mw / rho_m[-1] / (np.pi * self.r_tank_inner**2) # m/s # assemble vectors with ghost cells self.alpha_ghost[:, 0] = self.alpha_ghost[:, 1:-1][:, 0] # m3/m3 self.alpha_ghost[:, -1] = self.alpha_ghost[:, 1:-1][:, -1] # m3/m3 self.rho_ghost[:, 0] = self.rho_ghost[:, 1:-1][:, 0] # kg/m3 self.rho_ghost[:, -1] = self.rho_ghost[:, 1:-1][:, -1] # kg/m3 self.rho_m_ghost[0] = rho_m[0] # kg/m3 self.rho_m_ghost[1:-1] = rho_m # kg/m3 self.rho_m_ghost[-1] = rho_m[-1] # kg/m3 # u_ghost[:, 1:-1] = u # m/s self.u_ghost[:, 0] = u_bottom # m/s self.u_ghost[:, -1] = u_top # m/s self.u_m_ghost[0] = u_bottom # m/s self.u_m_ghost[1:-1] = u_m # m/s self.u_m_ghost[-1] = u_top # m/s self.e_ghost[:, 0] = self.e_ghost[:, 1:-1][:, 0] # J/mol self.e_ghost[:, -1] = self.e_ghost[:, 1:-1][:, -1] # J/mol self.pos_ghost[1:-1] = self.pos # m self.pos_ghost[0] = self.pos[0] # m self.pos_ghost[-1] = self.pos[-1] # m self.h_ghost[:, 0] = self.h_ghost[:, 1] # J/mol self.h_ghost[:, -1] = self.h_ghost[:, -2] # J/mol # recalculate wall temperature and heat flux # TODO ARE WE DOING THE STAGGERING CORRECTLY? lz = self.tank_params["lz_tank"] / self.N # m if ts.getTime() != self.t_current[0] and self.tank_params["heat_transfer"]: self.t_current[0] = ts.getTime() for i in range(self.N): self.T_wall[i], self.Q_wall[i], h_ht = ( solve_radial_heat_conduction_implicit( self.tank_params, self.T[i], self.T_wall[i], (self.u_m_ghost[i] + self.u_m_ghost[i + 1]) / 2, self.rho_m_ghost[i + 1], self.mph_uv_flsh_L[i], lz, dt, ) ) # K, J/s, W/m2K # Calculate residuals f[:, :] = 0.0 f[:, 0] = dt * rho_m_dot # kg/m3 f[:-1, 1] = self.p[1:] - self.p[:-1] + self.dx * self.g * rho_m[0:-1] # Pa/m f[:, 2] = ( dt * ( rho_m_dot * (e_m / self.mw + self.g * self.pos) + rho_m * e_m_dot / self.mw ) - rho_m_dot * e_m_dot / self.mw * dt**2 - self.Q_wall / (np.pi * self.r_tank_inner**2 * lz) * dt ) # J / m3 # add contribution from space for i in range(n_phase): e_flux_i = np.zeros_like(self.u_ghost[i]) # J/m3 m/s rho_flux_i = np.zeros_like(self.u_ghost[i]) # kg/m2/s for j in range(1, self.N + 1): if self.u_ghost[i][j] >= 0.0: rho_flux_new = _rho_flux( self.alpha_ghost[i][j], self.rho_ghost[i][j], self.u_ghost[i][j] ) e_flux_new = _e_flux( self.alpha_ghost[i][j], self.rho_ghost[i][j], self.h_ghost[i][j], self.mw, self.g, self.pos_ghost[j], self.u_ghost[i][j], ) # backward euler rho_flux_i[j] = rho_flux_new # kg/m2/s e_flux_i[j] = e_flux_new # J/m3 m/s else: rho_flux_new = _rho_flux( self.alpha_ghost[i][j + 1], self.rho_ghost[i][j + 1], self.u_ghost[i][j], ) e_flux_new = _e_flux( self.alpha_ghost[i][j + 1], self.rho_ghost[i][j + 1], self.h_ghost[i][j + 1], self.mw, self.g, self.pos_ghost[j + 1], self.u_ghost[i][j], ) # backward euler rho_flux_i[j] = rho_flux_new e_flux_i[j] = e_flux_new # mass eq f[:, 0] += (dt / self.dx) * (rho_flux_i[1:] - rho_flux_i[:-1]) # kg/m3 # energy eq f[:, 2] += (dt / self.dx) * (e_flux_i[1:] - e_flux_i[:-1]) # J/m3 f1_ref, f2_ref, f3_ref = self.rho_ref, self.p_ref, self.e_ref f[:, 0] /= f1_ref f[:-1, 1] /= f2_ref f[:, 2] /= f3_ref # dummy eq f[-1, 1] = x[-1, 2] -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!evJjDgXHf6wRJbFLstjldUYC62cqJenkgdUFFAX5UBv8sQkHtWaP4fKcVADFipZFOIuAjgirRECHzDaLL0lsqa7hkri9PlmBCHA$ -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!evJjDgXHf6wRJbFLstjldUYC62cqJenkgdUFFAX5UBv8sQkHtWaP4fKcVADFipZFOIuAjgirRECHzDaLL0lsqa7hkri9PlmBCHA$ -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!evJjDgXHf6wRJbFLstjldUYC62cqJenkgdUFFAX5UBv8sQkHtWaP4fKcVADFipZFOIuAjgirRECHzDaLL0lsqa7hkri9PlmBCHA$ -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Tue Feb 25 13:48:02 2025 From: knepley at gmail.com (Matthew Knepley) Date: Tue, 25 Feb 2025 14:48:02 -0500 Subject: [petsc-users] TS Solver stops working when including ts.setDM In-Reply-To: References: Message-ID: On Tue, Feb 25, 2025 at 9:51?AM Eirik Jaccheri H?ydalsvik < eirik.hoydalsvik at sintef.no> wrote: > Hi, > > After sending you the email, I rescaled the residual function and got the > two jacobians to agree down to e-7. > > I have tried with ?lu? and ?ilu? as preconditioners, and this does not > work. However, I just tried to use ?sor? as a preconditioner, and using sor > using the da jacobian works just fine! > > Why should it work with sor and not with ilu or lu? > ILU fails all the time, so that is not surprising. However, I do not understand why SOR would succeed and LU would fail, except that SOR is functioning as a kind of globalization by solving very inexactly. Can you run with -snes_monitor -snes_converged_reason -ksp_monitor_true_solution -ksp_converged_reason -snes_linesearch_monitor and send the output? Thanks, Matt > Eirik > > Jacobians: > row 0: (0, 1.1012) (1, -51356.3) (2, 0.258649) (3, -0.0644364) (4, > -6402.63) (5, 6.19796e-08) > > row 1: (0, -0.445291) (1, 901708.) (2, 0.) (3, 0.44529) (4, -901708.) > (5, 3.63946e-07) > > row 2: (0, 1.10139) (1, -40239.6) (2, 0.258157) (3, -0.0642761) (4, > -6985.51) (5, 6.19796e-08) > > row 3: (0, -0.101197) (1, 51356.3) (2, -0.258649) (3, 1.16563) (4, > -44953.7) (5, 0.258649) (6, -0.0644364) (7, -6402.63) (8, 8.23293e-08) > > row 4: (0, 0.) (1, 0.) (2, 0.) (3, -0.44529) (4, 901708.) (5, > -3.63946e-07) (6, 0.44529) (7, -901708.) (8, -2.8832e-07) > > row 5: (0, -0.101388) (1, 51394.3) (2, -0.258157) (3, 1.16566) (4, > -33254.1) (5, 0.258157) (6, -0.0642762) (7, -6985.51) (8, 8.27566e-08) > > row 6: (3, -0.101197) (4, 51356.3) (5, -0.258649) (6, 1.06444) (7, > 6402.63) (8, -5.80354e-08) > > row 7: (3, 0.) (4, 0.) (5, 0.) (6, 0.) (7, 0.) (8, 1.) > > row 8: (3, -0.101388) (4, 51394.3) (5, -0.258157) (6, 1.06428) (7, > 18140.2) (8, -5.88806e-08) > > 1.1011966721737030e+00 -5.1356338141763350e+04 2.5864916418200712e-01 > -6.4436434390614972e-02 -6.4026259743175688e+03 7.0149583230222760e-08 > -1.2114185055821600e-08 7.4938912205780135e-08 -1.2114185055821600e-08 > > -4.4529045442108389e-01 9.0170829570587131e+05 -3.6394551278146074e-07 > 4.4529038352260736e-01 -9.0170829570607911e+05 -2.8832047116453381e-07 > 2.8832047116453381e-07 -2.8832047116453381e-07 2.8832047116453381e-07 > > 1.1013882301195626e+00 -4.0239613965471210e+04 2.5815705499526392e-01 > -6.4276195275552644e-02 -6.9855091616075770e+03 7.0994758931791700e-08 > -1.2114185055821600e-08 7.5502362673492770e-08 -1.2114185055821600e-08 > > -1.0119660581208668e-01 5.1356338141756765e+04 -2.5864909514315410e-01 > 1.1656331102401210e+00 -4.4953712167476042e+04 2.5864905556513557e-01 > -6.4436357255260146e-02 -6.4026259743998889e+03 8.6207799859128616e-08 > > 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 > -4.4529067184307852e-01 9.0170829570636747e+05 0.0000000000000000e+00 > 4.4529045442108389e-01 -9.0170829570587131e+05 3.6394551278146074e-07 > > -1.0138834089854361e-01 5.1394313808568040e+04 -2.5815698520133573e-01 > 1.1656640380374783e+00 -3.3254086573823952e+04 2.5815685372183378e-01 > -6.4276095304361430e-02 -6.9855068889361701e+03 8.6288440103988163e-08 > > -6.9304407528653807e-08 0.0000000000000000e+00 -6.9304407528653807e-08 > -1.0119667848877271e-01 5.1356338141793494e+04 -2.5864912586737532e-01 > 1.0644363666594443e+00 6.4026259743251758e+03 -7.4093736504211182e-08 > > 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 > 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 > 0.0000000000000000e+00 0.0000000000000000e+00 1.0000000000000000e+00 > > -6.9586132762510124e-08 0.0000000000000000e+00 -6.9586132762510124e-08 > -1.0138837786069978e-01 5.1394295578534489e+04 -2.5815692427475545e-01 > 1.0642752170084262e+00 1.8140206731963925e+04 -7.4375461738067500e-08 > > > > > *From: *Matthew Knepley > *Date: *Tuesday, February 25, 2025 at 15:27 > *To: *Eirik Jaccheri H?ydalsvik > *Cc: *petsc-users at mcs.anl.gov > *Subject: *Re: [petsc-users] TS Solver stops working when including > ts.setDM > > On Tue, Feb 25, 2025 at 3:19?AM Eirik Jaccheri H?ydalsvik < > eirik.hoydalsvik at sintef.no> wrote: > > Thanks again for the quick response, > > I tried prining the jacobians with -ksp_view_mat as you suggested, with a > system of only 3 cells (I am studying at a 1d problem). Printing the > jacobian in the first timestep I got the two matrices attached at the end > of this email. The jacobians are in general agreement, with some small > diviations, like the final element of the matrix being 1.6e-5 in the sparse > case and 3.7 In the full case. > > > > We usually expect to see single precision accuracy (1e-7), so this > indicates that your condition number is high. > > > > If you use LU (-pc_type lu) to solve the linear system, do you get similar > results? > > > > Thanks, > > > > Matt > > > > Questions: > > 1. Are differences on the order of 1e-5 expected when computing the > jacobians in different ways? > > 2. Do you think these differences can be the cause of my problems? Any > suggestions for furtner debugging strategies? > > Eirik > > ! sparse jacobian > > row 0: (0, 1.1012) (1, -104.568) (2, 0.258649) (3, -0.0644364) (4, > -13.1186) (5, 1.3237e-08) > > row 1: (0, -0.44489) (1, 1846.04) (2, 2.12629e-07) (3, 0.445291) (4, > -1846.04) (5, 7.08762e-08) > > row 2: (0, 540.692) (1, -40219.1) (2, 126.734) (3, -31.5544) (4, > -7023.46) (5, 6.48896e-06) > > row 3: (0, -0.101197) (1, 104.568) (2, -0.258649) (3, 1.16563) (4, > -91.4489) (5, 0.258649) (6, -0.0644365) (7, -13.1186) (8, -4.4809e-08) > > row 4: (0, 0.) (1, 0.) (2, 0.) (3, -0.44489) (4, 1846.04) (5, > -2.17357e-07) (6, 0.445291) (7, -1846.04) (8, 2.17355e-07) > > row 5: (0, -49.7734) (1, 51373.8) (2, -126.734) (3, 572.246) (4, > -33195.6) (5, 126.734) (6, -31.5544) (7, -7023.46) (8, -2.19026e-05) > > row 6: (3, -0.101197) (4, 104.568) (5, -0.258649) (6, 1.06444) (7, > 13.1186) (8, 3.32334e-08) > > row 7: (3, 0.) (4, 0.) (5, 0.) (6, 0.) (7, 0.) (8, 1.) > > row 8: (3, -49.7734) (4, 51373.8) (5, -126.734) (6, 522.472) (7, > 18178.2) (8, 1.61503e-05) > > > > ! full jacobian > > 1.1011966827009450e+00 -1.0456754702270389e+02 2.5864915220241336e-01 > -6.4436436239323838e-02 -1.3118626729240630e+01 7.6042484344402957e-08 > 2.9290438414140398e-08 2.5347494781467651e-08 7.2381179542635411e-08 > > -4.4488897562431995e-01 1.8460406897256150e+03 -5.0558790784552242e-07 > 4.4529051871980491e-01 -1.8460406898012175e+03 0.0000000000000000e+00 > 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 > > 5.4069168701069862e+02 -4.0219094660763396e+04 1.2673402711669499e+02 > -3.1554364136687809e+01 -7.0234605760797031e+03 3.7635960251523168e-05 > 1.4708306305192963e-05 1.2833718246687978e-05 3.5617173111594722e-05 > > -1.0119659898285956e-01 1.0456754705230254e+02 -2.5864912672499502e-01 > 1.1656331184446040e+00 -9.1448920317937109e+01 2.5864905777109459e-01 > -6.4436443447843730e-02 -1.3118626754633008e+01 3.8772800266974086e-09 > > 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 > -4.4488904650049199e-01 1.8460406899429699e+03 -2.8823314009443733e-07 > 4.4529051871980491e-01 -1.8460406898012175e+03 0.0000000000000000e+00 > > -4.9773392795905657e+01 5.1373794518018862e+04 -1.2673401444613337e+02 > 5.7224585844509852e+02 -3.3195615874594827e+04 1.2673393520690603e+02 > -3.1554356503250116e+01 -7.0234583029005144e+03 1.9105655626471588e-06 > > -8.1675260962506883e-08 -2.9290438414140398e-08 -2.5347494781467651e-08 > -1.0119667997558363e-01 1.0456754704720647e+02 -2.5864913361425051e-01 > 1.0644364161519400e+00 1.3118626729240630e+01 -7.6042484344402957e-08 > > 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 > 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 > 0.0000000000000000e+00 0.0000000000000000e+00 1.0000000000000000e+00 > > -4.0087344635721997e-05 -1.4564107223769502e-05 -1.2401121002417596e-05 > -4.9773414819747863e+01 5.1373776293551586e+04 -1.2673397275364130e+02 > 5.2247224727851687e+02 1.8178158133060850e+04 -3.7347562088676249e-05 > > > > *From: *Matthew Knepley > *Date: *Monday, February 24, 2025 at 15:00 > *To: *Eirik Jaccheri H?ydalsvik > *Cc: *petsc-users at mcs.anl.gov > *Subject: *Re: [petsc-users] TS Solver stops working when including > ts.setDM > > On Mon, Feb 24, 2025 at 8:56?AM Eirik Jaccheri H?ydalsvik < > eirik.hoydalsvik at sintef.no> wrote: > > > 1. Thank you for the quick answer, I think this sounds reasonable? Is > there any way to compare the brute-force jacobian to the one computed using > the coloring information? > > > > The easiest way we have is to print them both out: > > > > -ksp_view_mat > > > > on both runs. We have a way to compare the analytic and FD Jacobians > (-snes_test_jacobian), but > > not two different FDs. > > > > Thanks, > > > > Matt > > > > > 1. > > *From: *Matthew Knepley > *Date: *Monday, February 24, 2025 at 14:53 > *To: *Eirik Jaccheri H?ydalsvik > *Cc: *petsc-users at mcs.anl.gov > *Subject: *Re: [petsc-users] TS Solver stops working when including > ts.setDM > > On Mon, Feb 24, 2025 at 8:41?AM Eirik Jaccheri H?ydalsvik via petsc-users > wrote: > > Hi, > > I am using the petsc4py.ts timestepper to solve a PDE. It is not easy to > obtain the jacobian for my equations, so I do not provide a jacobian > function. The code is given at the end of the email. > > When I comment out the function call ?ts.setDM(da)?, the code runs and > gives reasonable results. > > However, when I add this line of code, the program crashes with the error > message provided at the end of the email. > > Questions: > > 1. Do you know why adding this line of code can make the SNES solver > diverge? Any suggestions for how to debug the issue? > > > > I will not know until I run it, but here is my guess. When the DMDA is > specified, PETSc uses coloring to produce the Jacobian. When it is not, it > just brute-forces the entire J. My guess is that your residual does not > respect the stencil in the DMDA, so the coloring is wrong, making a wrong > Jacobian. > > > > 2. What is the advantage of adding the DMDA object to the ts solver? Will > this speed up the calculation of the finite difference jacobian? > > > > Yes, it speeds up the computation of the FD Jacobian. > > > > Thanks, > > > > Matt > > > > Best regards, > > Eirik H?ydalsvik > > SINTEF ER/NTNU > > Error message: > > [Eiriks-MacBook-Pro.local:26384] shmem: mmap: an error occurred while > determining whether or not > /var/folders/w1/35jw9y4n7lsbw0dhjqdhgzz80000gn/T//ompi.Eiriks-MacBook-Pro.501/jf.0/2046361600/sm_segment.Eiriks-MacBook-Pro.501.79f90000.0 could > be created. > > t 0 of 1 with dt = 0.2 > > 0 TS dt 0.2 time 0. > > TSAdapt none step 0 stage rejected (SNES reason > DIVERGED_LINEAR_SOLVE) t=0 + 2.000e-01 retrying with dt=5.000e-02 > > TSAdapt none step 0 stage rejected (SNES reason > DIVERGED_LINEAR_SOLVE) t=0 + 5.000e-02 retrying with dt=1.250e-02 > > TSAdapt none step 0 stage rejected (SNES reason > DIVERGED_LINEAR_SOLVE) t=0 + 1.250e-02 retrying with dt=3.125e-03 > > TSAdapt none step 0 stage rejected (SNES reason > DIVERGED_LINEAR_SOLVE) t=0 + 3.125e-03 retrying with dt=7.813e-04 > > TSAdapt none step 0 stage rejected (SNES reason > DIVERGED_LINEAR_SOLVE) t=0 + 7.813e-04 retrying with dt=1.953e-04 > > TSAdapt none step 0 stage rejected (SNES reason > DIVERGED_LINEAR_SOLVE) t=0 + 1.953e-04 retrying with dt=4.883e-05 > > TSAdapt none step 0 stage rejected (SNES reason > DIVERGED_LINEAR_SOLVE) t=0 + 4.883e-05 retrying with dt=1.221e-05 > > TSAdapt none step 0 stage rejected (SNES reason > DIVERGED_LINEAR_SOLVE) t=0 + 1.221e-05 retrying with dt=3.052e-06 > > TSAdapt none step 0 stage rejected (SNES reason > DIVERGED_LINEAR_SOLVE) t=0 + 3.052e-06 retrying with dt=7.629e-07 > > TSAdapt none step 0 stage rejected (SNES reason > DIVERGED_LINEAR_SOLVE) t=0 + 7.629e-07 retrying with dt=1.907e-07 > > TSAdapt none step 0 stage rejected (SNES reason > DIVERGED_LINEAR_SOLVE) t=0 + 1.907e-07 retrying with dt=4.768e-08 > > Traceback (most recent call last): > > File "/Users/iept1445/Code/tank_model/closed_tank.py", line 200, in > > > return_dict1d = get_tank_composition_1d(tank_params) > > File "/Users/iept1445/Code/tank_model/src/tank_model1d.py", line 223, in > get_tank_composition_1d > > ts.solve(u=x) > > File "petsc4py/PETSc/TS.pyx", line 2478, in petsc4py.PETSc.TS.solve > > petsc4py.PETSc.Error: error code 91 > > [0] TSSolve() at > /Users/iept1445/.cache/uv/sdists-v6/pypi/petsc/3.22.1/svV0XnlR4s3LB4lmsaKjR/src/src/ts/interface/ts.c:4072 > > [0] TSStep() at > /Users/iept1445/.cache/uv/sdists-v6/pypi/petsc/3.22.1/svV0XnlR4s3LB4lmsaKjR/src/src/ts/interface/ts.c:3440 > > [0] TSStep has failed due to DIVERGED_STEP_REJECTED > > Options for solver: > > COMM = PETSc.COMM_WORLD > > > > da = PETSc.DMDA().create( > > dim=(N_vertical,), > > dof=3, > > stencil_type=PETSc.DMDA().StencilType.STAR, > > stencil_width=1, > > # boundary_type=PETSc.DMDA.BoundaryType.GHOSTED, > > ) > > x = da.createGlobalVec() > > x_old = da.createGlobalVec() > > f = da.createGlobalVec() > > J = da.createMat() > > rho_ref = rho_m[0] # kg/m3 > > e_ref = e_m[0] # J/mol > > p_ref = p0 # Pa > > x.setArray(np.array([rho_m / rho_ref, e_m / e_ref, ux_m]).T.flatten()) > > x_old.setArray(np.array([rho_m / rho_ref, e_m / e_ref, > ux_m]).T.flatten()) > > > > optsDB = PETSc.Options() > > optsDB["snes_lag_preconditioner_persists"] = False > > optsDB["snes_lag_jacobian"] = 1 > > optsDB["snes_lag_jacobian_persists"] = False > > optsDB["snes_lag_preconditioner"] = 1 > > optsDB["ksp_type"] = "gmres" # "gmres" # gmres" > > optsDB["pc_type"] = "ilu" # "lu" # "ilu" > > optsDB["snes_type"] = "newtonls" > > optsDB["ksp_rtol"] = 1e-7 > > optsDB["ksp_atol"] = 1e-7 > > optsDB["ksp_max_it"] = 100 > > optsDB["snes_rtol"] = 1e-5 > > optsDB["snes_atol"] = 1e-5 > > optsDB["snes_stol"] = 1e-5 > > optsDB["snes_max_it"] = 100 > > optsDB["snes_mf"] = False > > optsDB["ts_max_time"] = t_end > > optsDB["ts_type"] = "beuler" # "bdf" # > > optsDB["ts_max_snes_failures"] = -1 > > optsDB["ts_monitor"] = "" > > optsDB["ts_adapt_monitor"] = "" > > # optsDB["snes_monitor"] = "" > > # optsDB["ksp_monitor"] = "" > > optsDB["ts_atol"] = 1e-4 > > > > x0 = x_old > > residual_wrap = residual_ts( > > eos, > > x0, > > N_vertical, > > g, > > pos, > > z, > > mw, > > dt, > > dx, > > p_amb, > > A_nozzle, > > r_tank_inner, > > mph_uv_flsh_L, > > rho_ref, > > e_ref, > > p_ref, > > closed_tank, > > J, > > f, > > da, > > drift_func, > > T_wall, > > tank_params, > > ) > > > > # optsDB["ts_adapt_type"] = "none" > > > > ts = PETSc.TS().create(comm=COMM) > > # TODO: Figure out why DM crashes the code > > # ts.setDM(residual_wrap.da) > > ts.setIFunction(residual_wrap.residual_ts, None) > > ts.setTimeStep(dt) > > ts.setMaxSteps(-1) > > ts.setTime(t_start) # s > > ts.setMaxTime(t_end) # s > > ts.setMaxSteps(1e5) > > ts.setStepLimits(1e-3, 1e5) > > ts.setFromOptions() > > ts.solve(u=x) > > > > Residual function: > > class residual_ts: > > def __init__( > > self, > > eos, > > x0, > > N, > > g, > > pos, > > z, > > mw, > > dt, > > dx, > > p_amb, > > A_nozzle, > > r_tank_inner, > > mph_uv_flsh_l, > > rho_ref, > > e_ref, > > p_ref, > > closed_tank, > > J, > > f, > > da, > > drift_func, > > T_wall, > > tank_params, > > ): > > self.eos = eos > > self.x0 = x0 > > self.N = N > > self.g = g > > self.pos = pos > > self.z = z > > self.mw = mw > > self.dt = dt > > self.dx = dx > > self.p_amb = p_amb > > self.A_nozzle = A_nozzle > > self.r_tank_inner = r_tank_inner > > self.mph_uv_flsh_L = mph_uv_flsh_l > > self.rho_ref = rho_ref > > self.e_ref = e_ref > > self.p_ref = p_ref > > self.closed_tank = closed_tank > > self.J = J > > self.f = f > > self.da = da > > self.drift_func = drift_func > > self.T_wall = T_wall > > self.tank_params = tank_params > > self.Q_wall = np.zeros(N) > > self.n_iter = 0 > > self.t_current = [0.0] > > self.s_top = [0.0] > > self.p_choke = [0.0] > > > > # setting interp func # TODO figure out how to generalize this > method > > self._interp_func = _jalla_upwind > > > > # allocate space for new params > > self.p = np.zeros(N) # Pa > > self.T = np.zeros(N) # K > > self.alpha = np.zeros((2, N)) > > self.rho = np.zeros((2, N)) > > self.e = np.zeros((2, N)) > > > > # allocate space for ghost cells > > self.alpha_ghost = np.zeros((2, N + 2)) > > self.rho_ghost = np.zeros((2, N + 2)) > > self.rho_m_ghost = np.zeros(N + 2) > > self.u_m_ghost = np.zeros(N + 1) > > self.u_ghost = np.zeros((2, N + 1)) > > self.e_ghost = np.zeros((2, N + 2)) > > self.pos_ghost = np.zeros(N + 2) > > self.h_ghost = np.zeros((2, N + 2)) > > > > # allocate soace for local X and Xdot > > self.X_LOCAL = da.createLocalVec() > > self.XDOT_LOCAL = da.createLocalVec() > > > > def residual_ts(self, ts, t, X, XDOT, F): > > self.n_iter += 1 > > # TODO: Estimate time use > > """ > > Caculate residuals for equations > > (rho, rho (e + gx))_t + (rho u, rho u (h + gx))_x = 0 > > P_x = - g \rho > > """ > > n_phase = 2 > > self.da.globalToLocal(X, self.X_LOCAL) > > self.da.globalToLocal(XDOT, self.XDOT_LOCAL) > > x = self.da.getVecArray(self.X_LOCAL) > > xdot = self.da.getVecArray(self.XDOT_LOCAL) > > f = self.da.getVecArray(F) > > > > T_c, v_c, p_c = self.eos.critical(self.z) # K, m3/mol, Pa > > rho_m = x[:, 0] * self.rho_ref # kg/m3 > > e_m = x[:, 1] * self.e_ref # J/mol > > u_m = x[:-1, 2] # m/s > > > > # derivatives > > rho_m_dot = xdot[:, 0] * self.rho_ref # kg/m3 > > e_m_dot = xdot[:, 1] * self.e_ref # kg/m3 > > dt = ts.getTimeStep() # s > > > > for i in range(self.N): > > # get new parameters > > self.mph_uv_flsh_L[i] = self.eos.multi_phase_uvflash( > > self.z, e_m[i], self.mw / rho_m[i], self.mph_uv_flsh_L[i] > > ) > > > > betaL, betaV, betaS = _get_betas(self.mph_uv_flsh_L[i]) # > mol/mol > > beta = [betaL, betaV] > > if betaS != 0.0: > > print("there is a solid phase which is not accounted for") > > self.T[i], self.p[i] = _get_tank_temperature_pressure( > > self.mph_uv_flsh_L[i] > > ) # K, Pa) > > for j, phase in enumerate([self.eos.LIQPH, self.eos.VAPPH]): > > # new parameters > > self.rho_ghost[:, 1:-1][j][i] = ( > > self.mw > > / self.eos.specific_volume(self.T[i], self.p[i], > self.z, phase)[0] > > ) # kg/m3 > > self.e_ghost[:, 1:-1][j][i] = self.eos.internal_energy_tv( > > self.T[i], self.mw / self.rho_ghost[:, 1:-1][j][i], > self.z, phase > > )[ > > 0 > > ] # J/mol > > self.h_ghost[:, 1:-1][j][i] = ( > > self.e_ghost[:, 1:-1][j][i] > > + self.p[i] * self.mw / self.rho_ghost[:, 1:-1][j][i] > > ) # J/mol > > self.alpha_ghost[:, 1:-1][j][i] = ( > > beta[j] / self.rho_ghost[:, 1:-1][j][i] * rho_m[i] > > ) # m3/m3 > > > > # calculate drift velocity > > for i in range(self.N - 1): > > self.u_ghost[:, 1:-1][0][i], self.u_ghost[:, 1:-1][1][i] = ( > > calc_drift_velocity( > > u_m[i], > > self._interp_func( > > self.rho_ghost[:, 1:-1][0][i], > > self.rho_ghost[:, 1:-1][0][i + 1], > > u_m[i], > > ), > > self._interp_func( > > self.rho_ghost[:, 1:-1][1][i], > > self.rho_ghost[:, 1:-1][1][i + 1], > > u_m[i], > > ), > > self.g, > > self._interp_func(self.T[i], self.T[i + 1], u_m[i]), > > T_c, > > self.r_tank_inner, > > self._interp_func( > > self.alpha_ghost[:, 1:-1][0][i], > > self.alpha_ghost[:, 1:-1][0][i + 1], > > u_m[i], > > ), > > self._interp_func( > > self.alpha_ghost[:, 1:-1][1][i], > > self.alpha_ghost[:, 1:-1][1][i + 1], > > u_m[i], > > ), > > self.drift_func, > > ) > > ) # liq m / s , vapour m / s > > > > u_bottom = 0 > > if self.closed_tank: > > u_top = 0.0 # m/s > > else: > > # calc phase to skip env_isentrope_cross > > if ( > > self.mph_uv_flsh_L[-1].liquid != None > > and self.mph_uv_flsh_L[-1].vapour == None > > and self.mph_uv_flsh_L[-1].solid == None > > ): > > phase_env = self.eos.LIQPH > > else: > > phase_env = self.eos.TWOPH > > > > self.h_m = e_m + self.p * self.mw / rho_m # J / mol > > self.s_top[0] = _get_s_tank(self.eos, self.mph_uv_flsh_L[-1]) > # J / mol / K > > mdot, self.p_choke[0] = calc_mass_outflow( > > self.eos, > > self.z, > > self.h_m[-1], > > self.s_top[0], > > self.p[-1], > > self.p_amb, > > self.A_nozzle, > > self.mw, > > phase_env, > > debug_plot=False, > > ) # mol / s , Pa > > u_top = -mdot * self.mw / rho_m[-1] / (np.pi * > self.r_tank_inner**2) # m/s > > > > # assemble vectors with ghost cells > > self.alpha_ghost[:, 0] = self.alpha_ghost[:, 1:-1][:, 0] # m3/m3 > > self.alpha_ghost[:, -1] = self.alpha_ghost[:, 1:-1][:, -1] # m3/m3 > > self.rho_ghost[:, 0] = self.rho_ghost[:, 1:-1][:, 0] # kg/m3 > > self.rho_ghost[:, -1] = self.rho_ghost[:, 1:-1][:, -1] # kg/m3 > > self.rho_m_ghost[0] = rho_m[0] # kg/m3 > > self.rho_m_ghost[1:-1] = rho_m # kg/m3 > > self.rho_m_ghost[-1] = rho_m[-1] # kg/m3 > > # u_ghost[:, 1:-1] = u # m/s > > self.u_ghost[:, 0] = u_bottom # m/s > > self.u_ghost[:, -1] = u_top # m/s > > self.u_m_ghost[0] = u_bottom # m/s > > self.u_m_ghost[1:-1] = u_m # m/s > > self.u_m_ghost[-1] = u_top # m/s > > self.e_ghost[:, 0] = self.e_ghost[:, 1:-1][:, 0] # J/mol > > self.e_ghost[:, -1] = self.e_ghost[:, 1:-1][:, -1] # J/mol > > self.pos_ghost[1:-1] = self.pos # m > > self.pos_ghost[0] = self.pos[0] # m > > self.pos_ghost[-1] = self.pos[-1] # m > > self.h_ghost[:, 0] = self.h_ghost[:, 1] # J/mol > > self.h_ghost[:, -1] = self.h_ghost[:, -2] # J/mol > > > > # recalculate wall temperature and heat flux > > # TODO ARE WE DOING THE STAGGERING CORRECTLY? > > lz = self.tank_params["lz_tank"] / self.N # m > > if ts.getTime() != self.t_current[0] and > self.tank_params["heat_transfer"]: > > self.t_current[0] = ts.getTime() > > for i in range(self.N): > > self.T_wall[i], self.Q_wall[i], h_ht = ( > > solve_radial_heat_conduction_implicit( > > self.tank_params, > > self.T[i], > > self.T_wall[i], > > (self.u_m_ghost[i] + self.u_m_ghost[i + 1]) / 2, > > self.rho_m_ghost[i + 1], > > self.mph_uv_flsh_L[i], > > lz, > > dt, > > ) > > ) # K, J/s, W/m2K > > > > # Calculate residuals > > f[:, :] = 0.0 > > f[:, 0] = dt * rho_m_dot # kg/m3 > > f[:-1, 1] = self.p[1:] - self.p[:-1] + self.dx * self.g * > rho_m[0:-1] # Pa/m > > f[:, 2] = ( > > dt > > * ( > > rho_m_dot * (e_m / self.mw + self.g * self.pos) > > + rho_m * e_m_dot / self.mw > > ) > > - rho_m_dot * e_m_dot / self.mw * dt**2 > > - self.Q_wall / (np.pi * self.r_tank_inner**2 * lz) * dt > > ) # J / m3 > > > > # add contribution from space > > for i in range(n_phase): > > e_flux_i = np.zeros_like(self.u_ghost[i]) # J/m3 m/s > > rho_flux_i = np.zeros_like(self.u_ghost[i]) # kg/m2/s > > for j in range(1, self.N + 1): > > if self.u_ghost[i][j] >= 0.0: > > rho_flux_new = _rho_flux( > > self.alpha_ghost[i][j], self.rho_ghost[i][j], > self.u_ghost[i][j] > > ) > > e_flux_new = _e_flux( > > self.alpha_ghost[i][j], > > self.rho_ghost[i][j], > > self.h_ghost[i][j], > > self.mw, > > self.g, > > self.pos_ghost[j], > > self.u_ghost[i][j], > > ) > > > > # backward euler > > rho_flux_i[j] = rho_flux_new # kg/m2/s > > e_flux_i[j] = e_flux_new # J/m3 m/s > > > > else: > > rho_flux_new = _rho_flux( > > self.alpha_ghost[i][j + 1], > > self.rho_ghost[i][j + 1], > > self.u_ghost[i][j], > > ) > > > > e_flux_new = _e_flux( > > self.alpha_ghost[i][j + 1], > > self.rho_ghost[i][j + 1], > > self.h_ghost[i][j + 1], > > self.mw, > > self.g, > > self.pos_ghost[j + 1], > > self.u_ghost[i][j], > > ) > > > > # backward euler > > rho_flux_i[j] = rho_flux_new > > e_flux_i[j] = e_flux_new > > > > # mass eq > > f[:, 0] += (dt / self.dx) * (rho_flux_i[1:] - > rho_flux_i[:-1]) # kg/m3 > > > > # energy eq > > f[:, 2] += (dt / self.dx) * (e_flux_i[1:] - e_flux_i[:-1]) # > J/m3 > > > > f1_ref, f2_ref, f3_ref = self.rho_ref, self.p_ref, self.e_ref > > f[:, 0] /= f1_ref > > f[:-1, 1] /= f2_ref > > f[:, 2] /= f3_ref > > # dummy eq > > f[-1, 1] = x[-1, 2] > > > > > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > > https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!Z3QqpIz5I2DwEJ5HG6GhoIsIxTOL2irfY5RMgPjrfCc99V9ZTfdxb5k_Tx49NclrnyAR3XQI-OmF1Y9QeBdH$ > > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > > https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!Z3QqpIz5I2DwEJ5HG6GhoIsIxTOL2irfY5RMgPjrfCc99V9ZTfdxb5k_Tx49NclrnyAR3XQI-OmF1Y9QeBdH$ > > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > > https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!Z3QqpIz5I2DwEJ5HG6GhoIsIxTOL2irfY5RMgPjrfCc99V9ZTfdxb5k_Tx49NclrnyAR3XQI-OmF1Y9QeBdH$ > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!Z3QqpIz5I2DwEJ5HG6GhoIsIxTOL2irfY5RMgPjrfCc99V9ZTfdxb5k_Tx49NclrnyAR3XQI-OmF1Y9QeBdH$ -------------- next part -------------- An HTML attachment was scrubbed... URL: From dargaville.steven at gmail.com Tue Feb 25 15:39:03 2025 From: dargaville.steven at gmail.com (Steven Dargaville) Date: Tue, 25 Feb 2025 21:39:03 +0000 Subject: [petsc-users] building kokkos matrices on the device Message-ID: Hi I'm just wondering if there is any possibility of making: MatSetSeqAIJKokkosWithCSRMatrix or MatCreateSeqAIJKokkosWithCSRMatrix in src/mat/impls/aij/seq/kokkos/aijkok.kokkos.cxx MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices in src/mat/impls/aij/mpi/kokkos/mpiaijkok.kokkos.cxx publicly accessible outside of petsc, or if there is an interface I have missed for creating Kokkos matrices entirely on the device? MatCreateSeqAIJKokkosWithCSRMatrix for example is marked as PETSC_INTERN so I can't link to it. I've currently just copied the code inside of those methods so that I can build without any preallocation on the host (e.g., through the COO interface) and it works really well. Thanks for your help Steven -------------- next part -------------- An HTML attachment was scrubbed... URL: From junchao.zhang at gmail.com Tue Feb 25 16:16:37 2025 From: junchao.zhang at gmail.com (Junchao Zhang) Date: Tue, 25 Feb 2025 16:16:37 -0600 Subject: [petsc-users] building kokkos matrices on the device In-Reply-To: References: Message-ID: Hi, Steven, MatSetSeqAIJKokkosWithCSRMatrix(Mat A, Mat_SeqAIJKokkos *akok) uses a private data type Mat_SeqAIJKokkos, so it can not be directly made public. If you already use COO, then why not directly make the matrix of type MATAIJKOKKOS and call MatSetPreallocationCOO() and MatSetValuesCOO()? So I am confused by your needs. Thanks! --Junchao Zhang On Tue, Feb 25, 2025 at 3:39?PM Steven Dargaville < dargaville.steven at gmail.com> wrote: > Hi > > I'm just wondering if there is any possibility of making: > MatSetSeqAIJKokkosWithCSRMatrix or MatCreateSeqAIJKokkosWithCSRMatrix in > src/mat/impls/aij/seq/kokkos/aijkok.kokkos.cxx > MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices in > src/mat/impls/aij/mpi/kokkos/mpiaijkok.kokkos.cxx > > publicly accessible outside of petsc, or if there is an interface I have > missed for creating Kokkos matrices entirely on the device? > MatCreateSeqAIJKokkosWithCSRMatrix for example is marked as PETSC_INTERN so > I can't link to it. > > I've currently just copied the code inside of those methods so that I can > build without any preallocation on the host (e.g., through the COO > interface) and it works really well. > > Thanks for your help > Steven > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dargaville.steven at gmail.com Tue Feb 25 17:35:18 2025 From: dargaville.steven at gmail.com (Steven Dargaville) Date: Tue, 25 Feb 2025 23:35:18 +0000 Subject: [petsc-users] building kokkos matrices on the device In-Reply-To: References: Message-ID: Thanks for the response! Although MatSetValuesCOO happens on the device if the input coo_v pointer is device memory, I believe MatSetPreallocationCOO requires host pointers for coo_i and coo_j, and the preallocation (and construction of the COO structures) happens on the host and is then copied onto the device? I need to be able to create a matrix object with minimal work on the host (like many of the routines in aijkok.kokkos.cxx do internally). I originally used the COO interface to build the matrices I need, but that was around 5x slower than constructing the aij structures myself on the device and then just directly using the MatSetSeqAIJKokkosWithCSRMatrix type methods. The reason I thought MatSetSeqAIJKokkosWithCSRMatrix could easily be made public is that the Mat_SeqAIJKokkos constructors are already publicly accessible? In particular one of those constructors takes in pointers to the Kokkos dual views which store a,i,j, and hence one can build a sequential matrix with nothing (or very little) occuring on the host. The only change I can see that would be necessary is for MatSetSeqAIJKokkosWithCSRMatrix (or MatCreateSeqAIJKokkosWithCSRMatrix) to be public is to change the PETSC_INTERN to PETSC_EXTERN? For MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices, I believe all that is required is declaring the method in the .hpp, as it's already defined as static in mpiaijkok.kokkos.cxx. In particular, the comments above MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices suggest that the off-diagonal block B needs to be built with global column ids, with mpiaij->garray constructed on the host along with the rewriting of the global column indices in B. This happens in MatSetUpMultiply_MPIAIJ, but checking the code there shows that if you pass in a non-null garray to MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices, the construction and compatification is skipped, meaning B can be built with local column ids as long as garray is provided on the host (which I also build on the device and then just copy to the host). Again this is what some of the internal Kokkos routines rely on, like the matrix-product. I am happy to try doing this and submitting a request to the petsc gitlab if this seems sensible, I just wanted to double check that I wasn't missing something important? Thanks Steven On Tue, 25 Feb 2025 at 22:16, Junchao Zhang wrote: > Hi, Steven, > MatSetSeqAIJKokkosWithCSRMatrix(Mat A, Mat_SeqAIJKokkos *akok) uses a > private data type Mat_SeqAIJKokkos, so it can not be directly made public. > If you already use COO, then why not directly make the matrix of type > MATAIJKOKKOS and call MatSetPreallocationCOO() and MatSetValuesCOO()? > So I am confused by your needs. > > Thanks! > --Junchao Zhang > > > On Tue, Feb 25, 2025 at 3:39?PM Steven Dargaville < > dargaville.steven at gmail.com> wrote: > >> Hi >> >> I'm just wondering if there is any possibility of making: >> MatSetSeqAIJKokkosWithCSRMatrix or MatCreateSeqAIJKokkosWithCSRMatrix in >> src/mat/impls/aij/seq/kokkos/aijkok.kokkos.cxx >> MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices in >> src/mat/impls/aij/mpi/kokkos/mpiaijkok.kokkos.cxx >> >> publicly accessible outside of petsc, or if there is an interface I have >> missed for creating Kokkos matrices entirely on the device? >> MatCreateSeqAIJKokkosWithCSRMatrix for example is marked as PETSC_INTERN so >> I can't link to it. >> >> I've currently just copied the code inside of those methods so that I can >> build without any preallocation on the host (e.g., through the COO >> interface) and it works really well. >> >> Thanks for your help >> Steven >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From junchao.zhang at gmail.com Tue Feb 25 21:35:07 2025 From: junchao.zhang at gmail.com (Junchao Zhang) Date: Tue, 25 Feb 2025 21:35:07 -0600 Subject: [petsc-users] building kokkos matrices on the device In-Reply-To: References: Message-ID: Mat_SeqAIJKokkos is private because it is in a private header src/mat/impls/aij/seq/kokkos/aijkok.hpp Your observation about the garray in MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices() might be right. The comment - B - the offdiag matrix using global col ids is out of date. Perhaps it should be "the offdiag matrix uses local column indices and garray contains the local to global mapping". But I need to double check it. Since you use Kokkos, I think we could provide these two constructors for MATSEQAIJKOKKOS and MATMPIAIJKOKKOS respectively - MatCreateSeqAIJKokkosWithKokkosCsrMatrix(MPI_Comm comm, KokkosCsrMatrix csr, Mat *A) - MatCreateMPIAIJKokkosWithSeqAIJKokkos(MPI_Comm comm, Mat A, Mat B, PetscInt *garray, Mat *mat) // To mimic the existing MatCreateMPIAIJWithSeqAIJ(MPI_Comm comm, Mat A, Mat B, const PetscInt garray[], Mat *mat); // A and B are MATSEQAIJKOKKOS matrices and use local column indices Do they meet your needs? --Junchao Zhang On Tue, Feb 25, 2025 at 5:35?PM Steven Dargaville < dargaville.steven at gmail.com> wrote: > Thanks for the response! > > Although MatSetValuesCOO happens on the device if the input coo_v pointer > is device memory, I believe MatSetPreallocationCOO requires host pointers > for coo_i and coo_j, and the preallocation (and construction of the COO > structures) happens on the host and is then copied onto the device? I need > to be able to create a matrix object with minimal work on the host (like > many of the routines in aijkok.kokkos.cxx do internally). I originally used > the COO interface to build the matrices I need, but that was around 5x > slower than constructing the aij structures myself on the device and then > just directly using the MatSetSeqAIJKokkosWithCSRMatrix type methods. > > The reason I thought MatSetSeqAIJKokkosWithCSRMatrix could easily be made > public is that the Mat_SeqAIJKokkos constructors are already publicly > accessible? In particular one of those constructors takes in pointers to > the Kokkos dual views which store a,i,j, and hence one can build a > sequential matrix with nothing (or very little) occuring on the host. The > only change I can see that would be necessary is for > MatSetSeqAIJKokkosWithCSRMatrix (or MatCreateSeqAIJKokkosWithCSRMatrix) to > be public is to change the PETSC_INTERN to PETSC_EXTERN? > > For MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices, I believe all that is > required is declaring the method in the .hpp, as it's already defined as > static in mpiaijkok.kokkos.cxx. In particular, the comments > above MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices suggest that the > off-diagonal block B needs to be built with global column ids, with > mpiaij->garray constructed on the host along with the rewriting of the > global column indices in B. This happens in MatSetUpMultiply_MPIAIJ, but > checking the code there shows that if you pass in a non-null garray to > MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices, the construction and > compatification is skipped, meaning B can be built with local column ids as > long as garray is provided on the host (which I also build on the device > and then just copy to the host). Again this is what some of the internal > Kokkos routines rely on, like the matrix-product. > > I am happy to try doing this and submitting a request to the petsc gitlab > if this seems sensible, I just wanted to double check that I wasn't missing > something important? > Thanks > Steven > > On Tue, 25 Feb 2025 at 22:16, Junchao Zhang > wrote: > >> Hi, Steven, >> MatSetSeqAIJKokkosWithCSRMatrix(Mat A, Mat_SeqAIJKokkos *akok) uses a >> private data type Mat_SeqAIJKokkos, so it can not be directly made public. >> If you already use COO, then why not directly make the matrix of type >> MATAIJKOKKOS and call MatSetPreallocationCOO() and MatSetValuesCOO()? >> So I am confused by your needs. >> >> Thanks! >> --Junchao Zhang >> >> >> On Tue, Feb 25, 2025 at 3:39?PM Steven Dargaville < >> dargaville.steven at gmail.com> wrote: >> >>> Hi >>> >>> I'm just wondering if there is any possibility of making: >>> MatSetSeqAIJKokkosWithCSRMatrix or MatCreateSeqAIJKokkosWithCSRMatrix in >>> src/mat/impls/aij/seq/kokkos/aijkok.kokkos.cxx >>> MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices in >>> src/mat/impls/aij/mpi/kokkos/mpiaijkok.kokkos.cxx >>> >>> publicly accessible outside of petsc, or if there is an interface I have >>> missed for creating Kokkos matrices entirely on the device? >>> MatCreateSeqAIJKokkosWithCSRMatrix for example is marked as PETSC_INTERN so >>> I can't link to it. >>> >>> I've currently just copied the code inside of those methods so that I >>> can build without any preallocation on the host (e.g., through the COO >>> interface) and it works really well. >>> >>> Thanks for your help >>> Steven >>> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From eirik.hoydalsvik at sintef.no Wed Feb 26 02:21:38 2025 From: eirik.hoydalsvik at sintef.no (=?utf-8?B?RWlyaWsgSmFjY2hlcmkgSMO4eWRhbHN2aWs=?=) Date: Wed, 26 Feb 2025 08:21:38 +0000 Subject: [petsc-users] TS Solver stops working when including ts.setDM In-Reply-To: References: Message-ID: Hi, Here is the output when I run with ?lu? and the settings you suggested: 0 TS dt 0.1 time 0. 0 SNES Function norm 2.982668991189e-01 Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 PC failed due to FACTOR_NUMERIC_ZEROPIVOT Nonlinear solve did not converge due to DIVERGED_LINEAR_SOLVE iterations 0 TSAdapt none step 0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0 + 1.000e-01 retrying with dt=2.500e-02 0 SNES Function norm 7.456672477972e-02 Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 PC failed due to FACTOR_NUMERIC_ZEROPIVOT Nonlinear solve did not converge due to DIVERGED_LINEAR_SOLVE iterations 0 TSAdapt none step 0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0 + 2.500e-02 retrying with dt=6.250e-03 0 SNES Function norm 1.864168119493e-02 Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 PC failed due to FACTOR_NUMERIC_ZEROPIVOT Nonlinear solve did not converge due to DIVERGED_LINEAR_SOLVE iterations 0 TSAdapt none step 0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0 + 6.250e-03 retrying with dt=1.563e-03 0 SNES Function norm 4.660420298733e-03 Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 PC failed due to FACTOR_NUMERIC_ZEROPIVOT Nonlinear solve did not converge due to DIVERGED_LINEAR_SOLVE iterations 0 TSAdapt none step 0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0 + 1.563e-03 retrying with dt=3.906e-04 0 SNES Function norm 1.165105074683e-03 Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 PC failed due to FACTOR_NUMERIC_ZEROPIVOT Nonlinear solve did not converge due to DIVERGED_LINEAR_SOLVE iterations 0 TSAdapt none step 0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0 + 3.906e-04 retrying with dt=9.766e-05 0 SNES Function norm 2.912762686708e-04 Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 PC failed due to FACTOR_NUMERIC_ZEROPIVOT Nonlinear solve did not converge due to DIVERGED_LINEAR_SOLVE iterations 0 TSAdapt none step 0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0 + 9.766e-05 retrying with dt=2.441e-05 0 SNES Function norm 7.281906716770e-05 Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 PC failed due to FACTOR_NUMERIC_ZEROPIVOT Nonlinear solve did not converge due to DIVERGED_LINEAR_SOLVE iterations 0 TSAdapt none step 0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0 + 2.441e-05 retrying with dt=6.104e-06 0 SNES Function norm 1.820476679192e-05 Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 PC failed due to FACTOR_NUMERIC_ZEROPIVOT Nonlinear solve did not converge due to DIVERGED_LINEAR_SOLVE iterations 0 TSAdapt none step 0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0 + 6.104e-06 retrying with dt=1.526e-06 0 SNES Function norm 4.551191697981e-06 Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 0 TSAdapt none beuler 0: step 0 accepted t=0 + 1.526e-06 dt=1.526e-06 1 TS dt 1.52588e-06 time 1.52588e-06 From: Matthew Knepley Date: Tuesday, February 25, 2025 at 20:48 To: Eirik Jaccheri H?ydalsvik Cc: petsc-users at mcs.anl.gov Subject: Re: [petsc-users] TS Solver stops working when including ts.setDM On Tue, Feb 25, 2025 at 9:51?AM Eirik Jaccheri H?ydalsvik > wrote: Hi, After sending you the email, I rescaled the residual function and got the two jacobians to agree down to e-7. I have tried with ?lu? and ?ilu? as preconditioners, and this does not work. However, I just tried to use ?sor? as a preconditioner, and using sor using the da jacobian works just fine! Why should it work with sor and not with ilu or lu? ILU fails all the time, so that is not surprising. However, I do not understand why SOR would succeed and LU would fail, except that SOR is functioning as a kind of globalization by solving very inexactly. Can you run with -snes_monitor -snes_converged_reason -ksp_monitor_true_solution -ksp_converged_reason -snes_linesearch_monitor and send the output? Thanks, Matt Eirik Jacobians: row 0: (0, 1.1012) (1, -51356.3) (2, 0.258649) (3, -0.0644364) (4, -6402.63) (5, 6.19796e-08) row 1: (0, -0.445291) (1, 901708.) (2, 0.) (3, 0.44529) (4, -901708.) (5, 3.63946e-07) row 2: (0, 1.10139) (1, -40239.6) (2, 0.258157) (3, -0.0642761) (4, -6985.51) (5, 6.19796e-08) row 3: (0, -0.101197) (1, 51356.3) (2, -0.258649) (3, 1.16563) (4, -44953.7) (5, 0.258649) (6, -0.0644364) (7, -6402.63) (8, 8.23293e-08) row 4: (0, 0.) (1, 0.) (2, 0.) (3, -0.44529) (4, 901708.) (5, -3.63946e-07) (6, 0.44529) (7, -901708.) (8, -2.8832e-07) row 5: (0, -0.101388) (1, 51394.3) (2, -0.258157) (3, 1.16566) (4, -33254.1) (5, 0.258157) (6, -0.0642762) (7, -6985.51) (8, 8.27566e-08) row 6: (3, -0.101197) (4, 51356.3) (5, -0.258649) (6, 1.06444) (7, 6402.63) (8, -5.80354e-08) row 7: (3, 0.) (4, 0.) (5, 0.) (6, 0.) (7, 0.) (8, 1.) row 8: (3, -0.101388) (4, 51394.3) (5, -0.258157) (6, 1.06428) (7, 18140.2) (8, -5.88806e-08) 1.1011966721737030e+00 -5.1356338141763350e+04 2.5864916418200712e-01 -6.4436434390614972e-02 -6.4026259743175688e+03 7.0149583230222760e-08 -1.2114185055821600e-08 7.4938912205780135e-08 -1.2114185055821600e-08 -4.4529045442108389e-01 9.0170829570587131e+05 -3.6394551278146074e-07 4.4529038352260736e-01 -9.0170829570607911e+05 -2.8832047116453381e-07 2.8832047116453381e-07 -2.8832047116453381e-07 2.8832047116453381e-07 1.1013882301195626e+00 -4.0239613965471210e+04 2.5815705499526392e-01 -6.4276195275552644e-02 -6.9855091616075770e+03 7.0994758931791700e-08 -1.2114185055821600e-08 7.5502362673492770e-08 -1.2114185055821600e-08 -1.0119660581208668e-01 5.1356338141756765e+04 -2.5864909514315410e-01 1.1656331102401210e+00 -4.4953712167476042e+04 2.5864905556513557e-01 -6.4436357255260146e-02 -6.4026259743998889e+03 8.6207799859128616e-08 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 -4.4529067184307852e-01 9.0170829570636747e+05 0.0000000000000000e+00 4.4529045442108389e-01 -9.0170829570587131e+05 3.6394551278146074e-07 -1.0138834089854361e-01 5.1394313808568040e+04 -2.5815698520133573e-01 1.1656640380374783e+00 -3.3254086573823952e+04 2.5815685372183378e-01 -6.4276095304361430e-02 -6.9855068889361701e+03 8.6288440103988163e-08 -6.9304407528653807e-08 0.0000000000000000e+00 -6.9304407528653807e-08 -1.0119667848877271e-01 5.1356338141793494e+04 -2.5864912586737532e-01 1.0644363666594443e+00 6.4026259743251758e+03 -7.4093736504211182e-08 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 1.0000000000000000e+00 -6.9586132762510124e-08 0.0000000000000000e+00 -6.9586132762510124e-08 -1.0138837786069978e-01 5.1394295578534489e+04 -2.5815692427475545e-01 1.0642752170084262e+00 1.8140206731963925e+04 -7.4375461738067500e-08 From: Matthew Knepley > Date: Tuesday, February 25, 2025 at 15:27 To: Eirik Jaccheri H?ydalsvik > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] TS Solver stops working when including ts.setDM On Tue, Feb 25, 2025 at 3:19?AM Eirik Jaccheri H?ydalsvik > wrote: Thanks again for the quick response, I tried prining the jacobians with -ksp_view_mat as you suggested, with a system of only 3 cells (I am studying at a 1d problem). Printing the jacobian in the first timestep I got the two matrices attached at the end of this email. The jacobians are in general agreement, with some small diviations, like the final element of the matrix being 1.6e-5 in the sparse case and 3.7 In the full case. We usually expect to see single precision accuracy (1e-7), so this indicates that your condition number is high. If you use LU (-pc_type lu) to solve the linear system, do you get similar results? Thanks, Matt Questions: 1. Are differences on the order of 1e-5 expected when computing the jacobians in different ways? 2. Do you think these differences can be the cause of my problems? Any suggestions for furtner debugging strategies? Eirik ! sparse jacobian row 0: (0, 1.1012) (1, -104.568) (2, 0.258649) (3, -0.0644364) (4, -13.1186) (5, 1.3237e-08) row 1: (0, -0.44489) (1, 1846.04) (2, 2.12629e-07) (3, 0.445291) (4, -1846.04) (5, 7.08762e-08) row 2: (0, 540.692) (1, -40219.1) (2, 126.734) (3, -31.5544) (4, -7023.46) (5, 6.48896e-06) row 3: (0, -0.101197) (1, 104.568) (2, -0.258649) (3, 1.16563) (4, -91.4489) (5, 0.258649) (6, -0.0644365) (7, -13.1186) (8, -4.4809e-08) row 4: (0, 0.) (1, 0.) (2, 0.) (3, -0.44489) (4, 1846.04) (5, -2.17357e-07) (6, 0.445291) (7, -1846.04) (8, 2.17355e-07) row 5: (0, -49.7734) (1, 51373.8) (2, -126.734) (3, 572.246) (4, -33195.6) (5, 126.734) (6, -31.5544) (7, -7023.46) (8, -2.19026e-05) row 6: (3, -0.101197) (4, 104.568) (5, -0.258649) (6, 1.06444) (7, 13.1186) (8, 3.32334e-08) row 7: (3, 0.) (4, 0.) (5, 0.) (6, 0.) (7, 0.) (8, 1.) row 8: (3, -49.7734) (4, 51373.8) (5, -126.734) (6, 522.472) (7, 18178.2) (8, 1.61503e-05) ! full jacobian 1.1011966827009450e+00 -1.0456754702270389e+02 2.5864915220241336e-01 -6.4436436239323838e-02 -1.3118626729240630e+01 7.6042484344402957e-08 2.9290438414140398e-08 2.5347494781467651e-08 7.2381179542635411e-08 -4.4488897562431995e-01 1.8460406897256150e+03 -5.0558790784552242e-07 4.4529051871980491e-01 -1.8460406898012175e+03 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 5.4069168701069862e+02 -4.0219094660763396e+04 1.2673402711669499e+02 -3.1554364136687809e+01 -7.0234605760797031e+03 3.7635960251523168e-05 1.4708306305192963e-05 1.2833718246687978e-05 3.5617173111594722e-05 -1.0119659898285956e-01 1.0456754705230254e+02 -2.5864912672499502e-01 1.1656331184446040e+00 -9.1448920317937109e+01 2.5864905777109459e-01 -6.4436443447843730e-02 -1.3118626754633008e+01 3.8772800266974086e-09 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 -4.4488904650049199e-01 1.8460406899429699e+03 -2.8823314009443733e-07 4.4529051871980491e-01 -1.8460406898012175e+03 0.0000000000000000e+00 -4.9773392795905657e+01 5.1373794518018862e+04 -1.2673401444613337e+02 5.7224585844509852e+02 -3.3195615874594827e+04 1.2673393520690603e+02 -3.1554356503250116e+01 -7.0234583029005144e+03 1.9105655626471588e-06 -8.1675260962506883e-08 -2.9290438414140398e-08 -2.5347494781467651e-08 -1.0119667997558363e-01 1.0456754704720647e+02 -2.5864913361425051e-01 1.0644364161519400e+00 1.3118626729240630e+01 -7.6042484344402957e-08 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 1.0000000000000000e+00 -4.0087344635721997e-05 -1.4564107223769502e-05 -1.2401121002417596e-05 -4.9773414819747863e+01 5.1373776293551586e+04 -1.2673397275364130e+02 5.2247224727851687e+02 1.8178158133060850e+04 -3.7347562088676249e-05 From: Matthew Knepley > Date: Monday, February 24, 2025 at 15:00 To: Eirik Jaccheri H?ydalsvik > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] TS Solver stops working when including ts.setDM On Mon, Feb 24, 2025 at 8:56?AM Eirik Jaccheri H?ydalsvik > wrote: 1. Thank you for the quick answer, I think this sounds reasonable? Is there any way to compare the brute-force jacobian to the one computed using the coloring information? The easiest way we have is to print them both out: -ksp_view_mat on both runs. We have a way to compare the analytic and FD Jacobians (-snes_test_jacobian), but not two different FDs. Thanks, Matt 1. From: Matthew Knepley > Date: Monday, February 24, 2025 at 14:53 To: Eirik Jaccheri H?ydalsvik > Cc: petsc-users at mcs.anl.gov > Subject: Re: [petsc-users] TS Solver stops working when including ts.setDM On Mon, Feb 24, 2025 at 8:41?AM Eirik Jaccheri H?ydalsvik via petsc-users > wrote: Hi, I am using the petsc4py.ts timestepper to solve a PDE. It is not easy to obtain the jacobian for my equations, so I do not provide a jacobian function. The code is given at the end of the email. When I comment out the function call ?ts.setDM(da)?, the code runs and gives reasonable results. However, when I add this line of code, the program crashes with the error message provided at the end of the email. Questions: 1. Do you know why adding this line of code can make the SNES solver diverge? Any suggestions for how to debug the issue? I will not know until I run it, but here is my guess. When the DMDA is specified, PETSc uses coloring to produce the Jacobian. When it is not, it just brute-forces the entire J. My guess is that your residual does not respect the stencil in the DMDA, so the coloring is wrong, making a wrong Jacobian. 2. What is the advantage of adding the DMDA object to the ts solver? Will this speed up the calculation of the finite difference jacobian? Yes, it speeds up the computation of the FD Jacobian. Thanks, Matt Best regards, Eirik H?ydalsvik SINTEF ER/NTNU Error message: [Eiriks-MacBook-Pro.local:26384] shmem: mmap: an error occurred while determining whether or not /var/folders/w1/35jw9y4n7lsbw0dhjqdhgzz80000gn/T//ompi.Eiriks-MacBook-Pro.501/jf.0/2046361600/sm_segment.Eiriks-MacBook-Pro.501.79f90000.0 could be created. t 0 of 1 with dt = 0.2 0 TS dt 0.2 time 0. TSAdapt none step 0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0 + 2.000e-01 retrying with dt=5.000e-02 TSAdapt none step 0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0 + 5.000e-02 retrying with dt=1.250e-02 TSAdapt none step 0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0 + 1.250e-02 retrying with dt=3.125e-03 TSAdapt none step 0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0 + 3.125e-03 retrying with dt=7.813e-04 TSAdapt none step 0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0 + 7.813e-04 retrying with dt=1.953e-04 TSAdapt none step 0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0 + 1.953e-04 retrying with dt=4.883e-05 TSAdapt none step 0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0 + 4.883e-05 retrying with dt=1.221e-05 TSAdapt none step 0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0 + 1.221e-05 retrying with dt=3.052e-06 TSAdapt none step 0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0 + 3.052e-06 retrying with dt=7.629e-07 TSAdapt none step 0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0 + 7.629e-07 retrying with dt=1.907e-07 TSAdapt none step 0 stage rejected (SNES reason DIVERGED_LINEAR_SOLVE) t=0 + 1.907e-07 retrying with dt=4.768e-08 Traceback (most recent call last): File "/Users/iept1445/Code/tank_model/closed_tank.py", line 200, in return_dict1d = get_tank_composition_1d(tank_params) File "/Users/iept1445/Code/tank_model/src/tank_model1d.py", line 223, in get_tank_composition_1d ts.solve(u=x) File "petsc4py/PETSc/TS.pyx", line 2478, in petsc4py.PETSc.TS.solve petsc4py.PETSc.Error: error code 91 [0] TSSolve() at /Users/iept1445/.cache/uv/sdists-v6/pypi/petsc/3.22.1/svV0XnlR4s3LB4lmsaKjR/src/src/ts/interface/ts.c:4072 [0] TSStep() at /Users/iept1445/.cache/uv/sdists-v6/pypi/petsc/3.22.1/svV0XnlR4s3LB4lmsaKjR/src/src/ts/interface/ts.c:3440 [0] TSStep has failed due to DIVERGED_STEP_REJECTED Options for solver: COMM = PETSc.COMM_WORLD da = PETSc.DMDA().create( dim=(N_vertical,), dof=3, stencil_type=PETSc.DMDA().StencilType.STAR, stencil_width=1, # boundary_type=PETSc.DMDA.BoundaryType.GHOSTED, ) x = da.createGlobalVec() x_old = da.createGlobalVec() f = da.createGlobalVec() J = da.createMat() rho_ref = rho_m[0] # kg/m3 e_ref = e_m[0] # J/mol p_ref = p0 # Pa x.setArray(np.array([rho_m / rho_ref, e_m / e_ref, ux_m]).T.flatten()) x_old.setArray(np.array([rho_m / rho_ref, e_m / e_ref, ux_m]).T.flatten()) optsDB = PETSc.Options() optsDB["snes_lag_preconditioner_persists"] = False optsDB["snes_lag_jacobian"] = 1 optsDB["snes_lag_jacobian_persists"] = False optsDB["snes_lag_preconditioner"] = 1 optsDB["ksp_type"] = "gmres" # "gmres" # gmres" optsDB["pc_type"] = "ilu" # "lu" # "ilu" optsDB["snes_type"] = "newtonls" optsDB["ksp_rtol"] = 1e-7 optsDB["ksp_atol"] = 1e-7 optsDB["ksp_max_it"] = 100 optsDB["snes_rtol"] = 1e-5 optsDB["snes_atol"] = 1e-5 optsDB["snes_stol"] = 1e-5 optsDB["snes_max_it"] = 100 optsDB["snes_mf"] = False optsDB["ts_max_time"] = t_end optsDB["ts_type"] = "beuler" # "bdf" # optsDB["ts_max_snes_failures"] = -1 optsDB["ts_monitor"] = "" optsDB["ts_adapt_monitor"] = "" # optsDB["snes_monitor"] = "" # optsDB["ksp_monitor"] = "" optsDB["ts_atol"] = 1e-4 x0 = x_old residual_wrap = residual_ts( eos, x0, N_vertical, g, pos, z, mw, dt, dx, p_amb, A_nozzle, r_tank_inner, mph_uv_flsh_L, rho_ref, e_ref, p_ref, closed_tank, J, f, da, drift_func, T_wall, tank_params, ) # optsDB["ts_adapt_type"] = "none" ts = PETSc.TS().create(comm=COMM) # TODO: Figure out why DM crashes the code # ts.setDM(residual_wrap.da) ts.setIFunction(residual_wrap.residual_ts, None) ts.setTimeStep(dt) ts.setMaxSteps(-1) ts.setTime(t_start) # s ts.setMaxTime(t_end) # s ts.setMaxSteps(1e5) ts.setStepLimits(1e-3, 1e5) ts.setFromOptions() ts.solve(u=x) Residual function: class residual_ts: def __init__( self, eos, x0, N, g, pos, z, mw, dt, dx, p_amb, A_nozzle, r_tank_inner, mph_uv_flsh_l, rho_ref, e_ref, p_ref, closed_tank, J, f, da, drift_func, T_wall, tank_params, ): self.eos = eos self.x0 = x0 self.N = N self.g = g self.pos = pos self.z = z self.mw = mw self.dt = dt self.dx = dx self.p_amb = p_amb self.A_nozzle = A_nozzle self.r_tank_inner = r_tank_inner self.mph_uv_flsh_L = mph_uv_flsh_l self.rho_ref = rho_ref self.e_ref = e_ref self.p_ref = p_ref self.closed_tank = closed_tank self.J = J self.f = f self.da = da self.drift_func = drift_func self.T_wall = T_wall self.tank_params = tank_params self.Q_wall = np.zeros(N) self.n_iter = 0 self.t_current = [0.0] self.s_top = [0.0] self.p_choke = [0.0] # setting interp func # TODO figure out how to generalize this method self._interp_func = _jalla_upwind # allocate space for new params self.p = np.zeros(N) # Pa self.T = np.zeros(N) # K self.alpha = np.zeros((2, N)) self.rho = np.zeros((2, N)) self.e = np.zeros((2, N)) # allocate space for ghost cells self.alpha_ghost = np.zeros((2, N + 2)) self.rho_ghost = np.zeros((2, N + 2)) self.rho_m_ghost = np.zeros(N + 2) self.u_m_ghost = np.zeros(N + 1) self.u_ghost = np.zeros((2, N + 1)) self.e_ghost = np.zeros((2, N + 2)) self.pos_ghost = np.zeros(N + 2) self.h_ghost = np.zeros((2, N + 2)) # allocate soace for local X and Xdot self.X_LOCAL = da.createLocalVec() self.XDOT_LOCAL = da.createLocalVec() def residual_ts(self, ts, t, X, XDOT, F): self.n_iter += 1 # TODO: Estimate time use """ Caculate residuals for equations (rho, rho (e + gx))_t + (rho u, rho u (h + gx))_x = 0 P_x = - g \rho """ n_phase = 2 self.da.globalToLocal(X, self.X_LOCAL) self.da.globalToLocal(XDOT, self.XDOT_LOCAL) x = self.da.getVecArray(self.X_LOCAL) xdot = self.da.getVecArray(self.XDOT_LOCAL) f = self.da.getVecArray(F) T_c, v_c, p_c = self.eos.critical(self.z) # K, m3/mol, Pa rho_m = x[:, 0] * self.rho_ref # kg/m3 e_m = x[:, 1] * self.e_ref # J/mol u_m = x[:-1, 2] # m/s # derivatives rho_m_dot = xdot[:, 0] * self.rho_ref # kg/m3 e_m_dot = xdot[:, 1] * self.e_ref # kg/m3 dt = ts.getTimeStep() # s for i in range(self.N): # get new parameters self.mph_uv_flsh_L[i] = self.eos.multi_phase_uvflash( self.z, e_m[i], self.mw / rho_m[i], self.mph_uv_flsh_L[i] ) betaL, betaV, betaS = _get_betas(self.mph_uv_flsh_L[i]) # mol/mol beta = [betaL, betaV] if betaS != 0.0: print("there is a solid phase which is not accounted for") self.T[i], self.p[i] = _get_tank_temperature_pressure( self.mph_uv_flsh_L[i] ) # K, Pa) for j, phase in enumerate([self.eos.LIQPH, self.eos.VAPPH]): # new parameters self.rho_ghost[:, 1:-1][j][i] = ( self.mw / self.eos.specific_volume(self.T[i], self.p[i], self.z, phase)[0] ) # kg/m3 self.e_ghost[:, 1:-1][j][i] = self.eos.internal_energy_tv( self.T[i], self.mw / self.rho_ghost[:, 1:-1][j][i], self.z, phase )[ 0 ] # J/mol self.h_ghost[:, 1:-1][j][i] = ( self.e_ghost[:, 1:-1][j][i] + self.p[i] * self.mw / self.rho_ghost[:, 1:-1][j][i] ) # J/mol self.alpha_ghost[:, 1:-1][j][i] = ( beta[j] / self.rho_ghost[:, 1:-1][j][i] * rho_m[i] ) # m3/m3 # calculate drift velocity for i in range(self.N - 1): self.u_ghost[:, 1:-1][0][i], self.u_ghost[:, 1:-1][1][i] = ( calc_drift_velocity( u_m[i], self._interp_func( self.rho_ghost[:, 1:-1][0][i], self.rho_ghost[:, 1:-1][0][i + 1], u_m[i], ), self._interp_func( self.rho_ghost[:, 1:-1][1][i], self.rho_ghost[:, 1:-1][1][i + 1], u_m[i], ), self.g, self._interp_func(self.T[i], self.T[i + 1], u_m[i]), T_c, self.r_tank_inner, self._interp_func( self.alpha_ghost[:, 1:-1][0][i], self.alpha_ghost[:, 1:-1][0][i + 1], u_m[i], ), self._interp_func( self.alpha_ghost[:, 1:-1][1][i], self.alpha_ghost[:, 1:-1][1][i + 1], u_m[i], ), self.drift_func, ) ) # liq m / s , vapour m / s u_bottom = 0 if self.closed_tank: u_top = 0.0 # m/s else: # calc phase to skip env_isentrope_cross if ( self.mph_uv_flsh_L[-1].liquid != None and self.mph_uv_flsh_L[-1].vapour == None and self.mph_uv_flsh_L[-1].solid == None ): phase_env = self.eos.LIQPH else: phase_env = self.eos.TWOPH self.h_m = e_m + self.p * self.mw / rho_m # J / mol self.s_top[0] = _get_s_tank(self.eos, self.mph_uv_flsh_L[-1]) # J / mol / K mdot, self.p_choke[0] = calc_mass_outflow( self.eos, self.z, self.h_m[-1], self.s_top[0], self.p[-1], self.p_amb, self.A_nozzle, self.mw, phase_env, debug_plot=False, ) # mol / s , Pa u_top = -mdot * self.mw / rho_m[-1] / (np.pi * self.r_tank_inner**2) # m/s # assemble vectors with ghost cells self.alpha_ghost[:, 0] = self.alpha_ghost[:, 1:-1][:, 0] # m3/m3 self.alpha_ghost[:, -1] = self.alpha_ghost[:, 1:-1][:, -1] # m3/m3 self.rho_ghost[:, 0] = self.rho_ghost[:, 1:-1][:, 0] # kg/m3 self.rho_ghost[:, -1] = self.rho_ghost[:, 1:-1][:, -1] # kg/m3 self.rho_m_ghost[0] = rho_m[0] # kg/m3 self.rho_m_ghost[1:-1] = rho_m # kg/m3 self.rho_m_ghost[-1] = rho_m[-1] # kg/m3 # u_ghost[:, 1:-1] = u # m/s self.u_ghost[:, 0] = u_bottom # m/s self.u_ghost[:, -1] = u_top # m/s self.u_m_ghost[0] = u_bottom # m/s self.u_m_ghost[1:-1] = u_m # m/s self.u_m_ghost[-1] = u_top # m/s self.e_ghost[:, 0] = self.e_ghost[:, 1:-1][:, 0] # J/mol self.e_ghost[:, -1] = self.e_ghost[:, 1:-1][:, -1] # J/mol self.pos_ghost[1:-1] = self.pos # m self.pos_ghost[0] = self.pos[0] # m self.pos_ghost[-1] = self.pos[-1] # m self.h_ghost[:, 0] = self.h_ghost[:, 1] # J/mol self.h_ghost[:, -1] = self.h_ghost[:, -2] # J/mol # recalculate wall temperature and heat flux # TODO ARE WE DOING THE STAGGERING CORRECTLY? lz = self.tank_params["lz_tank"] / self.N # m if ts.getTime() != self.t_current[0] and self.tank_params["heat_transfer"]: self.t_current[0] = ts.getTime() for i in range(self.N): self.T_wall[i], self.Q_wall[i], h_ht = ( solve_radial_heat_conduction_implicit( self.tank_params, self.T[i], self.T_wall[i], (self.u_m_ghost[i] + self.u_m_ghost[i + 1]) / 2, self.rho_m_ghost[i + 1], self.mph_uv_flsh_L[i], lz, dt, ) ) # K, J/s, W/m2K # Calculate residuals f[:, :] = 0.0 f[:, 0] = dt * rho_m_dot # kg/m3 f[:-1, 1] = self.p[1:] - self.p[:-1] + self.dx * self.g * rho_m[0:-1] # Pa/m f[:, 2] = ( dt * ( rho_m_dot * (e_m / self.mw + self.g * self.pos) + rho_m * e_m_dot / self.mw ) - rho_m_dot * e_m_dot / self.mw * dt**2 - self.Q_wall / (np.pi * self.r_tank_inner**2 * lz) * dt ) # J / m3 # add contribution from space for i in range(n_phase): e_flux_i = np.zeros_like(self.u_ghost[i]) # J/m3 m/s rho_flux_i = np.zeros_like(self.u_ghost[i]) # kg/m2/s for j in range(1, self.N + 1): if self.u_ghost[i][j] >= 0.0: rho_flux_new = _rho_flux( self.alpha_ghost[i][j], self.rho_ghost[i][j], self.u_ghost[i][j] ) e_flux_new = _e_flux( self.alpha_ghost[i][j], self.rho_ghost[i][j], self.h_ghost[i][j], self.mw, self.g, self.pos_ghost[j], self.u_ghost[i][j], ) # backward euler rho_flux_i[j] = rho_flux_new # kg/m2/s e_flux_i[j] = e_flux_new # J/m3 m/s else: rho_flux_new = _rho_flux( self.alpha_ghost[i][j + 1], self.rho_ghost[i][j + 1], self.u_ghost[i][j], ) e_flux_new = _e_flux( self.alpha_ghost[i][j + 1], self.rho_ghost[i][j + 1], self.h_ghost[i][j + 1], self.mw, self.g, self.pos_ghost[j + 1], self.u_ghost[i][j], ) # backward euler rho_flux_i[j] = rho_flux_new e_flux_i[j] = e_flux_new # mass eq f[:, 0] += (dt / self.dx) * (rho_flux_i[1:] - rho_flux_i[:-1]) # kg/m3 # energy eq f[:, 2] += (dt / self.dx) * (e_flux_i[1:] - e_flux_i[:-1]) # J/m3 f1_ref, f2_ref, f3_ref = self.rho_ref, self.p_ref, self.e_ref f[:, 0] /= f1_ref f[:-1, 1] /= f2_ref f[:, 2] /= f3_ref # dummy eq f[-1, 1] = x[-1, 2] -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!YLsa1SRpJkLsV8gFq3kXXxcNwmZ3lx9qJMebtTjTnxhU0-43s0nvM3UbT2FOmlqyxa9bU2er_jIx3OYk_Ds4kW542MX-Tp9IKBQ$ -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!YLsa1SRpJkLsV8gFq3kXXxcNwmZ3lx9qJMebtTjTnxhU0-43s0nvM3UbT2FOmlqyxa9bU2er_jIx3OYk_Ds4kW542MX-Tp9IKBQ$ -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!YLsa1SRpJkLsV8gFq3kXXxcNwmZ3lx9qJMebtTjTnxhU0-43s0nvM3UbT2FOmlqyxa9bU2er_jIx3OYk_Ds4kW542MX-Tp9IKBQ$ -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!YLsa1SRpJkLsV8gFq3kXXxcNwmZ3lx9qJMebtTjTnxhU0-43s0nvM3UbT2FOmlqyxa9bU2er_jIx3OYk_Ds4kW542MX-Tp9IKBQ$ -------------- next part -------------- An HTML attachment was scrubbed... URL: From dargaville.steven at gmail.com Wed Feb 26 06:26:01 2025 From: dargaville.steven at gmail.com (Steven Dargaville) Date: Wed, 26 Feb 2025 12:26:01 +0000 Subject: [petsc-users] building kokkos matrices on the device In-Reply-To: References: Message-ID: Those two constructors would definitely meet my needs, thanks! Also I should note that the comment about garray and B in MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices is correct if garray is passed in as NULL, it's just that if you pass in a completed garray it doesn't bother creating one or changing the column indices of B. So I would suggest the comment be: "if garray is NULL the offdiag matrix B should have global col ids; if garray is not NULL the offdiag matrix B should have local col ids" On Wed, 26 Feb 2025 at 03:35, Junchao Zhang wrote: > Mat_SeqAIJKokkos is private because it is in a private header > src/mat/impls/aij/seq/kokkos/aijkok.hpp > > Your observation about the garray in MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices() > might be right. The comment > > - B - the offdiag matrix using global col ids > > is out of date. Perhaps it should be "the offdiag matrix uses local column > indices and garray contains the local to global mapping". But I need to > double check it. > > Since you use Kokkos, I think we could provide these two constructors for > MATSEQAIJKOKKOS and MATMPIAIJKOKKOS respectively > > - MatCreateSeqAIJKokkosWithKokkosCsrMatrix(MPI_Comm comm, > KokkosCsrMatrix csr, Mat *A) > > > - MatCreateMPIAIJKokkosWithSeqAIJKokkos(MPI_Comm comm, Mat A, Mat B, > PetscInt *garray, Mat *mat) > > // To mimic the existing MatCreateMPIAIJWithSeqAIJ(MPI_Comm comm, > Mat A, Mat B, const PetscInt garray[], Mat *mat); > // A and B are MATSEQAIJKOKKOS matrices and use local column > indices > > Do they meet your needs? > > --Junchao Zhang > > > On Tue, Feb 25, 2025 at 5:35?PM Steven Dargaville < > dargaville.steven at gmail.com> wrote: > >> Thanks for the response! >> >> Although MatSetValuesCOO happens on the device if the input coo_v pointer >> is device memory, I believe MatSetPreallocationCOO requires host pointers >> for coo_i and coo_j, and the preallocation (and construction of the COO >> structures) happens on the host and is then copied onto the device? I need >> to be able to create a matrix object with minimal work on the host (like >> many of the routines in aijkok.kokkos.cxx do internally). I originally used >> the COO interface to build the matrices I need, but that was around 5x >> slower than constructing the aij structures myself on the device and then >> just directly using the MatSetSeqAIJKokkosWithCSRMatrix type methods. >> >> The reason I thought MatSetSeqAIJKokkosWithCSRMatrix could easily be made >> public is that the Mat_SeqAIJKokkos constructors are already publicly >> accessible? In particular one of those constructors takes in pointers to >> the Kokkos dual views which store a,i,j, and hence one can build a >> sequential matrix with nothing (or very little) occuring on the host. The >> only change I can see that would be necessary is for >> MatSetSeqAIJKokkosWithCSRMatrix (or MatCreateSeqAIJKokkosWithCSRMatrix) to >> be public is to change the PETSC_INTERN to PETSC_EXTERN? >> >> For MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices, I believe all that >> is required is declaring the method in the .hpp, as it's already defined as >> static in mpiaijkok.kokkos.cxx. In particular, the comments >> above MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices suggest that the >> off-diagonal block B needs to be built with global column ids, with >> mpiaij->garray constructed on the host along with the rewriting of the >> global column indices in B. This happens in MatSetUpMultiply_MPIAIJ, but >> checking the code there shows that if you pass in a non-null garray to >> MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices, the construction and >> compatification is skipped, meaning B can be built with local column ids as >> long as garray is provided on the host (which I also build on the device >> and then just copy to the host). Again this is what some of the internal >> Kokkos routines rely on, like the matrix-product. >> >> I am happy to try doing this and submitting a request to the petsc gitlab >> if this seems sensible, I just wanted to double check that I wasn't missing >> something important? >> Thanks >> Steven >> >> On Tue, 25 Feb 2025 at 22:16, Junchao Zhang >> wrote: >> >>> Hi, Steven, >>> MatSetSeqAIJKokkosWithCSRMatrix(Mat A, Mat_SeqAIJKokkos *akok) uses a >>> private data type Mat_SeqAIJKokkos, so it can not be directly made public. >>> If you already use COO, then why not directly make the matrix of type >>> MATAIJKOKKOS and call MatSetPreallocationCOO() and MatSetValuesCOO()? >>> So I am confused by your needs. >>> >>> Thanks! >>> --Junchao Zhang >>> >>> >>> On Tue, Feb 25, 2025 at 3:39?PM Steven Dargaville < >>> dargaville.steven at gmail.com> wrote: >>> >>>> Hi >>>> >>>> I'm just wondering if there is any possibility of making: >>>> MatSetSeqAIJKokkosWithCSRMatrix or MatCreateSeqAIJKokkosWithCSRMatrix >>>> in src/mat/impls/aij/seq/kokkos/aijkok.kokkos.cxx >>>> MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices in >>>> src/mat/impls/aij/mpi/kokkos/mpiaijkok.kokkos.cxx >>>> >>>> publicly accessible outside of petsc, or if there is an interface I >>>> have missed for creating Kokkos matrices entirely on the device? >>>> MatCreateSeqAIJKokkosWithCSRMatrix for example is marked as PETSC_INTERN so >>>> I can't link to it. >>>> >>>> I've currently just copied the code inside of those methods so that I >>>> can build without any preallocation on the host (e.g., through the COO >>>> interface) and it works really well. >>>> >>>> Thanks for your help >>>> Steven >>>> >>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Wed Feb 26 08:52:05 2025 From: knepley at gmail.com (Matthew Knepley) Date: Wed, 26 Feb 2025 09:52:05 -0500 Subject: [petsc-users] TS Solver stops working when including ts.setDM In-Reply-To: References: Message-ID: On Wed, Feb 26, 2025 at 3:21?AM Eirik Jaccheri H?ydalsvik < eirik.hoydalsvik at sintef.no> wrote: > Hi, > > Here is the output when I run with ?lu? and the settings you suggested: > > 0 TS dt 0.1 time 0. > > 0 SNES Function norm 2.982668991189e-01 > > Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 > > PC failed due to FACTOR_NUMERIC_ZEROPIVOT > Okay, your Jacobian is rank deficient. Did you know this? This usually indicates an error either in the implementation or formulation. Is it supposed to be rank deficient? Thanks, Matt > Nonlinear solve did not converge due to DIVERGED_LINEAR_SOLVE > iterations 0 > > TSAdapt none step 0 stage rejected (SNES reason > DIVERGED_LINEAR_SOLVE) t=0 + 1.000e-01 retrying with dt=2.500e-02 > > 0 SNES Function norm 7.456672477972e-02 > > Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 > > PC failed due to FACTOR_NUMERIC_ZEROPIVOT > > Nonlinear solve did not converge due to DIVERGED_LINEAR_SOLVE > iterations 0 > > TSAdapt none step 0 stage rejected (SNES reason > DIVERGED_LINEAR_SOLVE) t=0 + 2.500e-02 retrying with dt=6.250e-03 > > 0 SNES Function norm 1.864168119493e-02 > > Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 > > PC failed due to FACTOR_NUMERIC_ZEROPIVOT > > Nonlinear solve did not converge due to DIVERGED_LINEAR_SOLVE > iterations 0 > > TSAdapt none step 0 stage rejected (SNES reason > DIVERGED_LINEAR_SOLVE) t=0 + 6.250e-03 retrying with dt=1.563e-03 > > 0 SNES Function norm 4.660420298733e-03 > > Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 > > PC failed due to FACTOR_NUMERIC_ZEROPIVOT > > Nonlinear solve did not converge due to DIVERGED_LINEAR_SOLVE > iterations 0 > > TSAdapt none step 0 stage rejected (SNES reason > DIVERGED_LINEAR_SOLVE) t=0 + 1.563e-03 retrying with dt=3.906e-04 > > 0 SNES Function norm 1.165105074683e-03 > > Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 > > PC failed due to FACTOR_NUMERIC_ZEROPIVOT > > Nonlinear solve did not converge due to DIVERGED_LINEAR_SOLVE > iterations 0 > > TSAdapt none step 0 stage rejected (SNES reason > DIVERGED_LINEAR_SOLVE) t=0 + 3.906e-04 retrying with dt=9.766e-05 > > 0 SNES Function norm 2.912762686708e-04 > > Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 > > PC failed due to FACTOR_NUMERIC_ZEROPIVOT > > Nonlinear solve did not converge due to DIVERGED_LINEAR_SOLVE > iterations 0 > > TSAdapt none step 0 stage rejected (SNES reason > DIVERGED_LINEAR_SOLVE) t=0 + 9.766e-05 retrying with dt=2.441e-05 > > 0 SNES Function norm 7.281906716770e-05 > > Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 > > PC failed due to FACTOR_NUMERIC_ZEROPIVOT > > Nonlinear solve did not converge due to DIVERGED_LINEAR_SOLVE > iterations 0 > > TSAdapt none step 0 stage rejected (SNES reason > DIVERGED_LINEAR_SOLVE) t=0 + 2.441e-05 retrying with dt=6.104e-06 > > 0 SNES Function norm 1.820476679192e-05 > > Linear solve did not converge due to DIVERGED_PC_FAILED iterations 0 > > PC failed due to FACTOR_NUMERIC_ZEROPIVOT > > Nonlinear solve did not converge due to DIVERGED_LINEAR_SOLVE > iterations 0 > > TSAdapt none step 0 stage rejected (SNES reason > DIVERGED_LINEAR_SOLVE) t=0 + 6.104e-06 retrying with dt=1.526e-06 > > 0 SNES Function norm 4.551191697981e-06 > > Nonlinear solve converged due to CONVERGED_FNORM_ABS iterations 0 > > TSAdapt none beuler 0: step 0 accepted t=0 + 1.526e-06 > dt=1.526e-06 > > 1 TS dt 1.52588e-06 time 1.52588e-06 > > > > *From: *Matthew Knepley > *Date: *Tuesday, February 25, 2025 at 20:48 > *To: *Eirik Jaccheri H?ydalsvik > *Cc: *petsc-users at mcs.anl.gov > *Subject: *Re: [petsc-users] TS Solver stops working when including > ts.setDM > > On Tue, Feb 25, 2025 at 9:51?AM Eirik Jaccheri H?ydalsvik < > eirik.hoydalsvik at sintef.no> wrote: > > Hi, > > After sending you the email, I rescaled the residual function and got the > two jacobians to agree down to e-7. > > I have tried with ?lu? and ?ilu? as preconditioners, and this does not > work. However, I just tried to use ?sor? as a preconditioner, and using sor > using the da jacobian works just fine! > > Why should it work with sor and not with ilu or lu? > > > > ILU fails all the time, so that is not surprising. However, I do not > understand why SOR would succeed and LU would fail, > > except that SOR is functioning as a kind of globalization by solving very > inexactly. Can you run with > > > > -snes_monitor -snes_converged_reason -ksp_monitor_true_solution > -ksp_converged_reason -snes_linesearch_monitor > > > > and send the output? > > > > Thanks, > > > > Matt > > > > Eirik > > Jacobians: > row 0: (0, 1.1012) (1, -51356.3) (2, 0.258649) (3, -0.0644364) (4, > -6402.63) (5, 6.19796e-08) > > row 1: (0, -0.445291) (1, 901708.) (2, 0.) (3, 0.44529) (4, -901708.) > (5, 3.63946e-07) > > row 2: (0, 1.10139) (1, -40239.6) (2, 0.258157) (3, -0.0642761) (4, > -6985.51) (5, 6.19796e-08) > > row 3: (0, -0.101197) (1, 51356.3) (2, -0.258649) (3, 1.16563) (4, > -44953.7) (5, 0.258649) (6, -0.0644364) (7, -6402.63) (8, 8.23293e-08) > > row 4: (0, 0.) (1, 0.) (2, 0.) (3, -0.44529) (4, 901708.) (5, > -3.63946e-07) (6, 0.44529) (7, -901708.) (8, -2.8832e-07) > > row 5: (0, -0.101388) (1, 51394.3) (2, -0.258157) (3, 1.16566) (4, > -33254.1) (5, 0.258157) (6, -0.0642762) (7, -6985.51) (8, 8.27566e-08) > > row 6: (3, -0.101197) (4, 51356.3) (5, -0.258649) (6, 1.06444) (7, > 6402.63) (8, -5.80354e-08) > > row 7: (3, 0.) (4, 0.) (5, 0.) (6, 0.) (7, 0.) (8, 1.) > > row 8: (3, -0.101388) (4, 51394.3) (5, -0.258157) (6, 1.06428) (7, > 18140.2) (8, -5.88806e-08) > > 1.1011966721737030e+00 -5.1356338141763350e+04 2.5864916418200712e-01 > -6.4436434390614972e-02 -6.4026259743175688e+03 7.0149583230222760e-08 > -1.2114185055821600e-08 7.4938912205780135e-08 -1.2114185055821600e-08 > > -4.4529045442108389e-01 9.0170829570587131e+05 -3.6394551278146074e-07 > 4.4529038352260736e-01 -9.0170829570607911e+05 -2.8832047116453381e-07 > 2.8832047116453381e-07 -2.8832047116453381e-07 2.8832047116453381e-07 > > 1.1013882301195626e+00 -4.0239613965471210e+04 2.5815705499526392e-01 > -6.4276195275552644e-02 -6.9855091616075770e+03 7.0994758931791700e-08 > -1.2114185055821600e-08 7.5502362673492770e-08 -1.2114185055821600e-08 > > -1.0119660581208668e-01 5.1356338141756765e+04 -2.5864909514315410e-01 > 1.1656331102401210e+00 -4.4953712167476042e+04 2.5864905556513557e-01 > -6.4436357255260146e-02 -6.4026259743998889e+03 8.6207799859128616e-08 > > 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 > -4.4529067184307852e-01 9.0170829570636747e+05 0.0000000000000000e+00 > 4.4529045442108389e-01 -9.0170829570587131e+05 3.6394551278146074e-07 > > -1.0138834089854361e-01 5.1394313808568040e+04 -2.5815698520133573e-01 > 1.1656640380374783e+00 -3.3254086573823952e+04 2.5815685372183378e-01 > -6.4276095304361430e-02 -6.9855068889361701e+03 8.6288440103988163e-08 > > -6.9304407528653807e-08 0.0000000000000000e+00 -6.9304407528653807e-08 > -1.0119667848877271e-01 5.1356338141793494e+04 -2.5864912586737532e-01 > 1.0644363666594443e+00 6.4026259743251758e+03 -7.4093736504211182e-08 > > 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 > 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 > 0.0000000000000000e+00 0.0000000000000000e+00 1.0000000000000000e+00 > > -6.9586132762510124e-08 0.0000000000000000e+00 -6.9586132762510124e-08 > -1.0138837786069978e-01 5.1394295578534489e+04 -2.5815692427475545e-01 > 1.0642752170084262e+00 1.8140206731963925e+04 -7.4375461738067500e-08 > > > > *From: *Matthew Knepley > *Date: *Tuesday, February 25, 2025 at 15:27 > *To: *Eirik Jaccheri H?ydalsvik > *Cc: *petsc-users at mcs.anl.gov > *Subject: *Re: [petsc-users] TS Solver stops working when including > ts.setDM > > On Tue, Feb 25, 2025 at 3:19?AM Eirik Jaccheri H?ydalsvik < > eirik.hoydalsvik at sintef.no> wrote: > > Thanks again for the quick response, > > I tried prining the jacobians with -ksp_view_mat as you suggested, with a > system of only 3 cells (I am studying at a 1d problem). Printing the > jacobian in the first timestep I got the two matrices attached at the end > of this email. The jacobians are in general agreement, with some small > diviations, like the final element of the matrix being 1.6e-5 in the sparse > case and 3.7 In the full case. > > > > We usually expect to see single precision accuracy (1e-7), so this > indicates that your condition number is high. > > > > If you use LU (-pc_type lu) to solve the linear system, do you get similar > results? > > > > Thanks, > > > > Matt > > > > Questions: > > 1. Are differences on the order of 1e-5 expected when computing the > jacobians in different ways? > > 2. Do you think these differences can be the cause of my problems? Any > suggestions for furtner debugging strategies? > > Eirik > > ! sparse jacobian > > row 0: (0, 1.1012) (1, -104.568) (2, 0.258649) (3, -0.0644364) (4, > -13.1186) (5, 1.3237e-08) > > row 1: (0, -0.44489) (1, 1846.04) (2, 2.12629e-07) (3, 0.445291) (4, > -1846.04) (5, 7.08762e-08) > > row 2: (0, 540.692) (1, -40219.1) (2, 126.734) (3, -31.5544) (4, > -7023.46) (5, 6.48896e-06) > > row 3: (0, -0.101197) (1, 104.568) (2, -0.258649) (3, 1.16563) (4, > -91.4489) (5, 0.258649) (6, -0.0644365) (7, -13.1186) (8, -4.4809e-08) > > row 4: (0, 0.) (1, 0.) (2, 0.) (3, -0.44489) (4, 1846.04) (5, > -2.17357e-07) (6, 0.445291) (7, -1846.04) (8, 2.17355e-07) > > row 5: (0, -49.7734) (1, 51373.8) (2, -126.734) (3, 572.246) (4, > -33195.6) (5, 126.734) (6, -31.5544) (7, -7023.46) (8, -2.19026e-05) > > row 6: (3, -0.101197) (4, 104.568) (5, -0.258649) (6, 1.06444) (7, > 13.1186) (8, 3.32334e-08) > > row 7: (3, 0.) (4, 0.) (5, 0.) (6, 0.) (7, 0.) (8, 1.) > > row 8: (3, -49.7734) (4, 51373.8) (5, -126.734) (6, 522.472) (7, > 18178.2) (8, 1.61503e-05) > > > > ! full jacobian > > 1.1011966827009450e+00 -1.0456754702270389e+02 2.5864915220241336e-01 > -6.4436436239323838e-02 -1.3118626729240630e+01 7.6042484344402957e-08 > 2.9290438414140398e-08 2.5347494781467651e-08 7.2381179542635411e-08 > > -4.4488897562431995e-01 1.8460406897256150e+03 -5.0558790784552242e-07 > 4.4529051871980491e-01 -1.8460406898012175e+03 0.0000000000000000e+00 > 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 > > 5.4069168701069862e+02 -4.0219094660763396e+04 1.2673402711669499e+02 > -3.1554364136687809e+01 -7.0234605760797031e+03 3.7635960251523168e-05 > 1.4708306305192963e-05 1.2833718246687978e-05 3.5617173111594722e-05 > > -1.0119659898285956e-01 1.0456754705230254e+02 -2.5864912672499502e-01 > 1.1656331184446040e+00 -9.1448920317937109e+01 2.5864905777109459e-01 > -6.4436443447843730e-02 -1.3118626754633008e+01 3.8772800266974086e-09 > > 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 > -4.4488904650049199e-01 1.8460406899429699e+03 -2.8823314009443733e-07 > 4.4529051871980491e-01 -1.8460406898012175e+03 0.0000000000000000e+00 > > -4.9773392795905657e+01 5.1373794518018862e+04 -1.2673401444613337e+02 > 5.7224585844509852e+02 -3.3195615874594827e+04 1.2673393520690603e+02 > -3.1554356503250116e+01 -7.0234583029005144e+03 1.9105655626471588e-06 > > -8.1675260962506883e-08 -2.9290438414140398e-08 -2.5347494781467651e-08 > -1.0119667997558363e-01 1.0456754704720647e+02 -2.5864913361425051e-01 > 1.0644364161519400e+00 1.3118626729240630e+01 -7.6042484344402957e-08 > > 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 > 0.0000000000000000e+00 0.0000000000000000e+00 0.0000000000000000e+00 > 0.0000000000000000e+00 0.0000000000000000e+00 1.0000000000000000e+00 > > -4.0087344635721997e-05 -1.4564107223769502e-05 -1.2401121002417596e-05 > -4.9773414819747863e+01 5.1373776293551586e+04 -1.2673397275364130e+02 > 5.2247224727851687e+02 1.8178158133060850e+04 -3.7347562088676249e-05 > > > > *From: *Matthew Knepley > *Date: *Monday, February 24, 2025 at 15:00 > *To: *Eirik Jaccheri H?ydalsvik > *Cc: *petsc-users at mcs.anl.gov > *Subject: *Re: [petsc-users] TS Solver stops working when including > ts.setDM > > On Mon, Feb 24, 2025 at 8:56?AM Eirik Jaccheri H?ydalsvik < > eirik.hoydalsvik at sintef.no> wrote: > > > 1. Thank you for the quick answer, I think this sounds reasonable? Is > there any way to compare the brute-force jacobian to the one computed using > the coloring information? > > > > The easiest way we have is to print them both out: > > > > -ksp_view_mat > > > > on both runs. We have a way to compare the analytic and FD Jacobians > (-snes_test_jacobian), but > > not two different FDs. > > > > Thanks, > > > > Matt > > > > > 1. > > *From: *Matthew Knepley > *Date: *Monday, February 24, 2025 at 14:53 > *To: *Eirik Jaccheri H?ydalsvik > *Cc: *petsc-users at mcs.anl.gov > *Subject: *Re: [petsc-users] TS Solver stops working when including > ts.setDM > > On Mon, Feb 24, 2025 at 8:41?AM Eirik Jaccheri H?ydalsvik via petsc-users > wrote: > > Hi, > > I am using the petsc4py.ts timestepper to solve a PDE. It is not easy to > obtain the jacobian for my equations, so I do not provide a jacobian > function. The code is given at the end of the email. > > When I comment out the function call ?ts.setDM(da)?, the code runs and > gives reasonable results. > > However, when I add this line of code, the program crashes with the error > message provided at the end of the email. > > Questions: > > 1. Do you know why adding this line of code can make the SNES solver > diverge? Any suggestions for how to debug the issue? > > > > I will not know until I run it, but here is my guess. When the DMDA is > specified, PETSc uses coloring to produce the Jacobian. When it is not, it > just brute-forces the entire J. My guess is that your residual does not > respect the stencil in the DMDA, so the coloring is wrong, making a wrong > Jacobian. > > > > 2. What is the advantage of adding the DMDA object to the ts solver? Will > this speed up the calculation of the finite difference jacobian? > > > > Yes, it speeds up the computation of the FD Jacobian. > > > > Thanks, > > > > Matt > > > > Best regards, > > Eirik H?ydalsvik > > SINTEF ER/NTNU > > Error message: > > [Eiriks-MacBook-Pro.local:26384] shmem: mmap: an error occurred while > determining whether or not > /var/folders/w1/35jw9y4n7lsbw0dhjqdhgzz80000gn/T//ompi.Eiriks-MacBook-Pro.501/jf.0/2046361600/sm_segment.Eiriks-MacBook-Pro.501.79f90000.0 could > be created. > > t 0 of 1 with dt = 0.2 > > 0 TS dt 0.2 time 0. > > TSAdapt none step 0 stage rejected (SNES reason > DIVERGED_LINEAR_SOLVE) t=0 + 2.000e-01 retrying with dt=5.000e-02 > > TSAdapt none step 0 stage rejected (SNES reason > DIVERGED_LINEAR_SOLVE) t=0 + 5.000e-02 retrying with dt=1.250e-02 > > TSAdapt none step 0 stage rejected (SNES reason > DIVERGED_LINEAR_SOLVE) t=0 + 1.250e-02 retrying with dt=3.125e-03 > > TSAdapt none step 0 stage rejected (SNES reason > DIVERGED_LINEAR_SOLVE) t=0 + 3.125e-03 retrying with dt=7.813e-04 > > TSAdapt none step 0 stage rejected (SNES reason > DIVERGED_LINEAR_SOLVE) t=0 + 7.813e-04 retrying with dt=1.953e-04 > > TSAdapt none step 0 stage rejected (SNES reason > DIVERGED_LINEAR_SOLVE) t=0 + 1.953e-04 retrying with dt=4.883e-05 > > TSAdapt none step 0 stage rejected (SNES reason > DIVERGED_LINEAR_SOLVE) t=0 + 4.883e-05 retrying with dt=1.221e-05 > > TSAdapt none step 0 stage rejected (SNES reason > DIVERGED_LINEAR_SOLVE) t=0 + 1.221e-05 retrying with dt=3.052e-06 > > TSAdapt none step 0 stage rejected (SNES reason > DIVERGED_LINEAR_SOLVE) t=0 + 3.052e-06 retrying with dt=7.629e-07 > > TSAdapt none step 0 stage rejected (SNES reason > DIVERGED_LINEAR_SOLVE) t=0 + 7.629e-07 retrying with dt=1.907e-07 > > TSAdapt none step 0 stage rejected (SNES reason > DIVERGED_LINEAR_SOLVE) t=0 + 1.907e-07 retrying with dt=4.768e-08 > > Traceback (most recent call last): > > File "/Users/iept1445/Code/tank_model/closed_tank.py", line 200, in > > > return_dict1d = get_tank_composition_1d(tank_params) > > File "/Users/iept1445/Code/tank_model/src/tank_model1d.py", line 223, in > get_tank_composition_1d > > ts.solve(u=x) > > File "petsc4py/PETSc/TS.pyx", line 2478, in petsc4py.PETSc.TS.solve > > petsc4py.PETSc.Error: error code 91 > > [0] TSSolve() at > /Users/iept1445/.cache/uv/sdists-v6/pypi/petsc/3.22.1/svV0XnlR4s3LB4lmsaKjR/src/src/ts/interface/ts.c:4072 > > [0] TSStep() at > /Users/iept1445/.cache/uv/sdists-v6/pypi/petsc/3.22.1/svV0XnlR4s3LB4lmsaKjR/src/src/ts/interface/ts.c:3440 > > [0] TSStep has failed due to DIVERGED_STEP_REJECTED > > Options for solver: > > COMM = PETSc.COMM_WORLD > > > > da = PETSc.DMDA().create( > > dim=(N_vertical,), > > dof=3, > > stencil_type=PETSc.DMDA().StencilType.STAR, > > stencil_width=1, > > # boundary_type=PETSc.DMDA.BoundaryType.GHOSTED, > > ) > > x = da.createGlobalVec() > > x_old = da.createGlobalVec() > > f = da.createGlobalVec() > > J = da.createMat() > > rho_ref = rho_m[0] # kg/m3 > > e_ref = e_m[0] # J/mol > > p_ref = p0 # Pa > > x.setArray(np.array([rho_m / rho_ref, e_m / e_ref, ux_m]).T.flatten()) > > x_old.setArray(np.array([rho_m / rho_ref, e_m / e_ref, > ux_m]).T.flatten()) > > > > optsDB = PETSc.Options() > > optsDB["snes_lag_preconditioner_persists"] = False > > optsDB["snes_lag_jacobian"] = 1 > > optsDB["snes_lag_jacobian_persists"] = False > > optsDB["snes_lag_preconditioner"] = 1 > > optsDB["ksp_type"] = "gmres" # "gmres" # gmres" > > optsDB["pc_type"] = "ilu" # "lu" # "ilu" > > optsDB["snes_type"] = "newtonls" > > optsDB["ksp_rtol"] = 1e-7 > > optsDB["ksp_atol"] = 1e-7 > > optsDB["ksp_max_it"] = 100 > > optsDB["snes_rtol"] = 1e-5 > > optsDB["snes_atol"] = 1e-5 > > optsDB["snes_stol"] = 1e-5 > > optsDB["snes_max_it"] = 100 > > optsDB["snes_mf"] = False > > optsDB["ts_max_time"] = t_end > > optsDB["ts_type"] = "beuler" # "bdf" # > > optsDB["ts_max_snes_failures"] = -1 > > optsDB["ts_monitor"] = "" > > optsDB["ts_adapt_monitor"] = "" > > # optsDB["snes_monitor"] = "" > > # optsDB["ksp_monitor"] = "" > > optsDB["ts_atol"] = 1e-4 > > > > x0 = x_old > > residual_wrap = residual_ts( > > eos, > > x0, > > N_vertical, > > g, > > pos, > > z, > > mw, > > dt, > > dx, > > p_amb, > > A_nozzle, > > r_tank_inner, > > mph_uv_flsh_L, > > rho_ref, > > e_ref, > > p_ref, > > closed_tank, > > J, > > f, > > da, > > drift_func, > > T_wall, > > tank_params, > > ) > > > > # optsDB["ts_adapt_type"] = "none" > > > > ts = PETSc.TS().create(comm=COMM) > > # TODO: Figure out why DM crashes the code > > # ts.setDM(residual_wrap.da) > > ts.setIFunction(residual_wrap.residual_ts, None) > > ts.setTimeStep(dt) > > ts.setMaxSteps(-1) > > ts.setTime(t_start) # s > > ts.setMaxTime(t_end) # s > > ts.setMaxSteps(1e5) > > ts.setStepLimits(1e-3, 1e5) > > ts.setFromOptions() > > ts.solve(u=x) > > > > Residual function: > > class residual_ts: > > def __init__( > > self, > > eos, > > x0, > > N, > > g, > > pos, > > z, > > mw, > > dt, > > dx, > > p_amb, > > A_nozzle, > > r_tank_inner, > > mph_uv_flsh_l, > > rho_ref, > > e_ref, > > p_ref, > > closed_tank, > > J, > > f, > > da, > > drift_func, > > T_wall, > > tank_params, > > ): > > self.eos = eos > > self.x0 = x0 > > self.N = N > > self.g = g > > self.pos = pos > > self.z = z > > self.mw = mw > > self.dt = dt > > self.dx = dx > > self.p_amb = p_amb > > self.A_nozzle = A_nozzle > > self.r_tank_inner = r_tank_inner > > self.mph_uv_flsh_L = mph_uv_flsh_l > > self.rho_ref = rho_ref > > self.e_ref = e_ref > > self.p_ref = p_ref > > self.closed_tank = closed_tank > > self.J = J > > self.f = f > > self.da = da > > self.drift_func = drift_func > > self.T_wall = T_wall > > self.tank_params = tank_params > > self.Q_wall = np.zeros(N) > > self.n_iter = 0 > > self.t_current = [0.0] > > self.s_top = [0.0] > > self.p_choke = [0.0] > > > > # setting interp func # TODO figure out how to generalize this > method > > self._interp_func = _jalla_upwind > > > > # allocate space for new params > > self.p = np.zeros(N) # Pa > > self.T = np.zeros(N) # K > > self.alpha = np.zeros((2, N)) > > self.rho = np.zeros((2, N)) > > self.e = np.zeros((2, N)) > > > > # allocate space for ghost cells > > self.alpha_ghost = np.zeros((2, N + 2)) > > self.rho_ghost = np.zeros((2, N + 2)) > > self.rho_m_ghost = np.zeros(N + 2) > > self.u_m_ghost = np.zeros(N + 1) > > self.u_ghost = np.zeros((2, N + 1)) > > self.e_ghost = np.zeros((2, N + 2)) > > self.pos_ghost = np.zeros(N + 2) > > self.h_ghost = np.zeros((2, N + 2)) > > > > # allocate soace for local X and Xdot > > self.X_LOCAL = da.createLocalVec() > > self.XDOT_LOCAL = da.createLocalVec() > > > > def residual_ts(self, ts, t, X, XDOT, F): > > self.n_iter += 1 > > # TODO: Estimate time use > > """ > > Caculate residuals for equations > > (rho, rho (e + gx))_t + (rho u, rho u (h + gx))_x = 0 > > P_x = - g \rho > > """ > > n_phase = 2 > > self.da.globalToLocal(X, self.X_LOCAL) > > self.da.globalToLocal(XDOT, self.XDOT_LOCAL) > > x = self.da.getVecArray(self.X_LOCAL) > > xdot = self.da.getVecArray(self.XDOT_LOCAL) > > f = self.da.getVecArray(F) > > > > T_c, v_c, p_c = self.eos.critical(self.z) # K, m3/mol, Pa > > rho_m = x[:, 0] * self.rho_ref # kg/m3 > > e_m = x[:, 1] * self.e_ref # J/mol > > u_m = x[:-1, 2] # m/s > > > > # derivatives > > rho_m_dot = xdot[:, 0] * self.rho_ref # kg/m3 > > e_m_dot = xdot[:, 1] * self.e_ref # kg/m3 > > dt = ts.getTimeStep() # s > > > > for i in range(self.N): > > # get new parameters > > self.mph_uv_flsh_L[i] = self.eos.multi_phase_uvflash( > > self.z, e_m[i], self.mw / rho_m[i], self.mph_uv_flsh_L[i] > > ) > > > > betaL, betaV, betaS = _get_betas(self.mph_uv_flsh_L[i]) # > mol/mol > > beta = [betaL, betaV] > > if betaS != 0.0: > > print("there is a solid phase which is not accounted for") > > self.T[i], self.p[i] = _get_tank_temperature_pressure( > > self.mph_uv_flsh_L[i] > > ) # K, Pa) > > for j, phase in enumerate([self.eos.LIQPH, self.eos.VAPPH]): > > # new parameters > > self.rho_ghost[:, 1:-1][j][i] = ( > > self.mw > > / self.eos.specific_volume(self.T[i], self.p[i], > self.z, phase)[0] > > ) # kg/m3 > > self.e_ghost[:, 1:-1][j][i] = self.eos.internal_energy_tv( > > self.T[i], self.mw / self.rho_ghost[:, 1:-1][j][i], > self.z, phase > > )[ > > 0 > > ] # J/mol > > self.h_ghost[:, 1:-1][j][i] = ( > > self.e_ghost[:, 1:-1][j][i] > > + self.p[i] * self.mw / self.rho_ghost[:, 1:-1][j][i] > > ) # J/mol > > self.alpha_ghost[:, 1:-1][j][i] = ( > > beta[j] / self.rho_ghost[:, 1:-1][j][i] * rho_m[i] > > ) # m3/m3 > > > > # calculate drift velocity > > for i in range(self.N - 1): > > self.u_ghost[:, 1:-1][0][i], self.u_ghost[:, 1:-1][1][i] = ( > > calc_drift_velocity( > > u_m[i], > > self._interp_func( > > self.rho_ghost[:, 1:-1][0][i], > > self.rho_ghost[:, 1:-1][0][i + 1], > > u_m[i], > > ), > > self._interp_func( > > self.rho_ghost[:, 1:-1][1][i], > > self.rho_ghost[:, 1:-1][1][i + 1], > > u_m[i], > > ), > > self.g, > > self._interp_func(self.T[i], self.T[i + 1], u_m[i]), > > T_c, > > self.r_tank_inner, > > self._interp_func( > > self.alpha_ghost[:, 1:-1][0][i], > > self.alpha_ghost[:, 1:-1][0][i + 1], > > u_m[i], > > ), > > self._interp_func( > > self.alpha_ghost[:, 1:-1][1][i], > > self.alpha_ghost[:, 1:-1][1][i + 1], > > u_m[i], > > ), > > self.drift_func, > > ) > > ) # liq m / s , vapour m / s > > > > u_bottom = 0 > > if self.closed_tank: > > u_top = 0.0 # m/s > > else: > > # calc phase to skip env_isentrope_cross > > if ( > > self.mph_uv_flsh_L[-1].liquid != None > > and self.mph_uv_flsh_L[-1].vapour == None > > and self.mph_uv_flsh_L[-1].solid == None > > ): > > phase_env = self.eos.LIQPH > > else: > > phase_env = self.eos.TWOPH > > > > self.h_m = e_m + self.p * self.mw / rho_m # J / mol > > self.s_top[0] = _get_s_tank(self.eos, self.mph_uv_flsh_L[-1]) > # J / mol / K > > mdot, self.p_choke[0] = calc_mass_outflow( > > self.eos, > > self.z, > > self.h_m[-1], > > self.s_top[0], > > self.p[-1], > > self.p_amb, > > self.A_nozzle, > > self.mw, > > phase_env, > > debug_plot=False, > > ) # mol / s , Pa > > u_top = -mdot * self.mw / rho_m[-1] / (np.pi * > self.r_tank_inner**2) # m/s > > > > # assemble vectors with ghost cells > > self.alpha_ghost[:, 0] = self.alpha_ghost[:, 1:-1][:, 0] # m3/m3 > > self.alpha_ghost[:, -1] = self.alpha_ghost[:, 1:-1][:, -1] # m3/m3 > > self.rho_ghost[:, 0] = self.rho_ghost[:, 1:-1][:, 0] # kg/m3 > > self.rho_ghost[:, -1] = self.rho_ghost[:, 1:-1][:, -1] # kg/m3 > > self.rho_m_ghost[0] = rho_m[0] # kg/m3 > > self.rho_m_ghost[1:-1] = rho_m # kg/m3 > > self.rho_m_ghost[-1] = rho_m[-1] # kg/m3 > > # u_ghost[:, 1:-1] = u # m/s > > self.u_ghost[:, 0] = u_bottom # m/s > > self.u_ghost[:, -1] = u_top # m/s > > self.u_m_ghost[0] = u_bottom # m/s > > self.u_m_ghost[1:-1] = u_m # m/s > > self.u_m_ghost[-1] = u_top # m/s > > self.e_ghost[:, 0] = self.e_ghost[:, 1:-1][:, 0] # J/mol > > self.e_ghost[:, -1] = self.e_ghost[:, 1:-1][:, -1] # J/mol > > self.pos_ghost[1:-1] = self.pos # m > > self.pos_ghost[0] = self.pos[0] # m > > self.pos_ghost[-1] = self.pos[-1] # m > > self.h_ghost[:, 0] = self.h_ghost[:, 1] # J/mol > > self.h_ghost[:, -1] = self.h_ghost[:, -2] # J/mol > > > > # recalculate wall temperature and heat flux > > # TODO ARE WE DOING THE STAGGERING CORRECTLY? > > lz = self.tank_params["lz_tank"] / self.N # m > > if ts.getTime() != self.t_current[0] and > self.tank_params["heat_transfer"]: > > self.t_current[0] = ts.getTime() > > for i in range(self.N): > > self.T_wall[i], self.Q_wall[i], h_ht = ( > > solve_radial_heat_conduction_implicit( > > self.tank_params, > > self.T[i], > > self.T_wall[i], > > (self.u_m_ghost[i] + self.u_m_ghost[i + 1]) / 2, > > self.rho_m_ghost[i + 1], > > self.mph_uv_flsh_L[i], > > lz, > > dt, > > ) > > ) # K, J/s, W/m2K > > > > # Calculate residuals > > f[:, :] = 0.0 > > f[:, 0] = dt * rho_m_dot # kg/m3 > > f[:-1, 1] = self.p[1:] - self.p[:-1] + self.dx * self.g * > rho_m[0:-1] # Pa/m > > f[:, 2] = ( > > dt > > * ( > > rho_m_dot * (e_m / self.mw + self.g * self.pos) > > + rho_m * e_m_dot / self.mw > > ) > > - rho_m_dot * e_m_dot / self.mw * dt**2 > > - self.Q_wall / (np.pi * self.r_tank_inner**2 * lz) * dt > > ) # J / m3 > > > > # add contribution from space > > for i in range(n_phase): > > e_flux_i = np.zeros_like(self.u_ghost[i]) # J/m3 m/s > > rho_flux_i = np.zeros_like(self.u_ghost[i]) # kg/m2/s > > for j in range(1, self.N + 1): > > if self.u_ghost[i][j] >= 0.0: > > rho_flux_new = _rho_flux( > > self.alpha_ghost[i][j], self.rho_ghost[i][j], > self.u_ghost[i][j] > > ) > > e_flux_new = _e_flux( > > self.alpha_ghost[i][j], > > self.rho_ghost[i][j], > > self.h_ghost[i][j], > > self.mw, > > self.g, > > self.pos_ghost[j], > > self.u_ghost[i][j], > > ) > > > > # backward euler > > rho_flux_i[j] = rho_flux_new # kg/m2/s > > e_flux_i[j] = e_flux_new # J/m3 m/s > > > > else: > > rho_flux_new = _rho_flux( > > self.alpha_ghost[i][j + 1], > > self.rho_ghost[i][j + 1], > > self.u_ghost[i][j], > > ) > > > > e_flux_new = _e_flux( > > self.alpha_ghost[i][j + 1], > > self.rho_ghost[i][j + 1], > > self.h_ghost[i][j + 1], > > self.mw, > > self.g, > > self.pos_ghost[j + 1], > > self.u_ghost[i][j], > > ) > > > > # backward euler > > rho_flux_i[j] = rho_flux_new > > e_flux_i[j] = e_flux_new > > > > # mass eq > > f[:, 0] += (dt / self.dx) * (rho_flux_i[1:] - > rho_flux_i[:-1]) # kg/m3 > > > > # energy eq > > f[:, 2] += (dt / self.dx) * (e_flux_i[1:] - e_flux_i[:-1]) # > J/m3 > > > > f1_ref, f2_ref, f3_ref = self.rho_ref, self.p_ref, self.e_ref > > f[:, 0] /= f1_ref > > f[:-1, 1] /= f2_ref > > f[:, 2] /= f3_ref > > # dummy eq > > f[-1, 1] = x[-1, 2] > > > > > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > > https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!dMjDqMTlQm8dMz4SogywwfFxLuGNm4cejXWCgg3wmaKDTGtTcYGgbOlRFLDU1IkFMEeg4UxIiNnMeGQclIz3$ > > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > > https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!dMjDqMTlQm8dMz4SogywwfFxLuGNm4cejXWCgg3wmaKDTGtTcYGgbOlRFLDU1IkFMEeg4UxIiNnMeGQclIz3$ > > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > > https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!dMjDqMTlQm8dMz4SogywwfFxLuGNm4cejXWCgg3wmaKDTGtTcYGgbOlRFLDU1IkFMEeg4UxIiNnMeGQclIz3$ > > > > > > -- > > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > > > https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!dMjDqMTlQm8dMz4SogywwfFxLuGNm4cejXWCgg3wmaKDTGtTcYGgbOlRFLDU1IkFMEeg4UxIiNnMeGQclIz3$ > > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!dMjDqMTlQm8dMz4SogywwfFxLuGNm4cejXWCgg3wmaKDTGtTcYGgbOlRFLDU1IkFMEeg4UxIiNnMeGQclIz3$ -------------- next part -------------- An HTML attachment was scrubbed... URL: From junchao.zhang at gmail.com Wed Feb 26 10:11:16 2025 From: junchao.zhang at gmail.com (Junchao Zhang) Date: Wed, 26 Feb 2025 10:11:16 -0600 Subject: [petsc-users] building kokkos matrices on the device In-Reply-To: References: Message-ID: This fuction *MatCreateMPIAIJWithSeqAIJ(MPI_Comm comm, Mat A, Mat B, const PetscInt garray[], Mat *mat) *is rarely used. To compute the global matrix's row/col size M, N, it has to do an MPI_Allreduce(). I think it is a waste, as the caller usually knows M, N already. So I think we can depart from it and have a new one: MatCreateMPIAIJKokkosWithSeqAIJKokkos(MPI_Comm comm, PetscInt M, PetscInt N, Mat A, Mat B, const PetscInt garray[], Mat *mat) * M, N are global row/col size of mat * A, B are MATSEQAIJKOKKOS * M, N could be PETSC_DECIDE, if so, petsc will compute mat's M, N from A, i.e., M = Sum of A's M, N= Sum of A's N * if garray is NULL, B uses global column indices (and B's N should be equal to the output mat's N) * if garray is not NULL, B uses local column indices; garray[] was allocated by PetscMalloc() and after the call, garray will be owned by mat (user should not free garray afterwards). What do you think? If you agree, could you contribute an MR? BTW, I think we need to create a new header, petscmat_kokkos.hpp to declare PetscErrorCode MatCreateSeqAIJKokkosWithKokkosCsrMatrix(MPI_Comm comm, KokkosCsrMatrix csr, Mat *A) but PetscErrorCode MatCreateMPIAIJKokkosWithSeqAIJKokkos(MPI_Comm comm, PetscInt M, PetscInt N, Mat A, Mat B, const PetscInt garray[], Mat *mat) can be in petscmat.h as it uses only C types. Barry, what do you think of the two new APIs? --Junchao Zhang On Wed, Feb 26, 2025 at 6:26?AM Steven Dargaville < dargaville.steven at gmail.com> wrote: > Those two constructors would definitely meet my needs, thanks! > > Also I should note that the comment about garray and B in > MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices is correct if garray is > passed in as NULL, it's just that if you pass in a completed garray it > doesn't bother creating one or changing the column indices of B. So I would > suggest the comment be: "if garray is NULL the offdiag matrix B should > have global col ids; if garray is not NULL the offdiag matrix B should have > local col ids" > > On Wed, 26 Feb 2025 at 03:35, Junchao Zhang > wrote: > >> Mat_SeqAIJKokkos is private because it is in a private header >> src/mat/impls/aij/seq/kokkos/aijkok.hpp >> >> Your observation about the garray in MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices() >> might be right. The comment >> >> - B - the offdiag matrix using global col ids >> >> is out of date. Perhaps it should be "the offdiag matrix uses local >> column indices and garray contains the local to global mapping". But I >> need to double check it. >> >> Since you use Kokkos, I think we could provide these two constructors for >> MATSEQAIJKOKKOS and MATMPIAIJKOKKOS respectively >> >> - MatCreateSeqAIJKokkosWithKokkosCsrMatrix(MPI_Comm comm, >> KokkosCsrMatrix csr, Mat *A) >> >> >> - MatCreateMPIAIJKokkosWithSeqAIJKokkos(MPI_Comm comm, Mat A, Mat B, >> PetscInt *garray, Mat *mat) >> >> // To mimic the existing MatCreateMPIAIJWithSeqAIJ(MPI_Comm >> comm, Mat A, Mat B, const PetscInt garray[], Mat *mat); >> // A and B are MATSEQAIJKOKKOS matrices and use local column >> indices >> >> Do they meet your needs? >> >> --Junchao Zhang >> >> >> On Tue, Feb 25, 2025 at 5:35?PM Steven Dargaville < >> dargaville.steven at gmail.com> wrote: >> >>> Thanks for the response! >>> >>> Although MatSetValuesCOO happens on the device if the input coo_v >>> pointer is device memory, I believe MatSetPreallocationCOO requires host >>> pointers for coo_i and coo_j, and the preallocation (and construction of >>> the COO structures) happens on the host and is then copied onto the device? >>> I need to be able to create a matrix object with minimal work on the host >>> (like many of the routines in aijkok.kokkos.cxx do internally). I >>> originally used the COO interface to build the matrices I need, but that >>> was around 5x slower than constructing the aij structures myself on the >>> device and then just directly using the MatSetSeqAIJKokkosWithCSRMatrix >>> type methods. >>> >>> The reason I thought MatSetSeqAIJKokkosWithCSRMatrix could easily be >>> made public is that the Mat_SeqAIJKokkos constructors are already publicly >>> accessible? In particular one of those constructors takes in pointers to >>> the Kokkos dual views which store a,i,j, and hence one can build a >>> sequential matrix with nothing (or very little) occuring on the host. The >>> only change I can see that would be necessary is for >>> MatSetSeqAIJKokkosWithCSRMatrix (or MatCreateSeqAIJKokkosWithCSRMatrix) to >>> be public is to change the PETSC_INTERN to PETSC_EXTERN? >>> >>> For MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices, I believe all that >>> is required is declaring the method in the .hpp, as it's already defined as >>> static in mpiaijkok.kokkos.cxx. In particular, the comments >>> above MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices suggest that the >>> off-diagonal block B needs to be built with global column ids, with >>> mpiaij->garray constructed on the host along with the rewriting of the >>> global column indices in B. This happens in MatSetUpMultiply_MPIAIJ, but >>> checking the code there shows that if you pass in a non-null garray to >>> MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices, the construction and >>> compatification is skipped, meaning B can be built with local column ids as >>> long as garray is provided on the host (which I also build on the device >>> and then just copy to the host). Again this is what some of the internal >>> Kokkos routines rely on, like the matrix-product. >>> >>> I am happy to try doing this and submitting a request to the petsc >>> gitlab if this seems sensible, I just wanted to double check that I wasn't >>> missing something important? >>> Thanks >>> Steven >>> >>> On Tue, 25 Feb 2025 at 22:16, Junchao Zhang >>> wrote: >>> >>>> Hi, Steven, >>>> MatSetSeqAIJKokkosWithCSRMatrix(Mat A, Mat_SeqAIJKokkos *akok) uses >>>> a private data type Mat_SeqAIJKokkos, so it can not be directly made >>>> public. >>>> If you already use COO, then why not directly make the matrix of type >>>> MATAIJKOKKOS and call MatSetPreallocationCOO() and MatSetValuesCOO()? >>>> So I am confused by your needs. >>>> >>>> Thanks! >>>> --Junchao Zhang >>>> >>>> >>>> On Tue, Feb 25, 2025 at 3:39?PM Steven Dargaville < >>>> dargaville.steven at gmail.com> wrote: >>>> >>>>> Hi >>>>> >>>>> I'm just wondering if there is any possibility of making: >>>>> MatSetSeqAIJKokkosWithCSRMatrix or MatCreateSeqAIJKokkosWithCSRMatrix >>>>> in src/mat/impls/aij/seq/kokkos/aijkok.kokkos.cxx >>>>> MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices in >>>>> src/mat/impls/aij/mpi/kokkos/mpiaijkok.kokkos.cxx >>>>> >>>>> publicly accessible outside of petsc, or if there is an interface I >>>>> have missed for creating Kokkos matrices entirely on the device? >>>>> MatCreateSeqAIJKokkosWithCSRMatrix for example is marked as PETSC_INTERN so >>>>> I can't link to it. >>>>> >>>>> I've currently just copied the code inside of those methods so that I >>>>> can build without any preallocation on the host (e.g., through the COO >>>>> interface) and it works really well. >>>>> >>>>> Thanks for your help >>>>> Steven >>>>> >>>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From dargaville.steven at gmail.com Wed Feb 26 10:54:32 2025 From: dargaville.steven at gmail.com (Steven Dargaville) Date: Wed, 26 Feb 2025 16:54:32 +0000 Subject: [petsc-users] building kokkos matrices on the device In-Reply-To: References: Message-ID: I think that sounds great, I'm happy to put together an MR (likely next week) for review. On Wed, 26 Feb 2025 at 16:11, Junchao Zhang wrote: > This fuction *MatCreateMPIAIJWithSeqAIJ(MPI_Comm comm, Mat A, Mat B, > const PetscInt garray[], Mat *mat) *is rarely used. To compute the global > matrix's row/col size M, N, it has to do an MPI_Allreduce(). I think it is > a waste, as the caller usually knows M, N already. So I think we can depart > from it and have a new one: > > MatCreateMPIAIJKokkosWithSeqAIJKokkos(MPI_Comm comm, PetscInt M, PetscInt > N, Mat A, Mat B, const PetscInt garray[], Mat *mat) > * M, N are global row/col size of mat > * A, B are MATSEQAIJKOKKOS > * M, N could be PETSC_DECIDE, if so, petsc will compute mat's M, N from A, > i.e., M = Sum of A's M, N= Sum of A's N > * if garray is NULL, B uses global column indices (and B's N should be > equal to the output mat's N) > * if garray is not NULL, B uses local column indices; garray[] was > allocated by PetscMalloc() and after the call, garray will be owned by mat > (user should not free garray afterwards). > > What do you think? If you agree, could you contribute an MR? > > BTW, I think we need to create a new header, petscmat_kokkos.hpp to declare > PetscErrorCode MatCreateSeqAIJKokkosWithKokkosCsrMatrix(MPI_Comm comm, > KokkosCsrMatrix csr, Mat *A) > but > PetscErrorCode MatCreateMPIAIJKokkosWithSeqAIJKokkos(MPI_Comm comm, > PetscInt M, PetscInt N, Mat A, Mat B, const PetscInt garray[], Mat *mat) > can be in petscmat.h as it uses only C types. > > Barry, what do you think of the two new APIs? > > --Junchao Zhang > > > On Wed, Feb 26, 2025 at 6:26?AM Steven Dargaville < > dargaville.steven at gmail.com> wrote: > >> Those two constructors would definitely meet my needs, thanks! >> >> Also I should note that the comment about garray and B in >> MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices is correct if garray is >> passed in as NULL, it's just that if you pass in a completed garray it >> doesn't bother creating one or changing the column indices of B. So I would >> suggest the comment be: "if garray is NULL the offdiag matrix B should >> have global col ids; if garray is not NULL the offdiag matrix B should have >> local col ids" >> >> On Wed, 26 Feb 2025 at 03:35, Junchao Zhang >> wrote: >> >>> Mat_SeqAIJKokkos is private because it is in a private header >>> src/mat/impls/aij/seq/kokkos/aijkok.hpp >>> >>> Your observation about the garray in MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices() >>> might be right. The comment >>> >>> - B - the offdiag matrix using global col ids >>> >>> is out of date. Perhaps it should be "the offdiag matrix uses local >>> column indices and garray contains the local to global mapping". But I >>> need to double check it. >>> >>> Since you use Kokkos, I think we could provide these two constructors >>> for MATSEQAIJKOKKOS and MATMPIAIJKOKKOS respectively >>> >>> - MatCreateSeqAIJKokkosWithKokkosCsrMatrix(MPI_Comm comm, >>> KokkosCsrMatrix csr, Mat *A) >>> >>> >>> - MatCreateMPIAIJKokkosWithSeqAIJKokkos(MPI_Comm comm, Mat A, Mat B, >>> PetscInt *garray, Mat *mat) >>> >>> // To mimic the existing MatCreateMPIAIJWithSeqAIJ(MPI_Comm >>> comm, Mat A, Mat B, const PetscInt garray[], Mat *mat); >>> // A and B are MATSEQAIJKOKKOS matrices and use local column >>> indices >>> >>> Do they meet your needs? >>> >>> --Junchao Zhang >>> >>> >>> On Tue, Feb 25, 2025 at 5:35?PM Steven Dargaville < >>> dargaville.steven at gmail.com> wrote: >>> >>>> Thanks for the response! >>>> >>>> Although MatSetValuesCOO happens on the device if the input coo_v >>>> pointer is device memory, I believe MatSetPreallocationCOO requires host >>>> pointers for coo_i and coo_j, and the preallocation (and construction of >>>> the COO structures) happens on the host and is then copied onto the device? >>>> I need to be able to create a matrix object with minimal work on the host >>>> (like many of the routines in aijkok.kokkos.cxx do internally). I >>>> originally used the COO interface to build the matrices I need, but that >>>> was around 5x slower than constructing the aij structures myself on the >>>> device and then just directly using the MatSetSeqAIJKokkosWithCSRMatrix >>>> type methods. >>>> >>>> The reason I thought MatSetSeqAIJKokkosWithCSRMatrix could easily be >>>> made public is that the Mat_SeqAIJKokkos constructors are already publicly >>>> accessible? In particular one of those constructors takes in pointers to >>>> the Kokkos dual views which store a,i,j, and hence one can build a >>>> sequential matrix with nothing (or very little) occuring on the host. The >>>> only change I can see that would be necessary is for >>>> MatSetSeqAIJKokkosWithCSRMatrix (or MatCreateSeqAIJKokkosWithCSRMatrix) to >>>> be public is to change the PETSC_INTERN to PETSC_EXTERN? >>>> >>>> For MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices, I believe all that >>>> is required is declaring the method in the .hpp, as it's already defined as >>>> static in mpiaijkok.kokkos.cxx. In particular, the comments >>>> above MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices suggest that the >>>> off-diagonal block B needs to be built with global column ids, with >>>> mpiaij->garray constructed on the host along with the rewriting of the >>>> global column indices in B. This happens in MatSetUpMultiply_MPIAIJ, but >>>> checking the code there shows that if you pass in a non-null garray to >>>> MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices, the construction and >>>> compatification is skipped, meaning B can be built with local column ids as >>>> long as garray is provided on the host (which I also build on the device >>>> and then just copy to the host). Again this is what some of the internal >>>> Kokkos routines rely on, like the matrix-product. >>>> >>>> I am happy to try doing this and submitting a request to the petsc >>>> gitlab if this seems sensible, I just wanted to double check that I wasn't >>>> missing something important? >>>> Thanks >>>> Steven >>>> >>>> On Tue, 25 Feb 2025 at 22:16, Junchao Zhang >>>> wrote: >>>> >>>>> Hi, Steven, >>>>> MatSetSeqAIJKokkosWithCSRMatrix(Mat A, Mat_SeqAIJKokkos *akok) uses >>>>> a private data type Mat_SeqAIJKokkos, so it can not be directly made >>>>> public. >>>>> If you already use COO, then why not directly make the matrix of >>>>> type MATAIJKOKKOS and call MatSetPreallocationCOO() and MatSetValuesCOO()? >>>>> So I am confused by your needs. >>>>> >>>>> Thanks! >>>>> --Junchao Zhang >>>>> >>>>> >>>>> On Tue, Feb 25, 2025 at 3:39?PM Steven Dargaville < >>>>> dargaville.steven at gmail.com> wrote: >>>>> >>>>>> Hi >>>>>> >>>>>> I'm just wondering if there is any possibility of making: >>>>>> MatSetSeqAIJKokkosWithCSRMatrix or MatCreateSeqAIJKokkosWithCSRMatrix >>>>>> in src/mat/impls/aij/seq/kokkos/aijkok.kokkos.cxx >>>>>> MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices in >>>>>> src/mat/impls/aij/mpi/kokkos/mpiaijkok.kokkos.cxx >>>>>> >>>>>> publicly accessible outside of petsc, or if there is an interface I >>>>>> have missed for creating Kokkos matrices entirely on the device? >>>>>> MatCreateSeqAIJKokkosWithCSRMatrix for example is marked as PETSC_INTERN so >>>>>> I can't link to it. >>>>>> >>>>>> I've currently just copied the code inside of those methods so that I >>>>>> can build without any preallocation on the host (e.g., through the COO >>>>>> interface) and it works really well. >>>>>> >>>>>> Thanks for your help >>>>>> Steven >>>>>> >>>>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From bsmith at petsc.dev Wed Feb 26 11:02:16 2025 From: bsmith at petsc.dev (Barry Smith) Date: Wed, 26 Feb 2025 12:02:16 -0500 Subject: [petsc-users] building kokkos matrices on the device In-Reply-To: References: Message-ID: The new function doesn't seem to have anything to do with Kokkos so why have any new functions? Just have MatCreateMPIAIJWithSeqAIJ() work properly when the two matrices are Kokkos (or CUDA or HIP). Or if you want to eliminate the global reduction maybe make your new function MatCreateMPIWithSeq() and have it work for any type of submatrix and eventually we could deprecate the MatCreateMPIAIJWithSeqAIJ() Barry > On Feb 26, 2025, at 11:54?AM, Steven Dargaville wrote: > > I think that sounds great, I'm happy to put together an MR (likely next week) for review. > > On Wed, 26 Feb 2025 at 16:11, Junchao Zhang > wrote: >> This fuction MatCreateMPIAIJWithSeqAIJ(MPI_Comm comm, Mat A, Mat B, const PetscInt garray[], Mat *mat) is rarely used. To compute the global matrix's row/col size M, N, it has to do an MPI_Allreduce(). I think it is a waste, as the caller usually knows M, N already. So I think we can depart from it and have a new one: >> >> MatCreateMPIAIJKokkosWithSeqAIJKokkos(MPI_Comm comm, PetscInt M, PetscInt N, Mat A, Mat B, const PetscInt garray[], Mat *mat) >> * M, N are global row/col size of mat >> * A, B are MATSEQAIJKOKKOS >> * M, N could be PETSC_DECIDE, if so, petsc will compute mat's M, N from A, i.e., M = Sum of A's M, N= Sum of A's N >> * if garray is NULL, B uses global column indices (and B's N should be equal to the output mat's N) >> * if garray is not NULL, B uses local column indices; garray[] was allocated by PetscMalloc() and after the call, garray will be owned by mat (user should not free garray afterwards). >> >> What do you think? If you agree, could you contribute an MR? >> >> BTW, I think we need to create a new header, petscmat_kokkos.hpp to declare >> PetscErrorCode MatCreateSeqAIJKokkosWithKokkosCsrMatrix(MPI_Comm comm, KokkosCsrMatrix csr, Mat *A) >> but >> PetscErrorCode MatCreateMPIAIJKokkosWithSeqAIJKokkos(MPI_Comm comm, PetscInt M, PetscInt N, Mat A, Mat B, const PetscInt garray[], Mat *mat) >> can be in petscmat.h as it uses only C types. >> >> Barry, what do you think of the two new APIs? >> >> --Junchao Zhang >> >> >> On Wed, Feb 26, 2025 at 6:26?AM Steven Dargaville > wrote: >>> Those two constructors would definitely meet my needs, thanks! >>> >>> Also I should note that the comment about garray and B in MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices is correct if garray is passed in as NULL, it's just that if you pass in a completed garray it doesn't bother creating one or changing the column indices of B. So I would suggest the comment be: "if garray is NULL the offdiag matrix B should have global col ids; if garray is not NULL the offdiag matrix B should have local col ids" >>> >>> On Wed, 26 Feb 2025 at 03:35, Junchao Zhang > wrote: >>>> Mat_SeqAIJKokkos is private because it is in a private header src/mat/impls/aij/seq/kokkos/aijkok.hpp >>>> >>>> Your observation about the garray in MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices() might be right. The comment >>>> >>>> - B - the offdiag matrix using global col ids >>>> >>>> is out of date. Perhaps it should be "the offdiag matrix uses local column indices and garray contains the local to global mapping". But I need to double check it. >>>> >>>> Since you use Kokkos, I think we could provide these two constructors for MATSEQAIJKOKKOS and MATMPIAIJKOKKOS respectively >>>> MatCreateSeqAIJKokkosWithKokkosCsrMatrix(MPI_Comm comm, KokkosCsrMatrix csr, Mat *A) >>>> MatCreateMPIAIJKokkosWithSeqAIJKokkos(MPI_Comm comm, Mat A, Mat B, PetscInt *garray, Mat *mat) >>>> // To mimic the existing MatCreateMPIAIJWithSeqAIJ(MPI_Comm comm, Mat A, Mat B, const PetscInt garray[], Mat *mat); >>>> // A and B are MATSEQAIJKOKKOS matrices and use local column indices >>>> >>>> Do they meet your needs? >>>> >>>> --Junchao Zhang >>>> >>>> >>>> On Tue, Feb 25, 2025 at 5:35?PM Steven Dargaville > wrote: >>>>> Thanks for the response! >>>>> >>>>> Although MatSetValuesCOO happens on the device if the input coo_v pointer is device memory, I believe MatSetPreallocationCOO requires host pointers for coo_i and coo_j, and the preallocation (and construction of the COO structures) happens on the host and is then copied onto the device? I need to be able to create a matrix object with minimal work on the host (like many of the routines in aijkok.kokkos.cxx do internally). I originally used the COO interface to build the matrices I need, but that was around 5x slower than constructing the aij structures myself on the device and then just directly using the MatSetSeqAIJKokkosWithCSRMatrix type methods. >>>>> >>>>> The reason I thought MatSetSeqAIJKokkosWithCSRMatrix could easily be made public is that the Mat_SeqAIJKokkos constructors are already publicly accessible? In particular one of those constructors takes in pointers to the Kokkos dual views which store a,i,j, and hence one can build a sequential matrix with nothing (or very little) occuring on the host. The only change I can see that would be necessary is for MatSetSeqAIJKokkosWithCSRMatrix (or MatCreateSeqAIJKokkosWithCSRMatrix) to be public is to change the PETSC_INTERN to PETSC_EXTERN? >>>>> >>>>> For MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices, I believe all that is required is declaring the method in the .hpp, as it's already defined as static in mpiaijkok.kokkos.cxx. In particular, the comments above MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices suggest that the off-diagonal block B needs to be built with global column ids, with mpiaij->garray constructed on the host along with the rewriting of the global column indices in B. This happens in MatSetUpMultiply_MPIAIJ, but checking the code there shows that if you pass in a non-null garray to MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices, the construction and compatification is skipped, meaning B can be built with local column ids as long as garray is provided on the host (which I also build on the device and then just copy to the host). Again this is what some of the internal Kokkos routines rely on, like the matrix-product. >>>>> >>>>> I am happy to try doing this and submitting a request to the petsc gitlab if this seems sensible, I just wanted to double check that I wasn't missing something important? >>>>> Thanks >>>>> Steven >>>>> >>>>> On Tue, 25 Feb 2025 at 22:16, Junchao Zhang > wrote: >>>>>> Hi, Steven, >>>>>> MatSetSeqAIJKokkosWithCSRMatrix(Mat A, Mat_SeqAIJKokkos *akok) uses a private data type Mat_SeqAIJKokkos, so it can not be directly made public. >>>>>> If you already use COO, then why not directly make the matrix of type MATAIJKOKKOS and call MatSetPreallocationCOO() and MatSetValuesCOO()? >>>>>> So I am confused by your needs. >>>>>> >>>>>> Thanks! >>>>>> --Junchao Zhang >>>>>> >>>>>> >>>>>> On Tue, Feb 25, 2025 at 3:39?PM Steven Dargaville > wrote: >>>>>>> Hi >>>>>>> >>>>>>> I'm just wondering if there is any possibility of making: >>>>>>> MatSetSeqAIJKokkosWithCSRMatrix or MatCreateSeqAIJKokkosWithCSRMatrix in src/mat/impls/aij/seq/kokkos/aijkok.kokkos.cxx >>>>>>> MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices in src/mat/impls/aij/mpi/kokkos/mpiaijkok.kokkos.cxx >>>>>>> >>>>>>> publicly accessible outside of petsc, or if there is an interface I have missed for creating Kokkos matrices entirely on the device? MatCreateSeqAIJKokkosWithCSRMatrix for example is marked as PETSC_INTERN so I can't link to it. >>>>>>> >>>>>>> I've currently just copied the code inside of those methods so that I can build without any preallocation on the host (e.g., through the COO interface) and it works really well. >>>>>>> >>>>>>> Thanks for your help >>>>>>> Steven -------------- next part -------------- An HTML attachment was scrubbed... URL: From junchao.zhang at gmail.com Wed Feb 26 11:15:27 2025 From: junchao.zhang at gmail.com (Junchao Zhang) Date: Wed, 26 Feb 2025 11:15:27 -0600 Subject: [petsc-users] building kokkos matrices on the device In-Reply-To: References: Message-ID: That is a good idea. Perhaps a new MatCreateMPIXAIJWithSeqXAIJ(MPI_Comm comm, PetscInt M, PetscInt N, Mat A, Mat B, const PetscInt garray[], Mat *mat) since garray[] is only meaningful for MATAIJ and subclasses. --Junchao Zhang On Wed, Feb 26, 2025 at 11:02?AM Barry Smith wrote: > > The new function doesn't seem to have anything to do with Kokkos so > why have any new functions? Just have *MatCreateMPIAIJWithSeqAIJ() work > properly when the two matrices are Kokkos (or CUDA or HIP). Or if you > want to eliminate the global reduction maybe make your new function > MatCreateMPIWithSeq() and have it work for any type of submatrix and > eventually we could deprecate the **MatCreateMPIAIJWithSeqAIJ() * > > * Barry* > > > > > On Feb 26, 2025, at 11:54?AM, Steven Dargaville < > dargaville.steven at gmail.com> wrote: > > I think that sounds great, I'm happy to put together an MR (likely next > week) for review. > > On Wed, 26 Feb 2025 at 16:11, Junchao Zhang > wrote: > >> This fuction *MatCreateMPIAIJWithSeqAIJ(MPI_Comm comm, Mat A, Mat B, >> const PetscInt garray[], Mat *mat) *is rarely used. To compute the >> global matrix's row/col size M, N, it has to do an MPI_Allreduce(). I think >> it is a waste, as the caller usually knows M, N already. So I think we can >> depart from it and have a new one: >> >> MatCreateMPIAIJKokkosWithSeqAIJKokkos(MPI_Comm comm, PetscInt M, PetscInt >> N, Mat A, Mat B, const PetscInt garray[], Mat *mat) >> * M, N are global row/col size of mat >> * A, B are MATSEQAIJKOKKOS >> * M, N could be PETSC_DECIDE, if so, petsc will compute mat's M, N from >> A, i.e., M = Sum of A's M, N= Sum of A's N >> * if garray is NULL, B uses global column indices (and B's N should be >> equal to the output mat's N) >> * if garray is not NULL, B uses local column indices; garray[] was >> allocated by PetscMalloc() and after the call, garray will be owned by mat >> (user should not free garray afterwards). >> >> What do you think? If you agree, could you contribute an MR? >> >> BTW, I think we need to create a new header, petscmat_kokkos.hpp to >> declare >> PetscErrorCode MatCreateSeqAIJKokkosWithKokkosCsrMatrix(MPI_Comm comm, >> KokkosCsrMatrix csr, Mat *A) >> but >> PetscErrorCode MatCreateMPIAIJKokkosWithSeqAIJKokkos(MPI_Comm comm, >> PetscInt M, PetscInt N, Mat A, Mat B, const PetscInt garray[], Mat *mat) >> can be in petscmat.h as it uses only C types. >> >> Barry, what do you think of the two new APIs? >> >> --Junchao Zhang >> >> >> On Wed, Feb 26, 2025 at 6:26?AM Steven Dargaville < >> dargaville.steven at gmail.com> wrote: >> >>> Those two constructors would definitely meet my needs, thanks! >>> >>> Also I should note that the comment about garray and B in >>> MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices is correct if garray is >>> passed in as NULL, it's just that if you pass in a completed garray it >>> doesn't bother creating one or changing the column indices of B. So I would >>> suggest the comment be: "if garray is NULL the offdiag matrix B should >>> have global col ids; if garray is not NULL the offdiag matrix B should have >>> local col ids" >>> >>> On Wed, 26 Feb 2025 at 03:35, Junchao Zhang >>> wrote: >>> >>>> Mat_SeqAIJKokkos is private because it is in a private header >>>> src/mat/impls/aij/seq/kokkos/aijkok.hpp >>>> >>>> Your observation about the garray in MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices() >>>> might be right. The comment >>>> >>>> - B - the offdiag matrix using global col ids >>>> >>>> is out of date. Perhaps it should be "the offdiag matrix uses local >>>> column indices and garray contains the local to global mapping". But I >>>> need to double check it. >>>> >>>> Since you use Kokkos, I think we could provide these two constructors >>>> for MATSEQAIJKOKKOS and MATMPIAIJKOKKOS respectively >>>> >>>> - MatCreateSeqAIJKokkosWithKokkosCsrMatrix(MPI_Comm comm, >>>> KokkosCsrMatrix csr, Mat *A) >>>> >>>> >>>> - MatCreateMPIAIJKokkosWithSeqAIJKokkos(MPI_Comm comm, Mat A, Mat >>>> B, PetscInt *garray, Mat *mat) >>>> >>>> // To mimic the existing MatCreateMPIAIJWithSeqAIJ(MPI_Comm >>>> comm, Mat A, Mat B, const PetscInt garray[], Mat *mat); >>>> // A and B are MATSEQAIJKOKKOS matrices and use local column >>>> indices >>>> >>>> Do they meet your needs? >>>> >>>> --Junchao Zhang >>>> >>>> >>>> On Tue, Feb 25, 2025 at 5:35?PM Steven Dargaville < >>>> dargaville.steven at gmail.com> wrote: >>>> >>>>> Thanks for the response! >>>>> >>>>> Although MatSetValuesCOO happens on the device if the input coo_v >>>>> pointer is device memory, I believe MatSetPreallocationCOO requires host >>>>> pointers for coo_i and coo_j, and the preallocation (and construction of >>>>> the COO structures) happens on the host and is then copied onto the device? >>>>> I need to be able to create a matrix object with minimal work on the host >>>>> (like many of the routines in aijkok.kokkos.cxx do internally). I >>>>> originally used the COO interface to build the matrices I need, but that >>>>> was around 5x slower than constructing the aij structures myself on the >>>>> device and then just directly using the MatSetSeqAIJKokkosWithCSRMatrix >>>>> type methods. >>>>> >>>>> The reason I thought MatSetSeqAIJKokkosWithCSRMatrix could easily be >>>>> made public is that the Mat_SeqAIJKokkos constructors are already publicly >>>>> accessible? In particular one of those constructors takes in pointers to >>>>> the Kokkos dual views which store a,i,j, and hence one can build a >>>>> sequential matrix with nothing (or very little) occuring on the host. The >>>>> only change I can see that would be necessary is for >>>>> MatSetSeqAIJKokkosWithCSRMatrix (or MatCreateSeqAIJKokkosWithCSRMatrix) to >>>>> be public is to change the PETSC_INTERN to PETSC_EXTERN? >>>>> >>>>> For MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices, I believe all >>>>> that is required is declaring the method in the .hpp, as it's already >>>>> defined as static in mpiaijkok.kokkos.cxx. In particular, the comments >>>>> above MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices suggest that the >>>>> off-diagonal block B needs to be built with global column ids, with >>>>> mpiaij->garray constructed on the host along with the rewriting of the >>>>> global column indices in B. This happens in MatSetUpMultiply_MPIAIJ, but >>>>> checking the code there shows that if you pass in a non-null garray to >>>>> MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices, the construction and >>>>> compatification is skipped, meaning B can be built with local column ids as >>>>> long as garray is provided on the host (which I also build on the device >>>>> and then just copy to the host). Again this is what some of the internal >>>>> Kokkos routines rely on, like the matrix-product. >>>>> >>>>> I am happy to try doing this and submitting a request to the petsc >>>>> gitlab if this seems sensible, I just wanted to double check that I wasn't >>>>> missing something important? >>>>> Thanks >>>>> Steven >>>>> >>>>> On Tue, 25 Feb 2025 at 22:16, Junchao Zhang >>>>> wrote: >>>>> >>>>>> Hi, Steven, >>>>>> MatSetSeqAIJKokkosWithCSRMatrix(Mat A, Mat_SeqAIJKokkos *akok) uses >>>>>> a private data type Mat_SeqAIJKokkos, so it can not be directly made >>>>>> public. >>>>>> If you already use COO, then why not directly make the matrix of >>>>>> type MATAIJKOKKOS and call MatSetPreallocationCOO() and MatSetValuesCOO()? >>>>>> So I am confused by your needs. >>>>>> >>>>>> Thanks! >>>>>> --Junchao Zhang >>>>>> >>>>>> >>>>>> On Tue, Feb 25, 2025 at 3:39?PM Steven Dargaville < >>>>>> dargaville.steven at gmail.com> wrote: >>>>>> >>>>>>> Hi >>>>>>> >>>>>>> I'm just wondering if there is any possibility of making: >>>>>>> MatSetSeqAIJKokkosWithCSRMatrix >>>>>>> or MatCreateSeqAIJKokkosWithCSRMatrix in >>>>>>> src/mat/impls/aij/seq/kokkos/aijkok.kokkos.cxx >>>>>>> MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices in >>>>>>> src/mat/impls/aij/mpi/kokkos/mpiaijkok.kokkos.cxx >>>>>>> >>>>>>> publicly accessible outside of petsc, or if there is an interface I >>>>>>> have missed for creating Kokkos matrices entirely on the device? >>>>>>> MatCreateSeqAIJKokkosWithCSRMatrix for example is marked as PETSC_INTERN so >>>>>>> I can't link to it. >>>>>>> >>>>>>> I've currently just copied the code inside of those methods so that >>>>>>> I can build without any preallocation on the host (e.g., through the COO >>>>>>> interface) and it works really well. >>>>>>> >>>>>>> Thanks for your help >>>>>>> Steven >>>>>>> >>>>>> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From dargaville.steven at gmail.com Wed Feb 26 11:26:54 2025 From: dargaville.steven at gmail.com (Steven Dargaville) Date: Wed, 26 Feb 2025 17:26:54 +0000 Subject: [petsc-users] building kokkos matrices on the device In-Reply-To: References: Message-ID: Ok so just to double check the things I should do: 1. Create a new header for MatCreateSeqAIJKokkosWithCSRMatrix (and declare it PETSC_EXTERN) so users can call the existing method and build a seqaijkokkos matrix with no host involvement. 2. Modify *MatCreateMPIAIJWithSeqAIJ (*or equivalent*) *so it does the same thing as MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices in the case that A and B are seqaijkokkos matrices. 3. Potentially remove MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices given it would be redundant? On Wed, 26 Feb 2025 at 17:15, Junchao Zhang wrote: > That is a good idea. Perhaps a new MatCreateMPIXAIJWithSeqXAIJ(MPI_Comm > comm, PetscInt M, PetscInt N, Mat A, Mat B, const PetscInt garray[], Mat > *mat) since garray[] is only meaningful for MATAIJ and subclasses. > > --Junchao Zhang > > > On Wed, Feb 26, 2025 at 11:02?AM Barry Smith wrote: > >> >> The new function doesn't seem to have anything to do with Kokkos so >> why have any new functions? Just have *MatCreateMPIAIJWithSeqAIJ() work >> properly when the two matrices are Kokkos (or CUDA or HIP). Or if you >> want to eliminate the global reduction maybe make your new function >> MatCreateMPIWithSeq() and have it work for any type of submatrix and >> eventually we could deprecate the **MatCreateMPIAIJWithSeqAIJ() * >> >> * Barry* >> >> >> >> >> On Feb 26, 2025, at 11:54?AM, Steven Dargaville < >> dargaville.steven at gmail.com> wrote: >> >> I think that sounds great, I'm happy to put together an MR (likely next >> week) for review. >> >> On Wed, 26 Feb 2025 at 16:11, Junchao Zhang >> wrote: >> >>> This fuction *MatCreateMPIAIJWithSeqAIJ(MPI_Comm comm, Mat A, Mat B, >>> const PetscInt garray[], Mat *mat) *is rarely used. To compute the >>> global matrix's row/col size M, N, it has to do an MPI_Allreduce(). I think >>> it is a waste, as the caller usually knows M, N already. So I think we can >>> depart from it and have a new one: >>> >>> MatCreateMPIAIJKokkosWithSeqAIJKokkos(MPI_Comm comm, PetscInt M, >>> PetscInt N, Mat A, Mat B, const PetscInt garray[], Mat *mat) >>> * M, N are global row/col size of mat >>> * A, B are MATSEQAIJKOKKOS >>> * M, N could be PETSC_DECIDE, if so, petsc will compute mat's M, N from >>> A, i.e., M = Sum of A's M, N= Sum of A's N >>> * if garray is NULL, B uses global column indices (and B's N should be >>> equal to the output mat's N) >>> * if garray is not NULL, B uses local column indices; garray[] was >>> allocated by PetscMalloc() and after the call, garray will be owned by mat >>> (user should not free garray afterwards). >>> >>> What do you think? If you agree, could you contribute an MR? >>> >>> BTW, I think we need to create a new header, petscmat_kokkos.hpp to >>> declare >>> PetscErrorCode MatCreateSeqAIJKokkosWithKokkosCsrMatrix(MPI_Comm >>> comm, KokkosCsrMatrix csr, Mat *A) >>> but >>> PetscErrorCode MatCreateMPIAIJKokkosWithSeqAIJKokkos(MPI_Comm comm, >>> PetscInt M, PetscInt N, Mat A, Mat B, const PetscInt garray[], Mat *mat) >>> can be in petscmat.h as it uses only C types. >>> >>> Barry, what do you think of the two new APIs? >>> >>> --Junchao Zhang >>> >>> >>> On Wed, Feb 26, 2025 at 6:26?AM Steven Dargaville < >>> dargaville.steven at gmail.com> wrote: >>> >>>> Those two constructors would definitely meet my needs, thanks! >>>> >>>> Also I should note that the comment about garray and B in >>>> MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices is correct if garray >>>> is passed in as NULL, it's just that if you pass in a completed garray it >>>> doesn't bother creating one or changing the column indices of B. So I would >>>> suggest the comment be: "if garray is NULL the offdiag matrix B should >>>> have global col ids; if garray is not NULL the offdiag matrix B should have >>>> local col ids" >>>> >>>> On Wed, 26 Feb 2025 at 03:35, Junchao Zhang >>>> wrote: >>>> >>>>> Mat_SeqAIJKokkos is private because it is in a private header >>>>> src/mat/impls/aij/seq/kokkos/aijkok.hpp >>>>> >>>>> Your observation about the garray in MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices() >>>>> might be right. The comment >>>>> >>>>> - B - the offdiag matrix using global col ids >>>>> >>>>> is out of date. Perhaps it should be "the offdiag matrix uses local >>>>> column indices and garray contains the local to global mapping". But I >>>>> need to double check it. >>>>> >>>>> Since you use Kokkos, I think we could provide these two constructors >>>>> for MATSEQAIJKOKKOS and MATMPIAIJKOKKOS respectively >>>>> >>>>> - MatCreateSeqAIJKokkosWithKokkosCsrMatrix(MPI_Comm comm, >>>>> KokkosCsrMatrix csr, Mat *A) >>>>> >>>>> >>>>> - MatCreateMPIAIJKokkosWithSeqAIJKokkos(MPI_Comm comm, Mat A, Mat >>>>> B, PetscInt *garray, Mat *mat) >>>>> >>>>> // To mimic the existing MatCreateMPIAIJWithSeqAIJ(MPI_Comm >>>>> comm, Mat A, Mat B, const PetscInt garray[], Mat *mat); >>>>> // A and B are MATSEQAIJKOKKOS matrices and use local column >>>>> indices >>>>> >>>>> Do they meet your needs? >>>>> >>>>> --Junchao Zhang >>>>> >>>>> >>>>> On Tue, Feb 25, 2025 at 5:35?PM Steven Dargaville < >>>>> dargaville.steven at gmail.com> wrote: >>>>> >>>>>> Thanks for the response! >>>>>> >>>>>> Although MatSetValuesCOO happens on the device if the input coo_v >>>>>> pointer is device memory, I believe MatSetPreallocationCOO requires host >>>>>> pointers for coo_i and coo_j, and the preallocation (and construction of >>>>>> the COO structures) happens on the host and is then copied onto the device? >>>>>> I need to be able to create a matrix object with minimal work on the host >>>>>> (like many of the routines in aijkok.kokkos.cxx do internally). I >>>>>> originally used the COO interface to build the matrices I need, but that >>>>>> was around 5x slower than constructing the aij structures myself on the >>>>>> device and then just directly using the MatSetSeqAIJKokkosWithCSRMatrix >>>>>> type methods. >>>>>> >>>>>> The reason I thought MatSetSeqAIJKokkosWithCSRMatrix could easily be >>>>>> made public is that the Mat_SeqAIJKokkos constructors are already publicly >>>>>> accessible? In particular one of those constructors takes in pointers to >>>>>> the Kokkos dual views which store a,i,j, and hence one can build a >>>>>> sequential matrix with nothing (or very little) occuring on the host. The >>>>>> only change I can see that would be necessary is for >>>>>> MatSetSeqAIJKokkosWithCSRMatrix (or MatCreateSeqAIJKokkosWithCSRMatrix) to >>>>>> be public is to change the PETSC_INTERN to PETSC_EXTERN? >>>>>> >>>>>> For MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices, I believe all >>>>>> that is required is declaring the method in the .hpp, as it's already >>>>>> defined as static in mpiaijkok.kokkos.cxx. In particular, the comments >>>>>> above MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices suggest that the >>>>>> off-diagonal block B needs to be built with global column ids, with >>>>>> mpiaij->garray constructed on the host along with the rewriting of the >>>>>> global column indices in B. This happens in MatSetUpMultiply_MPIAIJ, but >>>>>> checking the code there shows that if you pass in a non-null garray to >>>>>> MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices, the construction and >>>>>> compatification is skipped, meaning B can be built with local column ids as >>>>>> long as garray is provided on the host (which I also build on the device >>>>>> and then just copy to the host). Again this is what some of the internal >>>>>> Kokkos routines rely on, like the matrix-product. >>>>>> >>>>>> I am happy to try doing this and submitting a request to the petsc >>>>>> gitlab if this seems sensible, I just wanted to double check that I wasn't >>>>>> missing something important? >>>>>> Thanks >>>>>> Steven >>>>>> >>>>>> On Tue, 25 Feb 2025 at 22:16, Junchao Zhang >>>>>> wrote: >>>>>> >>>>>>> Hi, Steven, >>>>>>> MatSetSeqAIJKokkosWithCSRMatrix(Mat A, Mat_SeqAIJKokkos *akok) uses >>>>>>> a private data type Mat_SeqAIJKokkos, so it can not be directly made >>>>>>> public. >>>>>>> If you already use COO, then why not directly make the matrix of >>>>>>> type MATAIJKOKKOS and call MatSetPreallocationCOO() and MatSetValuesCOO()? >>>>>>> So I am confused by your needs. >>>>>>> >>>>>>> Thanks! >>>>>>> --Junchao Zhang >>>>>>> >>>>>>> >>>>>>> On Tue, Feb 25, 2025 at 3:39?PM Steven Dargaville < >>>>>>> dargaville.steven at gmail.com> wrote: >>>>>>> >>>>>>>> Hi >>>>>>>> >>>>>>>> I'm just wondering if there is any possibility of making: >>>>>>>> MatSetSeqAIJKokkosWithCSRMatrix >>>>>>>> or MatCreateSeqAIJKokkosWithCSRMatrix in >>>>>>>> src/mat/impls/aij/seq/kokkos/aijkok.kokkos.cxx >>>>>>>> MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices in >>>>>>>> src/mat/impls/aij/mpi/kokkos/mpiaijkok.kokkos.cxx >>>>>>>> >>>>>>>> publicly accessible outside of petsc, or if there is an interface I >>>>>>>> have missed for creating Kokkos matrices entirely on the device? >>>>>>>> MatCreateSeqAIJKokkosWithCSRMatrix for example is marked as PETSC_INTERN so >>>>>>>> I can't link to it. >>>>>>>> >>>>>>>> I've currently just copied the code inside of those methods so that >>>>>>>> I can build without any preallocation on the host (e.g., through the COO >>>>>>>> interface) and it works really well. >>>>>>>> >>>>>>>> Thanks for your help >>>>>>>> Steven >>>>>>>> >>>>>>> >> -------------- next part -------------- An HTML attachment was scrubbed... URL: From junchao.zhang at gmail.com Wed Feb 26 12:00:23 2025 From: junchao.zhang at gmail.com (Junchao Zhang) Date: Wed, 26 Feb 2025 12:00:23 -0600 Subject: [petsc-users] building kokkos matrices on the device In-Reply-To: References: Message-ID: On Wed, Feb 26, 2025 at 11:27?AM Steven Dargaville < dargaville.steven at gmail.com> wrote: > Ok so just to double check the things I should do: > > 1. Create a new header for MatCreateSeqAIJKokkosWithCSRMatrix (and declare > it PETSC_EXTERN) so users can call the existing method and build a > seqaijkokkos matrix with no host involvement. > No, We already have a private MatCreateSeqAIJKokkosWithKokkosCsrMatrix(MPI_Comm comm, KokkosCsrMatrix csr, Mat *A), you just need to make it public in a new header petscmat_kokkos.hpp. BTW, I am also thinking MatCreateSeqAIJKokkosWithKokkosViews(MPI_Comm comm, PetscInt m, PetscInt n, Kokkos::View i, Kokkos::View j, Kokkos::View a, Mat *mat), as we already have MatCreateSeqAIJWithArrays(MPI_Comm comm, PetscInt m, PetscInt n, PetscInt i[], PetscInt j[], PetscScalar a[], Mat *mat) The benefit is that we don't need to include in petscmat_kokkos.hpp, to decouple petsc and kokkos to the least. But either is fine with me. > > 2. Modify *MatCreateMPIAIJWithSeqAIJ (*or equivalent*) *so it does the > same thing as MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices in the case > that A and B are seqaijkokkos matrices. > Keep the existing MatCreateMPIAIJWithSeqAIJ() but depreciate it in favor of a new MatCreateMPIXAIJWithSeqXAIJ(MPI_Comm comm, PetscInt M, PetscInt N, Mat A, Mat B, const PetscInt garray[], Mat *mat). The new function should handle cases that the A, B are MATSEQAIJKOKKOS. > > 3. Potentially remove MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices > given it would be redundant? > Yes, remove it and use the new API at places calling it. > > On Wed, 26 Feb 2025 at 17:15, Junchao Zhang > wrote: > >> That is a good idea. Perhaps a new MatCreateMPIXAIJWithSeqXAIJ(MPI_Comm >> comm, PetscInt M, PetscInt N, Mat A, Mat B, const PetscInt garray[], Mat >> *mat) since garray[] is only meaningful for MATAIJ and subclasses. >> >> --Junchao Zhang >> >> >> On Wed, Feb 26, 2025 at 11:02?AM Barry Smith wrote: >> >>> >>> The new function doesn't seem to have anything to do with Kokkos so >>> why have any new functions? Just have *MatCreateMPIAIJWithSeqAIJ() work >>> properly when the two matrices are Kokkos (or CUDA or HIP). Or if you >>> want to eliminate the global reduction maybe make your new function >>> MatCreateMPIWithSeq() and have it work for any type of submatrix and >>> eventually we could deprecate the **MatCreateMPIAIJWithSeqAIJ() * >>> >>> * Barry* >>> >>> >>> >>> >>> On Feb 26, 2025, at 11:54?AM, Steven Dargaville < >>> dargaville.steven at gmail.com> wrote: >>> >>> I think that sounds great, I'm happy to put together an MR (likely next >>> week) for review. >>> >>> On Wed, 26 Feb 2025 at 16:11, Junchao Zhang >>> wrote: >>> >>>> This fuction *MatCreateMPIAIJWithSeqAIJ(MPI_Comm comm, Mat A, Mat B, >>>> const PetscInt garray[], Mat *mat) *is rarely used. To compute the >>>> global matrix's row/col size M, N, it has to do an MPI_Allreduce(). I think >>>> it is a waste, as the caller usually knows M, N already. So I think we can >>>> depart from it and have a new one: >>>> >>>> MatCreateMPIAIJKokkosWithSeqAIJKokkos(MPI_Comm comm, PetscInt M, >>>> PetscInt N, Mat A, Mat B, const PetscInt garray[], Mat *mat) >>>> * M, N are global row/col size of mat >>>> * A, B are MATSEQAIJKOKKOS >>>> * M, N could be PETSC_DECIDE, if so, petsc will compute mat's M, N from >>>> A, i.e., M = Sum of A's M, N= Sum of A's N >>>> * if garray is NULL, B uses global column indices (and B's N should be >>>> equal to the output mat's N) >>>> * if garray is not NULL, B uses local column indices; garray[] was >>>> allocated by PetscMalloc() and after the call, garray will be owned by mat >>>> (user should not free garray afterwards). >>>> >>>> What do you think? If you agree, could you contribute an MR? >>>> >>>> BTW, I think we need to create a new header, petscmat_kokkos.hpp to >>>> declare >>>> PetscErrorCode MatCreateSeqAIJKokkosWithKokkosCsrMatrix(MPI_Comm >>>> comm, KokkosCsrMatrix csr, Mat *A) >>>> but >>>> PetscErrorCode MatCreateMPIAIJKokkosWithSeqAIJKokkos(MPI_Comm comm, >>>> PetscInt M, PetscInt N, Mat A, Mat B, const PetscInt garray[], Mat *mat) >>>> can be in petscmat.h as it uses only C types. >>>> >>>> Barry, what do you think of the two new APIs? >>>> >>>> --Junchao Zhang >>>> >>>> >>>> On Wed, Feb 26, 2025 at 6:26?AM Steven Dargaville < >>>> dargaville.steven at gmail.com> wrote: >>>> >>>>> Those two constructors would definitely meet my needs, thanks! >>>>> >>>>> Also I should note that the comment about garray and B in >>>>> MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices is correct if garray >>>>> is passed in as NULL, it's just that if you pass in a completed garray it >>>>> doesn't bother creating one or changing the column indices of B. So I would >>>>> suggest the comment be: "if garray is NULL the offdiag matrix B >>>>> should have global col ids; if garray is not NULL the offdiag matrix B >>>>> should have local col ids" >>>>> >>>>> On Wed, 26 Feb 2025 at 03:35, Junchao Zhang >>>>> wrote: >>>>> >>>>>> Mat_SeqAIJKokkos is private because it is in a private header >>>>>> src/mat/impls/aij/seq/kokkos/aijkok.hpp >>>>>> >>>>>> Your observation about the garray in MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices() >>>>>> might be right. The comment >>>>>> >>>>>> - B - the offdiag matrix using global col ids >>>>>> >>>>>> is out of date. Perhaps it should be "the offdiag matrix uses local >>>>>> column indices and garray contains the local to global mapping". But I >>>>>> need to double check it. >>>>>> >>>>>> Since you use Kokkos, I think we could provide these two constructors >>>>>> for MATSEQAIJKOKKOS and MATMPIAIJKOKKOS respectively >>>>>> >>>>>> - MatCreateSeqAIJKokkosWithKokkosCsrMatrix(MPI_Comm comm, >>>>>> KokkosCsrMatrix csr, Mat *A) >>>>>> >>>>>> >>>>>> - MatCreateMPIAIJKokkosWithSeqAIJKokkos(MPI_Comm comm, Mat A, Mat >>>>>> B, PetscInt *garray, Mat *mat) >>>>>> >>>>>> // To mimic the existing MatCreateMPIAIJWithSeqAIJ(MPI_Comm >>>>>> comm, Mat A, Mat B, const PetscInt garray[], Mat *mat); >>>>>> // A and B are MATSEQAIJKOKKOS matrices and use local column >>>>>> indices >>>>>> >>>>>> Do they meet your needs? >>>>>> >>>>>> --Junchao Zhang >>>>>> >>>>>> >>>>>> On Tue, Feb 25, 2025 at 5:35?PM Steven Dargaville < >>>>>> dargaville.steven at gmail.com> wrote: >>>>>> >>>>>>> Thanks for the response! >>>>>>> >>>>>>> Although MatSetValuesCOO happens on the device if the input coo_v >>>>>>> pointer is device memory, I believe MatSetPreallocationCOO requires host >>>>>>> pointers for coo_i and coo_j, and the preallocation (and construction of >>>>>>> the COO structures) happens on the host and is then copied onto the device? >>>>>>> I need to be able to create a matrix object with minimal work on the host >>>>>>> (like many of the routines in aijkok.kokkos.cxx do internally). I >>>>>>> originally used the COO interface to build the matrices I need, but that >>>>>>> was around 5x slower than constructing the aij structures myself on the >>>>>>> device and then just directly using the MatSetSeqAIJKokkosWithCSRMatrix >>>>>>> type methods. >>>>>>> >>>>>>> The reason I thought MatSetSeqAIJKokkosWithCSRMatrix could easily be >>>>>>> made public is that the Mat_SeqAIJKokkos constructors are already publicly >>>>>>> accessible? In particular one of those constructors takes in pointers to >>>>>>> the Kokkos dual views which store a,i,j, and hence one can build a >>>>>>> sequential matrix with nothing (or very little) occuring on the host. The >>>>>>> only change I can see that would be necessary is for >>>>>>> MatSetSeqAIJKokkosWithCSRMatrix (or MatCreateSeqAIJKokkosWithCSRMatrix) to >>>>>>> be public is to change the PETSC_INTERN to PETSC_EXTERN? >>>>>>> >>>>>>> For MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices, I believe all >>>>>>> that is required is declaring the method in the .hpp, as it's already >>>>>>> defined as static in mpiaijkok.kokkos.cxx. In particular, the comments >>>>>>> above MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices suggest that the >>>>>>> off-diagonal block B needs to be built with global column ids, with >>>>>>> mpiaij->garray constructed on the host along with the rewriting of the >>>>>>> global column indices in B. This happens in MatSetUpMultiply_MPIAIJ, but >>>>>>> checking the code there shows that if you pass in a non-null garray to >>>>>>> MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices, the construction and >>>>>>> compatification is skipped, meaning B can be built with local column ids as >>>>>>> long as garray is provided on the host (which I also build on the device >>>>>>> and then just copy to the host). Again this is what some of the internal >>>>>>> Kokkos routines rely on, like the matrix-product. >>>>>>> >>>>>>> I am happy to try doing this and submitting a request to the petsc >>>>>>> gitlab if this seems sensible, I just wanted to double check that I wasn't >>>>>>> missing something important? >>>>>>> Thanks >>>>>>> Steven >>>>>>> >>>>>>> On Tue, 25 Feb 2025 at 22:16, Junchao Zhang >>>>>>> wrote: >>>>>>> >>>>>>>> Hi, Steven, >>>>>>>> MatSetSeqAIJKokkosWithCSRMatrix(Mat A, Mat_SeqAIJKokkos *akok) uses >>>>>>>> a private data type Mat_SeqAIJKokkos, so it can not be directly made >>>>>>>> public. >>>>>>>> If you already use COO, then why not directly make the matrix of >>>>>>>> type MATAIJKOKKOS and call MatSetPreallocationCOO() and MatSetValuesCOO()? >>>>>>>> So I am confused by your needs. >>>>>>>> >>>>>>>> Thanks! >>>>>>>> --Junchao Zhang >>>>>>>> >>>>>>>> >>>>>>>> On Tue, Feb 25, 2025 at 3:39?PM Steven Dargaville < >>>>>>>> dargaville.steven at gmail.com> wrote: >>>>>>>> >>>>>>>>> Hi >>>>>>>>> >>>>>>>>> I'm just wondering if there is any possibility of making: >>>>>>>>> MatSetSeqAIJKokkosWithCSRMatrix >>>>>>>>> or MatCreateSeqAIJKokkosWithCSRMatrix in >>>>>>>>> src/mat/impls/aij/seq/kokkos/aijkok.kokkos.cxx >>>>>>>>> MatSetMPIAIJKokkosWithSplitSeqAIJKokkosMatrices in >>>>>>>>> src/mat/impls/aij/mpi/kokkos/mpiaijkok.kokkos.cxx >>>>>>>>> >>>>>>>>> publicly accessible outside of petsc, or if there is an interface >>>>>>>>> I have missed for creating Kokkos matrices entirely on the device? >>>>>>>>> MatCreateSeqAIJKokkosWithCSRMatrix for example is marked as PETSC_INTERN so >>>>>>>>> I can't link to it. >>>>>>>>> >>>>>>>>> I've currently just copied the code inside of those methods so >>>>>>>>> that I can build without any preallocation on the host (e.g., through the >>>>>>>>> COO interface) and it works really well. >>>>>>>>> >>>>>>>>> Thanks for your help >>>>>>>>> Steven >>>>>>>>> >>>>>>>> >>> -------------- next part -------------- An HTML attachment was scrubbed... URL: From liufield at gmail.com Thu Feb 27 17:12:11 2025 From: liufield at gmail.com (neil liu) Date: Thu, 27 Feb 2025 18:12:11 -0500 Subject: [petsc-users] Inquiry about resetting a petscsection for a dmplex Message-ID: Dear Pestc community, I am currently working on a 3D adaptive vector FEM solver. In my case, I need to solve two systems: one for the primal equation using a low-order discretization and another for the adjoint equation using a high-order discretization. Afterward, I need to reset the section associated with the DMPlex. Whichever is set first?20 DOFs (second-order) or 6 DOFs (first-order)?the final mapping always follows that of the first-defined configuration. Did I miss something? Thanks, Xiaodong PetscErrorCode DMManage::SetupSection(CaseInfo &objCaseInfo){ PetscSection s; PetscInt edgeStart, edgeEnd, pStart, pEnd; PetscInt cellStart, cellEnd; PetscInt faceStart, faceEnd; PetscFunctionBeginUser; DMPlexGetChart(dm, &pStart, &pEnd); DMPlexGetHeightStratum(dm, 0, &cellStart, &cellEnd); DMPlexGetHeightStratum(dm, 1, &faceStart, &faceEnd); DMPlexGetHeightStratum(dm, 2, &edgeStart, &edgeEnd); /* edges */; PetscSectionCreate(PetscObjectComm((PetscObject)dm), &s); PetscSectionSetNumFields(s, 1); PetscSectionSetFieldComponents(s, 0, 1); if (objCaseInfo.getnumberDof_local() == 6){ PetscSectionSetChart(s, edgeStart, edgeEnd); for (PetscInt edgeIndex = edgeStart; edgeIndex < edgeEnd; ++edgeIndex) { PetscSectionSetDof(s, edgeIndex, objCaseInfo.numdofPerEdge); PetscSectionSetFieldDof(s, edgeIndex, 0, 1); } } else if(objCaseInfo.getnumberDof_local() == 20){ PetscSectionSetChart(s, faceStart, edgeEnd); for (PetscInt faceIndex = faceStart; faceIndex < faceEnd; ++faceIndex) { PetscSectionSetDof(s, faceIndex, objCaseInfo.numdofPerFace); PetscSectionSetFieldDof(s, faceIndex, 0, 1); } //Test for (PetscInt edgeIndex = edgeStart; edgeIndex < edgeEnd; ++edgeIndex) { PetscSectionSetDof(s, edgeIndex, objCaseInfo.numdofPerEdge); PetscSectionSetFieldDof(s, edgeIndex, 0, 1); } } // PetscSectionSetUp(s); DMSetLocalSection(dm, s); PetscSectionDestroy(&s); //Output map for check ISLocalToGlobalMapping ltogm; const PetscInt *g_idx; DMGetLocalToGlobalMapping(dm, <ogm); ISLocalToGlobalMappingView(ltogm, PETSC_VIEWER_STDOUT_WORLD); ISLocalToGlobalMappingGetIndices(ltogm, &g_idx); PetscFunctionReturn(PETSC_SUCCESS); } -------------- next part -------------- An HTML attachment was scrubbed... URL: From knepley at gmail.com Thu Feb 27 20:16:31 2025 From: knepley at gmail.com (Matthew Knepley) Date: Thu, 27 Feb 2025 21:16:31 -0500 Subject: [petsc-users] Inquiry about resetting a petscsection for a dmplex In-Reply-To: References: Message-ID: On Thu, Feb 27, 2025 at 6:12?PM neil liu wrote: > Dear Pestc community, > > I am currently working on a 3D adaptive vector FEM solver. In my case, I > need to solve two systems: one for the primal equation using a low-order > discretization and another for the adjoint equation using a high-order > discretization. > > Afterward, I need to reset the section associated with the DMPlex. > Whichever is set first?20 DOFs (second-order) or 6 DOFs (first-order)?the > final mapping always follows that of the first-defined configuration. > > Did I miss something? > > When solving two systems like this on the same mesh, I recommend using DMClone(). What this does is create you a new DM with the same backend topology (Plex), but a different function space (Section). This is how I do everything internally in Plex. Does that make sense? Thanks, Matt > Thanks, > > > Xiaodong > > PetscErrorCode DMManage::SetupSection(CaseInfo &objCaseInfo){ > PetscSection s; > PetscInt edgeStart, edgeEnd, pStart, pEnd; > PetscInt cellStart, cellEnd; > PetscInt faceStart, faceEnd; > > PetscFunctionBeginUser; > DMPlexGetChart(dm, &pStart, &pEnd); > DMPlexGetHeightStratum(dm, 0, &cellStart, &cellEnd); > DMPlexGetHeightStratum(dm, 1, &faceStart, &faceEnd); > DMPlexGetHeightStratum(dm, 2, &edgeStart, &edgeEnd); /* edges */; > PetscSectionCreate(PetscObjectComm((PetscObject)dm), &s); > PetscSectionSetNumFields(s, 1); > PetscSectionSetFieldComponents(s, 0, 1); > if (objCaseInfo.getnumberDof_local() == 6){ > PetscSectionSetChart(s, edgeStart, edgeEnd); > for (PetscInt edgeIndex = edgeStart; edgeIndex < edgeEnd; ++edgeIndex) { > PetscSectionSetDof(s, edgeIndex, objCaseInfo.numdofPerEdge); > PetscSectionSetFieldDof(s, edgeIndex, 0, 1); > } > } > else if(objCaseInfo.getnumberDof_local() == 20){ > PetscSectionSetChart(s, faceStart, edgeEnd); > for (PetscInt faceIndex = faceStart; faceIndex < faceEnd; ++faceIndex) { > PetscSectionSetDof(s, faceIndex, objCaseInfo.numdofPerFace); > PetscSectionSetFieldDof(s, faceIndex, 0, 1); > } > //Test > for (PetscInt edgeIndex = edgeStart; edgeIndex < edgeEnd; ++edgeIndex) { > PetscSectionSetDof(s, edgeIndex, objCaseInfo.numdofPerEdge); > PetscSectionSetFieldDof(s, edgeIndex, 0, 1); > } > } > // > PetscSectionSetUp(s); > DMSetLocalSection(dm, s); > PetscSectionDestroy(&s); > > //Output map for check > ISLocalToGlobalMapping ltogm; > const PetscInt *g_idx; > DMGetLocalToGlobalMapping(dm, <ogm); > ISLocalToGlobalMappingView(ltogm, PETSC_VIEWER_STDOUT_WORLD); > ISLocalToGlobalMappingGetIndices(ltogm, &g_idx); > > PetscFunctionReturn(PETSC_SUCCESS); > } > -- What most experimenters take for granted before they begin their experiments is infinitely more interesting than any results to which their experiments lead. -- Norbert Wiener https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!ZoOMxv_X_kOSiYdGQlrrfxNqBqX-JwvUe4wg8Lmx9ICyyEgKROX7IMg4jQIW9310TtkewqWflxHLqw8Z6USM$ -------------- next part -------------- An HTML attachment was scrubbed... URL: From liufield at gmail.com Fri Feb 28 21:56:45 2025 From: liufield at gmail.com (neil liu) Date: Fri, 28 Feb 2025 22:56:45 -0500 Subject: [petsc-users] Inquiry about resetting a petscsection for a dmplex In-Reply-To: References: Message-ID: Thanks a lot, Matt! It works well. I have another question regarding future p-adaptivity. Will the section support defining different DOFs for each face and edge? Maybe I should try this. Thanks, Xiaodong On Thu, Feb 27, 2025 at 9:16?PM Matthew Knepley wrote: > On Thu, Feb 27, 2025 at 6:12?PM neil liu wrote: > >> Dear Pestc community, >> >> I am currently working on a 3D adaptive vector FEM solver. In my case, I >> need to solve two systems: one for the primal equation using a low-order >> discretization and another for the adjoint equation using a high-order >> discretization. >> >> Afterward, I need to reset the section associated with the DMPlex. >> Whichever is set first?20 DOFs (second-order) or 6 DOFs (first-order)?the >> final mapping always follows that of the first-defined configuration. >> >> Did I miss something? >> >> When solving two systems like this on the same mesh, I recommend using > DMClone(). What this does is create you a new > DM with the same backend topology (Plex), but a different function space > (Section). This is how I do everything internally in Plex. Does that make > sense? > > Thanks, > > Matt > >> Thanks, >> >> >> Xiaodong >> >> PetscErrorCode DMManage::SetupSection(CaseInfo &objCaseInfo){ >> PetscSection s; >> PetscInt edgeStart, edgeEnd, pStart, pEnd; >> PetscInt cellStart, cellEnd; >> PetscInt faceStart, faceEnd; >> >> PetscFunctionBeginUser; >> DMPlexGetChart(dm, &pStart, &pEnd); >> DMPlexGetHeightStratum(dm, 0, &cellStart, &cellEnd); >> DMPlexGetHeightStratum(dm, 1, &faceStart, &faceEnd); >> DMPlexGetHeightStratum(dm, 2, &edgeStart, &edgeEnd); /* edges */; >> PetscSectionCreate(PetscObjectComm((PetscObject)dm), &s); >> PetscSectionSetNumFields(s, 1); >> PetscSectionSetFieldComponents(s, 0, 1); >> if (objCaseInfo.getnumberDof_local() == 6){ >> PetscSectionSetChart(s, edgeStart, edgeEnd); >> for (PetscInt edgeIndex = edgeStart; edgeIndex < edgeEnd; ++edgeIndex) { >> PetscSectionSetDof(s, edgeIndex, objCaseInfo.numdofPerEdge); >> PetscSectionSetFieldDof(s, edgeIndex, 0, 1); >> } >> } >> else if(objCaseInfo.getnumberDof_local() == 20){ >> PetscSectionSetChart(s, faceStart, edgeEnd); >> for (PetscInt faceIndex = faceStart; faceIndex < faceEnd; ++faceIndex) { >> PetscSectionSetDof(s, faceIndex, objCaseInfo.numdofPerFace); >> PetscSectionSetFieldDof(s, faceIndex, 0, 1); >> } >> //Test >> for (PetscInt edgeIndex = edgeStart; edgeIndex < edgeEnd; ++edgeIndex) { >> PetscSectionSetDof(s, edgeIndex, objCaseInfo.numdofPerEdge); >> PetscSectionSetFieldDof(s, edgeIndex, 0, 1); >> } >> } >> // >> PetscSectionSetUp(s); >> DMSetLocalSection(dm, s); >> PetscSectionDestroy(&s); >> >> //Output map for check >> ISLocalToGlobalMapping ltogm; >> const PetscInt *g_idx; >> DMGetLocalToGlobalMapping(dm, <ogm); >> ISLocalToGlobalMappingView(ltogm, PETSC_VIEWER_STDOUT_WORLD); >> ISLocalToGlobalMappingGetIndices(ltogm, &g_idx); >> >> PetscFunctionReturn(PETSC_SUCCESS); >> } >> > > > -- > What most experimenters take for granted before they begin their > experiments is infinitely more interesting than any results to which their > experiments lead. > -- Norbert Wiener > > https://urldefense.us/v3/__https://www.cse.buffalo.edu/*knepley/__;fg!!G_uCfscf7eWS!ZTmIg2QHRHk4rNyrPzO0mJpbo5uTsYN7umDaXzGGtb2o3qeMrQtB0zvmFa55nwwfw-UtpYaOFEvs3PMXfa-oIQ$ > > -------------- next part -------------- An HTML attachment was scrubbed... URL: